TrendMD is (as its website states) “a content recommendation engine for scholarly publishers, which powers personalized recommendations for thousands of sites”. An interesting blog post by Matt Cockerill of TrendMD (published February 2016) claims “TrendMD’s collaborative filtering engine improves clickthrough rates 272% compared to a standard ‘similar article’ algorithm in an A/B trial”. That sounds pretty impressive. What is collaborative filtering and is it improving TrendMD’s results?
In the experiment described in the blog post (February 2016), TrendMD was compared to PubMed’s similar articles feature, and using a controlled random test (users didn’t know which method was being used to create the related articles displayed) the TrendMD was shown to increase clickthroughs quite dramatically.
It is clear from the graphic that click-throughs increased, but were they to the most relevant articles? The blog compares PubMed “similar articles” feature (not actually related citations, as the graphic states), which are generated using word counts, and explains that collaborative filtering makes use of click data to predict which links are likely to be most useful.
The article states:
it is clear that the most useful further reading links are not always the most semantically related. Indeed, if articles are too closely related, there may be diminishing returns from discovering more articles in precisely the same niche. By analogy, if I’ve just bought a coffee maker, I probably don’t want to buy another coffee maker, but I may well be interested in buying coffee beans, or descaler.
This comment appears to muddle very different ways in which the Internet operates. Academic researchers reading about (say) kidney disease are very interested in more articles about kidney disease. It may not be what your or I would click next, but we aren’t researchers. They aren’t in the least interested in reading about things that sufferers of kidney disease might be interested in, such as pain relief. In other words, that word “useful” begs a question – useful for who? Did anyone ask researchers if the links they clicked to were more or less useful?