Reading Time: 3 minutes

The Journal Impact Factor has been discussed, and criticized, for years. A recent Scholarly Kitchen article looks at another proposal for improving the impact factor (Optical Illusions, 21 July 2016). This is by no means the first suggested improvement to the impact factor metric – a search on Scholarly Kitchen itself reveals there are several posts on this topic each year.

Perhaps the biggest problem with the Journal Impact Factor is this. Most journals, from Nature to the smallest journal, seem to have a similar graph when number of citations are measured by individual articles in that journal. A few articles are cited a lot, followed by a very long tail of articles that get few or even zero citations. We all know this, but we persist in believing a Journal Impact Factor is in some way representative for each article in that journal.

It has been well known for years that the Journal Impact Factor is a poor measure of research. One famous article by Per Seglen, “Why the impact factor of journals should not be used for evaluating research” is still valid today. Seglen’s criticisms (written almost 20 years ago) relate to the wide distribution of citations among the articles from a single journal. He found that “the most cited 15% of the articles account for 50% of the citations, and the most cited 50% of the articles account for 90% of the citations. In other words, the most cited half of the articles are cited, on average, 10 times as often as the least cited half.” Seglen illustrates this point with a graph that looks very similar to the graphs that appear in the 2016 Scholarly Kitchen post:

Citation rates for three journals by article, 1983-84

As Seglen says, “for a journal’s impact factor to be reasonably representative of its articles, the citation rate of individual articles … should show a narrow distribution around the mean value (the journal’s impact factor).” But as you can see, the distribution of article citations is very skewed. Adding a single mean value to represent all the articles is simply not valid. It might work to use the mean to show an average human height, for example, but not an average citation for all articles in one journal.

Although Seglen’s graphic dates from the 1980s, the pattern of skewed distribution is very similar today (there are other examples in the Scholarly Kitchen post). Why then do Journal Impact Factors persist? One possible reason is because publishers love them. If a publisher can provide a metric that demonstrates that their journal is cited more than other journals, then libraries will buy it and stock it for the institution library. Useful for publishers’ sales, then, but not useful for academics.

Could we switch to article impact factors? Interestingly, there is already an article impact factor: the Relative Citation Ratio (RCR). Launched at the end of 2015, it was created by the federally funded National Institutes of Health (NIH), and the paper explaining it was deposited on the preprint site bioRxiv (and not in a subscription journal). This metric improves on the journal impact factor by being article-based rather than journal-based, but also overcomes another limitation of impact factors. The RCR is based on the number of citations a paper receives divided by the average number of citations an article normally gets within that field. In other words, it attempts to show number of citations against a calculated norm for that discipline – a relative ranking, which makes sense within disciplines, but which would not be valid across disciplines.

Sensibly, the NIH have called for researchers to improve it. It was immediately criticized as only valid within biomedicine (which is what it was set up to measure), but nonetheless, it is already a great improvement on the Journal Impact Factor. It seems to be gaining acceptance; it was adopted by all the Digital Science companies (including ReadCube and Figshare) in July 2016.

If citation indexes were switched to measuring the number of citations by articles rather than by journal, it would also eliminate, incidentally, the problem revealed by Phil Davis in another Scholarly Kitchen post, pointing out differences in the way Elsevier and Thomson Reuters define an article, leading to major differences in the Journal Impact Factor as defined by the two publishers’ rival interpretations.  

What would the effect of removing the Journal Impact Factor? Most importantly, it would no longer matter in which journal an article was published. If the academic community believes in citations above all else, then let’s not cling to the superseded idea that publishing with a more prestigious publisher makes the article better.

If the idea of a Journal Impact Factor is discredited, why do so many people persist in trying to fine-tune it when a journal-based metric can never be representative? And why did the new RCR metric not get reported in Scholarly Kitchen? Is it perhaps because it might further weaken the influence of publishers selling subscription-based journals to institutions?