Illustration by William Warby (Unsplash)

This short guide, in the “What Everyone Needs to Know” series from Oxford University Press, provides an outline of bibliometrics: measuring the impact and value of research. How to measure research is a topic that has been discussed and argued over for many years, so it certainly warrants a round-up of current thinking and practice. Did it provide me with all I wanted to know, as the series title suggests? At first glance, yes. The authors, Cassidy Sugimoto and Vincent Lariviere, are both academics in information science, and with several papers to their credit.

For a short book, it is rather dense. In an attempt to cover lots of ground, the book packs into a few pages very brief coverage of many ideas and principles behind bibliometrics. The result is a rather breathless, whistlestop tour that frequently mentions names but without enough space to explain in detail. I don’t object to the concision, but I think the arrangement of the material makes it more difficult to use well, and the argument omits some key points. As I read through the book, I started to see some areas where the book could be improved.

How the book is arranged

The book is in four parts: The Basics; The Data; The Indicators; and The Big Picture. The last part reviews the earlier three parts and adds some perspective, such as revealing limitations to the metrics already. This arrangement makes the book rather unwieldy: important topics are often covered twice, in separate locations. For a book that will be consulted as much as read through, it means readers might not notice an important qualification. Many users will use the index to find the first reference, believe that the subject has been covered, but not see how the last section continues the discussion about limitations and drawbacks. And, of course, providing an index requires the index to include what is discussed in the text. One topic missing from the index is ORCID.

ORCID – a mixed message

ORCID is an example of the dual approach described above. It is discussed in the text on p103, and then on page 130, but neither reference is in the index. Worse, ORCID is described on page 103 as a limitation rather than a benefit, yet on page 130 ORCID is described as a “success story”. Which is it? Were different sections of the book written perhaps separately by the two authors without reconciling the text when they were brought together?  I thought that ORCID was today accepted as a universal benefit for the academic community. Many researchers have similar or identical names, and ORCID – a free service that enables authors to be credited for their work without ambiguity – is the best way to resolve this problem of ambiguity. Casimoto and Lariviere even describe the problem of ambiguous names on page 27. So why is using ORCID a limitation?

The assignment of publications to individuals appears through ORCID … this provides one limitation to the service [Impactstory], in that authors must both have an identifier within ORCID and curate their profiles within the ORCID platform, in order to ensure accuracy of their profile on Impactstory. [page 103]

Is this a limitation? Why should a guide to measuring research not promote the widely accepted, non-commercial tool for disambiguating names? It’s a strange tactic in a book that elsewhere talks about the problem of ambiguous names. AI tools should make use of industry standards where they exist. It’s almost as if in an attempt to be neutral, the authors refrain from recommending good practice.

Should we trust authors?

Apparently not, since (page 33) the authors criticize Google Scholar’s feature by which authors can edit their own details :

Each author has the ability to curate his or her own list—merging duplicate records and deleting inaccurate records. However, there is relatively little incentive to do this, as merging and deleting both reduce the productivity and impact scores associated with the scholar’s record.

If you believe all authors are hell-bent on achieving the highest number of citations, regardless of veracity, you might write this. Yet Sugimoto and Lariviere state the opposite case on page 124:

Those who produce the scholarship are in the best position to evaluate the accuracy of their own records.

I would agree, which makes the statement on page 33 all the more mystifying.

Google Scholar

It seems unfair to criticize Google Scholar for statistics of articles and citation rates changing, since this, as the authors point out (page 33) is due to deleting inappropriately attributed sources and merging duplicates. Yet there is plenty of evidence for fake journals existing in the scholarly record for several years (and, presumably, being indexed by WoS and Scopus) – see a Scholarly Kitchen article describing one example.

COUNTER

You might think that COUNTER is a pre-eminent tool for the measurement of research. It is accepted by libraries and publishers as a (hopefully)  authoritative record of the number of times an article, book or chapter has been accessed and/or read in full. But this book makes only one mention of COUNTER: on page 102, where COUNTER is not explained, and all we are told that it is an “older metric”. I would have thought COUNTER would require a full explanation in any introductory guide to measuring research.

Digital Object Identifiers (DOIs)

Similarly, DOIs seem to come in for criticism:

In particular, the reliance on unique identifiers obscures a large proportion of the conversation around research that occurs on social media without being tied to a DOI (page 105)

Yet  elsewhere (for example, on page 15) the use of social media as a bibliometric tool is questioned.

Open Access

The index does not mention open access, although OA is discussed in the text, and it is difficult to imagine any consideration of bibliometrics that does not take open access into account. One change is noted in passing, but without further mention, that open access articles tend to have more citations (page 70). You would think this a significant finding that justifies open-access publishing, but there is no further mention of it. I would have thought bibliometrics should be not just about measurements, but identifying what conclusions can reliably be drawn from that evidence.

These examples lead me to question the fundamental premise of the book. Although many types of citation metric are mentioned and explained, the authors seem at times to avoid the widespread movement by many researchers towards a more equitable science. By equitable is meant, for example, supporting open access publishing because it provides content free at the point of use, which enables a much wider readership, including professionals, journalists, and independent scholars. Is this book simply to describe metrics, or to suggest ways in which metrics can reinforce attitudes and bias or demonstrate wider access?

Journals or articles ?

Do researchers want to know which journal an article appeared in, or are they more interested in the subject, regardless of where the article appeared ?  Casimoto and Lariviere suggest the former:

Journal-level classifications provide more coherent classifications that model onto most scholars’ understandings of the knowledge space. [page 49]

Should we provide the measurements that many (perhaps most) researchers ask for, or is there a case for mentioning some of the many arguments that have been made against journal-level metrics: they introduce an expectancy bias. An article does not become more trustworthy because it was published in Nature, one would hope.

References in this book

For a work that is all about the scholarly method, Measuring Research is surprisingly lacking in citations – there are none. The authors explain:

Due to the style of the series, we omitted explicit references from the chapters, but provide below full references to all the works that informed the writing of this book.

That is simply not the case. For example, “initial research suggests that altmetric indicators provide more timely … results [p65].” Which initial research? How can I find out? I cannot see which source was used for this statement Again, on page 116: “Studies across time and discipline have consistently shown …” but it is up to me to try to discover what these studies might have been.

Conclusion

This Is a good introduction, but It could, I believe, have been better. I would suggest four principles to inform the practice of bibliometrics :

  1. Identify the best tools available and point out their strengths and weaknesses.
  2. Assume that authors will generally act in the interests of research rather than expecting the worst.
  3. Point out that metrics are not value-free. Do researchers want journal metrics or article metrics, or perhaps some other form, such as author metrics?
  4. Lobby for better tools, especially tools such as ORCID and DOIs that are not owned by one commercial provider and that will improve the quality and reliability of indicators over time.