A fascinating webinar by MyScienceWork examined this question, and certainly provided some answers, some directly, and others by implication. The webinar is available on YouTube, but if you are in a hurry, I can tell you the panel’s answer, which is yes, it can; but it emerged during the session that it will require a lot more thought on the part of practitioners for AI to have a more fundamental impact beyond simple checking.
One obvious conclusion of the session is that even within the relatively constrained area of academic publishing, there is too much going on to be covered in one webinar. Phil Gooch’s slide showing tools in the sector could have had two or three the number of companies for each area, and still not be complete.
The panel was well balanced: an AI solutions developer(Phil Gooch), a user (Mirjam Eckert, Chief Publishing Officer at Frontiers) and the STM Director of Research Integrity, Joris van Rossum. The webinar format was unusual; instead of each participant presenting one after the other, the webinar comprised several questions, answered by each panellist in turn, thereby splitting the session into smaller chunks, a format that worked quite well.
The webinar focused on the current vogue in AI, based around pattern recognition, whether of images or words. This made the subject easier to follow, although even this limitation presented an almost unmanageable range to cover, since AI can be used everywhere from submission to discovery (full disclosure: I work for a company that produces these kind of tools).
Mirjam Eckert described the in-house system used at Frontiers, covering both technical checks of papers and finding reviewers. She has the luxury of an in-house development team building custom tools for the company’s own use; Joris van Rossum pointed out that few publishers will have that capability, so some kind of co-operation is vital.
Phil Gooch clarified how the most effective AI systems currently combine rules and pattern recognition; in other words, there is some human decision-making going on. For example, a system might count the number of self-citations by an author in a submission. Of course, the problem is to decide how many self-citations is too many. Three? Five? Ten? The machine finds whatever you ask it to, but a human has to make a decision about what is significant, usually enshrined in a rule (for example, “more than five self-citations results in a fail”).
For the most part, the webinar painted an optimistic view of AI. The tools are there, people use them, what’s the worry?
Mentioned, but not, I think, given the significance it requires, is the importance of a corpus with which to train the system. Frontiers benefits from a vast corpus of its own publications, although Ms Eckert claimed their corpus was based on 233 million publications, or three times larger than Web of Science or Scopus, and about double most people’s estimate of the number of academic publications in existence. But even if there are 233 million academic publications, such a vast corpus may not give perfect results, as will be seen below.
The message of the webinar looked pretty clear: yes, AI can change the scholarly landscape. Yet there were a couple of clouds on the sunny AI horizon. Joris van Rossum showed some text-heavy slides about ethics in AI, which weren’t very intelligible, but his very job title at STM was revealing: Director of Research Integrity. As you get beyond simple technical checks (and even these are never 100% right or wrong), questions of bias start to enter. This is nothing to do with AI, but a lot to do with human assumptions and decision-making. At STM, the Research Integrity group has produced a white paper and some slides about “AI ethics in scholarly communication”:
Phil Gooch was notable for not ascribing all the magic to AI; he made it clear there are a lot of rules applied. In response to a question, “Can AI predict the impact of an article”, he answered that you have to be very careful about how the AI is applied. This is a subject for a webinar in its own right, but will be fundamental to the adoption of AI in scholarly publishing (and everywhere else, for that matter). It’s all a question of pattern detection and how it is used. If AI comprises pattern matching and rules, then it all comes down to how those rules are applied. You could analyse millions of articles, for example, and propose the following logic:
- Articles with higher citations tend to be published in higher-ranking journals
- Therefore, my system will check the author for each new submission. If they have a track record of publishing in higher-ranking journals, I will boost their prediction for impact.
What’s wrong with that? It will probably predict impact quite well. But is a classic example of baking in the status quo. It makes it far more difficult for new authors to have an impact. It might reflect how authors are evaluated currently: I’m sure one of the factors used by many researchers when evaluating a paper is where the author has published, but does it represent how authors should be evaluated? Another example is identifying potential reviewers for a submission. Any corpus-based reviewer finder will inevitably reflect the bias of the corpus on which it is based. Even today, most science researchers are male. Any reviewer finder is likely to generate a predominance of male reviewers. With thought, it is possible to understand these biases and avoid them, but it requires an awareness of the kind of corpus-based AI being used.
Decisions about bias are too important to leave to developers; they should be made by the staff using the tools. In fact, Phil Gooch suggested a kind of AI Bootcamp, at which publishers (not just technical people creating the tools) could become familiar with implications of the AI tools being used on their content. There is no shortage of books pointing out the limitations of many present-day AI implementations (for example, books by Cathy O’Neill and Kate Crawford), but too often these books are so vitriolic that you feel we might be better off if AI had never been invented. I can’t help feeling that an AI Bootcamp, with dispassionate presentation of how pattern recognition works and how it can reveal bias, would be far more effective at promoting an ethical AI than trying to create via committees a set of best practice principles for scholarly use of AI.