Graphic by Anna Vital at Adioma

Creating a startup looks so simple; an infographic can summarise it in a few steps. So why aren’t startups more successful in academic publishing?

This is a topic big enough for a whole book, but Gabe Stein of Knowledge Futures Group wrote a post about it, and Clarke and Esposito responded here. Here are a few off-the-cuff thoughts.  

First, what Stein said. Stein presents a list of errors made by startups, and his own view of the market. I’ll restrict myself to what Stein believes about academic publishing; while I’m keen on looking at how the market should or might be, I’m focusing here on how the market is currently; I won’t follow Stein and “imagine a future”.

  1. Stein correctly describes academic publishing as a “byzantine” affair. Libraries pay the bills, but governments pay for research, and universities pay the salaries for researchers, who don’t directly pay for anything. Innovation, in other words, becomes a three-way or even four-way discussion.
  2. As a result, Stein describes as “social switching cost” the challenge for users to switch to new workflows. Most startups “capitulate to the status quo”. Many startups, such as Hindawi, F1000 and Publons, are acquired by the big publishing conglomerates. This, however, is simply how business works: it doesn’t prevent innovation.
  3. Stein claims that the core product in higher education is peer-reviewed research. Here, in my opinion, is another curiosity of the academic market: there are many claims made about it that are wide of the mark. Teaching is a vital part of the HE market. For a typical teaching-based university such as Staffs University in the UK,  tuition fees represent around 90% of their income, but even for Cambridge, a top-level research university, its income of £339m in tuition fees per year, represents around 30% of their income (excluding investments and external companies such as Cambridge University Press). You cannot say the core product is peer-reviewed research. And even within peer-reviewed research, arts and humanities subjects are flagrantly shoe-horned into a scientific model which is often not very appropriate.
  4. Stein describes disruptive startups. Clarke and Esposito claim the most successful innovations did not disrupt the market, but their examples, such as Hindawi and MDPI, are certainly disruptive. They have made the traditional publishers realise the vast opportunity provided by open-access publishing at scale.  
  5. I don’t believe the current incumbents are complacent. I would suggest that publishers, libraries and researchers do not evaluate new technology very well, but I am encouraged by the growth of bibliometrics expertise in libraries. Perhaps more specifically, some societies and publishers have not had to worry about saving money in an era of rapidly growing real income from journal publishing.
  6. According to Stein, academic publishing is therefore not a consumer-driven market. I don’t agree; the researchers put up with a very poor deal, frequently struggling to get access to full-text articles and books. But consumers in many markets tolerate poor products and service simply because they have no concept of anything different.
  7. Ease of use is not the key barrier to adoption. One major problem here is the lack of well-run case studies and assessments of new technology.

What’s my view? 

Most startups provide a point solution. The benefits of adopting each one are outweighed by the learning curve and the need to interrupt existing workflows. The biggest challenge for startups in academic publishing is that they do just one thing well. When you look at a company like Connected Papers, or scite.ai, they are excellent at what they do. But the entire research process would require 20 or 30 such individual solutions to complete the academic user journey, and that is simply not feasible. Humans do not like complexity. Until and unless one company provides a co-ordinated collection of tools, they are unlikely to reach mass adoption.

Because of the split between libraries, universities, researchers and publishers, it is difficult to get widespread adoption for new tools. Researchers are not used to paying for tools directly. Libraries have too little knowledge of the tools available. Publishers are too busy maintaining their current revenue streams to worry too much about improving the academic research workflow. Researchers want access to the entirety of scholarly content, but the only openly available resource indexing the bulk of academic content is provided by a third party (Google) with no guarantee of long-term continuation of the service.

Collaboration within academic publishing

Many of the core technical advances in academic publishing, in fact the things that make many of the startups able to exist, are not-for-profit collaborative initiatives: Crossref, COUNTER (at least in principle), or funded not-for-profit initiatives, such as Semantic Scholar and OpenAlex. However, even here, initiatives are fragmented and lack scale. For example, COUNTER provides industry-wide tools for subscription publishing, but implementation is challenging, and COUNTER missed the boat with the rise of open access, and lost the opportunity to be the default tool for measuring usage. There is far too little collaboration between the existing players. As Marcum and Schonfeld showed clearly in Along Came Google, libraries were offered the chance to create a universal digital library of academic works … and turned their backs. In my opinion, the opportunity for real change in academic publishing will come when stakeholders (publishers, libraries, institutions) recognize that collaboration is in their best interest.

XML is a standard?

If you want evidence of the lack of genuine collaboration in academic publishing, look at XML, which was developed over 25 years ago. Academic publishing requires content to be exchanged several times during the submission, publication and discovery process. There is a need for a common standard for document exchange. One flavour of XML, JATS, is widely regarded as the closest to a standard for interchange of content. Yet there is no single way, even within JATS, to tag an article as a “research” article. One major text conversion company told me they frequently are given the task of converting one publisher’s JATS to another publisher’s JATS. A vast proportion of the expertise marshalled so expensively by startups is used to try to interpret PDFs and to identify the very metadata that was thrown away when the PDF was created out of something more structured.

When academia agrees some more standard workflows and processes, then innovation will become easier to implement, and change will take place more rapidly. Then, perhaps, we can go back to our infographic, and see how simple it is to get to the final stage of success, and to win the trophy.