City University

The biannual ISKO-UK conference held this year at City University, London, should be a good opportunity to assess the state of things in knowledge management. Of course, it’s not possible to summarise all the sessions, nor would I want to. Instead, I hope to give a feel for the way information professionals currently see the world, particularly in light of recent developments in AI.

The format of the conference was for the most part predictable: plenary sessions followed by parallel strands for more specialised topics, while the most interesting of all was the round-table sessions, not a format I was familiar with. Each session of 45 minutes was not a presentation but a discussion, with the presenter responding to questions. This proved to be a surprisingly lively affair. It wasn’t intended that it became a lecture, and I certainly found the format very stimulating – you could ask questions in a much more thorough way than in a plenary session.

So, what did we learn about how information professionals think in 2019?

First, there is still a detachment between the information professional and the subject specialist. To some extent, the separation between IT and understanding of customers in organisations has been bridged by having product managers, but I see little sign of this in information management. With some notable exceptions (notably Patrick Lambe’s fascinating case studies of information failure), I detected a woeful concentration on building tools rather than understanding what those tools would be used for.

The clearest indication of this divide is the continuing assumption, unstated in this conference, that the answer to any information problem is to build a new taxonomy, or, second best, to review an existing one. One presenter, a taxonomist, was adamant, when I asked her why she was building a new, proprietary subject taxonomy in a subject where process was as important as static topics, that her job was to build taxonomies, not to answer questions about the subject matter. If information professionals are arbitrarily detached from the normal business processes in this way, it’s perhaps not surprising that their solutions are all about building classification systems.

Some presentations showed a real interest in users and their behaviour. Marianne Lykke, of the University of Aalborg, presented the findings of a fascinating survey of enterprise search in one organisation. The people she interviewed were not full-time information professionals, but staff members with knowledge of the domain and plenty of expertise about searching. Intriguingly, their searches focused on people, even when they were looking to find out about topics. These people might be experts within the organisation, who were presumed to know something about (or to have written something about) the topic in question. It would be fascinating to know if searching for people in enterprise search is a common tactic.

As for linked data, the common topic was about how to build it, not what it was being used for. Two information professionals from universities confidently stated that linked data was the solution to providing synonym searching. It may be one answer, but it’s a mighty expensive and elaborate way of solving a simple problem. The attendees’ confidence in linked data struck me as odd. These people were not going to be the ones writing Sparql queries.

One information manager in a corporation described his company’s worldwide presence and multiple websites. A challenge they have is that users cannot find the online training courses run by the company, because there is no single search term used: it could be “online learning, or “e-learning”, for example, and as a result users often did not find the relevant course. His solution was to create a linked data resource and to teach all the company course providers to use the same term when providing courses. That seemed like a sledgehammer to crack a nut.

The Keynote presentation, by Jem Rayfield, now at Ontotext, was not about the why of linked data, but was revealing nonetheless. He admitted the challenge of using Sparql for querying linked data (his words were “Sparql and RDF are perceived to be complex, difficult and unstable”),  but in recommending GraphQL, a new query language, he both damned the old tool (“because I don’t have years to writ e proper Sparql”) and at the same time revealed major differences between the way Sparql and GraphQL works. I wasn’t sure if this was a plug to use GraphQL or not, but it didn’t inspire me. The only example of a search he carried out was to look for news stories entitled “Russia meddles in US elections”, with the Daily Beast as the first hit. It didn’t look like a very likely use case.

Revealingly, even research students concentrated on the “how” rather than the “why”. Ludi Price described her PhD, three years of analysing the folksonomies of fandom websites, specifically one populated by readers of Marvell comics. The particular site she studied included moderators (called “tag wranglers”), but those moderators were very clear that their role was to “describe not proscribe” user tags: “we can’t change tags; we can’t tell users how to tag in any official capacity”. So even if the user-supplied tags were in the wrong field, the moderators would not change them. The question I wanted to ask was, why do users contribute to these site; why spend so much time posting comments about comics? But this was never answered. The process of tag wrangling was described in the literature either as

 “curated folksonomy” or “democratic indexing”. I’m not sure if I would describe it as either of those.

For me, the most remarkable discovery was from a participant in a software company working with Cochrane on developing new tools for systematic reviews. The Cochrane name is very well-known, and how they organise their searches will be highly influential on the industry What seemed to be implied was alarming.

The team are looking to remove the ambiguity of natural language – in this case, clinical trials and reports of new drugs – by imposing a curated machine-readable classification system that is so tightly created that, its architects claim, it will give better results than using the full text.

Here we have gone full circle. Forty or more years ago, when digital content started to be available for searching, the industry was transformed by the existence of full text. No longer would we be dependent on abstracts and keywords. Today, one prestigious organisation is saying that the authority of their in-house process is so great that it is preferable to the full text. In fact, they would prefer searchers not even to access the full text because they are so confident of the quality of their tagging. They will have machines to carry out the annotation, and “citizen science” volunteers to annotate the machine output. When information managers place themselves at the only way to carry out search and discovery, I think something has gone wrong in the world of information retrieval.