
A new report by Ken Chad, Aggregate and amplify, looks at theses and dissertations (terms that confusingly have reversed meanings on different sides of the Atlantic).
Why is it that TDs have been so neglected? Chad suggests it is because theses do not form part of university ranking exercises, but I think the main reason has simply been that it is difficuilt to access them. The report suggest several justifications for making them available:
- To prevent plagiarism
- To give an indication of current research (because they can appear more quickly than monographs or research articles)
- To see if a thesis has been written on a particular subject.
- To provide information about trends over time, e.g. which institutions are important for a particular field of research
The report goes on to list the major repositories of theses that are larger than simply at institutional level – notably, EThOS, a British Library initiative, and a commercial ProQuest collection, ProQuest Dissertaions and Theses (PQDT).
Most institutions in the UK appear to make some attempt to capture and make available theses from that institution, usually in the institution Instituional Repository (IR). BBut IRs are focused on local collections, which is of limited value to the researcher. Hence the report’s main recommendation, which is to create aggregated repositories – hence the title, “aggregated and amplified”.
What difference does it make if theses are discoverable or not? Well, if you are accepted on a PhD course, you are supposed to carry out a literature search. It’s not easy to make sure nobody has written about your main subject if you have to check 200+ local institutional repositories to make sure – and that is simply at the UK national level.
But, you might say, PhDs are less and less important as a unique record of research – most PhDs these days simply represent a number of articles strung together for the purpose of gaining a doctorate. This is a very short-sighted view; it may be true in some scientific areas, but in humanities and social sciences (HSS) there is much less of a tradition of submitting research int the form of academic articles in this way.
The literature review is a standard component of planning a PhD. For example, the helpful website The PhDproofreaders offers a free nine-stage guide to planning a PhD, with the literature review as step 4:

However, this doesn’t specifically mention that you should check to see if anyone has written a thesis on the same subject as you.
Using the British Library EThOS service, it is simple to check if a relevant thesis exists – as long as the words are in the title. Let’s say, for example, I am going to write a thesis on Shakespeare and Machiavelli. As it happens, there is an thesis on this very subject, as I can see from the following EThOS record:

The EThOS system performs reasonably well, but has limitations. There is an advanced search interface, but this only appears to index the title and abstract fields. I checked this by noticing the Shakespeare and Machiavelli PhD referred to above included the word “sorezzatura” in the abstract. My guess is that this was an error in the OCR of the original thesis, since the Italian word is “sprezzatura”. A search for “sprezzatura” (the correct spelling in the thesis full text) retrieved no hits, while a search for “sorezzatura” (a spelling found only in the abstract) retrieved this document. I would argue that anything less than a full-text index severely limits the value of the EThOS repository for researchers. Better still would be to use some AI tools and to provide concept indexing – but that is for another post.
Next steps
The University of Surrey did a deal with ProQuest: “We were approached by Cathy Boylan about possibly being involved in the ProQuest historical digitisation project … They wanted to harvest from our repository and also offered to scan up to 2,000 of our theses which we had not got around to yet”. But I would guess that those 2,000 theses will be available on the ProQuest platform, and perhaps on the University of Surrey local collection, but they will not be made available on any direct competitor to the ProQuest PQDT product, for example on the EThOS platform at the British Library. [Note: Ken Chad pointed out to me after this post was written that the digitised texts are in fact available on EThOS]. A better long-term goal would be to identify a funding model that enabled theses to be progressively digitized and made available to all researchers on an aggregate platform. A project for JISC, perhaps?
So to conclude, I agree whole-heartedly with Ken Chad that we should “aggregate and amplify”. My recommendations for individual institutions are:
- To ensure that all PhDs provided by that institution are offered on condition that the thesis full text is deposited in the local institution and in EThOS
- To remove all embargos on publication apart from cases where the identity of people referred to in the thesis have to be protected (in which case the thesis could be redacted)
- To use funding to promote full public access rather than furthering the strategies of proprietary repository owners.
M. Gibney
What about the Networked Digital Library of Theses and Dissertations (NDLTD)? It’s free and open and anyone can include their IR’s ETDs in it for aggregation. It has existed for many years already and solves the problem (if everyone uses it). http://search.ndltd.org/
Michael Upshall
Thanks for this reference. Ken Chad did mention NDLTD in his report, and I should have included it. The best solution for researchers is to have one world-wide repository, and all credit to NDLTD for providing such a tool.
Ken Chad
Thanks Michael -great grist to the mill. One correction – and sorry this wasn’t clear in the paper — your comment about the 2,00O digitised theses at the university of Surrey. Although the work was undertaken by ProQuest they always only retain *non-exclusive rights* and the author or the institution can make theses available to *any* other third parties, including of course the British Library’s EThOs service. Indeed the fact they are included in the Surrey Institutional Repository, means they are also included in the *both* the (global) ProQuest database and the (national) EThOs service. I believe that ‘amplification’ beyond the UK (e.g. EThOs ) is an important benefit to authors
Michael Upshall
Thanks for this clarification, Ken. Clearly the Surrey digisation project is a benefit to all users!
Michael Upshall
One update from a quick comparison of two of the largest theses repositories, NDLTD and ProQuest PQDT. NDLTD contains 5,990.596 theses and dissertations, while PQDT claims to have “nearly five million”. Nonetheless, PQDT claims to be the “world’s most comprehensive curated collection of graduate dissertations and theses”. Clearly, they are both very big. Note that NDLTD appears to be an index of theses, rather than a collection of full text. I found a reference, for example, to Germaine Greer’s doctoral thesis described above, but the entry was simply a link to the full text available at Cambridge.