I don't think the title “Deep Text” does this book any favours - a more accurate description might be “Text Analytics within the Enterprise” – less catchy, but certainly more intelligible, and more indicative of what this book covers. From the title, you might think this is yet another business book inventing a catchphrase and spinning the idea out to 220 pages. In reality this is a detailed and thoughtful overview of the use of text analytics for content-based organizations, written by a highly experienced practitioner.
Who is this book for? It is aimed at an audience that is involved in making business decisions (which means investment decisions) but that also needs to understand something about the technology involved. Large organisations will have senior management who would not open a book of this kind; very small organisations will not have the resources to build anything. Somewhere in the middle is the organisation Reamy is aiming at: trying to make sense of new technology without being able to turn to the resources of a research or semantic team. It is aimed at a business market, but there are references throughout the book on good IT practices, such as lean development, and build to fail models.
What makes the book readable is that Tom Reamy isn’t afraid to speak his mind. While most consultants have spent years learning to bite their tongue and provide the advice that the client asks, Reamy states in no uncertain terms what he thinks has worked – and what hasn’t. For example, he is clear that “most metadata projects – particularly asking authors to add keywords to documents as they publish them into content management systems – have been failures.” That’s a bit of an indictment of a process that has been undertaken by many publishers, but it is today quite widely agreed that the result is no more than a folksonomy. But, Reamy continues, “the other component that was supposed to improve search is adding taxonomies to the mix. I have to admit that I used to believe that this was the best answer, and spent a few years developing taxonomies for organizations which, while they helped somewhat, were rarely worth the effort and time.” You have to admire the author’s honesty, as he goes on to clarify: “The basic problem was not with the taxonomy, but with trying to apply the taxonomy to documents, in other words, manual tagging with all its well-known problems.”