Alien Reading: Text Mining, Language Standardization, and the Humanities

Enter text in Markdown. Use the toolbar above, or click the ? button for formatting help.

It’s clear that the current text-mining technologies is still limited and cannot read text the way humans do. The argument in the article is that there is little interaction between the scholars who apply these computational methods to literary history and those in media studies who critically analyze the history and culture from which this technology emerged. For instance, scholars who use text mining in literary and cultural history often do not consider the question of how the technologies they use might be influenced by the military and commercial contexts (from which they emerged). As mentioned in the article, a dialogue between media studies and text-mining should exist to allow scholars to engage with the linguistic technologies in a way that keeps their alienness in sight, foregrounding their biases, and focusing on the historical and cultural context of how computers read text.

If we critically examine the texts in which these methods are applied, we see that the kinds of text that these technologies parse share certain characteristics. They are primarily written in a standard dialect and orthography; they tend to privilege the informational over the aesthetic dimensions of language; and they primarily consist of prose. This is a very interesting point since these texts are not randomly selected to be read by parser. The technologies were designed for these specific kinds of texts. Some of these texts are the sorts of text that the military-industrial apparatus would have a clear interest in mining. In fact, the military was partially responsible for the birth of these technologies. Users of the text-mining technologies and even the general public are unlikely to know that fact. It’s scary to learn about all these powers behind the technologies that ultimately shift the entire society. It is more frightening that they are directly and indirectly influencing our lives and our children’s in various aspects without our awareness.

In addition to our experimentation with text-mining methods, we should also focus on research that situates them historically—both in the short term, looking at the institutional contexts from which they emerged, and in the long term, looking at how they relate to the histories of linguistic thought, philosophy, communication, and labor organization. I strongly think it is important that there is some openness about the behind-the-scene of these technologies. There should be someone to critically examine in the context and the drive behind these technologies and not just their applications and direct impact. Further, focusing on the research of the historical and cultural context of these technologies keep them in check (in term of biases) and help general public better understand them.