Assignment 18
Hearst: What is Text Mining? Hearst makes a distinction between true text mining and current text mining, in which true text mining is able to discern unknown patterns and interpret context within articles, while current text mining is very similar to data mining, only garnering known facts about the article (frequency of words etc).
Contrast between Text Mining over 20 year span Text mining defined with Hearst: Untangling Text Data Mining was still frequency and pair based mining, in which algorithms only looked for specific words within the data and presented the information back to the user. Although this is conducted in texts, the level of sophistication that these algorithms employ was on the same level as data mining. There was little to no use within the humanities field, as it was still largely fact based. Looking at Binder’s article current day, the field of text mining has made incredible strides, as the algorithms are able to connect sentiments towards specific fields of text with a respectable accuracy. This new version of text mining now has the potential to be used for humanities analysis, and is also a step closer to interpreting the context of specific text without human intervention. It is interesting looking at the progress of the field over the course of decades, but I am still a little skeptical about whether text mining would be able to interpret context on the level of a person.