Marti Hearst: Reading Commentary
What is Text Mining?
In this text Hearst defines text mining and distinguishes it from both web search and data mining. They also explain how it has been applied to various fields, specifically highlighting its frequent and promising use in bioscience. I had previously had the misconception that text mining is solely used in the humanities and linguistics, but I was surprised that Hearst actually saw its value in science.
Untangling Text Data Mining
Although many researchers in fields related to computational linguistics and NLP are focusing on developing complex methods to analyze text effectively, Hearst makes a compelling argument that this may not be necessary to find significant patterns about the world. They cite various examples of successful text mining using simple methods, which ranged from application in health to the American patent community. It seems that if a researcher has a refined approach and knows what type of trends to look for, they would be able to effectively use simple text mining to uncover interesting and/or important correlations.