09-19-OpenRefine blog entry-graduate-Sally-Chen
OpenRefine is a very easy-to-use software to learn the basic functions and conduct data cleanup in a relatively short time. Some minor errors, such as spaces or capital cases, may cause bias in categorization. Therefore it is very important to perform data cleaning before further analysis. However, I don’t seem to find a way to export CSV data from OpenRefine project, so one may need to export to an excel document first and then convert it to CSV form. Simple explorations of the two data files showed that comédie is the most popular drama genre, and by simply setting the filter, I can see the most popular authors or show titles under a certain genre. If researchers can convert the text of the revenue records into numbers, they can see the characteristics of the data more intuitively and facilitate further data analysis and visualization.