Reading Assignment - Data Scholarship in the Humanities: Qingyu Cai
This chapter, called Data Scholarship in the Humanities, introduced different aspects of data and scholarship and the relationship, including research methods and practices, sources and resources, knowledge infrastructure, and external factors.
When talking about sources and resources, the author argues that it is challenging to set boundaries on what are and are not potential sources of data for humanities scholarship because almost anything can be used as evidence of human activities. Data can be physical and digital, digital and digitized, surrogates and complete content, static images and searchable representations, searchable strings, and enhanced content. Each data format has its meaning, and one should choose whichever fits the needs. Besides, thanks to the emerging technology, many skills have been created during the development process to help diversify the data format, which supports humanities research. For instance, digitized objects can protect the original file and allow researchers to look closely at the files without destroying them. They have more potential to be turned into different forms of representation. Besides, the digitized form can be accessed by several people simultaneously instead of waiting to see the original object one by one. Thanks to technology today, digital humanities bring more convenience and possibilities for researchers to analyze the objects, in other words, the data, in various forms.
Moreover, the author has also mentioned the uneven access to data, material collections, and digital objects, which means the existence of digital segregation. However, this kind of segregation doesn’t mean a bad thing. Data, especially human activities, tend to be personal and private regarding human rights. If there’s no limitation regarding who can access the data, it would be dangerous for those involved in the data set. We tend to believe that companies and institutions are more responsible and ethical in dealing with data, which is not necessarily true. But, there’s no other way to determine who can access the data in a better way.