Zucker_Reading Commentary 10/1
D. Boyd, K. Crawford: Six Provocations for Big Data
Boyd and Crawford begin this article by questioning the validity of Big Data and the potential for misinterpretation of data as some researchers believe they can view data from a ‘30,000-foot view’. Furthermore, the patterning of networked data interpreted through the process of aggregation and data scraping often yields biased results. The following statement sums up this point quite accurately in my opinion:’Data is increasingly digital air- the oxygen we breathe and the carbon dioxide that we exhale. It can be a source of both sustenance and pollution’ [Boyd Crawford, 2]. While access to Big Data sets is increasing, there are many concerns about who is gaining access, how they are using and interpreting the data, and the ethics behind publishing findings. The authors provide six provocations for discussing the issues of Big Data which I will speak to a few.
The first being the presence of Big Data and the lack of historic context. Due to the newness of Big Data, it tends to focus solely on data collected in the recent decade rather than include historic data. While the task of data input is tedious, the lack of historic context is problematic when discussing social networks for instance and trying to make an argument for how millenials are more connected than ever. It would be impossible to determine as we do not have comparable data for previous generations nor do we have the methodologies for comparing social networks pre- and during the onset of social media applications such as Twitter and Facebook.
Big Data has always been and will continue to be ‘subjective’ until we have an accurate understanding of how humans function socially. In addition, the high concentration of gendering in the field further leads to bias when determining what questions are being asked and of who. How can we begin to understand marginalized populations or diverse identities when they are not being represented by those who are interpreting, cleaning, and publishing data? One of my greatest concerns with Big Data is just this, the role of bias and how Big Data is being misinterpretted yet it is this very data that is used to determine how we design the built environment.