[by Nicole] thoughts on Big Data
No. 5: “Just Because it is Accessible Doesn’t Make it Ethical”
As someone who “deactivated” their Facebook years ago and has to repeatedly “re-deactivate” it after someone tells me, “Your Facebook got reactivated again!”, I understand the issues about having personal information in the hands of social media sites. Being a fairly private person, it bothers me that Facebook refuses to let me cut the ties and remove my profile from the internet permanently, but I also acknowledge the fact that I put my personal information on the web, a notoriously public domain, in the first place.
I feel that people sometimes forget that the internet is public. There are very few spaces online, if any, that have any semblance of privacy. Any information you willingly put online, you should be prepared to share with the world. In other words, I am sympathetic to the researchers who have to tiptoe around the “ethicality” of accessing public, online information, as if the onus is completely on them and not the people who actually posted that information.
Accessing information that was not voluntarily made accessible, like online shopping habits, is a little more ethically ambiguous to me. With data like this, I am more sympathetic to the “anonymous” subjects. It is still a little difficult for me to believe that the internet should be deemed a “private” place.
No. 3: “Bigger Data are Not Always Better Data”
The issue of understanding the inconsistencies of big datasets is particularly interesting to me as a computer scientist, because we are limited by the algorithms used to process the data. While I understand that it is important to understand what the data is telling us about the people who created it, I wonder how this is really a new problem. The authors state that social media data is skewed by overseeing organizations like Twitter, but when is public data ever not skewed? They also say that errors in data sources are magnified, but doesn’t this just make errors easier to find? Finally, they note that small data is sometimes best, but this does not mean that big data is not powerful. There are many cases where small datasets lead to inappropriate conclusions.
I guess my point is: big data comes with new challenges, but isn’t this to be expected? I barely see this as a provocation, but a statement of the obvious.