Commentary 10/01
Data as Capta
It’s definitely interesting to dive into the definition of data (especially within humanities data and data humanism). However, I’m not too confident about the difference between the “taken” and the “given”, as both data and capta can be representation as well as interpretation, if we consider beyond just the numbers and observation themselves but also look at the methodology of collection, types of analysis and filtering as well as the marks and channels of which the data presents itself to the researcher as well as the reader. Whether or not in the field of humanities / ditigal humanties studies, one should always be considering the presentation integrity, data-ink ratio, cognitive association, etc. throughout the analysis (something like Edward Tufte’s principles), so that raises a question: when Drucker wrote “Humanistic inquiry acknowledges the situated, partial, and constitutive character of knowledge production, the recognition that knowledge is constructed, taken, not simply given as a natural representation of pre-existing fact.”, does she take accuracy into consideration, or is she simply looking at knowledge as a personal gain based on the interest and opinion of the researcher? When we consider capta (knowledge “taken”), we are suggesting that knowledge should never be something taken for granted and there shouldn’t be assumptions of the “truth” of data. However, what allows us to make the additional moves to process the data into capta, and what then supports out claim that the “new data - capta” is still representative of the native / raw data? I agree with Drucker’s approach, that in info-vis / data-vis, we should be selective and mindful of what information we process and present. Yet, I think there’s a lot of ambiguity and objectivity that she didn’t necessarily mention in the literature.
Six Provocations for Big Data
Large data v.s. smart data was the focus for last week’s reading, and the Six Provocations for Big Data folows the discussion and going deeper into the context of these types of data, and the limitations, espeically on what big data alone can not answer within humanities research. I agree with Boyd and Crawford on the fac thtat numbers do not speak for themselves. However, similar to the response to Drucker above, to what extent should how we think data should be presented as well as what we think within the larger dataset matter be considered in the evaluation of infographics and data visualization? Data filtering is almost always subjective, as it neglects components in the data that we often define as completely obsolete based on personal experience or generalized claims. Also, I’m curious regarding their view on data accessibility. What contributes more to the authenticity of data, is it smart filtering of a speciifc population, or an overall consideration of the large population? and what about in different humanities research approahc (idiographic / normative)?