Non-normative Data From The Global South And Epistemically Produced Invisibility In Computationally Mediated Inquiry

Sayan Bhattacharyya (sayan@illinois.edu), Price Lab for Digital Humanities, University of Pennsylvania, United States of America

This paper is an intervention that addresses an epistemological conundrum likely to become increasingly common and acute as the digital humanities both grow more diverse and increasingly encompass knowledge that has been produced outside of the parameters of Euro-American normativity. This contribution is in the spirit of addressing the need for cultural critique in the digital humanities along the lines that Alan Liu has called for (2012). I make use of an instantiated example, a text analysis tool for visualizing properties of a particular digitized text corpus in relation to trends of usage of specific words in the corpus, but I argue that the key insights are generalizable to a large spectrum of digital humanities tools.

Techno-social ensembles, acting as apparatuses of knowledge production through which computationally inferred knowledge is produced, are themselves power-laden. Data that is epistemically heterogeneous can be rendered illegible or less legible within a representational scheme that enforces standardization, creating a situation in which it can be visible only at the cost of relinquishing, in favor of the dominant episteme’s normative assumptions, the variability that constitutes its heterogeneity — as these normative assumptions tend to privilege the homogeneity of data. I name and describe this problematic in a way that fosters a dialog between philosophy and critical theory on the one hand and digital humanities on the other hand, placing it on a theoretical footing in relation to which that dialog can happen.

I describe possible approaches to this problematic, both conceptually and in the form of actionable solutions that follow from the conceptual issues. I also suggest a way to redress the unintended illegibility or invisibility that epistemologically heterogeneous and non-normative knowledge — such as, for example, many knowledge artifacts from the global South — can undergo in computationally mediated knowledge apparatuses. In the first, critical, section of the paper — “critical” in the sense of pertaining to critique — I show, building on insights that I have described elsewhere, how even powerful tools for text analysis and visualization that are state-of-art in the field may tend to produce an undercount in the number of accumulatively retrieved records of occurrence for non-western-language material encountered written in western script within western-language text (Bhattacharyya 2017). Considering such a tool as a knowledge apparatus, I show that the problem arises because the knowledge objects in question — non-western-language words — typically tend to present, when transliterated into morphological expressions in the Latin alphabet, much more representational variation than the extent of heterogeneity that such tools implicitly assume their normative knowledge objects, namely western-language material, to present. I describe the mechanism by means of which the problem arises in this particular knowledge apparatus, and I argue that the problem is homological, and therefore generalizable, beyond the particular constellation of words, scripts and language to a wider set of similar configurations in the humanities, especially when data from the global South is at play.

Computational inquiry into humanistic knowledge regarding non-normative knowledge objects such as knowledge objects from the global south is particularly vulnerable to the general problem: an apparatus for knowledge production tends to render invisible certain kinds of inscriptions that, for one reason or another, do not conform to the epistemic normativity that the apparatus presupposes. Cultural forces, through the sociotechnical ensemble that they are a part of, shape computational, algorithmic inquiry, so that the problem becomes especially acute in the digital humanities at scale. I argue that epistemological problems concerning legibility caused by the logic of scale and accumulation on the one hand, and the complementary logic of networks on the other hand, have a relation to the logics of hierarchical production and nonhierarchical (network-based) production, to whose increasing complementarity in the sociocultural sphere Luc Boltanski and Eve Chiapello, among others, have drawn attention (2005).

I will end by describing possible ways of addressing the issue in the context of undergraduate classes in comparative literature among the likes of which I have used a tool of the above kind. These possibilities point towards one possible kind of a decolonial approach in the digital humanities. I will suggest that the most promising solution has to do with "persistent annotation": a way for users (students for my use case) to annotate the invisibilities/illegibilities as and when they discover them, in the form of a written record that persists (from one term of teaching (one iteration of a course) to another term (another iterarion of the course). A sophisticated implementation of this solution would incorporate such a document, in the form of a user-contributable manifest, into the software tool itself (such as by including a visible pointer to such a manifest from within the GUI for the tool). For my small-sized use case, however, something as simple as a document carried over and renewed from semester to semester across the content-management system for the class can be sufficient as such a manifest. I will argue that this is roughly similar, in principle, to the way that one can make, edits (or, more generally (and more similarly to this situation), editing suggestions in Wikipedia, whether non-anonymously or anonymously as desired (but even in the case of anonymity, with an audit trail of accountability visible to a monitoring party). The epistemological stakes of this kind of approach in the case of Wikipedia have been addressed by Lih (2009) and can provide a useful point of comparison.

While my specific use case pertains to textuality, I will also make points of connection with instances of the illegibility or invisibility of non-normative knowledge in other modalities of computational media in the context of certain specific kinds of data or cultural knowledge. Shannon Mattern, for example, has examined the question of how computationally mediated representations of spatial data can produce illegibility or invisibility (2015), and Irit Rogoff has shown how curatorial practice can do the same for visual artifacts (2005, 2009). Finally, I will conclude by arguing that a connection exists between coloniality, legibility and accountability. Jon Wilson has recently argued that, rather than imperial certainty and confidence, coloniality was often distinguished by administrative anxiety about governance over strangers who are epistemically ‘other’ (2017) — an anxiety partially redressed by external informants who are tolerated, but only when their participation is underwritten by mechanisms of trust and accountability legible within the imperial episteme. There is an interesting parallel here with digital tools created by well-intentioned tool builders ending up governing the legibility of non-normative cultural artifacts that have their origin in zones of epistemic otherness.


Appendix A

Bibliography
  1. Bhattacharyya, Sayan. “Words in a World of Scaling-up: Epistemic Normativity and Text as Data.” Sanglap: Journal of Literary and Cultural Inquiry 4, no. 1 (2017). http://sanglap-journal.in/index.php/sanglap/article/view/157.
  2. Boltanski, Luc, and Eve Chiapello. The New Spirit of Capitalism. Translated by Gregory Elliott. London: Verso, 2005.
  3. Lih, Andrew. The Wikipedia Revolution: How a Bunch of Nobodies Created the World’s Greatest Encyclopedia. New York: Hyperion, 2009.
  4. Liu, Alan. “Where Is Cultural Criticism in the Digital Humanities?".” In Debates in the Digital Humanities, edited by Matthew Gold. Minnesota: University of Minnesota Press, 2012.
  5. Mattern, Shannon. “Gaps in the Map: Why We’re Mapping Everything, and Why Not Everything Can, or Should, Be Mapped.” Words in Space, September 18, 2015. http://wordsinspace.net/shannon/2015/09/18/gaps-in-the-map-why-were-mapping-everything-and-why-not-everything-can-or-should-be-mapped/.
  6. Rogoff, Irit. “GeoCultures: Circuits of Art and Globalization.” Open!: Platform for Art, Culture and the Public Domain, no. 16 (2009). https://www.onlineopen.org/download.php?id=53.
  7. ———. “Looking Away: Participations in Visual Culture.” In After Criticism: New Responses to Art and Performance, edited by Gavin Butt. Malden, MA: Blackwell, 2005.
  8. Wilson, Jon. India Conquered: Britain’s Raj & the Chaos of Empire. Simon and Schuster, 2017.