Ethical Constraints in Digital Humanities and Computational Social Science

Anagha Uppal (auppal@vols.utk.edu), University of Tennessee, Knoxville, United States of America

As it developed, the field of Digital Humanities has had a particular set of advantages in making advancements and gaining approval among the scientific community, allowing it to serve as a “means to revitalize the humanities” in the face of decreased funding and appreciation for its contributions (Reid 2011, pp. 352-353). Both for Digital Humanities and Computational Social Science, principal among these advantages are:

  • Easy and fast access, via the Internet, to data resources and databases.
  • Inexpensive computational power, including large amounts of inexpensive memory and physical storage.
  • New forms of data (especially text) that can be easily obtained from many sources, particularly social media and blogs.
  • Open-source software and a culture of code-sharing
  • Modern advocacy and acceptance of interdisciplinary and multidisciplinary research (Alvarez 2016, pp. 3-4)

Watts (2013, p. 7) adds to this list a shorter timescale and lower cost for experiments in theory.

But alongside these advantages come challenges in the use of such data and methods that, if ignored, have the capacity to harm the public and the advancement of knowledge. From the perspective of the researcher, the necessary combination of tools and applications required, often from “multiple research traditions,” are not all familiar to any individual researcher (Watts, 2013, pp. 5-6). Data acquisition is becoming more and more difficult, with much proprietary big data (such as the Social Security Administration database or IRS database that would be useful for the study of job networks and the economy) locked away and expensive. Data, once made available, is also messy, unreliable and easily falsified. In order to be usable, it must be grounded with offline findings or other web data. When decentralized online data is found to be false, there is no system of institutional accountability, further increasing uncertainty and eroding trust in the use of the web to crowdsource the production of data and knowledge (Conte et. al, 2012, p. 336). Additionally, now that the use of social network sites is becoming more common, users become more adept at toggling privacy controls and choosing which content to share publicly and which to keep hidden, and the availability of social media data decreases (Giglietto & Rossi, 2012, p. 25).

For study participants, the concerns of weight particularly relate to data acquisition, and its privacy and confidentiality, security and reliability. As social media data is extensively used in DH studies, we demarcate the line at which it is appropriate to use such information without users’ consent by confronting extant questions of public/private arenas of publishing and accountholder motivation. Although it is important to retain the approval of users and collect private data ethically, failure to do so has its most damaging consequences when those who have access once it is collected are able to identify users and withdraw participants’ privacy, and therefore, we discuss individual-level data and ways to retain people’s confidentiality.

We also review ways of benefiting from data that comes from online sources, despite its inherent exclusion of those of low income and low socioeconomic status throughout much of the world, including the U.S. Also excluded are independent researchers, students and those associated with small organizations – especially interdisciplinarians – conducting this work often requires special supercomputers, and many humanities researchers do not have access to such resources or the skillset to use them. A number of papers have been written about data use ethics in other fields of research. This paper attempts to review and combine these needs for the specific purposes of Digital Humanities and Computational Social Science. Through an extended literature review, it collects ethical questions surrounding data use, and applies them to two infamous case studies: that of AOL’s release of search data in 2006 and of Facebook’s emotional contagion study published in 2014.

It is feasible to imagine that computational advantages, and the promise of DH and CSS, lead to a world of the analysis of not only text, but also sound, images and video, of richly-visualized data so that a maximum number of people can overcome confirmation bias and understand complex research results and contribute, and large-scale undertaking of crowd-sourced data and sophisticated citizen science is commonplace enough to allow us to solve high-impact questions. As we move towards such a world, a periodic reconsideration of ethics is judicious; it remains ever a timely topic with violations resulting in vast scandals and increasing public distrust (most recently the bout of data breaches, such as Uber’s - Shaban, 2017).


Appendix A

Bibliography
  1. Alvarez, R. M. (2016b). Introduction. In R. M. Alvarez (Ed.), Computational Social Science: Discovery and Prediction (pp. 1-24): Cambridge University Press.
  2. Conte, R., Gilbert, N., Bonelli, G., Cioffi-Revilla, C., Deffuant, G., Kertesz, J., . . . Helbing, D. (2012). Manifesto of computational social science. European Physical Journal-Special Topics, 214(1), 325-346. doi:10.1140/epjst/e2012-01697-8
  3. Giglietto, F., & Rossi, L. (2012). Ethics and Interdisciplinarity in Computational Social Science. Methodological Innovations Online, 7(1), 25-36. doi:10.4256/mio.2012.003
  4. Manovich, L. (2011). Trending: The Promises and the Challenges of Big Social Data. In M. K. Gold (Ed.), Debates in the Digital Humanities (1 ed., Vol. 1, pp. 460-475): University of Minnesota Press.
  5. Reid, A. (2011). Graduate Education and the Ethics of the Digital Humanities. In M. K. Gold (Ed.), Debates in the Digital Humanities (1 ed., Vol. 1, pp. 350-367): University of Minnesota Press.
  6. Shaban, H. (2017).  Uber is sued over massive data breach after paying hackers to keep quietThe Washington Post. Retrieved 28 November 2017, from https://www.washingtonpost.com/news/the-switch/wp/2017/11/24/uber-is-sued-over-massive-data-breach-after-paying-hackers-to-keep-quiet/
  7. Watts, D. J. (2013). Computational Social Science: Exciting Progress and Future Directions. The Bridge: Linking Engineering and Society, 43(4), 5-10.