Shedding Light on Indigenous Knowledge Concepts and World Perception through Visual Analysis

Alejandro Benito (, University of Salamanca, Spain and Amelie Dorn (, Austrian Centre for Digital Humanities Austrian Academy of Sciences, Austria and Roberto Therón (, University of Salamanca, Spain and Eveline Wandl-Vogt (, Austrian Centre for Digital Humanities Austrian Academy of Sciences, Austria and Antonio Losada (, University of Salamanca, Spain

The way we conceptualise our world is dependent on various aspects, differing with culture, time and language, and may even be subject to change over the years [5,6]. In this paper, we introduce a visual analysis tool that supports the exploration of indigenous knowledge concepts of a historic language collection, the Database of Bavarian dialects in Austria (DBÖ, dboe@ema), originally and partially collected by means of systematic questionnaires in the area of the former Austro-Hungarian empire. The collection we focus on in this work consists of 109 (original-conceptual) and 9 (supplementary) questionnaires, designed between 1913 and 1920, with answers (about 5 million paper slips). Around 11.100 persons of regional importance with various professional backgrounds and different roles in the compilation process were involved for almost a century [further info c.f. 1,8].

Our tool results from a series of iterations [3] of a custom-made, agile and collaborative workflow inspired by work from other authors [4] that was especially designed for the Digital Humanities (DH). The workflow places data visualisation as the main dialogue facilitator between the different stakeholders participating in the project. By applying user-centered design [2] techniques such as design probes [7], we can direct the development of several micro-prototypes towards the answering of fine-grained research questions. This prototype comprises the results of a full iteration of this iterative and incremental software development cycle.

Attending to the technical aspect of our approach, we employ different distant reading techniques to provide the user with a realistic view of the contents of the questionnaire and with visual mechanisms to help her form a mental image of the cultural connections of the terms at the time the questionnaires were made.

Our visualization plays with lights, colours and shadows to display related concepts, a relationship that is obtained by analysing coincident terms in the questions: the more times two or more terms appear together, the more important they all look in the visualization. The main visual component of our pilot tool is an adjacency matrix tweaked to meet the needs of the multivariate analysis task at hand. This matrix represents one single questionnaire of the collection and its rows and columns the questions conforming it. Each cell is colored to show the number of different concepts two questions have in common (richer coincidences are coloured in darker colours), forming different visual patterns that inform the user about the general distribution and importance of the concepts across the questionnaire.

The main matrix view is escorted by two other views placed on its right and at the bottom respectively: The first one offers an overview of the individual concepts in the questionnaire attending to the number of times they appear, each one represented by a coloured circle. Less frequent (and therefore, less important in our approach) concepts are moved to the top of the visualization, whereas the more important ones are placed at the bottom. Whenever the user hovers over one element, the cells in which that concept appears are in turn highlighted in an effect that imitates refraction of light, allowing for a rapid identification of particularities in the exploration process. At the bottom, the specific concept associations can be found in a similar way. More populated associations appear bigger in the visualization, whereas the more common are placed to the left. We provide an example below related to the use of colour terms:

Although thematically restricted to a single questionnaire (Q53), colours occur in questions throughout the entire collection offering valuable insights on their connection to cultural concepts. Within a single questionnaire, concept patterns/groupings across questions are revealed (see Figure 1). Interestingly, in the case of Q53 the most frequently occurring colour term bleich (pale) groups across questions towards the end of the questionnaire.

Figure 1: Visual distribution of ‘bleich’ (pale) grouped across questions in questionnaire 53.

Additionally, yellow ( gelb) is the term/concept occurring most frequently across questions in questionnaire 85, thus playing an important role in the description of “The flora of our meadows / Die Pflanzenwelt unserer Fluren ” (Q85) (see Fig. 2). Further, frequent collocations of colour terms in questions are revealed, which also shed light on the structuring of language and part of the conceptualisation of certain topics (see Fig. 3).

Figure 2: Distribution of ‘gelb’ (yellow) across questions in questionnaire 85.

Figure 3: Visualisation of co-occurrence of terms ‘rot-gelb’ (red-yellow) across questions in questionnaire 85.

Note: Note: Preview of the prototype: (Google Chrome only). Please share your remarks with us at Thanks.


Datenbank der bairischen Mundarten in Österreich (DBÖ) | Database of Bavarian Dialects in Austria (DBÖ). Austrian Academy of Sciences: 11.2017.

Datenbank der bairischen Mundarten in Österreich electronically mapped (dbo@ema) | Database of Bavarian Dialects in Austria electronically mapped (dbo@ema). Ed. by Eveline Wandl-Vogt: Austrian Academy of Sciences: 2012 / 11.2017.

Appendix A

  1. Abgaz, Yalemisew, et al.: “A Semantic Model for Traditional Data Collection Questionnaires Enabling Cultural Analysis.” Proceedings of W23 - 6th Workshop on Linked Data in Linguistics: Towards Linguistic Data Science, LREC 2018, 21-29. [last accessed: 26.04.2018]
  2. Abras, C., Maloney-Krichmar, D. and Preece, J., 2004. User-centered design. Bainbridge, W. Encyclopedia of Human-Computer Interaction. Thousand Oaks: Sage Publications, 37(4), pp.445-456.
  3. Benito, A., Therón, R., Losada, A., Wandl-Vogt, E. and Dorn, A., Exploring Lemma Interconnections in Historical Dictionaries. 2nd Workshop on Visualization for the Digital Humanities. October 2017 - Phoenix, Arizona, USA.
  4. Bernard, J., Daberkow, D., Fellner, D., Fischer, K., Koepler, O., Kohlhammer, J., Runnwerth, M., Ruppert, T., Schreck, T. and Sens, I., 2015. VisInfo: a digital library system for time series research data based on exploratory search—a user-centered design approach. International Journal on Digital Libraries, 16(1), pp.37-59.
  5. ‘Concepts of the World’: Publishing in Mexico’s Indigenous Languages. [last accessed: 26.04.2018]
  6. De Beule, J. and De Vylder, B., 2005, January. Does language shape the way we conceptualize the world?. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 27, No. 27).
  7. Gaver, B., Dunne, T. and Pacenti, E., 1999. Design: cultural probes. interactions, 6(1), pp.21-29.
  8. Wandl-Vogt, Eveline. “…wie man ein Jahrhundertprojekt zeitgemäß hält: Datenbankgestützte Dialektlexikografie am Institut für Österreichische Dialekt- und Namenlexika (I DINAMLEX) (mit 10 Abbildungen)” P. Ernst (Ed.), Bausteine zur Wissenschaftsgeschichte von Dialektologie / Germanistischer Sprachwissenschaft im 19. und 20. Jahrhundert. Beiträge zum 2. Kongress der Internationalen Gesellschaft für Dialektologie des Deutschen, Wien, 20. – 23. September 2006. Wien: 2008. Praesens, (pp. 93–112).