ELEXIS: Yet Another Research Infrastructure. Or Why We Need An Special Infrastructure for E-Lexicography In The Digital Humanities.

Tanja Wissik (Tanja.Wissik@oeaw.ac.at), Austrian Academy of Sciences, Austria und Ksenia Zaytseva (Ksenia.Zaytseva@oeaw.ac.at), Austrian Academy of Sciences, Austria und Thierry Declerck (declerck@dfki.de), Austrian Academy of Sciences, Austria

In this presentation, we will discuss the recently started European project ELEXIS – European Lexicographic Infrastructure and its potential in the context of digital humanities.

The use of the computer in modern lexicography is intertwined with the history of the digital humanities (c.f. Schreibmann et al. 2004) and the lexical data have grown to be is indispensable in more and more DH projects, especially with the rise of the Semantic Web and Linked Open Data (c.f. Oldman et al. 2016).

However, current lexicographic resources, both modern and historical, have different levels of structuring and are not equally suitable for the application in other fields, such as Natural Language Processing, and thus not directly usable in DH projects for Semantic Web applications and methods.

Therefore, ELEXIS will develop strategies, tools and standards for extracting, structuring and linking lexicographic resources to unlock their full potential for Linked Open Data and the Semantic Web, as well as in the context of digital humanities.

The ELEXIS project is carried out by a consortium of partners from various fields (e.g. lexicography, computational linguistics, natural language processing, digital humanities, and artificial intelligence). The consortium consists of the following scientific institutions, language institutes, standardisation bodies, and publishing houses: “Jožef Stefan” Institute (Slovenia), Lexical Computing CZ s.r.o. (Czech Republic), Instituut voor de Nederlandse Taal (Netherlands), La Sapienza University of Rome (Italy), National University of Ireland, Galway (Ireland), Austrian Academy of Sciences (Austria), Belgrade Center for Digital Humanities (Serbia), Hungarian Academy of Sciences, Research Institute for Linguistics (Hungary), Institute for Bulgarian Language »Prof Lyubomir Andreychin« (Bulgaria), Universidade Nova de Lisboa (Portugal), K Dictionaries (Israel), Istituto di Linguistica Computazionale "A. Zampolli" (Italy), The Society for Danish Language and Literature (Denmark), University of Copenhagen, Centre for Language Technology (Denmark), Trier University, Center for Digital Humanities (Germany), Institute of the Estonian Language (Estonia), Real Academia Española (Spain).

The ELEXIS project aims to integrate, extend and harmonise national and regional efforts in the field of lexicography, both modern and historical, with the goal of creating a sustainable infrastructure which will enable efficient access to high quality lexical data in the digital age, and bridge the gap between more advanced and lesser-resourced scholarly communities working on lexicographic resources.

ELEXIS intends to take an innovative approach of production and development of lexico-semantic resources by creating intelligent applications for crucial tasks such as linking lexical resources, word sense disambiguation and cross-lingual mapping on the basis of applied methods and techniques in the fields of NLP and Artificial Intelligence fields.

The ELEXIS infrastructure will help researchers create, access, share, link, analyse, and interpret heterogeneous lexicographic data across national borders, paving the way for ambitious, trans-national, data-driven advancements in the field, while significantly reducing the duplication of efforts across disciplinary boundaries. In order to ensure the sustainability of the technical infrastructure after the end of the project, the created infrastructure will be integrated into the already existing infrastructures CLARIN and DARIAH, since most of the partners are members of CLARIN and DARIAH national consortia.

Besides the technical infrastructure, ELEXIS will establish a network for knowledge exchange and will develop and implement free online training courses for lexicography. Furthermore, ELEXIS will give researchers and research teams trans-national access to research facilities and lexicographical resources which are not fully accessible online or where professional on the spot expertise is needed in order to ensure and optimise mutual knowledge exchange. The trans-national access will have impact especially for under-resourced languages and will all in all strengthen the infrastructure and collaborative network provided by ELEXIS.

Even though the infrastructure is at the moment planned as a European infrastructure, there are thoughts to expand it beyond Europe in order to cater for the needs of DH researchers around the globe.

Appendix A

  1. Schreibman, S., Siemens, R. and Unsworth, J. (eds.) (2004). A Companion to Digital Humanities. Oxford: Blackwell. http://www.digitalhumanities.org/companion/
  2. Oldman, D., Doerr, M. and Gradmann, S. (2016). Zen and the Art of Linked Data: New Strategies for a Semantic Web of Humanist Knowledge. In Schreibman S. et al. (eds.) (2016). A New Companion to Digital Humanities, 2nd Edition. Oxford: Wiley-Blackwell.