The 91st Volume — How the Digitised Index for the Collected Works of Leo Tolstoy Adds A New Angle for Research

Boris V. Orekhov (nevmenandr@gmail.com), National Research University Higher School of Economics, Russian Federation und Frank Fischer (ffischer@hse.ru), National Research University Higher School of Economics, Russian Federation

1. Introduction

The collected works of Leo Tolstoy were printed and published in 90 volumes of some 46,000 pages between 1928 and 1958. The visibility and usability of these volumes were increased by the project "Tolstoy Digital", a TEI-encoded version of this vast resource (Skorinkin & Mozhaev 2016).

This talk, however, is not about the 90 volumes themselves, but about the 91st volume of this edition, a supplement volume containing indexes of works and proper names, from both the fictional works and the many volumes containing Tolstoy's letters.

"The 91st Volume" is a web application based on the digitised index of proper names for the 90-volume collection of Tolstoy's collected works ( http://index.tolstoy.ru/ ). The digitised data features additional properties, which can be explored by the enthusiast as well as the specialist.

This talk tries not just to present a new tool for literary scholars, but tries to generalise how this kind of resources can be used to gain new insights into larger text collections.

2. Level 1: Enhanced Searches

First and foremost, the index retains its original functionality, which is to map names to volumes and pages. Collected works of a canonical writer are not primarily meant to be read one by one, line by line. A 90-volume collection of books does not only contain entertaining narratives, but it can also be viewed as a set of facts, dates, names, mentions, etc. An index is the key to this data, and it was the only means to gain some orientation in the pre-digital age.

In the web app version of the "91st Volume", the index is even more convenient to use than in the paper version, as it allows "fuzzy" searches. By entering "ava" it will list among the results terms like "Poltava", "Bavariâ", or "Abdulla-al'-Mamun Zuravardi". The higher the frequency of a name within the whole collection, the higher up it will be displayed in the results. These types of searches are already an enhancement over the traditional index search.

If we cannot define in advance what we are looking for, we still have the lists of all names in the index (which sum up to more than 16,000 entries). Once we've found what we were looking for, we don't need to remove any book from its shelf and open the right page, but can jump directly to the corresponding page.

A graphical word-cloud representation is also featured and conveys a first idea about the most frequent words in the corpus.

3. Level 2: Studying Life and Works of Leo Tolstoy by Means of Network Analysis

Turning an index of names into a network is a new approach to facilitate the study of contexts. The co-occurrence of names in the same environment (on the same page, in the same chapter, etc.) reveals similarities and relations between different entities, which on the scale of 90 volumes, helps us to understand larger contexts.

"The 91st Volume" unfolds a rather unconventional social network of Leo Tolstoy. It shows not only Tolstoy's connections with other people (e.g., his pen pals), but also the connections of people from the point of view of Tolstoy.

The co-occurrence of proper names on the same page within the 90 volumes establishes an edge of the emerging network as it creates a link between two entities. For example, the Hindu scripture "Bhagavat-gita" can be found five times on the pages of the Complete Works, and it shares these five pages with a total of 43 other names mentioned. The proximity of these mentionings is not accidental, of course, in our example they form some kind of "Indian cluster" containing works like "Gitopadeša", "Dhammapada", "Vamana Purana", or names like Ramakrišna Šri Paramagamza.

For Tolstoy, the mentioned texts are part of a set of carriers of philosophical knowledge, and are associated with names like Xenophon, Montaigne, Montesquieu, Pascal, Skovoroda, Socrates. These networks provide great opportunities for understanding the whole range of Tolstoy's interests and ideas. It presents a panoramic picture revealing general trends and larger thematic clusters. For each individual name there is also a small graph showing the most significant names associated with it.

Another new kind of access to the 90 volumes is a heat map that shows the density of proper names used in each of them (the more names mentioned, the warmer the colouring).

In the first volume of the collection containing youth experiments, a red splash suddenly appears in the middle of a rather calm blue background on page 269. You can view this page and will find that it contains a list of European cities: Rome, Naples, Dresden, Berlin.

4. Level 3: Editorial Evolution of the "Complete Works"

The index also allows scholars to study the coming into life of the "Complete Works of Leo Tolstoy", i.e., the difficulties that had to be overcome when working on this edition (as they are laid out in Osterman 2002). The "91st Volume" allows us to understand how editorial principles have changed over time, especially as regards the depth of commenting.

For example, the 13th volume, with draft editions of "War and Peace", has a weak commentary, and the 47th volume (diaries and notebooks) features such detailed comments that it is the most detailed in the entire 90-volume edition. Quantifications like this allow us to draw conclusions to the process of editing the Complete Works over three decades.

Like mentioned above, the web app retains all the capabilities of the traditional index, and at the same time extends its potential through computer-based information management, a multi-purpose search engine and different kinds of visualisations. The app is to be understood as a suggestion to apply the newly developed methods to the Collected Works of other authors.


Appendix A

Bibliography
  1. Osterman L. (2002): The Battle for Tolstoy: History of the Publication of Tolstoy's Complete Works. [Srazhenie za Tolstogo. Istorija izdanija Polnogo sobranija sochinenij Tolstogo.]
  2. Skorinkin D., Mozhaev E. (2016). TEI markup for the 90-volume edition of Leo Tolstoy’s complete works. In: TEI Conference and Members' Meeting 2016. Book of Abstracts. Vienna: Austrian Centre for Digital Humanities, pp. 107–109.