Polysystem Theory and Macroanalysis. A Case Study of Sienkiewicz in Italian.

Jan Rybicki (jkrybicki@gmail.com), Jagiellonian University, Krakow, Poland, Poland and Katarzyna Biernacka-Licznar (katarzyna.biernacka-licznar@uwr.edu.pl), Uniwersytet Wrocławski, Wrocław, Poland and Monika Woźniak (moniwozniak@gmail.com), Università degli Studi di Roma "La Sapienza", Italia

1. Introduction

Even-Zohar's polysystem theory is a well-established approach to understanding how entire translated literatures interact (or not) with the body of the receiving native literary culture. Even-Zohar identifies a number of possible interactions depending on the relative "strength" and "age" of the two (or more) literatures, and translated literatures may assume "peripheral" or "central" positions within the target literary polysystem. According to this scholar, translations are usually peripheral to native literature; but he also cites examples where a given literary polysystem places some imported subsystems in a central position, while other “foreign imports” remain in the periphery (Even-Zohar 1990).

Even-Zohar thus deals with literary creation en masse rather than, as is often the case in academic approaches to literary translation, on single books original and translated. The obvious parallel to Distant Reading has already been drawn (Helgesson and Vermeulen 2015, 25-26); but it might also be tempting to do the same for a related approach, macroanalysis, if we are to follow the distinction made by the exponent of the latter term (Jockers 2013, 48). Both bring together investigations into masses of literary material unattainable by traditional close reading; yet macroanalysis looks inside many books at once using quantitative methods applied to their lexical layers that have been called “stylometry” well before both Moretti and Jockers.

2. Material

From our personal mixed Polish-Italian perspective, few cases could serve as a better pretext to try to negotiate this marriage between polysystem theory and computational stylistics than that of Quo vadis (1896), the historical romance by Henryk Sienkiewicz, Poland’s first literary Nobel Prize winner of 1905. Its international success – long gone with the wind but unparalleled by any other Polish novel to this day – resulted in a veritable explosion in terms of numbers of translations into various languages. In many countries, several different translations simultaneously vied for the public’s attention. Yet “several” does not even begin to describe the situation in Italy, where at least three hundred different editions can be still found today (Woźniak 2016). In the first two years of the existence of Quo vadis on the Italian market (1899-1900), as many as eight different translations were already available to the readers (Berti and Gagetti 2016).

No wonder: not only was the novel set in the Italian capital and not only did it deal with a subject already very present in Italian culture old and new; the book’s (and its author’s) brand of conservative Catholicism must have appealed to some of the most influential circles of the country. Yet the novel was also praised by some of Italy’s progressive critics, who saw, in Sienkiewicz’s persecuted Christians, the struggle of their contemporary revolutionary movements, and who liked to read his depiction of Imperial Rome’s decadence as a diatribe against the existing power structure (Marinelli 1984).

This profusion of Italian renderings is also the reason why building their representative selection was no easy task. Only a single translation was available online; a search in Polish and Italian libraries provided almost seventy candidate texts: signed or unsigned by a translator, published by a variety of publishers, often in several somewhat different editions. In the end, twenty-four translations produced until mid-20th c. have been identified as more or less independent of each other, although some of these still share over 50% of material, as evidenced by comparison of texts for identical 5- or longer word clusters with WCopyFind (Bloomfield 2011-2016). When applied to genuinely different translations, the similarity ratio is of the order of 5-7%.

The natively Italian literary polysystem was represented by close to 1300 different literary texts, mostly selected and adapted from Progetto Manuzio, one of the most comprehensive Internet collections of electronic texts in Italian. To include as many texts as possible, this set of Italian writing included dramas, epic poems and opera libretti as well as novels and novellas from the 15th to the 21st century. Several translations of other novels by Sienkiewicz were also added to the collection, and another big body of translations of a single author, Shakespeare, was included as well.

3. Methods

The stylometric method applied has been described by Eder (2017) and applied to other literary corpora by Rybicki (2014, 2016). Basing on Burrows’s Delta procedure (2002), a list of most-frequent words (MFWs) is produced for the entire corpus. These words are then counted in the individual texts, and their frequencies are compared in text pairs to produce a matrix of distance measures; in this study, the distances were established by means of the modified Cosine Delta (Smith and Aldridge 2011), which is now seen as the most reliable version (Evert et al. 2017). The distance matrix then undergoes Cluster Analysis (Ward’s hierarchical clustering), resulting in grouping the texts into “clusters” of greatest similarity; this is repeated for reiterations from 100 to 2000 MFWs at 100-word increments, and a consensus between the individual iterations is produced to show each text’s most consistent nearest neighbors, next-to-nearest neighbors and next-to-next-to nearest neighbors. The procedure is performed by means of stylo (Eder et al. 2016), a stylometric package for R (R Core Team 2016). The results are visualized by means of network analysis, applying the “Force Atlas 2” gravitational algorithm (Jacomy et al. 2008) in Gephi (Bastian et al. 2009) to the above-mentioned scores. Instead of applying a “human-made” classification of the resulting network of nodes and edges (i.e. identifying authors, genres and literary periods based on external and traditional literary history), the task of dividing the network into groups of greatest internal similarity was entrusted to Gephi’s modularity function, which finds communities within a weighted network (Blondel et al. 2008). The main experiment was conducted by successively increasing the number of communities shown until the expected separate cluster of translations of Quo vadis became a separate entity in the network, and the degree of its discreteness could thus be assessed.

4. Results

Dividing the network into just two modularity groups failed to isolate Sienkiewicz from the main Italian community. Instead, the main division was that between 19 th/20 th-century novels, translated or originally Italian, and everything else – the one notable exception to this rule was the prose of Pirandello, classified with the earlier texts. At three modularity groups, Italian drama detached itself from early prose. At four, the first writer became a separate community, but this was the native Deledda rather than the alien Sienkiewicz. At five, 19 th- and 20 th/21 st-century novels became two distinct groups; at six, another native Italian, Salgari, received his own class; at seven, pre-19 th-century works detached themselves from later prose. It is only at ten communities that a translated rather than an Italian author became a separate subsystem (to use Even-Zohar’s term) – in fact, not one but two: Sienkiewicz (not just his Quo vadis) and Shakespeare (Fig. 1).

Figure 1. Network analysis of distances between most-frequent-word usage. Thick and short lines (edges) denote small distance (or high similarity). For simplicity, only the final 10-community modularity is shown.
Figure 1. Network analysis of distances between most-frequent-word usage. Thick and short lines (edges) denote small distance (or high similarity). For simplicity, only the final 10-community modularity is shown.

5. Discussion

It seems too much of a coincidence that two major subsystems (translations of Shakespeare and translations of Sienkiewicz) become separated from the main body of literature in Italian at the same time, and that this happens only after two native authors receive their own subsystems. If such a mechanism were to be observed in even more extensive collection of texts (when they finally become available), Even-Zohar’s hypothesis of the usually peripheral position of translated literature could find its stylometric illustration. At the same time, this experiment confirms not only that original novels are more similar to translated ones than the former to original drama; but also that certain original authors are more different from other original authors than those translated from another language.

Obviously, this hypothesis must be tested in the future in other literary polysystems to claim that the affinity between polysystem theory and macroanalysis is anything more than metaphorical. Even-Zohar speaks of reception of literary works within a broader national culture; macroanalysis counts context-free words. Still, in its attempts to bring distant and close reading together, stylometry has been clutching at even weaker straws. Stylometrists continue to make similar leaps (of faith?) between their graphs and trees and networks on the one hand, and traditional literary history on the other. They usually believe that frequencies of very frequent words provide insights into more abstract characteristics of texts than their mere lexical or even grammatical difference: and these abstracts so far include authorship, genre, chronology, or gender. This study might just have added a new one. At the very least, it is an invitation to apply Even-Zohar’s concepts in various “distant” approaches to literature.

6. Acknowledgements

This research was made as part of the project: “Miejsce Quo vadis? w kulturze włoskiej. Przekłady, adaptacje, kultura popularna” (0136/ NPRH4/H2b/83/2016), funded by Poland’s National Program for Advances in the Humanities (NPRH).


Appendix A

Bibliography
  1. Bastian, M., Heymann, S., and Jacomy, M. (2009). “Gephi: an open source software for exploring and manipulating networks.” Proceedings of the International AAAI Conference on Weblogs and Social Media, San Jose, Ca.
  2. Berti, G. de, and Gagetti, E. (2016). “La fortuna di ‘Quo vadis’ in Italia nel primo quarto del Novecento.” In Woźniak, M., Biernacka-Licznar K., eds, Quo Vadis. Da caso letterario a fenomeno di massa. Ispirazioni - adattamenti - contesti. Roma: Ponte Sisto, 50-59.
  3. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E. (2008). “Fast Unfolding of Communities in Large Networks,“ Journal of Statistical Mechanics: Theory and Experiment 10: 1000.
  4. Bloomfield, L. (2011-2016). WCopyFind. The Plagiarism Resource Site, http://plagiarism.bloomfieldmedia.com. Accessed 24. Nov. 2017.
  5. Burrows, J.F. (2002). “Delta: A Measure of Stylistic Difference and a Guide to Likely Authorship.” Literary and Linguistic Computing 17: 267-287.
  6. Eder, M. (2017). “Visualization in stylometry: Cluster analysis using networks.” Digital Scholarship in the Humanities 32(1): 50-64.
  7. Eder, M., Kestemont, M., and Rybicki, J. (2016). “Stylometry with R: A package for computational text analysis.” The R Journal 8(1): 107–121.
  8. Even-Zohar, I. (1990). “The Position of Translated Literature within the Literary Polysystem.” In Polysystem Studies [= Poetics Today] 11(1): 45-51.
  9. Evert, S., Proisl, T., Jannidis, F., Reger, I., Pielström, S., Schöch, C., and Vitt, T. (2017). “Understanding and explaining Delta measures for authorship attribution.” Digital Scholarship in the Humanities 32 (sup. 2): 4-16.
  10. Helgesson, S. and Vermeulen, P. (2015). „Introduction. World Literature in the Making.” In Helgesson, S. and Vermeulen, P. eds, Institutions of World Literature, Writing, Translation, Markets. London: Routledge, 1-22.
  11. Jacomy, M., Venturini, T., Heymann, S., and Bastian, M. (2008). “ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software.” PLoS ONE 9(6): e98679. DOI=10.1371/journal.pone.0098679.
  12. Jockers, M. (2013). Macroanalysis. Digital Methods and Literary History, Champaign: University of Illinois Press 2013.
  13. Marinelli, L. (1984). “’Quo vadis.’ Traducibilità e tradimento,” Europa Orientalis 3: 131-146.
  14. R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
  15. Rybicki, J. (2014). “Pierwszy rzut oka na stylometryczną mapę literatury polskiej,” Teksty drugie 2: 106-128.
  16. Rybicki, J. (2016). “Vive la différence: Tracing the (Authorial) Gender Signal by Multivariate Analysis of Word Frequencies,” Digital Scholarship in the Humanities 31(4): 746-761.
  17. Smith, P. and Aldridge, W. (2011). “Improving authorship attribution: Optimizing Burrows’ Delta method.” Journal of Quantitative Linguistics, 18(1): 63–88.
  18. Woźniak, M. (2016). “Quo vadis: da caso letterario a fenomeno di massa. Dove ci ha portato Sienkiewicz?” In Woźniak, M., Biernacka-Licznar K., eds, Quo Vadis. Da caso letterario a fenomeno di massa. Ispirazioni - adattamenti - contesti. Ponte Sisto, Roma, 6-15.