REED London and the Promise of Critical Infrastructure
Alan Liu has called upon digital humanists to think more critically about infrastructure - the “social cum technological milieu that at once enables the fulfillment of human experience and enforces constraints on that experience” (Liu, 2017). Liu’s invitation comes at the moment when researchers involved in large-scale, long-term projects are shifting focus from remediation and the creation of digital incunabula to transmediation and the development of systems that support sustained discourse across ever-morphing digital networks, when we are recognizing the potential for “dynamism of the base or serialized form of the text—the state in which it is stored—as opposed to dynamic modes of presentation” (Brown, 2016: 288). REED London is one such project with a polyvalent dataset that spans over 500 years’ worth of archival records, embracing from the start the need to establish a stable, responsive production and presentation environment primed for use by a wide range of scholarly audiences. Thus we find that we are immediately testing those infrastructural constraints. In this paper, members of the REED London project team will address the challenges we face as we develop and implement a framework that trains us to think about our collected data in relation to much larger networks of disparate resources and user needs.
REED London develops from a partnership between the Records of Early English Drama (REED) and the Canadian Writing Research Collaboratory (CWRC). Together we are establishing an openly accessible online scholarly and pedagogical resource of London-centric documentary, editorial, and bibliographic materials related to performance, theatre, and music spanning the period 1100-1642. With support from the Andrew W. Mellon Foundation and a CANARIE Research Software Program grant, a team of researchers in the digital humanities and performance history from the U.S., Canada, and the U.K. are building a stable, extensible editorial production and publication environment that will create new possibilities for scholarly presentation of archival materials gathered from legal, ecclesiastical, civic, political, and personal archival sources in and around London. The REED London project combines materials from three printed REED collections (
Inns
of
Court,
Ecclesiastical
London, and
Civic
London
to
1558), the prosopographical material from REED’s
Patrons
&
Performances
(P&P), the bibliographical materials of the
Early
Modern
London
Theatres (EMLoT) database, and in-progress and planned digital collections focusing on London area performance spaces, most notably the Globe, Rose, and Curtain theatres and Civic London 1559-1642.
REED is an internationally renowned scholarly project that has worked to locate, transcribe, and edit evidence of drama, secular music, and other communal entertainment in Britain from the Middle Ages until 1642. Since 1979 REED has published twenty-seven printed collections of transcribed records plus contextual materials. REED has long recognized the importance of online access to its resources,first with
P&P and
EMLoT, and more recently with the born-digital collection
Staffordshire. REED has wrestled with the balance between what was once considered its “core” print publication activities and “adjunct” digital efforts, in the process migrating its data across a succession of programs and formats from Basic and dBASE to TEI P5 XML and MySQL (Hagen, MacLean, and Pasin, 2014). REED has developed its digital resources in ways that complicate integration (
P
&P exists in a Drupalinstance;
EMLoT was built in a version of Django that is now out-of-date;
REED Staffordshire was lightly tagged in TEI and relies on EATSML for entity management, an XML format used by the Entity Authority Tool Set (EATS) for serialisation of its data). The components of REED London must therefore first be made intra-operable before they can become interoperable (Jakacki, 2016). The partnership with CWRC supports broader adoption of standards for TEI text markup, RDF metadata specifications, and named entity aggregation, most immediately with the ingestion of
EMLoT and the printed
Inns
of Court collection.
CWRC is an online infrastructure project designed to enable unprecedented avenues for studying the words that most move people in and about Canada. Built with funding from the Canada Foundation for Innovation, the CWRC platform supports best practices in the production of online collections, editions, born-digital essays, anthologies, collections, monographs, articles, or bibliographies, and supports the inclusion of visual, audio, and video sources (About CWRC/CSÉC). It supports collaboration through the use of interoperable data formats and interlinking of materials, and for teams like REED London provides invaluable tools for communicating, tracking activity, and workflow. We envision that as the partnership develops and as REED London advances through production toward publication we will take full advantage of CWRC’s functionality. From the start we have worked directly in CWRC’s unique editor, CWRC-Writer, which allows us to edit REED London records, essays, and bibliographical material using more diplomatic and critical TEI P5 XML markup and at the same time creating semantic web annotations with RDF to identify, manage, and interlink entities contained within. The platform is also helping us to develop a better editorial workflow through management of access to data and editing by role, team communications, tracking and reporting of team activities.
To ensure REED London’s stability and sustainability while extending its content and value to new generations of scholars the project is being built within the CWRC environment. The scope of REED London would not be possible without the sophisticated, integrated platform that CWRC provides. The focus of our first year is the design and construction of a collaborative online production and publication environment. Extending from CWRC’s existing integrated content management and preservation system, the enhanced environment will accommodate the range of record texts, editorial and bibliographical content from the source materials, while a customized browser-based CWRC-Writer platform will support the team’s goal of developing online editorial collaboration and review. The resulting streamlined production and publication environment will yield multi-faceted user-centered editions, meaning that agile component archival and editorial parts can cohere according to various criteria in response to scholars’ research and teaching needs. In this way we are establishing a platform that produces new forms of “edition” that combine customized textual and contextual materials, exportable customized datasets and dynamic data visualizations. It also means that we will be able to realize the promise of extending the value of these materials to colleagues in fields beyond performance history, including political, religious, and cultural studies, and linguistics.
The partnership between CWRC and REED allows us to explore the potential for new research applications associated with prosopography, networks, and deep contextualization. REED London’s wealth of references to very itinerant individuals across contemporaneous records means that we will be able to discern patterns through linking, analysis, and visualization. We will leverage REED’s named entities for linking people, places, events, and organizations. Our team has healthy debates about the problematic present of linked data. Brown has stated that, “linking up with other data means connecting one ontology to another, and this brings with it a pressure toward generalization rather than specificity” (Brown, Simpson, et. al., 2015). Cummings has posited that “being able to seamlessly integrate highly complex and changing digital structures from a variety of heterogeneous sources through interoperable methods without either significant conditions or intermediary agents is a deluded fantasy” (Cummings 2014). Still, as a group we hope that by publishing our ontologies as a means of relating these entities as linked open data, we will be able to contribute to larger dialogues about class and society in Britain - certainly over the 500 years covered by REED London, but also about the development of Britain and Europe. CWRC content will be aggregated by the Advanced Research Consortium (ARC), and REED London will benefit from that aggregation, as we anticipate that people who figure in the REED London corpus, such as Elizabeth I, Francis Bacon, and Inigo Jones will be discoverable by scholars searching for these known figures across other
linked resources. Perhaps more important, REED London records include extended references to thousands of Londoners who were in some way connected to performance, but who were not defined by that connection: civic officials, guild members, lawyers, clerks, priests, etc. The work of this project thus holds as yet unrealized value for a much broader understanding of British historical subjects.
Working within CWRC’s platform and optimizing CWRC-Writer has allowed the core REED London team to move efficiently to an advanced planning phase. By the end of 2017 we will have designed templates for all record formats from
Inns of Court and mapped database fields from
EMLoT to align with the record parts from the print collections. We will have harvested a preliminary “white list” of named entities (people, places, organizations) from all three print collection indexes, P&P, and Staffordshire. Because of this efficient onramp we will be able to focus in the first half of 2018 on ingesting data, records, and contextual materials from Inns of Court and EMLoT. We will test the REED-specific entity list on ingested materials. We will also begin to user-test the editorial workflow system with the larger project team of REED editors and staff. By June 2018 we will have begun semantic tagging and experimentation with the CWRC HuViz semantic web visualization tool. At the DH 2018 conference we will report on further customization of the CWRC interface, our plans for data discovery and research collaboration, and present preliminary plans for user-responsive editions and data linkage.
Appendix A
- Brown, S. (2016). Tensions and Tenets of Socialized Scholarship.
D
igital
Scholarship
in
the Humanities, 31 (2): 283-300. - Brown, S., Simpson, J.,
CWRC Project Team, and Inke Project Team. (2015) An Entity By Any Other Name: Linked Open Data as a Basis for a Decentered, Dynamic Scholarly Publishing Ecology.
Scholarly
and
Research
Communication 6 (2).
http://src-online.ca/index.php/src/article/view/212/409. - Cummings, J. (2014). The Compromises and Flexibility of TEI Customisation. In Mills, C., Pidd, M. and Ward, E. (eds),
Proceedings of the Digital Humanities Congress 2012. - CWRC: About CWRC/CSÉC webpage.
http://www.cwrc.ca/about/#whatis - CWRC Humanities Visualizer webpage.
http://www.cwrc.ca/uncategorized/huviz-tool/ - Entity Authority Tool Set (EATS) website.
https://eats.readthedocs.io/en/latest/index.html - Hagen, T., MacLean, S., and Pasin, M. (2014). Moving Early Modern Theatre Online: the Records of Early English Drama introduces the Early Modern London Theatres. http://static.michelepasin.org/public_articles/2014-REED_McLean-Pasin.pdf
- Jakacki, D. (2017) REED London: Humanistic Roots, Humanistic Futures. Paper given at MLA 2017.
http://dx.doi.org/10.17613/M67794 - Jakacki, D. (2016) REED and the Prospect of Networked Data. Paper given at the Conference of the Canadian Society for Renaissance Studies.http://dx.doi.org/10.17613/M6CK59
- Liu, A. (2017) Toward Critical Infrastructure Studies», paper given at the University of Connecticut.
https://www.youtube.com/watch?v=2ojrtVx7iCw - Records of Early English Drama project website. http://reed.utoronto.ca
- REED Patrons and Performances website. https://reed.library.utoronto.ca
- REED
Staffordshire Collection website. https://ereed.library.utoronto.ca/collections/staff/