IncipitSearch - Interlinking Musicological Repositories

Anna Neovesky (), und Frederic von Vlahovits (),

Open research data is facilitating broader ways of using, reusing, enriching, and linking research results. Many services use metadata to bring the information of different repositories together. Europeana, for example, links material from various thematic focal points with diverse origins and makes a wide range of collections, archives and source objects searchable. Other platforms interlink and aggregate material for one distinct discipline or thematic interest.

To connect musicological collections and repositories, we created a metasearch that builds up on annotated music. IncipitSearch is a tool and a service specifically tailored for research on music incipits, the initial sequences of notes that characterise a work. It is simultaneously a centralised data endpoint, where multiple aggregated catalogues can be accessed and searched by their music incipits, as well as a decentralised software and data cluster.

1. Open Data and Meta Search Engines: Perspectives for Digital Musicology?

Open research data is facilitating broader ways of using, reusing, enriching, and linking research results. Many services use metadata to bring the information of different repositories together. Europeana ( https://europeana.eu), for example, links material from various thematic focal points with diverse origins and makes a wide range of collections, archives and source objects searchable. Other platforms interlink and aggregate material for one distinct thematic interest such as Ariadne ( http://ariadne-infrastructure.eu), which makes manifold archaeological contents accessible, or correspSearch ( http://correspsearch.net), which enables to search through collections of editions of letters.

Meanwhile, musicological projects do not only often have digital components, too. Ambitious global catalogue projects like the Répertoire International des Sources Musicales (RISM, https://opac.rism.info) or national library services such as the catalogues of the Italy's Servizio Bibliotecario Nazionale (SBN, http://opac.sbn.it) or the Deutsche Nationalbibliothek, (DNB, https://portal.dnb.de) substantially rely more and more on the digital representation of their data. In addition, overall demand of digital research platforms has lead to born digital editorial projects, e.g. Freischütz Digital, a genuinely digital edition of Carl Maria von Weber’s Freischütz ( http://freischuetz-digital.de) exploring the possibilities of multimedial digital musicological work editions, or the digital thematic work catalogue of the complete edition of Gluck’s works ( http://gluck-gesamtausgabe.de). The researcher’s stronger trust and belief in the benefits of open and accessible research data has led to a stronger emergence of open data policies in musicological projects. In order to interlink existing data repositories and encourage new proposals, a digital data hub is needed. But how can musicological data collections be connected and linked together? In our approach, we concentrated on musical incipits, the initial sequences of notes, that function as identifier for works, and created IncipitSearch, a metasearch for musical incipits.

2. Encoding Music Incipits

One of the main goals of musicological catalogues is making musical works findable and researchable. The main problem that often occurs, especially for music composed before 1800, is that it originally was composed for a singular religious or secular cultural event, e.g. at an aristocratic court to be performed only once or just a few times. Music was additionally bound to formalised genre standards and therefore unambiguous titles were not required. But how to search for a Sonata in D of a composer who has composed 20 sonatas in D? As early as the 1960s, Music librarians introduced the idea to generate a human and machine readable standardised format to identify music by its melodic beginning. For that purpose, Barrey S. Brook and Murray Gold developed the Plaine & Easie Code that allows the transcription of the beginning notes of a musical piece into a combination of numbers and letters. What Brook and Gould pointed out in 1964 was already a distinct definition of and guide to the Plaine & Easie code system. They introduced it as “an accurate shorthand for musical notation, especially useful for incipits and excerpts.” With some foresight they also stated that “it must be so devised to be readily transferable to electronic data-processing equipment for key transposition, fact-finding, tabulaturing and other research purposes.” (Brook and Gold, 1964)

Plaine & Easie Code is a simple to parse plaintext format and therefore suitable to deliver important metadata for manifold musicological interests. IncipitSearch adopts this standard and at the moment is purely concentrated on Plaine & Easie. However, the future goal is to be capable of reading incipits notated in other formats as well, e.g. MEI ( http://music-encoding.org) or abc notation ( http://abcnotation.com).

3. Searching Music Incipits

Musc information retrieval systems either build up on audio or symbolic music notation. In digital musicology, that deals with notation and critical digital edition of works, the search in notated music is widely used (Typke et. al. 2005).

RISM is undoubtedly the most established repository for musical data. It contains over one million records of historic music materials and over 1,7 million musical incipits (for manuscripts only), which can be accessed using an incipit search ( RISM search). Further incipit search engines build up on the RISM datasets. For example, Utrecht University has developed an extended and experimental search approach offering extended functionalities for user input as well as using sophisticated matching and ranking methods (Van Nuss et. al. 2017).

But other musical incipits exist which cannot be accessed via RISM because they either have not been implemented as data yet or because they are not a type of resource the RISM collection is focusing on and will not be added to the catalogue, such as work catalogues.

4. IncipitSearch

4.1. Scope and Functionalities

The efforts to implement incipits in the digital work catalogue of the complete edition of Gluck’s works and to make them searchable have led to the idea of connecting this research data with other repositories and creating even easier ways to instantiate new machine readable incipit repositories. Both digital and analogue catalogues, editions and collections which provide their data in a standardised format can be interlinked with IncipitSearch.

IncipitSearch addresses music that can be displayed in common western music notation. Its main focus lies on music composed prior to 1800. Nevertheless, through its openness it can be furthermore used as a platform to explore challenges in researching culturally and historically different forms of musical notation.

IncipitSearch is a tool and a service specifically tailored for research on music incipits. It is simultaneously a centralised data endpoint where multiple aggregated catalogues of incipits can be accessed as well as a decentralised open source software that can be integrated as stand-alone search in other platforms. A microservice based software architecture allows high flexibility in usage and extension of individual components (Haft et. al. 2015).

IncipitSearch enables users to enter search queries in the search field by playing them on a virtual piano keyboard while Plaine & Easie Code can also be directly entered into the search field. Search with transposition or with exact matching can be selected ( https://incipitsearch.adwmainz.net). Next to the found concordant incipts, the result list displays backlinks to the entry in the respective catalogue.

Screenshot of the search interface of IncipitSearch.
Figure 1. Screenshot of the search interface of IncipitSearch.

4.2. Metadata Schema

To enable a standard suitable for the different types of musicological repositories such as digital and analogue catalogues, editions and collections and to provide an output of the collected data, we have developed an easy to understand RDF schema using the schema.org vocabulary. Besides being recommended by the W3C, cross-linking possibilities for data and the possibility to rely on various vocabularies for specific topics, the interoperability and the multiple serialisation formats for RDF are advantageous.

Schema.org provides a vocabulary for the description of web pages. The initiative of several major search engine companies aims to develop a simple vocabulary to add semantic information to webpages. These vocabularies were designed in collaboration with domain experts. For the markup of music information, the data type MusicComposition ( http://schema.org/MusicComposition) supplies most elements to describe a work and its parts. To add the possibility of describing music incipits, we have expanded the vocabulary with further elements. The format can be used directly for data interchange - a feature request for the extension of schema.org with incipit declaration is planned.

The metadata format functions as an acquisition format for the repositories. It can be used to add information to the catalogue by adding music incipits to existing resource as well as a schema for the annotation and digital publication of analogue catalogues. Moreover, it will provide the aggregated data in a standardised format to enable further usage.

5. Conclusion

At the moment, IncipitSearch aggregates the incipit data of the catalogue of Gluck’s works, the SBN OPAC, the RISM OPAC and includes a sample data set of the thematic Breitkopf Catalogo delle Sinfonie 1762.

IncipitSearch builds on the potential of open musical data and provides the possibility to interlink musicological repositories of various types. This is accomplished by concentrating on musical incipits and using a standardised data interface, a straightforward metadata schema and encapsulated software components.

Through consistent usage of authority control and metadata standards, IncipitSearch is an open source tool and service warranting sustainability, transparency, and accessibility of research data.

6. External Links


Appendix A

Bibliography
  1. Brook, B.S., Gold, M. (1964). Notating Music with Ordinary Typewriter Characters (A Plaine and Easie Code System for Musicke). Fontes Artis Musicae, vol. 11, no. 3, 1964, pp. 142–159. www.jstor.org/stable/23504533.
  2. Haft, M., Neovesky, A. and Reimers, G (2016). Digitale Nachhaltigkeit von Forschungsdaten durch Microservices. FORGE 2016. Forschungsdaten in den Geisteswissenschaften: Conference Abstract, pp. 23 -24. https://www.fdm.uni-hamburg.de/ueber-uns/a-nachrichten/aktivitaeten/forge16/praesentationen/programmheft.pdf#page=23.
  3. Typke, R., Wiering, F. and Veltkamp, R.C. (2005). A survey of music information retrieval systems. Proceedings of the 6th International Conference on Music Information Retrieval, ISMIR 2005, pp. 153 - 160. http://ismir2005.ismir.net/proceedings/1020.pdf.
  4. Van Nuss, J., Giezeman, G.-J., Wiering, F. (2017). Melody retrieval and composer attribution using sequence alignment on RISM incipits. Proceedings of the International Conference on Technologies for Music Notation and Representation. Universidade da Coruña, pp. 79–90. http://doi.org/10.5281/zenodo.924135