Interrogating the Roots of American Settler Colonialism: Experiments in Network Analysis and Text Mining

Ashley Sanders Garcia (, The Claremont Colleges, United States of America

Even as the United States fought for independence in the American Revolution, it was already in the process of becoming a settler colonial power in its own right. This short paper interrogates the origins of American settler colonialism through text mining three corpora of personal and official documents. In order to understand and address present structural inequity in the United States, scholars, policy-makers, educators, and the public need to examine the country’s long history as a settler colonial society.

Through topic modeling and text mining methods, my research highlights the underlying goals and desires that prompted land acquisition, settlement, and cycles of violence between Euro-American settlers and Native Americans in the trans-Appalachian west between 1776 and 1820. This project explores three collections, or corpora, of documents, separated by the positions of the historical authors and document type: settler correspondence and records; official government documents; and writings of political elites in the eastern United States. The first corpus for this study consists of correspondence, journals, and memorials from settlers, colonial officials and military leaders in the territories (colonies) between 1776 and 1820. This is the smallest corpus of the three, at two million words. Few documents from representative settlers have been transcribed and published, so the corpus over-represents leaders in the settler communities, however the petitions from the settlers to Congress give voice to the most pressing challenges, needs, and hopes of the settlers themselves. The documents included in each corpus were transcribed and published in bound volumes during the nineteenth century and are now in the public domain. A second corpus, of approximately four million words, consists of official government records, including treaties with Native American communities, military records, documents related to public lands and governance of the territories, as well as pension and other petitions submitted to Congress in the late eighteenth and early nineteenth centuries. The third corpus is, by far, the largest of the three, at approximately 39 million words, and consists of the papers of the foremost political leaders in the eastern United States. The letters of the members of the Continental Congress are included, as are the writings of George Washington, James Madison, Thomas Jefferson, Benjamin Franklin, and John Adams. Not surprisingly, these statesmen wrote far more than settlers, who were primarily concerned with agricultural cultivation, hunting, and defending their families on the frontier.

The aforementioned sources form the corpora for text mining and analysis experiments. My study extracts and compares American settler, administrator, and political leaders’ perspectives on significant topics in the study of settler colonialism, such as land value; property acquisition and sales; as well as the presence, actions, and views of Native Americans. Early experiments using the LDA algorithm in MALLET to topic model the corpora and Lexos to visualize the topic clouds have already revealed significant patterns (Blei, 2012).

While recognizing that topic models are more effective with large corpora, my research began with a small experiment. Using MALLET, I created a topic model of ten topics of the twenty-five published petitions from settlers to Congress (1787-1798) from the Territorial Papers of the United States. This model suggests that one of the primary motivations for Euro-American emigrants to move to the western territories was to achieve what they described as a competency, or the means to rear their children “in a comfortable manner” and “raise a subsistence by their [own] industry” (Petition from the Inhabitants of Vincennes to Congress, 1787). The topic related to land reveals the dominant concerns that settlers expressed. They implored Congress to recognize their existing land claims, ensure reasonable land prices, provide military protection from Native American raids, and ensure justice through the provision of judges. These measures, they believed, would foster access to land, enable trade, establish legitimacy, and provide settlers with the means to achieve their modest goals.

Even though their objectives differed from those of the settlers, government officials both in the east and on the ground, in the western territories, were equally motivated to acquire land beyond the Appalachian Mountains.  In the aftermath of the American Revolution, the government was in dire financial straits. Political leaders urged agents to obtain western lands from Native communities so that the territory could be sold to pay off the burdensome war debts. Consequently, backcountry government officials decried settler violence against neighboring Indigenous communities, even as they took advantage of the unruly settlers’ actions to compel land cessions that the United States government desperately needed.

Figure 1: Topics related to land in the Continental Congress members’ correspondence

There was a high price to be paid for white American independence though, as is demonstrated in the topics generated from the Continental Congress members’ correspondence records (Figure 1). The words “transmitted, negotiations, ceding, extinguishment, extinguishing,” and, ominously, “funeral” stand out among the more benign “northwest, lands, and western.” Most of these words are more or less neutral when considered out of context, but, given their use in relation to the settler colonial endeavor, they evidence the brutal effects of American land acquisition and expropriation from Native communities.

These topics and the related documents both direct attention to specific sources for close reading, but also yield new terms of interest to explore at a distance and in a broad comparative framework. In addition to the results of topic modeling the aforementioned corpora, this presentation will also share experiments using part-of-speech tagging and collocations to explore concepts, such as land, family, independence, competency, and war to understand the ways in which settlers, and political and military leaders conceived each of these topics.

This talk offers an initial glimpse into the early stages of a much larger project that seeks to create an interactive interface for documents from the first four decades of the United States’ formation as a nation and nascent empire based on topic models and text mining approaches, such named entity recognition, and collocates. The interface will eventually allow users to drill down into documents that contain specific sought-after features, such as individuals’ names, gender identity, topics of interest, etc. This interface, it is hoped, will enable historians, students, genealogists, and interested members of the public to explore some of the most important documents related to the complicated, conflicting, and, occasionally, complementary objectives of American settlers and other political actors. The policies these agents developed between 1776 and 1820 not only shaped American settler colonialism in the eighteenth and nineteenth centuries, but they continue to reverberate more than two centuries later.

Appendix A

  1. Blei, D. M. (2012) “Probabilistic Topic Models.” Communications of the ACM 55.4 (April 2012): 77-84.
  2. Blei, D. M. (2012) “Topic Modeling and Digital Humanities.” Journal of Digital Humanities 2.1 (Winter 2012). Web. .
  3. The Inhabitants of Vincennes to Congress, July 26, 1787, in Territorial Papers of the United States, Volume 2 (Washington, D.C.: United States Government Publications Office, 1934): 58-60.