HathiTrust Research Center (HTRC) has selected four projects to participate in its special round of Advanced Collaborative Support (ACS), funded by the Andrew W. Mellon Foundation through the Scholar-Curated Worksets for Analysis, Reuse & Dissemination (SCWAReD) project.
The projects will seek to build HTRC worksets drawn from materials related to historically under-resourced and marginalized textual communities, and in doing so, to identify gaps in the HathiTrust collection where such communities are not represented in the digital library. The worksets will be analyzed using text and data mining techniques. The worksets, derived data outputs, and associated documentation will be shared at the end of the projects as illustrative research models of the text and data mining process. The four research models will join a flagship model that is being developed concurrently in collaboration with co-PI Maryemma Graham and her History of Black Writing project at the University of Kansas.
The four awarded projects are:
Mining the Native American Authored Works in HathiTrust for Insights
Kun Lu, Raina Heaton, and Raymond Orr (University of Oklahoma)
This project seeks to compile a collection of Native American authored works in HathiTrust and apply various text mining methods to the collection to reveal the coverage, subjects, perspectives, and writing styles of Native authors. A list of Native authors and their works will be compiled from an existing database created by a member of the project team and from other online resources. This list will be aligned with the HathiTrust digital library to create a workset of Native American authored works in HathiTrust for further analysis. Then, a variety of text mining methods will be used to analyze the subjects, topics, language use, and writing styles of Native American authors. Comparative analysis will be carried out to understand the characteristics of this textual community. The project is expected to develop a database of Native American authors and the bibliographic information of their works, create a reusable workset of Native American authored works in HathiTrust, identify potential gaps in the HathiTrust corpus on this textual community, and provide insights into the characteristics of the community by text mining their works.
The Black Fantastic: Curated Vocabularies, Artifact Analysis and Identification
Clarissa West-White (Bethune Cookman University) and Seretha Williams (Augusta University)
This project focuses on identifying Black Fantastic texts in the HathiTrust Digital Library. The project proposes that characteristics of the Black Fantastic—the cultural production of African Diasporic artists and creators who engage with the intersections of race and technology in their work—exist in historical and current cultural artifacts, including those created by and about future-forward personalities, such as Dr. Mary McLeod Bethune. It builds on previous and ongoing work to create a bibliography of the Black Fantastic that is featured in Third Stone Journal. Works in HathiTrust will be analyzed along with Black Fantastic artifacts from other collections, such as the Dr. Mary McLeod Bethune collection in the Bethune-Cookman University archives. By working across collections, the project will test methods for locating Black Fantastic texts and lives.
Creating Period-Specific Worksets for Latin American Fiction
José Eduardo González (University of Nebraska, Lincoln)
This project seeks to create large datasets to research the history of Latin American fiction and question traditional periodization of this literature by attempting to detect the boundaries between literary periods and subgenre distinctions in Latin American fiction. It will look critically at the techniques for detecting genre distinctions that have developed over the last few years and evaluate how they apply to the particular development of Latin American literary system. While many of the subgenres in the English-speaking literary market such as detective fiction, the Gothic novel, and speculative fiction have followers in Latin America, the genres that have traditionally been considered important for the changes in the literary history of the region are less formulaic and more closely linked to national and regional historical and/or social developments. Instead of attempting to identify canonical documents that typify a genre, this project will examine how documents diverge from a particular canon in order to explore the social and cultural reasons an author might accept or deviate from a dominant style.
The National Negro Health Digital Project: Recovering and Restoring a Black Public Health Corpus
Kim Gallon (Purdue University)
This project draws on HathiTrust’s collection of public health documents on Black health to explore how early twentieth Black public health officials communicated and addressed health disparities that impacted African American communities. The major goal of the project is to create a series of worksets and visualizations that scholars and students of African American health and medicine along with public health experts and physicians can use to deepen historical narratives about Black health that might offer insight into the development of contemporary health communications targeted toward African American communities. The project also establishes some of the research for Technologies of Recovery: Black DH Theory and Praxis, a book in- progress. Finally, the work will fill a gap in the history of African American public health.