Available Indexes

HathiTrust Research Center (HTRC) UnCamp 2018


Slides and presentations now posted at: https://osf.io/view/htrc_uncamp2018/

Twitter bird logo Follow @hathitrust, tweet with #HTRCUC18

Questions? Contact us at htrc-uncamp18@lists.illinois.edu


Location: University of California Libraries, Berkeley, CA

Dates: January 25-26, 2018 (Thursday & Friday)

0.5 day pre-conferences:

  • Jan. 25: 9:00 am - 12:00 pm

1.5 day UnCamp:

  • Jan. 25: 1:00 pm - 5:00 pm
  • Jan. 26: 9:00 am - 5:00 pm


The full schedule for UnCamp is available on Sched.


Registration is now open for HTRC UnCamp 2018:
  • Early registration is $100 and available through November 29, 2017
  • Standard registration is $150 and begins on November 30, 2017
  • Registration fee includes pre-conference sessions

Keynote Speakers

HTRC is excited to welcome keynote speakers Elizabeth Lorang, David Mimno and Leen-Kiat Soh! Dr. Lorang and Dr. Soh will be presenting about their Image Analysis for Archival Discovery (Aida) project, supported by National Endowment for Humanities (NEH) and Institute for Museum and Library Services (IMLS). Dr. Mimno will be discussing his text analysis work utilizing HathiTrust and HTRC data.

Dr. Liz Lorang Dr. Leen-Kiat Soh

Elizabeth Lorang (Left) is an Associate Professor in the University Libraries and a Faculty Fellow in the Center for Digital Research in the Humanities at University of Nebraska-Lincoln. Her research and practice emphasize the critical analysis, application and creation of information and information strucutres. With Leen-Kiat Soh, she leads the Aida project.

Leen-Kiat Soh (Right) is a Professor in the Department of Computer Science and Engineering at the University of Nebraska. His research interests are in multiagent systems, intelligent image analysis, computer-aided education, and computer science education. His work is supported by National Science Foundation (NSF), IMLS and National Geospatial-Intelligence Agency (NGA).

Keynote Title: "Increasing Our Vision for 21st-Century Digital Libraries"

Abstract: Through the frames of digital library and collections processing histories, Lorang and Soh will consider how digital libraries enable researchers to find materials within their collections, look at the intersections of research and development with application and practice in digital libraries, and discuss the roles of digital libraries in opening up or closing off the types of questions people can ask--as well as those they might imagine. Within this context, they will introduce the work of Image Analysis for Archival Discovery (Aida), its research questions, methods, and current work, and they will look to the future to propose some expanded visions for digital libraries development.


Dr. David Mimno

David Mimno is an Assistant Professor in the department of Information Science at Cornell University. He holds a PhD from University of Massachusetts, Amherst and was previously head programmer at the Perseus Project at Tufts University and a researcher at Princeton University. His work is supported by the Sloan Foundation and NSF.

Keynote Title: "Consistency and Confidence in the Million-book Library"

Abstract: The promise of digitized million-book libraries is that we can get reliable measurements of complicated historical and cultural processes. In this talk, I'll present a general framework for many of the most popular analytics of large-scale text, including topic models and word embeddings. Based on this intuition, I will show both the promise and potential pitfalls of such analyses. Through several case studies, I will present recommendations on how researchers should get the most consistent, confident results, and how we might collectively make HathiTrust more reliable.

Call for Submissions

For the first time, HTRC invites proposals for the 2018 UnCamp. Proposals for panel presentations, lightning talks, and posters may address any aspect of digital text collections, computational text analysis, copyright and open access, digital pedagogy, and related topics, especially as these relate to the HTRC.
Submission deadline extended: November 1, 2017

Please see the full CFP and submission details on the CFP page.


The HathiTrust Research Center is a community-driven organization that supports HathiTrust (HT) by facilitating non-profit and educational uses of the corpus by enabling computational analysis of public domain works and (on limited terms) non-consumptive use of in-copyright works from its collection.  The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

Progress in analysis of the HathiTrust Digital Library is dependent on the quality and accessibility of the content within the toolset of the HathiTrust Research Center (HTRC) to a wide variety of researchers from across the spectrum of the research community. It is critical to address many new challenges related to the development, deployment, maintenance, and sustainability of the HTRC tools and the best way to do this is to engage our research community within the HTRC UnCamp to better understand current use cases and research projects in order to improve and adapt the HTRC services to meet the community needs.

In addition, it is essential that scientists, researchers, and students are able to learn and adopt a new set of skills and methodologies relate to the mass-scale analysis of the HT corpus. The HTRC UnCamp 2018 seeks to provide a forum for discussion of these challenges as found in the the HathiTrust member community as well as within the domain of scholarly commons centers and within the DH curriculum, within the wide university community.

HTRC UnCamp 2018 aims to facilitate the creation of a national community focused on improving research use of the HathiTrust corpus through computational analysis. The UnCamp will discuss topics relevant to understanding and utilizing the HathiTrust Digital Library corpus within the modern computational research eco-system. This includes discussion of practices and experiences in mass-scale data mining, visualization, and analysis of the HT collection, with the goal of improving the quality of access and use of the collection by means of the HTRC Data Capsule and other affiliated research tools.

Topics of interest include but are not limited to:

Computational Text Analysis
Possible areas: Computational Text Analysis (CTA) basics, Visualizing HathiTrust data, Tools and methodologies for CTA in HathiTrust, Using Bookworm, CTA and HathiTrust case studies

Worksets and Corpus Creation
HathiTrust as a corpus or data for CTA, how to create, reuse, or publish a focused corpus/workset from HathiTrust, research reproducibility and sharing text as data

Digital Pedagogy and Text Analysis Curricula
Possible areas: Teaching computational text analysis, HathiTrust & HTRC in the classroom, instructional case studies

Fair Use, Copyright, and Non-Consumptive Research in HathiTrust
Possible areas: Copyright and fair use issues related to non-consumptive research, orphaned works, HathiTrust Data Capsule, case studies

Demystifying HathiTrust Metadata
Possible areas: Introduction to HathiTrust metadata, future directions for HTRC metadata, leveraging HathiTrust metadata for analysis and corpus building, metadata tools

HathiTrust Development, News, and Updates
Possible areas: Developing tools and uses for HathiTrust, future directions for HathiTrust, what's new in HathiTrust, HathiTrust community, case studies of tool development

Pre-Conference Program

This year's UnCamp will feature pre-conference sessions for the first time. These sessions will be held on the morning of Thursday, January 25, before the official start of UnCamp. Pre-conference sessions are no additional charge, and the full program can be found on the pre-conference information page.

Organizing Committee:

  • Erik Mitchell, University of California
  • Cody Hennesy, University of California
  • Robert McDonald, Indiana University
  • J. Stephen Downie, University of Illinois
  • Harriett Green, University of Illinois
  • John Unsworth, University of Virginia
  • John Walsh, Indiana University
  • Kathryn Stine, California Digital Library
  • Jean Ferguson, University of California
  • Stacy Reardon, University of California
  • Ashley Bacchi, University of California
  • Ian Knabe, University of California
  • Lisa Rowlison de Ortiz, University of California
  • Quinn Dombrowski, University of California
  • Evan Muzzall, University of California
  • Patricia Frontiera, University of California

Program Committee:

  • Robert H. McDonald, Indiana University Co-Chair
  • Cody Hennesy, University of California Co-Chair
  • Eleanor Dickson, University of Illinois
  • Ryan Dubnicek, University of Illinois
  • Valerie Glenn, HathiTrust Digital Library
  • Inna Kouper, Indiana University
  • Stacy Reardon, University of California
  • Kathryn Stine, California Digital Library
  • Inclusivity Chair: J. Stephen Downie, University of Illinois

Travel and Accommodations

Full details are available on the travel and accommodations page.


HTRC UnCamp 2018 will be hosted on the University of California, Berkeley campus. The primary venue will be the newly renovated Moffitt Library (map), with breakout events in nearby campus locations including the Berkeley Institute for Data Science (BIDS) and Morrison Library—both located in Doe Library (map)—the campus D-Lab in Barrows Hall (map), and the Academic Innovation Studio (AIS) in Dwinelle Hall (map). All UnCamp events and pre-conferences will be located within a 5-10 minute walk from Moffitt Library.

Code of Conduct

HTRC UnCamp seeks to provide a safe and inclusive environment for everyone, regardless of gender, gender identity, gender expression, sexual orientation, disability, physical appearance, body size, race, ethnicity, or religion. This year’s inclusivity chair is J. Stephen Downie who is responsible for encouraging diversity and inclusion for all UnCamp participants.

Prior to UnCamp, please review the full UnCamp Code of Conduct.


Thank you to our sponsors:

UC Berkeley Libraries logo

CDL logo

UIUC iSchool logo

University of Illinois Library logo

  IU School of ICE logo IU Libraries logo

You are browsing an archive of the HathiTrust website. In July 2023, we launched a new site at www.hathitrust.org.