Top News
Board of Governors Elections
HathiTrust has announced the members of its new Board of Governors. The full announcement, as well as information about the elections process, are available on the HathiTrust website. The composition of the Board, which officially begins work April 16, is as follows:
Representatives appointed from the founding partner institutions:
Representatives elected at-large:
Serving 5-year terms from 2012-2016
Serving 4-year terms from 2012-2015
Serving 3-year terms from 2012-2014
Call for Nominations: User Support Working Group
The User Support Working Group is seeking nominations from partner institutions for up to 4 new members. Nominations should be sent to Jeremy York (jjyork@umich.edu) and include the name, title, and a short description of current job duties. Additional information that might be relevant to participation in the group may be included as well. User Support members are on call at least one day per week and follow up on inquiries throughout the week, requiring between 2-4 hours of work. Staff that participate on the group will
- Gain knowledge about HathiTrust’s user base, typical problems and questions that are raised and how they are resolved.
- Become aware of new ways HathiTrust is being used, and features and functionality that users desire.
- Gain knowledge of HathiTrust organizational and technical infrastructure, and policies and procedures relating to copyright, access, collection development, deposit of materials, and preservation.
The charge for the working group is available at http://www.hathitrust.org/wg_user-support_charge.
Data API Modifications
Effective May 1, support for legacy Data API URLs in the following form will be removed:
http://services.hathitrust.org/api/htd/pathinfo-arguments
After May 1, URLs should be submitted according to the current Data API schema without the “api” path element:
http://services.hathitrust.org/htd/pathinfo-arguments
Data API Security
Over the next several months HathiTrust will be implemeting security enhancments to the Data API. The enhancements will require developers using the API to acquire an OAuth 1.0 access key that identifies them, and a secret key that must be used to “sign” URLs to retrieve HathiTrust resources via the Data API. HathiTrust will also provide a Web client that employ’s a user’s login credentials as a proxy for these keys to facilitate non-programmatic uses. In March, staff at the University of Michigan integrated 2-legged OAuth into the Data API and began to develop the Data API client. Once OAuth is released, there will be an approximately 6-month transition period, ending October 1, 2012, during which signed access to the Data API will be possible but not required. After October 1, all requests to the Data API will need to be properly signed with an access key retrieved from HathiTrust. Complete documentation of the security enhancements and methods of obtaining keys and accessing the Web client is forthcoming. OAuth is planned for release in April 2012.
Ingest
Local Digitization
University of Michigan staff are preparing tools that will allow partners to build complete ingest packages for materials they wish to deposit in HathiTrust. The tools will include functionality to remediate images and build METS files to HathiTrust specifications, and validate files prior to submission to HathiTrust. Several institutions have agreed to test the tools in the coming months. It is hoped that over time all partners and other entities that contribute content to HathiTrust will use the tools to create their submission packages, thereby distributing the effort needed to ingest materials produced from different sources.
Working Groups and Committees
Collections
The Collections Committee’s report on duplicate volumes in HathiTrust is now available. As described in last month’s update, the report recommends that HathiTrust retain all duplicate copies ingested into the repository for the time being, with periodic reassessment. The Strategic Advisory Board has requested that the Committee make further recommendations about the criteria that should be applied in future assessments and identify the future costs and risks of retaining duplicates in the corpus. The Committee also hopes to finalize its recommendations concerning a process for responding to requests and offers within the next several months.
User Experience Advisory Group
The UX Advisory Group conducted informal usability testing to evaluate the impact of changes proposed to the PageTurner interface to incorporate a volume version (date of last ingest). The group plans to discuss the results and make recommendations on the changes in April, with implementation to follow shortly thereafter.
User Support Working Group
The table below contains a summary of the issues received by the User Support Working Group in March.
Issue Type | March | February |
Content | 203 | 106 |
Quality |
193 | 97 |
Non-partner Digital Deposit |
0 | 3 |
Collections |
9 | 2 |
Cataloging | 49 | 24 |
Access and Use | 195 | 131 |
Copyright |
137 | 73 |
Permissions |
17 | 20 |
Takedown |
1 | 1 |
Print on Demand |
0 | 1 |
Inter-library loan |
2 | 0 |
Full-PDF or e-copy requests |
19 | 17 |
Datasets |
2 | 1 |
Data Availability and APIs |
2 | 0 |
Reuse of content |
6 | 0 |
Web applications | 11 | 22 |
Functionality problems |
4 | 7 |
Problems with login specifically |
1 | 0 |
General Questions about login |
3 | 5 |
Partners setting up login |
3 | 3 |
Usability issues |
0 | 1 |
Feature requests |
0 | 0 |
Partner Ingest | 5 | 5 |
General | 101 | 152 |
Partnership |
7 | 11 |
Infrastructure |
0 | 2 |
Miscellaneous |
94 | 139 |
*See User Support Working Group Issue Types for a description of the types of issues included in each category.
Projects
Bibliographic Data Management
California Digital Library achieved a milestone in March, loading all bibliographic records submitted by HathiTrust contributing institutions into the Zephir production environment. The goals of this dry run load were to test the functionality of the new metadata management system (Zephir), to test the production infrastructure, and to compare the production loading time with a previous load on a development server. The metadata management team continued to reconcile bibliographic records in Zephir with those in the current system at the University of Michigan to assure all data was accounted for, addressing record discrepancies and ingest errors as they were encountered. The team also began to verify that bibliographic record collation processes in Zephir resulted in the same records clustering as collation processes at Michigan.
jPach (formerly HathiTrust Publishing)
Staff of the University of Michigan formally named the journal publishing platform Michigan will use in conjunction with HathiTrust: jPach. Design principles and requirements for jPach, plus a description of the platform’s modules, are posted on the University of Michigan Library website. The project page on the HathiTrust website now includes a full project timeline.
Michigan staff continued work to generate valid JATS XML from DOCX files, render JATS XML files in PageTurner, and create a METS profile for the jPach Submission Information Package.
HathiTrust Research Center (HTRC)
The HathiTrust Research Center released a report of its activities over the last 6 months. More information about the Research Center can be found on the HTRC web page.
IMLS Quality Grant
Project staff continued whole-volume review of digital volumes in the first production sample (pre-1923 English-language Google-digitized volumes), looking for errors such as missing, duplicate, and out-of-order pages, as well as generally “bad” pages, defined in relation to the severity scale established for page-level review. Staff also continued page-level review of the project’s 4th 1,000-volume sample, consisting of non-Roman language volumes. Physical review of Michigan volumes sampled in the second production run (post-1923 Google-digitized English-language volumes) continued in March. Students have completed review of 543 of the 600 Michigan volumes present in the 1,000-volume sample. Further information about the grant project is available from the project website.
Development Updates
Full-text Search
Staff at the University of Michigan completed work on the next iteration of advanced full-text search, which will allow users to build queries with greater Boolean complexity and enhance the ability to revise advanced searches. The new features will be released in early April. Staff made significant progress on plans to improve search results relevance ranking.
Storage Hardware Replacement Cycle
Michigan staff installed new storage at the Indiana and Michigan sites that will both accommodate 2012 volume projections and replace storage scheduled for retirement. Storage due for retirement will be taken offline starting in April.
Web Hosting Infrastructure Changes
Developers and system administrators at Michigan began preparations to move HathiTrust’s Drupal-based informational website and VuFind-based catalog from their initial hosting environments, currently on Michigan library infrastructure, to dedicated HathiTrust hardware, where they will run alongside other HathiTrust applications. This move will simplify application integration.
Outages
No outages were reported in March 2012.
HathiTrust sends notice upon discovery and resolution of unscheduled outages and in advance of scheduled outages and maintenance work that may result in an outage. We welcome and encourage additional recipients for these notices. If your institution is not receiving outage notifications and would like to, please contact feedback@issues.hathitrust.org.
New Growth
As of April 1:
March | Total | |
Columbia University | 6 | 64,183 |
Cornell University | 896 | 392,356 |
Duke University | 1 | 4,523 |
Harvard University | 1 | 53,675 |
Indiana University | 480 | 187,635 |
Library of Congress | 5 | 89,416 |
North Carolina State University | 0 | 3,196 |
University of North Carolina - Chapel Hill | 1 | 8,088 |
Northwestern University | 554 | 6,820 |
New York Public Library | 31 | 259,537 |
Penn State University | 18 | 43,280 |
Princeton University | 171 | 250,789 |
Purdue University | 41 | 23,981 |
University of California | 758 | 3,329,769 |
The University of Chicago | 309 | 13,206 |
University of Illinois | 1,001 | 15,504 |
Universidad Complutense | 3,083 | 111,823 |
University of Michigan | 4,124 | 4,529,978 |
University of Minnesota | 2,696 | 95,064 |
University of Wisconsin | 1,297 | 534,870 |
University of Virginia | 0 | 48,921 |
Utah State | 0 | 90 |
Yale University | 0 | 23,678 |
Total | 15,473 | 10,090,382 |
Public Domain (~28%)
Total* | 5,458 |
2,783,946**
|
* Includes volumes opened through copyright review and rights holder permissions
** Corrected 5/11/2012. Previous number included 1,389 images from the Minnesota Digital Library
Papers and Presentations
Presentations
Jeremy York, "HathiTrust: Aspiring to Build the Universal Library". UKSG Annual Conference, March 26, 2012.
Jeremy York, "HathiTrust and the Research Library of the Future". American Antiquarian Society Conference on Needs and Opportunities, March 31, 2012.
April Forecast
- Release user interface enhancements for advanced full-text search
- Continue work on relevance ranking of full-text search results
- Complete work on Data API security
You can follow HathiTrust on Twitter.