Available Indexes

How to Add ETAS Records to Your Catalog

April 8, 2020

Updated October 14, 2021 to correct the order of the columns in the Report Structure table.  The correct order is: oclc, local_id, item_type, rights, access

****

Update: Recording of HathiTrust ETAS Discovery Office Hours
Learn more on how to add other types of HathiTrust records to your catalog


How to identify your ETAS records in your local catalog to users

This is an evolving page and will be updated as more examples and external documentation becomes available.

Once your institution has been approved for the HathiTrust Emergency Temporary Access Service (ETAS) (https://www.hathitrust.org/etas-approved-libraries) there are three possible paths to take to identify and make discoverable those records from HathiTrust that are newly open for Full View to your users. 

Note: If you have not applied or been approved, please visit this page for further information on that process

Method A: The BibAPI
Method B: Using the HathiFiles
Method C: Third-Party Discovery Services

With Any Method: Automatic Login 

Examples of Each Method 

 

Overlap Report

Upon approval for the ETA Service, a new overlap report will be generated for your institution identifying which of your print holdings have a digital equivalent in HathiTrust. This overlap report is unchanged from overlap reports you can otherwise request at any time, showing the access designations -  "allow" and "deny" values - that would apply under normal (non-ETAS) conditions. Under those normal conditions, you would have access only to the "allow" records. Under the new service, you would have access to both the "allow" and the "deny" records. 

 

Overlap Report Key

Category

Definition

oclc

An oclc number for the item(s)

local_id

the institutional identifier in the local system

item_type

One of three values: ‘serial’, ‘multi’, ‘mono’, representing, respectively, serials, multipart monographs, and monographs

rights

A code (also referred to as “rights attribute”) that describes the copyright status, license or access.

access

An access code that describes whether or not users can view the item. The access code is derived from the rights attribute. Two possible values: ‘allow’ or ‘deny’


For example, in the example below, this library would now have access to every item shown whereas before only one item - OCLC number 5577640, row 2 - was available via HathiTrust.    See the full list of codes.

Overlap Report Example

oclc

local_id

item_type

rights

access

1234380

99467233402401

serial

ic

deny

5577640

994021423402401

multi

pdallow

30688167

994628483402401

multi

ic

deny

5040497

993390943402401

serial

icdeny

 

Items that do not have matching HathiTrust items will have blanks in the report in both the ‘copyright’ and ‘rights’ columns for that item’s row. The image below shows an example of how this will appear:

 

oclc

local_id

item_type

rights

access

2635

b10000203

mono

ic

deny

2698

b10000215

mono

  

2747

b10000227

mono

ic

deny

2867

b10000239

mono

  

2950

b10000240

mono

  

23523324

b10000252

mono

  

3014

b10000254

multi

pdus

allow

 

In this example, there are four items (OCLC numbers 2698, 2867, 2950 and 23523324)  that HathiTrust does not have or can not match to your record (i.e., the “access” value is blank), and two that were not viewable before (“deny”) and now are temporarily available under the ETA service (OCLC numbers 2747 and 2635), and one (OCLC number 3014) that was previously viewable and remains so (“allow”).

 

Linking Syntax

There are two preferred URI syntaxes that will resolve for the patron to either the catalog record page or the item page for the PageTurner application:

 

Title Level:

https://catalog.hathitrust.org/Record/{ht_bib_key/clusterID}

 

Item Level:

https://hdl.handle.net/2027/{htid}


See details on where to find htid and ht_bib_key values under Method B: Using the HathiFiles below.
 

 

Method A: The BibAPI

One approach to integrating these newly available records into your catalog is to use the HathiTrust Bibliographic API

The BibAPI is intended for small batch calls, up to 20 items at a time. Calls should be made using the OCLC Number, though that is not the only acceptable identifier for the call, and can retrieve either ‘Full’ or ‘Brief’ records.

Syntax:

Brief:

https://catalog.hathitrust.org/api/volumes/ brief /<id type>/<id value>.json

Full :

https://catalog.hathitrust.org/api/volumes/ full /<id type>/<id value>.json

 

The difference is in the amount of information about each record that is retrieved. 

The basic difference is that the ‘full’ call returns the full marc-xml in the json file. 

Example API call and results from the Bib API:

Below is an example (from the BibAPI linked above) of what a call for the brief record for OCLC number 00424023, using this URI: https://catalog.hathitrust.org/api/volumes/brief/oclc/00424023.json

(The call for the full record would be: https://catalog.hathitrust.org/api/volumes/full/oclc/00424023.json , in the interest of space, the full record results are not included here)

{

    "records":{

        "000578050":{

            "recordURL":"https:\/\/catalog.hathitrust.org\/Record\/000578050",

            "titles":["Infinite series"],

            "isbns":["9780030110405","9780030110405"],

            "issns":[],

            "oclcs":["424023"],

            "lccns":["62009520"],

            "publishDates":["1962"]

        }

    },

    "items":[

        {

            "orig":"University of Michigan",

            "fromRecord":"000578050",

            "htid":"mdp.39015025315527",

"itemURL":"https:\/\/hdl.handle.net\/2027\/mdp.39015025315527",

            "rightsCode":"ic",

            "lastUpdate":"20200225",

            "enumcron":false,

            "usRightsString":"Limited (search-only)"

        },

        {

         "orig":"University of California",

         "fromRecord":"000578050",

         "htid":"uc1.b4405602",

         "itemURL":"https:\/\/hdl.handle.net\/2027\/uc1.b4405602",

         "rightsCode":"ic",

         "lastUpdate":"20190118",

         "enumcron":false,

         "usRightsString":"Limited (search-only)"

        }

    ]

}

 

Important fields from API results: 

  • Volume Identifier - ‘htid’ 

    • This is the permanent HathiTrust item identifier. Each item identifier is unique. Used to make the ‘itemURL’.

  • HathiTrust Record Number - ‘fromRecord’

    •  HathiTrust's record number for the associated bibliographic record. HathiTrust record numbers are not permanent and can change over time.

    • Used to make the ‘recordURL’.

  • OCLC Number - ‘oclcs’

    • OCLC number(s) for the bibliographic record. Multiple values are separated by a comma. At least one OCLC number will match that used for the API call.

  • Rights Code - ‘rightsCode’

    • A code (also referred to as “rights attribute”) that describes the copyright status, license or access. 

  • Access String - ‘usRightsString’

    • Identifies whether the item is temporarily available through ETAS (e.g. ‘Limited (Search Only)’ or normally available (e.g. ‘Full View’).

  • Catalog Record URL: ‘recordURL’

    • The URL for the catalog page.

  • Item Record URL: ‘itemURL’

    • The URL for a specific digital item

 

Below is a generic idea of the logic that can be used to implement the BibAPI into local discovery layers:

  • Pass an array of up to 20 OCLC numbers -- just the digits -- to the API

  • A call is made for each number, and then the API fetches data from HathiTrust based on that OCLC number passed in.

  • The brief or full record will be returned as a json response.

    • If no data is retrieved, or the number is not present in HathiTrust, return a null value. 

  • Parse through the returned rights values and distinguish between ‘Full View’ and ‘Limited View’

  • Update the linking text from ‘Limited View’ to text identifying that the item is available through ETAS. Example: ‘Temporary Access’

 

Note : For the ETA service, we have created a new ‘Temporary Access’ label on the HathiTrust catalog site. But the ‘usRightsString’ values returned by the API have not been changed, so any link text should be modified locally to reflect this new access.    

 

Method B. Using the HathiFiles  

Another method for integrating these newly available item records into your catalog is to download the most recent monthly HathiFiles (Links: HathiFiles and the HathiFiles Description ) and filter the results against the overlap report to extract only items that are now accessible to your library. 

 

The HathiFiles is a tab-delimited file containing an entry for every item in HathiTrust. 

ID values from the HathiFiles that are important for the ETA Service:

 

  • Volume Identifier - ‘htid’ 

    • This is the permanent HathiTrust item identifier. Each item identifier is unique.

  • HathiTrust Record Number - ‘ht_bib_key’

    •  HathiTrust's record number for the associated bibliographic record. HathiTrust record numbers are not permanent and can change over time.

  • OCLC Number - ‘oclc_num’

    • OCLC number(s) for the bibliographic record. Multiple values are separated by a comma.

  • Rights Code - ‘rights’

    • A code (also referred to as “rights attribute”) that describes the copyright status, license or access. 

 

For a more detailed breakdown of all of the fields present please see the HathiFiles Description

Example Applications of the HathiFiles:

Cornell Library regularly makes use of the HathiFiles. For ETAS, they have modified the process slightly to take the overlap report into account and used that information to create links that force login for authentication. More details on what they did can be found here: http://blogs.cornell.edu/discoveryandaccess/2020/04/01/adding-hathitrust-emergency-access-links/ 

Temple University Library has also used the HathiFiles, though in a slightly different fashion. For members that have had issues with the size of the HathiFiles, Temple’s approach may be instructive. Chad Nelson, a developer for Temple University Libraries, has written a blog post that goes into detail about his process using Temple’s overlap report and the HathiFiles together to create a smaller file (12MB) that is then used to identify associated ht_bib_key values for items newly available to them via ETAS. You can find that post here:https://chads.space/words/libraries/2020/04/27/temple-libraries-hathi-trust.html 

 

Below is modified version of the summary of the post:

 

1. Download the latest HathiTrust monthly file (i.e. hathi_full_20200401.txt.gz ) from the HathiFiles page.

2. Pare the monthly file down to just the needed data (OCLC number and Hathi Trust bib key) with:

  • gunzip the monthly full .gz file 

  • csvcut to limit to just the needed columns [OCLC Number (column 4) and ht_bib_key (column 8)]

  • csvgrep to eliminate rows without required fields [removes any row with an empty value in either of the two columns remaining after the previous step]

  • sort and uniq to eliminate duplicates [de-dupes the remaining values so only unique entries remain]

gunzip - c hathi_full_20200401.txt.gz|  \

  csvcut -t-c 4,8 -z 1310720 | \

  csvgrep -c 1,2 -r".+" | \

  sort | uniq> hathi_full_dedupe.csv

 

3. Take your overlap report and extract the unique set of OCLC numbers:

csvgrep -t-c-4-r".+" [overlap report].tsv | \

  csvcut -c 1 > | csvsort | uniq  \

  > overlap_all_unique.csv

 

4. Then filter the pared down HathiTrust data from step 2 using the overlap OCLC numbers from step 3 as the filter input:

csvgrep -c 2 -f overlap_all_unique.csv \

  hathi_full_dedupe.csv > hathi_filtered_by_overlap.csv

 

5. The output file — hathi_filtered_by_overlap.csv — is a two column csv of related OCLC numbers and Hathi Bib Keys that represent the items in your library available through ETAS, which you can use to construct links to HathiTrust items based on OCLC numbers in catalog records.

 

Method C. Third-Party Discovery Services 

In the event that your library makes use of third-party discovery services like OCLC WorldCat Discovery, Primo, Summon or EBSCO EDS and would rather activate ETAS via one of those discovery layers, we are in conversation with those partners to make this a viable alternative for temporary access. This space will be updated in the event of any developments. 

OCLC:

If you are using WorldCat Discovery to set holdings on HathiTrust titles you will need to create a custom KBART collection in your knowledge base with the HathiTrust titles, URLs and OCNs. If you need assistance with creating this collection and/or file please contact OCLC Support here

When reaching out to OCLC support please include your HathiTrust overlap report and with a subject line of HathiTrust access during COVID shutdown.

PRIMO:

For Primo, The University of Minnesota has created a package (available on GitHub or NPM ) that supplements the results for locally held items with links to associated HathiTrust records. The “Primo explore HathiTrust availability” package can, when search results are displayed, pass each record's OCLC numbers to the HathiTrust Bib API. If at least one match is found, a link to the HathiTrust record is appended to the availability section. A recent update to this package allows the copyright status of the records to be ignored, so that matches will include all locally held items, not just the public domain ones. In order for this update to be available, please follow the steps below:

 

1) Upgrade to version 2.4.0 of the primo-explore-hathitrust-availability package. 

2) Set an "ignore-copyright" attribute on the component in their template. For example: 

< hathi-trust-availability ignore-copyright =" true" ></ hathi-trust-availability >

 

With Any Method: Use Automatic Login Syntax

 

By appending a HathiTrust URL or item handle with a single sign-on URL, it is possible to construct a link to a HathiTrust item that automatically passes users from HathiTrust member institutions through their institution's authentication service. If users have already authenticated, they are effectively "automatically" logged into HathiTrust. If they have not authenticated yet, they are prompted to do so. See more details at Automatic Login for Partner Institutions

 

 

Note on proxies: A proxy server is not necessary for access to HathiTrust, and use of proxies to access HathiTrust is actively discouraged. The only requirement is to authenticate via a SAML identity provider (IdP). Sending users through proxy servers is more likely to cause login and/or throttling issues.  

 

If your institution automatically proxies via your discovery layer or an add-on like Zotero, we recommend adding a NeverProxy directive for all of HathiTrust within the respective config files. 

ex: 

NeverProxy   .?hathitrust.org

 

 

URLs must be constructed in the following way (all examples use the University of Michigan $ENTITY_ID : 

Generic syntax for automatic login that redirects then to an item:

https://hdl.handle.net/2027/$ID?urlappend=%3Bsignon=swle:$ENTITY_ID

Example: https://hdl.handle.net/2027/mdp.39015008098132?urlappend=%3Bsignon=swle:...

 

Generic Syntax for automatic login that redirects to the main HathiTrust catalog page:

https://www.hathitrust.org/?signon=swle:$ENTITY_ID

Example: https://www.hathitrust.org/?signon=swle:https://shibboleth.umich.edu/idp...

 

Generic Syntax for automatic login that redirects to a record catalog page:

https://catalog.hathitrust.org/Record/$RECORD_ID?signon=swle:$ENTITY_ID

Example: https://catalog.hathitrust.org/Record/001111513?signon=swle:https://shib...

 

Terms :
  • $ID is the HathiTrust item ID (for example mdp.39015008098132)

  • $RECORD_ID is the catalog record ID (for example 001111513)

  • $ENTITY_ID is your Shibboleth identity provider (IdP) entity ID.

Automatic Login for Institutions serving multiple campuses or organizations

If you are managing a catalog or website that serves users from multiple campuses or organizations, the above structure won't work because it only allows you to list one entity ID in the syntax. Instead, you can set up automatic login links that direct your users to our login screen first, where they have to select their organization from the dropdown list under "Find your partner institution." 

URLs must be constructed in the following way:

 

Examples of Each Method:

Method A: The BibAPI -- Columbia University

Method B: The HathiFiles -- Cornell University

Method C: Third Party Vendors

Primo: Oxford University

OCLC WMS: University of Delaware

EBSCO Discovery: Indiana University

 

 

 

 

 

You are browsing an archive of the HathiTrust website. In July 2023, we launched a new site at www.hathitrust.org.