Available Indexes

Add new comment

New Hardware for searching 5 million+ volumes of full-text

On November 19, 2009, we put new hardware into production to provide full-text searching against about 4.6 million volumes.  Currently we have about 5.3 million volumes indexed.  Below is a brief description of our current production hardware.  Future posts will give  details about performance and background on our experiments with different system architectures and configurations.

Hardware details

Solr Server configuration

  • Dell PowerEdge R710
  • 2 x Quad Core Intel Xeon E5540 2.53GHz processors (Nehalem)
  • 72 GB RAM
  • Red Hat Enterprise Linux  5.4 (kernel: 2.6.18 X86_64)
  • Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
  • Solr 1.3.0.2009.09.03.11.14.39 (1.4-dev 793569)
  • Tomcat 5.5.27

Storage

  • Isilon IQ NAS cluster (20 I/X-series nodes, 4 GB RAM per node)
  • 480 750GB or 1TB SATA drives providing 420 TB raw storage
  • 4GB RAM per node giving 80 GB of coherent cache in aggregate

Network

  • NFS uses a dedicated/private 9K MTU GbE network on Dell PowerConnect 5448 switch
  • NFS clients single-homed and mounts automatically distributed across all cluster nodes

Current Solr Architecture and Configuration

Search  Servers

  •  4 Servers with one Tomcat and 3 shards per server; 10 of 12 shards currently in use
  • 16 GB allocated to the JVM

Indexing  Server

  •  1 Server with 12 Tomcats and 12 shards; 10 of 12 tomcats/shards currently in use
  •  6 GB allocated to each of 10 JVMs

 

You are browsing an archive of the HathiTrust website. In July 2023, we launched a new site at www.hathitrust.org.