Available Indexes

Add new comment

The slower Solr response times are due to the size of the corpus and the related disk I/O for queries containing common words (the total index size for all the shards is now about 3 terabytes.) We suspect that the slower *elapsed* times are due to network problems, which we are in the process of solving.

As far as the load on the servers, the Solr servers are dedicated to serving the search so there is no real load from anything else. So far we have not had a demanding query rate. We average about one query every 5-10 seconds. If we start seeing a query rate that looks like it will have a sustained rate of over 1 query per second we will consider replicating the index. Each shard would be replicated and we would load balance between the replicas. In the near future we will have a second copy of the index in Indiana (primarily for failover purposes) but we will load balance between the instance here and at Indiana, so that should give us some extra time before we have to consider further replication.

Tom

You are browsing an archive of the HathiTrust website. In July 2023, we launched a new site at www.hathitrust.org.