Available Indexes

Add new comment

Hi Salman,

You are the second person to comment on an unexpected decrease in performance with CommonGrams, so I'm interested in trying to determine the cause.

CommonGrams should improve performance for simple phrase queries containing common words if your bottleneck is disk I/O for reading the positions index. How large is your 200K word index? How large is your *prx file?

You should see faster response times for simple phrase queries. Are you getting slower response times for simple phrase queries? What kinds of queries are slower with the CommonGrams index?

Its possible that wildcard queries may be slower with the CommonGrams index due to the increase in the number of unique terms and CommonGrams will probably not work properly with proximity searches, although I don't remember enough of the implementation details of proximity searches and slop factors to tell you in what cases they will and won't work.

As far as the increase in index size, we saw a 50-60% increase in index size. I wonder if there is some interaction with your filter chain.

Tom

You are browsing an archive of the HathiTrust website. In July 2023, we launched a new site at www.hathitrust.org.