Available Indexes

Add new comment

If you use this technique, I think you should try setting DefaultSimilarity.setDiscountOverlaps(true). I did some tests which showed that if you use commongrams, it will punish relevance somewhat, because these injected tokens adversely influence lengthNorm. if you discount these tokens with positionIncrement=0 by setting that parameter, then this problem goes away.
You are browsing an archive of the HathiTrust website. In July 2023, we launched a new site at www.hathitrust.org.