Ignore:
Timestamp:
2012-09-05T21:06:33+12:00 (12 years ago)
Author:
ak19
Message:

Searching with wildcards on lucene collection now displays term info in search results. But only at section level, not yet at document level. The latter can be accomplished by configuring the multitermquery rewrite method to a setting that could throw an exception if the number of terms exceeds BooleanQuery.MaxClauseCount(). But then searching with wildcards will work like GS2 again (where it works now since GS2's LuceneWrapper uses lucene core library 2.3.2 and GS3's LuceneWrapper3 uses lucene core library 3.3.0.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/common-src/indexers/lucene-gs/src/org/greenstone/LuceneWrapper3/GS2LuceneQuery.java

    r24732 r26155  
    5454import org.apache.lucene.util.Version;
    5555
     56import org.apache.lucene.search.MultiTermQuery;
     57import org.apache.lucene.search.MultiTermQuery.ConstantScoreAutoRewrite;
     58
    5659public class GS2LuceneQuery extends SharedSoleneQuery
    5760{
     
    147150
    148151        Query query = parseQuery(reader, query_parser, query_string, fuzziness);
     152
     153        // GS2's LuceneWrapper uses lucene-2.3.2. GS3's LuceneWrapper3 works with lucene-3.3.0.
     154        // This change in lucene core library for GS3 had the side-effect that searching on
     155        // "econom*" didn't display what terms it was searching for, whereas it had done so in GS2.
     156
     157        // The details of this problem and its current solution are explained in the ticket
     158        // http://trac.greenstone.org/ticket/845
     159
     160        // We need to change the settings for rewriteMethod in order to get searches on wildcards to
     161        // produce search terms again when the query is rewritten.
     162
     163        if(query instanceof MultiTermQuery) {
     164
     165        // default docCountPercent=0.1; default termCountCutoff=350
     166
     167        // Creating custom cutoff values, taking into account of existing cutoff values
     168        MultiTermQuery.ConstantScoreAutoRewrite customRewriteMethod = new MultiTermQuery.ConstantScoreAutoRewrite();
     169        customRewriteMethod.setDocCountPercent(100.0);//MultiTermQuery.ConstantScoreAutoRewrite.DEFAULT_DOC_COUNT_PERCENT);
     170        customRewriteMethod.setTermCountCutoff(350);
     171
     172        MultiTermQuery multiTermQuery = (MultiTermQuery)query;
     173        multiTermQuery.setRewriteMethod(customRewriteMethod);
     174
     175        // the above works when searching with wildcards over sections, the following also
     176        // works on book searches, but has been discouraged as it can throw an exception if
     177        // the number of terms exceeds BooleanQuery.getMaxClauseCount().
     178        // http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/MultiTermQuery.html
     179
     180        //multiTermQuery.setRewriteMethod(MultiTermQuery.CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE);//MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
     181        }
     182
    149183        query = query.rewrite(reader);
    150184       
Note: See TracChangeset for help on using the changeset viewer.