Changeset 26155

Show
Ignore:
Timestamp:
05.09.2012 21:06:33 (7 years ago)
Author:
ak19
Message:

Searching with wildcards on lucene collection now displays term info in search results. But only at section level, not yet at document level. The latter can be accomplished by configuring the multitermquery rewrite method to a setting that could throw an exception if the number of terms exceeds BooleanQuery?.MaxClauseCount?(). But then searching with wildcards will work like GS2 again (where it works now since GS2's LuceneWrapper? uses lucene core library 2.3.2 and GS3's LuceneWrapper?3 uses lucene core library 3.3.0.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/common-src/indexers/lucene-gs/src/org/greenstone/LuceneWrapper3/GS2LuceneQuery.java

    r24732 r26155  
    5454import org.apache.lucene.util.Version; 
    5555 
     56import org.apache.lucene.search.MultiTermQuery; 
     57import org.apache.lucene.search.MultiTermQuery.ConstantScoreAutoRewrite; 
     58 
    5659public class GS2LuceneQuery extends SharedSoleneQuery 
    5760{ 
     
    147150 
    148151        Query query = parseQuery(reader, query_parser, query_string, fuzziness); 
     152 
     153        // GS2's LuceneWrapper uses lucene-2.3.2. GS3's LuceneWrapper3 works with lucene-3.3.0.  
     154        // This change in lucene core library for GS3 had the side-effect that searching on  
     155        // "econom*" didn't display what terms it was searching for, whereas it had done so in GS2.  
     156 
     157        // The details of this problem and its current solution are explained in the ticket  
     158        // http://trac.greenstone.org/ticket/845 
     159 
     160        // We need to change the settings for rewriteMethod in order to get searches on wildcards to  
     161        // produce search terms again when the query is rewritten. 
     162 
     163        if(query instanceof MultiTermQuery) { 
     164 
     165        // default docCountPercent=0.1; default termCountCutoff=350 
     166 
     167        // Creating custom cutoff values, taking into account of existing cutoff values 
     168        MultiTermQuery.ConstantScoreAutoRewrite customRewriteMethod = new MultiTermQuery.ConstantScoreAutoRewrite(); 
     169        customRewriteMethod.setDocCountPercent(100.0);//MultiTermQuery.ConstantScoreAutoRewrite.DEFAULT_DOC_COUNT_PERCENT); 
     170        customRewriteMethod.setTermCountCutoff(350); 
     171 
     172        MultiTermQuery multiTermQuery = (MultiTermQuery)query; 
     173        multiTermQuery.setRewriteMethod(customRewriteMethod); 
     174 
     175        // the above works when searching with wildcards over sections, the following also  
     176        // works on book searches, but has been discouraged as it can throw an exception if  
     177        // the number of terms exceeds BooleanQuery.getMaxClauseCount(). 
     178        // http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/MultiTermQuery.html 
     179 
     180        //multiTermQuery.setRewriteMethod(MultiTermQuery.CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE);//MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);  
     181        } 
     182 
    149183        query = query.rewrite(reader); 
    150184