- Timestamp:
- 2015-06-30T19:17:36+12:00 (9 years ago)
- Location:
- gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/etc/conf
- Files:
-
- 42 added
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/etc/conf/schema.xml
r28092 r30001 46 46 --> 47 47 48 <schema name="example" version="1. 4">48 <schema name="example" version="1.5"> 49 49 <!-- attribute "name" is the name of this schema and is only used for display purposes. 50 Applications should change this to reflect the nature of the search collection. 51 version="1.4" is Solr's version number for the schema syntax and semantics. It should 52 not normally be changed by applications. 53 1.0: multiValued attribute did not exist, all fields are multiValued by nature 50 version="x.y" is Solr's version number for the schema syntax and 51 semantics. It should not normally be changed by applications. 52 53 1.0: multiValued attribute did not exist, all fields are multiValued 54 by nature 54 55 1.1: multiValued attribute introduced, false by default 55 1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields. 56 1.2: omitTermFreqAndPositions attribute introduced, true by default 57 except for text fields. 56 58 1.3: removed optional field compress feature 57 1.4: default auto-phrase (QueryParser feature) to off 59 1.4: autoGeneratePhraseQueries attribute introduced to drive QueryParser 60 behavior when a single string produces multiple tokens. Defaults 61 to off for version >= 1.4 62 1.5: omitNorms defaults to true for primitive field types 63 (int, float, boolean, string...) 58 64 --> 59 65 66 <fields> 67 <!-- Valid attributes for fields: 68 name: mandatory - the name for the field 69 type: mandatory - the name of a field type from the 70 <types> fieldType section 71 indexed: true if this field should be indexed (searchable or sortable) 72 stored: true if this field should be retrievable 73 docValues: true if this field should have doc values. Doc values are 74 useful for faceting, grouping, sorting and function queries. Although not 75 required, doc values will make the index faster to load, more 76 NRT-friendly and more memory-efficient. They however come with some 77 limitations: they are currently only supported by StrField, UUIDField 78 and all Trie*Fields, and depending on the field type, they might 79 require the field to be single-valued, be required or have a default 80 value (check the documentation of the field type you're interested in 81 for more information) 82 multiValued: true if this field may contain multiple values per document 83 omitNorms: (expert) set to true to omit the norms associated with 84 this field (this disables length normalization and index-time 85 boosting for the field, and saves some memory). Only full-text 86 fields or fields that need an index-time boost need norms. 87 Norms are omitted for primitive (non-analyzed) types by default. 88 termVectors: [false] set to true to store the term vector for a 89 given field. 90 When using MoreLikeThis, fields used for similarity should be 91 stored for best performance. 92 termPositions: Store position information with the term vector. 93 This will increase storage costs. 94 termOffsets: Store offset information with the term vector. This 95 will increase storage costs. 96 required: The field is required. It will throw an error if the 97 value does not exist 98 default: a value that should be used if no value is specified 99 when adding a document. 100 --> 101 102 <!-- field names should consist of alphanumeric or underscore characters only and 103 not start with a digit. This is not currently strictly enforced, 104 but other field names will not have first class support from all components 105 and back compatibility is not guaranteed. Names with both leading and 106 trailing underscores (e.g. _version_) are reserved. 107 --> 108 109 <!-- If you remove this field, you must _also_ disable the update log in solrconfig.xml 110 or Solr won't start. _version_ and update log are required for SolrCloud 111 --> 112 113 <field name="docOID" type="string" indexed="true" stored="true" required="true" /> 114 115 <field name="ZZ" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 116 <field name="TX" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 117 <field name="TI" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 118 <field name="SU" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 119 <field name="ORG" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 120 121 122 <field name="_version_" type="long" indexed="true" stored="true"/> 123 124 <!-- points to the root document of a block of nested documents. Required for nested 125 document support, may be removed otherwise 126 --> 127 <field name="_root_" type="string" indexed="true" stored="false"/> 128 129 <!-- Only remove the "id" field if you have a very good reason to. While not strictly 130 required, it is highly recommended. A <uniqueKey> is present in almost all Solr 131 installations. See the <uniqueKey> declaration below where <uniqueKey> is set to "id". 132 --> 133 <!-- 134 <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> 135 --> 136 137 <!-- 138 <field name="sku" type="text_en_splitting_tight" indexed="true" stored="true" omitNorms="true"/> 139 <field name="name" type="text_general" indexed="true" stored="true"/> 140 <field name="manu" type="text_general" indexed="true" stored="true" omitNorms="true"/> 141 <field name="cat" type="string" indexed="true" stored="true" multiValued="true"/> 142 <field name="features" type="text_general" indexed="true" stored="true" multiValued="true"/> 143 <field name="includes" type="text_general" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" /> 144 145 <field name="weight" type="float" indexed="true" stored="true"/> 146 <field name="price" type="float" indexed="true" stored="true"/> 147 <field name="popularity" type="int" indexed="true" stored="true" /> 148 <field name="inStock" type="boolean" indexed="true" stored="true" /> 149 --> 150 <field name="store" type="location" indexed="true" stored="true"/> 151 152 <!-- Common metadata fields, named specifically to match up with 153 SolrCell metadata when parsing rich documents such as Word, PDF. 154 Some fields are multiValued only because Tika currently may return 155 multiple values for them. Some metadata is parsed from the documents, 156 but there are some which come from the client context: 157 "content_type": From the HTTP headers of incoming stream 158 "resourcename": From SolrCell request param resource.name 159 --> 160 161 <!-- 162 <field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/> 163 <field name="subject" type="text_general" indexed="true" stored="true"/> 164 <field name="description" type="text_general" indexed="true" stored="true"/> 165 <field name="comments" type="text_general" indexed="true" stored="true"/> 166 <field name="author" type="text_general" indexed="true" stored="true"/> 167 <field name="keywords" type="text_general" indexed="true" stored="true"/> 168 <field name="category" type="text_general" indexed="true" stored="true"/> 169 <field name="resourcename" type="text_general" indexed="true" stored="true"/> 170 <field name="url" type="text_general" indexed="true" stored="true"/> 171 <field name="content_type" type="string" indexed="true" stored="true" multiValued="true"/> 172 <field name="last_modified" type="date" indexed="true" stored="true"/> 173 <field name="links" type="string" indexed="true" stored="true" multiValued="true"/> 174 --> 175 176 <!-- Main body of document extracted by SolrCell. 177 NOTE: This field is not indexed by default, since it is also copied to "text" 178 using copyField below. This is to save space. Use this field for returning and 179 highlighting document content. Use the "text" field to search the content. --> 180 <field name="content" type="text_general" indexed="false" stored="true" multiValued="true"/> 181 182 183 <!-- catchall field, containing all other searchable text fields (implemented 184 via copyField further on in this schema --> 185 <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/> 186 187 <!-- catchall text field that indexes tokens both normally and in reverse for efficient 188 leading wildcard queries. --> 189 <field name="text_rev" type="text_general_rev" indexed="true" stored="false" multiValued="true"/> 190 191 <!-- non-tokenized version of manufacturer to make it easier to sort or group 192 results by manufacturer. copied from "manu" via copyField --> 193 <field name="manu_exact" type="string" indexed="true" stored="false"/> 194 195 <field name="payloads" type="payloads" indexed="true" stored="true"/> 196 197 198 <!-- 199 Some fields such as popularity and manu_exact could be modified to 200 leverage doc values: 201 <field name="popularity" type="int" indexed="true" stored="true" docValues="true" /> 202 <field name="manu_exact" type="string" indexed="false" stored="false" docValues="true" /> 203 <field name="cat" type="string" indexed="true" stored="true" docValues="true" multiValued="true"/> 204 205 206 Although it would make indexing slightly slower and the index bigger, it 207 would also make the index faster to load, more memory-efficient and more 208 NRT-friendly. 209 --> 210 211 <!-- Dynamic field definitions allow using convention over configuration 212 for fields via the specification of patterns to match field names. 213 EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i) 214 RESTRICTION: the glob-like pattern in the name attribute must have 215 a "*" only at the start or the end. --> 216 217 <dynamicField name="*_i" type="int" indexed="true" stored="true"/> 218 <dynamicField name="*_is" type="int" indexed="true" stored="true" multiValued="true"/> 219 <dynamicField name="*_s" type="string" indexed="true" stored="true" /> 220 <dynamicField name="*_ss" type="string" indexed="true" stored="true" multiValued="true"/> 221 <dynamicField name="*_l" type="long" indexed="true" stored="true"/> 222 <dynamicField name="*_ls" type="long" indexed="true" stored="true" multiValued="true"/> 223 <dynamicField name="*_t" type="text_general" indexed="true" stored="true"/> 224 <dynamicField name="*_txt" type="text_general" indexed="true" stored="true" multiValued="true"/> 225 <dynamicField name="*_en" type="text_en" indexed="true" stored="true" multiValued="true"/> 226 <dynamicField name="*_b" type="boolean" indexed="true" stored="true"/> 227 <dynamicField name="*_bs" type="boolean" indexed="true" stored="true" multiValued="true"/> 228 <dynamicField name="*_f" type="float" indexed="true" stored="true"/> 229 <dynamicField name="*_fs" type="float" indexed="true" stored="true" multiValued="true"/> 230 <dynamicField name="*_d" type="double" indexed="true" stored="true"/> 231 <dynamicField name="*_ds" type="double" indexed="true" stored="true" multiValued="true"/> 232 233 <!-- Type used to index the lat and lon components for the "location" FieldType --> 234 <dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false" /> 235 236 <dynamicField name="*_dt" type="date" indexed="true" stored="true"/> 237 <dynamicField name="*_dts" type="date" indexed="true" stored="true" multiValued="true"/> 238 <dynamicField name="*_p" type="location" indexed="true" stored="true"/> 239 240 <!-- some trie-coded dynamic fields for faster range queries --> 241 <dynamicField name="*_ti" type="tint" indexed="true" stored="true"/> 242 <dynamicField name="*_tl" type="tlong" indexed="true" stored="true"/> 243 <dynamicField name="*_tf" type="tfloat" indexed="true" stored="true"/> 244 <dynamicField name="*_td" type="tdouble" indexed="true" stored="true"/> 245 <dynamicField name="*_tdt" type="tdate" indexed="true" stored="true"/> 246 247 <dynamicField name="*_pi" type="pint" indexed="true" stored="true"/> 248 <dynamicField name="*_c" type="currency" indexed="true" stored="true"/> 249 250 <dynamicField name="ignored_*" type="ignored" multiValued="true"/> 251 <dynamicField name="attr_*" type="text_general" indexed="true" stored="true" multiValued="true"/> 252 253 <dynamicField name="random_*" type="random" /> 254 255 <!-- dynamic field for sort/facet fields, which are strings by default. ie not tokenised --> 256 <dynamicField name="by*" type="string" indexed="true" stored="false" multiValued="false" /> 257 258 <!-- uncomment the following to ignore any fields that don't already match an existing 259 field name or dynamic field, rather than reporting them as an error. 260 alternately, change the type="ignored" to some other type e.g. "text" if you want 261 unknown fields indexed and/or stored by default --> 262 <!--dynamicField name="*" type="ignored" multiValued="true" /--> 263 264 </fields> 265 266 267 <!-- Field to use to determine and enforce document uniqueness. 268 Unless this field is marked with required="false", it will be a required field 269 --> 270 <uniqueKey>docOID</uniqueKey> 271 272 <!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when 273 parsing a query string that isn't explicit about the field. Machine (non-user) 274 generated queries are best made explicit, or they can use the "df" request parameter 275 which takes precedence over this. 276 Note: Un-commenting defaultSearchField will be insufficient if your request handler 277 in solrconfig.xml defines "df", which takes precedence. That would need to be removed. 278 <defaultSearchField>text</defaultSearchField> --> 279 280 <!-- DEPRECATED: The defaultOperator (AND|OR) is consulted by various query parsers 281 when parsing a query string to determine if a clause of the query should be marked as 282 required or optional, assuming the clause isn't already marked by some operator. 283 The default is OR, which is generally assumed so it is not a good idea to change it 284 globally here. The "q.op" request parameter takes precedence over this. 285 <solrQueryParser defaultOperator="OR"/> --> 286 287 <!-- copyField commands copy one field to another at the time a document 288 is added to the index. It's used either to index the same field differently, 289 or to add multiple fields to the same field for easier/faster searching. --> 290 <!-- 291 <copyField source="cat" dest="text"/> 292 <copyField source="name" dest="text"/> 293 <copyField source="manu" dest="text"/> 294 <copyField source="features" dest="text"/> 295 <copyField source="includes" dest="text"/> 296 <copyField source="manu" dest="manu_exact"/> 297 --> 298 299 <!-- Copy the price into a currency enabled field (default USD) --> 300 <!-- 301 <copyField source="price" dest="price_c"/> 302 --> 303 304 <!-- Text fields from SolrCell to search by default in our catch-all field --> 305 <!-- 306 <copyField source="title" dest="text"/> 307 <copyField source="author" dest="text"/> 308 <copyField source="description" dest="text"/> 309 <copyField source="keywords" dest="text"/> 310 <copyField source="content" dest="text"/> 311 <copyField source="content_type" dest="text"/> 312 <copyField source="resourcename" dest="text"/> 313 <copyField source="url" dest="text"/> 314 --> 315 316 <!-- Create a string version of author for faceting --> 317 <!-- 318 <copyField source="author" dest="author_s"/> 319 --> 320 321 <!-- Above, multiple source fields are copied to the [text] field. 322 Another way to map multiple source fields to the same 323 destination field is to use the dynamic field syntax. 324 copyField also supports a maxChars to copy setting. --> 325 326 <!-- <copyField source="*_t" dest="text" maxChars="3000"/> --> 327 328 <!-- copy name to alphaNameSort, a field designed for sorting by name --> 329 <!-- <copyField source="name" dest="alphaNameSort"/> --> 330 60 331 <types> 61 332 <!-- field type definitions. The "name" attribute is … … 63 334 attribute and any other attributes determine the real 64 335 behavior of the fieldType. 65 Class names starting with "solr" refer to java classes in the66 org.apache.solr.analysis package.336 Class names starting with "solr" refer to java classes in a 337 standard package such as org.apache.solr.analysis 67 338 --> 68 339 69 <!-- The StrField type is not analyzed, but indexed/stored verbatim. --> 70 <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> 340 <!-- The StrField type is not analyzed, but indexed/stored verbatim. 341 It supports doc values but in that case the field needs to be 342 single-valued and either required or have a default value. 343 --> 344 <fieldType name="string" class="solr.StrField" sortMissingLast="true" /> 71 345 72 346 <!-- boolean type: "true" or "false" --> 73 <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>74 <!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings --> 75 < fieldtype name="binary" class="solr.BinaryField"/>76 77 <!-- The optional sortMissingLast and sortMissingFirst attributes are78 currently supported on types that are sorted internally as strings. 79 This includes "string","boolean","sint","slong","sfloat","sdouble","pdate"347 <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/> 348 349 <!-- sortMissingLast and sortMissingFirst attributes are optional attributes are 350 currently supported on types that are sorted internally as strings 351 and on numeric types. 352 This includes "string","boolean", and, as of 3.5 (and 4.x), 353 int, float, long, date, double, including the "Trie" variants. 80 354 - If sortMissingLast="true", then a sort on this field will cause documents 81 355 without the field to come after documents with the field, … … 91 365 <!-- 92 366 Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types. 367 368 These fields support doc values, but they require the field to be 369 single-valued and either be required or have a default value. 93 370 --> 94 <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true"positionIncrementGap="0"/>95 <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true"positionIncrementGap="0"/>96 <fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true"positionIncrementGap="0"/>97 <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true"positionIncrementGap="0"/>371 <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/> 372 <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/> 373 <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/> 374 <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/> 98 375 99 376 <!-- … … 107 384 A precisionStep of 0 disables indexing at different precision levels. 108 385 --> 109 <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true"positionIncrementGap="0"/>110 <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true"positionIncrementGap="0"/>111 <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true"positionIncrementGap="0"/>112 <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true"positionIncrementGap="0"/>386 <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/> 387 <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/> 388 <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/> 389 <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/> 113 390 114 391 <!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and … … 134 411 Note: For faster range queries, consider the tdate type 135 412 --> 136 <fieldType name="date" class="solr.TrieDateField" omitNorms="true"precisionStep="0" positionIncrementGap="0"/>413 <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/> 137 414 138 415 <!-- A Trie based date field for faster date range queries and date faceting. --> 139 <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/> 140 416 <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/> 417 418 419 <!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings --> 420 <fieldtype name="binary" class="solr.BinaryField"/> 141 421 142 422 <!-- 143 423 Note: 144 These should only be used for compatibility with existing indexes (created with older Solr versions)145 or if "sortMissingFirst" or "sortMissingLast" functionality is needed. Use Trie based fields instead.146 424 These should only be used for compatibility with existing indexes (created with lucene or older Solr versions). 425 Use Trie based fields instead. As of Solr 3.5 and 4.x, Trie based fields support sortMissingFirst/Last 426 147 427 Plain numeric field types that store and index the text 148 value verbatim (and hence don't support range queries, since the428 value verbatim (and hence don't correctly support range queries, since the 149 429 lexicographic ordering isn't equal to the numeric ordering) 150 430 --> 151 <fieldType name="pint" class="solr.IntField" omitNorms="true"/> 152 <fieldType name="plong" class="solr.LongField" omitNorms="true"/> 153 <fieldType name="pfloat" class="solr.FloatField" omitNorms="true"/> 154 <fieldType name="pdouble" class="solr.DoubleField" omitNorms="true"/> 155 <fieldType name="pdate" class="solr.DateField" sortMissingLast="true" omitNorms="true"/> 156 157 158 <!-- 159 Note: 160 These should only be used for compatibility with existing indexes (created with older Solr versions) 161 or if "sortMissingFirst" or "sortMissingLast" functionality is needed. Use Trie based fields instead. 162 163 Numeric field types that manipulate the value into 164 a string value that isn't human-readable in its internal form, 165 but with a lexicographic ordering the same as the numeric ordering, 166 so that range queries work correctly. 167 --> 168 <fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/> 169 <fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/> 170 <fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/> 171 <fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/> 172 431 <fieldType name="pint" class="solr.IntField"/> 432 <fieldType name="plong" class="solr.LongField"/> 433 <fieldType name="pfloat" class="solr.FloatField"/> 434 <fieldType name="pdouble" class="solr.DoubleField"/> 435 <fieldType name="pdate" class="solr.DateField" sortMissingLast="true"/> 173 436 174 437 <!-- The "RandomSortField" is not used to store or search any 175 438 data. You can declare fields of this type it in your schema 176 439 to generate pseudo-random orderings of your docs for sorting 177 purposes. The ordering is generated based on the field name178 and the version of the index,As long as the index version440 or function purposes. The ordering is generated based on the field 441 name and the version of the index. As long as the index version 179 442 remains unchanged, and the same field name is reused, 180 443 the ordering of the docs will be consistent. 181 444 If you want different psuedo-random orderings of documents, 182 445 for the same version of the index, use a dynamicField and 183 change the name446 change the field name in the request. 184 447 --> 185 448 <fieldType name="random" class="solr.RandomSortField" indexed="true" /> … … 198 461 199 462 <!-- One can also specify an existing Analyzer class that has a 200 default constructor via the class attribute on the analyzer element 463 default constructor via the class attribute on the analyzer element. 464 Example: 201 465 <fieldType name="text_greek" class="solr.TextField"> 202 466 <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/> … … 219 483 <analyzer type="index"> 220 484 <tokenizer class="solr.StandardTokenizerFactory"/> 221 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>485 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> 222 486 <!-- in this example, we will only use synonyms at query time 223 487 <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> … … 227 491 <analyzer type="query"> 228 492 <tokenizer class="solr.StandardTokenizerFactory"/> 229 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>493 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> 230 494 <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> 231 495 <filter class="solr.LowerCaseFilterFactory"/> … … 235 499 <!-- A text field with defaults appropriate for English: it 236 500 tokenizes with StandardTokenizer, removes English stop words 237 ( stopwords_en.txt), down cases, protects words from protwords.txt, and501 (lang/stopwords_en.txt), down cases, protects words from protwords.txt, and 238 502 finally applies Porter's stemming. The query time analyzer 239 503 also applies synonyms from synonyms.txt. --> … … 245 509 --> 246 510 <!-- Case insensitive stop word removal. 247 add enablePositionIncrements=true in both the index and query248 analyzers to leave a 'gap' for more accurate phrase queries.249 511 --> 250 512 <filter class="solr.StopFilterFactory" 251 513 ignoreCase="true" 252 words="stopwords_en.txt" 253 enablePositionIncrements="true" 514 words="lang/stopwords_en.txt" 254 515 /> 255 516 <filter class="solr.LowerCaseFilterFactory"/> … … 259 520 <filter class="solr.EnglishMinimalStemFilterFactory"/> 260 521 --> 261 <filter class="solr.PorterStemFilterFactory"/> 522 <!--<filter class="solr.PorterStemFilterFactory"/>--> 523 <filter class="solr.EnglishMinimalStemFilterFactory"/> 262 524 </analyzer> 263 525 <analyzer type="query"> … … 266 528 <filter class="solr.StopFilterFactory" 267 529 ignoreCase="true" 268 words="stopwords_en.txt" 269 enablePositionIncrements="true" 530 words="lang/stopwords_en.txt" 270 531 /> 271 532 <filter class="solr.LowerCaseFilterFactory"/> … … 275 536 <filter class="solr.EnglishMinimalStemFilterFactory"/> 276 537 --> 277 <filter class="solr.PorterStemFilterFactory"/> 538 <!--<filter class="solr.PorterStemFilterFactory"/>--> 539 <filter class="solr.EnglishMinimalStemFilterFactory"/> 278 540 </analyzer> 279 541 </fieldType> … … 286 548 non-alphanumeric chars. This means certain compound word 287 549 cases will work, for example query "wi fi" will match 288 document "WiFi" or "wi-fi". However, other cases will still 289 not match, for example if the query is "wifi" and the 290 document is "wi fi" or if the query is "wi-fi" and the 291 document is "wifi". 550 document "WiFi" or "wi-fi". 292 551 --> 293 552 <fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> … … 298 557 --> 299 558 <!-- Case insensitive stop word removal. 300 add enablePositionIncrements=true in both the index and query301 analyzers to leave a 'gap' for more accurate phrase queries.302 559 --> 303 560 <filter class="solr.StopFilterFactory" 304 561 ignoreCase="true" 305 words="stopwords_en.txt" 306 enablePositionIncrements="true" 562 words="lang/stopwords_en.txt" 307 563 /> 308 564 <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> 309 565 <filter class="solr.LowerCaseFilterFactory"/> 310 566 <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> 311 <filter class="solr.PorterStemFilterFactory"/> 567 <!--<filter class="solr.PorterStemFilterFactory"/>--> 568 <filter class="solr.EnglishMinimalStemFilterFactory"/> 312 569 </analyzer> 313 570 <analyzer type="query"> … … 316 573 <filter class="solr.StopFilterFactory" 317 574 ignoreCase="true" 318 words="stopwords_en.txt" 319 enablePositionIncrements="true" 575 words="lang/stopwords_en.txt" 320 576 /> 321 577 <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> 322 578 <filter class="solr.LowerCaseFilterFactory"/> 323 579 <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> 324 <filter class="solr.PorterStemFilterFactory"/> 580 <!--<filter class="solr.PorterStemFilterFactory"/>--> 581 <filter class="solr.EnglishMinimalStemFilterFactory"/> 325 582 </analyzer> 326 583 </fieldType> … … 332 589 <tokenizer class="solr.WhitespaceTokenizerFactory"/> 333 590 <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/> 334 <filter class="solr.StopFilterFactory" ignoreCase="true" words=" stopwords_en.txt"/>591 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/> 335 592 <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/> 336 593 <filter class="solr.LowerCaseFilterFactory"/> … … 348 605 <analyzer type="index"> 349 606 <tokenizer class="solr.StandardTokenizerFactory"/> 350 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>607 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> 351 608 <filter class="solr.LowerCaseFilterFactory"/> 352 609 <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true" … … 356 613 <tokenizer class="solr.StandardTokenizerFactory"/> 357 614 <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> 358 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>615 <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> 359 616 <filter class="solr.LowerCaseFilterFactory"/> 360 617 </analyzer> … … 396 653 information on pattern and replacement string syntax. 397 654 398 http://java.sun.com/j2se/1. 5.0/docs/api/java/util/regex/package-summary.html655 http://java.sun.com/j2se/1.6.0/docs/api/java/util/regex/package-summary.html 399 656 --> 400 657 <filter class="solr.PatternReplaceFilterFactory" … … 437 694 </fieldType> 438 695 439 <fieldType name="text_path" class="solr.TextField" positionIncrementGap="100"> 440 <analyzer> 441 <tokenizer class="solr.PathHierarchyTokenizerFactory"/> 696 <!-- 697 Example of using PathHierarchyTokenizerFactory at index time, so 698 queries for paths match documents at that path, or in descendent paths 699 --> 700 <fieldType name="descendent_path" class="solr.TextField"> 701 <analyzer type="index"> 702 <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> 703 </analyzer> 704 <analyzer type="query"> 705 <tokenizer class="solr.KeywordTokenizerFactory" /> 706 </analyzer> 707 </fieldType> 708 <!-- 709 Example of using PathHierarchyTokenizerFactory at query time, so 710 queries for paths match documents at that path, or in ancestor paths 711 --> 712 <fieldType name="ancestor_path" class="solr.TextField"> 713 <analyzer type="index"> 714 <tokenizer class="solr.KeywordTokenizerFactory" /> 715 </analyzer> 716 <analyzer type="query"> 717 <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> 442 718 </analyzer> 443 719 </fieldType> … … 463 739 <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/> 464 740 465 <!-- 466 A Geohash is a compact representation of a latitude longitude pair in a single field. 467 See http://wiki.apache.org/solr/SpatialSearch 741 <!-- An alternative geospatial field type new to Solr 4. It supports multiValued and polygon shapes. 742 For more information about this and other Spatial fields new to Solr 4, see: 743 http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 744 --> 745 <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" 746 geo="true" distErrPct="0.025" maxDistErr="0.000009" units="degrees" /> 747 748 <!-- Money/currency field type. See http://wiki.apache.org/solr/MoneyFieldType 749 Parameters: 750 defaultCurrency: Specifies the default currency if none specified. Defaults to "USD" 751 precisionStep: Specifies the precisionStep for the TrieLong field used for the amount 752 providerClass: Lets you plug in other exchange provider backend: 753 solr.FileExchangeRateProvider is the default and takes one parameter: 754 currencyConfig: name of an xml file holding exchange rates 755 solr.OpenExchangeRatesOrgProvider uses rates from openexchangerates.org: 756 ratesFileLocation: URL or path to rates JSON file (default latest.json on the web) 757 refreshInterval: Number of minutes between each rates fetch (default: 1440, min: 60) 468 758 --> 469 <fieldtype name="geohash" class="solr.GeoHashField"/> 759 <fieldType name="currency" class="solr.CurrencyField" precisionStep="8" defaultCurrency="USD" currencyConfig="currency.xml" /> 760 761 762 763 <!-- some examples for different languages (generally ordered by ISO code) --> 764 765 <!-- Arabic --> 766 <fieldType name="text_ar" class="solr.TextField" positionIncrementGap="100"> 767 <analyzer> 768 <tokenizer class="solr.StandardTokenizerFactory"/> 769 <!-- for any non-arabic --> 770 <filter class="solr.LowerCaseFilterFactory"/> 771 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ar.txt" /> 772 <!-- normalizes ﻯ to ï»±, etc --> 773 <filter class="solr.ArabicNormalizationFilterFactory"/> 774 <filter class="solr.ArabicStemFilterFactory"/> 775 </analyzer> 776 </fieldType> 777 778 <!-- Bulgarian --> 779 <fieldType name="text_bg" class="solr.TextField" positionIncrementGap="100"> 780 <analyzer> 781 <tokenizer class="solr.StandardTokenizerFactory"/> 782 <filter class="solr.LowerCaseFilterFactory"/> 783 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_bg.txt" /> 784 <filter class="solr.BulgarianStemFilterFactory"/> 785 </analyzer> 786 </fieldType> 787 788 <!-- Catalan --> 789 <fieldType name="text_ca" class="solr.TextField" positionIncrementGap="100"> 790 <analyzer> 791 <tokenizer class="solr.StandardTokenizerFactory"/> 792 <!-- removes l', etc --> 793 <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ca.txt"/> 794 <filter class="solr.LowerCaseFilterFactory"/> 795 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ca.txt" /> 796 <filter class="solr.SnowballPorterFilterFactory" language="Catalan"/> 797 </analyzer> 798 </fieldType> 799 800 <!-- CJK bigram (see text_ja for a Japanese configuration using morphological analysis) --> 801 <fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="100"> 802 <analyzer> 803 <tokenizer class="solr.StandardTokenizerFactory"/> 804 <!-- normalize width before bigram, as e.g. half-width dakuten combine --> 805 <filter class="solr.CJKWidthFilterFactory"/> 806 <!-- for any non-CJK --> 807 <filter class="solr.LowerCaseFilterFactory"/> 808 <filter class="solr.CJKBigramFilterFactory"/> 809 </analyzer> 810 </fieldType> 811 812 <!-- Kurdish --> 813 <fieldType name="text_ckb" class="solr.TextField" positionIncrementGap="100"> 814 <analyzer> 815 <tokenizer class="solr.StandardTokenizerFactory"/> 816 <filter class="solr.SoraniNormalizationFilterFactory"/> 817 <!-- for any latin text --> 818 <filter class="solr.LowerCaseFilterFactory"/> 819 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ckb.txt"/> 820 <filter class="solr.SoraniStemFilterFactory"/> 821 </analyzer> 822 </fieldType> 823 824 <!-- Czech --> 825 <fieldType name="text_cz" class="solr.TextField" positionIncrementGap="100"> 826 <analyzer> 827 <tokenizer class="solr.StandardTokenizerFactory"/> 828 <filter class="solr.LowerCaseFilterFactory"/> 829 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_cz.txt" /> 830 <filter class="solr.CzechStemFilterFactory"/> 831 </analyzer> 832 </fieldType> 833 834 <!-- Danish --> 835 <fieldType name="text_da" class="solr.TextField" positionIncrementGap="100"> 836 <analyzer> 837 <tokenizer class="solr.StandardTokenizerFactory"/> 838 <filter class="solr.LowerCaseFilterFactory"/> 839 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_da.txt" format="snowball" /> 840 <filter class="solr.SnowballPorterFilterFactory" language="Danish"/> 841 </analyzer> 842 </fieldType> 843 844 <!-- German --> 845 <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100"> 846 <analyzer> 847 <tokenizer class="solr.StandardTokenizerFactory"/> 848 <filter class="solr.LowerCaseFilterFactory"/> 849 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" /> 850 <filter class="solr.GermanNormalizationFilterFactory"/> 851 <filter class="solr.GermanLightStemFilterFactory"/> 852 <!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/> --> 853 <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/> --> 854 </analyzer> 855 </fieldType> 856 857 <!-- Greek --> 858 <fieldType name="text_el" class="solr.TextField" positionIncrementGap="100"> 859 <analyzer> 860 <tokenizer class="solr.StandardTokenizerFactory"/> 861 <!-- greek specific lowercase for sigma --> 862 <filter class="solr.GreekLowerCaseFilterFactory"/> 863 <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_el.txt" /> 864 <filter class="solr.GreekStemFilterFactory"/> 865 </analyzer> 866 </fieldType> 867 868 <!-- Spanish --> 869 <fieldType name="text_es" class="solr.TextField" positionIncrementGap="100"> 870 <analyzer> 871 <tokenizer class="solr.StandardTokenizerFactory"/> 872 <filter class="solr.LowerCaseFilterFactory"/> 873 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_es.txt" format="snowball" /> 874 <filter class="solr.SpanishLightStemFilterFactory"/> 875 <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/> --> 876 </analyzer> 877 </fieldType> 878 879 <!-- Basque --> 880 <fieldType name="text_eu" class="solr.TextField" positionIncrementGap="100"> 881 <analyzer> 882 <tokenizer class="solr.StandardTokenizerFactory"/> 883 <filter class="solr.LowerCaseFilterFactory"/> 884 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_eu.txt" /> 885 <filter class="solr.SnowballPorterFilterFactory" language="Basque"/> 886 </analyzer> 887 </fieldType> 888 889 <!-- Persian --> 890 <fieldType name="text_fa" class="solr.TextField" positionIncrementGap="100"> 891 <analyzer> 892 <!-- for ZWNJ --> 893 <charFilter class="solr.PersianCharFilterFactory"/> 894 <tokenizer class="solr.StandardTokenizerFactory"/> 895 <filter class="solr.LowerCaseFilterFactory"/> 896 <filter class="solr.ArabicNormalizationFilterFactory"/> 897 <filter class="solr.PersianNormalizationFilterFactory"/> 898 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fa.txt" /> 899 </analyzer> 900 </fieldType> 901 902 <!-- Finnish --> 903 <fieldType name="text_fi" class="solr.TextField" positionIncrementGap="100"> 904 <analyzer> 905 <tokenizer class="solr.StandardTokenizerFactory"/> 906 <filter class="solr.LowerCaseFilterFactory"/> 907 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fi.txt" format="snowball" /> 908 <filter class="solr.SnowballPorterFilterFactory" language="Finnish"/> 909 <!-- less aggressive: <filter class="solr.FinnishLightStemFilterFactory"/> --> 910 </analyzer> 911 </fieldType> 912 913 <!-- French --> 914 <fieldType name="text_fr" class="solr.TextField" positionIncrementGap="100"> 915 <analyzer> 916 <tokenizer class="solr.StandardTokenizerFactory"/> 917 <!-- removes l', etc --> 918 <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/> 919 <filter class="solr.LowerCaseFilterFactory"/> 920 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fr.txt" format="snowball" /> 921 <filter class="solr.FrenchLightStemFilterFactory"/> 922 <!-- less aggressive: <filter class="solr.FrenchMinimalStemFilterFactory"/> --> 923 <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="French"/> --> 924 </analyzer> 925 </fieldType> 926 927 <!-- Irish --> 928 <fieldType name="text_ga" class="solr.TextField" positionIncrementGap="100"> 929 <analyzer> 930 <tokenizer class="solr.StandardTokenizerFactory"/> 931 <!-- removes d', etc --> 932 <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ga.txt"/> 933 <!-- removes n-, etc. position increments is intentionally false! --> 934 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/hyphenations_ga.txt"/> 935 <filter class="solr.IrishLowerCaseFilterFactory"/> 936 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ga.txt"/> 937 <filter class="solr.SnowballPorterFilterFactory" language="Irish"/> 938 </analyzer> 939 </fieldType> 940 941 <!-- Galician --> 942 <fieldType name="text_gl" class="solr.TextField" positionIncrementGap="100"> 943 <analyzer> 944 <tokenizer class="solr.StandardTokenizerFactory"/> 945 <filter class="solr.LowerCaseFilterFactory"/> 946 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_gl.txt" /> 947 <filter class="solr.GalicianStemFilterFactory"/> 948 <!-- less aggressive: <filter class="solr.GalicianMinimalStemFilterFactory"/> --> 949 </analyzer> 950 </fieldType> 951 952 <!-- Hindi --> 953 <fieldType name="text_hi" class="solr.TextField" positionIncrementGap="100"> 954 <analyzer> 955 <tokenizer class="solr.StandardTokenizerFactory"/> 956 <filter class="solr.LowerCaseFilterFactory"/> 957 <!-- normalizes unicode representation --> 958 <filter class="solr.IndicNormalizationFilterFactory"/> 959 <!-- normalizes variation in spelling --> 960 <filter class="solr.HindiNormalizationFilterFactory"/> 961 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hi.txt" /> 962 <filter class="solr.HindiStemFilterFactory"/> 963 </analyzer> 964 </fieldType> 965 966 <!-- Hungarian --> 967 <fieldType name="text_hu" class="solr.TextField" positionIncrementGap="100"> 968 <analyzer> 969 <tokenizer class="solr.StandardTokenizerFactory"/> 970 <filter class="solr.LowerCaseFilterFactory"/> 971 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hu.txt" format="snowball" /> 972 <filter class="solr.SnowballPorterFilterFactory" language="Hungarian"/> 973 <!-- less aggressive: <filter class="solr.HungarianLightStemFilterFactory"/> --> 974 </analyzer> 975 </fieldType> 976 977 <!-- Armenian --> 978 <fieldType name="text_hy" class="solr.TextField" positionIncrementGap="100"> 979 <analyzer> 980 <tokenizer class="solr.StandardTokenizerFactory"/> 981 <filter class="solr.LowerCaseFilterFactory"/> 982 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hy.txt" /> 983 <filter class="solr.SnowballPorterFilterFactory" language="Armenian"/> 984 </analyzer> 985 </fieldType> 986 987 <!-- Indonesian --> 988 <fieldType name="text_id" class="solr.TextField" positionIncrementGap="100"> 989 <analyzer> 990 <tokenizer class="solr.StandardTokenizerFactory"/> 991 <filter class="solr.LowerCaseFilterFactory"/> 992 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_id.txt" /> 993 <!-- for a less aggressive approach (only inflectional suffixes), set stemDerivational to false --> 994 <filter class="solr.IndonesianStemFilterFactory" stemDerivational="true"/> 995 </analyzer> 996 </fieldType> 997 998 <!-- Italian --> 999 <fieldType name="text_it" class="solr.TextField" positionIncrementGap="100"> 1000 <analyzer> 1001 <tokenizer class="solr.StandardTokenizerFactory"/> 1002 <!-- removes l', etc --> 1003 <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_it.txt"/> 1004 <filter class="solr.LowerCaseFilterFactory"/> 1005 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_it.txt" format="snowball" /> 1006 <filter class="solr.ItalianLightStemFilterFactory"/> 1007 <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Italian"/> --> 1008 </analyzer> 1009 </fieldType> 1010 1011 <!-- Japanese using morphological analysis (see text_cjk for a configuration using bigramming) 1012 1013 NOTE: If you want to optimize search for precision, use default operator AND in your query 1014 parser config with <solrQueryParser defaultOperator="AND"/> further down in this file. Use 1015 OR if you would like to optimize for recall (default). 1016 --> 1017 <fieldType name="text_ja" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="false"> 1018 <analyzer> 1019 <!-- Kuromoji Japanese morphological analyzer/tokenizer (JapaneseTokenizer) 1020 1021 Kuromoji has a search mode (default) that does segmentation useful for search. A heuristic 1022 is used to segment compounds into its parts and the compound itself is kept as synonym. 1023 1024 Valid values for attribute mode are: 1025 normal: regular segmentation 1026 search: segmentation useful for search with synonyms compounds (default) 1027 extended: same as search mode, but unigrams unknown words (experimental) 1028 1029 For some applications it might be good to use search mode for indexing and normal mode for 1030 queries to reduce recall and prevent parts of compounds from being matched and highlighted. 1031 Use <analyzer type="index"> and <analyzer type="query"> for this and mode normal in query. 1032 1033 Kuromoji also has a convenient user dictionary feature that allows overriding the statistical 1034 model with your own entries for segmentation, part-of-speech tags and readings without a need 1035 to specify weights. Notice that user dictionaries have not been subject to extensive testing. 1036 1037 User dictionary attributes are: 1038 userDictionary: user dictionary filename 1039 userDictionaryEncoding: user dictionary encoding (default is UTF-8) 1040 1041 See lang/userdict_ja.txt for a sample user dictionary file. 1042 1043 Punctuation characters are discarded by default. Use discardPunctuation="false" to keep them. 1044 1045 See http://wiki.apache.org/solr/JapaneseLanguageSupport for more on Japanese language support. 1046 --> 1047 <tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/> 1048 <!--<tokenizer class="solr.JapaneseTokenizerFactory" mode="search" userDictionary="lang/userdict_ja.txt"/>--> 1049 <!-- Reduces inflected verbs and adjectives to their base/dictionary forms (èŸæžåœ¢) --> 1050 <filter class="solr.JapaneseBaseFormFilterFactory"/> 1051 <!-- Removes tokens with certain part-of-speech tags --> 1052 <filter class="solr.JapanesePartOfSpeechStopFilterFactory" tags="lang/stoptags_ja.txt" /> 1053 <!-- Normalizes full-width romaji to half-width and half-width kana to full-width (Unicode NFKC subset) --> 1054 <filter class="solr.CJKWidthFilterFactory"/> 1055 <!-- Removes common tokens typically not useful for search, but have a negative effect on ranking --> 1056 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ja.txt" /> 1057 <!-- Normalizes common katakana spelling variations by removing any last long sound character (U+30FC) --> 1058 <filter class="solr.JapaneseKatakanaStemFilterFactory" minimumLength="4"/> 1059 <!-- Lower-cases romaji characters --> 1060 <filter class="solr.LowerCaseFilterFactory"/> 1061 </analyzer> 1062 </fieldType> 1063 1064 <!-- Latvian --> 1065 <fieldType name="text_lv" class="solr.TextField" positionIncrementGap="100"> 1066 <analyzer> 1067 <tokenizer class="solr.StandardTokenizerFactory"/> 1068 <filter class="solr.LowerCaseFilterFactory"/> 1069 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_lv.txt" /> 1070 <filter class="solr.LatvianStemFilterFactory"/> 1071 </analyzer> 1072 </fieldType> 1073 1074 <!-- Dutch --> 1075 <fieldType name="text_nl" class="solr.TextField" positionIncrementGap="100"> 1076 <analyzer> 1077 <tokenizer class="solr.StandardTokenizerFactory"/> 1078 <filter class="solr.LowerCaseFilterFactory"/> 1079 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_nl.txt" format="snowball" /> 1080 <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_nl.txt" ignoreCase="false"/> 1081 <filter class="solr.SnowballPorterFilterFactory" language="Dutch"/> 1082 </analyzer> 1083 </fieldType> 1084 1085 <!-- Norwegian --> 1086 <fieldType name="text_no" class="solr.TextField" positionIncrementGap="100"> 1087 <analyzer> 1088 <tokenizer class="solr.StandardTokenizerFactory"/> 1089 <filter class="solr.LowerCaseFilterFactory"/> 1090 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_no.txt" format="snowball" /> 1091 <filter class="solr.SnowballPorterFilterFactory" language="Norwegian"/> 1092 <!-- less aggressive: <filter class="solr.NorwegianLightStemFilterFactory" variant="nb"/> --> 1093 <!-- singular/plural: <filter class="solr.NorwegianMinimalStemFilterFactory" variant="nb"/> --> 1094 <!-- The "light" and "minimal" stemmers support variants: nb=BokmÃ¥l, nn=Nynorsk, no=Both --> 1095 </analyzer> 1096 </fieldType> 1097 1098 <!-- Portuguese --> 1099 <fieldType name="text_pt" class="solr.TextField" positionIncrementGap="100"> 1100 <analyzer> 1101 <tokenizer class="solr.StandardTokenizerFactory"/> 1102 <filter class="solr.LowerCaseFilterFactory"/> 1103 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" format="snowball" /> 1104 <filter class="solr.PortugueseLightStemFilterFactory"/> 1105 <!-- less aggressive: <filter class="solr.PortugueseMinimalStemFilterFactory"/> --> 1106 <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/> --> 1107 <!-- most aggressive: <filter class="solr.PortugueseStemFilterFactory"/> --> 1108 </analyzer> 1109 </fieldType> 1110 1111 <!-- Romanian --> 1112 <fieldType name="text_ro" class="solr.TextField" positionIncrementGap="100"> 1113 <analyzer> 1114 <tokenizer class="solr.StandardTokenizerFactory"/> 1115 <filter class="solr.LowerCaseFilterFactory"/> 1116 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ro.txt" /> 1117 <filter class="solr.SnowballPorterFilterFactory" language="Romanian"/> 1118 </analyzer> 1119 </fieldType> 1120 1121 <!-- Russian --> 1122 <fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100"> 1123 <analyzer> 1124 <tokenizer class="solr.StandardTokenizerFactory"/> 1125 <filter class="solr.LowerCaseFilterFactory"/> 1126 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" /> 1127 <filter class="solr.SnowballPorterFilterFactory" language="Russian"/> 1128 <!-- less aggressive: <filter class="solr.RussianLightStemFilterFactory"/> --> 1129 </analyzer> 1130 </fieldType> 1131 <!-- Russian with morphology--> 1132 <fieldType name="text_ru_morph" class="solr.TextField" positionIncrementGap="100"> 1133 <analyzer> 1134 <tokenizer class="solr.StandardTokenizerFactory"/> 1135 <filter class="solr.LowerCaseFilterFactory"/> 1136 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" /> 1137 <filter class="org.apache.lucene.morphology.russian.RussianFilterFactory"/> 1138 </analyzer> 1139 </fieldType> 1140 1141 <!-- Swedish --> 1142 <fieldType name="text_sv" class="solr.TextField" positionIncrementGap="100"> 1143 <analyzer> 1144 <tokenizer class="solr.StandardTokenizerFactory"/> 1145 <filter class="solr.LowerCaseFilterFactory"/> 1146 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_sv.txt" format="snowball" /> 1147 <filter class="solr.SnowballPorterFilterFactory" language="Swedish"/> 1148 <!-- less aggressive: <filter class="solr.SwedishLightStemFilterFactory"/> --> 1149 </analyzer> 1150 </fieldType> 1151 1152 <!-- Thai --> 1153 <fieldType name="text_th" class="solr.TextField" positionIncrementGap="100"> 1154 <analyzer> 1155 <tokenizer class="solr.StandardTokenizerFactory"/> 1156 <filter class="solr.LowerCaseFilterFactory"/> 1157 <filter class="solr.ThaiWordFilterFactory"/> 1158 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_th.txt" /> 1159 </analyzer> 1160 </fieldType> 1161 1162 <!-- Turkish --> 1163 <fieldType name="text_tr" class="solr.TextField" positionIncrementGap="100"> 1164 <analyzer> 1165 <tokenizer class="solr.StandardTokenizerFactory"/> 1166 <filter class="solr.TurkishLowerCaseFilterFactory"/> 1167 <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_tr.txt" /> 1168 <filter class="solr.SnowballPorterFilterFactory" language="Turkish"/> 1169 </analyzer> 1170 </fieldType> 1171 470 1172 </types> 471 472 473 <fields> 474 <!-- Valid attributes for fields: 475 name: mandatory - the name for the field 476 type: mandatory - the name of a previously defined type from the 477 <types> section 478 indexed: true if this field should be indexed (searchable or sortable) 479 stored: true if this field should be retrievable 480 multiValued: true if this field may contain multiple values per document 481 omitNorms: (expert) set to true to omit the norms associated with 482 this field (this disables length normalization and index-time 483 boosting for the field, and saves some memory). Only full-text 484 fields or fields that need an index-time boost need norms. 485 termVectors: [false] set to true to store the term vector for a 486 given field. 487 When using MoreLikeThis, fields used for similarity should be 488 stored for best performance. 489 termPositions: Store position information with the term vector. 490 This will increase storage costs. 491 termOffsets: Store offset information with the term vector. This 492 will increase storage costs. 493 default: a value that should be used if no value is specified 494 when adding a document. 495 --> 496 497 <field name="docOID" type="string" indexed="true" stored="true" required="true" /> 498 499 <field name="ZZ" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 500 <field name="TX" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 501 <field name="TI" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 502 <field name="SU" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 503 <field name="ORG" type="text_en_splitting" indexed="true" stored="false" multiValued="true" /> 504 505 <!-- 506 <field name="sku" type="text_en_splitting_tight" indexed="true" stored="true" omitNorms="true"/> 507 <field name="name" type="text_general" indexed="true" stored="true"/> 508 <field name="alphaNameSort" type="alphaOnlySort" indexed="true" stored="false"/> 509 <field name="manu" type="text_general" indexed="true" stored="true" omitNorms="true"/> 510 <field name="cat" type="string" indexed="true" stored="true" multiValued="true"/> 511 <field name="features" type="text_general" indexed="true" stored="true" multiValued="true"/> 512 <field name="includes" type="text_general" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" /> 513 514 <field name="weight" type="float" indexed="true" stored="true"/> 515 <field name="price" type="float" indexed="true" stored="true"/> 516 <field name="popularity" type="int" indexed="true" stored="true" /> 517 <field name="inStock" type="boolean" indexed="true" stored="true" /> 518 --> 519 520 <!-- 521 The following store examples are used to demonstrate the various ways one might _CHOOSE_ to 522 implement spatial. It is highly unlikely that you would ever have ALL of these fields defined. 1173 1174 <!-- Similarity is the scoring routine for each document vs. a query. 1175 A custom Similarity or SimilarityFactory may be specified here, but 1176 the default is fine for most applications. 1177 For more info: http://wiki.apache.org/solr/SchemaXml#Similarity 523 1178 --> 524 <field name="store" type="location" indexed="true" stored="true"/> 525 526 <!-- Common metadata fields, named specifically to match up with 527 SolrCell metadata when parsing rich documents such as Word, PDF. 528 Some fields are multiValued only because Tika currently may return 529 multiple values for them. 530 --> 531 <!-- 532 <field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/> 533 <field name="subject" type="text_general" indexed="true" stored="true"/> 534 <field name="description" type="text_general" indexed="true" stored="true"/> 535 <field name="comments" type="text_general" indexed="true" stored="true"/> 536 <field name="author" type="text_general" indexed="true" stored="true"/> 537 <field name="keywords" type="text_general" indexed="true" stored="true"/> 538 <field name="category" type="text_general" indexed="true" stored="true"/> 539 <field name="content_type" type="string" indexed="true" stored="true" multiValued="true"/> 540 <field name="last_modified" type="date" indexed="true" stored="true"/> 541 <field name="links" type="string" indexed="true" stored="true" multiValued="true"/> 542 --> 543 544 545 <!-- catchall field, containing all other searchable text fields (implemented 546 via copyField further on in this schema --> 547 <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/> 548 549 <!-- catchall text field that indexes tokens both normally and in reverse for efficient 550 leading wildcard queries. --> 551 <field name="text_rev" type="text_general_rev" indexed="true" stored="false" multiValued="true"/> 552 553 <!-- non-tokenized version of manufacturer to make it easier to sort or group 554 results by manufacturer. copied from "manu" via copyField --> 555 <field name="manu_exact" type="string" indexed="true" stored="false"/> 556 557 <field name="payloads" type="payloads" indexed="true" stored="true"/> 558 559 <!-- Uncommenting the following will create a "timestamp" field using 560 a default value of "NOW" to indicate when each document was indexed. 561 --> 562 <!-- 563 <field name="timestamp" type="date" indexed="true" stored="true" default="NOW" multiValued="false"/> 564 --> 565 566 567 <!-- Dynamic field definitions. If a field name is not found, dynamicFields 568 will be used if the name matches any of the patterns. 569 RESTRICTION: the glob-like pattern in the name attribute must have 570 a "*" only at the start or the end. 571 EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i) 572 Longer patterns will be matched first. if equal size patterns 573 both match, the first appearing in the schema will be used. --> 574 <dynamicField name="*_i" type="int" indexed="true" stored="true"/> 575 <dynamicField name="*_s" type="string" indexed="true" stored="true"/> 576 <dynamicField name="*_l" type="long" indexed="true" stored="true"/> 577 <dynamicField name="*_t" type="text_general" indexed="true" stored="true"/> 578 <dynamicField name="*_txt" type="text_general" indexed="true" stored="true" multiValued="true"/> 579 <dynamicField name="*_b" type="boolean" indexed="true" stored="true"/> 580 <dynamicField name="*_f" type="float" indexed="true" stored="true"/> 581 <dynamicField name="*_d" type="double" indexed="true" stored="true"/> 582 583 <!-- Type used to index the lat and lon components for the "location" FieldType --> 584 <dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/> 585 586 <dynamicField name="*_dt" type="date" indexed="true" stored="true"/> 587 <dynamicField name="*_p" type="location" indexed="true" stored="true"/> 588 589 <!-- some trie-coded dynamic fields for faster range queries --> 590 <dynamicField name="*_ti" type="tint" indexed="true" stored="true"/> 591 <dynamicField name="*_tl" type="tlong" indexed="true" stored="true"/> 592 <dynamicField name="*_tf" type="tfloat" indexed="true" stored="true"/> 593 <dynamicField name="*_td" type="tdouble" indexed="true" stored="true"/> 594 <dynamicField name="*_tdt" type="tdate" indexed="true" stored="true"/> 595 596 <dynamicField name="*_pi" type="pint" indexed="true" stored="true"/> 597 598 <dynamicField name="ignored_*" type="ignored" multiValued="true"/> 599 <dynamicField name="attr_*" type="text_general" indexed="true" stored="true" multiValued="true"/> 600 601 <dynamicField name="random_*" type="random" /> 602 <!-- dynamic field for sort/facet fields, which are strings by default. ie not tokenised. Can't be multivalued - ie can only have one value per document --> 603 <dynamicField name="by*" type="string" indexed="true" stored="false" multiValued="false" /> 604 <!-- uncomment the following to ignore any fields that don't already match an existing 605 field name or dynamic field, rather than reporting them as an error. 606 alternately, change the type="ignored" to some other type e.g. "text" if you want 607 unknown fields indexed and/or stored by default --> 608 <!--dynamicField name="*" type="ignored" multiValued="true" /--> 609 610 </fields> 611 612 <!-- Field to use to determine and enforce document uniqueness. 613 Unless this field is marked with required="false", it will be a required field 614 --> 615 <uniqueKey>docOID</uniqueKey> 616 617 <!-- field for the QueryParser to use when an explicit fieldname is absent --> 618 <defaultSearchField>text</defaultSearchField> 619 620 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" --> 621 <solrQueryParser defaultOperator="OR"/> 622 623 <!-- copyField commands copy one field to another at the time a document 624 is added to the index. It's used either to index the same field differently, 625 or to add multiple fields to the same field for easier/faster searching. --> 626 627 <!-- 628 <copyField source="cat" dest="text"/> 629 <copyField source="name" dest="text"/> 630 <copyField source="manu" dest="text"/> 631 <copyField source="features" dest="text"/> 632 <copyField source="includes" dest="text"/> 633 <copyField source="manu" dest="manu_exact"/> 634 --> 635 636 <!-- Above, multiple source fields are copied to the [text] field. 637 Another way to map multiple source fields to the same 638 destination field is to use the dynamic field syntax. 639 copyField also supports a maxChars to copy setting. --> 640 641 <!-- <copyField source="*_t" dest="text" maxChars="3000"/> --> 642 643 <!-- copy name to alphaNameSort, a field designed for sorting by name --> 644 <!-- <copyField source="name" dest="alphaNameSort"/> --> 645 646 647 <!-- Similarity is the scoring routine for each document vs. a query. 648 A custom similarity may be specified here, but the default is fine 649 for most applications. --> 650 <!-- <similarity class="org.apache.lucene.search.DefaultSimilarity"/> --> 651 <!-- ... OR ... 652 Specify a SimilarityFactory class name implementation 653 allowing parameters to be used. 654 --> 655 <!-- 656 <similarity class="com.example.solr.CustomSimilarityFactory"> 657 <str name="paramkey">param value</str> 658 </similarity> 659 --> 660 1179 <!-- 1180 <similarity class="com.example.solr.CustomSimilarityFactory"> 1181 <str name="paramkey">param value</str> 1182 </similarity> 1183 --> 661 1184 662 1185 </schema> -
gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/etc/conf/solrconfig.xml
r27850 r30001 30 30 --> 31 31 32 <!-- Set this to 'false' if you want solr to continue working after33 it has encountered an severe configuration error. In a34 production environment, you may want solr to keep working even35 if one handler is mis-configured.36 37 You may also set this to false using by setting the system38 property:39 40 -Dsolr.abortOnConfigurationError=false41 -->42 <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError>43 44 32 <!-- Controls what version of Lucene various components of Solr 45 33 adhere to. Generally, you want to use the latest version to … … 47 35 that you fully re-index after changing this setting as it can 48 36 affect both how text is indexed and queried. 49 50 <luceneMatchVersion> LUCENE_33</luceneMatchVersion>51 52 <!-- libdirectives can be used to instruct Solr to load an Jars37 --> 38 <luceneMatchVersion>4.7</luceneMatchVersion> 39 40 <!-- <lib/> directives can be used to instruct Solr to load an Jars 53 41 identified and use them to resolve any "plugins" specified in 54 42 your solrconfig.xml or schema.xml (ie: Analyzers, Request … … 57 45 All directories and paths are resolved relative to the 58 46 instanceDir. 47 48 Please note that <lib/> directives are processed in the order 49 that they appear in your solrconfig.xml file, and are "stacked" 50 on top of each other when building a ClassLoader - so if you have 51 plugin jars with dependencies on other jars, the "lower level" 52 dependency jars should be loaded first. 59 53 60 54 If a "./lib" directory exists in your instanceDir, all files … … 64 58 <lib dir="./lib" /> 65 59 --> 66 <!-- A dir option by itself adds any files found in the directory to 67 the classpath, this is useful for including all jars in a 60 61 <!-- A 'dir' option by itself adds any files found in the directory 62 to the classpath, this is useful for including all jars in a 68 63 directory. 69 --> 70 <lib dir="../../contrib/extraction/lib" /> 71 <!-- When a regex is specified in addition to a directory, only the 64 65 When a 'regex' is specified in addition to a 'dir', only the 72 66 files in that directory which completely match the regex 73 67 (anchored on both ends) will be included. 74 --> 75 <lib dir="../../dist/" regex="apache-solr-cell-\d.*\.jar" /> 76 <lib dir="../../dist/" regex="apache-solr-clustering-\d.*\.jar" /> 77 <lib dir="../../dist/" regex="apache-solr-dataimporthandler-\d.*\.jar" /> 78 79 <!-- If a dir option (with or without a regex) is used and nothing 80 is found that matches, it will be ignored 81 --> 82 <lib dir="../../contrib/clustering/lib/" /> 83 <lib dir="/total/crap/dir/ignored" /> 84 <!-- an exact path can be used to specify a specific file. This 85 will cause a serious error to be logged if it can't be loaded. 68 69 If a 'dir' option (with or without a regex) is used and nothing 70 is found that matches, a warning will be logged. 71 72 The examples below can be used to load some solr-contribs along 73 with their external dependencies. 74 --> 75 <lib dir="../../../contrib/extraction/lib" regex=".*\.jar" /> 76 <lib dir="../../../dist/" regex="solr-cell-\d.*\.jar" /> 77 78 <lib dir="../../../contrib/clustering/lib/" regex=".*\.jar" /> 79 <lib dir="../../../dist/" regex="solr-clustering-\d.*\.jar" /> 80 81 <lib dir="../../../contrib/langid/lib/" regex=".*\.jar" /> 82 <lib dir="../../../dist/" regex="solr-langid-\d.*\.jar" /> 83 84 <lib dir="../../../contrib/velocity/lib" regex=".*\.jar" /> 85 <lib dir="../../../dist/" regex="solr-velocity-\d.*\.jar" /> 86 87 <!-- an exact 'path' can be used instead of a 'dir' to specify a 88 specific jar file. This will cause a serious error to be logged 89 if it can't be loaded. 86 90 --> 87 91 <!-- 88 <lib path="../a-jar-that-does-not-exist.jar" />92 <lib path="../a-jar-that-does-not-exist.jar" /> 89 93 --> 90 94 … … 101 105 <!-- The DirectoryFactory to use for indexes. 102 106 103 solr.StandardDirectoryFactory, the default, is filesystem 104 based. solr.RAMDirectoryFactory is memory based, not 107 solr.StandardDirectoryFactory is filesystem 108 based and tries to pick the best implementation for the current 109 JVM and platform. solr.NRTCachingDirectoryFactory, the default, 110 wraps solr.StandardDirectoryFactory and caches small files in memory 111 for better NRT performance. 112 113 One can force a particular implementation via solr.MMapDirectoryFactory, 114 solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory. 115 116 solr.RAMDirectoryFactory is memory based, not 105 117 persistent, and doesn't work with replication. 106 118 --> 107 119 <directoryFactory name="DirectoryFactory" 108 class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/> 109 110 111 <!-- Index Defaults 112 113 Values here affect all index writers and act as a default 114 unless overridden. 115 116 WARNING: See also the <mainIndex> section below for parameters 117 that overfor Solr's main Lucene index. 118 --> 119 <indexDefaults> 120 121 <useCompoundFile>false</useCompoundFile> 122 120 class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"> 121 122 123 <!-- These will be used if you are using the solr.HdfsDirectoryFactory, 124 otherwise they will be ignored. If you don't plan on using hdfs, 125 you can safely remove this section. --> 126 <!-- The root directory that collection data should be written to. --> 127 <str name="solr.hdfs.home">${solr.hdfs.home:}</str> 128 <!-- The hadoop configuration files to use for the hdfs client. --> 129 <str name="solr.hdfs.confdir">${solr.hdfs.confdir:}</str> 130 <!-- Enable/Disable the hdfs cache. --> 131 <str name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}</str> 132 133 </directoryFactory> 134 135 <!-- The CodecFactory for defining the format of the inverted index. 136 The default implementation is SchemaCodecFactory, which is the official Lucene 137 index format, but hooks into the schema to provide per-field customization of 138 the postings lists and per-document values in the fieldType element 139 (postingsFormat/docValuesFormat). Note that most of the alternative implementations 140 are experimental, so if you choose to customize the index format, its a good 141 idea to convert back to the official format e.g. via IndexWriter.addIndexes(IndexReader) 142 before upgrading to a newer version to avoid unnecessary reindexing. 143 --> 144 <codecFactory class="solr.SchemaCodecFactory"/> 145 146 <!-- To enable dynamic schema REST APIs, use the following for <schemaFactory>: 147 148 <schemaFactory class="ManagedIndexSchemaFactory"> 149 <bool name="mutable">true</bool> 150 <str name="managedSchemaResourceName">managed-schema</str> 151 </schemaFactory> 152 153 When ManagedIndexSchemaFactory is specified, Solr will load the schema from 154 he resource named in 'managedSchemaResourceName', rather than from schema.xml. 155 Note that the managed schema resource CANNOT be named schema.xml. If the managed 156 schema does not exist, Solr will create it after reading schema.xml, then rename 157 'schema.xml' to 'schema.xml.bak'. 158 159 Do NOT hand edit the managed schema - external modifications will be ignored and 160 overwritten as a result of schema modification REST API calls. 161 162 When ManagedIndexSchemaFactory is specified with mutable = true, schema 163 modification REST API calls will be allowed; otherwise, error responses will be 164 sent back for these requests. 165 --> 166 <schemaFactory class="ClassicIndexSchemaFactory"/> 167 168 <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 169 Index Config - These settings control low-level behavior of indexing 170 Most example settings here show the default value, but are commented 171 out, to more easily see where customizations have been made. 172 173 Note: This replaces <indexDefaults> and <mainIndex> from older versions 174 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> 175 <indexConfig> 176 <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a 177 LimitTokenCountFilterFactory in your fieldType definition. E.g. 178 <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/> 179 --> 180 <!-- Maximum time to wait for a write lock (ms) for an IndexWriter. Default: 1000 --> 181 <!-- <writeLockTimeout>1000</writeLockTimeout> --> 182 183 <!-- The maximum number of simultaneous threads that may be 184 indexing documents at once in IndexWriter; if more than this 185 many threads arrive they will wait for others to finish. 186 Default in Solr/Lucene is 8. --> 187 <!-- <maxIndexingThreads>8</maxIndexingThreads> --> 188 189 <!-- Expert: Enabling compound file will use less files for the index, 190 using fewer file descriptors on the expense of performance decrease. 191 Default in Lucene is "true". Default in Solr is "false" (since 3.6) --> 192 <!-- <useCompoundFile>false</useCompoundFile> --> 193 194 <!-- ramBufferSizeMB sets the amount of RAM that may be used by Lucene 195 indexing for buffering added documents and deletions before they are 196 flushed to the Directory. 197 maxBufferedDocs sets a limit on the number of documents buffered 198 before flushing. 199 If both ramBufferSizeMB and maxBufferedDocs is set, then 200 Lucene will flush based on whichever limit is hit first. 201 The default is 100 MB. --> 202 <!-- <ramBufferSizeMB>100</ramBufferSizeMB> --> 203 <!-- <maxBufferedDocs>1000</maxBufferedDocs> --> 204 205 <!-- Expert: Merge Policy 206 The Merge Policy in Lucene controls how merging of segments is done. 207 The default since Solr/Lucene 3.3 is TieredMergePolicy. 208 The default since Lucene 2.3 was the LogByteSizeMergePolicy, 209 Even older versions of Lucene used LogDocMergePolicy. 210 --> 211 <!-- 212 <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> 213 <int name="maxMergeAtOnce">10</int> 214 <int name="segmentsPerTier">10</int> 215 </mergePolicy> 216 --> 217 218 <!-- Merge Factor 219 The merge factor controls how many segments will get merged at a time. 220 For TieredMergePolicy, mergeFactor is a convenience parameter which 221 will set both MaxMergeAtOnce and SegmentsPerTier at once. 222 For LogByteSizeMergePolicy, mergeFactor decides how many new segments 223 will be allowed before they are merged into one. 224 Default is 10 for both merge policies. 225 --> 226 <!-- 123 227 <mergeFactor>10</mergeFactor> 124 <!-- Sets the amount of RAM that may be used by Lucene indexing 125 for buffering added documents and deletions before they are 126 flushed to the Directory. --> 127 <ramBufferSizeMB>32</ramBufferSizeMB> 128 <!-- If both ramBufferSizeMB and maxBufferedDocs is set, then 129 Lucene will flush based on whichever limit is hit first. 130 --> 131 <!-- <maxBufferedDocs>1000</maxBufferedDocs> --> 132 133 <maxFieldLength>10000</maxFieldLength> 134 <writeLockTimeout>1000</writeLockTimeout> 135 <commitLockTimeout>10000</commitLockTimeout> 136 137 <!-- Expert: Merge Policy 138 139 The Merge Policy in Lucene controls how merging is handled by 140 Lucene. The default in Solr 3.3 is TieredMergePolicy. 141 142 The default in 2.3 was the LogByteSizeMergePolicy, 143 previous versions used LogDocMergePolicy. 144 145 LogByteSizeMergePolicy chooses segments to merge based on 146 their size. The Lucene 2.2 default, LogDocMergePolicy chose 147 when to merge based on number of documents 148 149 Other implementations of MergePolicy must have a no-argument 150 constructor 151 --> 152 <!-- 153 <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"/> 154 --> 228 --> 155 229 156 230 <!-- Expert: Merge Scheduler 157 158 231 The Merge Scheduler in Lucene controls how merges are 159 232 performed. The ConcurrentMergeScheduler (Lucene 2.3 default) … … 164 237 <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/> 165 238 --> 166 239 167 240 <!-- LockFactory 168 241 … … 178 251 simple = SimpleFSLockFactory - uses a plain file for locking 179 252 180 (For backwards compatibility with Solr 1.2, 'simple' is the181 default if not specified.)253 Defaults: 'native' is default for Solr3.6 and later, otherwise 254 'simple' is the default 182 255 183 256 More details on the nuances of each LockFactory... 184 257 http://wiki.apache.org/lucene-java/AvailableLockFactories 185 258 --> 186 <lockType>native</lockType> 187 188 <!-- Expert: Controls how often Lucene loads terms into memory 189 Default is 128 and is likely good for most everyone. 190 --> 191 <!-- <termIndexInterval>256</termIndexInterval> --> 192 </indexDefaults> 193 194 <!-- Main Index 195 196 Values here override the values in the <indexDefaults> section 197 for the main on disk index. 198 --> 199 <mainIndex> 200 201 <useCompoundFile>false</useCompoundFile> 202 <ramBufferSizeMB>32</ramBufferSizeMB> 203 <mergeFactor>10</mergeFactor> 259 <lockType>${solr.lock.type:native}</lockType> 204 260 205 261 <!-- Unlock On Startup … … 208 264 This defeats the locking mechanism that allows multiple 209 265 processes to safely access a lucene index, and should be used 210 with care. 211 212 This is not needed if lock type is ' none' or 'single'266 with care. Default is "false". 267 268 This is not needed if lock type is 'single' 213 269 --> 270 <!-- 214 271 <unlockOnStartup>false</unlockOnStartup> 272 --> 215 273 216 <!-- If true, IndexReaders will be reopened (often more efficient) 217 instead of closed and then opened. 218 --> 219 <reopenReaders>true</reopenReaders> 274 <!-- Expert: Controls how often Lucene loads terms into memory 275 Default is 128 and is likely good for most everyone. 276 --> 277 <!-- <termIndexInterval>128</termIndexInterval> --> 278 279 <!-- If true, IndexReaders will be opened/reopened from the IndexWriter 280 instead of from the Directory. Hosts in a master/slave setup 281 should have this set to false while those in a SolrCloud 282 cluster need to be set to true. Default: true 283 --> 284 <!-- 285 <nrtMode>true</nrtMode> 286 --> 220 287 221 288 <!-- Commit Deletion Policy 222 223 Custom deletion policies can specified here. The class must 289 Custom deletion policies can be specified here. The class must 224 290 implement org.apache.lucene.index.IndexDeletionPolicy. 225 291 226 http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexDeletionPolicy.html 227 228 The standard Solr IndexDeletionPolicy implementation supports 292 The default Solr IndexDeletionPolicy implementation supports 229 293 deleting index commit points on number of commits, age of 230 294 commit point and optimized status. … … 233 297 of the criteria. 234 298 --> 299 <!-- 235 300 <deletionPolicy class="solr.SolrDeletionPolicy"> 301 --> 236 302 <!-- The number of commit points to be kept --> 237 < str name="maxCommitsToKeep">1</str>303 <!-- <str name="maxCommitsToKeep">1</str> --> 238 304 <!-- The number of optimized commit points to be kept --> 239 < str name="maxOptimizedCommitsToKeep">0</str>305 <!-- <str name="maxOptimizedCommitsToKeep">0</str> --> 240 306 <!-- 241 307 Delete all commit points once they have reached the given age. … … 246 312 <str name="maxCommitAge">1DAY</str> 247 313 --> 314 <!-- 248 315 </deletionPolicy> 316 --> 249 317 250 318 <!-- Lucene Infostream … … 253 321 of detailed information when indexing. 254 322 255 Setting The value to true will instruct the underlying Lucene 256 IndexWriter to write its debugging info the specified file 257 --> 258 <infoStream file="INFOSTREAM.txt">false</infoStream> 259 260 </mainIndex> 323 Setting the value to true will instruct the underlying Lucene 324 IndexWriter to write its info stream to solr's log. By default, 325 this is enabled here, and controlled through log4j.properties. 326 --> 327 <infoStream>true</infoStream> 328 </indexConfig> 329 261 330 262 331 <!-- JMX … … 281 350 <updateHandler class="solr.DirectUpdateHandler2"> 282 351 352 <!-- Enables a transaction log, used for real-time get, durability, and 353 and solr cloud replica recovery. The log can grow as big as 354 uncommitted changes to the index, so use of a hard autoCommit 355 is recommended (see below). 356 "dir" - the target directory for transaction logs, defaults to the 357 solr data directory. --> 358 <updateLog> 359 <str name="dir">${solr.ulog.dir:}</str> 360 </updateLog> 361 283 362 <!-- AutoCommit 284 363 285 Perform a <commit/>automatically under certain conditions.364 Perform a hard commit automatically under certain conditions. 286 365 Instead of enabling autoCommit, consider using "commitWithin" 287 366 when adding documents. … … 292 371 commit before automatically triggering a new commit. 293 372 294 maxTime - Maximum amount of time that is allowed to pass295 since a document was added before automatic ly373 maxTime - Maximum amount of time in ms that is allowed to pass 374 since a document was added before automatically 296 375 triggering a new commit. 297 --> 298 <!-- 299 <autoCommit> 300 <maxDocs>10000</maxDocs> 301 <maxTime>1000</maxTime> 302 </autoCommit> 303 --> 376 openSearcher - if false, the commit causes recent index changes 377 to be flushed to stable storage, but does not cause a new 378 searcher to be opened to make those changes visible. 379 380 If the updateLog is enabled, then it's highly recommended to 381 have some sort of hard autoCommit to limit the log size. 382 --> 383 <autoCommit> 384 <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 385 <openSearcher>false</openSearcher> 386 </autoCommit> 387 388 <!-- softAutoCommit is like autoCommit except it causes a 389 'soft' commit which only ensures that changes are visible 390 but does not ensure that data is synced to disk. This is 391 faster and more near-realtime friendly than a hard commit. 392 --> 393 394 <autoSoftCommit> 395 <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> 396 </autoSoftCommit> 304 397 305 398 <!-- Update Related Event Listeners … … 334 427 </listener> 335 428 --> 429 336 430 </updateHandler> 337 431 … … 373 467 --> 374 468 375 469 <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 470 Query section - these settings control query time things like caches 471 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> 376 472 <query> 377 473 <!-- Max Boolean Clauses … … 448 544 autowarmCount="0"/> 449 545 546 <!-- custom cache currently used by block join --> 547 <cache name="perSegFilter" 548 class="solr.search.LRUCache" 549 size="10" 550 initialSize="0" 551 autowarmCount="10" 552 regenerator="solr.NoOpRegenerator" /> 553 450 554 <!-- Field Value Cache 451 555 … … 587 691 should behave when processing requests for this SolrCore. 588 692 589 handleSelect affects the behavior of requests such as /select?qt=XXX 693 handleSelect is a legacy option that affects the behavior of requests 694 such as /select?qt=XXX 590 695 591 696 handleSelect="true" will cause the SolrDispatchFilter to process 592 the request and will result in consistent error handling and593 formatting for all types of requests.697 the request and dispatch the query to a handler specified by the 698 "qt" param, assuming "/select" isn't already registered. 594 699 595 700 handleSelect="false" will cause the SolrDispatchFilter to 596 ignore "/select" requests and fallback to using the legacy 597 SolrServlet and it's Solr 1.1 style error formatting 598 --> 599 <requestDispatcher handleSelect="true" > 701 ignore "/select" requests, resulting in a 404 unless a handler 702 is explicitly registered with the name "/select" 703 704 handleSelect="true" is not recommended for new users, but is the default 705 for backwards compatibility 706 --> 707 <requestDispatcher handleSelect="false" > 600 708 <!-- Request Parsing 601 709 … … 607 715 and stream.url parameters for specifying remote streams. 608 716 609 multipartUploadLimitInKB - specifies the max size of717 multipartUploadLimitInKB - specifies the max size (in KiB) of 610 718 Multipart File Uploads that Solr will allow in a Request. 719 720 formdataUploadLimitInKB - specifies the max size (in KiB) of 721 form data (application/x-www-form-urlencoded) sent via 722 POST. You can use POST to pass request parameters not 723 fitting into the URL. 724 725 addHttpRequestToContext - if set to true, it will instruct 726 the requestParsers to include the original HttpServletRequest 727 object in the context map of the SolrQueryRequest under the 728 key "httpRequest". It will not be used by any of the existing 729 Solr components, but may be useful when developing custom 730 plugins. 611 731 612 732 *** WARNING *** … … 617 737 --> 618 738 <requestParsers enableRemoteStreaming="true" 619 multipartUploadLimitInKB="2048000" /> 739 multipartUploadLimitInKB="2048000" 740 formdataUploadLimitInKB="2048" 741 addHttpRequestToContext="false"/> 620 742 621 743 <!-- HTTP Caching … … 678 800 http://wiki.apache.org/solr/SolrRequestHandler 679 801 680 incoming queries will be dispatched to the correct handler681 based on the path or the qt (query type) param.682 683 Names starting with a '/' are accessed with the a path equal to684 the registered name. Names without a leading '/' are accessed685 with: http://host/app/[core/]select?qt=name686 687 If a /select request is processed with out a qt param688 specified, the requestHandler that declares default="true" will689 be used.690 802 Incoming queries will be dispatched to a specific handler by name 803 based on the path specified in the request. 804 805 Legacy behavior: If the request path uses "/select" but no Request 806 Handler has that name, and if handleSelect="true" has been specified in 807 the requestDispatcher, then the Request Handler is dispatched based on 808 the qt parameter. Handlers without a leading '/' are accessed this way 809 like so: http://host/app/[core/]select?qt=name If no qt is 810 given, then the requestHandler that declares default="true" will be 811 used or the one named "standard". 812 691 813 If a Request Handler is declared with startup="lazy", then it will 692 814 not be initialized until the first request that uses it. … … 702 824 queries across multiple shards 703 825 --> 704 <requestHandler name="search" class="solr.SearchHandler" default="true"> 826 <!--<requestHandler name="/select" class="solr.SearchHandler">--> 827 <requestHandler name="/select" class="org.greenstone.solrserver.Greenstone3SearchHandler"> 705 828 <!-- default values for query parameters can be specified, these 706 829 will be overridden by parameters in the request … … 709 832 <str name="echoParams">explicit</str> 710 833 <int name="rows">10</int> 834 <str name="df">text</str> 711 835 </lst> 712 836 <!-- In addition to defaults, "appends" params can be specified … … 764 888 </requestHandler> 765 889 766 <!-- A Robust Example 767 890 <!-- A request handler that returns indented JSON by default --> 891 <!--<requestHandler name="/query" class="solr.SearchHandler">--> 892 <requestHandler name="/query" class="org.greenstone.solrserver.Greenstone3SearchHandler"> 893 <lst name="defaults"> 894 <str name="echoParams">explicit</str> 895 <str name="wt">json</str> 896 <str name="indent">true</str> 897 <str name="df">text</str> 898 </lst> 899 </requestHandler> 900 901 902 <!-- realtime get handler, guaranteed to return the latest stored fields of 903 any document, without the need to commit or open a new searcher. The 904 current implementation relies on the updateLog feature being enabled. 905 906 ** WARNING ** 907 Do NOT disable the realtime get handler at /get if you are using 908 SolrCloud otherwise any leader election will cause a full sync in ALL 909 replicas for the shard in question. Similarly, a replica recovery will 910 also always fetch the complete index from the leader because a partial 911 sync will not be possible in the absence of this handler. 912 --> 913 <requestHandler name="/get" class="solr.RealTimeGetHandler"> 914 <lst name="defaults"> 915 <str name="omitHeader">true</str> 916 <str name="wt">json</str> 917 <str name="indent">true</str> 918 </lst> 919 </requestHandler> 920 921 922 <!-- A Robust Example 923 768 924 This example SearchHandler declaration shows off usage of the 769 925 SearchHandler with many defaults declared … … 779 935 <!-- VelocityResponseWriter settings --> 780 936 <str name="wt">velocity</str> 781 782 937 <str name="v.template">browse</str> 783 938 <str name="v.layout">layout</str> 784 939 <str name="title">Solritas</str> 785 940 941 <!-- Query settings --> 786 942 <str name="defType">edismax</str> 943 <str name="qf"> 944 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 945 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 946 </str> 947 <str name="df">text</str> 948 <str name="mm">100%</str> 787 949 <str name="q.alt">*:*</str> 788 950 <str name="rows">10</str> 789 951 <str name="fl">*,score</str> 952 790 953 <str name="mlt.qf"> 791 954 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 955 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 792 956 </str> 793 <str name="mlt.fl">text,features,name,sku,id,manu,cat </str>957 <str name="mlt.fl">text,features,name,sku,id,manu,cat,title,description,keywords,author,resourcename</str> 794 958 <int name="mlt.count">3</int> 795 959 796 <str name="qf"> 797 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 798 </str> 799 960 <!-- Faceting defaults --> 800 961 <str name="facet">on</str> 801 962 <str name="facet.field">cat</str> 802 963 <str name="facet.field">manu_exact</str> 964 <str name="facet.field">content_type</str> 965 <str name="facet.field">author_s</str> 803 966 <str name="facet.query">ipod</str> 804 967 <str name="facet.query">GB</str> 805 968 <str name="facet.mincount">1</str> 806 969 <str name="facet.pivot">cat,inStock</str> 970 <str name="facet.range.other">after</str> 807 971 <str name="facet.range">price</str> 808 972 <int name="f.price.facet.range.start">0</int> 809 973 <int name="f.price.facet.range.end">600</int> 810 974 <int name="f.price.facet.range.gap">50</int> 811 <str name="f.price.facet.range.other">after</str> 975 <str name="facet.range">popularity</str> 976 <int name="f.popularity.facet.range.start">0</int> 977 <int name="f.popularity.facet.range.end">10</int> 978 <int name="f.popularity.facet.range.gap">3</int> 812 979 <str name="facet.range">manufacturedate_dt</str> 813 980 <str name="f.manufacturedate_dt.facet.range.start">NOW/YEAR-10YEARS</str> … … 817 984 <str name="f.manufacturedate_dt.facet.range.other">after</str> 818 985 819 820 986 <!-- Highlighting defaults --> 821 987 <str name="hl">on</str> 822 <str name="hl.fl">text features name</str> 988 <str name="hl.fl">content features title name</str> 989 <str name="hl.encoder">html</str> 990 <str name="hl.simple.pre"><b></str> 991 <str name="hl.simple.post"></b></str> 992 <str name="f.title.hl.fragsize">0</str> 993 <str name="f.title.hl.alternateField">title</str> 823 994 <str name="f.name.hl.fragsize">0</str> 824 995 <str name="f.name.hl.alternateField">name</str> 996 <str name="f.content.hl.snippets">3</str> 997 <str name="f.content.hl.fragsize">200</str> 998 <str name="f.content.hl.alternateField">content</str> 999 <str name="f.content.hl.maxAlternateFieldLength">750</str> 1000 1001 <!-- Spell checking defaults --> 1002 <str name="spellcheck">on</str> 1003 <str name="spellcheck.extendedResults">false</str> 1004 <str name="spellcheck.count">5</str> 1005 <str name="spellcheck.alternativeTermCount">2</str> 1006 <str name="spellcheck.maxResultsForSuggest">5</str> 1007 <str name="spellcheck.collate">true</str> 1008 <str name="spellcheck.collateExtendedResults">true</str> 1009 <str name="spellcheck.maxCollationTries">5</str> 1010 <str name="spellcheck.maxCollations">3</str> 825 1011 </lst> 1012 1013 <!-- append spellchecking to our list of components --> 826 1014 <arr name="last-components"> 827 1015 <str>spellcheck</str> 828 1016 </arr> 829 <!--830 <str name="url-scheme">httpx</str>831 -->832 1017 </requestHandler> 833 1018 834 <!-- XML Update Request Handler. 1019 1020 <!-- Update Request Handler. 835 1021 836 1022 http://wiki.apache.org/solr/UpdateXmlMessages 837 1023 838 1024 The canonical Request Handler for Modifying the Index through 839 commands specified using XML .1025 commands specified using XML, JSON, CSV, or JAVABIN 840 1026 841 1027 Note: Since solr1.1 requestHandlers requires a valid content 842 1028 type header if posted in the body. For example, curl now 843 1029 requires: -H 'Content-type:text/xml; charset=utf-8' 844 --> 845 <requestHandler name="/update" 846 class="solr.XmlUpdateRequestHandler"> 1030 1031 To override the request content type and force a specific 1032 Content-type, use the request parameter: 1033 ?update.contentType=text/csv 1034 1035 This handler will pick a response format to match the input 1036 if the 'wt' parameter is not explicit 1037 --> 1038 <requestHandler name="/update" class="solr.UpdateRequestHandler"> 847 1039 <!-- See below for information on defining 848 1040 updateRequestProcessorChains that can be used by name … … 854 1046 </lst> 855 1047 --> 856 </requestHandler> 857 <!-- Binary Update Request Handler 858 http://wiki.apache.org/solr/javabin 859 --> 860 <requestHandler name="/update/javabin" 861 class="solr.BinaryUpdateRequestHandler" /> 862 863 <!-- CSV Update Request Handler 864 http://wiki.apache.org/solr/UpdateCSV 865 --> 866 <requestHandler name="/update/csv" 867 class="solr.CSVRequestHandler" 868 startup="lazy" /> 869 870 <!-- JSON Update Request Handler 871 http://wiki.apache.org/solr/UpdateJSON 872 --> 873 <requestHandler name="/update/json" 874 class="solr.JsonUpdateRequestHandler" 875 startup="lazy" /> 1048 </requestHandler> 1049 1050 <!-- for back compat with clients using /update/json and /update/csv --> 1051 <requestHandler name="/update/json" class="solr.UpdateRequestHandler"> 1052 <lst name="defaults"> 1053 <str name="stream.contentType">application/json</str> 1054 </lst> 1055 </requestHandler> 1056 <requestHandler name="/update/csv" class="solr.UpdateRequestHandler"> 1057 <lst name="defaults"> 1058 <str name="stream.contentType">application/csv</str> 1059 </lst> 1060 </requestHandler> 876 1061 877 1062 <!-- Solr Cell Update Request Handler … … 884 1069 class="solr.extraction.ExtractingRequestHandler" > 885 1070 <lst name="defaults"> 886 <!-- All the main content goes into "text"... if you need to return887 the extracted text or do highlighting, use a stored field. -->888 <str name="fmap.content">text</str>889 1071 <str name="lowernames">true</str> 890 1072 <str name="uprefix">ignored_</str> … … 896 1078 </lst> 897 1079 </requestHandler> 1080 898 1081 899 1082 <!-- Field Analysis Request Handler … … 925 1108 926 1109 An analysis handler that provides a breakdown of the analysis 927 process of provided docu emnts. This handler expects a (single)1110 process of provided documents. This handler expects a (single) 928 1111 content stream with the following format: 929 1112 … … 971 1154 --> 972 1155 <!-- If you wish to hide files under ${solr.home}/conf, explicitly 973 register the ShowFileRequestHandler using: 1156 register the ShowFileRequestHandler using the definition below. 1157 NOTE: The glob pattern ('*') is the only pattern supported at present, *.xml will 1158 not exclude all files ending in '.xml'. Use it to exclude _all_ updates 974 1159 --> 975 1160 <!-- … … 979 1164 <str name="hidden">synonyms.txt</str> 980 1165 <str name="hidden">anotherfile.txt</str> 1166 <str name="hidden">*</str> 981 1167 </lst> 982 1168 </requestHandler> … … 985 1171 <!-- ping/healthcheck --> 986 1172 <requestHandler name="/admin/ping" class="solr.PingRequestHandler"> 1173 <lst name="invariants"> 1174 <str name="q">solrpingquery</str> 1175 </lst> 987 1176 <lst name="defaults"> 988 <str name="qt">search</str>989 <str name="q">solrpingquery</str>990 1177 <str name="echoParams">all</str> 991 1178 </lst> 1179 <!-- An optional feature of the PingRequestHandler is to configure the 1180 handler with a "healthcheckFile" which can be used to enable/disable 1181 the PingRequestHandler. 1182 relative paths are resolved against the data dir 1183 --> 1184 <!-- <str name="healthcheckFile">server-enabled.txt</str> --> 992 1185 </requestHandler> 993 1186 … … 1003 1196 1004 1197 The SolrReplicationHandler supports replicating indexes from a 1005 "master" used for indexing and "s alves" used for queries.1198 "master" used for indexing and "slaves" used for queries. 1006 1199 1007 1200 http://wiki.apache.org/solr/SolrReplication 1008 1201 1009 In the example below, remove the <lst name="master"> section if 1010 this is just a slave and remove the <lst name="slave"> section 1011 if this is just a master. 1012 --> 1013 <!-- 1014 <requestHandler name="/replication" class="solr.ReplicationHandler" > 1202 It is also necessary for SolrCloud to function (in Cloud mode, the 1203 replication handler is used to bulk transfer segments when nodes 1204 are added or need to recover). 1205 1206 https://wiki.apache.org/solr/SolrCloud/ 1207 --> 1208 <requestHandler name="/replication" class="solr.ReplicationHandler" > 1209 <!-- 1210 To enable simple master/slave replication, uncomment one of the 1211 sections below, depending on whether this solr instance should be 1212 the "master" or a "slave". If this instance is a "slave" you will 1213 also need to fill in the masterUrl to point to a real machine. 1214 --> 1215 <!-- 1015 1216 <lst name="master"> 1016 1217 <str name="replicateAfter">commit</str> … … 1018 1219 <str name="confFiles">schema.xml,stopwords.txt</str> 1019 1220 </lst> 1221 --> 1222 <!-- 1020 1223 <lst name="slave"> 1021 <str name="masterUrl">http:// localhost:8983/solr/replication</str>1224 <str name="masterUrl">http://your-master-hostname:8983/solr</str> 1022 1225 <str name="pollInterval">00:00:60</str> 1023 1226 </lst> 1024 </requestHandler>1025 -->1227 --> 1228 </requestHandler> 1026 1229 1027 1230 <!-- Search Components … … 1067 1270 1068 1271 --> 1069 1272 1070 1273 <!-- Spell Check 1071 1274 … … 1077 1280 <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> 1078 1281 1079 <str name="queryAnalyzerFieldType">text Spell</str>1282 <str name="queryAnalyzerFieldType">text_general</str> 1080 1283 1081 1284 <!-- Multiple "Spell Checkers" can be declared and used by this … … 1083 1286 --> 1084 1287 1085 <!-- a spellchecker built from a field of the main index, and 1086 written to disk 1087 --> 1288 <!-- a spellchecker built from a field of the main index --> 1088 1289 <lst name="spellchecker"> 1089 1290 <str name="name">default</str> 1291 <str name="field">text</str> 1292 <str name="classname">solr.DirectSolrSpellChecker</str> 1293 <!-- the spellcheck distance measure used, the default is the internal levenshtein --> 1294 <str name="distanceMeasure">internal</str> 1295 <!-- minimum accuracy needed to be considered a valid spellcheck suggestion --> 1296 <float name="accuracy">0.5</float> 1297 <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2 --> 1298 <int name="maxEdits">2</int> 1299 <!-- the minimum shared prefix when enumerating terms --> 1300 <int name="minPrefix">1</int> 1301 <!-- maximum number of inspections per result. --> 1302 <int name="maxInspections">5</int> 1303 <!-- minimum length of a query term to be considered for correction --> 1304 <int name="minQueryLength">4</int> 1305 <!-- maximum threshold of documents a query term can appear to be considered for correction --> 1306 <float name="maxQueryFrequency">0.01</float> 1307 <!-- uncomment this to require suggestions to occur in 1% of the documents 1308 <float name="thresholdTokenFrequency">.01</float> 1309 --> 1310 </lst> 1311 1312 <!-- a spellchecker that can break or combine words. See "/spell" handler below for usage --> 1313 <lst name="spellchecker"> 1314 <str name="name">wordbreak</str> 1315 <str name="classname">solr.WordBreakSolrSpellChecker</str> 1090 1316 <str name="field">name</str> 1091 <str name="spellcheckIndexDir">spellchecker</str> 1092 <!-- uncomment this to require terms to occur in 1% of the documents in order to be included in the dictionary 1093 <float name="thresholdTokenFrequency">.01</float> 1094 --> 1317 <str name="combineWords">true</str> 1318 <str name="breakWords">true</str> 1319 <int name="maxChanges">10</int> 1095 1320 </lst> 1096 1321 … … 1100 1325 <str name="name">jarowinkler</str> 1101 1326 <str name="field">spell</str> 1327 <str name="classname">solr.DirectSolrSpellChecker</str> 1102 1328 <str name="distanceMeasure"> 1103 1329 org.apache.lucene.search.spell.JaroWinklerDistance 1104 1330 </str> 1105 <str name="spellcheckIndexDir">spellcheckerJaro</str>1106 1331 </lst> 1107 1332 --> … … 1118 1343 <str name="name">freq</str> 1119 1344 <str name="field">lowerfilt</str> 1120 <str name=" spellcheckIndexDir">spellcheckerFreq</str>1345 <str name="classname">solr.DirectSolrSpellChecker</str> 1121 1346 <str name="comparatorClass">freq</str> 1122 <str name="buildOnCommit">true</str>1123 1347 --> 1124 1348 … … 1134 1358 --> 1135 1359 </searchComponent> 1136 1360 1137 1361 <!-- A request handler for demonstrating the spellcheck component. 1138 1362 … … 1150 1374 <requestHandler name="/spell" class="solr.SearchHandler" startup="lazy"> 1151 1375 <lst name="defaults"> 1152 <str name="spellcheck.onlyMorePopular">false</str> 1153 <str name="spellcheck.extendedResults">false</str> 1154 <str name="spellcheck.count">1</str> 1376 <str name="df">text</str> 1377 <!-- Solr will use suggestions from both the 'default' spellchecker 1378 and from the 'wordbreak' spellchecker and combine them. 1379 collations (re-written queries) can include a combination of 1380 corrections from both spellcheckers --> 1381 <str name="spellcheck.dictionary">default</str> 1382 <str name="spellcheck.dictionary">wordbreak</str> 1383 <str name="spellcheck">on</str> 1384 <str name="spellcheck.extendedResults">true</str> 1385 <str name="spellcheck.count">10</str> 1386 <str name="spellcheck.alternativeTermCount">5</str> 1387 <str name="spellcheck.maxResultsForSuggest">5</str> 1388 <str name="spellcheck.collate">true</str> 1389 <str name="spellcheck.collateExtendedResults">true</str> 1390 <str name="spellcheck.maxCollationTries">10</str> 1391 <str name="spellcheck.maxCollations">5</str> 1155 1392 </lst> 1156 1393 <arr name="last-components"> … … 1159 1396 </requestHandler> 1160 1397 1398 <searchComponent name="suggest" class="solr.SuggestComponent"> 1399 <lst name="suggester"> 1400 <str name="name">mySuggester</str> 1401 <str name="lookupImpl">FuzzyLookupFactory</str> <!-- org.apache.solr.spelling.suggest.fst --> 1402 <str name="dictionaryImpl">DocumentDictionaryFactory</str> <!-- org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory --> 1403 <str name="field">cat</str> 1404 <str name="weightField">price</str> 1405 <str name="suggestAnalyzerFieldType">string</str> 1406 </lst> 1407 </searchComponent> 1408 1409 <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy"> 1410 <lst name="defaults"> 1411 <str name="suggest">true</str> 1412 <str name="suggest.count">10</str> 1413 </lst> 1414 <arr name="components"> 1415 <str>suggest</str> 1416 </arr> 1417 </requestHandler> 1161 1418 <!-- Term Vector Component 1162 1419 … … 1172 1429 already specified request handlers. 1173 1430 --> 1174 <requestHandler name=" tvrh" class="solr.SearchHandler" startup="lazy">1431 <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy"> 1175 1432 <lst name="defaults"> 1433 <str name="df">text</str> 1176 1434 <bool name="tv">true</bool> 1177 1435 </lst> … … 1183 1441 <!-- Clustering Component 1184 1442 1443 You'll need to set the solr.clustering.enabled system property 1444 when running solr to run with clustering enabled: 1445 1446 java -Dsolr.clustering.enabled=true -jar start.jar 1447 1185 1448 http://wiki.apache.org/solr/ClusteringComponent 1186 1187 This relies on third party jars which are notincluded in the 1188 release. To use this component (and the "/clustering" handler) 1189 Those jars will need to be downloaded, and you'll need to set 1190 the solr.cluster.enabled system property when running solr... 1191 1192 java -Dsolr.clustering.enabled=true -jar start.jar 1193 --> 1194 <searchComponent name="clustering" 1449 http://carrot2.github.io/solr-integration-strategies/ 1450 --> 1451 <searchComponent name="clustering" 1195 1452 enable="${solr.clustering.enabled:false}" 1196 1453 class="solr.clustering.ClusteringComponent" > 1197 <!-- Declare an engine -->1198 1454 <lst name="engine"> 1199 <!-- The name, only one can be named "default" --> 1200 <str name="name">default</str> 1201 1202 <!-- Class name of Carrot2 clustering algorithm. 1203 1204 Currently available algorithms are: 1205 1455 <str name="name">lingo</str> 1456 1457 <!-- Class name of a clustering algorithm compatible with the Carrot2 framework. 1458 1459 Currently available open source algorithms are: 1206 1460 * org.carrot2.clustering.lingo.LingoClusteringAlgorithm 1207 1461 * org.carrot2.clustering.stc.STCClusteringAlgorithm 1208 1462 * org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm 1209 1210 See http://project.carrot2.org/algorithms.html for the 1211 algorithm's characteristics. 1463 1464 See http://project.carrot2.org/algorithms.html for more information. 1465 1466 A commercial algorithm Lingo3G (needs to be installed separately) is defined as: 1467 * com.carrotsearch.lingo3g.Lingo3GClusteringAlgorithm 1212 1468 --> 1213 1469 <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str> 1214 1470 1215 <!-- Overriding values for Carrot2 default algorithm attributes. 1216 1217 For a description of all available attributes, see: 1218 http://download.carrot2.org/stable/manual/#chapter.components. 1219 Use attribute key as name attribute of str elements 1220 below. These can be further overridden for individual 1221 requests by specifying attribute key as request parameter 1222 name and attribute value as parameter value. 1223 --> 1224 <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str> 1225 1226 <!-- Location of Carrot2 lexical resources. 1227 1228 A directory from which to load Carrot2-specific stop words 1229 and stop labels. Absolute or relative to Solr config directory. 1230 If a specific resource (e.g. stopwords.en) is present in the 1231 specified dir, it will completely override the corresponding 1232 default one that ships with Carrot2. 1471 <!-- Override location of the clustering algorithm's resources 1472 (attribute definitions and lexical resources). 1473 1474 A directory from which to load algorithm-specific stop words, 1475 stop labels and attribute definition XMLs. 1233 1476 1234 1477 For an overview of Carrot2 lexical resources, see: 1235 1478 http://download.carrot2.org/head/manual/#chapter.lexical-resources 1236 --> 1237 <str name="carrot.lexicalResourcesDir">clustering/carrot2</str> 1238 1239 <!-- The language to assume for the documents. 1240 1241 For a list of allowed values, see: 1242 http://download.carrot2.org/stable/manual/#section.attribute.lingo.MultilingualClustering.defaultLanguage 1479 1480 For an overview of Lingo3G lexical resources, see: 1481 http://download.carrotsearch.com/lingo3g/manual/#chapter.lexical-resources 1243 1482 --> 1244 <str name=" MultilingualClustering.defaultLanguage">ENGLISH</str>1483 <str name="carrot.resourcesDir">clustering/carrot2</str> 1245 1484 </lst> 1485 1486 <!-- An example definition for the STC clustering algorithm. --> 1246 1487 <lst name="engine"> 1247 1488 <str name="name">stc</str> 1248 1489 <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str> 1490 </lst> 1491 1492 <!-- An example definition for the bisecting kmeans clustering algorithm. --> 1493 <lst name="engine"> 1494 <str name="name">kmeans</str> 1495 <str name="carrot.algorithm">org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm</str> 1249 1496 </lst> 1250 1497 </searchComponent> … … 1263 1510 <lst name="defaults"> 1264 1511 <bool name="clustering">true</bool> 1265 <str name="clustering.engine">default</str>1266 1512 <bool name="clustering.results">true</bool> 1267 <!-- The title field-->1513 <!-- Field name with the logical "title" of a each document (optional) --> 1268 1514 <str name="carrot.title">name</str> 1515 <!-- Field name with the logical "URL" of a each document (optional) --> 1269 1516 <str name="carrot.url">id</str> 1270 <!-- The field to cluster on --> 1271 <str name="carrot.snippet">features</str> 1272 <!-- produce summaries --> 1273 <bool name="carrot.produceSummary">true</bool> 1274 <!-- the maximum number of labels per cluster --> 1275 <!--<int name="carrot.numDescriptions">5</int>--> 1276 <!-- produce sub clusters --> 1277 <bool name="carrot.outputSubClusters">false</bool> 1278 1279 <str name="defType">edismax</str> 1280 <str name="qf"> 1281 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 1282 </str> 1283 <str name="q.alt">*:*</str> 1284 <str name="rows">10</str> 1285 <str name="fl">*,score</str> 1286 </lst> 1517 <!-- Field name with the logical "content" of a each document (optional) --> 1518 <str name="carrot.snippet">features</str> 1519 <!-- Apply highlighter to the title/ content and use this for clustering. --> 1520 <bool name="carrot.produceSummary">true</bool> 1521 <!-- the maximum number of labels per cluster --> 1522 <!--<int name="carrot.numDescriptions">5</int>--> 1523 <!-- produce sub clusters --> 1524 <bool name="carrot.outputSubClusters">false</bool> 1525 1526 <!-- Configure the remaining request handler parameters. --> 1527 <str name="defType">edismax</str> 1528 <str name="qf"> 1529 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 1530 </str> 1531 <str name="q.alt">*:*</str> 1532 <str name="rows">10</str> 1533 <str name="fl">*,score</str> 1534 </lst> 1287 1535 <arr name="last-components"> 1288 1536 <str>clustering</str> … … 1303 1551 <lst name="defaults"> 1304 1552 <bool name="terms">true</bool> 1553 <bool name="distrib">false</bool> 1305 1554 </lst> 1306 1555 <arr name="components"> … … 1318 1567 scoring. 1319 1568 --> 1320 <!--1321 1569 <searchComponent name="elevator" class="solr.QueryElevationComponent" > 1322 -->1323 1570 <!-- pick a fieldType to analyze queries --> 1324 1325 <!--1326 1571 <str name="queryFieldType">string</str> 1327 1572 <str name="config-file">elevate.xml</str> 1328 1573 </searchComponent> 1329 --> 1574 1330 1575 <!-- A request handler for demonstrating the elevator component --> 1331 <!--1332 1576 <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> 1333 1577 <lst name="defaults"> 1334 1578 <str name="echoParams">explicit</str> 1579 <str name="df">text</str> 1335 1580 </lst> 1336 1581 <arr name="last-components"> … … 1338 1583 </arr> 1339 1584 </requestHandler> 1340 --> 1585 1341 1586 <!-- Highlighting Component 1342 1587 … … 1386 1631 <!-- Configure the standard fragListBuilder --> 1387 1632 <fragListBuilder name="simple" 1388 default="true"1389 1633 class="solr.highlight.SimpleFragListBuilder"/> 1390 1634 1391 1635 <!-- Configure the single fragListBuilder --> 1392 1636 <fragListBuilder name="single" 1393 1637 class="solr.highlight.SingleFragListBuilder"/> 1394 1638 1639 <!-- Configure the weighted fragListBuilder --> 1640 <fragListBuilder name="weighted" 1641 default="true" 1642 class="solr.highlight.WeightedFragListBuilder"/> 1643 1395 1644 <!-- default tag FragmentsBuilder --> 1396 1645 <fragmentsBuilder name="default" … … 1417 1666 </lst> 1418 1667 </fragmentsBuilder> 1668 1669 <boundaryScanner name="default" 1670 default="true" 1671 class="solr.highlight.SimpleBoundaryScanner"> 1672 <lst name="defaults"> 1673 <str name="hl.bs.maxScan">10</str> 1674 <str name="hl.bs.chars">.,!? 	 </str> 1675 </lst> 1676 </boundaryScanner> 1677 1678 <boundaryScanner name="breakIterator" 1679 class="solr.highlight.BreakIteratorBoundaryScanner"> 1680 <lst name="defaults"> 1681 <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE --> 1682 <str name="hl.bs.type">WORD</str> 1683 <!-- language and country are used when constructing Locale object. --> 1684 <!-- And the Locale object will be used when getting instance of BreakIterator --> 1685 <str name="hl.bs.language">en</str> 1686 <str name="hl.bs.country">US</str> 1687 </lst> 1688 </boundaryScanner> 1419 1689 </highlighting> 1420 1690 </searchComponent> … … 1451 1721 </updateRequestProcessorChain> 1452 1722 --> 1453 1723 1724 <!-- Language identification 1725 1726 This example update chain identifies the language of the incoming 1727 documents using the langid contrib. The detected language is 1728 written to field language_s. No field name mapping is done. 1729 The fields used for detection are text, title, subject and description, 1730 making this example suitable for detecting languages form full-text 1731 rich documents injected via ExtractingRequestHandler. 1732 See more about langId at http://wiki.apache.org/solr/LanguageDetection 1733 --> 1734 <!-- 1735 <updateRequestProcessorChain name="langid"> 1736 <processor class="org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory"> 1737 <str name="langid.fl">text,title,subject,description</str> 1738 <str name="langid.langField">language_s</str> 1739 <str name="langid.fallback">en</str> 1740 </processor> 1741 <processor class="solr.LogUpdateProcessorFactory" /> 1742 <processor class="solr.RunUpdateProcessorFactory" /> 1743 </updateRequestProcessorChain> 1744 --> 1745 1746 <!-- Script update processor 1747 1748 This example hooks in an update processor implemented using JavaScript. 1749 1750 See more about the script update processor at http://wiki.apache.org/solr/ScriptUpdateProcessor 1751 --> 1752 <!-- 1753 <updateRequestProcessorChain name="script"> 1754 <processor class="solr.StatelessScriptUpdateProcessorFactory"> 1755 <str name="script">update-script.js</str> 1756 <lst name="params"> 1757 <str name="config_param">example config parameter</str> 1758 </lst> 1759 </processor> 1760 <processor class="solr.RunUpdateProcessorFactory" /> 1761 </updateRequestProcessorChain> 1762 --> 1763 1454 1764 <!-- Response Writers 1455 1765 … … 1475 1785 <queryResponseWriter name="php" class="solr.PHPResponseWriter"/> 1476 1786 <queryResponseWriter name="phps" class="solr.PHPSerializedResponseWriter"/> 1477 <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>1478 1787 <queryResponseWriter name="csv" class="solr.CSVResponseWriter"/> 1479 --> 1788 <queryResponseWriter name="schema.xml" class="solr.SchemaXmlResponseWriter"/> 1789 --> 1790 1791 <queryResponseWriter name="json" class="solr.JSONResponseWriter"> 1792 <!-- For the purposes of the tutorial, JSON responses are written as 1793 plain text so that they are easy to read in *any* browser. 1794 If you expect a MIME type of "application/json" just remove this override. 1795 --> 1796 <str name="content-type">text/plain; charset=UTF-8</str> 1797 </queryResponseWriter> 1798 1480 1799 <!-- 1481 1800 Custom response writers can be declared as needed... 1482 1801 --> 1483 <!-- 1484 <queryResponseWriter name="custom" class="com.example.MyResponseWriter"/> 1485 --> 1802 <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" startup="lazy"/> 1803 1486 1804 1487 1805 <!-- XSLT response writer transforms the XML output by any xslt file found … … 1518 1836 class="com.mycompany.MyValueSourceParser" /> 1519 1837 --> 1838 1839 1840 <!-- Document Transformers 1841 http://wiki.apache.org/solr/DocTransformers 1842 --> 1843 <!-- 1844 Could be something like: 1845 <transformer name="db" class="com.mycompany.LoadFromDatabaseTransformer" > 1846 <int name="connection">jdbc://....</int> 1847 </transformer> 1848 1849 To add a constant value to all docs, use: 1850 <transformer name="mytrans2" class="org.apache.solr.response.transform.ValueAugmenterFactory" > 1851 <int name="value">5</int> 1852 </transformer> 1853 1854 If you want the user to still be able to change it with _value:something_ use this: 1855 <transformer name="mytrans3" class="org.apache.solr.response.transform.ValueAugmenterFactory" > 1856 <double name="defaultValue">5</double> 1857 </transformer> 1858 1859 If you are using the QueryElevationComponent, you may wish to mark documents that get boosted. The 1860 EditorialMarkerFactory will do exactly that: 1861 <transformer name="qecBooster" class="org.apache.solr.response.transform.EditorialMarkerFactory" /> 1862 --> 1863 1520 1864 1521 1865 <!-- Legacy config for the admin interface --> 1522 1866 <admin> 1523 1867 <defaultQuery>*:*</defaultQuery> 1524 1525 <!-- configure a healthcheck file for servers behind a1526 loadbalancer1527 -->1528 <!--1529 <healthcheck type="file">server-enabled</healthcheck>1530 -->1531 1868 </admin> 1532 1869
Note:
See TracChangeset
for help on using the changeset viewer.