|
|
@31786
|
7 years |
davidb |
extra param in call; change to case-folding _htrctokentext
|
|
|
@31783
|
7 years |
davidb |
Solr Doc Add changed to include volume-level metadata within every …
|
|
|
@31779
|
7 years |
davidb |
Change in how POS words are checked against the Whitelist. Previously …
|
|
|
@31677
|
7 years |
davidb |
Supress processing governmentDocument for now in JSON metadata record, …
|
|
|
@31675
|
7 years |
davidb |
More careful set of metadata fields indexed
|
|
|
@31597
|
7 years |
davidb |
Additional _s and _ss fields to help with faceting. Temporarily …
|
|
|
@31510
|
7 years |
davidb |
Turns out some languages fields can be empty. Need to test for this
|
|
|
@31509
|
7 years |
davidb |
LangPos determination changed to lock into first match, rather than …
|
|
|
@31505
|
7 years |
davidb |
Added in storing of top-level document metadata as separate solr-doc
|
|
|
@31499
|
7 years |
davidb |
Better exception handling
|
|
|
@31378
|
7 years |
davidb |
Fixed loop limit test
|
|
|
@31375
|
7 years |
davidb |
Initial cut at including POS information to solr index
|
|
|
@31308
|
7 years |
davidb |
Minor tidy-up
|
|
|
@31274
|
7 years |
davidb |
Need to use JSONArray no JSONObject for a multifield item
|
|
|
@31273
|
7 years |
davidb |
Code moved to store fields for multilingual use using dynamic Solr …
|
|
|
@31260
|
7 years |
davidb |
Language counting
|
|
|
@31258
|
7 years |
davidb |
POS Label count, similar to Whitelist word count
|
|
|
@31254
|
7 years |
davidb |
Experimenting with Lucene lowercase filter
|
|
|
@31252
|
7 years |
davidb |
Support for icu-tokenize property added, plus relevant refactoring.
|
|
|
@31245
|
7 years |
davidb |
Refactored so processing of words from TokenPosCount now done by the …
|
|
|
@31244
|
7 years |
davidb |
Tidy up
|
|
|
@31243
|
7 years |
davidb |
Experimenting with Lucene/Solr's ICU tokenizer
|
|
|
@31242
|
7 years |
davidb |
Method name refactor
|
|
|
@31220
|
7 years |
davidb |
Use of whitelist Bloom filter added to words going into Solr index
|
|
|
@31176
|
7 years |
davidb |
Support added for producing whitelist word count
|
|
|
@31015
|
8 years |
davidb |
Restructuring of projects into one
|
|
copied from other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java
|
|
|
@31007
|
8 years |
davidb |
Class name refactoring
|