|
|
@33426
|
4 years |
davidb |
Folder to details on how to standup the HTRC DevEnv locally
|
|
|
@32175
|
5 years |
davidb |
No longer needed
|
|
|
@32174
|
5 years |
davidb |
Useful utility
|
|
|
@32173
|
5 years |
davidb |
Shift to using Solr7 setup
|
|
|
@32172
|
5 years |
davidb |
Tweaked to use Solr7 config files with Jetty installation
|
|
|
@32171
|
5 years |
davidb |
Increased number or max bool clauses
|
|
|
@32170
|
5 years |
davidb |
Shifting to newer version of Solr
|
|
|
@32120
|
5 years |
davidb |
Not entirely sure about this tweak to which SOLR env var is being …
|
|
|
@32119
|
5 years |
davidb |
Another useful script to run with Solr
|
|
|
@32118
|
5 years |
davidb |
Config files needed to top up Jetty with solr servlet
|
|
|
@32117
|
5 years |
davidb |
Some changes that are needed to run Jetty with security on admin, and …
|
|
|
@32116
|
5 years |
davidb |
Useful script that makes it easier to recall how to talk to the …
|
|
|
@32109
|
5 years |
davidb |
Changes made after testing through YARN
|
|
|
@32108
|
5 years |
davidb |
Useful breadcrumb for compiling
|
|
|
@32107
|
5 years |
davidb |
Rekindling the ability to run a JSON-filelist Spark run via YARN
|
|
|
@32106
|
5 years |
davidb |
Rekindle ability to process a json-filelist.txt using Spark
|
|
|
@32104
|
5 years |
davidb |
Serial version
|
|
|
@32103
|
5 years |
davidb |
Tidy up of output
|
|
|
@32102
|
5 years |
davidb |
Version to project local JSON list serially
|
|
|
@32101
|
5 years |
davidb |
Tweaks to allow serial ingest to run
|
|
|
@31786
|
6 years |
davidb |
extra param in call; change to case-folding _htrctokentext
|
|
|
@31785
|
6 years |
davidb |
Change to allow solr command to optioanlly issue 'restart' instead of …
|
|
|
@31784
|
6 years |
davidb |
Output to highlight skipping per-page indexing
|
|
|
@31783
|
6 years |
davidb |
Solr Doc Add changed to include volume-level metadata within every …
|
|
|
@31782
|
6 years |
davidb |
more careful separation into field types htrcstring and htrcstrings
|
|
|
@31779
|
6 years |
davidb |
Change in how POS words are checked against the Whitelist. Previously …
|
|
|
@31772
|
6 years |
davidb |
Accidentally committed
|
|
|
@31693
|
6 years |
davidb |
Changes to workset information is pulled from sparql-endpoint for each …
|
|
|
@31677
|
6 years |
davidb |
Supress processing governmentDocument for now in JSON metadata record, …
|
|
|
@31676
|
6 years |
davidb |
To make it easier to remember how to kill off a YARN task at the …
|
|
|
@31675
|
6 years |
davidb |
More careful set of metadata fields indexed
|
|
|
@31645
|
6 years |
davidb |
Some initial work on drawing in workset info from sparql-endpoint. …
|
|
|
@31626
|
6 years |
davidb |
Links to blog entries added
|
|
|
@31625
|
6 years |
davidb |
Tidy up
|
|
|
@31624
|
6 years |
davidb |
Combined volume md and full-text page searching
|
|
|
@31623
|
6 years |
davidb |
Removed commented out static HTML POS section
|
|
|
@31622
|
6 years |
davidb |
Adding in CORS support to Solr
|
|
|
@31621
|
6 years |
davidb |
Step towards making HTML/JS work with on different server, with AJAX …
|
|
|
@31619
|
6 years |
davidb |
Further minor tidy up
|
|
|
@31618
|
6 years |
davidb |
Code tidy up
|
|
|
@31614
|
6 years |
davidb |
Separate off stream query page
|
|
|
@31613
|
6 years |
davidb |
Multiple word support in POS search box. Tidy up of anchor for search …
|
|
|
@31601
|
6 years |
davidb |
To get the look and feel of the HTRC portal web site, supporting files …
|
|
|
@31598
|
6 years |
davidb |
Easier to remember what to do
|
|
|
@31597
|
6 years |
davidb |
Additional _s and _ss fields to help with faceting. Temporarily …
|
|
|
@31571
|
6 years |
davidb |
Simple search-all-langs feature added
|
|
|
@31570
|
6 years |
davidb |
Solr-stream based search
|
|
|
@31524
|
6 years |
davidb |
Main changes: Fix for page/seqnum; group by id; show-hide other …
|
|
|
@31510
|
6 years |
davidb |
Turns out some languages fields can be empty. Need to test for this
|
|
|
@31509
|
6 years |
davidb |
LangPos determination changed to lock into first match, rather than …
|
|
|
@31506
|
6 years |
davidb |
Forgot to add initialization line. Doh!
|
|
|
@31505
|
6 years |
davidb |
Added in storing of top-level document metadata as separate solr-doc
|
|
|
@31504
|
6 years |
davidb |
Adjusted call to work with added parameter
|
|
|
@31503
|
6 years |
davidb |
Monitor for missing POS keys, and print out details first time each …
|
|
|
@31502
|
6 years |
davidb |
Comment out section, useful for controlling a smaller run
|
|
|
@31501
|
6 years |
davidb |
No longer used
|
|
|
@31500
|
6 years |
davidb |
Synchronize on reading in of white-list and universal-lang-pos
|
|
|
@31499
|
6 years |
davidb |
Better exception handling
|
|
|
@31498
|
6 years |
davidb |
Tidy up on print statements
|
|
|
@31466
|
6 years |
davidb |
Fix to work out solr_host rather than assume it is gc0
|
|
|
@31465
|
6 years |
davidb |
Adjustment to run solr with more memory
|
|
|
@31464
|
6 years |
davidb |
More general version of script that let's you specify the collection …
|
|
|
@31455
|
6 years |
davidb |
deprecated
|
|
|
@31454
|
6 years |
davidb |
Deprecated
|
|
|
@31453
|
6 years |
davidb |
Added size() method
|
|
|
@31452
|
6 years |
davidb |
Additional Spark progs to run
|
|
|
@31451
|
6 years |
davidb |
shift to using solr-base-url and a specified solr-collection
|
|
|
@31450
|
6 years |
davidb |
Some debugging output to help see what is happening with …
|
|
|
@31393
|
6 years |
davidb |
Fixed typo
|
|
|
@31392
|
6 years |
davidb |
Support for Catalog page added
|
|
|
@31385
|
6 years |
davidb |
Next and previous pages
|
|
|
@31384
|
6 years |
davidb |
After next phase of development
|
|
|
@31383
|
6 years |
davidb |
Files for initial functioning search page
|
|
|
@31378
|
6 years |
davidb |
Fixed loop limit test
|
|
|
@31377
|
6 years |
davidb |
Switch to using URI not string
|
|
|
@31376
|
6 years |
davidb |
Universal language mappings for opennlp POS model tags
|
|
|
@31375
|
6 years |
davidb |
Initial cut at including POS information to solr index
|
|
|
@31374
|
6 years |
davidb |
simplified command line usage
|
|
|
@31373
|
6 years |
davidb |
Changes made to operate on solr1 and solr2 boxes
|
|
|
@31372
|
6 years |
davidb |
Reworked to use sequenceFiles
|
|
|
@31371
|
6 years |
davidb |
Trying to get saveAsSequenceFile working
|
|
|
@31370
|
6 years |
davidb |
Fixed incorrect version number. Using htrcstring so field values not …
|
|
|
@31369
|
6 years |
davidb |
Trial new save
|
|
|
@31368
|
6 years |
davidb |
downsample-100 added
|
|
|
@31367
|
6 years |
davidb |
Changes to work with solr1 and solr2
|
|
|
@31366
|
6 years |
davidb |
Updated to latest released version of Solr
|
|
|
@31365
|
6 years |
davidb |
Quick code added to downsample
|
|
|
@31364
|
6 years |
davidb |
removed sample() line
|
|
|
@31363
|
6 years |
davidb |
Control num of partitions on sort
|
|
|
@31362
|
6 years |
davidb |
use Spark sample() to make for smaller test with Sequence files
|
|
|
@31361
|
6 years |
davidb |
Change from String to Text
|
|
|
@31360
|
6 years |
davidb |
Seems to be Text class not a String class coming out of the seuquenceFiles
|
|
|
@31359
|
6 years |
davidb |
Changed over to use sequenceFiles as input
|
|
|
@31358
|
6 years |
davidb |
Make workset download save as file
|
|
|
@31357
|
6 years |
davidb |
Ensure all output sent to browser
|
|
|
@31356
|
6 years |
davidb |
Tidy up on appending missing volumes
|
|
|
@31355
|
6 years |
davidb |
Changed to using containsKey rather than get to avoid null pointer …
|
|
|
@31354
|
6 years |
davidb |
import tidy-up
|
|
|
@31353
|
6 years |
davidb |
Added debug print statement
|
|
|
@31352
|
6 years |
davidb |
collection-to-workset now with id-check added to filter
|
|
|