|
|
@31676
|
7 years |
davidb |
To make it easier to remember how to kill off a YARN task at the …
|
|
|
@31675
|
7 years |
davidb |
More careful set of metadata fields indexed
|
|
|
@31645
|
7 years |
davidb |
Some initial work on drawing in workset info from sparql-endpoint. …
|
|
|
@31626
|
7 years |
davidb |
Links to blog entries added
|
|
|
@31625
|
7 years |
davidb |
Tidy up
|
|
|
@31624
|
7 years |
davidb |
Combined volume md and full-text page searching
|
|
|
@31623
|
7 years |
davidb |
Removed commented out static HTML POS section
|
|
|
@31622
|
7 years |
davidb |
Adding in CORS support to Solr
|
|
|
@31621
|
7 years |
davidb |
Step towards making HTML/JS work with on different server, with AJAX …
|
|
|
@31619
|
7 years |
davidb |
Further minor tidy up
|
|
|
@31618
|
7 years |
davidb |
Code tidy up
|
|
|
@31614
|
7 years |
davidb |
Separate off stream query page
|
|
|
@31613
|
7 years |
davidb |
Multiple word support in POS search box. Tidy up of anchor for search …
|
|
|
@31601
|
7 years |
davidb |
To get the look and feel of the HTRC portal web site, supporting files …
|
|
|
@31598
|
7 years |
davidb |
Easier to remember what to do
|
|
|
@31597
|
7 years |
davidb |
Additional _s and _ss fields to help with faceting. Temporarily …
|
|
|
@31571
|
7 years |
davidb |
Simple search-all-langs feature added
|
|
|
@31570
|
7 years |
davidb |
Solr-stream based search
|
|
|
@31524
|
7 years |
davidb |
Main changes: Fix for page/seqnum; group by id; show-hide other …
|
|
|
@31510
|
7 years |
davidb |
Turns out some languages fields can be empty. Need to test for this
|
|
|
@31509
|
7 years |
davidb |
LangPos determination changed to lock into first match, rather than …
|
|
|
@31506
|
7 years |
davidb |
Forgot to add initialization line. Doh!
|
|
|
@31505
|
7 years |
davidb |
Added in storing of top-level document metadata as separate solr-doc
|
|
|
@31504
|
7 years |
davidb |
Adjusted call to work with added parameter
|
|
|
@31503
|
7 years |
davidb |
Monitor for missing POS keys, and print out details first time each …
|
|
|
@31502
|
7 years |
davidb |
Comment out section, useful for controlling a smaller run
|
|
|
@31501
|
7 years |
davidb |
No longer used
|
|
|
@31500
|
7 years |
davidb |
Synchronize on reading in of white-list and universal-lang-pos
|
|
|
@31499
|
7 years |
davidb |
Better exception handling
|
|
|
@31498
|
7 years |
davidb |
Tidy up on print statements
|
|
|
@31466
|
7 years |
davidb |
Fix to work out solr_host rather than assume it is gc0
|
|
|
@31465
|
7 years |
davidb |
Adjustment to run solr with more memory
|
|
|
@31464
|
7 years |
davidb |
More general version of script that let's you specify the collection …
|
|
|
@31455
|
7 years |
davidb |
deprecated
|
|
|
@31454
|
7 years |
davidb |
Deprecated
|
|
|
@31453
|
7 years |
davidb |
Added size() method
|
|
|
@31452
|
7 years |
davidb |
Additional Spark progs to run
|
|
|
@31451
|
7 years |
davidb |
shift to using solr-base-url and a specified solr-collection
|
|
|
@31450
|
7 years |
davidb |
Some debugging output to help see what is happening with …
|
|
|
@31385
|
7 years |
davidb |
Next and previous pages
|
|
|
@31384
|
7 years |
davidb |
After next phase of development
|
|
|
@31383
|
7 years |
davidb |
Files for initial functioning search page
|
|
|
@31378
|
7 years |
davidb |
Fixed loop limit test
|
|
|
@31377
|
7 years |
davidb |
Switch to using URI not string
|
|
|
@31376
|
7 years |
davidb |
Universal language mappings for opennlp POS model tags
|
|
|
@31375
|
7 years |
davidb |
Initial cut at including POS information to solr index
|
|
|
@31374
|
7 years |
davidb |
simplified command line usage
|
|
|
@31373
|
7 years |
davidb |
Changes made to operate on solr1 and solr2 boxes
|
|
|
@31372
|
7 years |
davidb |
Reworked to use sequenceFiles
|
|
|
@31371
|
7 years |
davidb |
Trying to get saveAsSequenceFile working
|
|
|
@31370
|
7 years |
davidb |
Fixed incorrect version number. Using htrcstring so field values not …
|
|
|
@31369
|
7 years |
davidb |
Trial new save
|
|
|
@31368
|
7 years |
davidb |
downsample-100 added
|
|
|
@31367
|
7 years |
davidb |
Changes to work with solr1 and solr2
|
|
|
@31366
|
7 years |
davidb |
Updated to latest released version of Solr
|
|
|
@31365
|
7 years |
davidb |
Quick code added to downsample
|
|
|
@31364
|
7 years |
davidb |
removed sample() line
|
|
|
@31363
|
7 years |
davidb |
Control num of partitions on sort
|
|
|
@31362
|
7 years |
davidb |
use Spark sample() to make for smaller test with Sequence files
|
|
|
@31361
|
7 years |
davidb |
Change from String to Text
|
|
|
@31360
|
7 years |
davidb |
Seems to be Text class not a String class coming out of the seuquenceFiles
|
|
|
@31359
|
7 years |
davidb |
Changed over to use sequenceFiles as input
|
|
|
@31320
|
7 years |
davidb |
build Document rather than parse JSON string
|
|
|
@31319
|
7 years |
davidb |
Changed to replace existing MongoDB entry. Fixed up printt statement
|
|
|
@31318
|
7 years |
davidb |
change to using contains()
|
|
|
@31317
|
7 years |
davidb |
added debug statement
|
|
|
@31316
|
7 years |
davidb |
fixed typo
|
|
|
@31315
|
7 years |
davidb |
Further tweak
|
|
|
@31314
|
7 years |
davidb |
Another go at avoiding concurrency update exception
|
|
|
@31313
|
7 years |
davidb |
Alternative to avoid concurrency update exception
|
|
|
@31312
|
7 years |
davidb |
MongoDB can't have 'period' and 'dollar' in key, as reserved characters
|
|
|
@31311
|
7 years |
davidb |
Processing print statement added
|
|
|
@31310
|
7 years |
davidb |
Initial cut at files for working with MongoDB
|
|
|
@31309
|
7 years |
davidb |
Sparked MongoDB connector added
|
|
|
@31308
|
7 years |
davidb |
Minor tidy-up
|
|
|
@31307
|
7 years |
davidb |
convenience scripts
|
|
|
@31306
|
7 years |
davidb |
Final part of the mongodb shard puzzle -- router servers
|
|
|
@31305
|
7 years |
davidb |
Next good commit point. Initial testing of shard replset scripts
|
|
|
@31304
|
7 years |
davidb |
Changes made whe (it turned out) the real source of the error was an …
|
|
|
@31303
|
7 years |
davidb |
Adding in support to start and stop router server
|
|
|
@31302
|
7 years |
davidb |
Initial commit of scripts, after some testing, and subsequent changes …
|
|
|
@31301
|
7 years |
davidb |
Fix for gsliscluster1
|
|
|
@31300
|
7 years |
davidb |
Need to use NETWORK not PACKAGE
|
|
|
@31299
|
7 years |
davidb |
Additionally setup MongoDB
|
|
|
@31298
|
7 years |
davidb |
Initial cut at setup file for MongoDB
|
|
|
@31297
|
7 years |
davidb |
|
|
|
@31294
|
7 years |
davidb |
Version for language counting the catalog assignment language …
|
|
|
@31278
|
7 years |
davidb |
To avoid null pointer on ids.iterator()
|
|
|
@31277
|
7 years |
davidb |
Tweak to minimum value
|
|
|
@31276
|
7 years |
davidb |
Min num partition guard put in
|
|
|
@31275
|
7 years |
davidb |
Changes to allow gc slave nodes to work with local disk versions of …
|
|
|
@31274
|
7 years |
davidb |
Need to use JSONArray no JSONObject for a multifield item
|
|
|
@31273
|
7 years |
davidb |
Code moved to store fields for multilingual use using dynamic Solr …
|
|
|
@31272
|
7 years |
davidb |
Use disk and memory to store main language RDD
|
|
|
@31271
|
7 years |
davidb |
Updating of POS code to new files-per-partition paramater, plus some …
|
|
|
@31270
|
7 years |
davidb |
Changed over to repartition approach
|
|
|
@31269
|
7 years |
davidb |
Some variable name changes, and printing tidy up
|
|
|
@31268
|
7 years |
davidb |
Adjustments to memory allocation in response to test runs on 10% of dataset
|
|
|
@31267
|
7 years |
davidb |
Values trialed on gsliscluster1. Rekindling idea of per-vol processing
|
|
|
@31266
|
7 years |
davidb |
Rekindling of per-volume approach. Also some tweaking to verbosity …
|
|
|