|
|
@31367
|
7 years |
davidb |
Changes to work with solr1 and solr2
|
|
|
@31366
|
7 years |
davidb |
Updated to latest released version of Solr
|
|
|
@31365
|
7 years |
davidb |
Quick code added to downsample
|
|
|
@31364
|
7 years |
davidb |
removed sample() line
|
|
|
@31363
|
7 years |
davidb |
Control num of partitions on sort
|
|
|
@31362
|
7 years |
davidb |
use Spark sample() to make for smaller test with Sequence files
|
|
|
@31361
|
7 years |
davidb |
Change from String to Text
|
|
|
@31360
|
7 years |
davidb |
Seems to be Text class not a String class coming out of the seuquenceFiles
|
|
|
@31359
|
7 years |
davidb |
Changed over to use sequenceFiles as input
|
|
|
@31358
|
7 years |
davidb |
Make workset download save as file
|
|
|
@31357
|
7 years |
davidb |
Ensure all output sent to browser
|
|
|
@31356
|
7 years |
davidb |
Tidy up on appending missing volumes
|
|
|
@31355
|
7 years |
davidb |
Changed to using containsKey rather than get to avoid null pointer …
|
|
|
@31354
|
7 years |
davidb |
import tidy-up
|
|
|
@31353
|
7 years |
davidb |
Added debug print statement
|
|
|
@31352
|
7 years |
davidb |
collection-to-workset now with id-check added to filter
|
|
|
@31351
|
7 years |
davidb |
Powerpoint slides showing mahsup features
|
|
|
@31350
|
7 years |
davidb |
Use new 'convert-col' action
|
|
|
@31349
|
7 years |
davidb |
Change over to proxyied main web server
|
|
|
@31348
|
7 years |
davidb |
Restructure of how convert-to works
|
|
|
@31347
|
7 years |
davidb |
First stage of developing HT collection to HTRC workset. Code to …
|
|
|
@31342
|
7 years |
davidb |
Some initial progress on collection to workset conversion
|
|
|
@31341
|
7 years |
davidb |
Cody tidy-up
|
|
|
@31340
|
7 years |
davidb |
Test worked OK. Removing debug code
|
|
|
@31339
|
7 years |
davidb |
Debugging statement
|
|
|
@31338
|
7 years |
davidb |
additional close()
|
|
|
@31337
|
7 years |
davidb |
Output the downloaded rsync file
|
|
|
@31336
|
7 years |
davidb |
Changes in response to testing
|
|
|
@31335
|
7 years |
davidb |
Too expensive to hold pairtree filename in hashmap, so change to …
|
|
|
@31334
|
7 years |
davidb |
Initial cut at rsync download
|
|
|
@31333
|
7 years |
davidb |
Minor word tweak
|
|
|
@31332
|
7 years |
davidb |
needed in Jetty CORS support
|
|
|
@31331
|
7 years |
davidb |
Reworked to use CORS and $.ajax() so TamperMonkey doesn't interceed …
|
|
|
@31330
|
7 years |
davidb |
Initial cut a files that explain how to install the user-script
|
|
|
@31329
|
7 years |
davidb |
Tweaks after testing INSTALL.sh
|
|
|
@31328
|
7 years |
davidb |
Install the necessary files in the jetty webapps dir
|
|
|
@31327
|
7 years |
davidb |
name change to be more consistent
|
|
|
@31326
|
7 years |
davidb |
Further tweaks
|
|
|
@31325
|
7 years |
davidb |
Further tweaks
|
|
|
@31324
|
7 years |
davidb |
More accurate name
|
|
|
@31323
|
7 years |
davidb |
Download script plus setup instructions
|
|
|
@31322
|
7 years |
davidb |
Location for the Java byte compiled code to link in with rest of servlet
|
|
|
@31321
|
7 years |
davidb |
useful scripts
|
|
|
@31320
|
7 years |
davidb |
build Document rather than parse JSON string
|
|
|
@31319
|
7 years |
davidb |
Changed to replace existing MongoDB entry. Fixed up printt statement
|
|
|
@31318
|
7 years |
davidb |
change to using contains()
|
|
|
@31317
|
7 years |
davidb |
added debug statement
|
|
|
@31316
|
7 years |
davidb |
fixed typo
|
|
|
@31315
|
7 years |
davidb |
Further tweak
|
|
|
@31314
|
7 years |
davidb |
Another go at avoiding concurrency update exception
|
|
|
@31313
|
7 years |
davidb |
Alternative to avoid concurrency update exception
|
|
|
@31312
|
7 years |
davidb |
MongoDB can't have 'period' and 'dollar' in key, as reserved characters
|
|
|
@31311
|
7 years |
davidb |
Processing print statement added
|
|
|
@31310
|
7 years |
davidb |
Initial cut at files for working with MongoDB
|
|
|
@31309
|
7 years |
davidb |
Sparked MongoDB connector added
|
|
|
@31308
|
7 years |
davidb |
Minor tidy-up
|
|
|
@31307
|
7 years |
davidb |
convenience scripts
|
|
|
@31306
|
7 years |
davidb |
Final part of the mongodb shard puzzle -- router servers
|
|
|
@31305
|
7 years |
davidb |
Next good commit point. Initial testing of shard replset scripts
|
|
|
@31304
|
7 years |
davidb |
Changes made whe (it turned out) the real source of the error was an …
|
|
|
@31303
|
7 years |
davidb |
Adding in support to start and stop router server
|
|
|
@31302
|
7 years |
davidb |
Initial commit of scripts, after some testing, and subsequent changes …
|
|
|
@31301
|
7 years |
davidb |
Fix for gsliscluster1
|
|
|
@31300
|
7 years |
davidb |
Need to use NETWORK not PACKAGE
|
|
|
@31299
|
7 years |
davidb |
Additionally setup MongoDB
|
|
|
@31298
|
7 years |
davidb |
Initial cut at setup file for MongoDB
|
|
|
@31297
|
7 years |
davidb |
|
|
|
@31296
|
7 years |
davidb |
Make loading in of ID file more portable
|
|
|
@31295
|
7 years |
davidb |
name change of webapp
|
|
|
@31294
|
7 years |
davidb |
Version for language counting the catalog assignment language …
|
|
|
@31283
|
7 years |
davidb |
Fixed typo
|
|
|
@31282
|
7 years |
davidb |
Jetty jar-runable server
|
|
|
@31281
|
7 years |
davidb |
|
|
|
@31280
|
7 years |
davidb |
|
|
|
@31279
|
7 years |
davidb |
First cut at servlet
|
|
|
@31278
|
7 years |
davidb |
To avoid null pointer on ids.iterator()
|
|
|
@31277
|
7 years |
davidb |
Tweak to minimum value
|
|
|
@31276
|
7 years |
davidb |
Min num partition guard put in
|
|
|
@31275
|
7 years |
davidb |
Changes to allow gc slave nodes to work with local disk versions of …
|
|
|
@31274
|
7 years |
davidb |
Need to use JSONArray no JSONObject for a multifield item
|
|
|
@31273
|
7 years |
davidb |
Code moved to store fields for multilingual use using dynamic Solr …
|
|
|
@31272
|
7 years |
davidb |
Use disk and memory to store main language RDD
|
|
|
@31271
|
7 years |
davidb |
Updating of POS code to new files-per-partition paramater, plus some …
|
|
|
@31270
|
7 years |
davidb |
Changed over to repartition approach
|
|
|
@31269
|
7 years |
davidb |
Some variable name changes, and printing tidy up
|
|
|
@31268
|
7 years |
davidb |
Adjustments to memory allocation in response to test runs on 10% of dataset
|
|
|
@31267
|
7 years |
davidb |
Values trialed on gsliscluster1. Rekindling idea of per-vol processing
|
|
|
@31266
|
7 years |
davidb |
Rekindling of per-volume approach. Also some tweaking to verbosity …
|
|
|
@31264
|
7 years |
davidb |
Switching to 'long' in counts to allow higher number representation
|
|
|
@31263
|
7 years |
davidb |
Change to using long for higher word counts
|
|
|
@31261
|
7 years |
davidb |
Overlooked changes from POS to lang
|
|
|
@31260
|
7 years |
davidb |
Language counting
|
|
|
@31259
|
7 years |
davidb |
Lambda sort had wrong boolean arg to sort descending. Now fixed
|
|
|
@31258
|
7 years |
davidb |
POS Label count, similar to Whitelist word count
|
|
|
@31257
|
7 years |
davidb |
Fixed typo
|
|
|
@31256
|
7 years |
davidb |
Earlier check of output directory to prevent large scale processing, …
|
|
|
@31255
|
7 years |
davidb |
Changed to using lambda functions
|
|
|
@31254
|
7 years |
davidb |
Experimenting with Lucene lowercase filter
|
|
|
@31253
|
7 years |
davidb |
Identified a typo, and changed to being true anyway
|
|
|
@31252
|
7 years |
davidb |
Support for icu-tokenize property added, plus relevant refactoring.
|
|
|