source: other-projects/hathitrust

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @31366   7 years davidb Updated to latest released version of Solr
(edit) @31365   7 years davidb Quick code added to downsample
(edit) @31364   7 years davidb removed sample() line
(edit) @31363   7 years davidb Control num of partitions on sort
(edit) @31362   7 years davidb use Spark sample() to make for smaller test with Sequence files
(edit) @31361   7 years davidb Change from String to Text
(edit) @31360   7 years davidb Seems to be Text class not a String class coming out of the seuquenceFiles
(edit) @31359   7 years davidb Changed over to use sequenceFiles as input
(edit) @31358   7 years davidb Make workset download save as file
(edit) @31357   7 years davidb Ensure all output sent to browser
(edit) @31356   7 years davidb Tidy up on appending missing volumes
(edit) @31355   7 years davidb Changed to using containsKey rather than get to avoid null pointer …
(edit) @31354   7 years davidb import tidy-up
(edit) @31353   7 years davidb Added debug print statement
(edit) @31352   7 years davidb collection-to-workset now with id-check added to filter
(edit) @31351   7 years davidb Powerpoint slides showing mahsup features
(edit) @31350   7 years davidb Use new 'convert-col' action
(edit) @31349   7 years davidb Change over to proxyied main web server
(edit) @31348   7 years davidb Restructure of how convert-to works
(edit) @31347   7 years davidb First stage of developing HT collection to HTRC workset. Code to …
(edit) @31342   7 years davidb Some initial progress on collection to workset conversion
(edit) @31341   7 years davidb Cody tidy-up
(edit) @31340   7 years davidb Test worked OK. Removing debug code
(edit) @31339   7 years davidb Debugging statement
(edit) @31338   7 years davidb additional close()
(edit) @31337   7 years davidb Output the downloaded rsync file
(edit) @31336   7 years davidb Changes in response to testing
(edit) @31335   7 years davidb Too expensive to hold pairtree filename in hashmap, so change to …
(edit) @31334   7 years davidb Initial cut at rsync download
(edit) @31333   7 years davidb Minor word tweak
(edit) @31332   7 years davidb needed in Jetty CORS support
(edit) @31331   7 years davidb Reworked to use CORS and $.ajax() so TamperMonkey doesn't interceed …
(edit) @31330   7 years davidb Initial cut a files that explain how to install the user-script
(edit) @31329   7 years davidb Tweaks after testing INSTALL.sh
(edit) @31328   7 years davidb Install the necessary files in the jetty webapps dir
(edit) @31327   7 years davidb name change to be more consistent
(edit) @31326   7 years davidb Further tweaks
(edit) @31325   7 years davidb Further tweaks
(edit) @31324   7 years davidb More accurate name
(edit) @31323   7 years davidb Download script plus setup instructions
(edit) @31322   7 years davidb Location for the Java byte compiled code to link in with rest of servlet
(edit) @31321   7 years davidb useful scripts
(edit) @31320   7 years davidb build Document rather than parse JSON string
(edit) @31319   7 years davidb Changed to replace existing MongoDB entry. Fixed up printt statement
(edit) @31318   7 years davidb change to using contains()
(edit) @31317   7 years davidb added debug statement
(edit) @31316   7 years davidb fixed typo
(edit) @31315   7 years davidb Further tweak
(edit) @31314   7 years davidb Another go at avoiding concurrency update exception
(edit) @31313   7 years davidb Alternative to avoid concurrency update exception
(edit) @31312   7 years davidb MongoDB can't have 'period' and 'dollar' in key, as reserved characters
(edit) @31311   7 years davidb Processing print statement added
(edit) @31310   7 years davidb Initial cut at files for working with MongoDB
(edit) @31309   7 years davidb Sparked MongoDB connector added
(edit) @31308   7 years davidb Minor tidy-up
(edit) @31307   7 years davidb convenience scripts
(edit) @31306   7 years davidb Final part of the mongodb shard puzzle -- router servers
(edit) @31305   7 years davidb Next good commit point. Initial testing of shard replset scripts
(edit) @31304   7 years davidb Changes made whe (it turned out) the real source of the error was an …
(edit) @31303   7 years davidb Adding in support to start and stop router server
(edit) @31302   7 years davidb Initial commit of scripts, after some testing, and subsequent changes …
(edit) @31301   7 years davidb Fix for gsliscluster1
(edit) @31300   7 years davidb Need to use NETWORK not PACKAGE
(edit) @31299   7 years davidb Additionally setup MongoDB
(edit) @31298   7 years davidb Initial cut at setup file for MongoDB
(edit) @31297   7 years davidb
(edit) @31296   7 years davidb Make loading in of ID file more portable
(edit) @31295   7 years davidb name change of webapp
(edit) @31294   7 years davidb Version for language counting the catalog assignment language …
(edit) @31283   7 years davidb Fixed typo
(edit) @31282   7 years davidb Jetty jar-runable server
(edit) @31281   7 years davidb
(edit) @31280   7 years davidb
(edit) @31279   7 years davidb First cut at servlet
(edit) @31278   7 years davidb To avoid null pointer on ids.iterator()
(edit) @31277   7 years davidb Tweak to minimum value
(edit) @31276   7 years davidb Min num partition guard put in
(edit) @31275   7 years davidb Changes to allow gc slave nodes to work with local disk versions of …
(edit) @31274   7 years davidb Need to use JSONArray no JSONObject for a multifield item
(edit) @31273   7 years davidb Code moved to store fields for multilingual use using dynamic Solr …
(edit) @31272   7 years davidb Use disk and memory to store main language RDD
(edit) @31271   7 years davidb Updating of POS code to new files-per-partition paramater, plus some …
(edit) @31270   7 years davidb Changed over to repartition approach
(edit) @31269   7 years davidb Some variable name changes, and printing tidy up
(edit) @31268   7 years davidb Adjustments to memory allocation in response to test runs on 10% of dataset
(edit) @31267   7 years davidb Values trialed on gsliscluster1. Rekindling idea of per-vol processing
(edit) @31266   7 years davidb Rekindling of per-volume approach. Also some tweaking to verbosity …
(edit) @31264   7 years davidb Switching to 'long' in counts to allow higher number representation
(edit) @31263   7 years davidb Change to using long for higher word counts
(edit) @31261   7 years davidb Overlooked changes from POS to lang
(edit) @31260   7 years davidb Language counting
(edit) @31259   7 years davidb Lambda sort had wrong boolean arg to sort descending. Now fixed
(edit) @31258   7 years davidb POS Label count, similar to Whitelist word count
(edit) @31257   7 years davidb Fixed typo
(edit) @31256   7 years davidb Earlier check of output directory to prevent large scale processing, …
(edit) @31255   7 years davidb Changed to using lambda functions
(edit) @31254   7 years davidb Experimenting with Lucene lowercase filter
(edit) @31253   7 years davidb Identified a typo, and changed to being true anyway
(edit) @31252   7 years davidb Support for icu-tokenize property added, plus relevant refactoring.
(edit) @31251   7 years davidb Code tidy up. Timed experiment showed sorting by key with …
Note: See TracRevisionLog for help on using the revision log.