source: other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/PrepareForIngest.java

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @30988   8 years davidb Changed flag to 'read-only' and changed the filed name full text saved …
(edit) @30986   8 years davidb Debugging for double accumulator added
(edit) @30985   8 years davidb Changed to run main processing method as action rather than transform. …
(edit) @30984   8 years davidb Introduction of Spark accumulator to measure progress. Output of POST …
(edit) @30979   8 years davidb _solr_url needs to be stored in class!
(edit) @30977   8 years davidb Only have RDD if an output directory was specified on the command-line …
(edit) @30976   8 years davidb Change to reflect changed order of command-line arguments
(edit) @30975   8 years davidb Introduction of new solr-url command line argument, leading to some …
(edit) @30951   8 years davidb Save a JSONObject as a file in the output directory
(edit) @30949   8 years davidb Use better name than 'foo'. Further fix to JSON name generated
(edit) @30945   8 years davidb Getting closer to writing out JSON files
(edit) @30944   8 years davidb Forcer higher partition (6) than default, which seems to be 2
(edit) @30943   8 years davidb Extra debug info
(edit) @30942   8 years davidb Improved output printing for slave node
(edit) @30941   8 years davidb Moved to getFileSystemInstance() method to play nice on cluster
(edit) @30937   8 years davidb Expanded set of ClusterFileIO methods
(edit) @30934   8 years davidb Providing json-filelist now a compulsory argument, rather than an option
(edit) @30918   8 years davidb More flexible command-line args
(add) @30898   8 years davidb Scripts for downloading sample JSON data from public domain extracted …
Note: See TracRevisionLog for help on using the revision log.