root/other-projects/hathitrust

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Rev Chgset Date Author Log Message
(edit) @31320 [31320] 3 years davidb build Document rather than parse JSON string
(edit) @31319 [31319] 3 years davidb Changed to replace existing MongoDB entry. Fixed up printt statement
(edit) @31318 [31318] 3 years davidb change to using contains()
(edit) @31317 [31317] 3 years davidb added debug statement
(edit) @31316 [31316] 3 years davidb fixed typo
(edit) @31315 [31315] 3 years davidb Further tweak
(edit) @31314 [31314] 3 years davidb Another go at avoiding concurrency update exception
(edit) @31313 [31313] 3 years davidb Alternative to avoid concurrency update exception
(edit) @31312 [31312] 3 years davidb MongoDB can't have 'period' and 'dollar' in key, as reserved characters
(edit) @31311 [31311] 3 years davidb Processing print statement added
(edit) @31310 [31310] 3 years davidb Initial cut at files for working with MongoDB
(edit) @31309 [31309] 3 years davidb Sparked MongoDB connector added
(edit) @31308 [31308] 3 years davidb Minor tidy-up
(edit) @31307 [31307] 3 years davidb convenience scripts
(edit) @31306 [31306] 3 years davidb Final part of the mongodb shard puzzle -- router servers
(edit) @31305 [31305] 3 years davidb Next good commit point. Initial testing of shard replset scripts
(edit) @31304 [31304] 3 years davidb Changes made whe (it turned out) the real source of the error was an error …
(edit) @31303 [31303] 3 years davidb Adding in support to start and stop router server
(edit) @31302 [31302] 3 years davidb Initial commit of scripts, after some testing, and subsequent changes to …
(edit) @31301 [31301] 3 years davidb Fix for gsliscluster1
(edit) @31300 [31300] 3 years davidb Need to use NETWORK not PACKAGE
(edit) @31299 [31299] 3 years davidb Additionally setup MongoDB
(edit) @31298 [31298] 3 years davidb Initial cut at setup file for MongoDB
(edit) @31297 [31297] 3 years davidb
(edit) @31296 [31296] 3 years davidb Make loading in of ID file more portable
(edit) @31295 [31295] 3 years davidb name change of webapp
(edit) @31294 [31294] 3 years davidb Version for language counting the catalog assignment language metadata. …
(edit) @31283 [31283] 3 years davidb Fixed typo
(edit) @31282 [31282] 3 years davidb Jetty jar-runable server
(edit) @31281 [31281] 3 years davidb
(edit) @31280 [31280] 3 years davidb
(edit) @31279 [31279] 3 years davidb First cut at servlet
(edit) @31278 [31278] 3 years davidb To avoid null pointer on ids.iterator()
(edit) @31277 [31277] 3 years davidb Tweak to minimum value
(edit) @31276 [31276] 3 years davidb Min num partition guard put in
(edit) @31275 [31275] 3 years davidb Changes to allow gc slave nodes to work with local disk versions of …
(edit) @31274 [31274] 3 years davidb Need to use JSONArray no JSONObject for a multifield item
(edit) @31273 [31273] 3 years davidb Code moved to store fields for multilingual use using dynamic Solr fields …
(edit) @31272 [31272] 3 years davidb Use disk and memory to store main language RDD
(edit) @31271 [31271] 3 years davidb Updating of POS code to new files-per-partition paramater, plus some other …
(edit) @31270 [31270] 3 years davidb Changed over to repartition approach
(edit) @31269 [31269] 3 years davidb Some variable name changes, and printing tidy up
(edit) @31268 [31268] 3 years davidb Adjustments to memory allocation in response to test runs on 10% of …
(edit) @31267 [31267] 3 years davidb Values trialed on gsliscluster1. Rekindling idea of per-vol processing
(edit) @31266 [31266] 3 years davidb Rekindling of per-volume approach. Also some tweaking to verbosity debug …
(edit) @31264 [31264] 3 years davidb Switching to 'long' in counts to allow higher number representation
(edit) @31263 [31263] 3 years davidb Change to using long for higher word counts
(edit) @31261 [31261] 3 years davidb Overlooked changes from POS to lang
(edit) @31260 [31260] 3 years davidb Language counting
(edit) @31259 [31259] 3 years davidb Lambda sort had wrong boolean arg to sort descending. Now fixed
(edit) @31258 [31258] 3 years davidb POS Label count, similar to Whitelist word count
(edit) @31257 [31257] 3 years davidb Fixed typo
(edit) @31256 [31256] 3 years davidb Earlier check of output directory to prevent large scale processing, when …
(edit) @31255 [31255] 3 years davidb Changed to using lambda functions
(edit) @31254 [31254] 3 years davidb Experimenting with Lucene lowercase filter
(edit) @31253 [31253] 3 years davidb Identified a typo, and changed to being true anyway
(edit) @31252 [31252] 3 years davidb Support for icu-tokenize property added, plus relevant refactoring.
(edit) @31251 [31251] 3 years davidb Code tidy up. Timed experiment showed sorting by key with num_partitions …
(edit) @31250 [31250] 3 years davidb Minor mods
(edit) @31247 [31247] 3 years davidb Change sort order. Pick better output directory name
(edit) @31246 [31246] 3 years davidb Experimenting with sorting
(edit) @31245 [31245] 3 years davidb Refactored so processing of words from TokenPosCount? now done by the same …
(edit) @31244 [31244] 3 years davidb Tidy up
(edit) @31243 [31243] 3 years davidb Experimenting with Lucene/Solr's ICU tokenizer
(edit) @31242 [31242] 3 years davidb Method name refactor
(edit) @31235 [31235] 3 years davidb More fine-grained testing to help nema setup
(edit) @31234 [31234] 3 years davidb More selective control of what to source/setup depending on hostname
(edit) @31233 [31233] 3 years davidb Changes to operate on nema as well as gsliscluster1 and gc0-9
(edit) @31232 [31232] 3 years davidb Hand edited version of state.json from gsliscluster1 suitable for running …
(edit) @31231 [31231] 3 years davidb Changes to allow SOLR to run on nodes in /hdfsd05/dbbridge/solr-ef
(edit) @31228 [31228] 3 years davidb Change to see if code can be made more unified. If so, then …
(edit) @31227 [31227] 3 years davidb Code tidy up
(edit) @31226 [31226] 3 years davidb Fixed bloom test for init
(edit) @31225 [31225] 3 years davidb Relocated bloomfilter creation to within call() method, so done on the …
(edit) @31224 [31224] 3 years davidb Debug added
(edit) @31223 [31223] 3 years davidb Exception printStackTrace
(edit) @31222 [31222] 3 years davidb Changed to using ClusterFileIO supporting methods
(edit) @31221 [31221] 3 years davidb Missing argument added in
(edit) @31220 [31220] 3 years davidb Use of whitelist Bloom filter added to words going into Solr index
(edit) @31215 [31215] 3 years davidb Changed back to Guava 20 API, now mvn shading allows me to have this in …
(edit) @31214 [31214] 3 years davidb Not needed now using mvn shading
(edit) @31213 [31213] 3 years davidb Tidy up
(edit) @31212 [31212] 3 years davidb Changed from mvn assemblhy to shadowing, which has more control
(edit) @31211 [31211] 3 years davidb Changing back to regular Guava classes. Looking to use maven shading to …
(edit) @31209 [31209] 3 years davidb checkArgument added in
(edit) @31207 [31207] 3 years davidb And some more tweaking
(edit) @31206 [31206] 3 years davidb More tweaking of Guava cloned code
(edit) @31205 [31205] 3 years davidb Next added in part of new Guava code
(edit) @31204 [31204] 3 years davidb Splicing in Guava verion 20 of BloomFilter? into code as own class (now …
(edit) @31203 [31203] 3 years davidb Use class provided stringFunnel
(edit) @31202 [31202] 3 years davidb Turns out Spark uses Guava 14.0 not 20.0. Additional code to fill in some …
(edit) @31201 [31201] 3 years davidb Trigger serialization of whitelist in main program
(edit) @31200 [31200] 3 years davidb Better output statement
(edit) @31199 [31199] 3 years davidb Renaming of classname to reflect filename rename
(edit) @31198 [31198] 3 years davidb File renaming to make way for newer version of classes needed in the main …
(edit) @31197 [31197] 3 years davidb File renaming to make way for newer version of classes needed in the main …
(edit) @31196 [31196] 3 years davidb File renaming to make way for newer version of classes needed in the main …
(edit) @31195 [31195] 3 years davidb File renaming to make way for newer version of classes needed in the main …
(edit) @31194 [31194] 3 years davidb Serialize in and out methods added
(edit) @31193 [31193] 3 years davidb Peter's white-list file
Note: See TracRevisionLog for help on using the revision log.