source: other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures@ 31372

Name Size Rev Age Author Last Change
../
__PerPageJSONForeach.java 2.4 KB 31011   7 years davidb Further RDD flatMap/map restructuring and refactoring, for per-page
ClusterFileIO.java 6.3 KB 31310   7 years davidb Initial cut at files for working with MongoDB
JSONClusterFileIO.java 782 bytes 31272   7 years davidb Use disk and memory to store main language RDD
PerPageJSONFlatmap.java 4.8 KB 31266   7 years davidb Rekindling of per-volume approach. Also some tweaking to verbosity …
PerPageJSONMap.java 2.5 KB 31045   7 years davidb More careful treatment of what to do when a JSON file isn't there
PerVolumeCatalogLangSequenceFileMap.java 1.6 KB 31360   7 years davidb Seems to be Text class not a String class coming out of the seuquenceFiles
PerVolumeCatalogLangStreamFlatmap.java 2.4 KB 31294   7 years davidb Version for language counting the catalog assignment language …
PerVolumeJSON.java 7.5 KB 31372   7 years davidb Reworked to use sequenceFiles
PerVolumeLangStreamFlatmap.java 3.2 KB 31269   7 years davidb Some variable name changes, and printing tidy up
PerVolumeMongoDBDocumentsMap.java 7.5 KB 31320   7 years davidb build Document rather than parse JSON string
PerVolumePOSStreamFlatmap.java 3.2 KB 31271   7 years davidb Updating of POS code to new files-per-partition paramater, plus some …
PerVolumeWordStreamFlatmap.java 3.3 KB 31273   7 years davidb Code moved to store fields for multilingual use using dynamic Solr …
ProcessForCatalogLangCount.java 12.2 KB 31371   7 years davidb Trying to get saveAsSequenceFile working
ProcessForLangCount.java 6.7 KB 31272   7 years davidb Use disk and memory to store main language RDD
ProcessForMongoDBIngest.java 6.1 KB 31319   7 years davidb Changed to replace existing MongoDB entry. Fixed up printt statement
ProcessForPOSCount.java 7.0 KB 31271   7 years davidb Updating of POS code to new files-per-partition paramater, plus some …
ProcessForSolrIngest.java 12.5 KB 31372   7 years davidb Reworked to use sequenceFiles
ProcessForWhitelist.java 7.9 KB 31308   7 years davidb Minor tidy-up
SolrDocJSON.java 15.3 KB 31308   7 years davidb Minor tidy-up
TestWhitelistBloomFilter.java 3.7 KB 31200   7 years davidb Better output statement
TestWhitelistDictionaryMain.java 1.0 KB 31199   7 years davidb Renaming of classname to reflect filename rename
TestWhitelistHashmap.java 1.3 KB 31199   7 years davidb Renaming of classname to reflect filename rename
WhitelistBloomFilter.java 4.2 KB 31227   7 years davidb Code tidy up
Note: See TracBrowser for help on using the repository browser.