# # ChangeLog for / # # Generated by Trac 1.4.2 # 2024-04-28T11:18:14+12:00 Tue, 27 Dec 2016 05:52:41 GMT davidb [31267] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) Values trialed on gsliscluster1. Rekindling idea of per-vol processing Tue, 27 Dec 2016 05:51:42 GMT davidb [31266] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Rekindling of per-volume approach. Also some tweaking to verbosity ... Wed, 21 Dec 2016 13:05:16 GMT Georgiy Litvinov [31265] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/OAIPMH.java (modified) Fixed null pointer exception while rebuilding collection via web editor Wed, 21 Dec 2016 00:47:56 GMT davidb [31264] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForLangCount.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Switching to 'long' in counts to allow higher number representation Wed, 21 Dec 2016 00:26:31 GMT davidb [31263] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (modified) Change to using long for higher word counts Tue, 20 Dec 2016 12:49:14 GMT Georgiy Litvinov [31262] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/CrossCollectionSearch.java (modified) Add snippets to cross collection search results Tue, 20 Dec 2016 11:14:48 GMT davidb [31261] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForLangCount.java (modified) Overlooked changes from POS to lang Tue, 20 Dec 2016 11:12:10 GMT davidb [31260] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-LANG-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeLangStreamFlatmap.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForLangCount.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Language counting Tue, 20 Dec 2016 10:45:28 GMT davidb [31259] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Lambda sort had wrong boolean arg to sort descending. Now fixed Tue, 20 Dec 2016 10:39:40 GMT davidb [31258] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-POS-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumePOSStreamFlatmap.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) POS Label count, similar to Whitelist word count Tue, 20 Dec 2016 03:52:52 GMT davidb [31257] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Fixed typo Tue, 20 Dec 2016 03:44:40 GMT davidb [31256] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Earlier check of output directory to prevent large scale processing, ... Tue, 20 Dec 2016 02:37:26 GMT davidb [31255] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Changed to using lambda functions Tue, 20 Dec 2016 02:29:56 GMT davidb [31254] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Experimenting with Lucene lowercase filter Tue, 20 Dec 2016 01:57:38 GMT davidb [31253] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) Identified a typo, and changed to being true anyway Tue, 20 Dec 2016 01:15:05 GMT davidb [31252] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeWordStreamFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Support for icu-tokenize property added, plus relevant refactoring. Mon, 19 Dec 2016 02:13:52 GMT davidb [31251] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Code tidy up. Timed experiment showed sorting by key with ... Mon, 19 Dec 2016 02:03:27 GMT davidb [31250] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Minor mods Mon, 19 Dec 2016 01:37:31 GMT kjdon [31249] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/action/DocumentAction.java (modified) when doing highlightqueryterms we pass in the id of the document we ... Sun, 18 Dec 2016 23:49:09 GMT kjdon [31248] * main/trunk/greenstone3/web/interfaces/default/transform/pages/document.xsl (modified) removed an extraneous ) Sun, 18 Dec 2016 07:38:59 GMT davidb [31247] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Change sort order. Pick better output directory name Sun, 18 Dec 2016 05:25:02 GMT davidb [31246] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (modified) Experimenting with sorting Sun, 18 Dec 2016 04:18:13 GMT davidb [31245] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Refactored so processing of words from TokenPosCount now done by the ... Sun, 18 Dec 2016 03:57:05 GMT davidb [31244] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Tidy up Sat, 17 Dec 2016 04:25:08 GMT davidb [31243] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/pom.xml (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Experimenting with Lucene/Solr's ICU tokenizer Sat, 17 Dec 2016 02:53:23 GMT davidb [31242] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeWordStreamFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Method name refactor Thu, 15 Dec 2016 04:34:22 GMT ak19 [31241] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/OAIPMH.java (modified) Corrected misunderstanding on distinction between whether we're ... Wed, 14 Dec 2016 21:28:39 GMT kjdon [31240] * main/trunk/greenstone3/web/interfaces/default/transform/layouts/header.xsl (modified) indented nicely in emacs. unfortunately it means the javascript code ... Wed, 14 Dec 2016 21:24:21 GMT kjdon [31239] * main/trunk/greenstone3/web/interfaces/default/transform/layouts/header.xsl (modified) split up the home-help-prefs template into multiple templates for ... Wed, 14 Dec 2016 21:16:26 GMT kjdon [31238] * main/trunk/greenstone3/web/interfaces/default/transform/layouts/toc.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/document.xsl (modified) added option disableSearchTermHighlighting=true so the user can ... Wed, 14 Dec 2016 07:22:44 GMT ak19 [31237] * main/trunk/greenstone3/resources/oai/OAIConfig.xml.in (modified) Forgot to declare the GS3 OAI repository's deletion policy (Deleted ... Wed, 14 Dec 2016 06:42:53 GMT ak19 [31236] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/OAIPMH.java (modified) More checking of oaiinf_db: need to make sure it exists before trying ... Tue, 13 Dec 2016 22:29:31 GMT davidb [31235] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP/setup-solr.bash (modified) More fine-grained testing to help nema setup Tue, 13 Dec 2016 22:20:57 GMT davidb [31234] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP.bash (modified) More selective control of what to source/setup depending on hostname Tue, 13 Dec 2016 22:12:29 GMT davidb [31233] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP.bash (modified) Changes to operate on nema as well as gsliscluster1 and gc0-9 Tue, 13 Dec 2016 22:11:00 GMT davidb [31232] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/CONF/nema-state.json (added) Hand edited version of state.json from gsliscluster1 suitable for ... Tue, 13 Dec 2016 09:41:54 GMT davidb [31231] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/nema-solr-start-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-check-local-shardsize-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-rsync2nema-local-shard-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-setup-local-disk-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-start-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP/setup-solr.bash (modified) Changes to allow SOLR to run on nodes in /hdfsd05/dbbridge/solr-ef Tue, 13 Dec 2016 07:36:01 GMT ak19 [31230] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/OAIPMH.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/FlatDatabaseWrapper.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GDBMWrapper.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSFile.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/JDBMWrapper.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/OAIXML.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/SimpleCollectionDatabase.java (modified) Commit for GS3 server side part of OAI deletion police ... Tue, 13 Dec 2016 03:22:24 GMT ak19 [31229] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/OAIReceptionist.java (modified) Converting informative logger messages emitted with logger.error() ... Tue, 13 Dec 2016 01:02:01 GMT davidb [31228] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ClusterFileIO.java (modified) Change to see if code can be made more unified. If so, then ... Tue, 13 Dec 2016 01:00:15 GMT davidb [31227] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ClusterFileIO.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Code tidy up Tue, 13 Dec 2016 00:53:48 GMT davidb [31226] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) Fixed bloom test for init Tue, 13 Dec 2016 00:46:23 GMT davidb [31225] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) Relocated bloomfilter creation to within call() method, so done on ... Mon, 12 Dec 2016 10:30:27 GMT davidb [31224] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Debug added Mon, 12 Dec 2016 10:28:08 GMT davidb [31223] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ClusterFileIO.java (modified) Exception printStackTrace Mon, 12 Dec 2016 10:22:33 GMT davidb [31222] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ClusterFileIO.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Changed to using ClusterFileIO supporting methods Mon, 12 Dec 2016 07:20:25 GMT davidb [31221] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) Missing argument added in Mon, 12 Dec 2016 07:18:04 GMT davidb [31220] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Use of whitelist Bloom filter added to words going into Solr index Mon, 12 Dec 2016 07:12:02 GMT ak19 [31219] * other-projects/nightly-tasks/diffcol/trunk/model-collect/Demo-Lucene/etc/oai-inf-tmp.gdb (added) Forgot to add to model-collect with previous commit. Mon, 12 Dec 2016 06:45:39 GMT ak19 [31218] * main/trunk/greenstone2/perllib/oaiinfo.pm (modified) Changes to get new perl code to work on the Mac Mountain Lion. Mon, 12 Dec 2016 06:06:08 GMT ak19 [31217] * other-projects/nightly-tasks/diffcol/trunk/model-collect/Associated-Files/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/CDS-ISIS/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Customization/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/DSpace-To-GS/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Demo-MGPP/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Demo-Section-Tagging/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Enhanced-PDF/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/MARC-Exploded/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/MARC-Singlefile/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/METS/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Multimedia/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/OAI-Local/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/PDFBox/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Scanned-Img-Advanced/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Scanned-Img-Basic/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Simple-Image/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Small-HTML/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Tudor-Basic/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Tudor-Enhanced/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Web-Tudor/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Word-PDF-Basic/etc/oai-inf-tmp.gdb (added) * other-projects/nightly-tasks/diffcol/trunk/model-collect/Word-PDF-Formatting/etc/oai-inf-tmp.gdb (added) Adding the new oai-inf.db files, created by rebuilding the model ... Mon, 12 Dec 2016 04:45:45 GMT ak19 [31216] * main/trunk/greenstone2/perllib/oaiinfo.pm (modified) Adding a datestamp field to the new oai-inf.db. Now the timestamp and ... Mon, 12 Dec 2016 04:12:56 GMT davidb [31215] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Changed back to Guava 20 API, now mvn shading allows me to have this ... Mon, 12 Dec 2016 04:08:51 GMT davidb [31214] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com (deleted) Not needed now using mvn shading Mon, 12 Dec 2016 04:08:06 GMT davidb [31213] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/pom.xml (modified) Tidy up Mon, 12 Dec 2016 04:06:50 GMT davidb [31212] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/COMPILE.bash (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/pom.xml (modified) Changed from mvn assemblhy to shadowing, which has more control Mon, 12 Dec 2016 03:01:59 GMT davidb [31211] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Changing back to regular Guava classes. Looking to use maven shading ... Mon, 12 Dec 2016 02:29:33 GMT ak19 [31210] * main/trunk/greenstone2/perllib/DBDrivers/JDBM.pm (modified) Forgot to remove an extra debug statement. Mon, 12 Dec 2016 02:24:54 GMT davidb [31209] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvanced.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvancedStrategies.java (modified) checkArgument added in Mon, 12 Dec 2016 02:20:40 GMT ak19 [31208] * main/trunk/greenstone2/perllib/DBDrivers/JDBM.pm (modified) * main/trunk/greenstone2/perllib/dbutil/jdbm.pm (modified) * main/trunk/greenstone2/perllib/oaiinfo.pm (modified) Kathy found that the lowercased dbutil modules are not used (jdbm.pm, ... Mon, 12 Dec 2016 02:10:24 GMT davidb [31207] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvanced.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvancedStrategies.java (modified) And some more tweaking Mon, 12 Dec 2016 02:05:33 GMT davidb [31206] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvanced.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvancedStrategies.java (modified) More tweaking of Guava cloned code Mon, 12 Dec 2016 02:01:26 GMT davidb [31205] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvanced.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvancedStrategies.java (added) Next added in part of new Guava code Mon, 12 Dec 2016 01:28:20 GMT davidb [31204] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/com/google/common/hash/BloomFilterAdvanced.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Splicing in Guava verion 20 of BloomFilter into code as own class ... Mon, 12 Dec 2016 00:57:01 GMT davidb [31203] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Use class provided stringFunnel Mon, 12 Dec 2016 00:53:06 GMT davidb [31202] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Turns out Spark uses Guava 14.0 not 20.0. Additional code to fill in ... Sun, 11 Dec 2016 21:35:42 GMT davidb [31201] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (added) Trigger serialization of whitelist in main program Sun, 11 Dec 2016 21:35:05 GMT davidb [31200] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistBloomFilter.java (modified) Better output statement Sun, 11 Dec 2016 21:04:55 GMT davidb [31199] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistBloomFilter.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistDictionaryMain.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistHashmap.java (modified) Renaming of classname to reflect filename rename Sun, 11 Dec 2016 21:03:20 GMT davidb [31198] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistHashmap.java (moved) File renaming to make way for newer version of classes needed in the ... Sun, 11 Dec 2016 21:02:37 GMT davidb [31197] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/DictionaryWhitelist.java (deleted) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (deleted) File renaming to make way for newer version of classes needed in the ... Sun, 11 Dec 2016 21:01:30 GMT davidb [31196] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistBloomFilter.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TestWhitelistDictionaryMain.java (added) File renaming to make way for newer version of classes needed in the ... Sun, 11 Dec 2016 21:00:08 GMT davidb [31195] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/TESTWhitelistHashmap.java (moved) File renaming to make way for newer version of classes needed in the ... Sun, 11 Dec 2016 20:51:07 GMT davidb [31194] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (modified) Serialize in and out methods added Sun, 11 Dec 2016 20:32:50 GMT davidb [31193] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/wcsa-whitelist1.csv.gz (added) Peter's white-list file Fri, 09 Dec 2016 09:29:13 GMT ak19 [31192] * main/trunk/greenstone2/perllib/oaiinfo.pm (modified) Replacing unnecessary functions and removing unused functions. Fri, 09 Dec 2016 09:18:32 GMT ak19 [31191] * main/trunk/greenstone2/perllib/inexport.pm (modified) * main/trunk/greenstone2/perllib/oaiinfo.pm (modified) Correction to previous commit. Fri, 09 Dec 2016 08:37:52 GMT ak19 [31190] * main/trunk/gli/src/org/greenstone/gatherer/collection/CollectionManager.java (modified) * main/trunk/greenstone2/bin/script/activate.pl (modified) * main/trunk/greenstone2/perllib/basebuilder.pm (modified) * main/trunk/greenstone2/perllib/inexport.pm (modified) * main/trunk/greenstone2/perllib/oaiinfo.pm (added) First major commit to do with the new oaiinfo db that keeps track of ... Fri, 09 Dec 2016 08:28:50 GMT ak19 [31189] * main/trunk/greenstone2/perllib/inexport.pm (modified) Corrections and cleanups ahead of major commit to do with the new ... Fri, 09 Dec 2016 08:24:55 GMT ak19 [31188] * main/trunk/greenstone2/perllib/DBDrivers/BaseDBDriver.pm (modified) * main/trunk/greenstone2/perllib/DBDrivers/GDBM.pm (modified) * main/trunk/greenstone2/perllib/DBDrivers/JDBM.pm (modified) * main/trunk/greenstone2/perllib/dbutil.pm (modified) * main/trunk/greenstone2/perllib/dbutil/jdbm.pm (modified) This commit is related to but not specific to the upcoming commit to ... Fri, 09 Dec 2016 08:21:29 GMT ak19 [31187] * main/trunk/greenstone2/perllib/FileUtils.pm (modified) * main/trunk/greenstone2/perllib/arcinfo.pm (modified) * main/trunk/greenstone2/perllib/util.pm (modified) Useful changes not specifically related to upcoming oaiinfo db ... Thu, 08 Dec 2016 18:26:05 GMT Georgiy Litvinov [31186] * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) Removed useless search param from previous commit Thu, 08 Dec 2016 15:37:47 GMT Georgiy Litvinov [31185] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/LibraryServlet.java (modified) Remove both s1.collection s1.group from cache in case either of them ... Wed, 07 Dec 2016 20:21:25 GMT davidb [31184] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-GEN-WHITELIST.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) New provision to run different main classes in _RUN.sh; New top-level ... Wed, 07 Dec 2016 20:19:36 GMT davidb [31183] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/.classpath (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/.settings/org.eclipse.jdt.core.prefs (modified) Bump up to project using Java 1.8 Wed, 07 Dec 2016 13:03:05 GMT Georgiy Litvinov [31182] * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) Added xslt code to search by CCS in group from group page Wed, 07 Dec 2016 13:02:55 GMT Georgiy Litvinov [31181] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/CollectionGroups.java (modified) * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/CrossCollectionSearch.java (modified) Added java code to use groups in cross collection search Wed, 07 Dec 2016 02:44:54 GMT ak19 [31180] * main/trunk/greenstone2/macros/catalan2.dm (modified) Catalan language translations for the auxilliary module of GS2 ... Mon, 05 Dec 2016 00:42:21 GMT kjdon [31179] * main/trunk/greenstone3/web/interfaces/default/transform/config_format.xsl (modified) adding in gsf:space template, useful when you want to force a space ... Mon, 05 Dec 2016 00:39:59 GMT kjdon [31178] * main/trunk/greenstone3/web/interfaces/default/transform/query-common.xsl (modified) having selected='' is invalid for HTML, need to have ... Sat, 03 Dec 2016 08:23:51 GMT davidb [31177] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/pom.xml (modified) Adding in Google jar that supports Bloom filters Sat, 03 Dec 2016 08:16:38 GMT davidb [31176] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Support added for producing whitelist word count Sat, 03 Dec 2016 08:15:52 GMT davidb [31175] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/DictionaryWhitelist.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeWordStreamFlatmap.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistBloomFilter.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/WhitelistHashmap.java (added) Trial to find memory difference betwen Hashmap and Bloom filters Sat, 03 Dec 2016 01:16:01 GMT davidb [31174] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/FULL-EF-HDFS-extra10-njp-missing.sh (added) One of the last scripts developed for getting ef dataset into HDFS Sat, 03 Dec 2016 01:14:20 GMT davidb [31173] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-aeu.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-bc.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-caia.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-chi.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-coo.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-coo1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-dul1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-emu.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-gri.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-hvd.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-iau.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ien.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-inu.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-keio.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ku01.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-loc.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-mcg.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-mdp.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-miua.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-miun.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-mmet.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-mou.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-nc01.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ncs1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-njp.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-nnc1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-nnc2.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-nyp.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-osu.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-psia.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-pst.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-pur1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-txa.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uc1-filename.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uc1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uc2.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ucm.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ucw.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-udel.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ufl1.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-ufl2.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uiuc.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uiug.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uiuo.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uma.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-umn.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-usu.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-uva.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-wau.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-wu.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-yale.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/file-size-local/ef-full-yul.txt (added) individual file sizes per top-level folder Fri, 02 Dec 2016 20:40:17 GMT davidb [31172] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/FULL-FILE-SIZE-COUNT.sh (added) to help track down missing files in HDFS copy Thu, 01 Dec 2016 21:15:59 GMT davidb [31171] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/FILE-SIZE-CHECK-SUBFOLDERS.pl (added) Util to help find where missing files are Thu, 01 Dec 2016 21:15:25 GMT davidb [31170] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/PAIRTREE-TL-TARGET-DEPTH2-FOREACH-DEPTH3-HDFS-PUT.sh (added) Targetted sub-dir copy Thu, 01 Dec 2016 21:14:47 GMT davidb [31169] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/FILE-SIZE-CHECK.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/FULL-EF-HDFS.sh (modified) Improved logic Thu, 01 Dec 2016 07:13:03 GMT Georgiy Litvinov [31168] * gs3-extensions/solr/trunk/src/perllib/solrutil.pm (modified) Increased java heap limit to prevent Out of memory error