# # ChangeLog for other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts # # Generated by Trac 1.4.2 # 2024-05-16T07:40:14+12:00 Sat, 13 Jan 2018 08:18:08 GMT davidb [32102] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/SERIAL-INGEST.sh (added) Version to project local JSON list serially Thu, 02 Mar 2017 10:31:07 GMT davidb [31452] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-CATALOG-LANG-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-YARN-SPARK-CATALOG-LANG-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-YARN-SPARK.sh (added) Additional Spark progs to run Tue, 27 Dec 2016 05:54:09 GMT davidb [31268] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK.sh (modified) Adjustments to memory allocation in response to test runs on 10% of ... Tue, 20 Dec 2016 11:12:10 GMT davidb [31260] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-LANG-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeLangStreamFlatmap.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForLangCount.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Language counting Tue, 20 Dec 2016 10:39:40 GMT davidb [31258] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-POS-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumePOSStreamFlatmap.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForPOSCount.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) POS Label count, similar to Whitelist word count Wed, 07 Dec 2016 20:21:25 GMT davidb [31184] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-GEN-WHITELIST.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) New provision to run different main classes in _RUN.sh; New top-level ... Thu, 10 Nov 2016 10:15:32 GMT davidb [31102] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-QUERY.sh (added) Command line way of running a Solr test query Thu, 10 Nov 2016 03:20:02 GMT davidb [31093] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) Changes triggered by running on gsliscluster1 Sun, 06 Nov 2016 20:09:03 GMT davidb [31065] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) Additional echo output Sat, 05 Nov 2016 02:04:01 GMT davidb [31062] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) Added in -W option so check-sum calculation is skipped Thu, 03 Nov 2016 22:01:29 GMT davidb [31058] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) echo for additional information added Thu, 03 Nov 2016 21:59:03 GMT davidb [31057] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) Tweak to jps output formatting Thu, 03 Nov 2016 01:26:13 GMT davidb [31053] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) Addition of second argument, optional, for where to save the files Wed, 02 Nov 2016 08:30:49 GMT davidb [31044] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) Fixed up error when output_dir is empty Wed, 02 Nov 2016 08:24:32 GMT davidb [31043] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK.sh (added) Version for processing full EF set Wed, 02 Nov 2016 07:18:22 GMT davidb [31042] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/PD-RUN-MASTER-LOCAL.sh (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/PD-RUN-MASTER-SPARK.sh (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (moved) Name changes, preparing the way for FULL-RUN versions Wed, 02 Nov 2016 04:16:04 GMT davidb [31035] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-SELECT-EVERY-N.sh (modified) Changes after testing scripts Wed, 02 Nov 2016 04:10:29 GMT davidb [31034] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-GET-FILE-LIST.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-SELECT-EVERY-N.sh (added) Development of scripts for working with Full EF dataset Wed, 02 Nov 2016 04:10:17 GMT davidb [31033] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/PD-GET-FILE-LIST.sh (moved) Development of scripts for working with Full EF dataset Tue, 01 Nov 2016 01:06:05 GMT davidb [31015] * other-projects/hathitrust/wcsa (added) * other-projects/hathitrust/wcsa/extracted-features-solr (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-solr-cluster (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/web-portal (moved) Restructuring of projects into one Tue, 25 Oct 2016 01:52:52 GMT davidb [30919] * other-projects/hathitrust/solr-extracted-features/trunk/scripts/PD-DOWNLOAD-EVERY-1000.sh (modified) * other-projects/hathitrust/solr-extracted-features/trunk/scripts/PD-DOWNLOAD-EVERY-10000.sh (modified) More consistent naming of folders used