# # ChangeLog for other-projects/hathitrust/wcsa # # Generated by Trac 1.4.2 # 2024-03-29T03:40:43+13:00 Mon, 07 Nov 2016 10:27:52 GMT davidb [31078] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/CONF (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/CONF/htrc_configs.tar.gz (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/CONF/zoo.cfg (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-start-all.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-stop-all.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-spark-start-all.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-spark-stop-all.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-zookeeper-start.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-zookeeper-stop.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP.bash (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP/setup-solr.bash (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP/setup-spark.bash (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP/setup-zookeeper.bash (added) Some setup files and scripts to make running Spark and Solr easier on ... Mon, 07 Nov 2016 09:34:31 GMT davidb [31077] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/Vagrantfile (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/manifests/base-hadoop.pp (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/modules/hadoop/manifests/init.pp (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/modules/hadoop/templates/hadoop-env.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/modules/hadoop/templates/hdfs-site.xml (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/modules/hadoop/templates/masters (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/modules/hadoop/templates/setup-hadoop.bash (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster/modules/spark/manifests/init.pp (modified) Move up to JDK1.8. Tidy up of Vagrant machine names. Support for ... Sun, 06 Nov 2016 20:09:03 GMT davidb [31065] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) Additional echo output Sat, 05 Nov 2016 02:04:01 GMT davidb [31062] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) Added in -W option so check-sum calculation is skipped Thu, 03 Nov 2016 22:01:29 GMT davidb [31058] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) echo for additional information added Thu, 03 Nov 2016 21:59:03 GMT davidb [31057] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) Tweak to jps output formatting Thu, 03 Nov 2016 01:26:13 GMT davidb [31053] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) Addition of second argument, optional, for where to save the files Thu, 03 Nov 2016 00:46:49 GMT davidb [31051] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/packages/GET-PACKAGES.sh (modified) Added in JDK to list of possible packages needed Wed, 02 Nov 2016 09:52:43 GMT davidb [31046] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) Added property to control how severe a JSON IO problem is Wed, 02 Nov 2016 08:34:47 GMT davidb [31045] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/JSONClusterFileIO.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONMap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) More careful treatment of what to do when a JSON file isn't there Wed, 02 Nov 2016 08:30:49 GMT davidb [31044] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (modified) Fixed up error when output_dir is empty Wed, 02 Nov 2016 08:24:32 GMT davidb [31043] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK.sh (added) Version for processing full EF set Wed, 02 Nov 2016 07:18:22 GMT davidb [31042] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/PD-RUN-MASTER-LOCAL.sh (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/PD-RUN-MASTER-SPARK.sh (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/_RUN.sh (moved) Name changes, preparing the way for FULL-RUN versions Wed, 02 Nov 2016 07:07:40 GMT davidb [31041] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Test needs to be more careful if -read-only specified Wed, 02 Nov 2016 04:20:52 GMT davidb [31036] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/RUN-PD-MASTER-LOCAL.bash (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/RUN-PD-MASTER-SPARK.bash (moved) Renaming to prepare way for YARN version of script Wed, 02 Nov 2016 04:16:04 GMT davidb [31035] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-SELECT-EVERY-N.sh (modified) Changes after testing scripts Wed, 02 Nov 2016 04:10:29 GMT davidb [31034] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-DOWNLOAD-EVERY-N.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-GET-FILE-LIST.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-SELECT-EVERY-N.sh (added) Development of scripts for working with Full EF dataset Wed, 02 Nov 2016 04:10:17 GMT davidb [31033] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/PD-GET-FILE-LIST.sh (moved) Development of scripts for working with Full EF dataset Wed, 02 Nov 2016 01:28:39 GMT davidb [31030] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) Tweak to some verbosity level 2 printing Wed, 02 Nov 2016 01:19:23 GMT davidb [31029] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) Newline at end of file added Wed, 02 Nov 2016 01:17:45 GMT davidb [31028] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/_RUN.bash (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONMap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Support for randonly choosing Solr endpoints added in Wed, 02 Nov 2016 00:06:15 GMT davidb [31027] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Mixed typo in property name used Wed, 02 Nov 2016 00:01:16 GMT davidb [31026] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Corrected flag setting Tue, 01 Nov 2016 22:59:37 GMT davidb [31025] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Use property process-json-mode to determine which sort of Spark ... Tue, 01 Nov 2016 22:37:07 GMT davidb [31024] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/ef-solr.properties (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Support for Java properties file Tue, 01 Nov 2016 01:14:51 GMT davidb [31022] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal-trunk (deleted) No longer used Tue, 01 Nov 2016 01:14:21 GMT davidb [31021] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal (copied) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal-trunk (copied) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/trunk (deleted) Folder restructure to remove 'trunk' part Tue, 01 Nov 2016 01:13:11 GMT davidb [31020] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-solr-cluster-trunk (deleted) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster-trunk (deleted) No longer used Tue, 01 Nov 2016 01:12:15 GMT davidb [31019] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-solr-cluster (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster (moved) Part 2 or two-step folder restructure Tue, 01 Nov 2016 01:10:29 GMT davidb [31018] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-solr-cluster-trunk (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster-trunk (moved) Part 1 or two-step folder restructure Tue, 01 Nov 2016 01:08:24 GMT davidb [31017] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal (moved) Moved to correct position Tue, 01 Nov 2016 01:06:05 GMT davidb [31015] * other-projects/hathitrust/wcsa (added) * other-projects/hathitrust/wcsa/extracted-features-solr (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-solr-cluster (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/vagrant-spark-hdfs-cluster (moved) * other-projects/hathitrust/wcsa/extracted-features-solr/web-portal (moved) Restructuring of projects into one