# # ChangeLog for other-projects/hathitrust # # Generated by Trac 1.4.2 # 2024-03-29T05:47:01+13:00 Tue, 16 Jan 2018 09:39:16 GMT davidb [32106] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSONList.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeUtil.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngestJSONFilelist.java (added) Rekindle ability to process a json-filelist.txt using Spark Sat, 13 Jan 2018 09:49:32 GMT davidb [32104] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/serial-ef-solr.properties (added) Serial version Sat, 13 Jan 2018 08:19:28 GMT davidb [32103] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSerialSolrIngest.java (modified) Tidy up of output Sat, 13 Jan 2018 08:18:08 GMT davidb [32102] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/SERIAL-INGEST.sh (added) Version to project local JSON list serially Fri, 12 Jan 2018 05:16:31 GMT davidb [32101] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ClusterFileIO.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSerialSolrIngest.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) Tweaks to allow serial ingest to run Sat, 08 Jul 2017 09:04:06 GMT davidb [31786] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) extra param in call; change to case-folding _htrctokentext Sat, 08 Jul 2017 07:32:54 GMT davidb [31785] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-start-all.sh (modified) Change to allow solr command to optioanlly issue 'restart' instead of ... Fri, 07 Jul 2017 11:33:37 GMT davidb [31784] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) Output to highlight skipping per-page indexing Fri, 07 Jul 2017 11:31:25 GMT davidb [31783] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Solr Doc Add changed to include volume-level metadata within every ... Fri, 07 Jul 2017 11:13:46 GMT davidb [31782] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/CONF/htrc_configs.tar.gz (modified) more careful separation into field types htrcstring and htrcstrings Fri, 07 Jul 2017 04:11:22 GMT davidb [31779] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Change in how POS words are checked against the Whitelist. ... Sun, 02 Jul 2017 04:43:06 GMT davidb [31772] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/tmp (deleted) Accidentally committed Thu, 18 May 2017 10:29:16 GMT davidb [31693] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Changes to workset information is pulled from sparql-endpoint for ... Thu, 11 May 2017 10:48:39 GMT davidb [31677] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Supress processing governmentDocument for now in JSON metadata ... Thu, 11 May 2017 10:47:16 GMT davidb [31676] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/FULL-YARN-KILL-EXAMPLE.sh (added) To make it easier to remember how to kill off a YARN task at the ... Thu, 11 May 2017 10:19:06 GMT davidb [31675] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) More careful set of metadata fields indexed Thu, 04 May 2017 02:43:27 GMT davidb [31645] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Some initial work on drawing in workset info from sparql-endpoint. ... Thu, 20 Apr 2017 13:03:09 GMT davidb [31626] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/WEB-INF/web.xml (modified) Links to blog entries added Thu, 20 Apr 2017 13:02:40 GMT davidb [31625] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/stream-query.html (modified) Tidy up Thu, 20 Apr 2017 13:02:28 GMT davidb [31624] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Combined volume md and full-text page searching Thu, 20 Apr 2017 10:59:24 GMT davidb [31623] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) Removed commented out static HTML POS section Thu, 20 Apr 2017 10:55:39 GMT davidb [31622] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/WEB-INF/web.xml (modified) Adding in CORS support to Solr Thu, 20 Apr 2017 09:35:48 GMT davidb [31621] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) Step towards making HTML/JS work with on different server, with AJAX ... Thu, 20 Apr 2017 08:54:40 GMT davidb [31619] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Further minor tidy up Thu, 20 Apr 2017 08:41:04 GMT davidb [31618] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Code tidy up Thu, 20 Apr 2017 06:17:56 GMT davidb [31614] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/stream-index.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/stream-query.html (modified) Separate off stream query page Thu, 20 Apr 2017 05:55:58 GMT davidb [31613] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/main.css (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Multiple word support in POS search box. Tidy up of anchor for search ... Thu, 13 Apr 2017 09:36:19 GMT davidb [31601] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/bootstrap.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/bootstrap.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/bowser.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/flat.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/font-awesome.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/ga-download-tracking.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/hathi.jpg (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/highlight.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/htrcwarnings.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1 (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/AUTHORS.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/LICENSE.txt (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/external (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/external/jquery (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/external/jquery/jquery.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_diagonals-thick_18_b81900_40x40.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_diagonals-thick_20_666666_40x40.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_glass_100_f6f6f6_1x400.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_glass_100_fdf5ce_1x400.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_glass_65_ffffff_1x400.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_gloss-wave_35_f6a828_500x100.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_highlight-soft_100_eeeeee_1x100.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-bg_highlight-soft_75_ffe45c_1x100.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-icons_222222_256x240.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-icons_228ef1_256x240.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-icons_ef8c08_256x240.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-icons_ffd27a_256x240.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/images/ui-icons_ffffff_256x240.png (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/index.html (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.min.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.min.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.structure.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.structure.min.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.theme.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/jquery-ui.theme.min.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery-ui-lightness-1.12.1/package.json (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/jquery.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/main.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/stupidtable.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/tmp (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/tmp/jquery-ui-1.12.1.lightness.zip (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/tomorrow.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/HTRC_Mashup--Home_files/uploadws.js (added) To get the look and feel of the HTRC portal web site, supporting ... Tue, 11 Apr 2017 11:44:07 GMT davidb [31598] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/FULL-YARN-INGEST.sh (added) Easier to remember what to do Tue, 11 Apr 2017 11:41:07 GMT davidb [31597] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Additional _s and _ss fields to help with faceting. Temporarily ... Mon, 03 Apr 2017 11:23:59 GMT davidb [31571] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Simple search-all-langs feature added Mon, 03 Apr 2017 11:04:19 GMT davidb [31570] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.css (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/stream-query.html (added) Solr-stream based search Mon, 20 Mar 2017 06:56:14 GMT davidb [31524] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.css (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Main changes: Fix for page/seqnum; group by id; show-hide other ... Mon, 13 Mar 2017 08:00:16 GMT davidb [31510] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Turns out some languages fields can be empty. Need to test for this Mon, 13 Mar 2017 07:50:06 GMT davidb [31509] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) LangPos determination changed to lock into first match, rather than ... Mon, 13 Mar 2017 03:02:07 GMT davidb [31506] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) Forgot to add initialization line. Doh! Mon, 13 Mar 2017 02:31:40 GMT davidb [31505] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Added in storing of top-level document metadata as separate solr-doc Mon, 13 Mar 2017 02:31:09 GMT davidb [31504] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONMap.java (modified) Adjusted call to work with added parameter Mon, 13 Mar 2017 02:30:13 GMT davidb [31503] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) Monitor for missing POS keys, and print out details first time each ... Mon, 13 Mar 2017 01:16:15 GMT davidb [31502] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Comment out section, useful for controlling a smaller run Mon, 13 Mar 2017 01:09:39 GMT davidb [31501] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/__PerPageJSONForeach.java (deleted) No longer used Mon, 13 Mar 2017 00:56:36 GMT davidb [31500] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) Synchronize on reading in of white-list and universal-lang-pos Mon, 13 Mar 2017 00:54:54 GMT davidb [31499] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Better exception handling Mon, 13 Mar 2017 00:53:24 GMT davidb [31498] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) Tidy up on print statements Mon, 06 Mar 2017 10:18:58 GMT davidb [31466] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-delete-collection.sh (modified) Fix to work out solr_host rather than assume it is gc0 Mon, 06 Mar 2017 10:18:24 GMT davidb [31465] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-start-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP.bash (modified) Adjustment to run solr with more memory Mon, 06 Mar 2017 10:10:21 GMT davidb [31464] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-delete-collection.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-init-collection.sh (added) More general version of script that let's you specify the collection ... Sun, 05 Mar 2017 06:44:56 GMT davidb [31455] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/_remote-solr-delete-full-ef-collection.sh (moved) deprecated Sun, 05 Mar 2017 06:41:35 GMT davidb [31454] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/_remote-solr-init-full-ef-collection.sh (moved) Deprecated Thu, 02 Mar 2017 20:29:44 GMT davidb [31453] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) Added size() method Thu, 02 Mar 2017 10:31:07 GMT davidb [31452] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-MASTER-SPARK-CATALOG-LANG-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-YARN-SPARK-CATALOG-LANG-COUNT.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/scripts/FULL-RUN-YARN-SPARK.sh (added) Additional Spark progs to run Thu, 02 Mar 2017 10:28:38 GMT davidb [31451] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) shift to using solr-base-url and a specified solr-collection Tue, 28 Feb 2017 10:37:29 GMT davidb [31450] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Some debugging output to help see what is happening with ... Wed, 08 Feb 2017 21:40:52 GMT davidb [31393] * other-projects/hathitrust/wcsa/vol-checker/WebContent/index.html (modified) Fixed typo Wed, 08 Feb 2017 21:40:39 GMT davidb [31392] * other-projects/hathitrust/wcsa/vol-checker/WebContent/HT-HTRC_Mashup.user.js (modified) Support for Catalog page added Thu, 02 Feb 2017 10:40:09 GMT davidb [31385] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) Next and previous pages Thu, 02 Feb 2017 08:41:53 GMT davidb [31384] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/INSTALL.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/UPDATE-SERVLET.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (modified) After next phase of development Thu, 02 Feb 2017 06:01:55 GMT davidb [31383] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/INSTALL.sh (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/WEB-INF (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/WEB-INF/web.xml (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/admin.html (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/etc (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/etc/jetty.xml (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/etc/realm.properties (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.css (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.html (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/index.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/iso-639-1.js (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/luke_lang_mappings.html (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/web-portal/luke_lang_mappings.js (added) Files for initial functioning search page Tue, 31 Jan 2017 11:16:08 GMT davidb [31378] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) Fixed loop limit test Tue, 31 Jan 2017 09:55:17 GMT davidb [31377] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/pom.xml (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (modified) Switch to using URI not string Tue, 31 Jan 2017 08:40:43 GMT davidb [31376] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings/da-ddt.map (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings/de-tiger.map (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings/en-ptb.map (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings/nl-alpino.map (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings/pt-bosque.map (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/opennlp-lang-pos-mappings/se-talbanken.map (added) Universal language mappings for opennlp POS model tags Tue, 31 Jan 2017 08:35:50 GMT davidb [31375] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/POSString.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerPageJSONFlatmap.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/SolrDocJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/UniversalPOSLangMap.java (added) Initial cut at including POS information to solr index Mon, 30 Jan 2017 11:22:39 GMT davidb [31374] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) simplified command line usage Mon, 30 Jan 2017 11:08:35 GMT davidb [31373] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-setup-local-disk-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-solr-start-all.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP.bash (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP/setup-solr.bash (modified) Changes made to operate on solr1 and solr2 boxes Mon, 30 Jan 2017 11:06:39 GMT davidb [31372] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeJSON.java (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java (modified) Reworked to use sequenceFiles Mon, 30 Jan 2017 11:06:08 GMT davidb [31371] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) Trying to get saveAsSequenceFile working Mon, 30 Jan 2017 10:30:16 GMT davidb [31370] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/CONF/htrc_configs.tar.gz (modified) Fixed incorrect version number. Using htrcstring so field values not ... Sun, 29 Jan 2017 21:34:01 GMT davidb [31369] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) Trial new save Sun, 29 Jan 2017 21:02:27 GMT davidb [31368] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) downsample-100 added Sun, 29 Jan 2017 10:39:41 GMT davidb [31367] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-zookeeper-start.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SCRIPTS/remote-zookeeper-stop.sh (modified) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/SETUP.bash (modified) Changes to work with solr1 and solr2 Sun, 29 Jan 2017 09:19:47 GMT davidb [31366] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/gslis-cluster/GET-PACKAGES-SOLR.sh (modified) Updated to latest released version of Solr Sun, 29 Jan 2017 08:51:30 GMT davidb [31365] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) Quick code added to downsample Fri, 27 Jan 2017 20:57:21 GMT davidb [31364] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) removed sample() line Fri, 27 Jan 2017 08:24:16 GMT davidb [31363] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) Control num of partitions on sort Fri, 27 Jan 2017 03:38:08 GMT davidb [31362] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) use Spark sample() to make for smaller test with Sequence files Thu, 26 Jan 2017 21:26:16 GMT davidb [31361] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) Change from String to Text Thu, 26 Jan 2017 10:50:19 GMT davidb [31360] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeCatalogLangSequenceFileMap.java (modified) Seems to be Text class not a String class coming out of the ... Thu, 26 Jan 2017 10:08:16 GMT davidb [31359] * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/PerVolumeCatalogLangSequenceFileMap.java (added) * other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForCatalogLangCount.java (modified) Changed over to use sequenceFiles as input Tue, 24 Jan 2017 11:04:07 GMT davidb [31358] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Make workset download save as file Tue, 24 Jan 2017 10:31:32 GMT davidb [31357] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Ensure all output sent to browser Tue, 24 Jan 2017 10:12:00 GMT davidb [31356] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Tidy up on appending missing volumes Tue, 24 Jan 2017 10:08:13 GMT davidb [31355] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Changed to using containsKey rather than get to avoid null pointer ... Tue, 24 Jan 2017 09:43:53 GMT davidb [31354] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) import tidy-up Tue, 24 Jan 2017 09:28:23 GMT davidb [31353] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Added debug print statement Tue, 24 Jan 2017 08:54:12 GMT davidb [31352] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) collection-to-workset now with id-check added to filter Tue, 24 Jan 2017 08:41:55 GMT davidb [31351] * other-projects/hathitrust/wcsa/vol-checker/WebContent/htrc-mashup.ppsx (added) * other-projects/hathitrust/wcsa/vol-checker/WebContent/htrc-mashup.pptx (added) Powerpoint slides showing mahsup features Tue, 24 Jan 2017 08:41:15 GMT davidb [31350] * other-projects/hathitrust/wcsa/vol-checker/WebContent/HT-HTRC_Mashup.user.js (modified) Use new 'convert-col' action Tue, 24 Jan 2017 08:39:56 GMT davidb [31349] * other-projects/hathitrust/wcsa/vol-checker/WebContent/index.html (modified) Change over to proxyied main web server Tue, 24 Jan 2017 01:01:55 GMT davidb [31348] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Restructure of how convert-to works Tue, 24 Jan 2017 00:40:12 GMT davidb [31347] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) First stage of developing HT collection to HTRC workset. Code to ... Mon, 23 Jan 2017 11:11:15 GMT davidb [31342] * other-projects/hathitrust/wcsa/vol-checker/WebContent/HT-HTRC_Mashup.user.js (modified) Some initial progress on collection to workset conversion Mon, 23 Jan 2017 09:21:38 GMT davidb [31341] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Cody tidy-up Mon, 23 Jan 2017 09:16:06 GMT davidb [31340] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Test worked OK. Removing debug code Mon, 23 Jan 2017 09:00:15 GMT davidb [31339] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Debugging statement Mon, 23 Jan 2017 08:38:29 GMT davidb [31338] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) additional close() Mon, 23 Jan 2017 08:24:31 GMT davidb [31337] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Output the downloaded rsync file Mon, 23 Jan 2017 07:46:54 GMT davidb [31336] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Changes in response to testing Mon, 23 Jan 2017 07:37:32 GMT davidb [31335] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Too expensive to hold pairtree filename in hashmap, so change to ... Mon, 23 Jan 2017 05:03:51 GMT davidb [31334] * other-projects/hathitrust/wcsa/vol-checker/src/org/hathitrust/extractedfeatures/VolumeCheck.java (modified) Initial cut at rsync download Mon, 23 Jan 2017 03:03:20 GMT davidb [31333] * other-projects/hathitrust/wcsa/vol-checker/WebContent/index.html (modified) Minor word tweak