# # ChangeLog for gs3-extensions # # Generated by Trac 1.4.2 # 2024-04-28T18:55:34+12:00 Thu, 26 Sep 2019 08:39:38 GMT ak19 [33527] * gs3-extensions/maori-lang-detection/hdfs-cc-work (moved) Name change for folder Thu, 26 Sep 2019 08:38:14 GMT ak19 [33526] * gs3-extensions/maori-lang-detection/bin/script/get_Maori_WET_records_from_CCSep2018_on.sh (deleted) * gs3-extensions/maori-lang-detection/bin/script/get_maori_WET_records_for_crawl.sh (deleted) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/get_Maori_WET_records_from_CCSep2018_on.sh (modified) Moved hadoop related scripts from bin/script into hdfs-instructions Thu, 26 Sep 2019 08:35:38 GMT ak19 [33525] * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/get_Maori_WET_records_from_CCSep2018_on.sh (moved) Rename before latest version Thu, 26 Sep 2019 08:34:12 GMT ak19 [33524] * gs3-extensions/maori-lang-detection/hdfs-instructions/Readme.txt (modified) * gs3-extensions/maori-lang-detection/hdfs-instructions/conf (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/conf/ia-hadoop-tools-pom.xml (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/conf/spark-defaults.conf.in (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/gitprojects (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/gitprojects/cc-index-table.tar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/gitprojects/ia-hadoop-tools.tar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/gitprojects/ia-web-commons.tar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/jars (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/jars/aws-java-sdk-1.11.616.jar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/jars/aws-java-sdk-1.7.4.jar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/jars/guava.jar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/jars/hadoop-aws-2.7.6.jar (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/patches (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/patches/CCIndexWarcExport.java (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/patches/CCIndexWarcExport.java.orig (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/GS_README (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/export_maori_index_csv.sh (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/export_maori_subset.sh (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/export_maori_subset_from_scratch.sh (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/get_Maori_WET_records_in_cc_from_Sep2018.sh (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/get_maori_WET_records_for_crawl.sh (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/scripts/limit10_export_index.sh (added) 1. Further adjustments to documenting what we did to get things to ... Thu, 26 Sep 2019 07:00:36 GMT ak19 [33523] * gs3-extensions/maori-lang-detection/bin/script/gen-all-dumps.sh (modified) Instructional comment Thu, 26 Sep 2019 07:00:23 GMT ak19 [33522] * gs3-extensions/maori-lang-detection/bin/script/get_Maori_WET_records_from_CCSep2018_on.sh (modified) Some comments and an improvement Tue, 24 Sep 2019 09:40:16 GMT ak19 [33519] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) Code still writes out the global seedURLs.txt and regex-urlfilter.txt ... Tue, 24 Sep 2019 09:13:47 GMT ak19 [33518] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) Intermediate commit: got the seed urls file temporarily written out ... Tue, 24 Sep 2019 08:30:40 GMT ak19 [33517] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) 1. Blacklists were introduced so that too many instances of ... Tue, 24 Sep 2019 08:14:16 GMT ak19 [33516] * gs3-extensions/maori-lang-detection/bin/script/gen-all-dumps.sh (added) Before I accidentally lose it, committing the script Dr Bainbridge ... Tue, 24 Sep 2019 07:50:40 GMT ak19 [33515] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) Removed an unused function Tue, 24 Sep 2019 07:44:04 GMT ak19 [33514] * gs3-extensions/maori-lang-detection/hdfs-instructions (added) * gs3-extensions/maori-lang-detection/hdfs-instructions/Readme.txt (added) Committing README on starting off with the vagrant VM for hadoop- ... Tue, 24 Sep 2019 07:15:01 GMT ak19 [33513] * gs3-extensions/maori-lang-detection/bin/script/get_Maori_WET_records_from_CCSep2018_on.sh (added) Higher level script that runs against each named crawl since Sep 2018 ... Mon, 23 Sep 2019 11:16:28 GMT ak19 [33503] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) More efficient blacklisting/greylisting/whitelisting now by reading ... Mon, 23 Sep 2019 11:11:29 GMT ak19 [33502] * gs3-extensions/maori-lang-detection/conf/url-blacklist-filter.txt (added) * gs3-extensions/maori-lang-detection/conf/url-greylist-filter.txt (added) Current url pattern blacklist and greylist filter files. Used by ... Mon, 23 Sep 2019 09:28:06 GMT ak19 [33501] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) Refactored code into 2 classes: The existing WETProcessor, which ... Mon, 23 Sep 2019 05:59:07 GMT ak19 [33499] * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) Explicitly adding in IAM policy configuration details instead of just ... Mon, 23 Sep 2019 04:43:22 GMT ak19 [33498] * gs3-extensions/maori-lang-detection/bin/script/get_maori_WET_records_for_crawl.sh (modified) Corrections to script. Modified the tests checking for file/dir ... Sun, 22 Sep 2019 09:17:48 GMT ak19 [33497] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) First version of discard url filter file. Inefficient implementation. ... Sun, 22 Sep 2019 07:23:28 GMT ak19 [33496] * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) Minor changes to reading list file Sun, 22 Sep 2019 07:19:36 GMT ak19 [33495] * gs3-extensions/maori-lang-detection/bin/script/get_maori_WET_records_for_crawl.sh (modified) Pruned out unused commands, added comments, marked unused variables ... Sat, 21 Sep 2019 10:49:56 GMT ak19 [33494] * gs3-extensions/maori-lang-detection/bin/script/get_maori_WET_records_for_crawl.sh (added) All in one script that takes as parameter a common crawl identifier ... Wed, 18 Sep 2019 08:20:09 GMT ak19 [33489] * gs3-extensions/maori-lang-detection/bin/script/drop_nutch_solrcore.sh (added) Handy file to not have to keep manually repeating commands when ... Tue, 17 Sep 2019 02:48:36 GMT ak19 [33488] * gs3-extensions/maori-lang-detection/bin/script/unique_mri_domains_from_cc.sh (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) new function createSeedURLsFiles() in WETProcessor that replaces the ... Mon, 16 Sep 2019 07:45:01 GMT ak19 [33480] * gs3-extensions/maori-lang-detection/conf/config.properties (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) Much harder to remove pages where words are fused together as some ... Fri, 13 Sep 2019 10:57:38 GMT ak19 [33471] * gs3-extensions/maori-lang-detection/bin/script/unique_mri_domains_from_cc.sh (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) Very minor changes. Fri, 13 Sep 2019 10:53:23 GMT ak19 [33470] * gs3-extensions/maori-lang-detection/bin/script/unique_mri_domains_from_cc.sh (added) A new script to reduce keepURLs.txt to unique URLs, 1 from each ... Fri, 13 Sep 2019 09:46:09 GMT ak19 [33469] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) Don't want URLs with the word product(s) in them (but production ... Fri, 13 Sep 2019 07:24:27 GMT ak19 [33468] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) More meaningful to (also) write out the keep vs discard URLs into ... Fri, 13 Sep 2019 05:44:41 GMT ak19 [33467] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) * gs3-extensions/maori-lang-detection/conf/config.properties (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) Improved the code to use a static block to load the needed properties ... Thu, 12 Sep 2019 09:37:39 GMT ak19 [33466] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NZTLDProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/Utility.java (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) 1. WETProcessor.main() now processes a folder of *.warc.wet(.gz) ... Thu, 12 Sep 2019 08:00:14 GMT ak19 [33465] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (added) Committing first version of the WETProcessor.java which takes a ... Thu, 05 Sep 2019 07:01:36 GMT ak19 [33457] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) Got stage 1, the WARC to WET conversion, working, after necessary ... Thu, 05 Sep 2019 05:26:27 GMT ak19 [33456] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) Link to discussion on how to convert WARC to WET Fri, 30 Aug 2019 06:27:21 GMT ak19 [33448] * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) Minor clarification and inclusion of helpful command Thu, 29 Aug 2019 07:12:39 GMT ak19 [33446] * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) * gs3-extensions/maori-lang-detection/bin/hadoop-spark-scripts/export_maori_subset.sh (added) * gs3-extensions/maori-lang-detection/bin/hadoop-spark-scripts/export_maori_subset_from_scratch.sh (added) 1. Committing working version of export_maori_subset.sh which takes ... Thu, 29 Aug 2019 05:01:12 GMT ak19 [33445] * gs3-extensions/maori-lang-detection/bin/hadoop-spark-scripts (added) * gs3-extensions/maori-lang-detection/bin/hadoop-spark-scripts/export_maori_index_csv.sh (added) The first working hadoop spark script for processing common crawl ... Wed, 28 Aug 2019 08:22:34 GMT ak19 [33443] * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) More notes Wed, 28 Aug 2019 07:30:38 GMT ak19 [33442] * gs3-extensions/maori-lang-detection/lib/gutil.jar (modified) Updated gutil.jar file (with SafeProcses debugging) Wed, 28 Aug 2019 07:30:00 GMT ak19 [33441] * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) Adding further notes to do with running the CC-index examples on spark. Wed, 28 Aug 2019 07:17:42 GMT ak19 [33440] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) * gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (added) Split file to move vagrant-spark-hadoop notes into own file. Mon, 19 Aug 2019 08:31:23 GMT ak19 [33428] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) Working commoncrawl cc-warc-examples' WET wordcount example using ... Fri, 16 Aug 2019 10:15:40 GMT ak19 [33425] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) A few more links now that I got past getting the vagrant VM with ... Thu, 15 Aug 2019 08:07:04 GMT ak19 [33423] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) Adding in the link to the vagrant VM with Hadoop, Spark for cluster ... Thu, 15 Aug 2019 05:52:19 GMT ak19 [33422] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) Some more links. Thu, 15 Aug 2019 04:20:03 GMT ak19 [33419] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) Last evening, I had found some links about how language-detection is ... Tue, 13 Aug 2019 09:57:58 GMT ak19 [33414] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) Adding important links Tue, 13 Aug 2019 09:57:42 GMT ak19 [33413] * gs3-extensions/maori-lang-detection/bin/script/create-uniq-WET-urls-file.sh (added) * gs3-extensions/maori-lang-detection/bin/script/create-uniq-nz-urls-file.sh (added) * gs3-extensions/maori-lang-detection/bin/script/get_commoncrawl_nz_urls.sh (modified) Splitting the get_commoncrawl_nz_urls.sh script back into 2 scripts, ... Tue, 13 Aug 2019 09:54:31 GMT ak19 [33412] * gs3-extensions/maori-lang-detection/conf/config.properties (modified) config command for wgetting a single file Tue, 13 Aug 2019 09:50:29 GMT ak19 [33411] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NZTLDProcessor.java (modified) Newer version now doesn't mirror sites with wget but gets WET files ... Tue, 13 Aug 2019 09:48:19 GMT ak19 [33410] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NZTLDProcessor.java (modified) Committing some variable name changes before I replace this file with ... Tue, 13 Aug 2019 03:59:29 GMT ak19 [33409] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) * gs3-extensions/maori-lang-detection/MoreReading/WebScraping.txt (added) * gs3-extensions/maori-lang-detection/MoreReading/macrons_with_emacs.txt (added) * gs3-extensions/maori-lang-detection/MoreReading/other.txt (modified) Forgot to commit 2 files with links and shuffling some links around ... Tue, 13 Aug 2019 03:09:28 GMT ak19 [33408] * gs3-extensions/maori-lang-detection/MoreReading/other.txt (modified) Some rough notes. Will move into appropriate file later. Tue, 13 Aug 2019 02:40:50 GMT ak19 [33407] * gs3-extensions/maori-lang-detection/lib/gutil.jar (modified) gutil.jar was rebuilt yesterday in GS3 after a bugfix. Recommitting ... Mon, 12 Aug 2019 08:37:44 GMT ak19 [33405] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NZTLDProcessor.java (modified) Even though we're probably not going to use this code after all, will ... Mon, 12 Aug 2019 08:35:48 GMT ak19 [33404] * gs3-extensions/maori-lang-detection/MoreReading/other.txt (modified) 1. Links to other Java ways of extracting text from web content. 2. ... Sun, 11 Aug 2019 10:03:14 GMT ak19 [33402] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NZTLDProcessor.java (added) Beginnings of the Java class to wget sites and process its pages to ... Sun, 11 Aug 2019 09:16:41 GMT ak19 [33401] * gs3-extensions/maori-lang-detection/logs (added) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.class (deleted) MaoriTextDetector.class file now generated inside its package folder ... Sun, 11 Aug 2019 09:15:26 GMT ak19 [33400] * gs3-extensions/maori-lang-detection/conf/log4j.properties (added) * gs3-extensions/maori-lang-detection/conf/log4j.properties.in (added) * gs3-extensions/maori-lang-detection/lib/log4j-1.2.8.jar (added) 1. Setting up log4j.properties based on the macronizer's basic one ... Sun, 11 Aug 2019 08:48:54 GMT ak19 [33399] * gs3-extensions/maori-lang-detection/conf (added) * gs3-extensions/maori-lang-detection/conf/config.properties (moved) * gs3-extensions/maori-lang-detection/lib/gutil.jar (added) Putting properties files into the conf folder and keeping the lib ... Sun, 11 Aug 2019 07:35:57 GMT ak19 [33398] * gs3-extensions/maori-lang-detection/README.txt (modified) * gs3-extensions/maori-lang-detection/src/org (added) * gs3-extensions/maori-lang-detection/src/org/greenstone (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (moved) Committing the actual package structure and the updated README after ... Sun, 11 Aug 2019 07:30:49 GMT ak19 [33397] * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.java (modified) 1. Changing package structure and instructions on compiling/running ... Sun, 11 Aug 2019 06:20:14 GMT ak19 [33396] * gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/resources/collectionConfig_ka.properties (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_es.properties (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_fr.properties (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_gu.properties (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_ja.properties (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_ka.properties (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_pl.properties (modified) * main/trunk/greenstone3/web/sites/localsite/resources/siteConfig_ka.properties (modified) Georgian language gs3colcfg module of GS interface. Many thanks to ... Fri, 09 Aug 2019 08:37:23 GMT ak19 [33394] * gs3-extensions/maori-lang-detection/bin/script/get_commoncrawl_nz_urls.sh (modified) * gs3-extensions/maori-lang-detection/feasibility.txt (added) * gs3-extensions/maori-lang-detection/lib (added) * gs3-extensions/maori-lang-detection/lib/config.properties (added) 1. Started a file on feasibility with the data now available and some ... Fri, 09 Aug 2019 06:57:12 GMT ak19 [33393] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) * gs3-extensions/maori-lang-detection/bin/script/get_commoncrawl_nz_urls.sh (modified) Modified the get_commoncrawl_nz_urls.sh to also create a reduced urls ... Thu, 08 Aug 2019 03:15:11 GMT ak19 [33392] * gs3-extensions/solr/trunk/src/perllib/solrbuilder.pm (modified) * gs3-extensions/solr/trunk/src/perllib/solrserver.pm (modified) * main/trunk/greenstone2/bin/script/activate.pl (modified) Kathy found a problem whereby she wanted to run consecutive buildcols ... Wed, 07 Aug 2019 07:11:12 GMT ak19 [33391] * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (modified) Some rough bash scripting lines that work but aren't complete. Wed, 07 Aug 2019 05:31:10 GMT ak19 [33390] * gs3-extensions/maori-lang-detection/bin/script/get_commoncrawl_nz_urls.sh (modified) Minor message telling the user to wait for a task that takes some time. Mon, 05 Aug 2019 23:46:09 GMT kjdon [33388] * gs3-extensions/solr/trunk/src/src/java/org/greenstone/gsdl3/util/SolrQueryWrapper.java (modified) tidied up some debug statements Wed, 31 Jul 2019 09:09:31 GMT ak19 [33379] * gs3-extensions/maori-lang-detection/bin/script/get_commoncrawl_nz_urls.sh (added) New script to automate getting a file listing of the common crawl URL ... Wed, 31 Jul 2019 07:05:15 GMT ak19 [33378] * gs3-extensions/maori-lang-detection/bin (added) * gs3-extensions/maori-lang-detection/bin/script (added) * gs3-extensions/maori-lang-detection/bin/script/gen_SentenceDetection_model.sh (moved) New bin/script folder and relocating gen_SentenceDetection_model.sh ... Wed, 31 Jul 2019 07:04:00 GMT ak19 [33377] * gs3-extensions/maori-lang-detection/README.txt (modified) * gs3-extensions/maori-lang-detection/gen_SentenceDetection_model.sh (modified) Changes to get gen_SentenceDetection_model.sh to run still from the ... Wed, 31 Jul 2019 06:39:24 GMT ak19 [33376] * gs3-extensions/maori-lang-detection/MoreReading (added) * gs3-extensions/maori-lang-detection/MoreReading/CommonCrawl.txt (added) * gs3-extensions/maori-lang-detection/MoreReading/Heritrix-and-WCT.txt (added) * gs3-extensions/maori-lang-detection/MoreReading/other.txt (added) Links and extracts I've read so far on the Web Curator Tool (WCT), ... Mon, 29 Jul 2019 03:25:03 GMT kjdon [33372] * gs3-extensions/solr/trunk/src/perllib/solrbuilder.pm (modified) when writing out facets in buildConfig, need to get them from ... Mon, 29 Jul 2019 00:08:14 GMT kjdon [33371] * gs3-extensions/solr/trunk/src/perllib/solrbuildproc.pm (modified) separate sort and facet fields as the former needs to be single ... Sun, 28 Jul 2019 23:59:24 GMT kjdon [33370] * gs3-extensions/solr/trunk/src/perllib/solrbuilder.pm (modified) use the new get_or_create_shortname instead of create_shortname Sun, 28 Jul 2019 23:12:39 GMT kjdon [33368] * gs3-extensions/solr/trunk/src/conf/schema.xml.in (modified) sort fields cannot be multivalued. Facet fields need to be. SO have ... Wed, 24 Jul 2019 22:43:32 GMT davidb [33359] * gs3-extensions/solr/trunk/src/perllib/solrbuilder.pm (modified) solr needs to add shortnames to the fieldnamemap otherwise it won't ... Wed, 24 Jul 2019 09:03:29 GMT ak19 [33358] * gs3-extensions/maori-lang-detection/README.txt (modified) More minor changes to README Wed, 24 Jul 2019 09:00:47 GMT ak19 [33357] * gs3-extensions/maori-lang-detection/README.txt (modified) * gs3-extensions/maori-lang-detection/gen_SentenceDetection_model.sh (modified) Minor changes Wed, 24 Jul 2019 08:57:39 GMT ak19 [33356] * gs3-extensions/maori-lang-detection/gen_SentenceDetection_model.sh (modified) Updating script. Correction to a filepath different in the svn folder ... Wed, 24 Jul 2019 08:54:50 GMT ak19 [33355] * gs3-extensions/maori-lang-detection/README.txt (modified) * gs3-extensions/maori-lang-detection/gen_SentenceDetection_model.sh (added) * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts (added) * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts/langdetect-183.bin (moved) * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts/mri-sent.train (added) * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts/mri-sent_trained.bin (added) * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts/sample_maori_shorttext.txt (added) * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts/sample_mri_paragraphs.txt (added) * gs3-extensions/maori-lang-detection/mri-opennlp-corpus.tar.gz (added) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.class (modified) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.java (modified) Changes for adding in the new gen_SentenceDetection_model.sh script, ... Tue, 23 Jul 2019 05:29:18 GMT ak19 [33350] * gs3-extensions/maori-lang-detection/README.txt (modified) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.class (modified) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.java (modified) Better comments. Tested macronised vs unmacronised Māori language ... Sat, 20 Jul 2019 11:43:53 GMT ak19 [33339] * gs3-extensions/maori-lang-detection/README.txt (modified) Updated README. Sat, 20 Jul 2019 11:24:46 GMT ak19 [33338] * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.class (modified) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.java (modified) 1.After renaming the java class, changed all occurrences of the old ... Sat, 20 Jul 2019 11:21:41 GMT ak19 [33337] * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.class (moved) * gs3-extensions/maori-lang-detection/src/MaoriTextDetector.java (moved) Renaming the class to MaoriTextDetector, since it doesn't detect ... Sat, 20 Jul 2019 10:58:17 GMT ak19 [33336] * gs3-extensions/maori-lang-detection/src/MaoriDetector.class (modified) * gs3-extensions/maori-lang-detection/src/MaoriDetector.java (modified) Major rewrite to make this class more useful to callers. ... Fri, 19 Jul 2019 10:17:21 GMT ak19 [33335] * gs3-extensions/maori-lang-detection (added) * gs3-extensions/maori-lang-detection/README.txt (added) * gs3-extensions/maori-lang-detection/apache-opennlp-1.9.1-bin.tar.gz (added) * gs3-extensions/maori-lang-detection/langdetect-183.bin (added) * gs3-extensions/maori-lang-detection/src (added) * gs3-extensions/maori-lang-detection/src/MaoriDetector.class (added) * gs3-extensions/maori-lang-detection/src/MaoriDetector.java (added) First java file for Māori language detection using openNLP with the ... Thu, 18 Jul 2019 11:05:31 GMT ak19 [33330] * gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/index.zip (modified) Also rebuilt the solr demo collection with the changes to ... Thu, 18 Jul 2019 10:45:22 GMT ak19 [33327] * gs3-extensions/solr/trunk/src/perllib/solrbuilder.pm (modified) * gs3-extensions/solr/trunk/src/perllib/solrbuildproc.pm (modified) * main/trunk/greenstone2/perllib/lucenebuildproc.pm (modified) In order to get map coordinate metadata stored correctly in solr, ... Tue, 09 Jul 2019 04:54:05 GMT ak19 [33315] * gs3-extensions/solr/trunk/src/perllib/solrserver.pm (modified) 1. Bugfix to issue discovered on windows: when the GS3 server isn't ... Mon, 08 Jul 2019 02:05:50 GMT kjdon [33307] * gs3-extensions/solr/trunk/src/webapps/solr.war (modified) updating solr.war to include my latest changes. TODO: does this war ... Mon, 08 Jul 2019 01:59:12 GMT kjdon [33306] * gs3-extensions/solr/trunk/src/src/java/org/greenstone/gsdl3/service/GS2SolrSearch.java (modified) we need to use (the new) level_ids list to determine which cores we ... Thu, 09 May 2019 08:31:08 GMT ak19 [33065] * gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/resources/collectionConfig_ka.properties (added) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/resources/collectionConfig_ka.properties (added) * main/trunk/greenstone3/web/sites/localsite/resources/siteConfig_ka.properties (added) 3 new Georgian language files added, 2 of which automatically ... Sun, 10 Mar 2019 20:37:51 GMT davidb [32891] * gs3-extensions/iiif-servlet/trunk/src/gsdl-src/java/org/greenstone/gsdl3/IIIFServerBridge.java (modified) * gs3-extensions/iiif-servlet/trunk/src/src/main/java/edu/illinois/library/cantaloupe/resource/iiif/v2/GSInformationResource.java (modified) * gs3-extensions/iiif-servlet/trunk/src/src/main/java/edu/illinois/library/cantaloupe/resource/iiif/v2/IdentifierToGSAssocfile.java (modified) Additional error checking Sun, 10 Mar 2019 20:37:21 GMT davidb [32890] * gs3-extensions/iiif-servlet/trunk/src/PREPARE-CANTALOUPE.sh (modified) No longer use the OAIConfig file Sun, 10 Mar 2019 04:36:54 GMT davidb [32889] * gs3-extensions/iiif-servlet/trunk/src/PREPARE-GSDL-AND-COMPILE-CORE.sh (modified) Some adjustments after testing Sun, 10 Mar 2019 04:14:22 GMT davidb [32888] * gs3-extensions/iiif-servlet/trunk/src/PREPARE-GSDL-AND-COMPILE-CORE.sh (modified) Also want to check and untar cantoloupe in this PREPARE file Fri, 08 Mar 2019 00:59:41 GMT davidb [32886] * gs3-extensions/iiif-servlet/trunk/src/src/main/java/edu/illinois/library/cantaloupe/resource/iiif/v2/GSImageResource.java (modified) * gs3-extensions/iiif-servlet/trunk/src/src/main/java/edu/illinois/library/cantaloupe/resource/iiif/v2/GSInformationResource.java (modified) * gs3-extensions/iiif-servlet/trunk/src/src/main/java/edu/illinois/library/cantaloupe/resource/iiif/v2/IdentifierToGSAssocfile.java (added) Copy refactoring Fri, 08 Mar 2019 00:45:35 GMT davidb [32885] * gs3-extensions/iiif-servlet/trunk/src/cantaloupe.properties.svn (deleted) Now in main Greenstone resources/iiif area