# # ChangeLog for / # # Generated by Trac 1.4.2 # 2024-06-23T15:22:16+12:00 Fri, 15 Nov 2019 04:55:56 GMT davidb [33676] * gs2-extensions/malware-checker/trunk/perllib (added) * gs2-extensions/malware-checker/trunk/perllib/plugins (added) * gs2-extensions/malware-checker/trunk/perllib/plugins/MalwareCheckerConverter.pm (added) * gs2-extensions/malware-checker/trunk/perllib/plugins/PDFv3Plugin.pm (added) Some initial work getting a plugin going that call's Alex's ... Thu, 14 Nov 2019 11:22:34 GMT ak19 [33675] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) Committing the newer query results (but from before today's ... Thu, 14 Nov 2019 11:21:31 GMT ak19 [33674] * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/LanguageInfo.java (added) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/SentenceInfo.java (modified) Changes to support the top 5 predicted langcodes and their confidence ... Thu, 14 Nov 2019 11:17:21 GMT ak19 [33673] * main/trunk/model-sites-dev/WaikatoEducationDept (added) * main/trunk/model-sites-dev/WaikatoEducationDept/README.txt (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/README.txt (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/etc (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/etc/collectionConfig.bak (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/etc/collectionConfig.xml (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/etc/oai-inf.jdb (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/etc/oai-inf.jdb.bak (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/etc/oai-inf.lg (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/gli.col (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/images (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/import (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/import/metadata.xml (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/log (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/metadata (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/metadata/educationresources.mds (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/metadata/ex.mds (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/metadata/greenstone.mds (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/metadata/profile.xml (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/script (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/style (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/math.gs3coll/tmp (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/procMath.bat (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TC4.12_Math Resources_Activities/screenshot-input-mathfolder-organisation.png (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/README.txt (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/procCrates.bat (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/etc (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/etc/collectionConfig.bak (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/etc/collectionConfig.xml (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/etc/oai-inf.jdb (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/etc/oai-inf.jdb.bak (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/etc/oai-inf.lg (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/gli.col (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/images (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/import (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/import/metadata.xml (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/log (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/metadata (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/metadata/educationresources.mds (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/metadata/ex.mds (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/metadata/greenstone.mds (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/metadata/profile.xml (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/script (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/style (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/sciencea.gs3coll/tmp (added) * main/trunk/model-sites-dev/WaikatoEducationDept/TT4.02_Science Resources_Crates/screenshot-input-sciencefolder-organisation.png (added) * main/trunk/model-sites-dev/WaikatoEducationDept/iconjpg.png (added) * main/trunk/model-sites-dev/WaikatoEducationDept/ijpg.gif (added) * main/trunk/model-sites-dev/WaikatoEducationDept/siteConfig.xml (added) Waikato Education Department's Science Activities and Maths ... Thu, 14 Nov 2019 01:14:28 GMT kjdon [33672] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/GS2Construct.java (modified) modified slightly so that the error messages come from the dictionary ... Thu, 14 Nov 2019 01:12:44 GMT kjdon [33671] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/Dictionary.java (modified) added a static getTextString method - currently this is in ... Thu, 14 Nov 2019 01:10:45 GMT kjdon [33670] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSXML.java (modified) added editEnabled att string Thu, 14 Nov 2019 01:10:20 GMT kjdon [33669] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSXSLT.java (modified) removed an annoying debug message Wed, 13 Nov 2019 21:03:01 GMT kjdon [33668] * main/trunk/greenstone3/web/WEB-INF/classes/interface_default2.properties (modified) a few changes to debuginfo texts Wed, 13 Nov 2019 20:55:56 GMT kjdon [33667] * main/trunk/greenstone3/web/interfaces/core/transform/expand-gslib.xsl (moved) preProcess.xsl renamed to expand-gslib.xsl to better indicate what it ... Wed, 13 Nov 2019 10:08:37 GMT ak19 [33666] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/conf/sites-too-big-to-exhaustively-crawl.txt (modified) * other-projects/maori-lang-detection/crawledNode6.tar (modified) * other-projects/maori-lang-detection/hdfs-cc-work/conf/regex-urlfilter.GS_TEMPLATE (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) * other-projects/maori-lang-detection/to_crawl.tar.gz (added) Having finished sending all the crawl data to mongodb 1. Recrawled ... Wed, 13 Nov 2019 04:18:55 GMT davidb [33665] * gs2-extensions/malware-checker/trunk/java/AddComment.sh (modified) * gs2-extensions/malware-checker/trunk/java/GetFileScanReport.sh (modified) * gs2-extensions/malware-checker/trunk/java/ScanFile.sh (modified) Fixed jar name Wed, 13 Nov 2019 04:17:30 GMT davidb [33664] * gs2-extensions/malware-checker/trunk/java/pom.xml (added) * gs2-extensions/malware-checker/trunk/java/src (added) * gs2-extensions/malware-checker/trunk/java/src/main (added) * gs2-extensions/malware-checker/trunk/java/src/main/java (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org/greenstone (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org/greenstone/virustotal (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org/greenstone/virustotal/AddComment.java (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org/greenstone/virustotal/ApiDetails.java (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org/greenstone/virustotal/GetFileScanReport.java (added) * gs2-extensions/malware-checker/trunk/java/src/main/java/org/greenstone/virustotal/ScanFile.java (added) Initial version code for running VirusTotal API against files, CLI ... Wed, 13 Nov 2019 04:12:12 GMT davidb [33663] * gs2-extensions/malware-checker/trunk/java/AddComment.sh (added) * gs2-extensions/malware-checker/trunk/java/GetFileScanReport.sh (modified) * gs2-extensions/malware-checker/trunk/java/ScanFile.sh (modified) Changes after testing the scripts Wed, 13 Nov 2019 04:04:10 GMT davidb [33662] * gs2-extensions/malware-checker/trunk/java/CLEAN.sh (added) * gs2-extensions/malware-checker/trunk/java/COMPILE.sh (added) * gs2-extensions/malware-checker/trunk/java/GetFileScanReport.sh (added) * gs2-extensions/malware-checker/trunk/java/ScanFile.sh (added) Scripts to compile and run java code Wed, 13 Nov 2019 03:54:16 GMT davidb [33661] * gs2-extensions/malware-checker/trunk/java/packages (added) * gs2-extensions/malware-checker/trunk/java/packages/apache-maven-3.6.2-bin.tar.gz (added) Compiling needs to use Maven Wed, 13 Nov 2019 03:53:42 GMT davidb [33660] * gs2-extensions/malware-checker/trunk/java (added) For Java source code Wed, 13 Nov 2019 03:40:40 GMT davidb [33659] * gs2-extensions/malware-checker/trunk (added) Top-level folder for new extension based on TotalVirus API which ... Wed, 13 Nov 2019 03:40:25 GMT davidb [33658] * gs2-extensions/malware-checker (added) Top-level folder for new extension based on TotalVirus API which ... Tue, 12 Nov 2019 08:33:57 GMT ak19 [33657] * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) Some fixes after brief testing against 1/3 of the crawl. Restarted ... Tue, 12 Nov 2019 08:11:05 GMT ak19 [33656] * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) Final minor changes before I start processing the crawls of node2. Tue, 12 Nov 2019 07:56:53 GMT ak19 [33655] * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) Minor change to print statement Tue, 12 Nov 2019 07:54:06 GMT ak19 [33654] * other-projects/maori-lang-detection/lib/logging-slf4j-1.5.8.jar (deleted) Removing jar file that wasn't used after all. Tue, 12 Nov 2019 07:51:48 GMT ak19 [33653] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/lib/classgraph-4.8.52.jar (added) * other-projects/maori-lang-detection/lib/core-1.5.8.jar (added) * other-projects/maori-lang-detection/lib/logging-slf4j-1.5.8.jar (added) * other-projects/maori-lang-detection/lib/slf4j-api-1.7.9.jar (added) * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/SentenceInfo.java (moved) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/WebpageInfo.java (moved) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/WebsiteInfo.java (moved) 1. As suggested by Dr Bainbridge, made the code changes to use ... Tue, 12 Nov 2019 07:41:13 GMT ak19 [33652] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia (added) Introducing morphia subpackage Tue, 12 Nov 2019 05:11:39 GMT ak19 [33651] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/WebpageInfo.java (modified) 1. Bugfix: overlappingSentences works. 2. storing numSentencesInMaor Mon, 11 Nov 2019 23:06:56 GMT kjdon [33650] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/TransformingReceptionist.java (modified) updated to match the new xsl file names; lots of variable renames to ... Mon, 11 Nov 2019 23:04:46 GMT kjdon [33649] * main/trunk/greenstone3/web/interfaces/default/transform/expand-gsf-pass1.xsl (moved) * main/trunk/greenstone3/web/interfaces/default/transform/expand-gsf.xsl (moved) renamed config_format and text_fragment_format to better represent ... Mon, 11 Nov 2019 23:04:03 GMT kjdon [33648] * main/trunk/greenstone3/web/WEB-INF/classes/interface_default2.properties (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/debuginfo.xsl (modified) changed the debuginfo xsl and strings to match the new o=xxx debug ... Mon, 11 Nov 2019 20:30:46 GMT kjdon [33647] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/TransformingReceptionist.java (modified) added/changed a few of the output values for debugging the transform Mon, 11 Nov 2019 05:46:24 GMT ak19 [33646] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) Saving the mongodb queries and learning links that Dr Bainbridge ... Mon, 11 Nov 2019 05:45:29 GMT ak19 [33645] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) Fix to 2 bugs when sending data to MongoDB: 1. overlappingSentences ... Sun, 10 Nov 2019 22:50:29 GMT ak19 [33644] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (added) Just committing the growing mongodb.txt file with links and ... Sun, 10 Nov 2019 22:46:48 GMT ak19 [33643] * other-projects/maori-lang-detection/conf/config.properties.in (moved) * other-projects/maori-lang-detection/conf/log4j.properties (deleted) * other-projects/maori-lang-detection/conf/log4j.properties.in (modified) Brought the template log4j.properties.in back up to speed. I forgot ... Sun, 10 Nov 2019 22:06:48 GMT ak19 [33642] * other-projects/maori-lang-detection/lib/mongo-java-driver-3.9.1.jar (added) Forgot to commit the java driver for mongodb when I committed the ... Sun, 10 Nov 2019 21:53:36 GMT kjdon [33641] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/TransformingReceptionist.java (modified) commented out some debug statements Sun, 10 Nov 2019 21:48:41 GMT kjdon [33640] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/TransformingReceptionist.java (modified) oops, I must have 'tidied' up the file and then not compiled it to ... Sun, 10 Nov 2019 21:23:42 GMT kjdon [33639] * main/trunk/greenstone3/web/interfaces/default/transform/config_format.xsl (modified) need to select child nodes, otherwise the gsf:default node ends up in ... Sun, 10 Nov 2019 21:22:57 GMT kjdon [33638] * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/layouts/formatmanager.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/layouts/xml-to-string.xsl (moved) gslib doesn't use xml-to-string.xsl. its only used by formatmanager, ... Sun, 10 Nov 2019 21:21:00 GMT kjdon [33637] * main/trunk/greenstone3/web/interfaces/default/transform/layouts/main.xsl (modified) we can now use gsf and gslib in layout files. Sun, 10 Nov 2019 21:04:37 GMT kjdon [33636] * main/trunk/greenstone3/web/interfaces/default/transform/pages/about.xsl (modified) include means the stylesheet gets added inline, import mea s it gets ... Sun, 10 Nov 2019 20:38:55 GMT ak19 [33635] * other-projects/maori-lang-detection (moved) Maori-language-detection doesn't use Greenstone 3 at present, it's ... Fri, 08 Nov 2019 10:59:07 GMT ak19 [33634] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToCSV.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/SentenceInfo.java (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WebpageInfo.java (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WebsiteInfo.java (added) Rewrote NutchTextDumpProcessor as NutchTextDumpToMongoDB.java, which ... Fri, 08 Nov 2019 06:43:39 GMT ak19 [33633] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToCSV.java (moved) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (modified) 1. TextLanguageDetector now has methods for collecting all sentences ... Thu, 07 Nov 2019 01:53:54 GMT kjdon [33632] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/TransformingReceptionist.java (modified) overhaul of TransformingReceptionist. changed the order of inlining ... Thu, 07 Nov 2019 01:52:21 GMT kjdon [33631] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/XMLTransformer.java (modified) added a bit more error reporting Thu, 07 Nov 2019 01:44:16 GMT kjdon [33630] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSXSLT.java (modified) minor comment changes Thu, 07 Nov 2019 01:20:36 GMT kjdon [33629] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSXML.java (modified) added methods using Parameter2 - for params with text node values Thu, 07 Nov 2019 00:52:27 GMT kjdon [33628] * main/trunk/greenstone3/web/interfaces/default/transform/pages/query.xsl (modified) not sure why documentNode was a gsf:template here. Can't be like that ... Wed, 06 Nov 2019 20:28:41 GMT kjdon [33627] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSFile.java (modified) removed unnecessary comments Tue, 05 Nov 2019 08:59:46 GMT ak19 [33626] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) TODOs Tue, 05 Nov 2019 08:58:44 GMT ak19 [33625] * gs3-extensions/maori-lang-detection/conf/keep-since-not-product-sites.txt (added) * gs3-extensions/maori-lang-detection/conf/possible-product-sites.txt (added) A file listing domains with seedurls containing /mi(/) that are ... Tue, 05 Nov 2019 08:48:50 GMT ak19 [33624] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) Some cleanup surrounding the now renamed function createSeedURLsFile, ... Tue, 05 Nov 2019 08:04:09 GMT ak19 [33623] * gs3-extensions/maori-lang-detection/MoreReading/crawling-Nutch.txt (modified) * gs3-extensions/maori-lang-detection/conf/config.properties (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) 1. Incorporated Dr Nichols earlier suggestion of storing page ... Tue, 05 Nov 2019 02:42:46 GMT ak19 [33622] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (moved) File rename Mon, 04 Nov 2019 07:35:59 GMT ak19 [33621] * gs3-extensions/maori-lang-detection/MoreReading/crawling-Nutch.txt (modified) Comitting jotted down mongodb related instructions from what Dr ... Mon, 04 Nov 2019 01:24:25 GMT ak19 [33620] * gs3-extensions/maori-lang-detection/crawledNode6.tar (added) Final crawl, done on vagrant VM node6. Crawl site IDs 01407-01462. Sun, 03 Nov 2019 22:36:56 GMT kjdon [33619] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/URLFilter.java (modified) need to handle the case where a collection file (eg image) gets ... Fri, 01 Nov 2019 07:14:18 GMT ak19 [33618] * gs3-extensions/maori-lang-detection/hdfs-cc-work/GS_README.TXT (modified) Adding in the download URL Fri, 01 Nov 2019 04:13:18 GMT ak19 [33617] * gs3-extensions/maori-lang-detection/crawledNode5.tar (modified) Node5 is now full and here is the finished crawl (up to and including ... Thu, 31 Oct 2019 07:05:07 GMT ak19 [33616] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MongoDBConnection.java (added) Beginnings of Java class that is to interact with MongoDB. I don't ... Thu, 31 Oct 2019 07:03:55 GMT ak19 [33615] * gs3-extensions/maori-lang-detection/MoreReading/crawling-Nutch.txt (modified) * gs3-extensions/maori-lang-detection/conf/config.properties (modified) * gs3-extensions/maori-lang-detection/conf/log4j.properties (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/WETProcessor.java (modified) 1. Worked out how to configure log4j to log both to console and ... Wed, 30 Oct 2019 22:22:21 GMT kjdon [33614] * main/trunk/greenstone3/web/interfaces/default/transform/config_format.xsl (modified) added a new line Wed, 30 Oct 2019 22:18:44 GMT kjdon [33613] * main/trunk/greenstone2/collect/modelcol/etc/collectionConfig.xml (modified) added allowdocumentediting and allowmapgpsediting options, plus also ... Wed, 30 Oct 2019 22:00:37 GMT kjdon [33612] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/LibraryServlet.java (modified) work to do with params. add in default values to params if they are ... Wed, 30 Oct 2019 21:55:04 GMT kjdon [33611] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSParams.java (modified) added global setting to params - thesea re for params that are valid ... Wed, 30 Oct 2019 21:54:05 GMT kjdon [33610] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/GSXML.java (modified) USER_SESSION_CACHE_ATT moved to GSParams, as it is stored in session ... Wed, 30 Oct 2019 10:03:19 GMT ak19 [33609] * gs3-extensions/maori-lang-detection/crawledNode2.tar (moved) * gs3-extensions/maori-lang-detection/crawledNode3.tar (moved) * gs3-extensions/maori-lang-detection/crawledNode4.tar (moved) * gs3-extensions/maori-lang-detection/crawledNode5.tar (added) The tar files containing the crawled sites data shouldn't be called ... Wed, 30 Oct 2019 10:02:26 GMT ak19 [33608] * gs3-extensions/maori-lang-detection/hdfs-cc-work/GS_README.TXT (modified) * gs3-extensions/maori-lang-detection/hdfs-cc-work/scripts/batchcrawl.sh (modified) * gs3-extensions/maori-lang-detection/hdfs-cc-work/scripts/exportHBase.sh (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) 1. New script to export from HBase so that we could in theory ... Tue, 29 Oct 2019 05:33:49 GMT ak19 [33607] * gs3-extensions/maori-lang-detection/crawledNode4.tar.gz (modified) Updated with the remaining successfully crawled sites on node4 before ... Tue, 29 Oct 2019 02:18:51 GMT ak19 [33606] * gs3-extensions/maori-lang-detection/crawledNode2.tar.gz (moved) * gs3-extensions/maori-lang-detection/crawledNode3.tar.gz (added) 1. Committing crawl data from node3 (2nd VM for nutch crawling). 2. ... Tue, 29 Oct 2019 01:54:24 GMT ak19 [33605] * gs3-extensions/maori-lang-detection/crawledNode4.tar.gz (added) Node 4 VM still works, but committing first set of crawled sites on there Thu, 24 Oct 2019 10:22:30 GMT ak19 [33604] * gs3-extensions/maori-lang-detection/conf/sites-too-big-to-exhaustively-crawl.txt (modified) * gs3-extensions/maori-lang-detection/conf/url-whitelist-filter.txt (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) 1. Better output into possible-product-sites.txt including the ... Thu, 24 Oct 2019 09:04:37 GMT ak19 [33603] * gs3-extensions/maori-lang-detection/MoreReading/crawling-Nutch.txt (modified) * gs3-extensions/maori-lang-detection/conf/GeoLiteCity.dat (added) * gs3-extensions/maori-lang-detection/lib/geoip-api-1.2.10.jar (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) Incorporating Dr Nichols suggestion to help weed out product sites: ... Wed, 23 Oct 2019 10:49:34 GMT ak19 [33602] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MRIWebPageStats.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) 1. The final csv file, mri-sentences.csv, is now written out. 2. Only ... Wed, 23 Oct 2019 10:22:14 GMT ak19 [33601] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) Creates the 2nd csv file, with info about webpages. At present stores ... Wed, 23 Oct 2019 10:05:38 GMT ak19 [33600] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MRIWebPageStats.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) Work in progress of writing out CSV files. In future, may write the ... Tue, 22 Oct 2019 07:49:48 GMT ak19 [33599] * gs3-extensions/maori-lang-detection/crawled-1-of-3.tar.gz (added) First one-third sites crawled. Committing to SVN despite the tarred ... Tue, 22 Oct 2019 07:19:54 GMT ak19 [33598] * gs3-extensions/maori-lang-detection/hdfs-cc-work/GS_README.TXT (modified) * gs3-extensions/maori-lang-detection/hdfs-cc-work/vagrant-for-nutch2.tar.gz (modified) More instructions on setting up Nutch now that I've remembered to ... Tue, 22 Oct 2019 07:05:50 GMT ak19 [33597] * gs3-extensions/maori-lang-detection/hdfs-cc-work/conf/regex-urlfilter.GS_TEMPLATE (modified) Committing active version of template file which has a newline at end ... Tue, 22 Oct 2019 05:44:05 GMT ak19 [33596] * gs3-extensions/maori-lang-detection/hdfs-cc-work/conf/nutch-site.xml (added) * gs3-extensions/maori-lang-detection/hdfs-cc-work/conf/regex-urlfilter.GS_TEMPLATE (added) Adding in the nutch-site.xml and regex-urlfilter.GS_TEMPLATE template ... Tue, 22 Oct 2019 01:05:46 GMT kjdon [33595] * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) new displayBaskets template - to avoid replicating code in query and ... Tue, 22 Oct 2019 01:00:34 GMT kjdon [33594] * main/trunk/greenstone3/web/interfaces/default/transform/pages/classifier.xsl (modified) call gslib:displayBasket instead of replicating the code here Tue, 22 Oct 2019 00:59:53 GMT kjdon [33593] * main/trunk/greenstone3/web/interfaces/default/transform/pages/query.xsl (modified) the test for facets should be facetList/facet/count, as the facets ... Tue, 22 Oct 2019 00:51:02 GMT kjdon [33592] * main/trunk/greenstone3/web/interfaces/default/transform/pages/query.xsl (modified) reindented the file Mon, 21 Oct 2019 22:51:11 GMT kjdon [33591] * main/trunk/greenstone3/web/WEB-INF/classes/interface_default.properties (modified) added in some strings for 'this collection contains x documents and ... Mon, 21 Oct 2019 22:12:22 GMT kjdon [33590] * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/etc/collectionConfig.xml (modified) added 'this colleciton contains X documents and was last build Y days ... Mon, 21 Oct 2019 08:45:10 GMT cpb16 [33589] * other-projects/is-sheet-music-encore/trunk/COMPX520-MAP-DOWNLOADER-PNG.sh (added) * other-projects/is-sheet-music-encore/trunk/COMPX520-MAP-RUN-PNG-hi-res.sh (added) * other-projects/is-sheet-music-encore/trunk/EndToEndSystem.sh (modified) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/AppendixFormattedListGenerator.class (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/AppendixFormattedListGenerator.java (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/BatchFORMATTED.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/BookIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/FORMATTEDBookIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/FORMATTEDMusicIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/FORMATTEDlSerialIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/Makefile (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/MapIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/MusicIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/FormattedListForAppendix/SerialIDList.txt (added) * other-projects/is-sheet-music-encore/trunk/Makefile (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/#test.txt# (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/.idea/workspace.xml (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/000 Inverse Binarized Original.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/001 De-noise.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/002 heal objects in mask.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/003 Isolate large.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/100 Large Items Removed.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/202 heal objects in mask.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/203 Open.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/204 Dilate.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/205 Close Again (Final).jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/4000 Rect found.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/houghtest-bin.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/houghtest-lines.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/image-identification-development/src/Main.java (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/image-identification-development/src/MainMorph.java (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/original.jpg (added) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/out/production/image-identification-dev-02/Main.class (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/out/production/image-identification-dev-02/MainMorph.class (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-dev-02/test.zip (added) * other-projects/is-sheet-music-encore/trunk/image-identification-development/Makefile~ (added) * other-projects/is-sheet-music-encore/trunk/image-identification-development/backup (added) * other-projects/is-sheet-music-encore/trunk/image-identification-development/backup/MainBackup.java (added) * other-projects/is-sheet-music-encore/trunk/image-identification-development/backup/MainHoughLine.java (added) * other-projects/is-sheet-music-encore/trunk/image-identification-development/backup/MainWithOldComments.java (added) * other-projects/is-sheet-music-encore/trunk/image-identification-terminal/Makefile (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-terminal/javaAccuracyCalculator.class (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-terminal/javaAccuracyCalculator.java (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-terminal/javaClassifierComparison.java (modified) * other-projects/is-sheet-music-encore/trunk/image-identification-terminal/runClassifer.sh (modified) final01. Need Map results still Fri, 18 Oct 2019 10:20:09 GMT ak19 [33588] * gs3-extensions/maori-lang-detection/models-trainingdata-and-sampletxts/mri-sent_trained.bin (modified) Committing the MRI sentence model that I'm actually using, the one in ... Fri, 18 Oct 2019 10:16:25 GMT ak19 [33587] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MRIWebPageStats.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (modified) 1. Better stats reporting on crawled sites: not just if a page was in ... Fri, 18 Oct 2019 09:20:06 GMT ak19 [33586] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextLanguageDetector.java (added) Refactored MaoriTextDetector.java class into more general ... Fri, 18 Oct 2019 08:41:32 GMT ak19 [33585] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) Much simpler way of using sentence and language detection model to ... Fri, 18 Oct 2019 08:20:39 GMT ak19 [33584] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) Committing experimental version 2 using the sentence detector model, ... Fri, 18 Oct 2019 08:20:18 GMT ak19 [33583] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) Committing experimental version 1 using the sentence detector model, ... Thu, 17 Oct 2019 10:12:38 GMT ak19 [33582] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MRIWebPageStats.java (added) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) NutchTextDumpProcessor prints each crawled site's stats: number of ... Thu, 17 Oct 2019 08:53:20 GMT ak19 [33581] * gs3-extensions/maori-lang-detection/bin/script/gen_SentenceDetection_model.sh (modified) Minor fix. Noticed when looking for work I did on MRI sentence detection Thu, 17 Oct 2019 08:44:46 GMT ak19 [33580] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) Finally fixed the thus-far identified bugs when parsing dump.txt. Thu, 17 Oct 2019 08:05:21 GMT ak19 [33579] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) Debugging. Solved one problem. Thu, 17 Oct 2019 06:31:53 GMT ak19 [33578] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpProcessor.java (modified) * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/TextDumpPage.java (modified) Corrections for compiling the 2 new classes. Thu, 17 Oct 2019 06:12:15 GMT ak19 [33577] * gs3-extensions/maori-lang-detection/src/org/greenstone/atea/MaoriTextDetector.java (modified) Forgot to adjust usage statement to say that silent mode was already ...