# # ChangeLog for / # # Generated by Trac 1.4.2 # 2024-06-10T15:03:41+12:00 Fri, 28 Feb 2020 09:07:29 GMT ak19 [33986] * other-projects/maori-lang-detection/mongodb-data/piechart_data.txt (modified) Dr Bainbridge investigated the original data set more Thu, 27 Feb 2020 08:49:00 GMT ak19 [33985] * other-projects/maori-lang-detection/mongodb-data/piechart_data.txt (added) Data to back the piechart I need to make that will illustrate how we ... Thu, 27 Feb 2020 08:44:06 GMT ak19 [33984] * other-projects/maori-lang-detection/src/org/greenstone/atea/AllDomainCount.java (added) Simple class to summarise some basic counts of the input common crawl ... Thu, 27 Feb 2020 07:26:53 GMT ak19 [33983] * other-projects/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) More sensible name for method which had too long kept its old name ... Wed, 26 Feb 2020 08:59:55 GMT ak19 [33982] * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) SummaryTool.java now processed the handcrafted UNIQUE domains counts ... Wed, 26 Feb 2020 08:19:23 GMT ak19 [33981] * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) As Dr Bainbridge suggested, code now opens a new firefox tab with a ... Wed, 26 Feb 2020 08:11:58 GMT ak19 [33980] * other-projects/maori-lang-detection/mongodb-data/6counts_sitesWithPagesContainingMRI_manualShortlist.json (modified) Additional comments Wed, 26 Feb 2020 08:00:38 GMT ak19 [33979] * other-projects/maori-lang-detection/mongodb-data/6counts_sitesWithPagesContainingMRI_manualShortlist.json (modified) Clearly stating that counts are of unique domains Wed, 26 Feb 2020 06:57:05 GMT ak19 [33978] * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) Opens all geoJSON maps in new tabs instead of waiting for user to ... Wed, 26 Feb 2020 05:37:08 GMT ak19 [33977] * other-projects/maori-lang-detection/mongodb-data/random260_results.txt (modified) Added something on precision vs recall being applicable to our ... Wed, 26 Feb 2020 05:28:09 GMT ak19 [33976] * other-projects/maori-lang-detection/mongodb-data/random260_results.txt (modified) Adding in what I could remember of Dr Bainbridge's statement about ... Tue, 25 Feb 2020 01:46:51 GMT kjdon [33975] * main/trunk/greenstone3/build.xml (modified) some mods to do with allowing multiple oaiservers. need OAIConfig- ... Tue, 25 Feb 2020 01:14:52 GMT kjdon [33974] * main/trunk/greenstone3/build.properties.svn (modified) added in new oai.servlets field - if you want to run two oaiservlets, ... Tue, 25 Feb 2020 01:01:18 GMT kjdon [33973] * main/trunk/greenstone3/web/WEB-INF/web.xml (modified) tidied up the file a bit. added new servlet_url param to oaiserver - ... Tue, 25 Feb 2020 00:47:48 GMT kjdon [33972] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/OAIPMH.java (modified) fixed a typo in a comment Tue, 25 Feb 2020 00:47:12 GMT kjdon [33971] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/OAIServer.java (modified) get servlet_url param and pass to getOAIConfigXML, as now the files ... Tue, 25 Feb 2020 00:46:03 GMT kjdon [33970] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/OAIXML.java (modified) changed OAIConfig naming to OAIConfig-oaiserver.xml - so multiple ... Tue, 25 Feb 2020 00:39:10 GMT kjdon [33969] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/OAIXML.java (modified) we no longer use OAIConfig.xml as the filename, now we use eg ... Tue, 25 Feb 2020 00:37:20 GMT kjdon [33968] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/core/OAIMessageRouter.java (modified) pass in oai_config from server, rather than reading it in itself Tue, 25 Feb 2020 00:36:08 GMT kjdon [33967] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/service/OAIPMH.java (modified) you might want to change the oaiserver url, eg if you have 2 oai ... Fri, 21 Feb 2020 08:00:55 GMT ak19 [33966] * other-projects/maori-lang-detection/mongodb-data/random260.ods (added) * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) * other-projects/maori-lang-detection/mongodb-data/random260_results.txt (added) Added the origSequence and basicDomain columns to the random 260 web ... Fri, 21 Feb 2020 07:59:07 GMT ak19 [33965] * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) 1. Adding a basicDomain column (stripped of http/https and www ... Fri, 21 Feb 2020 06:57:38 GMT ak19 [33964] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) 2 records were missing a value for the qualityLevel column. Thu, 20 Feb 2020 09:12:43 GMT ak19 [33963] * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBQueryer.java (modified) Added a new helper method to MongoDBQueryer.java to add numPagesInMRI ... Thu, 20 Feb 2020 09:07:20 GMT ak19 [33962] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) 2 fields changed, as one was missed out and the other incorrectly ... Thu, 20 Feb 2020 07:24:19 GMT ak19 [33961] * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) New category, LINK_TEXT, introduced for the random web page URL samples. Thu, 20 Feb 2020 07:22:38 GMT ak19 [33960] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Reviewed all the random sample web page URLs marked ... Thu, 20 Feb 2020 07:06:41 GMT ak19 [33959] * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) URIEncoding the mapData makes it unparseable by geojson.io Thu, 20 Feb 2020 06:32:28 GMT ak19 [33958] * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/depositor_home.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/home.xsl (modified) There were other xsl files using the original depositorTitleAndLink ... Thu, 20 Feb 2020 06:24:26 GMT ak19 [33957] * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/depositor_home.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/home.xsl (modified) 1. depositor related interface display modified to work with recent ... Thu, 20 Feb 2020 05:28:54 GMT ak19 [33956] * main/trunk/greenstone3/web/interfaces/default/transform/pages/home.xsl (modified) Related to commit 33953: made lots of accidental commits in rev ... Thu, 20 Feb 2020 05:26:04 GMT ak19 [33955] * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/etc/collectionConfig.xml (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/etc/oai-inf.jdb (modified) Undoing accidental commit of unintended files. Thu, 20 Feb 2020 05:21:55 GMT ak19 [33954] * main/trunk/greenstone3/web/etc/usersDB/log/log.ctrl (modified) * main/trunk/greenstone3/web/etc/usersDB/log/log1.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/log/logmirror.ctrl (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c10.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c230.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c340.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c351.dat (modified) Accidentally committed with other files. Undoing. Thu, 20 Feb 2020 05:19:57 GMT ak19 [33953] * main/trunk/greenstone3/web/WEB-INF/classes/interface_default.properties (modified) * main/trunk/greenstone3/web/etc/usersDB/log/log.ctrl (modified) * main/trunk/greenstone3/web/etc/usersDB/log/log1.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/log/logmirror.ctrl (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c10.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c230.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c340.dat (modified) * main/trunk/greenstone3/web/etc/usersDB/seg0/c351.dat (modified) * main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl (modified) * main/trunk/greenstone3/web/interfaces/default/transform/pages/home.xsl (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/etc/collectionConfig.xml (modified) * main/trunk/greenstone3/web/sites/localsite/collect/lucene-jdbm-demo/etc/oai-inf.jdb (modified) Depositor link not used Tue, 18 Feb 2020 10:35:35 GMT ak19 [33952] * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) Minor changes for processing Tue, 18 Feb 2020 10:33:29 GMT ak19 [33951] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Reviewed the qualityLevel column where LITTLE_TEXT was assigned. Tue, 18 Feb 2020 10:28:55 GMT ak19 [33950] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Reviewed the qualityLevel column where MIXED_TEXT was assigned. Tue, 18 Feb 2020 10:22:53 GMT ak19 [33949] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Reviewed the qualityLevel column where NAV was assigned. Tue, 18 Feb 2020 09:56:44 GMT ak19 [33948] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) Reviewed the random sampled web page URLs marked as ... Tue, 18 Feb 2020 09:07:33 GMT ak19 [33947] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Some more questionmarked field values assigned. Tue, 18 Feb 2020 08:58:42 GMT ak19 [33946] * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) 1. New function to handle user input assigning the newly introduced ... Tue, 18 Feb 2020 08:48:14 GMT ak19 [33945] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Added a 4th column for all 260 sample web page URLs and have used the ... Tue, 18 Feb 2020 03:44:21 GMT ak19 [33944] * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) Added the isReallyInMRI column after manually inspecting the ... Tue, 18 Feb 2020 02:56:07 GMT davidb [33943] * main/trunk/greenstone3/src/packages/javagdbm/java/Makefile.in (modified) Further tweaking of javah check after it failed to work on Bedrock LSB Tue, 18 Feb 2020 02:55:32 GMT davidb [33942] * main/trunk/greenstone2/common-src/indexers/mg/java/org/greenstone/mg/Makefile.in (modified) * main/trunk/greenstone2/common-src/indexers/mgpp/java/org/greenstone/mgpp/Makefile.in (modified) Further tweaking of javah check after it failed to work on Bedrock LSB Tue, 18 Feb 2020 02:18:00 GMT ak19 [33941] * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (modified) 1. Uppercase 3rd field (Y/N/? field) read back in from file before ... Mon, 17 Feb 2020 09:16:40 GMT ak19 [33940] * other-projects/maori-lang-detection/lib/commons-csv-1.7.jar (deleted) * other-projects/maori-lang-detection/lib/commons-csv-1.8.jar (added) * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/ManualURLInspection.java (added) * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBQueryer.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) 1. In order to make it easier to do the manual work of inspecting 260 ... Mon, 17 Feb 2020 03:22:08 GMT ak19 [33939] * other-projects/maori-lang-detection/mongodb-data/isMRI_full_manualList_globalDomains_whereAPageContainsMRI.txt (added) * other-projects/maori-lang-detection/mongodb-data/random255_domainsNZ_IsMRI.txt (deleted) * other-projects/maori-lang-detection/mongodb-data/random260_manualList_globalDomains_whereAPageContainsMRI.txt (added) 1. Old random samples file doesn't apply as we're not sampling by ... Mon, 17 Feb 2020 03:10:00 GMT ak19 [33938] * other-projects/maori-lang-detection/conf/log4j.properties.in (modified) * other-projects/maori-lang-detection/lib/gutil.jar (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) 1. Don't regenerate random sample of web page urls and full web page ... Mon, 17 Feb 2020 03:06:40 GMT ak19 [33937] * other-projects/maori-lang-detection/mongodb-data/6counts_sitesWithPagesContainingMRI_manualShortlist.json (added) New counts of manual sites after reingesting into MongoDB. Forgot to ... Mon, 17 Feb 2020 03:05:55 GMT ak19 [33936] * other-projects/maori-lang-detection/mongodb-data/6counts_sitesWithPagesContainingMRI_manualShortlist.jsonOLD (moved) * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2_afterMongoDBReingest.txt (modified) Renaming old file to place with new counts after reingesting into ... Sun, 16 Feb 2020 05:16:39 GMT davidb [33935] * main/trunk/greenstone3/build.xml (modified) Additional check added into get-isis target Sun, 16 Feb 2020 04:34:29 GMT davidb [33934] * main/trunk/greenstone3/src/packages/javagdbm/java/au/com/pharos/gdbm/GdbmFile.java (modified) Removal of static code block calling ancient/deprecated static ... Sun, 16 Feb 2020 01:19:46 GMT davidb [33933] * main/trunk/greenstone2/common-src/indexers/mg/java/org/greenstone/mg/Makefile.in (modified) * main/trunk/greenstone2/common-src/indexers/mgpp/java/org/greenstone/mgpp/Makefile.in (modified) Changed 8-spaces to tag chars in Makefile.in. Original problem ... Sat, 15 Feb 2020 06:14:24 GMT davidb [33932] * main/trunk/greenstone3/gs3-setup.bat (modified) Commented out Java version warning message, as it presents as ... Sat, 15 Feb 2020 06:10:54 GMT davidb [33931] * main/trunk/greenstone3/gs3-setup.sh (modified) Two changes to setup file. The first was to move the test for ant to ... Sat, 15 Feb 2020 06:00:05 GMT davidb [33930] * main/trunk/search4j/libsearch4j.cpp (modified) Code used to assume that major number was a single digit, as in 1.6 ... Sat, 15 Feb 2020 05:57:27 GMT davidb [33929] * main/trunk/greenstone3/src/packages/javagdbm/java/Makefile.in (modified) Newer JDKs don't have javah => make file change that takes account of ... Sat, 15 Feb 2020 05:55:27 GMT davidb [33928] * main/trunk/greenstone3/src/packages/javagdbm/aclocal.m4 (added) * main/trunk/greenstone3/src/packages/javagdbm/configure (modified) * main/trunk/greenstone3/src/packages/javagdbm/configure.in (modified) Streamlining of how test for JDK/javac is done Sat, 15 Feb 2020 01:57:35 GMT davidb [33927] * main/trunk/greenstone2/common-src/indexers/mg/java/org/greenstone/mg/Makefile.in (modified) * main/trunk/greenstone2/common-src/indexers/mgpp/java/org/greenstone/mgpp/Makefile.in (modified) Reworking of javah test Fri, 14 Feb 2020 10:03:21 GMT ak19 [33926] * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) Investigated some other options for screen capturing and Google ... Fri, 14 Feb 2020 07:41:20 GMT ak19 [33925] * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) 1. Bugfix: oversight, should return uri encoded URL for mapData, ... Fri, 14 Feb 2020 06:22:40 GMT ak19 [33924] * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) Adding in Dr Bainbridge's command to check the JSON generated is ... Fri, 14 Feb 2020 05:45:24 GMT davidb [33923] * main/trunk/greenstone2/common-src/packages/jdbm/README.txt (added) * main/trunk/greenstone2/common-src/packages/jdbm/gs-jdbm-1.0.tar.gz (modified) Removed non-UTF8 valid char from comment; regenerated tar file Fri, 14 Feb 2020 05:13:49 GMT davidb [33922] * main/trunk/model-sites-dev/multimodal-mdl/README.txt (added) Notes about using this site Fri, 14 Feb 2020 05:11:22 GMT davidb [33921] * main/trunk/greenstone2/common-src/indexers/mg/java/org/greenstone/mg/Makefile.in (modified) * main/trunk/greenstone2/common-src/indexers/mgpp/java/org/greenstone/mgpp/Makefile.in (modified) Newer Java's don't have 'javah' any more. The functionality has been ... Fri, 14 Feb 2020 03:55:49 GMT davidb [33920] * gs2-extensions/imagemagick/trunk/src/packages/CASCADE-MAKE/GS.sh (modified) Found to be needed when compiling up on a Google Compute Engine (GCE) ... Thu, 13 Feb 2020 09:40:41 GMT ak19 [33919] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/lib/jna-platform.jar (added) * other-projects/maori-lang-detection/lib/jna.jar (added) * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBQueryer.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) SummaryTool now uses the CountryCodeCountsMapData.java class to ... Thu, 13 Feb 2020 06:34:14 GMT ak19 [33918] * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2_afterMongoDBReingest.txt (modified) * other-projects/maori-lang-detection/mongodb-data/manualList_globalDomains_whereAPageContainsMRI.txt (modified) Country codes added to each domain's URL of the manual site/domain ... Thu, 13 Feb 2020 05:18:13 GMT ak19 [33917] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBQueryer.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (modified) Added some better reporting when confirming sample size was correct Thu, 13 Feb 2020 04:42:11 GMT ak19 [33916] * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2_afterMongoDBReingest.txt (modified) Updated the rest of the file after reingest Thu, 13 Feb 2020 04:12:06 GMT ak19 [33915] * other-projects/maori-lang-detection/mongodb-data/6counts_nonProductSites1_manualShortlist.json (added) * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2_afterMongoDBReingest.txt (moved) Forgot to add a (manual) counts file created last week, and am now ... Thu, 13 Feb 2020 04:09:07 GMT ak19 [33914] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting.txt (modified) * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2.txt (modified) * other-projects/maori-lang-detection/mongodb-data/manualList_globalDomains_whereAPageContainsMRI.txt (added) Shortlisted just the domain sites by country into ... Wed, 12 Feb 2020 08:27:02 GMT ak19 [33913] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/hdfs-cc-work/GS_README.TXT (modified) * other-projects/maori-lang-detection/mongodb-data/tables.txt (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBQueryer.java (modified) 1. Adjusted table mongodb query statements to be more exact, but same ... Wed, 12 Feb 2020 06:53:48 GMT ak19 [33912] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBQueryer.java (added) Forgot to svn add the new MongoDBQueryer.java class with commit ... Wed, 12 Feb 2020 06:12:42 GMT ak19 [33911] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/SummaryTool.java (moved) Correct commit message for previous and current commit: 1. After ... Wed, 12 Feb 2020 06:05:50 GMT ak19 [33910] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) 1. Implementing tables 3 to 5. 2. Rolled back the introduction of the ... Wed, 12 Feb 2020 06:02:44 GMT ak19 [33909] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/WebPageURLsListing.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/WebsiteInfo.java (modified) 1. Implementing tables 3 to 5. 2. Rolled back the introduction of the ... Sun, 09 Feb 2020 20:41:10 GMT kjdon [33908] * main/trunk/greenstone3/web/interfaces/default/transform/expand-gsf.xsl (modified) meta values are already escaped. Don't want to escape them again ... Wed, 05 Feb 2020 10:38:57 GMT ak19 [33907] * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2.txt (added) See previous commit message. This will be the file with the results ... Wed, 05 Feb 2020 10:36:37 GMT ak19 [33906] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/WebPageURLsListing.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/WebsiteInfo.java (modified) Code is intermediate state. 1. Introduced basicDomain field to ... Wed, 05 Feb 2020 05:49:16 GMT ak19 [33905] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/hdfs-cc-work/GS_README.TXT (modified) More notes Wed, 05 Feb 2020 05:48:33 GMT ak19 [33904] * other-projects/maori-lang-detection/conf/sites-too-big-to-exhaustively-crawl.txt (modified) * other-projects/maori-lang-detection/conf/url-greylist-filter.txt (modified) * other-projects/maori-lang-detection/crawledNode6.tar (modified) * other-projects/maori-lang-detection/to_crawl.tar.gz (modified) Shouldn't greylist anglican.org, as this prevented crawling of ... Tue, 04 Feb 2020 02:50:43 GMT ak19 [33903] * other-projects/maori-lang-detection/journal-paper/MRI_slideNotes.txt (added) My notes when preparing for today's meetings. Some of this may be ... Tue, 04 Feb 2020 00:05:30 GMT kjdon [33902] * main/trunk/greenstone2/perllib/classify/AZCompactList.pm (modified) * main/trunk/greenstone2/perllib/classify/AZList.pm (modified) * main/trunk/greenstone2/perllib/classify/AZSectionList.pm (modified) * main/trunk/greenstone2/perllib/classify/DateList.pm (modified) * main/trunk/greenstone2/perllib/classify/Hierarchy.pm (modified) * main/trunk/greenstone2/perllib/classify/SectionList.pm (modified) * main/trunk/greenstone2/perllib/classify/SimpleList.pm (modified) pass in new casefold and accentfold options to ... Tue, 04 Feb 2020 00:04:35 GMT kjdon [33901] * main/trunk/greenstone2/perllib/classify/BaseClassifier.pm (modified) new casefold_metadata_for_formatting and ... Tue, 04 Feb 2020 00:03:37 GMT kjdon [33900] * main/trunk/greenstone2/perllib/strings.properties (modified) BaseClassifier casefold/accentfold options Tue, 04 Feb 2020 00:03:05 GMT kjdon [33899] * main/trunk/greenstone2/perllib/classify/List.pm (modified) pass in new casefold and accentfold options (BaseClassifier) to ... Mon, 03 Feb 2020 23:59:00 GMT kjdon [33898] * main/trunk/greenstone2/perllib/sorttools.pm (modified) format_metadata_for_sorting now takes two additional args - casefold ... Mon, 03 Feb 2020 21:06:11 GMT kjdon [33897] * main/trunk/greenstone3/src/java/org/greenstone/gsdl3/util/XMLConverter.java (modified) elsewhere in the code - GSXML.xmlSafe, we are escaping ' => ' we ... Mon, 03 Feb 2020 10:29:59 GMT ak19 [33896] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) Clarification in comments Mon, 03 Feb 2020 10:20:53 GMT ak19 [33895] * other-projects/maori-lang-detection/mongodb-data/5b_counts_containsMRI_groupedByNZorOverseasNoFilter.json (moved) Minor rename Mon, 03 Feb 2020 10:20:33 GMT ak19 [33894] * other-projects/maori-lang-detection/mongodb-data/5b_count_containsMRI_groupedByNZorOverseasNoFilter.json (added) * other-projects/maori-lang-detection/mongodb-data/5b_geojson-features_containsMRI_groupedByNZorOverseasNoFilter.json (added) * other-projects/maori-lang-detection/mongodb-data/5b_map_containsMRI_groupedByNZorOverseasNoFilter.png (added) * other-projects/maori-lang-detection/mongodb-data/5b_multipoint_containsMRI_groupedByNZorOverseasNoFilter.json (added) * other-projects/maori-lang-detection/mongodb-data/6counts_sitesWithPagesContainingMRI_manualShortlist.json (moved) * other-projects/maori-lang-detection/mongodb-data/6geojson-features_sitesWithPagesContainingMRI_manualShortlist.json (moved) * other-projects/maori-lang-detection/mongodb-data/6map_sitesWithPagesContainingMRI_manualShortlist.png (moved) * other-projects/maori-lang-detection/mongodb-data/6multipoint_sitesWithPagesContainingMRI_manualShortlist.json (moved) * other-projects/maori-lang-detection/mongodb-data/tables.txt (modified) 1. Adding map, counts.json and geo-json files for 5b count of sites ... Mon, 03 Feb 2020 09:41:47 GMT ak19 [33893] * other-projects/maori-lang-detection/mongodb-data/8TableOfNumDetectedVsManualSITESWithMRI.ods (modified) * other-projects/maori-lang-detection/mongodb-data/8table_siteCountSummary.png (modified) 1. Left out region code column. 2. Two more sheets of work in ... Mon, 03 Feb 2020 09:28:44 GMT ak19 [33892] * other-projects/maori-lang-detection/mongodb-data/8TableOfNumDetectedVsManualSITESWithMRI.ods (moved) Sheets renamed and spreadsheet renamed Mon, 03 Feb 2020 09:27:37 GMT ak19 [33891] * other-projects/maori-lang-detection/mongodb-data/6table_nonProductSites1_manualShortlist.json (modified) * other-projects/maori-lang-detection/mongodb-data/8table_siteCountSummary.png (added) * other-projects/maori-lang-detection/mongodb-data/ManualShortlisting.txt (added) * other-projects/maori-lang-detection/mongodb-data/TableOfNumDetectedVsManualSITESWithMRI.ods (added) Site level detected vs manual inspected data: working shown in file ... Mon, 03 Feb 2020 07:31:33 GMT ak19 [33890] * other-projects/maori-lang-detection/mongodb-data/6table_nonProductSites1_manualShortlist.json (modified) Finished going through NZ sites listing of numPagesContainingMRI > 0 ... Mon, 03 Feb 2020 02:48:40 GMT ak19 [33889] * other-projects/maori-lang-detection/mongodb-data/1a_table_miInUrlPath.csv (modified) * other-projects/maori-lang-detection/mongodb-data/1a_table_miInUrlPath.png (added) * other-projects/maori-lang-detection/mongodb-data/1b_table_noMiInUrlPath.csv (modified) * other-projects/maori-lang-detection/mongodb-data/1b_table_noMiInUrlPath.png (added) * other-projects/maori-lang-detection/mongodb-data/1table_allCrawledSites.csv (modified) * other-projects/maori-lang-detection/mongodb-data/1table_allCrawledSites.png (added) * other-projects/maori-lang-detection/mongodb-data/2table_sitesWithPagesInMRI.csv (modified) * other-projects/maori-lang-detection/mongodb-data/2table_sitesWithPagesInMRI.png (added) * other-projects/maori-lang-detection/mongodb-data/3table_sitesWithPagesContainingMRI.csv (modified) * other-projects/maori-lang-detection/mongodb-data/3table_sitesWithPagesContainingMRI.png (added) * other-projects/maori-lang-detection/mongodb-data/4table_tentativeNonProductSites.csv (modified) * other-projects/maori-lang-detection/mongodb-data/4table_tentativeNonProductSites.png (added) * other-projects/maori-lang-detection/mongodb-data/5b_table_containsMRI_groupedByNZorOverseasNoFilter.csv (added) * other-projects/maori-lang-detection/mongodb-data/5b_table_containsMRI_groupedByNZorOverseasNoFilter.png (added) * other-projects/maori-lang-detection/mongodb-data/5table_tentativeNonProductSites1.csv (modified) * other-projects/maori-lang-detection/mongodb-data/5table_tentativeNonProductSites1.png (added) * other-projects/maori-lang-detection/mongodb-data/tables.txt (modified) 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of ... Mon, 03 Feb 2020 00:08:44 GMT kjdon [33888] * main/trunk/greenstone3/web/interfaces/default/transform/expand-gsf.xsl (modified) added propertyFile attribute to gsf:interfaceText so that you can ... Fri, 31 Jan 2020 10:49:11 GMT ak19 [33887] * other-projects/maori-lang-detection/src/org/greenstone/atea/MongoDBAccess.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/WebPageURLsListing.java (modified) 1. Added support for writing out tables in csv format too. 2. Second ...