# # ChangeLog for other-projects/maori-lang-detection/conf # # Generated by Trac 1.4.2 # 2024-06-07T23:41:33+12:00 Wed, 05 Feb 2020 05:48:33 GMT ak19 [33904] * other-projects/maori-lang-detection/conf/sites-too-big-to-exhaustively-crawl.txt (modified) * other-projects/maori-lang-detection/conf/url-greylist-filter.txt (modified) * other-projects/maori-lang-detection/crawledNode6.tar (modified) * other-projects/maori-lang-detection/to_crawl.tar.gz (modified) Shouldn't greylist anglican.org, as this prevented crawling of ... Mon, 13 Jan 2020 06:45:21 GMT ak19 [33823] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/conf/url-blacklist-filter.txt (modified) * other-projects/maori-lang-detection/mongodb-data (added) * other-projects/maori-lang-detection/mongodb-data/1a_counts_miInUrlPath.json (added) * other-projects/maori-lang-detection/mongodb-data/1a_geojson-features_miInUrlPath.json (added) * other-projects/maori-lang-detection/mongodb-data/1a_multipoint_miInUrlPath.json (added) * other-projects/maori-lang-detection/mongodb-data/1b_counts_noMiInUrlPath.json (added) * other-projects/maori-lang-detection/mongodb-data/1b_geojson-features_noMiInUrlPath.json (added) * other-projects/maori-lang-detection/mongodb-data/1b_multipoint_noMiInUrlPath.json (added) * other-projects/maori-lang-detection/mongodb-data/1counts_allCrawledSites.json (added) * other-projects/maori-lang-detection/mongodb-data/1geojson-features_allCrawledSites.json (added) * other-projects/maori-lang-detection/mongodb-data/1map_allCrawledSites.png (added) * other-projects/maori-lang-detection/mongodb-data/1multipoint_allCrawledSites.json (added) * other-projects/maori-lang-detection/mongodb-data/2counts_sitesWithPagesInMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/2geojson-features_sitesWithPagesInMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/2map_sitesWithPagesInMRI.png (added) * other-projects/maori-lang-detection/mongodb-data/2multipoint_sitesWithPagesInMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/3counts_sitesWithPagesContainingMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/3geojson-features_sitesWithPagesContainingMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/3map_sitesWithPagesContainingMRI.png (added) * other-projects/maori-lang-detection/mongodb-data/3multipoint_sitesWithPagesContainingMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/4counts_tentativeNonProductSites.json (added) * other-projects/maori-lang-detection/mongodb-data/4geojson-features_tentativeNonProductSites.json (added) * other-projects/maori-lang-detection/mongodb-data/4map_exclTentativeAutotranslatedSites.png (added) * other-projects/maori-lang-detection/mongodb-data/4multipoint_tentativeNonProductSites.json (added) * other-projects/maori-lang-detection/mongodb-data/5counts_tentativeNonProductSites1.json (added) * other-projects/maori-lang-detection/mongodb-data/5geojson-features_tentativeNonProductSites1.json (added) * other-projects/maori-lang-detection/mongodb-data/5map_exclTentativeAutotranslatedSites1.png (added) * other-projects/maori-lang-detection/mongodb-data/5multipoint_tentativeNonProductSites1.json (added) * other-projects/maori-lang-detection/mongodb-data/6counts_nonProductSites1_manualShortlist.json (added) * other-projects/maori-lang-detection/mongodb-data/6geojson-features_nonProductSites1_manualShortlist.json (added) * other-projects/maori-lang-detection/mongodb-data/6map_exclAutotranslatedSites1_manualShortlist.png (added) * other-projects/maori-lang-detection/mongodb-data/6multipoint_nonProductSites1_manualShortlist.json (added) Recommitting mongo-data folder with renamed files with numbering. Wed, 18 Dec 2019 08:36:07 GMT ak19 [33812] * other-projects/maori-lang-detection/conf/countrycodes.json (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) Better handling of multi-line comment symbols, so I can now include ... Fri, 13 Dec 2019 07:08:14 GMT ak19 [33805] * other-projects/maori-lang-detection/conf/countrycodes.json (moved) * other-projects/maori-lang-detection/mongodb-data/countrycodes1.json (deleted) * other-projects/maori-lang-detection/mongodb-data/counts_sitesWithPagesInMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/geojson-features_sitesWithPagesInMRI.json (added) * other-projects/maori-lang-detection/mongodb-data/map_sitesWithPagesInMRI.png (added) * other-projects/maori-lang-detection/mongodb-data/multipoint_sitesWithPagesInMRI.json (added) * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) 1. Moving the static countrycodes.json file to conf folder and ... Thu, 12 Dec 2019 05:04:10 GMT ak19 [33800] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/conf/url-blacklist-filter.txt (modified) * other-projects/maori-lang-detection/crawledNode2.tar (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/CountryCodeCountsMapData.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToMongoDB.java (modified) Removed an adult site from crawled contents and added its url to ... Wed, 13 Nov 2019 10:08:37 GMT ak19 [33666] * other-projects/maori-lang-detection/MoreReading/mongodb.txt (modified) * other-projects/maori-lang-detection/conf/sites-too-big-to-exhaustively-crawl.txt (modified) * other-projects/maori-lang-detection/crawledNode6.tar (modified) * other-projects/maori-lang-detection/hdfs-cc-work/conf/regex-urlfilter.GS_TEMPLATE (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java (modified) * other-projects/maori-lang-detection/src/org/greenstone/atea/Utility.java (modified) * other-projects/maori-lang-detection/to_crawl.tar.gz (added) Having finished sending all the crawl data to mongodb 1. Recrawled ... Sun, 10 Nov 2019 22:46:48 GMT ak19 [33643] * other-projects/maori-lang-detection/conf/config.properties.in (moved) * other-projects/maori-lang-detection/conf/log4j.properties (deleted) * other-projects/maori-lang-detection/conf/log4j.properties.in (modified) Brought the template log4j.properties.in back up to speed. I forgot ... Sun, 10 Nov 2019 20:38:55 GMT ak19 [33635] * other-projects/maori-lang-detection (moved) Maori-language-detection doesn't use Greenstone 3 at present, it's ... Tue, 05 Nov 2019 08:58:44 GMT ak19 [33625] * gs3-extensions/maori-lang-detection/conf/keep-since-not-product-sites.txt (added) * gs3-extensions/maori-lang-detection/conf/possible-product-sites.txt (added) A file listing domains with seedurls containing /mi(/) that are ...