source: other-projects/maori-lang-detection/conf@ 33804

Name Size Rev Age Author Last Change
../
config.properties.in 1.3 KB 33643   5 years ak19 Brought the template log4j.properties.in back up to speed. I forgot it …
GeoLiteCity.dat 19.6 MB 33603   5 years ak19 Incorporating Dr Nichols suggestion to help weed out product sites: if …
keep-since-not-product-sites.txt 69.7 KB 33625   5 years ak19 A file listing domains with seedurls containing /mi(/) that are …
log4j.properties.in 2.6 KB 33643   5 years ak19 Brought the template log4j.properties.in back up to speed. I forgot it …
possible-product-sites.txt 46.0 KB 33625   5 years ak19 A file listing domains with seedurls containing /mi(/) that are …
sites-too-big-to-exhaustively-crawl.txt 11.2 KB 33666   5 years ak19 Having finished sending all the crawl data to mongodb 1. Recrawled the …
url-blacklist-filter.txt 3.1 KB 33800   4 years ak19 Removed an adult site from crawled contents and added its url to …
url-greylist-filter.txt 1.7 KB 33569   5 years ak19 1. batchcrawl.sh now does what it should have from the start, which is …
url-whitelist-filter.txt 1.3 KB 33604   5 years ak19 1. Better output into possible-product-sites.txt including the …
Note: See TracBrowser for help on using the repository browser.