source:
other-projects/maori-lang-detection/conf@
34004
Name | Size | Rev | Age | Author | Last Change |
---|---|---|---|---|---|
../ | |||||
url-whitelist-filter.txt | 1.3 KB | 33604 | 5 years | 1. Better output into possible-product-sites.txt including the … | |
url-greylist-filter.txt | 1.8 KB | 33904 | 4 years | Shouldn't greylist anglican.org, as this prevented crawling of … | |
url-blacklist-filter.txt | 3.2 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
sites-too-big-to-exhaustively-crawl.txt | 11.3 KB | 33904 | 4 years | Shouldn't greylist anglican.org, as this prevented crawling of … | |
possible-product-sites.txt | 46.0 KB | 33625 | 5 years | A file listing domains with seedurls containing /mi(/) that are … | |
log4j.properties.in | 2.7 KB | 33938 | 4 years | 1. Don't regenerate random sample of web page urls and full web page … | |
keep-since-not-product-sites.txt | 69.7 KB | 33625 | 5 years | A file listing domains with seedurls containing /mi(/) that are … | |
GeoLiteCity.dat | 19.6 MB | 33603 | 5 years | Incorporating Dr Nichols suggestion to help weed out product sites: if … | |
countrycodes.json | 40.7 KB | 33812 | 4 years | Better handling of multi-line comment symbols, so I can now include … | |
config.properties.in | 1.3 KB | 33643 | 5 years | Brought the template log4j.properties.in back up to speed. I forgot it … |
Note:
See TracBrowser
for help on using the repository browser.