source:
other-projects/maori-lang-detection/conf@
33868
Name | Size | Rev | Age | Author | Last Change |
---|---|---|---|---|---|
../ | |||||
url-whitelist-filter.txt | 1.3 KB | 33604 | 5 years | 1. Better output into possible-product-sites.txt including the … | |
config.properties.in | 1.3 KB | 33643 | 4 years | Brought the template log4j.properties.in back up to speed. I forgot it … | |
url-greylist-filter.txt | 1.7 KB | 33569 | 5 years | 1. batchcrawl.sh now does what it should have from the start, which is … | |
log4j.properties.in | 2.6 KB | 33643 | 4 years | Brought the template log4j.properties.in back up to speed. I forgot it … | |
url-blacklist-filter.txt | 3.2 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
sites-too-big-to-exhaustively-crawl.txt | 11.2 KB | 33666 | 4 years | Having finished sending all the crawl data to mongodb 1. Recrawled the … | |
countrycodes.json | 40.7 KB | 33812 | 4 years | Better handling of multi-line comment symbols, so I can now include … | |
possible-product-sites.txt | 46.0 KB | 33625 | 4 years | A file listing domains with seedurls containing /mi(/) that are … | |
keep-since-not-product-sites.txt | 69.7 KB | 33625 | 4 years | A file listing domains with seedurls containing /mi(/) that are … | |
GeoLiteCity.dat | 19.6 MB | 33603 | 5 years | Incorporating Dr Nichols suggestion to help weed out product sites: if … |
Note:
See TracBrowser
for help on using the repository browser.