source:
gs3-extensions/maori-lang-detection@
33561
Name | Size | Rev | Age | Author | Last Change |
---|---|---|---|---|---|
../ | |||||
src | 33561 | 5 years | 1. sites-too-big-to-exhaustively-crawl.txt is now a comma separated … | ||
MoreReading | 33558 | 5 years | Committing cumulative changes since last commit. | ||
models-trainingdata-and-sampletxts | 33355 | 5 years | Changes for adding in the new gen_SentenceDetection_model.sh script, … | ||
logs | 33401 | 5 years | MaoriTextDetector.class file now generated inside its package folder … | ||
lib | 33442 | 5 years | Updated gutil.jar file (with SafeProcses debugging) | ||
hdfs-cc-work | 33545 | 5 years | Mainly changes to crawling-Nutch.txt and some minor changes to other … | ||
conf | 33561 | 5 years | 1. sites-too-big-to-exhaustively-crawl.txt is now a comma separated … | ||
ccrawl-data | 33549 | 5 years | All the downloaded commoncrawl MRI warc.wet.gz data from Sep 2018 … | ||
bin | 33526 | 5 years | Moved hadoop related scripts from bin/script into hdfs-instructions | ||
README.txt | 14.0 KB | 33398 | 5 years | Committing the actual package structure and the updated README after … | |
mri-opennlp-corpus.tar.gz | 8.3 MB | 33355 | 5 years | Changes for adding in the new gen_SentenceDetection_model.sh script, … | |
feasibility.txt | 761 bytes | 33394 | 5 years | 1. Started a file on feasibility with the data now available and some … | |
apache-opennlp-1.9.1-bin.tar.gz | 10.6 MB | 33335 | 5 years | First java file for Māori language detection using openNLP with the … |
Note:
See TracBrowser
for help on using the repository browser.