source: gs3-extensions/maori-lang-detection@ 33565

Name Size Rev Age Author Last Change
../
bin 33526   5 years ak19 Moved hadoop related scripts from bin/script into hdfs-instructions
ccrawl-data 33549   5 years ak19 All the downloaded commoncrawl MRI warc.wet.gz data from Sep 2018 …
conf 33565   5 years ak19 CCWETProcessor: domain url now goes in as a seedURL after the …
hdfs-cc-work 33564   5 years ak19 batchcrawl.sh now does the crawl and logs output of the crawl, dumps …
lib 33562   5 years ak19 1. The sites-too-big-to-exhaustively-crawl.txt is now a csv file of a …
logs 33401   5 years ak19 MaoriTextDetector.class file now generated inside its package folder …
models-trainingdata-and-sampletxts 33355   5 years ak19 Changes for adding in the new gen_SentenceDetection_model.sh script, …
MoreReading 33565   5 years ak19 CCWETProcessor: domain url now goes in as a seedURL after the …
src 33565   5 years ak19 CCWETProcessor: domain url now goes in as a seedURL after the …
apache-opennlp-1.9.1-bin.tar.gz 10.6 MB 33335   5 years ak19 First java file for Māori language detection using openNLP with the …
feasibility.txt 761 bytes 33394   5 years ak19 1. Started a file on feasibility with the data now available and some …
mri-opennlp-corpus.tar.gz 8.3 MB 33355   5 years ak19 Changes for adding in the new gen_SentenceDetection_model.sh script, …
README.txt 14.0 KB 33398   5 years ak19 Committing the actual package structure and the updated README after …
Note: See TracBrowser for help on using the repository browser.