source:
other-projects/maori-lang-detection@
33914
Name | Size | Rev | Age | Author | Last Change |
---|---|---|---|---|---|
../ | |||||
bin | 33581 | 5 years | Minor fix. Noticed when looking for work I did on MRI sentence detection | ||
ccrawl-data | 33572 | 5 years | Only meant to store the wet.gz versions of these files, not also the … | ||
conf | 33904 | 4 years | Shouldn't greylist anglican.org, as this prevented crawling of … | ||
hdfs-cc-work | 33913 | 4 years | 1. Adjusted table mongodb query statements to be more exact, but same … | ||
journal-paper | 33903 | 4 years | My notes when preparing for today's meetings. Some of this may be … | ||
lib | 33788 | 4 years | Adding all the jar files needed to work in Java with geojson Simple … | ||
logs | 33401 | 5 years | MaoriTextDetector.class file now generated inside its package folder … | ||
models-trainingdata-and-sampletxts | 33588 | 5 years | Committing the MRI sentence model that I'm actually using, the one in … | ||
mongodb-data | 33914 | 4 years | Shortlisted just the domain sites by country into ManualShortlist2.txt … | ||
MoreReading | 33914 | 4 years | Shortlisted just the domain sites by country into ManualShortlist2.txt … | ||
src | 33913 | 4 years | 1. Adjusted table mongodb query statements to be more exact, but same … | ||
apache-opennlp-1.9.1-bin.tar.gz | 10.6 MB | 33335 | 5 years | First java file for Māori language detection using openNLP with the … | |
crawledNode2.tar | 606.8 MB | 33800 | 4 years | Removed an adult site from crawled contents and added its url to … | |
crawledNode3.tar | 370.6 MB | 33609 | 5 years | The tar files containing the crawled sites data shouldn't be called … | |
crawledNode4.tar | 374.6 MB | 33609 | 5 years | The tar files containing the crawled sites data shouldn't be called … | |
crawledNode5.tar | 544.3 MB | 33617 | 5 years | Node5 is now full and here is the finished crawl (up to and including … | |
crawledNode6.tar | 126.0 MB | 33904 | 4 years | Shouldn't greylist anglican.org, as this prevented crawling of … | |
feasibility.txt | 761 bytes | 33394 | 5 years | 1. Started a file on feasibility with the data now available and some … | |
mri-opennlp-corpus.tar.gz | 8.3 MB | 33355 | 5 years | Changes for adding in the new gen_SentenceDetection_model.sh script, … | |
README.txt | 14.0 KB | 33398 | 5 years | Committing the actual package structure and the updated README after … | |
to_crawl.tar.gz | 1.4 MB | 33904 | 4 years | Shouldn't greylist anglican.org, as this prevented crawling of … |
Note:
See TracBrowser
for help on using the repository browser.