source: other-projects/maori-lang-detection/src/org/greenstone/atea@ 33887

Name Size Rev Age Author Last Change
../
morphia 33811   4 years ak19 Returning to using a single variable, urlContainsLangCodeInPath, to …
CCWETProcessor.java 40.3 KB 33666   4 years ak19 Having finished sending all the crawl data to mongodb 1. Recrawled the …
CountryCodeCountsMapData.java 22.7 KB 33869   4 years ak19 First cut at the RandomURLsForDomainGenerator.java class and the …
MaoriTextDetector.java 12.9 KB 33615   4 years ak19 1. Worked out how to configure log4j to log both to console and …
MongoDBAccess.java 26.4 KB 33887   4 years ak19 1. Added support for writing out tables in csv format too. 2. Second …
MRIWebPageStats.java 1.7 KB 33602   5 years ak19 1. The final csv file, mri-sentences.csv, is now written out. 2. Only …
NutchTextDumpToCSV.java 16.5 KB 33634   4 years ak19 Rewrote NutchTextDumpProcessor as NutchTextDumpToMongoDB.java, which …
NutchTextDumpToMongoDB.java 15.8 KB 33811   4 years ak19 Returning to using a single variable, urlContainsLangCodeInPath, to …
NZTLDProcessor.java 15.3 KB 33466   5 years ak19 1. WETProcessor.main() now processes a folder of *.warc.wet(.gz) …
RandomURLsForDomainGenerator.java 3.3 KB 33883   4 years ak19 Clarifications
TextDumpPage.java 6.4 KB 33652   4 years ak19 Introducing morphia subpackage
TextLanguageDetector.java 17.7 KB 33698   4 years ak19 Links to more reading
Utility.java 5.7 KB 33887   4 years ak19 1. Added support for writing out tables in csv format too. 2. Second …
WebPageURLsListing.java 13.1 KB 33887   4 years ak19 1. Added support for writing out tables in csv format too. 2. Second …
WETProcessor.java 13.1 KB 33615   4 years ak19 1. Worked out how to configure log4j to log both to console and …
Note: See TracBrowser for help on using the repository browser.