source: gs3-extensions/maori-lang-detection/bin

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @33526   5 years ak19 Moved hadoop related scripts from bin/script into hdfs-instructions
(edit) @33523   5 years ak19 Instructional comment
(edit) @33522   5 years ak19 Some comments and an improvement
(edit) @33516   5 years ak19 Before I accidentally lose it, committing the script Dr Bainbridge …
(edit) @33513   5 years ak19 Higher level script that runs against each named crawl since Sep 2018 …
(edit) @33498   5 years ak19 Corrections to script. Modified the tests checking for file/dir …
(edit) @33495   5 years ak19 Pruned out unused commands, added comments, marked unused variables to …
(edit) @33494   5 years ak19 All in one script that takes as parameter a common crawl identifier of …
(edit) @33489   5 years ak19 Handy file to not have to keep manually repeating commands when …
(edit) @33488   5 years ak19 new function createSeedURLsFiles() in WETProcessor that replaces the …
(edit) @33471   5 years ak19 Very minor changes.
(edit) @33470   5 years ak19 A new script to reduce keepURLs.txt to unique URLs, 1 from each unique …
(edit) @33446   5 years ak19 1. Committing working version of export_maori_subset.sh which takes …
(edit) @33445   5 years ak19 The first working hadoop spark script for processing common crawl …
(edit) @33413   5 years ak19 Splitting the get_commoncrawl_nz_urls.sh script back into 2 scripts, …
(edit) @33394   5 years ak19 1. Started a file on feasibility with the data now available and some …
(edit) @33393   5 years ak19 Modified the get_commoncrawl_nz_urls.sh to also create a reduced urls …
(edit) @33390   5 years ak19 Minor message telling the user to wait for a task that takes some time.
(edit) @33379   5 years ak19 New script to automate getting a file listing of the common crawl URL …
(add) @33378   5 years ak19 New bin/script folder and relocating gen_SentenceDetection_model.sh to …
Note: See TracRevisionLog for help on using the revision log.