Changeset 33408 for gs3-extensions
- Timestamp:
- 2019-08-13T15:09:28+12:00 (5 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
gs3-extensions/maori-lang-detection/MoreReading/other.txt
r33404 r33408 19 19 20 20 https://gist.github.com/svemir/4207353 21 (Hadoop related) A Common Crawl Experiment 21 22 22 23 https://gist.github.com/Smerity/afe7430fdb4371015466 … … 32 33 33 34 https://dmorgan.info/posts/common-crawl-python/ 35 https://groups.google.com/forum/#!topic/common-crawl/pdI3w09AAbQ 36 37 Example: 38 WARC: 39 tikauka:[142]/Scratch/anupama/maori-lang-detection>wget https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2019-30/segments/1563195526237.47/crawldiagnostics/CC-MAIN-20190719115720-20190719141720-00077.warc.gz 40 WET: 41 tikauka:[142]/Scratch/anupama/maori-lang-detection>wget https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2019-30/segments/1563195526237.47/wet/CC-MAIN-20190719115720-20190719141720-00508.warc.wet.gz 42 tikauka:[142]/Scratch/anupama/maori-lang-detection>gunzip CC-MAIN-20190719115720-20190719141720-00508.warc.wet.gz 43
Note:
See TracChangeset
for help on using the changeset viewer.