source: main/trunk/model-sites-dev/atea/collect/commoncrawl-all-is-mri/README.txt@ 34470

Last change on this file since 34470 was 34470, checked in by davidb, 4 years ago

README and script to get going with this collection

File size: 213 bytes
Line 
1
2Greenstone3 collection resulting from Anu's work with CommonCrawl web dumps
3
4The PRE-IMPORT-PREPARE.sh script currently grabs the archives.tar.gz and index.tar.gz
5from the Atea google-drive area, and untars them
Note: See TracBrowser for help on using the repository browser.