Changeset 33825

Show
Ignore:
Timestamp:
13.01.2020 21:47:33 (8 days ago)
Author:
ak19
Message:

Beginnings of first draft of write up.

Location:
other-projects/maori-lang-detection
Files:
1 added
1 modified

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/hdfs-cc-work/GS_README.TXT

    r33824 r33825  
    2020--- 
    2121 
     22APPENDIX: Legend of mongodb-data folder's contents 
    2223APPENDIX: Reading data from hbase tables and backing up hbase 
    2324 
     
    983984 
    984985-------------------------------------------------------- 
    985 APPENDIX: Legend of mongodb-data folder's contents  
     986APPENDIX: Legend of mongodb-data folder's contents 
    986987-------------------------------------------------------- 
    9879881. allCrawledSites: all sites from CommonCrawl where the content-language=MRI, which we then crawled with Nutch with depth=10. Some obvious auto-translated websites were skipped.