Ignore:
Timestamp:
2019-11-08T23:59:07+13:00 (4 years ago)
Author:
ak19
Message:

Rewrote NutchTextDumpProcessor as NutchTextDumpToMongoDB.java, which uses MongoDBAccess that now has insertWebpageInfo() and insertWebsiteInfo(). However, testing has been unsuccessful locally, despite the fact that authentication should be working, as I'm following the examples online to use the Credential object. It supposedly connects to the database, but database.listCollections() fails with an Unauthorized error. Nothing subsequent can be expected to work. I could do my preliminary testing against a small sample subset of crawled sites on vagrant where there is no authentication setup, but what if someone else wants to run this one day against a mongodb where they authentication is set up (the way TSG set it up for the mongodb they gave me access to). Then it still wouldn't work.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gs3-extensions/maori-lang-detection/src/org/greenstone/atea/NutchTextDumpToCSV.java

    r33633 r33634  
    244244       
    245245        if(text.equals("")) {
    246         page.addMRILanguageStatus(false);
     246        //page.addMRILanguageStatus(false);
    247247        continue;
    248248        }
     
    250250        boolean isMRI = maoriTxtDetector.isTextInMaori(text);
    251251       
    252         page.addMRILanguageStatus(isMRI);
     252        //page.addMRILanguageStatus(isMRI);
    253253       
    254254   
Note: See TracChangeset for help on using the changeset viewer.