Ignore:
Timestamp:
2019-11-12T20:51:48+13:00 (4 years ago)
Author:
ak19
Message:
  1. As suggested by Dr Bainbridge, made the code changes to use Morphia as ODM for MongoDB (Object Document Mapper, ODM for MongoDB is equivalent to what ORM is to RDBMS). 2. Adding jar files to get this to work. 3. Further changes to store site folder names of form ##### as primary key of Websites collection. However, may in a future commit decide to store a reference to a WebsiteInfo object (representing a JSON document in a Websites MongoDB collection) inside a WebpageInfo object. 4. The MongoDB collections are now called Websites and Webpages, not websites and webpages. 5. geolocation of site now stored as field in Websites mongodb collection. And containsMRI now stored as field in Webpages collection of mongoDB. 6. Tried out some mongodb query commands based on what Dr Bainbridge did yesterday.
File:
1 moved

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/WebsiteInfo.java

    r33652 r33653  
    1 package org.greenstone.atea;
     1package org.greenstone.atea.morphia;
    22
     3import dev.morphia.annotations.*;
     4
     5@Entity("Websites")
    36public class WebsiteInfo {
    4 
    5     public final int id;
     7    //public final int id;
     8    @Id
    69    public final String siteFolderName;
    710    public final String domain;
     
    1821    public final boolean urlContainsLangCodeInpath;
    1922   
    20     public WebsiteInfo(int siteCount, String siteFolderName, String domainOfSite,
     23    public WebsiteInfo(/*int siteCount,*/ String siteFolderName, String domainOfSite,
    2124               int totalPages, int countOfWebPagesWithBodyText, int numPagesInMRI,
    2225               long siteCrawledTimestamp, boolean siteCrawlUnfinished, boolean redoCrawl,
    2326               String geoLocationCountryCode, boolean urlContainsLangCodeInpath)
    2427    {
    25     this.id = siteCount;
     28    //this.id = siteCount;
    2629    this.siteFolderName = siteFolderName;
    2730    this.domain = domainOfSite;
Note: See TracChangeset for help on using the changeset viewer.