Changeset 33815

Show
Ignore:
Timestamp:
19.12.2019 17:17:16 (5 weeks ago)
Author:
ak19
Message:

Removed old results from before bugfix and improvement to urlContainsLangCodeInPath

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/hdfs-cc-work/GS_README.TXT

    r33814 r33815  
    72472420371 
    725725 
    726  
    727 # Number of sites with URLs containing /mi(/) 
    728 db.getCollection('Websites').find({urlContainsLangCodeInPath:true}).count() 
    729 X 153 
    730 # Number of sites with URLs containing /mi(/) OR http(s)://mi.* 
     726# Number of sites with crawled web pages that have URLs containing /mi(/) OR http(s)://mi.* 
    731727db.getCollection('Websites').find({urlContainsLangCodeInPath:true}).count() 
    732728670 
    733729 
    734 # Number of websites that are outside NZ that contain /mi(/) in any of its sub-urls 
    735 db.getCollection('Websites').find({urlContainsLangCodeInPath:true, geoLocationCountryCode: {$ne : "NZ"} }).count() 
    736 X 147 
    737 # Number of websites that are outside NZ that contain /mi(/) OR http(s)://mi.* in any of its sub-urls 
     730# Number of websites that are outside NZ that contain /mi(/) OR http(s)://mi.* 
     731# in any of its crawled webpage urls 
    738732db.getCollection('Websites').find({urlContainsLangCodeInPath:true, geoLocationCountryCode: {$ne : "NZ"} }).count() 
    739733656 
    740734 
    741 # 6 sites with URLs containing /mi(/) that are in NZ 
    742 db.getCollection('Websites').find({urlContainsLangCodeInPath:true, geoLocationCountryCode: "NZ"}).count() 
    743 X 6 
    744735# 14 sites with URLs containing /mi(/) OR http(s)://mi.* that are in NZ 
    74573614