Changeset 33815


Ignore:
Timestamp:
2019-12-19T17:17:16+13:00 (4 years ago)
Author:
ak19
Message:

Removed old results from before bugfix and improvement to urlContainsLangCodeInPath

File:
1 edited

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/hdfs-cc-work/GS_README.TXT

    r33814 r33815  
    72472420371
    725725
    726 
    727 # Number of sites with URLs containing /mi(/)
    728 db.getCollection('Websites').find({urlContainsLangCodeInPath:true}).count()
    729 X 153
    730 # Number of sites with URLs containing /mi(/) OR http(s)://mi.*
     726# Number of sites with crawled web pages that have URLs containing /mi(/) OR http(s)://mi.*
    731727db.getCollection('Websites').find({urlContainsLangCodeInPath:true}).count()
    732728670
    733729
    734 # Number of websites that are outside NZ that contain /mi(/) in any of its sub-urls
    735 db.getCollection('Websites').find({urlContainsLangCodeInPath:true, geoLocationCountryCode: {$ne : "NZ"} }).count()
    736 X 147
    737 # Number of websites that are outside NZ that contain /mi(/) OR http(s)://mi.* in any of its sub-urls
     730# Number of websites that are outside NZ that contain /mi(/) OR http(s)://mi.*
     731# in any of its crawled webpage urls
    738732db.getCollection('Websites').find({urlContainsLangCodeInPath:true, geoLocationCountryCode: {$ne : "NZ"} }).count()
    739733656
    740734
    741 # 6 sites with URLs containing /mi(/) that are in NZ
    742 db.getCollection('Websites').find({urlContainsLangCodeInPath:true, geoLocationCountryCode: "NZ"}).count()
    743 X 6
    744735# 14 sites with URLs containing /mi(/) OR http(s)://mi.* that are in NZ
    74573614
Note: See TracChangeset for help on using the changeset viewer.