Changeset 33868

Show
Ignore:
Timestamp:
23.01.2020 21:16:44 (5 weeks ago)
Author:
ak19
Message:

With the updated code for generating the maps from 6a and 6b manual site counts, generated corrected maps for num PAGES in MRI and num PAGES containing MRI and their geojson files. (Also some tabbing to 6table file).

Location:
other-projects/maori-lang-detection/mongodb-data
Files:
6 added
1 modified

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/mongodb-data/6table_nonProductSites1_manualShortlist.json

    r33854 r33868  
    4747 
    4848 
    49  
    50  
    51  
    5249-------------- 
    5350 
    54 https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/find-sample-size/#CI1 
    55 https://stats.stackexchange.com/questions/207584/sample-size-choice-with-binary-outcome 
    56 https://www.statisticshowto.datasciencecentral.com/z-alpha2-za2/ 
    57  
    58 N (NZ pages where isMRI comes out true) = 4360 
    59 solving for n, the sample size 
    60 confidence level = 90% 
    61 m, margin of error = 5% 
    62  
    63 From the "z alpha/2" table, for 90% confidence, we get a z alpha/2 value of 1.6449 (or 1.645). 
    64  
    65 Then the sample size, n, we need is = 1.6449^2 * 4360 / ( 1.6449^2 + (4 * 4359) * 0.05^2) = 255 (rounded up) 
    66  
    67  
    68 For N = 681,  
    69 sample size n is = 1.6449^2 * 681 / ( 1.6449^2 + (4 * 680) * 0.05^2) = 194 (rounded up) 
    70  
    71  
    72 sample size for NZ: 255 (90% confidence with 5% margine of error, Including a finite correction factor) 
    73 sample size for US: 194 
     51    https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/find-sample-size/#CI1 
     52    https://stats.stackexchange.com/questions/207584/sample-size-choice-with-binary-outcome 
     53    https://www.statisticshowto.datasciencecentral.com/z-alpha2-za2/ 
     54 
     55    N (NZ pages where isMRI comes out true) = 4360 
     56    solving for n, the sample size 
     57    confidence level = 90% 
     58    m, margin of error = 5% 
     59 
     60    From the "z alpha/2" table, for 90% confidence, we get a z alpha/2 value of 1.6449 (or 1.645). 
     61 
     62    Then the sample size, n, we need is = 1.6449^2 * 4360 / ( 1.6449^2 + (4 * 4359) * 0.05^2) = 255 (rounded up) 
     63 
     64 
     65    For N = 681,  
     66    sample size n is = 1.6449^2 * 681 / ( 1.6449^2 + (4 * 680) * 0.05^2) = 194 (rounded up) 
     67 
     68 
     69    sample size for NZ: 255 (90% confidence with 5% margine of error, Including a finite correction factor) 
     70    sample size for US: 194 
    7471 
    7572*/ 
     
    7774 
    7875 
    79 ï»¿"_id","siteCount","numPagesInMRICount","numPagesContainingMRICount","URLs of pages detected as inMRI" 
     76"_id","siteCount containsMRI","numPagesInMRICount","numPagesContainingMRICount","URLs of pages detected as inMRI" 
    8077"nz","176.0","4360","9641" 
    8178"us","29.0","681","953" 
     
    9087 
    9188Total sites containing MRI: 216 
     89[of which 96 isMRI sites from NZ] 
    9290Total pages detected as being in MRI: 5062 
    9391Total pages detected as containing MRI sentences: 10706