Changeset 33868


Ignore:
Timestamp:
01/23/20 21:16:44 (16 months ago)
Author:
ak19
Message:

With the updated code for generating the maps from 6a and 6b manual site counts, generated corrected maps for num PAGES in MRI and num PAGES containing MRI and their geojson files. (Also some tabbing to 6table file).

Location:
other-projects/maori-lang-detection/mongodb-data
Files:
6 added
1 edited

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/mongodb-data/6table_nonProductSites1_manualShortlist.json

    r33854 r33868  
    4747
    4848
    49 
    50 
    51 
    5249--------------
    5350
    54 https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/find-sample-size/#CI1
    55 https://stats.stackexchange.com/questions/207584/sample-size-choice-with-binary-outcome
    56 https://www.statisticshowto.datasciencecentral.com/z-alpha2-za2/
    57 
    58 N (NZ pages where isMRI comes out true) = 4360
    59 solving for n, the sample size
    60 confidence level = 90%
    61 m, margin of error = 5%
    62 
    63 From the "z alpha/2" table, for 90% confidence, we get a z alpha/2 value of 1.6449 (or 1.645).
    64 
    65 Then the sample size, n, we need is = 1.6449^2 * 4360 / ( 1.6449^2 + (4 * 4359) * 0.05^2) = 255 (rounded up)
    66 
    67 
    68 For N = 681,
    69 sample size n is = 1.6449^2 * 681 / ( 1.6449^2 + (4 * 680) * 0.05^2) = 194 (rounded up)
    70 
    71 
    72 sample size for NZ: 255 (90% confidence with 5% margine of error, Including a finite correction factor)
    73 sample size for US: 194
     51    https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/find-sample-size/#CI1
     52    https://stats.stackexchange.com/questions/207584/sample-size-choice-with-binary-outcome
     53    https://www.statisticshowto.datasciencecentral.com/z-alpha2-za2/
     54
     55    N (NZ pages where isMRI comes out true) = 4360
     56    solving for n, the sample size
     57    confidence level = 90%
     58    m, margin of error = 5%
     59
     60    From the "z alpha/2" table, for 90% confidence, we get a z alpha/2 value of 1.6449 (or 1.645).
     61
     62    Then the sample size, n, we need is = 1.6449^2 * 4360 / ( 1.6449^2 + (4 * 4359) * 0.05^2) = 255 (rounded up)
     63
     64
     65    For N = 681,
     66    sample size n is = 1.6449^2 * 681 / ( 1.6449^2 + (4 * 680) * 0.05^2) = 194 (rounded up)
     67
     68
     69    sample size for NZ: 255 (90% confidence with 5% margine of error, Including a finite correction factor)
     70    sample size for US: 194
    7471
    7572*/
     
    7774
    7875
    79 ï»¿"_id","siteCount","numPagesInMRICount","numPagesContainingMRICount","URLs of pages detected as inMRI"
     76"_id","siteCount containsMRI","numPagesInMRICount","numPagesContainingMRICount","URLs of pages detected as inMRI"
    8077"nz","176.0","4360","9641"
    8178"us","29.0","681","953"
     
    9087
    9188Total sites containing MRI: 216
     89[of which 96 isMRI sites from NZ]
    9290Total pages detected as being in MRI: 5062
    9391Total pages detected as containing MRI sentences: 10706
Note: See TracChangeset for help on using the changeset viewer.