Changeset 33868 for other-projects/maori-lang-detection/mongodb-data
- Timestamp:
- 2020-01-23T21:16:44+13:00 (4 years ago)
- Location:
- other-projects/maori-lang-detection/mongodb-data
- Files:
-
- 6 added
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other-projects/maori-lang-detection/mongodb-data/6table_nonProductSites1_manualShortlist.json
r33854 r33868 47 47 48 48 49 50 51 52 49 -------------- 53 50 54 https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/find-sample-size/#CI155 https://stats.stackexchange.com/questions/207584/sample-size-choice-with-binary-outcome56 https://www.statisticshowto.datasciencecentral.com/z-alpha2-za2/57 58 N (NZ pages where isMRI comes out true) = 436059 solving for n, the sample size60 confidence level = 90%61 m, margin of error = 5%62 63 From the "z alpha/2" table, for 90% confidence, we get a z alpha/2 value of 1.6449 (or 1.645).64 65 Then the sample size, n, we need is = 1.6449^2 * 4360 / ( 1.6449^2 + (4 * 4359) * 0.05^2) = 255 (rounded up)66 67 68 For N = 681,69 sample size n is = 1.6449^2 * 681 / ( 1.6449^2 + (4 * 680) * 0.05^2) = 194 (rounded up)70 71 72 sample size for NZ: 255 (90% confidence with 5% margine of error, Including a finite correction factor)73 sample size for US: 19451 https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/find-sample-size/#CI1 52 https://stats.stackexchange.com/questions/207584/sample-size-choice-with-binary-outcome 53 https://www.statisticshowto.datasciencecentral.com/z-alpha2-za2/ 54 55 N (NZ pages where isMRI comes out true) = 4360 56 solving for n, the sample size 57 confidence level = 90% 58 m, margin of error = 5% 59 60 From the "z alpha/2" table, for 90% confidence, we get a z alpha/2 value of 1.6449 (or 1.645). 61 62 Then the sample size, n, we need is = 1.6449^2 * 4360 / ( 1.6449^2 + (4 * 4359) * 0.05^2) = 255 (rounded up) 63 64 65 For N = 681, 66 sample size n is = 1.6449^2 * 681 / ( 1.6449^2 + (4 * 680) * 0.05^2) = 194 (rounded up) 67 68 69 sample size for NZ: 255 (90% confidence with 5% margine of error, Including a finite correction factor) 70 sample size for US: 194 74 71 75 72 */ … … 77 74 78 75 79 "_id","siteCount ","numPagesInMRICount","numPagesContainingMRICount","URLs of pages detected as inMRI"76 "_id","siteCount containsMRI","numPagesInMRICount","numPagesContainingMRICount","URLs of pages detected as inMRI" 80 77 "nz","176.0","4360","9641" 81 78 "us","29.0","681","953" … … 90 87 91 88 Total sites containing MRI: 216 89 [of which 96 isMRI sites from NZ] 92 90 Total pages detected as being in MRI: 5062 93 91 Total pages detected as containing MRI sentences: 10706
Note:
See TracChangeset
for help on using the changeset viewer.