- Timestamp:
- 2020-02-03T15:48:40+13:00 (4 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other-projects/maori-lang-detection/mongodb-data/tables.txt
r33878 r33889 1 1 Instructions for producing the tables: 2 2 a. Copy the Javascript version of results for each mongodb query listed below into a text editor. 3 b. Then regex replace \/\*\s*\d+\s*\*\/ with ","and embed all the JS inside [].3 b. OPTIONAL: Then regex replace \/\*\s*\d+\s*\*\/ with a comma (','), remove the very first comma, and embed all the JS inside []. 4 4 c. Paste that Javascript into https://json-csv.com/ to get the CSV tables 5 5 … … 17 17 /*domain: { $addToSet: '$domain' },*/ 18 18 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 19 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 19 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 20 totalPagesAcrossSites: { $sum: '$totalPages'} 20 21 } 21 22 }, … … 35 36 /*domain: { $addToSet: '$domain' },*/ 36 37 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 37 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 38 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 39 totalPagesAcrossMatchingSites: { $sum: '$totalPages'} 38 40 } 39 41 }, … … 53 55 /*domain: { $addToSet: '$domain' },*/ 54 56 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 55 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 57 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 58 totalPagesAcrossMatchingSites: { $sum: '$totalPages'} 56 59 } 57 60 }, … … 75 78 /*domain: { $addToSet: '$domain' },*/ 76 79 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 77 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 80 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 81 totalPagesAcrossSitesWithPositiveMRICount: { $sum: '$totalPages'} 78 82 } 79 83 }, … … 97 101 /*domain: { $addToSet: '$domain' },*/ 98 102 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 99 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 103 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 104 totalPagesAcrossSitesWithPositiveContainsMRI: { $sum: '$totalPages'} 100 105 } 101 106 }, … … 122 127 /*domain: { $addToSet: '$domain' },*/ 123 128 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 124 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 129 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 130 totalPagesAcrossMatchingSites: { $sum: '$totalPages'} 125 131 } 126 132 }, … … 151 157 /*domain: { $addToSet: '$domain' },*/ 152 158 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 153 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 159 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 160 totalPagesAcrossMatchingSites: { $sum: '$totalPages'} 154 161 } 155 162 }, … … 175 182 /*domain: { $addToSet: '$domain' },*/ 176 183 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 177 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 178 } 179 }, 180 { $sort : { count : -1} } 181 ]); 182 183 184 To find NZ web pages in MRI the following may be BETTER, 184 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 185 totalPagesAcrossMatchingSites: { $sum: '$totalPages'} 186 } 187 }, 188 { $sort : { count : -1} } 189 ]); 190 191 192 To find NZ web pages IN MRI the following may be BETTER, 185 193 as it looks for sites with positive numPagesINMRI rather than sites that only have positive containingMRI: 186 194 … … 201 209 domain: { $addToSet: '$domain' }, 202 210 numPagesInMRICount: { $sum: '$numPagesInMRI' }, 203 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' } 204 } 205 }, 206 { $sort : { count : -1} } 207 ]); 208 211 numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, 212 totalPagesAcrossMatchingSites: { $sum: '$totalPages'} 213 } 214 }, 215 { $sort : { count : -1} } 216 ]); 217
Note:
See TracChangeset
for help on using the changeset viewer.