source: other-projects/maori-lang-detection/mongodb-data/counts_tentativeNonProductSites.json@ 33806

Last change on this file since 33806 was 33806, checked in by ak19, 4 years ago

More mongodb querying revealed that excluding tentative product sites (if site has /mi in path and emanates from outside NZ) from sites with numPagesCONTAININGMRI > 0, the result is barely different from just querying numPagesCONTAININGMRI > 0. Sadly, several autotranslated reslts still turned up by briefly checking the domains of the result sets in both cases. So maybe the test excluding tentativeProductSites should be repeated with numPagesINMRI > 0, to see whether that test that can better discriminate between auto-translated and sites with proper Maori language webpages.

File size: 1.6 KB
Line 
1/* 1 */
2{
3 "_id" : "us",
4 "count" : 475.0
5}
6
7/* 2 */
8{
9 "_id" : "cn",
10 "count" : 114.0
11}
12
13/* 3 */
14{
15 "_id" : "nz",
16 "count" : 98.0
17}
18
19/* 4 */
20{
21 "_id" : "fr",
22 "count" : 36.0
23}
24
25/* 5 */
26{
27 "_id" : "de",
28 "count" : 26.0
29}
30
31/* 6 */
32{
33 "_id" : "nl",
34 "count" : 22.0
35}
36
37/* 7 */
38{
39 "_id" : "au",
40 "count" : 17.0
41}
42
43/* 8 */
44{
45 "_id" : "ca",
46 "count" : 13.0
47}
48
49/* 9 */
50{
51 "_id" : "dk",
52 "count" : 8.0
53}
54
55/* 10 */
56{
57 "_id" : "es",
58 "count" : 7.0
59}
60
61/* 11 */
62{
63 "_id" : "gb",
64 "count" : 7.0
65}
66
67/* 12 */
68{
69 "_id" : "cz",
70 "count" : 4.0
71}
72
73/* 13 */
74{
75 "_id" : "it",
76 "count" : 3.0
77}
78
79/* 14 */
80{
81 "_id" : "at",
82 "count" : 3.0
83}
84
85/* 15 */
86{
87 "_id" : "ch",
88 "count" : 2.0
89}
90
91/* 16 */
92{
93 "_id" : "ro",
94 "count" : 2.0
95}
96
97/* 17 */
98{
99 "_id" : "il",
100 "count" : 2.0
101}
102
103/* 18 */
104{
105 "_id" : "unknown",
106 "count" : 2.0
107}
108
109/* 19 */
110{
111 "_id" : "hk",
112 "count" : 2.0
113}
114
115/* 20 */
116{
117 "_id" : "jp",
118 "count" : 2.0
119}
120
121/* 21 */
122{
123 "_id" : "ie",
124 "count" : 2.0
125}
126
127/* 22 */
128{
129 "_id" : "ua",
130 "count" : 2.0
131}
132
133/* 23 */
134{
135 "_id" : "se",
136 "count" : 1.0
137}
138
139/* 24 */
140{
141 "_id" : "gr",
142 "count" : 1.0
143}
144
145/* 25 */
146{
147 "_id" : "ru",
148 "count" : 1.0
149}
150
151/* 26 */
152{
153 "_id" : "eu",
154 "count" : 1.0
155}
156
157/* 27 */
158{
159 "_id" : "bg",
160 "count" : 1.0
161}
162
163/* 28 */
164{
165 "_id" : "fi",
166 "count" : 1.0
167}
168
169/* 29 */
170{
171 "_id" : "sg",
172 "count" : 1.0
173}
174
175/* 30 */
176{
177 "_id" : "tr",
178 "count" : 1.0
179}
180
181/* 31 */
182{
183 "_id" : "mx",
184 "count" : 1.0
185}
186
187/* 32 */
188{
189 "_id" : "ir",
190 "count" : 1.0
191}
Note: See TracBrowser for help on using the repository browser.