source: other-projects/maori-lang-detection/mongodb-data/counts_sitesWithPagesInMRI.json@ 33813

Last change on this file since 33813 was 33813, checked in by ak19, 4 years ago

With the bugfix from yesterday and the inclusion of http(s):mi.* type URLs in setting the Websites mongodb collection's urlContainsLangCodeInPath property, and updated/improved mongodb queries and their results I have now regenerated the latest geojson json data and maps.

File size: 1.6 KB
Line 
1/*
2Number of websites that have 1 or more pages detected as being in Maori, a positive numPagesInMRI.
3
4db.getCollection('Websites').find({numPagesInMRI: { $gt: 0}}).count()
5= 361
6
7Count of country codes for sites that have at least one page detected as MRI:
8
9db.Websites.aggregate([
10 {
11 $match: {
12 numPagesInMRI: {$gt: 0}
13 }
14 },
15 { $unwind: "$geoLocationCountryCode" },
16 {
17 $group: {
18 _id: {$toLower: '$geoLocationCountryCode'},
19 count: { $sum: 1 }
20 }
21 },
22 { $sort : { count : -1} }
23]);
24*/
25
26/* 1 */
27{
28 "_id" : "us",
29 "count" : 206.0
30}
31
32/* 2 */
33{
34 "_id" : "nz",
35 "count" : 53.0
36}
37
38/* 3 */
39{
40 "_id" : "cn",
41 "count" : 32.0
42}
43
44/* 4 */
45{
46 "_id" : "fr",
47 "count" : 18.0
48}
49
50/* 5 */
51{
52 "_id" : "au",
53 "count" : 11.0
54}
55
56/* 6 */
57{
58 "_id" : "nl",
59 "count" : 10.0
60}
61
62/* 7 */
63{
64 "_id" : "de",
65 "count" : 5.0
66}
67
68/* 8 */
69{
70 "_id" : "dk",
71 "count" : 4.0
72}
73
74/* 9 */
75{
76 "_id" : "gb",
77 "count" : 3.0
78}
79
80/* 10 */
81{
82 "_id" : "ca",
83 "count" : 3.0
84}
85
86/* 11 */
87{
88 "_id" : "ua",
89 "count" : 2.0
90}
91
92/* 12 */
93{
94 "_id" : "ie",
95 "count" : 2.0
96}
97
98/* 13 */
99{
100 "_id" : "es",
101 "count" : 2.0
102}
103
104/* 14 */
105{
106 "_id" : "sg",
107 "count" : 2.0
108}
109
110/* 15 */
111{
112 "_id" : "unknown",
113 "count" : 2.0
114}
115
116/* 16 */
117{
118 "_id" : "gr",
119 "count" : 1.0
120}
121
122/* 17 */
123{
124 "_id" : "hk",
125 "count" : 1.0
126}
127
128/* 18 */
129{
130 "_id" : "jp",
131 "count" : 1.0
132}
133
134/* 19 */
135{
136 "_id" : "bg",
137 "count" : 1.0
138}
139
140/* 20 */
141{
142 "_id" : "mx",
143 "count" : 1.0
144}
145
146/* 21 */
147{
148 "_id" : "ro",
149 "count" : 1.0
150}
Note: See TracBrowser for help on using the repository browser.