Changeset 33675
- Timestamp:
- 2019-11-15T00:22:34+13:00 (4 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other-projects/maori-lang-detection/MoreReading/mongodb.txt
r33666 r33675 347 347 https://docs.mongodb.com/manual/reference/method/db.collection.find/#find-projection 348 348 349 349 Mongo Studio 3T documentation: 350 https://studio3t.com/download/ (also has uninstall information) 351 https://studio3t.com/download-thank-you/?OS=x64 352 353 Google: MongoDB visualization 354 MongoDB visualization map 355 MongoDB Charts 356 (Open source visualisation tools) 357 358 json map visualizer 359 geojson.tools 350 360 ------------------- 351 361 … … 358 368 # Num webpages 359 369 db.getCollection('Webpages').find({}).count() 360 75139 370 X75139 371 117496 361 372 362 373 # Find number of websites who have 1 or more pages in Maori (a positive numPagesInMRI) … … 367 378 db.getCollection('Webpages').find({isMRI:true}).count() 368 379 X5224 369 5215 380 X5215 381 db.getCollection('Webpages').find({isMRI:true}).count() 382 7818 370 383 371 384 # Number of pages that contain any number of MRI sentences 372 385 db.getCollection('Webpages').find({containsMRI: true}).count() 373 12858 386 X12858 387 20371 388 374 389 375 390 # Number of sites with URLs containing /mi(/) … … 389 404 db.getCollection('Websites').find({urlContainsLangCodeInpath:true}).sort({geoLocationCountryCode: 1}) 390 405 391 406 Actually, I want to sort by count. See https://docs.mongodb.com/manual/reference/operator/aggregation/sortByCount/ 407 408 409 410 * Identify where Maori language is online. 411 * How can we identify high quality sites that would be good for a corpus. 412 (Related work for other languages to quantifiably answer that) 413
Note:
See TracChangeset
for help on using the changeset viewer.