source:
other-projects/maori-lang-detection/mongodb-data@
38957
Name | Size | Rev | Age | Author | Last Change |
---|---|---|---|---|---|
../ | |||||
1a_counts_miInUrlPath.json | 1.5 KB | 33848 | 4 years | Tables of mongodb counts (1-5 table) and manual counts (6table). … | |
1a_geojson-features_miInUrlPath.json | 4.6 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1a_multipoint_miInUrlPath.json | 501 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1a_table_miInUrlPath.csv | 794 bytes | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
1a_table_miInUrlPath.png | 85.2 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
1b_counts_noMiInUrlPath.json | 2.2 KB | 33848 | 4 years | Tables of mongodb counts (1-5 table) and manual counts (6table). … | |
1b_geojson-features_noMiInUrlPath.json | 8.2 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1b_multipoint_noMiInUrlPath.json | 862 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1b_table_noMiInUrlPath.csv | 1.1 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
1b_table_noMiInUrlPath.png | 105.4 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
1counts_allCrawledSites.json | 2.4 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1geojson-features_allCrawledSites.json | 9.6 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1map_allCrawledSites.png | 282.7 KB | 33846 | 4 years | Cropped out the json portion | |
1multipoint_allCrawledSites.json | 1008 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
1table_allCrawledSites.csv | 1.2 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
1table_allCrawledSites.png | 113.3 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
2counts_sitesWithPagesInMRI.json | 1.6 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
2geojson-features_sitesWithPagesInMRI.json | 5.0 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
2map_sitesWithPagesInMRI.png | 266.4 KB | 33846 | 4 years | Cropped out the json portion | |
2multipoint_sitesWithPagesInMRI.json | 535 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
2table_sitesWithPagesInMRI.csv | 737 bytes | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
2table_sitesWithPagesInMRI.png | 86.7 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
3counts_sitesWithPagesContainingMRI.json | 2.3 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
3geojson-features_sitesWithPagesContainingMRI.json | 7.2 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
3map_sitesWithPagesContainingMRI.png | 272.2 KB | 33846 | 4 years | Cropped out the json portion | |
3multipoint_sitesWithPagesContainingMRI.json | 762 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
3table_sitesWithPagesContainingMRI.csv | 1019 bytes | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
3table_sitesWithPagesContainingMRI.png | 106.0 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
4counts_tentativeNonProductSites.json | 2.4 KB | 33872 | 4 years | 1. Added the file containing the 255 random NZ page URLs to sample. 2. … | |
4geojson-features_tentativeNonProductSites.json | 6.4 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
4map_exclTentativeAutotranslatedSites.png | 272.7 KB | 33846 | 4 years | Cropped out the json portion | |
4multipoint_tentativeNonProductSites.json | 675 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
4table_tentativeNonProductSites.csv | 869 bytes | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
4table_tentativeNonProductSites.png | 89.9 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
5b_counts_containsMRI_groupedByNZorOverseasNoFilter.json | 2.5 KB | 33895 | 4 years | Minor rename | |
5b_geojson-features_containsMRI_groupedByNZorOverseasNoFilter.json | 7.0 KB | 33894 | 4 years | 1. Adding map, counts.json and geo-json files for 5b count of sites by … | |
5b_map_containsMRI_groupedByNZorOverseasNoFilter.png | 268.6 KB | 33894 | 4 years | 1. Adding map, counts.json and geo-json files for 5b count of sites by … | |
5b_multipoint_containsMRI_groupedByNZorOverseasNoFilter.json | 738 bytes | 33894 | 4 years | 1. Adding map, counts.json and geo-json files for 5b count of sites by … | |
5b_table_containsMRI_groupedByNZorOverseasNoFilter.csv | 943 bytes | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
5b_table_containsMRI_groupedByNZorOverseasNoFilter.png | 96.9 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
5counts_tentativeNonProductSites1.json | 19.3 KB | 33877 | 4 years | Reordering to have proper descending order of counts | |
5geojson-features_tentativeNonProductSites1.json | 5.9 KB | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
5map_exclTentativeAutotranslatedSites1.png | 271.0 KB | 33846 | 4 years | Cropped out the json portion | |
5multipoint_tentativeNonProductSites1.json | 627 bytes | 33823 | 4 years | Recommitting mongo-data folder with renamed files with numbering. | |
5table_tentativeNonProductSites1.csv | 812 bytes | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
5table_tentativeNonProductSites1.png | 85.5 KB | 33889 | 4 years | 1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of … | |
6a_counts_manualShortlist_numPagesInMRI.json | 487 bytes | 33848 | 4 years | Tables of mongodb counts (1-5 table) and manual counts (6table). … | |
6a_geojson-features_manualShortlist_numPagesInMRI.json | 2.2 KB | 33874 | 4 years | Renaming 2 files correctly | |
6a_map_numPagesInMRI_fromManualInspectedSites.png | 275.3 KB | 33868 | 4 years | With the updated code for generating the maps from 6a and 6b manual … | |
6a_multipoint_manualShortlist_numPagesInMRI.json | 253 bytes | 33874 | 4 years | Renaming 2 files correctly | |
6b_counts_manualShortlist_numPagesContainingMRI.json | 536 bytes | 33848 | 4 years | Tables of mongodb counts (1-5 table) and manual counts (6table). … | |
6b_geojson-features_manualShortlist_numPagesContainingMRI.json | 2.4 KB | 33875 | 4 years | Renaming 2 more files correctly | |
6b_map_numPagesContainingMRI_fromManualInspectedSites.png | 267.8 KB | 33868 | 4 years | With the updated code for generating the maps from 6a and 6b manual … | |
6b_multipoint_manualShortlist_numPagesContainingMRI.json | 277 bytes | 33875 | 4 years | Renaming 2 more files correctly | |
6counts_nonProductSites1_manualShortlist.json | 698 bytes | 33915 | 4 years | Forgot to add a (manual) counts file created last week, and am now … | |
6counts_sitesWithPagesContainingMRI_manualShortlist.json | 952 bytes | 33980 | 4 years | Additional comments | |
6counts_sitesWithPagesContainingMRI_manualShortlist.jsonOLD | 698 bytes | 33936 | 4 years | Renaming old file to place with new counts after reingesting into MongoDB. | |
6geojson-features_sitesWithPagesContainingMRI_manualShortlist.json | 2.4 KB | 33894 | 4 years | 1. Adding map, counts.json and geo-json files for 5b count of sites by … | |
6map_sitesWithPagesContainingMRI_manualShortlist.png | 265.4 KB | 33894 | 4 years | 1. Adding map, counts.json and geo-json files for 5b count of sites by … | |
6multipoint_sitesWithPagesContainingMRI_manualShortlist.json | 277 bytes | 33894 | 4 years | 1. Adding map, counts.json and geo-json files for 5b count of sites by … | |
6table_nonProductSites1_manualShortlist.json | 29.7 KB | 33891 | 4 years | Site level detected vs manual inspected data: working shown in file … | |
7miInURLPath_exclNZ_byCountryCode.json | 20.8 KB | 33844 | 4 years | Regenerated | |
8table_siteCountSummary.png | 87.0 KB | 33893 | 4 years | 1. Left out region code column. 2. Two more sheets of work in progress … | |
8TableOfNumDetectedVsManualSITESWithMRI.ods | 25.7 KB | 33893 | 4 years | 1. Left out region code column. 2. Two more sheets of work in progress … | |
googlescholar.txt | 10.3 KB | 34089 | 4 years | So far accumulated URLs to docs on Google scholar about or somewhat … | |
InfoOnEmptyPagesNotInMongoDB.csv | 61.6 MB | 34004 | 4 years | Renaming csv file to have csv extension | |
InfoOnEmptyPagesNotInMongoDB.ods | 10.0 MB | 34097 | 4 years | Open office version of similarly named spreadsheet, just with columns … | |
isMRI_full_manualList_globalDomains_whereAPageContainsMRI.txt | 445.8 KB | 33939 | 4 years | 1. Old random samples file doesn't apply as we're not sampling by … | |
manualList_globalDomains_whereAPageContainsMRI.txt | 3.3 KB | 33918 | 4 years | Country codes added to each domain's URL of the manual site/domain … | |
ManualShortlisting2_afterMongoDBReingest.txt | 86.3 KB | 33936 | 4 years | Renaming old file to place with new counts after reingesting into MongoDB. | |
ManualShortlisting.txt | 76.8 KB | 33914 | 4 years | Shortlisted just the domain sites by country into ManualShortlist2.txt … | |
pieChart01a_seedURLsForCrawling.png | 21.8 KB | 34006 | 4 years | Committing more data I've collected for generating pie charts and the … | |
pieChart01b_obtainingSeedURLs.png | 58.2 KB | 34006 | 4 years | Committing more data I've collected for generating pie charts and the … | |
pieChart01c_obtainingSeedURLs.svg | 7.5 KB | 34006 | 4 years | Committing more data I've collected for generating pie charts and the … | |
pieChart2a_CrawledWebPages_EmptyVsInMongoDB.png | 140.1 KB | 34007 | 4 years | Prepared more data for the piecharts. This time for empty web pages vs … | |
pieChart2b_CrawledWebPages_EmptyVsInMongoDB.svg | 17.3 KB | 34007 | 4 years | Prepared more data for the piecharts. This time for empty web pages vs … | |
pieChart3a_SimplerCrawledWebPages_EmptyVsInMongoDB.png | 98.5 KB | 34007 | 4 years | Prepared more data for the piecharts. This time for empty web pages vs … | |
pieChart3b_SimplerCrawledWebPages_EmptyVsInMongoDB.svg | 13.2 KB | 34007 | 4 years | Prepared more data for the piecharts. This time for empty web pages vs … | |
pieChart3c_screenshot_SimplerCrawledWebPages_EmptyVsInMongoDB.png | 41.1 KB | 34127 | 4 years | Spelling correction in filename: screeMshot to screeNshot | |
pieChart4a_sitesPreparedForCrawling.png | 57.7 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
pieChart4b_sitesPreparedForCrawling.svg | 7.4 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
pieChart4c_screenshotSitesPreparedForCrawling.png | 25.9 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
pieChart5a_sitesPreparedForCrawling.png | 72.7 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
pieChart5b_sitesPreparedForCrawling.svg | 8.7 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
pieChart5c_screenshotSitesPreparedForCrawling.png | 28.7 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
piechart_data2.txt | 4.7 KB | 34011 | 4 years | Piechart data for sites prepared for crawling and the piecharts for these | |
piechart_data.txt | 25.3 KB | 34089 | 4 years | So far accumulated URLs to docs on Google scholar about or somewhat … | |
random260.csv | 30.5 KB | 34120 | 4 years | CSV version of .ods file, so openoffice isn't required | |
random260.ods | 27.9 KB | 33966 | 4 years | Added the origSequence and basicDomain columns to the random 260 web … | |
random260_manualList_globalDomains_whereAPageContainsMRI.txt | 30.5 KB | 33966 | 4 years | Added the origSequence and basicDomain columns to the random 260 web … | |
random260_results.txt | 2.3 KB | 33977 | 4 years | Added something on precision vs recall being applicable to our … | |
tables.txt | 7.6 KB | 33913 | 4 years | 1. Adjusted table mongodb query statements to be more exact, but same … |
Note:
See TracBrowser
for help on using the repository browser.