Last change
on this file since 34006 was 34006, checked in by ak19, 4 years ago |
Committing more data I've collected for generating pie charts and the pie-charts for the first dataset, which is how the seed URLs for crawling were obtained.
|
File size:
844 bytes
|
Line | |
---|
1 | https://www.rapidtables.com/tools/pie-chart.html
|
---|
2 |
|
---|
3 | Title: 38724 out of >11.4 billion URLs in 12-month CommonCrawl data had content_language=MRI
|
---|
4 | data names: discarded_10290 greylisted_2751 pruned_4 crawlSeeds_25679
|
---|
5 | data values: 10290 2751 4 25679
|
---|
6 | slice text: (Percentage)
|
---|
7 |
|
---|
8 |
|
---|
9 | ------
|
---|
10 | https://www.meta-chart.com/pie#/data
|
---|
11 |
|
---|
12 | Number of slices -> 4
|
---|
13 | Series Unit: URLs
|
---|
14 |
|
---|
15 | Slice 1: discarded (red) 10290
|
---|
16 | Slice 2: greyListed (grey) 2751
|
---|
17 | Slice 3: further pruned away (yellow) 4
|
---|
18 | Slice 4: final crawl seeds (green) 25679
|
---|
19 |
|
---|
20 | https://www.meta-chart.com/pie#/labels
|
---|
21 | Graph title: Processing the 38724 out of >11.4 billion URLs in the 12-month CommonCrawl data which had content_language=MRI
|
---|
22 | Slice Display data label display setting: Name, Value and Percent
|
---|
23 |
|
---|
24 | https://www.meta-chart.com/pie#/display
|
---|
25 | Export as SVG and PNG
|
---|
26 | Leave Sort setting at botton to "ORIG (default)"
|
---|
Note:
See
TracBrowser
for help on using the repository browser.