Changeset 35018
- Timestamp:
- 2021-04-04T20:48:30+12:00 (3 years ago)
- Location:
- main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/prepare/errata-categories
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/prepare/errata-categories/esc-wikipedia-download-and-detect-missing-cat-entries.py
r35016 r35018 1 1 #!/usr/bin/env python 2 2 3 # Work out where the gaps are in the Wikipedia ESC Category pages that are meant to3 # Works out where the gaps are in the Wikipedia ESC Category pages that are meant to 4 4 # list all the countries that competed in a given year 5 5 … … 8 8 import os 9 9 import re 10 # import shutil11 10 12 11 import argparse … … 104 103 util.write_text_file_from_template(sparql_values_template_filename,"**missing-country-year-uris**",missing_uri_lines_text,sparql_values_output_filename) 105 104 106 # problem_category_in_year_filename = "../problem-lod-lists/dbpedia-problem-category-in-year.sparql"107 # shutil.copyfile(sparql_values_output_filename,problem_category_in_year_filename)108 109 110 111 105 112 106 if __name__ == "__main__": … … 122 116 123 117 # 2. Download Wikipedia ESC *Category* pages (listscountries in a given year) 124 # (This appears to be at times incomplete, however is the hook used in the SPARQL query) 118 # (This appears to be at times incomplete. As it the Category pages that 119 # are used as the hook used in the SPARQL query produce the list of all ESC entries 120 # using this approach alone will lead to countries being overloooked) 125 121 126 122 # 3. Extract list of countries from (1) -
main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/prepare/errata-categories/esc-wikipedia-download-and-process-votes.py
r35017 r35018 2 2 3 3 ### TODO 4 # 4 5 ### Grab image for competition year out of infobox??? 6 # 7 ### If hyperlink in table => add in Wikipedia Song URL mapped to DBPedia 8 # 9 5 10 6 11 import json
Note:
See TracChangeset
for help on using the changeset viewer.