Changeset 35093
- Timestamp:
- 2021-04-22T14:50:00+12:00 (3 years ago)
- Location:
- main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/transform/pages
- Files:
-
- 3 edited
Legend:
- Unmodified
- Added
- Removed
-
main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/transform/pages/about.xsl
r35066 r35093 18 18 <gsf:script src="sites/{$site_name}/collect/{$collName}/js/jquery.show-more.js"/> 19 19 20 20 21 <div id="about-desc"> 21 22 <h2>Introduction</h2> … … 23 24 <p style="padding-bottom: 10px;"> 24 25 The <a href="https://eurovision.tv">Eurovision Song 25 Conte nt</a> is a live-broadcast televised event that26 Contest</a> is a live-broadcast televised event that 26 27 was first held in 1956 featuring artists singing original songs from 27 28 7 countries. Since then it has grown into an event involving … … 43 44 The contest has grown significantly from 44 45 that modest start with 7 countries (and one cameraman), 45 with over 40 countries competing these daysâ even46 Australiatakes part now, through a specially46 with over 40 countries competing these daysâAustralia 47 even takes part now, through a specially 47 48 arranged invitation. It's an annual celebration of 48 49 European culture and the highlight of many people's … … 507 508 508 509 <p> 509 Access to and the analysis of how countries have voted over the years 510 To fulfill our vision of developing this DL collection 511 as a rich resource through which people can explore the 512 phenomenon we went looking for voting data that was 513 available in a machine-readable format. 514 We found data compiled through a manual curation process 515 about how countries have voted going back to 1975 is available through the 516 <a href="https://www.kaggle.com/datagraver/eurovision-song-contest-scores-19752019">Kaggle website as an Excel spreadsheet</a>. 517 </p> 518 <p> 519 To incorporate this as metadata into the DL, we wrote 520 some Python code to transform the data into the internal 521 serialized metadata format used by Greenstone. Prior to 522 this project, the only serialized form for this was XML, 523 which is processed by the MetadataXML plugin. As it was 524 more convenient to generate JSON from our Python code, 525 we took the step of adding in a new plugin to 526 Greenstone3: MetadataJSON. 527 </p> 528 529 <h3>Page Scraping</h3> 530 531 <p> 532 Despite our best intentions work soley with 533 machine-readable dataâprimarily as you have seen in the 534 form of Linked Open Data, but also utilizing a 535 spreadsheet of voting dataâto form the Eurovision DL, 536 in looking to expand the metadata in the DL to cover 537 details concerning the draw position of acts, and their 538 overall placing, we have resorted to page-scraping 539 content from Wikipedia itself. This was because such 540 information was not part of the entity extraction 541 process that occurs when Wikipedia is mapped to DBpedia. 542 </p> 543 544 <p> 545 A review of Wikipedia article pages about the event in 546 any given year showed these pages to be especially well 547 curated, and included a table in each that listed the 548 information we sought. While there was some variation 549 in how this table was expressed in HTML, with a 550 considerably portion of the heavy lifting being done by 551 the Python library BeautifulSoup4, it was not too 552 complex a task to develop a program that extracted this 553 information and turned it into the newly developed 554 Greenstone JSON metadata format. 555 </p> 510 556 511 To fulfill our vision of developing this DL collection as a rich resource to 512 through which people can explore the phenomenon. 513 514 </p> 557 <h3>Patching in Missing Data</h3> 558 515 559 516 <h3>Patching in Missing Data: Page Scraping</h3> 517 518 519 <p> 520 Despite our best intentions to work solely with .... 521 .. missing categories ... 522 523 totting up how many entrie per year ... 524 thousands of entries 525 560 <p> 561 Another difficulty we have encountered is that 562 not every country who had an entry in Eurovision 563 in a given year has its own standalone article page. 564 This leads to missing entries in the category 565 page for the contest in a given year, which is 566 problematic to us, because it is this category 567 information that we draw upon in our SPARQL query 568 to populate the DL with all the acts. 569 </p> 570 <p> 571 The information about all the countries competing 572 in a given year does, however, appear in the 573 article page for the contest in that year. In fact 574 it's in the same table we targetted to extract out 575 draw position and placement. We therefore 576 wrote a further page-scraping program to compare 577 the countries in that table with the countries 578 listed on the category page for the contest in 579 that year. For any entries we find in the 580 table, but not in the Category page, we 581 produce a metadata record for the DL 582 with basic information about the entry: 583 country, year, song title, artist, 584 draw-position, placement, and (where available) 585 their total score. 586 </p> 587 <p> 588 Comparable with the problem titles and artist/entrants, 589 we have formulated a SPARQL query that enumerates 590 these missing category entrants: 591 <!-- 526 592 We took the opportunity to add in further fields: Performing Position, Placement, Voting Total, thumbnail flag image. 593 594 595 An unintended side-affect of this is that we have also been able to expand 596 --> 597 527 598 528 599 <ul> -
main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/transform/pages/sgvizler.xsl
r35089 r35093 52 52 <link rel="stylesheet" href="sites/{$site_name}/collect/{$collName}/style/fuseki.css" type="text/css" /> 53 53 --> 54 <gsf:script src="sites/{$site_name}/collect/{$collName}/js/jquery.show-more.js"/> 54 55 <gsf:script src="sites/{$site_name}/collect/{$collName}/js/eurovision.js"/> 55 56 <gsf:script src="sites/{$site_name}/collect/{$collName}/js/dataviz.js"/> … … 58 59 <link rel="stylesheet" href="sites/{$site_name}/collect/{$collName}/css/dataviz.css" type="text/css" /> 59 60 <style> 60 div.page { margin-left: 12px; margin-right: 12px; margin-top: 6px; margin-bottom: 6px;} 61 62 p { padding-top: 6px; padding-bottom: 6px;} 63 a { text-decoration: underline; } 64 li { padding-bottom: 6px; margin-bottom: 6px; } 61 #gs_content div.page { margin-left: 12px; margin-right: 12px; margin-top: 6px; margin-bottom: 6px;} 62 #gs_content div.showmore { padding-left: 0px; padding-right: 0px; padding-top: 0px; padding-bottom: 0px;} 63 64 #gs_content p { padding-top: 6px; padding-bottom: 6px;} 65 #gs_content a { text-decoration: underline; } 66 #gs_content li { padding-bottom: 6px; margin-bottom: 6px; } 65 67 </style> 66 68 … … 91 93 </p> 92 94 93 <p> 94 We use this two-step process so it is possible to 95 change what query is run, and how the resulting data is visualized. 96 The first text-box below is for the SPARQL query. The following 97 3 text-boxes control aspects of the visualization. 98 If you haven't worked with the underlying tools before, 99 we suggest you work your way through the sample visualizations 100 provided, trying out small edits to see how that affects 101 what is produced. 102 </p> 103 <p> 104 Rather than visualize results, if you would like to 105 directly access and/or export the data to peroforms 106 other forms of analysis, then you'll probably want to 107 use the: 108 <ul> 109 <li> 110 <a href="{$library_name}/collection/{$collName}/page/sparql">Data Analysis page</a> 111 </li> 112 </ul> 113 </p> 95 <div id="sgvizler-show-more" class="showmore"> 96 <p> 97 We use this two-step process so it is possible to 98 change what query is run, and how the resulting data is visualized. 99 The first text-box below is for the SPARQL query. The following 100 3 text-boxes control aspects of the visualization. 101 If you haven't worked with the underlying tools before, 102 we suggest you work your way through the sample visualizations 103 provided, trying out small edits to see how that affects 104 what is produced. 105 </p> 106 <p> 107 Rather than visualize results, if you would like to 108 directly access and/or export the data to peroforms 109 other forms of analysis, then you'll probably want to 110 use the: 111 <ul> 112 <li> 113 <a href="{$library_name}/collection/{$collName}/page/sparql">Data Analysis page</a> 114 </li> 115 </ul> 116 </p> 117 </div> 118 <gsf:script> 119 $('#sgvizler-show-more').showMore({ 120 minheight: 0, 121 buttontxtmore:"show more ...", 122 buttontxtless:"... show less" 123 }); 124 </gsf:script> 114 125 115 126 </div> … … 183 194 <script type="text/javascript"> 184 195 <xsl:text disable-output-escaping="yes"> 185 $(document).ready( 186 196 $(document).ready( 197 function() { 187 198 ssv_load("ssv-orig"); 188 ssv_execute( );199 ssv_execute(ssv_no_auto_focus); 189 200 } 190 201 ); … … 275 286 <b>Number of times entered, sorted by frequency:</b><br/> 276 287 <button type="button" class="load-ssq" id="load-ssv-orig" onclick="ssv_load('ssv-orig')">Load query above</button> 277 <button type="button" class="exec-ssq" id="exec-ssv-orig" onclick="ssv_execute( )">Visualize Results</button><br/>288 <button type="button" class="exec-ssq" id="exec-ssv-orig" onclick="ssv_execute(ssv_auto_focus)">Visualize Results</button><br/> 278 289 <p> 279 290 Plot as a bar graph the number of times each country has competed in the … … 285 296 <b>Made the Finals:</b><br/> 286 297 <button type="button" class="load-ssq" id="load-ssv-made-the-final" onclick="ssv_load('ssv-made-the-final')">Load query above</button> 287 <button type="button" class="exec-ssq" id="exec-ssv-made-the-final" onclick="ssv_execute( )">Visualize Results</button><br/>298 <button type="button" class="exec-ssq" id="exec-ssv-made-the-final" onclick="ssv_execute(ssv_auto_focus)">Visualize Results</button><br/> 288 299 <p>Plot as a bar graph the number of times each country has made it to the finals.</p> 289 300 </li> … … 292 303 <b>List of Winners:</b><br/> 293 304 <button type="button" class="load-ssq" id="load-ssv-list-of-winners" onclick="ssv_load('ssv-list-of-winners')">Load query above</button> 294 <button type="button" class="exec-ssq" id="exec-ssv-list-of-winners" onclick="ssv_execute( )">Visualize Tabulated Results</button><br/>305 <button type="button" class="exec-ssq" id="exec-ssv-list-of-winners" onclick="ssv_execute(ssv_auto_focus)">Visualize Tabulated Results</button><br/> 295 306 <p>The songs that have won through the ages.</p> 296 307 </li> … … 299 310 <b>List of Last Place Entrants:</b><br/> 300 311 <button type="button" class="load-ssq" id="load-ssv-list-of-losers" onclick="ssv_load('ssv-list-of-losers')">Load query above</button> 301 <button type="button" class="exec-ssq" id="exec-ssv-list-of-losers" onclick="ssv_execute( )">Visualize Tabulated Results</button><br/>312 <button type="button" class="exec-ssq" id="exec-ssv-list-of-losers" onclick="ssv_execute(ssv_auto_focus)">Visualize Tabulated Results</button><br/> 302 313 <p>The songs that have won through the ages.</p> 303 314 </li> … … 306 317 <b>Top 3 Acts per Year with (where available) Details of Musical Content:</b><br/> 307 318 <button type="button" class="load-ssq" id="load-ssv-top-3-with-mir-content" onclick="ssv_load('ssv-top-3-with-mir-content')">Load query above</button> 308 <button type="button" class="exec-ssq" id="exec-ssv-top-3-with-mir-content" onclick="ssv_execute( )">Visualize Tabulated Results</button><br/>319 <button type="button" class="exec-ssq" id="exec-ssv-top-3-with-mir-content" onclick="ssv_execute(ssv_auto_focus)">Visualize Tabulated Results</button><br/> 309 320 <p>List the Top 3 entries per year, including musical details such as tempo, time-signature, and key where alignment with content in MusicBrainz was possible.</p> 310 321 </li> … … 313 324 <b>The ignominy of "nul point":</b><br/> 314 325 <button type="button" class="load-ssq" id="load-ssv-got-nul-point" onclick="ssv_load('ssv-got-nul-point')">Load query above</button> 315 <button type="button" class="exec-ssq" id="exec-ssv-got-nul-point" onclick="ssv_execute( )">Visualize Results</button><br/>326 <button type="button" class="exec-ssq" id="exec-ssv-got-nul-point" onclick="ssv_execute(ssv_auto_focus)">Visualize Results</button><br/> 316 327 <p> 317 328 Plot a bar graph showing which countries, and how … … 324 335 <b>The even more galling circumstance of getting "nul point" having won the previous year:</b><br/> 325 336 <button type="button" class="load-ssq" id="load-ssv-got-nul-point-after-winning" onclick="ssv_load('ssv-got-nul-point-after-winning')">Load query above</button> 326 <button type="button" class="exec-ssq" id="exec-ssv-got-nul-point-after-winning" onclick="ssv_execute( )">Visualize Results</button><br/>337 <button type="button" class="exec-ssq" id="exec-ssv-got-nul-point-after-winning" onclick="ssv_execute(ssv_auto_focus)">Visualize Results</button><br/> 327 338 <p> 328 339 Have any countries ever been in the situation of going from Hero (i.e., winning) to Zero (nul point) in back to back years in … … 342 353 <input type="text" id="ssv-voting-dataflow-jury-endyear" value="2019" style="width: 80px; padding-left: 6px"/> 343 354 </div> 344 <button type="button" class="exec-ssq" id="exec-ssv-voting-dataflow-jury" onclick="ssv_execute( 'ssv-voting-dataflow-jury')">Visualize Results</button>355 <button type="button" class="exec-ssq" id="exec-ssv-voting-dataflow-jury" onclick="ssv_execute(ssv_auto_focus,'ssv-voting-dataflow-jury')">Visualize Results</button> 345 356 </div> 346 357 <p> … … 374 385 </span> 375 386 376 <button type="button" class="exec-ssq" id="exec-ssv-voting-dataflow-tele" onclick="ssv_execute( 'ssv-voting-dataflow-tele')">Visualize Results</button><br/>387 <button type="button" class="exec-ssq" id="exec-ssv-voting-dataflow-tele" onclick="ssv_execute(ssv_auto_focus,'ssv-voting-dataflow-tele')">Visualize Results</button><br/> 377 388 </div> 378 389 <p> … … 397 408 </span> 398 409 399 <button type="button" class="exec-ssq" id="exec-ssv-jury-tele-diff" onclick="ssv_execute( 'ssv-jury-tele-diff')">Visualize Results</button><br/>410 <button type="button" class="exec-ssq" id="exec-ssv-jury-tele-diff" onclick="ssv_execute(ssv_auto_focus,'ssv-jury-tele-diff')">Visualize Results</button><br/> 400 411 </div> 401 412 <p> … … 417 428 <b>The Curse of Being the Second Performer in the Lineup?</b><br/> 418 429 <button type="button" class="load-ssq" id="load-ssv-draw-bias" onclick="ssv_load('ssv-draw-bias')">Load query above</button> 419 <button type="button" class="exec-ssq" id="exec-ssv-draw-bias" onclick="ssv_execute( )">Visualize Results</button><br/>430 <button type="button" class="exec-ssq" id="exec-ssv-draw-bias" onclick="ssv_execute(ssv_auto_focus)">Visualize Results</button><br/> 420 431 <p> 421 432 Plot as a bar graph how many times an entrant … … 432 443 <b>Normalized Plot of ... The Curse of Being the Second Performer in the Lineup?</b><br/> 433 444 <button type="button" class="load-ssq" id="load-ssv-draw-bias-normalized" onclick="ssv_load('ssv-draw-bias-normalized')">Load query above</button> 434 <button type="button" class="exec-ssq" id="exec-ssv-draw-bias-normalized" onclick="ssv_execute( )">Visualize Results</button><br/>445 <button type="button" class="exec-ssq" id="exec-ssv-draw-bias-normalized" onclick="ssv_execute(ssv_auto_focus)">Visualize Results</button><br/> 435 446 <p> 436 447 Same as the above, only to better take account of -
main/trunk/model-sites-dev/eurovision-lod/collect/eurovision/transform/pages/sparql.xsl
r35061 r35093 48 48 <gsf:style src="sites/{$site_name}/collect/{$collName}/css/dataviz.css" /> 49 49 <style> 50 div { padding-left: 12px; padding-right: 12px; padding-top: 6px; padding-bottom: 6px;}51 div.showmore { padding-left: 0px; padding-right: 0px; padding-top: 0px; padding-bottom: 0px;}52 53 p { padding-top: 6px; padding-bottom: 6px;}54 a { text-decoration: underline; }55 li { padding-bottom: 6px; margin-bottom: 6px; }50 #gs_content div { padding-left: 12px; padding-right: 12px; padding-top: 6px; padding-bottom: 6px;} 51 #gs_content div.showmore { padding-left: 0px; padding-right: 0px; padding-top: 0px; padding-bottom: 0px;} 52 53 #gs_content p { padding-top: 6px; padding-bottom: 6px;} 54 #gs_content a { text-decoration: underline; } 55 #gs_content li { padding-bottom: 6px; margin-bottom: 6px; } 56 56 </style> 57 57 … … 389 389 390 390 391 <div id="ssq-from-winning-to-losing" style="display: none;"> 392 <!-- --> 393 <xsl:text> 394 395 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 396 PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 397 PREFIX dc: <http://purl.org/dc/elements/1.1/> 398 PREFIX gsdlextracted: <http://greenstone.org/gsdlextracted#> 399 400 SELECT (?country as ?Country) (?year AS ?WinningYear) (?total as ?WinningGrandTotal) (?next_year AS ?LosingYear) (?next_total as ?LosingGrandTotal) 401 WHERE { 402 GRAPH <</xsl:text><xsl:value-of select="$graphURI"/><xsl:text>> { 403 ?esc_entrant_uri gsdlextracted:Country ?country. 404 ?esc_entrant_uri gsdlextracted:VoteGrandTotal ?total. 405 406 ?esc_entrant_uri gsdlextracted:Place ?place. 407 BIND(xsd:integer(?place) AS ?place_int). 408 FILTER(?place_int = 1). 409 410 ?esc_entrant_uri gsdlextracted:Year ?year. 411 BIND(xsd:integer(?year) AS ?year_int). 412 BIND(?year_int + 1 AS ?next_year_int). 413 BIND(str(?next_year_int) AS ?next_year). 414 415 ?next_esc_entrant_uri gsdlextracted:Country ?country. 416 ?next_esc_entrant_uri gsdlextracted:Year ?next_year. 417 418 ?next_esc_entrant_uri gsdlextracted:VoteGrandTotal ?next_total. 419 420 ?next_esc_entrant_uri gsdlextracted:ReverseFinishingPos ?rev_finishing_pos. 421 BIND(xsd:integer(?rev_finishing_pos) AS ?rev_finishing_pos_int). 422 FILTER(?rev_finishing_pos_int = 1). 423 424 425 } 426 } 427 ORDER BY ?country 428 </xsl:text> 429 </div> 391 430 392 431 <div id="ssq-country-info" style="display: none;"> … … 528 567 529 568 <p> 530 The number of times a country has won Eurovision across the years. The includes 531 the years when only Juries voted (1956-2000) through to the introduction 532 of Televotes, where a variety of forms have been used such as only 533 Televotes, a pre-combined score based on Televotes and Jury votes, to 534 (from 2016 onwards) where the Jury and Tele votes are explicitly 535 given individually per country, then combined. 569 The number of times a country has won Eurovision 570 across the years. The includes the years when only 571 Juries voted (1956-2000) through to the introduction 572 of Televotes, where a variety of forms have been 573 used such as only Televotes, a pre-combined score 574 based on Televotes and Jury votes, to (from 2016 575 onwards) where the Jury and Tele votes are 576 explicitly given individually per country, then 577 combined. 536 578 </p> 537 579 </li> 538 580 581 582 583 <li> 584 <b>From Hero to Zero!</b><br/> 585 <button type="button" class="load-ssq" id="load-ssq-from-winning-to-losing" onclick="ssq_load('ssq-from-winning-to-losing')">Load query above</button> 586 <button type="button" class="exec-ssq" id="exec-ssq-from-winning-to-losing" onclick="ssq_execute()">Get Results</button><br/> 587 588 <p> 589 List countries where, having won in one year, they 590 have gone on to lose with the lowest score of the 591 contest the following year. Note this definition 592 doesn't quite meet the given heading of <i>From Here 593 to Zero</i>. There is in fact only one country, to 594 date, where this has happened (nul-point the 595 following year). View the table to see who it is 596 ... then why not test out your SPARQL querying 597 skills and see if you can modify the query used so 598 it only returns the Hero to Zero case? 599 </p> 600 </li> 539 601 540 602 <li>
Note:
See TracChangeset
for help on using the changeset viewer.