1 | <?xml version="1.0" encoding="UTF-8"?>
|
---|
2 | <!DOCTYPE videocollection [
|
---|
3 | <!ENTITY ndash "–">
|
---|
4 | <!ENTITY mdash "—">
|
---|
5 | ]>
|
---|
6 | <xsl:stylesheet version="1.0"
|
---|
7 | xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
---|
8 | xmlns:java="http://xml.apache.org/xslt/java"
|
---|
9 | xmlns:util="xalan://org.greenstone.gsdl3.util.XSLTUtil"
|
---|
10 | xmlns:gslib="http://www.greenstone.org/skinning"
|
---|
11 | xmlns:gsf="http://www.greenstone.org/greenstone3/schema/ConfigFormat"
|
---|
12 | extension-element-prefixes="java util"
|
---|
13 | exclude-result-prefixes="java util">
|
---|
14 |
|
---|
15 |
|
---|
16 | <xsl:template name="coll-description">
|
---|
17 | <gsf:style src="sites/{$site_name}/collect/{$collName}/css/eurovision.css"/>
|
---|
18 | <gsf:script src="sites/{$site_name}/collect/{$collName}/js/jquery.show-more.js"/>
|
---|
19 |
|
---|
20 |
|
---|
21 | <div id="about-desc">
|
---|
22 | <h2>Introduction</h2>
|
---|
23 | <!--
|
---|
24 | <p style="padding-bottom: 10px;">
|
---|
25 | The <a href="https://eurovision.tv">Eurovision Song
|
---|
26 | Contest</a> is a live-broadcast televised event that
|
---|
27 | was first held in 1956 featuring artists singing original songs from
|
---|
28 | 7 countries. Since then it has grown into an event involving
|
---|
29 | over 40 countries, and streamed all around the world. ...
|
---|
30 |
|
---|
31 | </p>
|
---|
32 | -->
|
---|
33 |
|
---|
34 | <p style="padding-bottom: 10px;">
|
---|
35 | <i style="padding-right:6px;">A help to shore up a post war Europe in 1956 it
|
---|
36 | all began, where there were only seven countries and one
|
---|
37 | camera man!</i>
|
---|
38 | </p>
|
---|
39 | <p>
|
---|
40 | The <a href="https://eurovision.tv">Eurovision Song
|
---|
41 | Contest</a> is a long-running, live-broadcast televised multi-national
|
---|
42 | competition with a collaborative mission, not dissimilar
|
---|
43 | in spirit to the Olympics.
|
---|
44 | The contest has grown significantly from
|
---|
45 | that modest start with 7 countries (and one cameraman),
|
---|
46 | with over 40 countries competing these daysâAustralia
|
---|
47 | even takes part now, through a specially
|
---|
48 | arranged invitation. It's an annual celebration of
|
---|
49 | European culture and the highlight of many people's
|
---|
50 | year.
|
---|
51 | </p>
|
---|
52 |
|
---|
53 | <div id="about-show-more">
|
---|
54 | <p>
|
---|
55 | At Eurovision there is no division because wherever
|
---|
56 | you come from Eurovision is home. The Eurovision song
|
---|
57 | contest is widely known as a safe space for LGBTQIA+
|
---|
58 | people and a platform for free expression. For example
|
---|
59 | trans-woman
|
---|
60 | <a href="https://en.wikipedia.org/wiki/Dana_International">Dana International</a>
|
---|
61 | won as far back as 1998.
|
---|
62 | There have been songs in many different languages over the
|
---|
63 | years, although most are in English these days. This
|
---|
64 | doesn't matter, however, because music is a language we all
|
---|
65 | know how to speak.
|
---|
66 | </p>
|
---|
67 | <p>
|
---|
68 | In its latest incarnation, after
|
---|
69 | all the performances are over, artists wait
|
---|
70 | nervously as via live television link-ups the show's hosts visit each
|
---|
71 | of the 40+ countries in turn collecting all points cast
|
---|
72 | by the country appointed juries. This includes
|
---|
73 | the all important top score that can be cast, 12 points
|
---|
74 | (douze points!), a double-increment up from the
|
---|
75 | 10 points awarded to the song a country ranks second,
|
---|
76 | followed by 8, 7, 6 ⊠1 points awarded.
|
---|
77 | With over 20 countries competing in a final, this means
|
---|
78 | that not all performers gets points from that country.
|
---|
79 | Next comes the "the popular vote"
|
---|
80 | where fans, still grouped by country, have
|
---|
81 | the votes they cast by phone, SMS or the Eurovision app
|
---|
82 | tallied and mapped into the same format of 12 points for 1st
|
---|
83 | place, and so on.
|
---|
84 | This all culminates in a new winner being crowned, with
|
---|
85 | the competition typically being hosted the following year
|
---|
86 | in that country.
|
---|
87 | </p>
|
---|
88 | </div>
|
---|
89 | <gsf:script>
|
---|
90 | $('#about-show-more').showMore({
|
---|
91 | minheight: 0,
|
---|
92 | buttontxtmore:"show more ...",
|
---|
93 | buttontxtless:"... show less"
|
---|
94 | });
|
---|
95 | </gsf:script>
|
---|
96 |
|
---|
97 |
|
---|
98 | <h2>Features of this Website</h2>
|
---|
99 |
|
---|
100 | <p>
|
---|
101 | This (unofficial) website has been developed by a small
|
---|
102 | team of dedicated Digital Library researchers who also
|
---|
103 | happen to be <i>huge</i> fans of Eurovision. We wish to
|
---|
104 | share our love for the competition, and at the same time
|
---|
105 | demonstrate what is possible whenâharnessing some of that
|
---|
106 | passion!âthe techniques of
|
---|
107 | <a href="https://en.wikipedia.org/wiki/Linked_data">Linked
|
---|
108 | Open Data</a> are applied
|
---|
109 | to the Open Source
|
---|
110 | <a href="https://www.greenstone.org">Greenstone3</a>
|
---|
111 | Digital Library platform. For the technically interested
|
---|
112 | see the
|
---|
113 | <a href="{$library_name}/collection/{$collName}/page/about#it-all-started-with">
|
---|
114 | <i style="padding-right: 6px;">It All Started with a Little <strike>Sparkle</strike>SPARQL</i></a>
|
---|
115 | below for details about how the digital library was formed.
|
---|
116 | </p>
|
---|
117 |
|
---|
118 | <!--
|
---|
119 | <p>
|
---|
120 | For those who want to jump right in and access information about, as well as see and hear some of the past performances,
|
---|
121 | we suggest you
|
---|
122 | start by exploring the assembled information through
|
---|
123 | the browsing tabs, such as
|
---|
124 | <a href="{$library_name}/collection/{$collName}/browse/CL3">browse by countries</a>
|
---|
125 | if you want (for instance) to reminisce about songs your country have entered in the past, or
|
---|
126 | <a href="{$library_name}/collection/{$collName}/browse/CL4">browse by years</a> if
|
---|
127 | you are curious about who were the countries competing in that inaugural year of 1956.
|
---|
128 | Alternatively, use the quick-search box to query the DL collection for a term that you sparks
|
---|
129 | interest, such as
|
---|
130 | <a href="{$library_name}/collection/{$collName}/search/TextQuery?qs=1&rt=rd&s1.level=Doc&startPage=1&s1.query=love&s1.index=ZZ">love</a>
|
---|
131 | and
|
---|
132 | <a href="{$library_name}/collection/{$collName}/search/TextQuery?qs=1&rt=rd&s1.level=Doc&startPage=1&s1.query=amore&s1.index=ZZ">amore</a>,
|
---|
133 | or maybe something more frivolous such as
|
---|
134 | <a href="{$library_name}/collection/{$collName}/search/TextQuery?qs=1&rt=rd&s1.level=Doc&startPage=1&s1.query=la&s1.index=ZZ">la</a>.
|
---|
135 |
|
---|
136 | </p>
|
---|
137 | -->
|
---|
138 |
|
---|
139 | <p>
|
---|
140 | For those who want to jump right in and access information about, as well as see and hear some of the past performances,
|
---|
141 | we suggest you
|
---|
142 | start by exploring the assembled information through
|
---|
143 | the browsing tabs. For example:
|
---|
144 | <ul>
|
---|
145 | <li><a href="{$library_name}/collection/{$collName}/browse/CL3">Browse by countries</a>
|
---|
146 | if you want (for instance) to reminisce about songs your country have entered in the past; or</li>
|
---|
147 | <li><a href="{$library_name}/collection/{$collName}/browse/CL4">Browse by years</a> if
|
---|
148 | you are curious about who were the countries competing in that inaugural year of 1956.</li>
|
---|
149 | </ul>
|
---|
150 | </p>
|
---|
151 | <p>
|
---|
152 | Alternatively, use the quick-search box to query the DL collection for a term that sparks
|
---|
153 | your interest. For example:
|
---|
154 | <ul>
|
---|
155 | <li>
|
---|
156 | <a href="{$library_name}/collection/{$collName}/search/TextQuery?qs=1&rt=rd&s1.level=Doc&startPage=1&s1.query=love&s1.index=ZZ">love</a>
|
---|
157 | and
|
---|
158 | <a href="{$library_name}/collection/{$collName}/search/TextQuery?qs=1&rt=rd&s1.level=Doc&startPage=1&s1.query=amore&s1.index=ZZ">amore</a>,
|
---|
159 | or maybe something more frivolous such as
|
---|
160 | <a href="{$library_name}/collection/{$collName}/search/TextQuery?qs=1&rt=rd&s1.level=Doc&startPage=1&s1.query=la&s1.index=ZZ">la</a>.
|
---|
161 | </li>
|
---|
162 | </ul>
|
---|
163 | </p>
|
---|
164 |
|
---|
165 |
|
---|
166 | <h2>Data Analysis and Visualization</h2>
|
---|
167 |
|
---|
168 | <gsf:script src="ext/jena/sgvizler2/sgvizler2.js"/>
|
---|
169 |
|
---|
170 | <gsf:script>
|
---|
171 | $(document).ready(
|
---|
172 | function() {
|
---|
173 |
|
---|
174 | // Exaple triple
|
---|
175 | // "s": { "type": "uri" , "value": "http://127.0.0.1:8383/greenstone3/library/collection/eurovision/document/HASH0191e9cc7bfdf14743472257s10" } ,
|
---|
176 | // "p": { "type": "uri" , "value": "gsdlextracted:Country" } ,
|
---|
177 | // "o": { "type": "literal" , "value": "United Kingdom" }
|
---|
178 |
|
---|
179 | sgvizler2.containerDraw('sgvizler2-country-count');
|
---|
180 | }
|
---|
181 | );
|
---|
182 | </gsf:script>
|
---|
183 |
|
---|
184 | <xsl:variable name="graphURI">https://so-we-must-think.space<xsl:value-of select="$siteURL"/><xsl:value-of select="$library_name"/>/collection/<xsl:value-of select="$collName"/></xsl:variable>
|
---|
185 | <div id="sgvizler2-country-count"
|
---|
186 | data-sgvizler-endpoint="//sowemustthink.space/greenstone3-lod3/greenstone/query"
|
---|
187 | data-sgvizler-chart="google.visualization.BarChart"
|
---|
188 | data-sgvizler-chart-options="title=Number of Songs from each Country|legend.position=none|height=900|chartArea.height=840|fontSize=11"
|
---|
189 | data-sgvizler-log="2"
|
---|
190 | style="width:900px; height:300px; margin-left: auto; margin-right: auto; overflow-y: scroll; overflow-x: hidden;">
|
---|
191 | <xsl:attribute name="data-sgvizler-query">
|
---|
192 | PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
---|
193 | PREFIX gsdlextracted: <http://greenstone.org/gsdlextracted#>
|
---|
194 |
|
---|
195 | SELECT ?country (COUNT(?country) AS ?freqCount)
|
---|
196 | WHERE {
|
---|
197 | GRAPH <<xsl:value-of select="$graphURI"/>> {
|
---|
198 | {
|
---|
199 | SELECT DISTINCT ?country ?year WHERE {
|
---|
200 | ?s gsdlextracted:Country ?country.
|
---|
201 | ?s gsdlextracted:Year ?year.
|
---|
202 | } ORDER BY ?country ?year
|
---|
203 | }
|
---|
204 | }
|
---|
205 | }
|
---|
206 | GROUP BY ?country ORDER BY ASC(?country)
|
---|
207 | </xsl:attribute>
|
---|
208 | <xsl:text> Loading ...</xsl:text>
|
---|
209 | </div>
|
---|
210 |
|
---|
211 |
|
---|
212 | <p style="padding-top: 10px;">
|
---|
213 | All the metadata in the digital library is simultaneously
|
---|
214 | published as linked data, meaning it is possible to
|
---|
215 | extract and analyze the data contained here in a variety
|
---|
216 | of ways. To aid in such analysis we have
|
---|
217 | added in a data visualization layer to the digital
|
---|
218 | library. This is how the bar-graph above has been
|
---|
219 | created, which shows how many times each country has
|
---|
220 | competed, alphabetically sorted.
|
---|
221 | </p>
|
---|
222 | <p>
|
---|
223 | Through our:
|
---|
224 | <ul>
|
---|
225 | <li>
|
---|
226 | <a href="{$library_name}/collection/{$collName}/page/sgvizler">Visualizer page</a>
|
---|
227 | </li>
|
---|
228 | </ul>
|
---|
229 | </p>
|
---|
230 | <p>
|
---|
231 | we provide samples you can try out to give you an idea of
|
---|
232 | the sorts of visualization that can be produced. More
|
---|
233 | importantly, these samples are editable so you are free to
|
---|
234 | change them however you wish. On the visualization page
|
---|
235 | you'll find a sample that shows you how often different
|
---|
236 | countries have won Eurovision, but perhaps you'd like to
|
---|
237 | find out who has lost the most often? We also provide a
|
---|
238 | sample dataflow visualization of jury voting patterns over
|
---|
239 | the last decade, which makes for interesting viewing!
|
---|
240 | Adjust the values used to discover how this compares
|
---|
241 | with other time periods.
|
---|
242 | </p>
|
---|
243 |
|
---|
244 | <div id="viz-show-more" style="margin-bottom: 10px;">
|
---|
245 |
|
---|
246 | <p>
|
---|
247 | In addition to the visualizer, through the:
|
---|
248 | <ul>
|
---|
249 | <li>
|
---|
250 | <a href="{$library_name}/collection/{$collName}/page/sparql">Data Analysis page</a>
|
---|
251 | </li>
|
---|
252 | </ul>
|
---|
253 | you will find a set of samples you can test-drive to give you an idea of the
|
---|
254 | sorts of raw data analysis that can be done. The syntax used is called
|
---|
255 | <a href="https://en.wikipedia.org/wiki/SPARQL" target="_blank">SPARQL</a> (pronounced "sparkle"). If you are unfamiliar
|
---|
256 | with this syntax, there are a variety of tutorials available online where you can learn about query language, such as
|
---|
257 | the one done by <a href="https://jena.apache.org/tutorials/sparql.html" target="_blank">Apache Jena</a>, an Open Source
|
---|
258 | initiative that provides a variety of Semantic Web and Linked Data tools.
|
---|
259 | As before, these samples are editable so you are free to
|
---|
260 | change them however you wish to adjust the analysis undertaken, or once you're mastered the
|
---|
261 | query syntax, develop completely original forms of
|
---|
262 | analysis.
|
---|
263 | </p>
|
---|
264 |
|
---|
265 |
|
---|
266 | <p>
|
---|
267 | We suggest starting with viewing <a href="{$library_name}/collection/{$collName}/page/sgvizler">sample visualizations</a> to see what's possible,
|
---|
268 | and making minor edits to that to adjust what is visualized.
|
---|
269 | Then, if you want to start visualizing the data in a more substantially different way
|
---|
270 | or else export the data for more detailed analysis under your own control,
|
---|
271 | switch to the <a href="{$library_name}/collection/{$collName}/page/sparql">SPARQL-based data analysis</a> page to ensure the underlying
|
---|
272 | data retrieved is as you intended. Then take the newly developed SPARQL query back to the visualizer page, and through the
|
---|
273 | additional text-input fields provided there, develop the visualization.
|
---|
274 |
|
---|
275 | </p>
|
---|
276 |
|
---|
277 | </div>
|
---|
278 |
|
---|
279 | <gsf:script>
|
---|
280 | $('#viz-show-more').showMore({
|
---|
281 | minheight: 0,
|
---|
282 | buttontxtmore:"show more ...",
|
---|
283 | buttontxtless:"... show less"
|
---|
284 | });
|
---|
285 | </gsf:script>
|
---|
286 |
|
---|
287 | <!--
|
---|
288 | <p>
|
---|
289 | If you'd like to dig into the data behind this Digital Library collection, this can be done directly
|
---|
290 | using the <a href="{$library_name}/collection/{$collName}/page/sparql">SPARQL Query interface</a>.
|
---|
291 | This is a good place to go to see what sort of data is being stored, and we provide some sample
|
---|
292 | queries to get you going. But if you like to see the data presented more visually, we suggest
|
---|
293 | you try out the <a href="{$library_name}/collection/{$collName}/page/sgvizler">SGVizler page</a>,
|
---|
294 | which takes things to the next level, using pie-charts, histograms and other forms of
|
---|
295 | visualization to present the data.
|
---|
296 | </p>
|
---|
297 |
|
---|
298 | -->
|
---|
299 |
|
---|
300 | <h2 id="it-all-started-with">It All Started with a Little <strike>Sparkle</strike>SPARQL</h2>
|
---|
301 |
|
---|
302 |
|
---|
303 | <p>
|
---|
304 | In terms of how this collection was developed using the
|
---|
305 | Greenstone3 Digital Library (DL) architecture, we are
|
---|
306 | being a touch irreverent to say <i>it all started with a
|
---|
307 | little SPARQL</i>.
|
---|
308 | It is certainly true to say that, operationally, the DL
|
---|
309 | was created using SPARQL query that draws down JSON
|
---|
310 | records from
|
---|
311 | <a href="https://dbpedia.org" target="_blank">DBPedia</a>
|
---|
312 | about all the different entrants in the Eurovision. This
|
---|
313 | is then ingested into Greenstone using its document- and
|
---|
314 | metadata-processing pipeline: expand through the <i>show
|
---|
315 | more ...</i> button below to see the actual query.
|
---|
316 | But in truth, our starting point of the SPARQL query is
|
---|
317 | only possible due to the Herculean efforts of the
|
---|
318 | contributors to the Wikipedia pages about
|
---|
319 | the Eurovision Song Contest, and following on from
|
---|
320 | that the endeavors of the DBPedia project to
|
---|
321 | transform a substantial portion of that information
|
---|
322 | into machine-readable linked data.
|
---|
323 | </p>
|
---|
324 |
|
---|
325 | <p>
|
---|
326 | Continuing the technical development of the DL,
|
---|
327 | to the DBpedia extracted content, we then added in voting metadataâagain
|
---|
328 | using the Greenstone document- and metadata- processing
|
---|
329 | pipelineâthis time in the form of CSV-based spreadsheet derived from the
|
---|
330 | <a href="https://www.kaggle.com/datagraver/eurovision-song-contest-scores-19752019" target="_blank">Kaggle Eurovision Voting dataset 1975-2019</a>.
|
---|
331 | </p>
|
---|
332 |
|
---|
333 |
|
---|
334 | <div id="dl-tech-show-more">
|
---|
335 | <p>
|
---|
336 | Here's the SPARQL query that retrieves, for every year
|
---|
337 | Eurovision has been held, the countries that took part.
|
---|
338 | At under 20 lines of code, we think it's pretty awesome!
|
---|
339 | The information retrieved includes the country, year,
|
---|
340 | title of the song, and name of the entrant (the
|
---|
341 | act/artist), amongst other things. All useful core
|
---|
342 | information to seed the digital library collection. As
|
---|
343 | the 2020 Eurovision event did not run due to the
|
---|
344 | Covid-19 Pandemic, and (at the time of writing the 2021
|
---|
345 | is yet to occur), we have opted to filter the matches
|
---|
346 | returned to be prior to 2020.
|
---|
347 | </p>
|
---|
348 | <!--
|
---|
349 | # bind( REPLACE(str(?country_in_year), ".*(\\d{4})", "$1") AS ?year).
|
---|
350 |
|
---|
351 | PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
---|
352 | xsd:
|
---|
353 | skos:
|
---|
354 | prov:
|
---|
355 |
|
---|
356 | dbc:
|
---|
357 | dbp:
|
---|
358 |
|
---|
359 | dct:
|
---|
360 | -->
|
---|
361 | <pre style="background-color: #fff; color: #000; padding: 12px; margin-right: 6px;">
|
---|
362 | SELECT ?countries_in_esc_by_year ?country_in_year (?year AS ?Year) (?country AS ?Country) ?entrant (?entrant_label AS ?Creator) ?song (?song_label AS ?Title) (?was_derived_from AS ?WikipediaURL)
|
---|
363 | WHERE {
|
---|
364 | ?countries_in_esc_by_year skos:broader dbc:Countries_in_the_Eurovision_Song_Contest_by_year.
|
---|
365 |
|
---|
366 | ?country_in_year dct:subject ?countries_in_esc_by_year.
|
---|
367 | ?country_in_year dbp:year ?year.
|
---|
368 | FILTER ( xsd:integer(?year) < 2020).
|
---|
369 |
|
---|
370 | ?country_in_year dbp:country ?country.
|
---|
371 |
|
---|
372 | ?country_in_year dbp:entrant ?entrant.
|
---|
373 | ?entrant rdfs:label ?entrant_label
|
---|
374 | FILTER (lang(?entrant_label) = 'en').
|
---|
375 |
|
---|
376 | ?country_in_year dbp:song ?song.
|
---|
377 | ?song rdfs:label ?song_label
|
---|
378 | FILTER (lang(?song_label) = 'en').
|
---|
379 |
|
---|
380 | OPTIONAL {
|
---|
381 | ?song prov:wasDerivedFrom ?was_derived_from
|
---|
382 | }
|
---|
383 | }
|
---|
384 | ORDER BY DESC(?countries_in_esc_by_year)
|
---|
385 | </pre>
|
---|
386 |
|
---|
387 | <p>
|
---|
388 | You can try this query out yourself if you like. Select the entirety of the SPARQL query
|
---|
389 | in the above text box, and press <i>Control-C</i> to place it in your Copy-buffer.
|
---|
390 | Next visit the DBPedia SPARQL Endpoint given below, and in the main text box of the page
|
---|
391 | that appears, press <i>Control-V</i>
|
---|
392 | to paste in your SPARQL query. Finally, click on the <i>Execute Query</i> button
|
---|
393 | to initiate the search.
|
---|
394 | <ul>
|
---|
395 | <li>
|
---|
396 | <a href="https://dbpedia.org/sparql/" target="_blank">DBpedia's SPARQL endpoint</a>
|
---|
397 | </li>
|
---|
398 | </ul>
|
---|
399 | </p>
|
---|
400 | <p>
|
---|
401 | Through the SPARQL Endpoint you can change the output format that is used to, for example, JSON or Turtle.
|
---|
402 | For convenience, if you are just interested in seeing what the outcome of running the query is, displayed as a web page:
|
---|
403 | <ul>
|
---|
404 | <li>
|
---|
405 | <a href="https://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=SELECT+%3Fcountries_in_esc_by_year+%3Fcountry_in_year+%3Fyear+as+%3FYear+%3Fcountry+as+%3FCountry+%3Fentrant+%3Fentrant_label+as+%3FCreator+%3Fsong+%3Fsong_label+as+%3FTitle+%3Fwas_derived_from+as+%3FWikipediaURL%0D%0AWHERE+%7B%0D%0A++++%3Fcountries_in_esc_by_year+skos%3Abroader+dbc%3ACountries_in_the_Eurovision_Song_Contest_by_year.%0D%0A%0D%0A++++%3Fcountry_in_year+dct%3Asubject+%3Fcountries_in_esc_by_year.%0D%0A++++bind%28+REPLACE%28str%28%3Fcountry_in_year%29%2C+%22.*%28%5C%5Cd%7B4%7D%29%22%2C+%22%241%22%29+as+%3Fyear%29.%0D%0A++++FILTER+%28+xsd%3Ainteger%28%3Fyear%29+%3C+2020%29.%0D%0A%0D%0A++++%3Fcountry_in_year+dbp%3Acountry+%3Fcountry.%0D%0A%0D%0A++++%3Fcountry_in_year+dbp%3Aentrant+%3Fentrant.%0D%0A++++%3Fentrant+rdfs%3Alabel+%3Fentrant_label%0D%0A++++++FILTER+%28lang%28%3Fentrant_label%29+%3D+%27en%27%29.%0D%0A%0D%0A++++%3Fcountry_in_year+dbp%3Asong+%3Fsong.%0D%0A++++%3Fsong+rdfs%3Alabel+%3Fsong_label%0D%0A++++++FILTER+%28lang%28%3Fsong_label%29+%3D+%27en%27%29.%0D%0A%0D%0A++++OPTIONAL+%7B%0D%0A++++++%3Fsong+prov%3AwasDerivedFrom+%3Fwas_derived_from%0D%0A++++%7D%0D%0A%7D%0D%0AORDER+BY+DESC%28%3Fcountries_in_esc_by_year%29&format=text%2Fhtml&timeout=30000&signal_void=on&signal_unconnected=on" target="_blank">Click here to run the query directly</a>
|
---|
406 | </li>
|
---|
407 | </ul>
|
---|
408 | </p>
|
---|
409 |
|
---|
410 | <h2>Triplestore Errata</h2>
|
---|
411 |
|
---|
412 | <p>
|
---|
413 | The above SPARQL query is a good starting point to
|
---|
414 | extract all the Eurovision entries over the years,
|
---|
415 | however a more careful study of the returned results
|
---|
416 | revealed a few complications that needed to be
|
---|
417 | addressed. One issue stems from the fact that in its
|
---|
418 | inaugural year, countries were allowed to send two
|
---|
419 | entries each. For 1956, for every URI representing a
|
---|
420 | country in that year there are two title and two
|
---|
421 | entrants represented. As initially expressed, the
|
---|
422 | SPARQL query does not cater for this circumstance and
|
---|
423 | results in 2 x 2 = 4 combinations of artist and title
|
---|
424 | per song.
|
---|
425 | </p>
|
---|
426 | <p>
|
---|
427 | The way to address this is to include an additional
|
---|
428 | constraint that ensures that the URI representing
|
---|
429 | <i>?song</i> includes the relationship <i>dbp:artist</i>
|
---|
430 | for <i>?entrant</i>, effectively locking in to the
|
---|
431 | artist that performed that particular song. Studying
|
---|
432 | the result of this change, however, showed up a more
|
---|
433 | wide-reaching problem which was that not all the
|
---|
434 | <i>?country_year</i> URI entries expressed relationships
|
---|
435 | to songs and artists that were themselves URI: sometimes
|
---|
436 | they were represented as a string literal, meaning the
|
---|
437 | added constraint would fail, and reject entirely the
|
---|
438 | details about a country's entry in that
|
---|
439 | year. Compounding this, we also saw that some of the
|
---|
440 | processing work by DBPedia to turn the manually curated
|
---|
441 | information in Wikipedia into machine-readable form
|
---|
442 | erroneously handled the formation of some of the song
|
---|
443 | titles and artists.
|
---|
444 | </p>
|
---|
445 | <p>
|
---|
446 | Given that the erroneous entries were strings (even
|
---|
447 | integer numbers at times!) and not URI gave us a way in
|
---|
448 | to see how wide-spread the problem was. Using adapted
|
---|
449 | versions of the the main SPARQL query we had formulated,
|
---|
450 | we were able to produce lists of the affected entries.
|
---|
451 | The lists are available here through the following
|
---|
452 | links:
|
---|
453 | <ul>
|
---|
454 | <li>
|
---|
455 | <a target="_blank" href="sites/{$site_name}/collect/{$collName}/prepare/problem-lod-lists/dbpedia-problem-songs.html">Problem Songs (titles are literals not URIs/IRIs)</a>
|
---|
456 | </li>
|
---|
457 | <li>
|
---|
458 | <a target="_blank" href="sites/{$site_name}/collect/{$collName}/prepare/problem-lod-lists/dbpedia-problem-entrants.html">Problem Entrants (artists are literals not URIs/IRIs)</a>
|
---|
459 | </li>
|
---|
460 | </ul>
|
---|
461 | </p>
|
---|
462 |
|
---|
463 | <p>
|
---|
464 | The generation of these lists also provided the key to
|
---|
465 | the approach we used to compensate for the complications
|
---|
466 | these issues introduced. Skipping ahead slightly to the
|
---|
467 | formation of the Digital Library collection with
|
---|
468 | Greenstone3, we make use of this software architecture's
|
---|
469 | Triplestore Extension, which means that in addition to
|
---|
470 | the main DL and Open Archive Initiative (OAI) server
|
---|
471 | endpoints, there is also a triplestore backend. While
|
---|
472 | the triplestore extension was designed to provide SPARQL
|
---|
473 | access to the metadata and document content of the DL
|
---|
474 | collections, its existence means we can include in it a
|
---|
475 | graph that represents the necessary errata information
|
---|
476 | we need to "course correct" the SPARQL query
|
---|
477 | to perform how it is intended.
|
---|
478 | </p>
|
---|
479 |
|
---|
480 | <p>
|
---|
481 | This does admittedly complicate the expression of the
|
---|
482 | query, but the additions are manageable. The expanded
|
---|
483 | query makes use of SPARQL's federated search feature:
|
---|
484 | the query starts as before with the retrieval of triples
|
---|
485 | from the DBPedia endpoint; based on resolved values of
|
---|
486 | entities such as <i>?country_year</i> and <i>?song</i>,
|
---|
487 | it then optionally retrieves matching items from the DL
|
---|
488 | SPARQL endpoint. The final step is to use a conditional
|
---|
489 | clause (if-statement) to test to see if the DBpedia
|
---|
490 | version of the song is a literal, and if it is and if
|
---|
491 | there is a bound value for the DL retrieved one, then it
|
---|
492 | selects that one in preference.
|
---|
493 | </p>
|
---|
494 |
|
---|
495 | <p>
|
---|
496 | The DBpedia SPARQL endpoint doesn't allow for federated
|
---|
497 | queries, and so we initiate the SPARQL queries through
|
---|
498 | the DLs SPARQL endpoint, using SERVICE blocks to specify
|
---|
499 | the parts of the query that are run on the DBpedia endpoint.
|
---|
500 | <ul>
|
---|
501 | <li>
|
---|
502 | <a href="{$library_name}/collection/{$collName}/page/sparql">DL's (local) SPARQL endpoint</a>
|
---|
503 | </li>
|
---|
504 | </ul>
|
---|
505 | </p>
|
---|
506 |
|
---|
507 | <h3>Adding in Voting Metadata</h3>
|
---|
508 |
|
---|
509 | <p>
|
---|
510 | To fulfill our vision of developing this DL collection
|
---|
511 | as a rich resource through which people can explore the
|
---|
512 | phenomenon we went looking for voting data that was
|
---|
513 | available in a machine-readable format.
|
---|
514 | We found data compiled through a manual curation process
|
---|
515 | about how countries have voted going back to 1975 is available through the
|
---|
516 | <a href="https://www.kaggle.com/datagraver/eurovision-song-contest-scores-19752019">Kaggle website as an Excel spreadsheet</a>.
|
---|
517 | </p>
|
---|
518 | <p>
|
---|
519 | To incorporate this as metadata into the DL, we wrote
|
---|
520 | some Python code to transform the data into the internal
|
---|
521 | serialized metadata format used by Greenstone. Prior to
|
---|
522 | this project, the only serialized form for this was XML,
|
---|
523 | which is processed by the MetadataXML plugin. As it was
|
---|
524 | more convenient to generate JSON from our Python code,
|
---|
525 | we took the step of adding in a new plugin to
|
---|
526 | Greenstone3: MetadataJSON.
|
---|
527 | </p>
|
---|
528 |
|
---|
529 | <h3>Page Scraping</h3>
|
---|
530 |
|
---|
531 | <p>
|
---|
532 | Despite our best intentions work soley with
|
---|
533 | machine-readable dataâprimarily as you have seen in the
|
---|
534 | form of Linked Open Data, but also utilizing a
|
---|
535 | spreadsheet of voting dataâto form the Eurovision DL,
|
---|
536 | in looking to expand the metadata in the DL to cover
|
---|
537 | details concerning the draw position of acts, and their
|
---|
538 | overall placing, we have resorted to page-scraping
|
---|
539 | content from Wikipedia itself. This was because such
|
---|
540 | information was not part of the entity extraction
|
---|
541 | process that occurs when Wikipedia is mapped to DBpedia.
|
---|
542 | </p>
|
---|
543 |
|
---|
544 | <p>
|
---|
545 | A review of Wikipedia article pages about the event in
|
---|
546 | any given year showed these pages to be especially well
|
---|
547 | curated, and included a table in each that listed the
|
---|
548 | information we sought. While there was some variation
|
---|
549 | in how this table was expressed in HTML, with a
|
---|
550 | considerably portion of the heavy lifting being done by
|
---|
551 | the Python library BeautifulSoup4, it was not too
|
---|
552 | complex a task to develop a program that extracted this
|
---|
553 | information and turned it into the newly developed
|
---|
554 | Greenstone JSON metadata format.
|
---|
555 | </p>
|
---|
556 |
|
---|
557 | <h3>Patching in Missing Data</h3>
|
---|
558 |
|
---|
559 |
|
---|
560 | <p>
|
---|
561 | Another difficulty we have encountered is that
|
---|
562 | not every country who had an entry in Eurovision
|
---|
563 | in a given year has its own standalone article page.
|
---|
564 | This leads to missing entries in the category
|
---|
565 | page for the contest in a given year, which is
|
---|
566 | problematic to us, because it is this category
|
---|
567 | information that we draw upon in our SPARQL query
|
---|
568 | to populate the DL with all the acts.
|
---|
569 | </p>
|
---|
570 | <p>
|
---|
571 | The information about all the countries competing
|
---|
572 | in a given year does, however, appear in the
|
---|
573 | article page for the contest in that year. In fact
|
---|
574 | it's in the same table we targetted to extract out
|
---|
575 | draw position and placement. We therefore
|
---|
576 | wrote a further page-scraping program to compare
|
---|
577 | the countries in that table with the countries
|
---|
578 | listed on the category page for the contest in
|
---|
579 | that year. For any entries we find in the
|
---|
580 | table, but not in the Category page, we
|
---|
581 | produce a metadata record for the DL
|
---|
582 | with basic information about the entry:
|
---|
583 | country, year, song title, artist,
|
---|
584 | draw-position, placement, and (where available)
|
---|
585 | their total score.
|
---|
586 | </p>
|
---|
587 | <p>
|
---|
588 | Comparable with the problem titles and artist/entrants,
|
---|
589 | we have formulated a SPARQL query that enumerates
|
---|
590 | these missing category entrants:
|
---|
591 | <!--
|
---|
592 | We took the opportunity to add in further fields: Performing Position, Placement, Voting Total, thumbnail flag image.
|
---|
593 |
|
---|
594 |
|
---|
595 | An unintended side-affect of this is that we have also been able to expand
|
---|
596 | -->
|
---|
597 |
|
---|
598 |
|
---|
599 | <ul>
|
---|
600 | <li>
|
---|
601 | <a href="sites/{$site_name}/collect/{$collName}/prepare/problem-lod-lists/dbpedia-problem-category-in-year.html">Problem Category pages (some countries not listed in a given year despite competing)</a>
|
---|
602 | </li>
|
---|
603 | </ul>
|
---|
604 | </p>
|
---|
605 |
|
---|
606 |
|
---|
607 | </div>
|
---|
608 | <gsf:script>
|
---|
609 | $('#dl-tech-show-more').showMore({
|
---|
610 | minheight: 0,
|
---|
611 | buttontxtmore:"show more ...",
|
---|
612 | buttontxtless:"... show less"
|
---|
613 | });
|
---|
614 | </gsf:script>
|
---|
615 |
|
---|
616 |
|
---|
617 | <div>
|
---|
618 | <h2>The Gory Details</h2>
|
---|
619 | <!--
|
---|
620 | <p>
|
---|
621 | The resulting SPARQL query result set (JSON format
|
---|
622 | selected for output) is then ingested into a Greenstone
|
---|
623 | DL collection, and used in a variety of ways. For now
|
---|
624 | an (admittedly cryptic) list of technical steps that
|
---|
625 | were developed and/or deployed to provide the
|
---|
626 | functionality encountered in interacting with this site.
|
---|
627 |
|
---|
628 | <ul>
|
---|
629 | <li>New SPARQL plugin for <i>download_from.pl</i> developed, used in GLI to enter the above query</li>
|
---|
630 | <li>New SPARQL <i>Document Processing</i> plugin developed</li>
|
---|
631 | <li>Greenstone3 Apache Jena Triple Store Extension activated</li>
|
---|
632 | <li>SGVizler used to display Google Visualizations such as the pie-chart above.</li>
|
---|
633 | <li>Metadata in document view enhanced through Greenstone Format Statements micro-data</li>
|
---|
634 | <li>Custom <i>interface</i> developed</li>
|
---|
635 | </ul>
|
---|
636 | </p>
|
---|
637 | -->
|
---|
638 | <p>
|
---|
639 | Viewing the
|
---|
640 | <a download="collectionConfig.xml"
|
---|
641 | href="sites/{$site_name}/collect/{$collName}/etc/collectionConfig.xml">collection
|
---|
642 | configuration file</a> provides a good insight into how
|
---|
643 | all of these technical aspects are brought together.
|
---|
644 | </p>
|
---|
645 |
|
---|
646 | <p>
|
---|
647 | Full disclosure as to how the collection all ticks is
|
---|
648 | provided through our Subversion repository. Topping up
|
---|
649 | our
|
---|
650 | <a href="https//trac.greenstone.org/browser/main/trunk/greenstone3">Greenstone3
|
---|
651 | code base</a> we have:
|
---|
652 |
|
---|
653 | <ul>
|
---|
654 | <li>The site: <a href="https://trac.greenstone.org/browser/main/trunk/model-sites-dev/eurovision-lod">eurovision-lod</a></li>
|
---|
655 | <li>The interface: <a href="https://trac.greenstone.org/browser/main/trunk/model-interfaces-dev/eurovision-lod">eurovision-lod</a></li>
|
---|
656 | <li>The triplestore extension: <a href="https://trac.greenstone.org/browser/gs2-extensions/apache-jena/trunk/src">apache-jena</a></li>
|
---|
657 | </ul>
|
---|
658 |
|
---|
659 | </p>
|
---|
660 |
|
---|
661 | </div>
|
---|
662 |
|
---|
663 | <!--
|
---|
664 | <div id="technicaldev-turnstyle" style="margin-top: 12px;">
|
---|
665 | <div class="turnstyle-header" style="background-image: none; background-color: hsl(195, 47%, 35%);">
|
---|
666 | DL Technical Development
|
---|
667 | </div>
|
---|
668 |
|
---|
669 | <div style="display: none; padding-left: 6px; padding-top: 6px; margin-left: 2px; margin-right: 2px; border-left: white solid 1px; border-right: white solid 1px; border-bottom: white solid 1px;">
|
---|
670 | <p>
|
---|
671 | In terms of how this collection was developed using the
|
---|
672 | Greenstone DL architecture, the starting point is the
|
---|
673 | formulation of a SPARQL query to retrieve from DBpedia
|
---|
674 | entries about all the entrants in the contest over the
|
---|
675 | years:
|
---|
676 |
|
---|
677 | </p>
|
---|
678 |
|
---|
679 | </div>
|
---|
680 | </div>
|
---|
681 |
|
---|
682 | <script>
|
---|
683 | <xsl:text disable-output-escaping="yes">
|
---|
684 | $(function(){
|
---|
685 | transformToTurnstyleBlock("technicaldev");
|
---|
686 | });
|
---|
687 | </xsl:text>
|
---|
688 | </script>
|
---|
689 | -->
|
---|
690 |
|
---|
691 | <!--
|
---|
692 | <div id="LOD-turnstyle" style="margin-top: 12px;">
|
---|
693 | <div class="turnstyle-header" style="background-image: none; background-color: hsl(195, 47%, 35%);">
|
---|
694 | Linked Open Data
|
---|
695 | </div>
|
---|
696 |
|
---|
697 | <div style="display: none; padding-left: 6px; padding-top: 6px; margin-left: 2px; margin-right: 2px; border-left: white solid 1px; border-right: white solid 1px; border-bottom: white solid 1px;">
|
---|
698 |
|
---|
699 |
|
---|
700 | <h2>Eurovision LOD SPARQL Endpoints</h2>
|
---|
701 | <p>
|
---|
702 | The source data can be access vis the DBpedia SPARQL endpoint. The ingested,
|
---|
703 | data (with correction) is available through the collection's local
|
---|
704 | SPARQL endpoint:
|
---|
705 | <ul>
|
---|
706 | <li>
|
---|
707 | <a href="https://dbpedia.org/sparql/">DBpedia's SPARQL endpoint</a>
|
---|
708 | </li>
|
---|
709 | <li>
|
---|
710 | <a href="{$library_name}/collection/{$collName}/page/sparql">DL's (local) SPARQL endpoint</a>
|
---|
711 | </li>
|
---|
712 | </ul>
|
---|
713 | </p>
|
---|
714 |
|
---|
715 | <h2>Eurovision LOD Errata</h2>
|
---|
716 | </div>
|
---|
717 | </div>
|
---|
718 |
|
---|
719 | <script>
|
---|
720 | <xsl:text disable-output-escaping="yes">
|
---|
721 | $(function(){
|
---|
722 | transformToTurnstyleBlock("LOD");
|
---|
723 | });
|
---|
724 | </xsl:text>
|
---|
725 | </script>
|
---|
726 | -->
|
---|
727 |
|
---|
728 | <!--
|
---|
729 | <div id="voting-turnstyle" style="margin-top: 12px;">
|
---|
730 | <div class="turnstyle-header" style="background-image: none; background-color: hsl(195, 47%, 35%);">
|
---|
731 | Voting Data
|
---|
732 | </div>
|
---|
733 |
|
---|
734 | <div style="display: none; padding-left: 6px; padding-top: 6px; margin-left: 2px; margin-right: 2px; border-left: white solid 1px; border-right: white solid 1px; border-bottom: white solid 1px;">
|
---|
735 | <p>
|
---|
736 | The Voting data used in this collection is sourced from the Kaggle, which in turn
|
---|
737 | is derived from work available through Data Graver:
|
---|
738 | <ul>
|
---|
739 | <li><a href="https://www.kaggle.com/datagraver/eurovision-song-contest-scores-19752019">Kaggle Eurovision Voting dataset 1975-2019</a></li>
|
---|
740 | <li><a href="https://data.world/datagraver/eurovision-song-contest-scores-1975-2019">Data Graver</a></li>
|
---|
741 | <li><a href="https://docs.google.com/spreadsheets/d/1veXpiF54hQGP4OVuf1xjowumIe8HUOhI/edit#gid=528591420">Google Spreadsheet (internal use only)</a></li>
|
---|
742 |
|
---|
743 | </ul>
|
---|
744 | </p>
|
---|
745 | </div>
|
---|
746 | </div>
|
---|
747 |
|
---|
748 | <script>
|
---|
749 | <xsl:text disable-output-escaping="yes">
|
---|
750 | $(function(){
|
---|
751 | transformToTurnstyleBlock("voting");
|
---|
752 | });
|
---|
753 | </xsl:text>
|
---|
754 | </script>
|
---|
755 |
|
---|
756 | -->
|
---|
757 |
|
---|
758 | </div>
|
---|
759 |
|
---|
760 | </xsl:template>
|
---|
761 |
|
---|
762 |
|
---|
763 | </xsl:stylesheet>
|
---|
764 |
|
---|