Ignore:
Timestamp:
2013-11-02T23:05:23+13:00 (10 years ago)
Author:
davidb
Message:

Changes after testing in preparation for SMAM keynote

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/model-sites-dev/multimodal-mdl/collect/js-dsp-my-ipod/transform/pages/about.xsl

    r28433 r28593  
    4747
    4848
    49     <script src="{$httpPath}/script/browser-detect.js" type="text/javascript">
     49    <script src="{$httpPath}/script2/browser-detect.js" type="text/javascript">
    5050    /* space filler needed */
    5151    </script>
     
    5555
    5656
    57 <h2>A Musical Web Observatory: </h2>
     57<h2>I Can't <i>Believe</i> It's All Javascript!</h2>
    5858
    59 <h3><i><font size="+2">Connecting People with Data, Documents, and Algorithms</font></i></h3>
    60 
     59<!--
    6160<div style="background-color: #DD3611; color: white; padding: 4pt; margin-bottom: 6pt;">
    6261<table>
     
    139138</table>
    140139</div>
     140-->
    141141
    142 <p class="about">This project is currently work in progress.  The sequence of development so far has been:
     142<p class="about">A music DL collection that explores what is possible with only client-side processing.
     143
    143144
    144145<style>
     
    149150
    150151<ol>
    151   <li>
    152   <i>Starting point:</i> Manual (command-line) aggregation of disparate
    153     resources prior to building the DL.  A set of bespoke Greenstone
    154     document processing plugins corrals the heterogeneous gathered
    155     data into a unified (homogeneous) canonical form that the DL can
    156     access and display.  Everything presented in the DL is
    157     either pre-computed (such as the self-similarity heat-maps) or else
    158     computed at build-time.
    159   </li>
    160 
    161    <li>
    162      <i>Audio-fingerprinting:</i> as before, but now metadata about the audio
    163      songs is now enriched through a set of audio-fingerprinting web
    164      services.  Everything presented in the DL is still pre-computed
    165      or else computed at build-time—however,  the inclusion in the document
    166      view of a &quot;Discovery&quot; block allows a user to begin to
    167      access and explore through linked-data, information related to
    168      the song.
    169    </li>
    170 
    171   <li>
    172     <i>Client-side audio processing (and visualization):</i> the
    173     Pre-computed self-similarity heat-maps are dropped from the
    174     collection building process in preference for the same information
    175     being computed through Javascript running in the user's web
    176     browser.
    177   </li>
    178 
    179   <li>
    180     <i>Embedded Meandre workflows:</i> the Meandre Workbench is
    181     integrated into Greenstone.  Audio documents in the Greenstone
    182     digital library can now be dispatched and be processed by the
    183     selected Meandre, and output from the workflow returned to the
    184     Greenstone document view—for example, playing audio that has been
    185     processed and output from the workflow.
    186   </li>
    187152
    188153  <li>
     
    193158
    194159  <li>
    195     <i>Client/Server hybrid workflows:</i> the Greenstone/Meandre
    196     integration is extended to support the dynamic
    197     transmission of <tt>executeCallBack()</tt> methods written
    198     in the user's browser in Javascript to be run on the
    199     Meandre server as part of the active workflow.
    200   </li>
    201 
    202   <li>
    203     <i>Forging ahead:</i> the next area to be worked on is upgrading
    204     the level of Greenstone/Meandre integration so that data produced
    205     by the workflows can be incorporated back into the underlying
    206     digital library itself, rather than being (as it currently stands)
    207     transitory data that only lives for the duration of the web page
    208     being viewed.  This will form an implementation stepping-stone to
    209     a more generalized ability to have data retrieved from other
    210     external resources (located through the Discovery block
    211     linked-data portion of the document view) ingested in to the DL
    212     collection.
     160    <i>Visualization:</i> the
     161    Pre-computed self-similarity heat-maps are dropped from the
     162    collection building process in preference for the same information
     163    being computed through Javascript running in the user's web
     164    browser.
    213165  </li>
    214166
    215167
     168  <li>
     169    <i>Annotation:</i> ...
     170
     171  </li>
    216172</ol>
    217173
    218174</p>
    219175
    220 
    221 
    222 <!--
    223 
    224 <p class="about">This digital library collection demonstrates the
    225 integration of a variety of sources of structural metadata from both
    226 automatically derived content analysis and manually labelled
    227 ground-truth data to form a rich interactive web application suitable
    228 for musicologists (as the target end-users) to explore a collated set
    229 of music files.</p>
    230 
    231 <p class="about">While the base digital library system, <a
    232 href="http://www.greenstone.org" target="_blank">Greenstone</a>, is
    233 designed to operate with any web browser, the audio features this
    234 particular DL collection demonstrates relies on audio processing
    235 features currently only available in Firefox.  Using one of the other
    236 web browsers you will be able to access most features of the digital
    237 library collection, but when you access a particular song you may find
    238 playing the audio—in particular the panned audio
    239 effect—does not work.</p>
    240 -->
    241 
    242 
    243 <!--
    244 <p class="about">The developed system combines the open source capabilities of
    245 Greenstone, \cite{Greenstone}, NEMA \cite{Nema}, Salami \cite{Salami},
    246 and AudioDB \cite{AudioDB}.
    247 
    248 %%
    249 NEMA and Salami have been previously described in this article, and contribute
    250 manual and audio content based metadata.
    251 %%
    252 AudioDB is a raw-audio content based searching algorithm based on Local-Sensitivity Hashing \cite{LSH}.
    253 %%
    254 Greenstone is a versatile digital library architecture with an
    255 extensible service-based architecture that.  In
    256 addition to the text-based searching and browsing capability to
    257 organize content, Greenstone provides the framework in which to harness the
    258 structural metadata from NEMA and Salami, and the audio-content based
    259 search functionality of audioDB.
    260 
    261 -->
    262 
    263 
    264 
    265 <h3>A Walkthrough</h3>
    266 
    267 
    268 <p class="about">The following walkthrough is for the initial incarnation of the DL collection, where all
    269 the information presented is either precomputed or computed at build time.</p>
    270 
    271 <p class="about">Taking as a starting point a set of music files identified as worthy
    272 of study, the <a href="#browse">figure below</a> shows the result of browsing
    273 the formed digital library collection <a href="dev?a=b&amp;rt=s&amp;s=ClassifierBrowse&amp;c=salami-audioDB&amp;cl=CL1" target="_blank">by title</a> from a web browser.
    274 The figure is a useful snapshot in which to orientate ourselves with
    275 the main structure and features to the digital library.  Functionality that
    276 persistently reoccurs is accessible through the header to the page.</p>
    277 
    278 <p class="about">This includes:
    279 <ul>
    280 
    281 <li>help and preferences (top-right);</li>
    282 <li>a quick-search option (located just
    283 below) with links to more sophisticated searching options; and</li>
    284 <li>pin-pointing where within the site a user is currently located
    285 (top-left).</li>
    286 </ul>
    287 
    288 
    289 <a name="browse" />
    290 <table style="width: 700px; margin-left: auto; margin-right: auto; margin-bottom: 6pt;">
    291   <tr>
    292     <td style="border: solid 1px;">
    293       <img style="width: 700px" src="{$httpPath}/images/figs/cropped/salami-browse.png" />
    294     </td>
    295   </tr>
    296   <tr style="background-color: #bbeebb">
    297     <td>
    298       <i>Browsing in the digital library
    299   <a href="dev?a=b&amp;rt=s&amp;s=ClassifierBrowse&amp;c=salami-audioDB&amp;cl=CL1" target="_blank">by titles</a>.</i>
    300     </td>
    301   </tr>
    302 </table>
    303 
    304 
    305 The specific content to this location within the site (in this case
    306 browsing by title) is shown beneath the main banner.  Various
    307 groupings of title can be accessed by clicking on the bookshelf icons
    308 vertically aligned as the main part of the page: currently <a
    309 href="dev?a=b&amp;rt=s&amp;s=ClassifierBrowse&amp;c=salami-audioDB&amp;cl=CL1#CL1.2">C–D</a>
    310 is open, with the remaining letters to the alphabet below this,
    311 accessed through scrolling.</p>
    312 
    313 
    314 
    315 
    316 <p class="about">Interested in the song <i>Candela</i> our curious musicologist clicks
    317 on <a href="dev?a=d&amp;ed=1&amp;book=off&amp;c=salami-audioDB&amp;d=D145&amp;dt=simple&amp;sib=1&amp;p.a=b&amp;p.sa=&amp;p.s=ClassifierBrowse" target="_blank">the link for this</a>.  This brings up the document view to this song:</p>
    318 
    319 <a name="self-similarity" />
    320 <table style="width: 700px; margin-left: auto; margin-right: auto; margin-bottom: 6pt;">
    321   <tr>
    322     <td style="border: solid 1px;">
    323       <img style="width: 700px" src="{$httpPath}/images/figs/cropped/salami-self-similarity2.png" />
    324     </td>
    325   </tr>
    326   <tr style="background-color: #bbeebb">
    327     <td>
    328       <i>The <a href="dev?a=d&amp;ed=1&amp;book=off&amp;c=salami-audioDB&amp;d=D145&amp;dt=simple&amp;sib=1&amp;p.a=b&amp;p.sa=&amp;p.s=ClassifierBrowse" target="_blank">musicologically enriched document view</a> for</i> Candela.
    329     </td>
    330   </tr>
    331 </table>
    332 
    333 <p class="about">Normally in a digital
    334 library the document view brings up a page that is strongly derived
    335 from textual metadata.  If the document viewed was a text-document,
    336 some summary information such as title and author is typically
    337 presented, say in tabular form, before the main text is presented.
    338 Even in the case of multimedia digital libraries, the view presented
    339 is still strongly derived from textual metadata: this time including
    340 details such as the length of the video, the TV company that produced
    341 it, whether captions are available, and so forth, accompanied with an
    342 embedded video player for viewing the content—essentially more
    343 textual metadata (in this case the URL to the video content) which in
    344 terms of the user-interface is largely divorced from the other
    345 elements displayed on the page.</p>
    346 
    347 <p class="about">This contrasts sharply with the document view developed in this
    348 digital library.
    349 Naturally it allows the song to be played (akin to the embedded video player),
    350 however this is largely of secondary importance to the
    351 other functionality available this is much more closely
    352 integrated.</p>
    353 
    354 <p class="about">The most striking visual component to the document view is a
    355 self-similarity &quot;heat map&quot; where the duration of the song forms both
    356 the <i>x</i>- and <i>y</i>-axis, and a red pixel located at a given <i>(x,y)</i>
    357 co-ordinate in the map represents a location where two parts of the
    358 song are strongly similar, proportionally shifting to blue to
    359 represent dissimilar.  Given this configuration, the leading diagonal
    360 to the matrix (<i>x=y</i>) is always coloured red as this represents the
    361 comparison of one part of the song with itself.</p>
    362 
    363 <p class="about">When the user moves the mouse cursor around the self-similarity map
    364 a highlighting circle is shown to emphasize the area the user is over,
    365 with a black dot at the centre (visible in in
    366 the <a href="#self-similarity">above figure</a>); annotated vertically
    367 and horizontally are the two time-offsets in seconds that that point
    368 in the map corresponds to.  Clicking the cursor at this point results
    369 in the audio being played <i>simultaneously</i> from these two points.
    370 To aid the musicologist in listening to the two parts of the song, one
    371 part is panned to the left speaker, and the other to the right (this
    372 was implemented using the extended audio API provided by Firefox, and
    373 so this particular feature only work when viewing the collection with
    374 this browser—see implementation details below).  In our figure,
    375 the musicologist has zeroed in the location <i>x=33</i>, <i>y=97</i>
    376 which corresponds to the start of a strong red diagonal that occurs
    377 some distance off the leading diagonal.  Listening to the two sounds
    378 played (most reliably done with headphones on), they hear that these
    379 two sections of the song are indeed repeating sections of the guitar
    380 piece <i>Candela</i> with a minor variation in the latter section
    381 where a recorder is also playing in the arrangement.</p>
    382 
    383 <p class="about">The structured audio time-lines (labelled A, B, ..., and 6, 5, 2,
    384 ... in the figure) located above the self-similarity map are another
    385 area of enriched musical content in the digital library. The upper
    386 line shows the ground-truth data for this song generated by
    387 the <a href="http://www.music-ir.org/?q=node/14" target="_blank">Salami project</a>;
    388 the lower line is generated by an algorithmically based content
    389 analysis algorithm.</p>
    390 
    391 <p class="about">While there is some agreement between these two lines, there are
    392 also significant differences.  The play and search buttons within the
    393 structured time-lines (the latter represented by a magnifying glass)
    394 allow the user to investigate further these structures.  We shall
    395 return to the search functionality shortly (which is content based,
    396 using <a href="http://omras2.doc.gold.ac.uk/software/audiodb/" target="_blank">AudioDB</a>),
    397 but in the meantime note that with the time-lines positioned above the
    398 self-similarity map, there is further opportunity to study the
    399 differences between the two structured time-line.  It is certainly the
    400 case that there are strong visual cues in the map that line up with
    401 the algorithmic time-line, even though they do not align with a
    402 boundary in the ground-truth data, and the user can click on these
    403 parts of the similarity map to hear what is occurring at these
    404 points.</p>
    405 
    406 
    407 
    408 <a name="resultspage" />
    409 <table style="width: 700px; margin-left: auto; margin-right: auto; margin-bottom: 6pt;">
    410   <tr>
    411     <td style="border: solid 1px;">
    412       <img style="width: 350px" src="{$httpPath}/images/figs/cropped/salami-audiodb-search-resultspage-halfcut.png" />
    413     </td>
    414     <td style="border: solid 1px;">
    415       <img style="width: 350px" src="{$httpPath}/images/figs/cropped/salami-structured-search-resultspage-halfcut.png" />
    416     </td>
    417   </tr>
    418   <tr style="background-color: #bbeebb">
    419     <td colspan="2">
    420       <i>Audio content based search results: left) the
    421     <a href="dev?a=q&amp;sa=&amp;rt=rd&amp;s=AudioQuery&amp;c=salami-audioDB&amp;startPage=1&amp;s1.maxDocs=100&amp;s1.hitsPerPage=20&amp;q=D206.dir&amp;s1.query=D206.dir&amp;s1.offset=263&amp;s1.length=24&amp;mysongWindowDuration=6" target="_blank">result list</a> from
    422       an audio content based query taken from an extract of </i>
    423       Michelle; <i>and right)
    424       the <a href="dev?a=q&amp;sa=&amp;rt=rd&amp;s=TextQuery&amp;c=salami-audioDB&amp;startPage=1&amp;s1.query=%22b+b+c+b+c%22&amp;s1.index=JS" target="_blank">result of a structured music search</a> for
    425       content containing &quot;b b c b a&quot; as a sequence.</i>
    426     </td>
    427   </tr>
    428 </table>
    429 
    430 <p class="about">Returning to the search capability provided by the structured
    431 time-lines, the <a href="#resultspage">above figure (left)</a> show the result of
    432 using this feature while study the song <i>Michelle</i> by The
    433 Beatles.  In this case the user selected the section of the
    434 ground-truth time-line corresponding to section starting &quot;I want you
    435 ...&quot;, but could have equally used the algorithmically calculated
    436 time-line, or in fact paused the song playing at any point, and
    437 started a match from there.</p>
    438 
    439 
    440 <p class="about">Not surprisingly <i>Michelle</i> is returned as the top hit (at
    441 92.1%)—we shall see shortly that this is because the system found
    442 several sections of the song that matched this—the next hit being
    443 <i>Bigger Than JC</i> at 74.4%, and so on down, where only one hit
    444 per song occurs.  Clicking on the top hit,
    445 the <a href="#audiodb">figure below</a> shows the document view that is
    446 displayed, focusing in on the key area to this screen.  This time the
    447 time-line area has an additional bar: the points within the song that
    448 AudioDB found to be similar.  Like the other time-lines, a play button
    449 is present on these segments so the user can play the matching points
    450 directly.  In this case
    451 clicking on them reveals the matching sections found correspond to
    452 melodically repeating sections of the song, only with different lyrics
    453 (&quot;I love you ...&quot; and &quot;I need you ...&quot;).</p>
    454 
    455 
    456 <p class="about">A further form of musical content-based searching is available through
    457 the main header to the digital library.  Instead of searching by title
    458 or artist (which are also available) the user can click on this
    459 &quot;search by&quot; menu next to the quick-search text box and select
    460 &quot;ground-truth structure&quot; instead.
    461 The <a href="#resultspage">figure above</a> (right) shows the result of using this
    462 option, where the user has entered &quot;b b c b c&quot; as the query, in
    463 other words searching for songs that have two sections in a row the
    464 same, then a new section, then returning to the original section,
    465 before progressing to a recurrence of the second distinct section.  In
    466 pop and rock music this is a popular sequence corresponding to verse,
    467 verse, chorus, verse, chorus.  A more intriguing framing of a query
    468 along these lines—equally possible in the digital library—would be
    469 to go to the fielded search page (accessible through the <i>advanced
    470   search</i> link below the quick-search box, and entered the same
    471 music structure query, but this time combine it with other fielded
    472 query terms (Genre=Blues OR Jazz, Year&lt;1950).  Results from this
    473 query would let the musicologist explore potential evidence of this
    474 pattern of sections being used in two of the key musical genres that
    475 influenced the development of Rock and Roll music.</p>
    476 
    477 
    478 <a name="audiodb" />
    479 <table style="width: 700px; margin-left: auto; margin-right: auto; margin-bottom: 6pt;">
    480   <tr>
    481     <td style="border: solid 1px;">
    482       <img style="width: 700px" src="{$httpPath}/images/figs/cropped/salami-audiodb-search2.png" />
    483     </td>
    484   </tr>
    485   <tr style="background-color: #bbeebb">
    486     <td>
    487       <i>The document view for</i> Michelle <i><a href="dev?a=d&amp;c=salami-audioDB&amp;d=D206&amp;dt=simple&amp;p.frameOffset=133,393,264,262,392&amp;p.frameLength=24&amp;p.a=q&amp;p.s=AudioQuery&amp;hl=on&amp;ed=1#D206" target="_blank">augmented with the locations of the matches</a> of the audio query.</i>
    488     </td>
    489   </tr>
    490 </table>
    491 
    492 
    493 
    494 
    495 <h3>Implementation details</h3>
    496 
    497 <p class="about">The core part of the interactive elements in the document view were
    498 implemented using SVG combined with Javascript.  The left- and
    499 right-panning interactively available from the self-similarity map was
    500 implemented by processing the raw audio stream, made accessible by the
    501 Firefox Audio extension
    502 <a href="https://wiki.mozilla.org/Audio_Data_API" target="_blank">API</a>.</p>
    503 
    504 <p class="about">AudioDB content based searching was integrated into Greenstone through
    505 two components of the digital library software architecture: its
    506 build-time document processing plugin system, and its runtime
    507 message-passing service-base framework.  The developed plugin accepts
    508 a wide range of audio formats (including OGG and MP3), and converts
    509 them to WAV, the format needed by AudioDB for processing.  The new
    510 search service took the form of a proxy, accepting messages in the
    511 XML syntax used by Greenstone, turning them into the necessary calls
    512 to the AudioDB command-line interface, and then converting the output
    513 from AudioDB back into the XML syntax expected by the digital library
    514 architecture.  Finally, the two parts were packaged to operate as a
    515 Greenstone extension; the software is available at:
    516 <a href="http://trac.greenstone.org/gs3-extensions/audioDB/trunk/src" target="_blank">http://trac.greenstone.org/gs3-extensions/audioDB/trunk/src</a>.</p>
    517176
    518177</div>
Note: See TracChangeset for help on using the changeset viewer.