Changeset 34240


Ignore:
Timestamp:
2020-07-03T12:16:43+12:00 (4 years ago)
Author:
ak19
Message:

First bugfix for preserving HTML in collection description. This fixes the bug outside GLI: it was XSLT that was stripping of HTML tags. Massive comments to explain the pitfalls in slightly differing solutions, and why the current one was selected. Also documented links that were instructive and one the mode attribute of apply-templates that when present seems to be safe to use with the most general template (the one that acts on any node/*). After this fix, one problem still remains with HTML tags in collDescription displayItems (outside the additional problems GLI poses). As can be seen when you do o=xml on an appropriately crafted about page, the runtime system shifts to the bottom any text-only content that in collConfig.xml still precedes any HTML that's also inside the coll description. (That is, the non-HTML text appears after the HTML when it occurs before the HTML in the collConfig.xml). The solution for now would be for the user to put such plain text in paragraph tags, so that all the coll description content appears in the order they're in collConfig.xml. In future need to look at what is reordering the text relative to the HTML in such cases.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl

    r33958 r34240  
    606606    <xsl:apply-templates select="pageResponse/collection|serviceCluster"/>
    607607  </xsl:template>
    608  
     608
     609  <!--
     610      Solution to problem where incoming collection description contains HTML and we want to preserve
     611      that HTML instead of XML's default behaviour stripping out all tags and just concatenating their text
     612      content.
     613      The actual solution is here:
     614      https://stackoverflow.com/questions/19998180/xsl-copy-nodes-without-xmlns
     615      - Close but adds namespacs to the first HTML tag:
     616      https://stackoverflow.com/questions/6199345/how-to-copy-all-child-nodes-of-any-type-of-a-template-context-element
     617      Other links that are a learning experience:
     618      - https://support.microsoft.com/en-us/help/264665/how-to-display-html-in-xsl-style-sheet
     619      (tags need to come in in entity form for xslt to remove the entities in output)
     620      - https://forums.asp.net/t/1414309.aspx?Decoding+HTML+after+applying+an+XLST+transform+to+an+XML+control
     621      (same problem)
     622      - https://docs.oracle.com/javase/tutorial/jaxp/xslt/transformingXML.html
     623      (general information)
     624      - https://stackoverflow.com/questions/5876382/using-xslt-to-copy-all-nodes-in-xml-with-support-for-special-cases
     625      (close to solution)
     626  -->
    609627  <xsl:template match="collection|serviceCluster">
    610     <xsl:value-of select="displayItemList/displayItem[@name='description']" disable-output-escaping="yes"/>
     628    <!-- original way: does not preserve html tags -->
     629    <!--<xsl:value-of select="displayItemList/displayItem[@name='description']" disable-output-escaping="yes"/>-->   
     630    <xsl:apply-templates select="displayItemList/displayItem[@name='description']"/>
     631
     632    <!-- Don't do this. It seems to remove any text nodes directly within this displayItem
     633     before copying only the subdnodes -->
     634    <!-- <xsl:apply-templates select="displayItemList/displayItem[@name='description']/*" mode="copy-no-namespaces"/> -->
     635
     636     <!-- Don't do this: it will also copy the <displayItem> element itself into the HTML output -->
     637     <!--<xsl:copy-of select="displayItemList/displayItem[@name='description']"/>-->
     638     
     639     <!-- The other way: requires the input to already be entity encoded for xslt to get it right,
     640      to get it ending up as tags in the HTML generated. An example of it working below.
     641      But that means we have to get the runtime code to send entity encoded elements, which
     642      is not what we want.
     643     -->
     644    <!--<xsl:text disable-output-escaping="yes">&lt;b&gt;hello&lt;/b&gt;</xsl:text>-->
     645   
    611646<!-- Uncomment this section if you want the collection service links and their descriptions to appear -->
    612647    <!--<xsl:apply-templates select="serviceList">
     
    614649    </xsl:apply-templates>-->
    615650  </xsl:template>
    616  
     651
     652  <!-- preserve any HTML tags *within* the collection description
     653       Why is this adding an xmlns namespace to the first HTML tag encountered and converting
     654       non html entities like the apostrophe character into their entity forms?
     655       This seems to assume entities in the input should be converted:
     656       https://stackoverflow.com/questions/31517944/xsl-disable-output-escaping-copy-of
     657       But I'm wondering why when there's no entity in the input, copy-of produces entities
     658       for chars like apostrophe in the output?
     659       Note: there's xsl:copy and xsl:copy-of, we want copy-of!
     660       - https://www.w3schools.com/xml/ref_xsl_el_copy.asp
     661       - vs https://www.w3schools.com/XML/ref_xsl_el_copy-of.asp
     662       
     663       Info on avoiding doe/disable-output-escaping at https://saxonica.plan.io/issues/3214
     664       https://stackoverflow.com/questions/31517944/xsl-disable-output-escaping-copy-of
     665  -->
     666  <xsl:template match="displayItem[@name='description']">
     667    <!-- don't do this: it adds an xmlns namespace to the first/root html element in displayItem -->
     668    <!--<xsl:copy-of select="node()"/>-->
     669    <xsl:apply-templates select="node()" mode="copy-no-namespaces"/>
     670   </xsl:template>
     671   <!-- On mode attribute: https://www.w3schools.com/xml/ref_xsl_el_apply-templates.asp
     672    https://stackoverflow.com/questions/4486869/can-one-give-me-the-example-for-mode-of-template-in-xsl
     673    https://docs.microsoft.com/en-us/previous-versions/dotnet/netframework-4.0/ms256045(v=vs.100)?redirectedfrom=MSDN
     674   -->
     675   <xsl:template match="*" mode="copy-no-namespaces">
     676     <xsl:element name="{local-name()}">
     677       <xsl:copy-of select="@*"/>
     678       <xsl:apply-templates select="node()" mode="copy-no-namespaces"/>
     679     </xsl:element>
     680   </xsl:template>
     681   
     682   <xsl:template match="comment()| processing-instruction()" mode="copy-no-namespaces">
     683     <xsl:copy/>
     684   </xsl:template>
     685   
    617686  <xsl:template match="serviceList">   
    618687    <xsl:param name="collName"/>
Note: See TracChangeset for help on using the changeset viewer.