Changeset 34240

Show
Ignore:
Timestamp:
03.07.2020 12:16:43 (5 weeks ago)
Author:
ak19
Message:

First bugfix for preserving HTML in collection description. This fixes the bug outside GLI: it was XSLT that was stripping of HTML tags. Massive comments to explain the pitfalls in slightly differing solutions, and why the current one was selected. Also documented links that were instructive and one the mode attribute of apply-templates that when present seems to be safe to use with the most general template (the one that acts on any node/*). After this fix, one problem still remains with HTML tags in collDescription displayItems (outside the additional problems GLI poses). As can be seen when you do o=xml on an appropriately crafted about page, the runtime system shifts to the bottom any text-only content that in collConfig.xml still precedes any HTML that's also inside the coll description. (That is, the non-HTML text appears after the HTML when it occurs before the HTML in the collConfig.xml). The solution for now would be for the user to put such plain text in paragraph tags, so that all the coll description content appears in the order they're in collConfig.xml. In future need to look at what is reordering the text relative to the HTML in such cases.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone3/web/interfaces/default/transform/gslib.xsl

    r33958 r34240  
    606606    <xsl:apply-templates select="pageResponse/collection|serviceCluster"/> 
    607607  </xsl:template> 
    608    
     608 
     609  <!-- 
     610      Solution to problem where incoming collection description contains HTML and we want to preserve 
     611      that HTML instead of XML's default behaviour stripping out all tags and just concatenating their text 
     612      content. 
     613      The actual solution is here: 
     614      https://stackoverflow.com/questions/19998180/xsl-copy-nodes-without-xmlns 
     615      - Close but adds namespacs to the first HTML tag: 
     616      https://stackoverflow.com/questions/6199345/how-to-copy-all-child-nodes-of-any-type-of-a-template-context-element 
     617      Other links that are a learning experience: 
     618      - https://support.microsoft.com/en-us/help/264665/how-to-display-html-in-xsl-style-sheet 
     619      (tags need to come in in entity form for xslt to remove the entities in output) 
     620      - https://forums.asp.net/t/1414309.aspx?Decoding+HTML+after+applying+an+XLST+transform+to+an+XML+control 
     621      (same problem) 
     622      - https://docs.oracle.com/javase/tutorial/jaxp/xslt/transformingXML.html 
     623      (general information) 
     624      - https://stackoverflow.com/questions/5876382/using-xslt-to-copy-all-nodes-in-xml-with-support-for-special-cases 
     625      (close to solution) 
     626  --> 
    609627  <xsl:template match="collection|serviceCluster"> 
    610     <xsl:value-of select="displayItemList/displayItem[@name='description']" disable-output-escaping="yes"/> 
     628    <!-- original way: does not preserve html tags --> 
     629    <!--<xsl:value-of select="displayItemList/displayItem[@name='description']" disable-output-escaping="yes"/>-->     
     630    <xsl:apply-templates select="displayItemList/displayItem[@name='description']"/> 
     631 
     632    <!-- Don't do this. It seems to remove any text nodes directly within this displayItem 
     633     before copying only the subdnodes --> 
     634    <!-- <xsl:apply-templates select="displayItemList/displayItem[@name='description']/*" mode="copy-no-namespaces"/> --> 
     635 
     636     <!-- Don't do this: it will also copy the <displayItem> element itself into the HTML output --> 
     637     <!--<xsl:copy-of select="displayItemList/displayItem[@name='description']"/>--> 
     638      
     639     <!-- The other way: requires the input to already be entity encoded for xslt to get it right, 
     640      to get it ending up as tags in the HTML generated. An example of it working below. 
     641      But that means we have to get the runtime code to send entity encoded elements, which 
     642      is not what we want. 
     643     --> 
     644    <!--<xsl:text disable-output-escaping="yes">&lt;b&gt;hello&lt;/b&gt;</xsl:text>--> 
     645     
    611646<!-- Uncomment this section if you want the collection service links and their descriptions to appear --> 
    612647    <!--<xsl:apply-templates select="serviceList"> 
     
    614649    </xsl:apply-templates>--> 
    615650  </xsl:template> 
    616    
     651 
     652  <!-- preserve any HTML tags *within* the collection description 
     653       Why is this adding an xmlns namespace to the first HTML tag encountered and converting 
     654       non html entities like the apostrophe character into their entity forms? 
     655       This seems to assume entities in the input should be converted: 
     656       https://stackoverflow.com/questions/31517944/xsl-disable-output-escaping-copy-of 
     657       But I'm wondering why when there's no entity in the input, copy-of produces entities 
     658       for chars like apostrophe in the output? 
     659       Note: there's xsl:copy and xsl:copy-of, we want copy-of! 
     660       - https://www.w3schools.com/xml/ref_xsl_el_copy.asp 
     661       - vs https://www.w3schools.com/XML/ref_xsl_el_copy-of.asp 
     662        
     663       Info on avoiding doe/disable-output-escaping at https://saxonica.plan.io/issues/3214 
     664       https://stackoverflow.com/questions/31517944/xsl-disable-output-escaping-copy-of 
     665  --> 
     666  <xsl:template match="displayItem[@name='description']"> 
     667    <!-- don't do this: it adds an xmlns namespace to the first/root html element in displayItem --> 
     668    <!--<xsl:copy-of select="node()"/>--> 
     669    <xsl:apply-templates select="node()" mode="copy-no-namespaces"/> 
     670   </xsl:template> 
     671   <!-- On mode attribute: https://www.w3schools.com/xml/ref_xsl_el_apply-templates.asp 
     672    https://stackoverflow.com/questions/4486869/can-one-give-me-the-example-for-mode-of-template-in-xsl 
     673    https://docs.microsoft.com/en-us/previous-versions/dotnet/netframework-4.0/ms256045(v=vs.100)?redirectedfrom=MSDN 
     674   --> 
     675   <xsl:template match="*" mode="copy-no-namespaces"> 
     676     <xsl:element name="{local-name()}"> 
     677       <xsl:copy-of select="@*"/> 
     678       <xsl:apply-templates select="node()" mode="copy-no-namespaces"/> 
     679     </xsl:element> 
     680   </xsl:template> 
     681    
     682   <xsl:template match="comment()| processing-instruction()" mode="copy-no-namespaces"> 
     683     <xsl:copy/> 
     684   </xsl:template> 
     685    
    617686  <xsl:template match="serviceList">     
    618687    <xsl:param name="collName"/>