Ignore:
Timestamp:
2022-09-15T16:31:38+12:00 (19 months ago)
Author:
anupama
Message:

Commit 2/2 Escaping apostrophes again in English collectionConfig.properties this time. (Other than French and English, no other language files were affected.) Previous commit message: Previous set of related commits were a big mistake. The apostrophes needed escaping in the collectionConfig.properties after all, because GTI won't even load entire translation chunks where apostrophes aren't escaped. So it may be ugly that GTI presents them unescaped, but at least it presents them when the apostrophes are escaped with Backslash.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • documented-examples/trunk/wrdpdf-e/resources/collectionConfig.properties

    r36619 r36621  
    22document_text=documents
    33
    4 shortDescription=<p>This collection demonstrates Greenstone's ability to build collections from documents provided in different formats. It contains a number of papers written by various members of the NZDL project in PDF, MSWord, RTF, and Postscript formats.</p>
     4shortDescription=<p>This collection demonstrates Greenstone\'s ability to build collections from documents provided in different formats. It contains a number of papers written by various members of the NZDL project in PDF, MSWord, RTF, and Postscript formats.</p>
    55
    66description1=<p>The documents in this collection have been produced by members of the Department of Computer Science, University of Waikato. The University of Waikato holds copyright. They may be distributed freely, without any restrictions.</p>
    77
    8 description2=<h3>How the collection works</h3> <p> This collection's configuration file, <tt>collectionConfig.xml</tt>, contains the four plugins <i>WordPlugin</i>, <i>RTFPlugin</i>, <i>PDFPlugin</i> and <i>PostScriptPlugin</i> (along with the standard four, <i>GreenstoneXMLPlugin</i>, <i>MetadataXMLPlugin</i>, <i>ArchivesInfPlugin</i> and <i>DirectoryPlugin</i>). These four plugins all extract <i>Title</i> and <i>Source</i> (i.e. filename) metadata.</p>
     8description2=<h3>How the collection works</h3> <p> This collection\'s configuration file, <tt>collectionConfig.xml</tt>, contains the four plugins <i>WordPlugin</i>, <i>RTFPlugin</i>, <i>PDFPlugin</i> and <i>PostScriptPlugin</i> (along with the standard four, <i>GreenstoneXMLPlugin</i>, <i>MetadataXMLPlugin</i>, <i>ArchivesInfPlugin</i> and <i>DirectoryPlugin</i>). These four plugins all extract <i>Title</i> and <i>Source</i> (i.e. filename) metadata.</p>
    99
    1010description3=<p>Greenstone contains third-party software that is used to convert Word, RTF, PDF and PostScript files into HTML. The Greenstone team does not maintain these modules, although we do try to include the latest versions with each Greenstone release. Bugs arise with unusual Word documents (e.g. from older Macintosh systems), and sometimes the text is badly extracted. Some PDF files have no machine-readable text at all, comprising instead a sequence of page <i>images</i> from which text can only be extracted by optical character recognition (OCR), which Greenstone does not attempt. If you encounter these problems, you can either remove the offending documents from your collection, or try using some of the advanced plugin options to process the documents in different ways. For more information, see the Enhanced PDF and Word tutorials on the <a href='http\://wiki.greenstone.org/wiki/index.php/Tutorial_exercises'>Greenstone wiki</a>. Alternatively, a new Greenstone 3 collection will add in a pre-configured <i>UnknownConverterPlugin</i> that will use <i>apache tika</i> by default to process docx files. You can reconfigure it, or add another UnknownConverterPlugin and configure it appropriately, to process other document types, refer to <a href="http\://wiki.greenstone.org/doku.php?id=en\:plugin\:unknownconverterplugin">The UnknownConverterPlugin</a> page on the Greenstone wiki.</p>
Note: See TracChangeset for help on using the changeset viewer.