Changeset 36619 for documented-examples


Ignore:
Timestamp:
2022-09-15T16:13:19+12:00 (20 months ago)
Author:
anupama
Message:

Commit 2/2: undoing escaping apostrophes in English, having committed the same for French previously. Previous commit message still applies: Maybe I shouldn't have escaped the apostrophes in the collectionConfig.properties files (only affects French and English), as it doesn't get unescaped when presented to the translator in GTI. It really only helped when using emacs to edit the collectionConfig.properties files, because single quotes acted liek they were starting string literals and highlighting colours changed.

Location:
documented-examples/trunk
Files:
12 edited

Legend:

Unmodified
Added
Removed
  • documented-examples/trunk/bibtex-e/resources/collectionConfig.properties

    r36615 r36619  
    3131shortDescription=<p>This collection, which contains 135 BibTeX entries, is a collection of working papers published from 1997 to 2006 at <a href="http\://www.cs.waikato.ac.nz/">Department of Computer Science</a>, <a href="http\://www.waikato.ac.nz/">the University of Waikato</a>.</p>
    3232
    33 description1=<h3>How the collection works</h3><p>The collection configuration file (the collection\'s <tt>etc/collectionConfig.xml</tt>) begins with the specification <i>groupsize 200</i>. This groups 200 documents together into a single archive file. Bibliography collections typically have many small documents, and grouping them together prevents Greenstone\'s internal file structures from becoming bloated and occupying more disk space than necessary.</p>
     33description1=<h3>How the collection works</h3><p>The collection configuration file (the collection's <tt>etc/collectionConfig.xml</tt>) begins with the specification <i>groupsize 200</i>. This groups 200 documents together into a single archive file. Bibliography collections typically have many small documents, and grouping them together prevents Greenstone's internal file structures from becoming bloated and occupying more disk space than necessary.</p>
    3434
    35 description2=<p>Apart from the standard plugins, this collection uses <i>BibTexPlugin</i>, which processes references in the BibTeX format (well known to computer scientists). Two options have been set for BibTexPlugin\: <i>-OIDtype assigned -OIDmetadata Number</i>. This means the metadata element "Number" will be used as the record identifier, instead of Greenstone\'s default hash identifiers. These options are available for all plugins.</p>
     35description2=<p>Apart from the standard plugins, this collection uses <i>BibTexPlugin</i>, which processes references in the BibTeX format (well known to computer scientists). Two options have been set for BibTexPlugin\: <i>-OIDtype assigned -OIDmetadata Number</i>. This means the metadata element "Number" will be used as the record identifier, instead of Greenstone's default hash identifiers. These options are available for all plugins.</p>
    3636
    3737description3=<p>Fielded searching, with a form-based interface, is selected by <i>format SearchTypes "form,plain" </i> in the configuration file. In fact, a plain textual full-text search index is included in this collection as well (since <i>form</i> comes first, it is the default interface; you reach the <i>plain</i> search through the <i>Preferences</i> page).</p>
    3838
    39 description4=<p>The <i>buildtype</i> option shows that the default search engine <i>mgpp</i> is used. The <i>indexes</i> line specifies indexes for "text", and "metadata". In this case, "text" will be the original BibTeX record. "metadata" is a special keyword signifying that an index should be built for any metadata item found in the collection. Thus when the "field" menus in the collection\'s <a href="library/collection/bibtex-e/search/FieldQuery">search page</a> are pulled down, they show <i>full records</i> followed by an entry for each metadata element. In the collection\'s <tt>resources/collectionConfig.properties</tt> file, collection-level metadata <i>collectionmeta</i> can be specified for any index to determine what it is called (except for <i>metadata</i>, which produces many menu items). In this case, the <i>collectionConfig.properties</i> file specifies that the <i>text</i> index (referred to by collection\'s configuration file, <tt>collectionConfig.xml</tt>) should be named "full records" because it contains the original bibliographic record.</p>
     39description4=<p>The <i>buildtype</i> option shows that the default search engine <i>mgpp</i> is used. The <i>indexes</i> line specifies indexes for "text", and "metadata". In this case, "text" will be the original BibTeX record. "metadata" is a special keyword signifying that an index should be built for any metadata item found in the collection. Thus when the "field" menus in the collection's <a href="library/collection/bibtex-e/search/FieldQuery">search page</a> are pulled down, they show <i>full records</i> followed by an entry for each metadata element. In the collection's <tt>resources/collectionConfig.properties</tt> file, collection-level metadata <i>collectionmeta</i> can be specified for any index to determine what it is called (except for <i>metadata</i>, which produces many menu items). In this case, the <i>collectionConfig.properties</i> file specifies that the <i>text</i> index (referred to by collection's configuration file, <tt>collectionConfig.xml</tt>) should be named "full records" because it contains the original bibliographic record.</p>
    4040
    4141description5=<p>An additional keyword, "allfields", could also be used in the <i>indexes</i> line, specifying that combined searching over all indexes should be available.</p>
    4242
    43 description6=<p>The <i>levels</i> lines specifies only document level, as bibliographic records don\'t have internal structure.</p>
     43description6=<p>The <i>levels</i> lines specifies only document level, as bibliographic records don't have internal structure.</p>
    4444
    4545description7=<p>This collection contains <i>Title, Author</i>, and <i>Date</i> browsers. The <i>AZCompactList</i> classifier used for the <i>Author</i> browser is like <i>AZList</i> but generates a bookshelf for duplicate items. The BibTeX plugin records each author as <i>Author</i> metadata; it also puts a list containing all authors into the <i>Creator</i> metadata element. Consequently the <i>AZCompactList</i> classifier is based on <i>Author</i>. However, Greenstone has a standard button reading <i>authors</i> whose name is (confusingly) "Creator", so this button name is specified for the classifier.</p>
     
    4747description8=<p>The format statements for the search results list and the title browser are both determined by the <i>VList</i> specification. It gives a document icon that links to the document itself (which in this collection is the full reference); the title in bold; <i>Creator</i> metadata if there is any, otherwise <i>Editor</i> metadata; and <i>Month, Year</i> metadata if there is any. <a href="library/collection/bibtex-e/search/FieldQuery?a=q&sa=&rt=rd&s1.level=Doc&s1.case=1&s1.stem=0&s1.matchMode=some&s1.sortBy=1&s1.maxDocs=50&s1.fqv=Jain&s1.fqf=TX&s1.fqv=&s1.fqf=NU&s1.fqv=&s1.fqf=NU&s1.fqv=&s1.fqf=NU&s1.hitsPerPage=20">Here</a> is an example.</p>
    4848
    49 description9=<p>The format statement for the author browser (<i>CL2VList</i>) is more complex. The <i>AZCompactList</i> classifier generates a tree whose nodes are either leaf nodes, representing documents, or internal nodes. A metadata item called <i>numleafdocs</i> gives the total number of documents below an internal node. This format statement checks whether numleafdocs exists. If so the node must be an internal node, in which case the node is labeled by its <i>Title</i>. But beware\: this classifier is generated on <i>Author</i> metadata, so its title -- the title of the classifier -- is actually the author\'s name! This means that the bookshelf nodes <a href="library/collection/bibtex-e/browse/CL2">here</a> are labeled by author\'s name. The leaf nodes, however, are labeled the same way as documents (i.e. references) are in the search results list.</p>
     49description9=<p>The format statement for the author browser (<i>CL2VList</i>) is more complex. The <i>AZCompactList</i> classifier generates a tree whose nodes are either leaf nodes, representing documents, or internal nodes. A metadata item called <i>numleafdocs</i> gives the total number of documents below an internal node. This format statement checks whether numleafdocs exists. If so the node must be an internal node, in which case the node is labeled by its <i>Title</i>. But beware\: this classifier is generated on <i>Author</i> metadata, so its title -- the title of the classifier -- is actually the author's name! This means that the bookshelf nodes <a href="library/collection/bibtex-e/browse/CL2">here</a> are labeled by author's name. The leaf nodes, however, are labeled the same way as documents (i.e. references) are in the search results list.</p>
    5050
    51 description10=<p>The documents themselves (here is an <a href="library/collection/bibtex-e/document/98_9">example</a>) are generated by two format statements, one (a long one) called <i>DocumentHeading</i>, and another called <i>DocumentContent</i>. The <i>DocumentHeading</i>, which is the top two-thirds of the page, contains the document\'s <i>Title</i> followed by a table that gives all the metadata elements that the BibTeX plugin can generate. The role of all the <i>gsf\:switch</i> statements in the collection cofiguration file, <tt>collectionConfig.xml</tt>, is to determine which elements are defined.</p>
     51description10=<p>The documents themselves (here is an <a href="library/collection/bibtex-e/document/98_9">example</a>) are generated by two format statements, one (a long one) called <i>DocumentHeading</i>, and another called <i>DocumentContent</i>. The <i>DocumentHeading</i>, which is the top two-thirds of the page, contains the document's <i>Title</i> followed by a table that gives all the metadata elements that the BibTeX plugin can generate. The role of all the <i>gsf\:switch</i> statements in the collection cofiguration file, <tt>collectionConfig.xml</tt>, is to determine which elements are defined.</p>
    5252
    5353description11=<p>The <i>DocumentContent</i> has been overridden. When the document is displayed initially, only a hyperlink reading <i>Show/Hide BibTex Record</i> appears -- clicking this invokes JavaScript to toggle the display of the raw BibTex record (showing the BibText version of the reference), which is hidden by default.</p>
  • documented-examples/trunk/dls-e/resources/collectionConfig.properties

    r36615 r36619  
    1313description0=<p>The editors of this collection are Human Info NGO, HumanityCD Ltd, and participating organizations. Contact us at Humanitarian and Development Libraries Project, Oosterveldiaan 196, B-2610 Antwerp, Belgium, Tel 32-3-448.05.54, Fax 32-3-449.75.74, email <a href=mailto\:[email protected]>[email protected]</a>.
    1414
    15 description1=<h3>How the collection works</h3><p>The DLS collection is fairly complex. If you\'re just starting out you might prefer to look at some other collections first (e.g. <a href="library/collection/wrdpdf-e/page/about">Word and PDF demonstration</a>, or the <a href="library/collection/gsarch-e/page/about">Greenstone Archives</a>, or the <a href="library/collection/image-e/page/about">Simple Image collection</a>).</p>
     15description1=<h3>How the collection works</h3><p>The DLS collection is fairly complex. If you're just starting out you might prefer to look at some other collections first (e.g. <a href="library/collection/wrdpdf-e/page/about">Word and PDF demonstration</a>, or the <a href="library/collection/gsarch-e/page/about">Greenstone Archives</a>, or the <a href="library/collection/image-e/page/about">Simple Image collection</a>).</p>
    1616
    17 description2=<p>The collection configuration file, <tt>collectionConfig.xml</tt>, like all collection configuration files, begins with the <i>creator</i> metadata element that gives the email address of the collection\'s creator, and another metadata ("public") that determines whether the collection will appear on the home page of the Greenstone installation. Note that setting "public" to "false" only removes it from the home page; it will still be accessible in the library to anyone that knows the URL to the collection.</p>
     17description2=<p>The collection configuration file, <tt>collectionConfig.xml</tt>, like all collection configuration files, begins with the <i>creator</i> metadata element that gives the email address of the collection's creator, and another metadata ("public") that determines whether the collection will appear on the home page of the Greenstone installation. Note that setting "public" to "false" only removes it from the home page; it will still be accessible in the library to anyone that knows the URL to the collection.</p>
    1818
    1919description3=<p><b>Plugins</b>. The "plugin" lines in the collection configuration file give the plugins used by the collection. The documents in the DLS collection are in HTML, so <i>HTMLPlugin</i> must be included. The <i>description_tags</i> option processes tags in the text that define sections and section titles as described below.</p>
     
    2121description4=<p>The other plugins, <i>GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin, and DirectoryPlugin</i>, are used by Greenstone for internal purposes and are standard in almost all collections.</p>
    2222
    23 description5=<p><b>Searchable indexes</b>. The block of lines starting with <i>indexes</i> specifies what searchable indexes will be available. In this collection there are three\: you can see them when you pull down the "Search for" menu on the collection\'s <a href="library/collection/dls-e/search/TextQuery">search page</a>. The first index is called "chapters", the second "section titles", and the third "entire documents". The names of these three indexes are given by three properties (section_text, section_Title and document_text) in the translatable <tt>collectionConfig.properties</tt> file located in the collection\'s <tt>resources</tt> subfolder.</p>
     23description5=<p><b>Searchable indexes</b>. The block of lines starting with <i>indexes</i> specifies what searchable indexes will be available. In this collection there are three\: you can see them when you pull down the "Search for" menu on the collection's <a href="library/collection/dls-e/search/TextQuery">search page</a>. The first index is called "chapters", the second "section titles", and the third "entire documents". The names of these three indexes are given by three properties (section_text, section_Title and document_text) in the translatable <tt>collectionConfig.properties</tt> file located in the collection's <tt>resources</tt> subfolder.</p>
    2424
    2525description6=<p>The contents of the indexes -- that is, the specification of what it is that will be searched -- are defined by the <i>indexes</i> line at the beginning of this block. This specifies three indexes, two at the section level (beginning with <i>section\:</i>) and one at the document level (beginning with <i>document\:</i>). The difference is that a multi-word query will only match a section-level index if all query terms appear in the same section, whereas it will match a document-level index if the terms appear anywhere within the document (which typically comprises several sections). The first and third indexes are <i>section\:text</i> and <i>document\:text</i>, and the <i>\:text</i> means that the full text of sections and documents respectively will be searched. The second is <i>section\:Title</i>, which means that <i>Title</i> metadata will be searched -- in this case, section titles (rather than document titles). The three indexes appear in the order in which they are specified on the <i>indexes</i> line.</p>
     
    2929description8=<p>The first classifier provides access by subject. It is a <i>Hierarchy</i> classifier whose hierarchy is defined in the file <tt>etc/dls.Subject.txt</tt> (the <i>hfile</i> argument); this file is discussed below. This classifier is based on <i>dls.Subject</i> metadata, and when several books appear at a leaf of the hierarchy they are sorted by <i>dls.Title</i> metadata (as you can see when you open classifier browser <tt>CL1.4.1</tt>).  The second classifier provides access by title. It is also a <i>Hierarchy</i> classifier, this time based on <i>dls.AZList</i> metadata, whose hierarchy is defined in <tt>etc/dls.AZList.txt</tt>. This file is discussed below.  The third provides access by organization\: it is a <i>List</i> classifier based on <i>dls.Organization</i> metadata. The <i>-bookshelf_type always</i> option creates a new bookshelf for each organization, even if only one document belongs to that category.  The fourth provides access by "Howto" text\: it is a <i>List</i> classifier based on <i>dls.Keyword</i> metadata. The <i>-bookshelf_type never</i> option prevents bookshelves being created even if two documents share the same keywords.</p>
    3030
    31 description9=<p><b>Cover images</b>. Greenstone looks for a cover image for each document, whose name is the same as the document\'s but with a <i>.jpg</i> extension. This image is associated with the document, and may be displayed on the document page (see below). Cover images can be switched off by setting the -no_cover_image flag for each plugin.</p>
     31description9=<p><b>Cover images</b>. Greenstone looks for a cover image for each document, whose name is the same as the document's but with a <i>.jpg</i> extension. This image is associated with the document, and may be displayed on the document page (see below). Cover images can be switched off by setting the -no_cover_image flag for each plugin.</p>
    3232
    3333description10=<p><b>Format statements</b>. The <i>format</i> elements (&lt;format;&gt;, &lt;browse&gt;, &lt;search&gt; and &lt;display&gt; XML elements), called "format statements", govern how various parts of the collection should be displayed. The <i>VList</i> format statement applies to lists of items displayed vertically, such as the lists of titles, subjects and organisations, and the table of contents for the target documents. It is overridden for the search results list by the <i>SearchVList</i> format statement, and also for the <i>Howto</i> classifier by the <i>CL4VList</i> statement (CL4 specifies the fourth classifier).</p>
     
    3737description12=<p>Greenstone 3 uses XML for format statements, allowing librarians with XML experience to more easily understand and use format statements than Greenstone 2 which worked with a custom way of specifying format statements. For more information on understanding format statements and writing your own format statements for collections, refer to <a href="http\://wiki.greenstone.org/doku.php?id=en\:user\:gs3_format_statements">Greenstone 3 Format Statements</a> on the Greenstone wiki.</p>
    3838
    39 description13=<p><b>Collection-level metadata</b>. The <i>&lt;displayItem&gt;</i> elements under the top-level <i>&lt;displayItemList&gt;</i> in the configuration file are also standard in all Greenstone collections. They give general information about the collection, defining its name, and a description that appears on its home page. The description text (defined in the translatable <tt>resources/collectionConfig.properties</tt> files) can be seen on the DLS collection\'s home page (this text is part of it).</p>
     39description13=<p><b>Collection-level metadata</b>. The <i>&lt;displayItem&gt;</i> elements under the top-level <i>&lt;displayItemList&gt;</i> in the configuration file are also standard in all Greenstone collections. They give general information about the collection, defining its name, and a description that appears on its home page. The description text (defined in the translatable <tt>resources/collectionConfig.properties</tt> files) can be seen on the DLS collection's home page (this text is part of it).</p>
    4040
    41 description14=<p><b>Language translations</b>. In the collection configuration file, lines that look like <tt>&lt;displayItem assigned="true" dictionary="collectionConfig" key="..." name="..."/&gt;</tt> allow for translatable collection-level metadata, that are defined in the <tt>resources/collectionConfig.properties</tt> text files and can be translated in the same location such as by creating French and Spanish versions (in <tt>resources/collectionConfig_fr.properties</tt> and <tt>resources/collectionConfig_es.properties</tt>, respectively). Note that we advise translators to go through the GTI (Greenstone Translation Interface) system if they want to contribute translations to Greenstone as used by everyone, such as translations to Greenstone\'s demo collections and these documented example collections. The properties files allow for accented characters (e.g. French <i>é</i>). The files are in UTF-8, and these characters are represented by multi-byte sequences (&lt;C3&gt;&lt;A9&gt; in this case). Alternatively they could be represented by their HTML entity names (like <i>& eacute ;</i>). It makes no difference for how they appear on the screen.</p>
     41description14=<p><b>Language translations</b>. In the collection configuration file, lines that look like <tt>&lt;displayItem assigned="true" dictionary="collectionConfig" key="..." name="..."/&gt;</tt> allow for translatable collection-level metadata, that are defined in the <tt>resources/collectionConfig.properties</tt> text files and can be translated in the same location such as by creating French and Spanish versions (in <tt>resources/collectionConfig_fr.properties</tt> and <tt>resources/collectionConfig_es.properties</tt>, respectively). Note that we advise translators to go through the GTI (Greenstone Translation Interface) system if they want to contribute translations to Greenstone as used by everyone, such as translations to Greenstone's demo collections and these documented example collections. The properties files allow for accented characters (e.g. French <i>é</i>). The files are in UTF-8, and these characters are represented by multi-byte sequences (&lt;C3&gt;&lt;A9&gt; in this case). Alternatively they could be represented by their HTML entity names (like <i>& eacute ;</i>). It makes no difference for how they appear on the screen.</p>
    4242
    4343description15=<p><b>Description tags</b>. The description tags recognized by <i>HTMLPlugin</i> are inserted into the HTML source text of the documents to define where sections begin and end, and to specify section titles. They look like this\: <pre> &lt;!-- &lt;Section&gt; &lt;Description&gt; &lt;Metadata name="Title"&gt; Realizing human rights for poor people\: Strategies for achieving the international development targets &lt;/Metadata&gt; &lt;/Description&gt; --&gt; (text of section goes here) &lt;!-- &lt;/Section&gt; --&gt; </pre> The &lt;!-- ... --&gt; markers are used to ensure that these tags are marked as comments in HTML and therefore do not affect document formatting. In the <i>Description</i> part other kinds of metadata can be specified, but this is not done for the style of collection we are describing here. Exactly the same specification (including the &lt;!-- ... --&gt; markers) can be used in Word documents too.</p>
    4444
    45 description16=<p><b>Metadata Files</b>. Metadata for all documents in the DLS collection is provided in metadata.xml files, one per document folder. In this collection\'s <tt>import/r0087e</tt> is the <tt>metadata.xml</tt> file for one book -- <i>Income generation and money management\: training women as entrepreneurs</i> -- which is a block of about ten lines encased in &lt;<i>FileSet</i>&gt; ... &lt;<i>/FileSet</i>&gt; tags. It defines <i>dls.Title</i>, <i>dls.Language</i>, <i>dls.Subject</i> and <i>dls.AZList</i> metadata. More than one value can be specified for any metadata item. For example, this book has two dls.Subject classifications. Both of these are stored as metadata values for this particular document (because <i>mode=accumulate</i> is specified; the alternative, and the default, is <i>mode=override</i>).</p>
     45description16=<p><b>Metadata Files</b>. Metadata for all documents in the DLS collection is provided in metadata.xml files, one per document folder. In this collection's <tt>import/r0087e</tt> is the <tt>metadata.xml</tt> file for one book -- <i>Income generation and money management\: training women as entrepreneurs</i> -- which is a block of about ten lines encased in &lt;<i>FileSet</i>&gt; ... &lt;<i>/FileSet</i>&gt; tags. It defines <i>dls.Title</i>, <i>dls.Language</i>, <i>dls.Subject</i> and <i>dls.AZList</i> metadata. More than one value can be specified for any metadata item. For example, this book has two dls.Subject classifications. Both of these are stored as metadata values for this particular document (because <i>mode=accumulate</i> is specified; the alternative, and the default, is <i>mode=override</i>).</p>
    4646
    4747description17=<p><b>Hierarchy files</b>. Hierarchy files contain a succession of lines each of which has three items. The first item is a text string which is matched against the metadata that occurs in the <i>metadata.xml</i> file described above. The second item is a number that defines the position in the hierarchy. The third item is a text string that describes the node of the hierarchy on the web pages that Greenstone generates.</p>
  • documented-examples/trunk/garish-e/resources/collectionConfig.properties

    r36615 r36619  
    1717description1=<p>Greenstone 3 uses default stylesheets, which can be overridden for all collections in a site or for any particular collection. This documented example collection covers the last case.</p>
    1818
    19 description2=<h3>How the collection works</h3><p>The <b>global</b> format statement contains a link to the collection\'s custom stylesheet, which is located inside the collection\: \n\
     19description2=<h3>How the collection works</h3><p>The <b>global</b> format statement contains a link to the collection's custom stylesheet, which is located inside the collection\: \n\
    2020<pre>&lt;xsl\:template name="additionalHeaderContent"&gt; \n\
    2121    &lt;xsl\:variable name="httpCollection"&gt; \n\
     
    2727</p>
    2828
    29 description3=<p>Next, a folder named <i>style</i> is created within the collection and a new text file, called <i>custom-style.css</i>, is created within that folder. The css suffix indicates it\'s a <i>Cascading Style Sheet</i>. CSS files define the look of web pages such as the colours, borders, fonts, heading styles and more. CSS files are just text files, and can thus be edited with any text editor.</p>
     29description3=<p>Next, a folder named <i>style</i> is created within the collection and a new text file, called <i>custom-style.css</i>, is created within that folder. The css suffix indicates it's a <i>Cascading Style Sheet</i>. CSS files define the look of web pages such as the colours, borders, fonts, heading styles and more. CSS files are just text files, and can thus be edited with any text editor.</p>
    3030
    31 description4=<p>The default Greenstone CSS style sheets define certain styles for all collections, that are <i>overridden</i> for the collection by defining CSS rules within its new custom stylesheet. It is by linking the CSS file in the Greenstone collection\'s <b>global</b> format statement as above, that the general Greenstone CSS styling rules get overridden at the collection level.</p>
     31description4=<p>The default Greenstone CSS style sheets define certain styles for all collections, that are <i>overridden</i> for the collection by defining CSS rules within its new custom stylesheet. It is by linking the CSS file in the Greenstone collection's <b>global</b> format statement as above, that the general Greenstone CSS styling rules get overridden at the collection level.</p>
    3232
    3333description5=<p>You can quickly learn how to write CSS at <a href="https\://www.w3schools.com/css/default.asp">W3schools</a> and other online sites.</p>
  • documented-examples/trunk/gsarch-e/resources/collectionConfig.properties

    r36615 r36619  
    1313description1=<h3>How the collection works</h3><p>The Greenstone Archives collection uses the <i>Email</i> plugin, which parses files in email formats. In this case, there is a file per month per mailing list, and each file contains many email messages. The <i>Email</i> plugin splits these into individual documents, and produces <i>Title</i>, <i>Subject</i>, <i>From</i>, <i>FromName</i>, <i>FromAddr</i>, <i>Date</i>, <i>DateText</i>, <i>InReplyTo</i>, and optionally <i>Headers</i>, metadata.</p>
    1414
    15 description2=<p>The collection configuration file, <tt>etc/collectionConfig.xml</tt> specifies <i>&lt;importOption name="groupsize" value="200"/&gt;</i>. This groups documents together into groups of 200. Email collections typically have many small documents, and grouping them together prevents Greenstone\'s internal file structures from becoming bloated and occupying more disk space than necessary. Notice that the <i>Email</i> plugin first splits the input files up into individual Emails, then <i>groupsize</i> groups them together again. This allows the collection designer to control what is going on.</p>
     15description2=<p>The collection configuration file, <tt>etc/collectionConfig.xml</tt> specifies <i>&lt;importOption name="groupsize" value="200"/&gt;</i>. This groups documents together into groups of 200. Email collections typically have many small documents, and grouping them together prevents Greenstone's internal file structures from becoming bloated and occupying more disk space than necessary. Notice that the <i>Email</i> plugin first splits the input files up into individual Emails, then <i>groupsize</i> groups them together again. This allows the collection designer to control what is going on.</p>
    1616
    1717description3=<p>The <i>indexes</i> line specifies 3 searchable indexes, which can be seen by clicking beside the word "Messages" on the <a href="library/collection/gsarch-e/search/TextQuery">search page</a> to reveal a drop-down menu. The first (called <i>Messages</i>) is created from the document text, while the others are formed from <i>From</i> and <i>Subject</i> metadata.</p>
    1818
    19 description4=<p>There are three classifiers, based on <i>Subject</i>, <i>FromName</i>, and <i>Date</i> metadata. The <i>AZCompactList</i> classifier used for the first two is like <i>AZList</i> but generates a bookshelf for duplicate items, as illustrated <a href="library/collection/gsarch-e/browse/CL1">here</a>. This is represented by a tree structure whose nodes are either leaf nodes, representing documents, or internal nodes. A metadata item called numleafdocs gives the total number of documents below an internal node. The format statement for the first classifier, called <i>CL1Vlist</i>, checks whether this item exists. If so the node must be an internal one, in which case it is labeled by its <i>Title</i>. Otherwise the node\'s label starts with the <i>Subject</i> which links to the document, then gives <i>FromName</i> metadata, with a link to "Search by Sender", followed by the <i>DateText</i>.</p>
     19description4=<p>There are three classifiers, based on <i>Subject</i>, <i>FromName</i>, and <i>Date</i> metadata. The <i>AZCompactList</i> classifier used for the first two is like <i>AZList</i> but generates a bookshelf for duplicate items, as illustrated <a href="library/collection/gsarch-e/browse/CL1">here</a>. This is represented by a tree structure whose nodes are either leaf nodes, representing documents, or internal nodes. A metadata item called numleafdocs gives the total number of documents below an internal node. The format statement for the first classifier, called <i>CL1Vlist</i>, checks whether this item exists. If so the node must be an internal one, in which case it is labeled by its <i>Title</i>. Otherwise the node's label starts with the <i>Subject</i> which links to the document, then gives <i>FromName</i> metadata, with a link to "Search by Sender", followed by the <i>DateText</i>.</p>
    2020
    2121description5=<p>The second classifier (<i>CL2Vlist</i>) is similar, but shows slightly different information -- the result can be seen <a href="library/collection/gsarch-e/browse/CL2">here</a>. For internal nodes, the actual number of leaf documents (<i>numleafdocs</i>) is given in parentheses after the <i>Title</i>. For document nodes the <i>FromName</i>, with a link to "Search By Sender", <i>Subject</i> (linked to the document), and <i>DateText</i> metadata is shown.</p>
  • documented-examples/trunk/image-e/resources/collectionConfig.properties

    r36615 r36619  
    66sampleoid=D2
    77
    8 shortDescription=<p>This is a basic image collection that contains no text and no explicit metadata. Several JPEG files are placed in the import directory prior to importing and building the collection, that\'s all.</p>
     8shortDescription=<p>This is a basic image collection that contains no text and no explicit metadata. Several JPEG files are placed in the import directory prior to importing and building the collection, that's all.</p>
    99
    1010description1=<p>The images in this collection have been produced by members of the Department of Computer Science, University of Waikato. The University of Waikato holds copyright. They may be distributed freely, without any restrictions.</p>
     
    1414description3=<p>There is only one plugin, <i>ImagePlugin</i>, aside from the others that are always present (crucially <i>GreenstoneXMLPlugin</i>, <i>MetadataXMLPlugin</i>, <i>ArchivesInfPlugin</i>, <i>DirectoryPlugin</i>). <i>ImagePlugin</i> relies on the existence of two programs from the ImageMagick suite (<a href="http\://www.imagemagick.org">www.imagemagick.org</a>)\: <i>convert</i> and <i>identify</i>. Greenstone 3 binaries come bundled with Imagemagick as one of the components that can be optionally installed. Greenstone will not be able to build the collection correctly unless an ImageMagick is installed on your computer.</p>
    1515
    16 description4=<p><i>ImagePlugin</i> automatically creates a thumbnail and generates the following metadata for each image in the collection\:  <blockquote> <table border=0 cellspacing=0> <tr><td width=125 valign=top><i>Image</i></td><td>Name of file containing the image <tr> <tr><td valign=top><i>ImageWidth</i></td><td>Width of image (in pixels) <tr> <tr><td valign=top><i>ImageHeight</i></td><td>Height of image (in pixels) <tr> <tr><td valign=top><i>Thumb</i></td><td> Name of gif file containing thumbnail of image <tr> <tr><td valign=top><i>ThumbWidth</i></td><td>Width of thumbnail image (in pixels) <tr> <tr><td valign=top><i>ThumbHeight</i></td><td>Height of thumbnail image (in pixels) <tr> <tr><td valign=top><i>thumbicon</i></td><td>Full pathname specification of thumbnail image <tr> <tr><td valign=top><i>assocfilepath</i></td><td>Pathname of image directory in the collection\'s <i>assoc</i> directory <tr> </table> </blockquote></p>
     16description4=<p><i>ImagePlugin</i> automatically creates a thumbnail and generates the following metadata for each image in the collection\:  <blockquote> <table border=0 cellspacing=0> <tr><td width=125 valign=top><i>Image</i></td><td>Name of file containing the image <tr> <tr><td valign=top><i>ImageWidth</i></td><td>Width of image (in pixels) <tr> <tr><td valign=top><i>ImageHeight</i></td><td>Height of image (in pixels) <tr> <tr><td valign=top><i>Thumb</i></td><td> Name of gif file containing thumbnail of image <tr> <tr><td valign=top><i>ThumbWidth</i></td><td>Width of thumbnail image (in pixels) <tr> <tr><td valign=top><i>ThumbHeight</i></td><td>Height of thumbnail image (in pixels) <tr> <tr><td valign=top><i>thumbicon</i></td><td>Full pathname specification of thumbnail image <tr> <tr><td valign=top><i>assocfilepath</i></td><td>Pathname of image directory in the collection's <i>assoc</i> directory <tr> </table> </blockquote></p>
    1717
    18 description5=<p>The image is stored as an "associated file" in the <i>assoc</i> subdirectory of the collection\'s <i>index</i> directory. (<i>Index</i> is where all files necessary to serve the collection are placed, to make it self-contained.) For any document, its thumbnail and image are both in a subdirectory whose filename is given by <i>assocfilepath</i>. The metadata element <i>thumbicon</i> is set to the full pathname specification of the thumbnail image, and can be used in the same way as <i>srcicon</i> (see the MSWord and PDF demonstration collection).</p>
     18description5=<p>The image is stored as an "associated file" in the <i>assoc</i> subdirectory of the collection's <i>index</i> directory. (<i>Index</i> is where all files necessary to serve the collection are placed, to make it self-contained.) For any document, its thumbnail and image are both in a subdirectory whose filename is given by <i>assocfilepath</i>. The metadata element <i>thumbicon</i> is set to the full pathname specification of the thumbnail image, and can be used in the same way as <i>srcicon</i> (see the MSWord and PDF demonstration collection).</p>
    1919
    2020description6=<p>The <tt>browse</tt> format statement in the collection configuration file, <tt>collectionConfig.xml</tt>, dictates how the document will appear, and <a href="library/collection/image-e/document/D2">this</a> is the result. There is no document text (if there were, it would be producible by <i>&lt;xsl\:call-template name="documentNodeText"/&gt;</i> in format statements). What is shown is the image itself, along with some metadata extracted from it.</p>
  • documented-examples/trunk/lomdemo-e/resources/collectionConfig.properties

    r36615 r36619  
    88text_and_rawtext=All text
    99
    10 shortDescription=<p>This collection is a sample excerpt of educational resources from the University of Calgary\'s Learning Commons Educational Object Repository (no longer active). Taken from the subject areas of the arts and science, 38 items from the repository were exported in the IEEE LOM (Learning Object Metadata) format and digested into a Greenstone collection. For sample LOM metadata, see the records <tt>arts/657841.xml</tt> or <tt>record science/582041.xml</tt>.</p>
     10shortDescription=<p>This collection is a sample excerpt of educational resources from the University of Calgary's Learning Commons Educational Object Repository (no longer active). Taken from the subject areas of the arts and science, 38 items from the repository were exported in the IEEE LOM (Learning Object Metadata) format and digested into a Greenstone collection. For sample LOM metadata, see the records <tt>arts/657841.xml</tt> or <tt>record science/582041.xml</tt>.</p>
    1111
    1212description1=<p>Traditional educational learning object repositories base searching and browsing around the provided metadata. This demonstration collection goes one step further and provides <i>full-text</i> indexing of the on-line resources, where possible.</p>
    1313
    14 description2=<p>Browse around the collection\'s items ordered by subject then title, or view the items chronologically. Alternatively search the text or titles of the items in the collection, optionally restricted to arts or science. When you view an item from the collection various views of it are available. You start by viewing its Learning Object Metadata (LOM) record in a tabulated form and divided into sections\: these sections can be expanded or contracted to reveal more or less information as desired. Use the tabs at the top of the table to change the view of the learning object. There will always be a tab for "XML Record" which displays the metadata in its original IEEE LOM format. Depending on whether or not the learning object references an on-line resource that is available for indexing, a third tab may be present that displays the source document.</p>
     14description2=<p>Browse around the collection's items ordered by subject then title, or view the items chronologically. Alternatively search the text or titles of the items in the collection, optionally restricted to arts or science. When you view an item from the collection various views of it are available. You start by viewing its Learning Object Metadata (LOM) record in a tabulated form and divided into sections\: these sections can be expanded or contracted to reveal more or less information as desired. Use the tabs at the top of the table to change the view of the learning object. There will always be a tab for "XML Record" which displays the metadata in its original IEEE LOM format. Depending on whether or not the learning object references an on-line resource that is available for indexing, a third tab may be present that displays the source document.</p>
    1515
    1616description3=<h3>How the collection works</h3><p>The records were exported from the Calgary Repository in LOM format. LOMPlugin is used to process the records. Using the <tt>-download_srcdocs</tt> option to the plugin will search for <tt>general^identifier^entry</tt> or <tt>technical^location</tt>, and attempt to download the source document into a <i>_gsdldown.all</i> folder (<tt>import/arts/_gsdldown.all</tt>) in the same folder as the LOM record.</p>
  • documented-examples/trunk/manifest-demo-e/resources/collectionConfig.properties

    r36615 r36619  
    3030<pre>perl -S import.pl -site localsite documented-examples/manifest-demo-e \n\
    3131perl -S buildcol.pl -site localsite -activate documented-examples/manifest-demo-e</pre> \n\
    32 <i>Note\:</i> If you forget to pass in the <tt>-activate</tt> flag to the <tt>buildcol</tt> command, use a file explorer to go into your Greenstone 3 installations\'s <tt>web/sites/localsite/collect/documented-examples/manifest-demo-e</tt> folder, and rename the <tt>building</tt> subfolder there to <tt>index</tt>.)<br /> \n\
     32<i>Note\:</i> If you forget to pass in the <tt>-activate</tt> flag to the <tt>buildcol</tt> command, use a file explorer to go into your Greenstone 3 installations's <tt>web/sites/localsite/collect/documented-examples/manifest-demo-e</tt> folder, and rename the <tt>building</tt> subfolder there to <tt>index</tt>.)<br /> \n\
    3333<br /> \n\
    3434Preview the collection. Contains 8 documents, 5 from BOSTID and 3 from EC Courier.</p><br />
     
    4747perl -S incremental-buildcol.pl -site localsite -activate documented-examples/manifest-demo-e</pre> \n\
    4848 \n\
    49 Note that we haven\'t actually deleted the docs from the import folder. Just from the collection\'s <tt>archives</tt> and <tt>index</tt> subfolders.<br /> \n\
     49Note that we haven't actually deleted the docs from the import folder. Just from the collection's <tt>archives</tt> and <tt>index</tt> subfolders.<br /> \n\
    5050Now the EC Courier documents should be gone.</p><br />
    5151
  • documented-examples/trunk/marc-e/resources/collectionConfig.properties

    r36615 r36619  
    1010description2=<p>The <i>VList</i> format statement controls the display of search results and all classifiers. For bookshelves, the number of leaf documents is displayed on the right-hand side. For documents, <i>dc.Title</i> is displayed, along with <i>dc.Creator</i> and <i>dc.Publisher</i>. <i>[sibling\:dc.Creator]</i> is used as dc.Creator has multiple values, and specifies that all values be output, not just the first one.</p>
    1111
    12 description3=<p>The MARC plugin uses a special file to map MARC field numbers to Greenstone-style metadata. This file resides in the greenstone3 installation folder\'s <i>gs2build/etc</i> directory, and is called <tt>marc2dc.txt</tt>. It lists the correspondences between MARC field numbers and Greenstone metadata. Any MARC fields that are not listed simply do not appear as metadata, though they are still present in the Greenstone document. Each line in the file has the format <blockquote> &lt;MARC field number&gt; -&gt; GreenstoneMetadataName </blockquote> Lines in the file that begin with "\#" are comments.</p>
     12description3=<p>The MARC plugin uses a special file to map MARC field numbers to Greenstone-style metadata. This file resides in the greenstone3 installation folder's <i>gs2build/etc</i> directory, and is called <tt>marc2dc.txt</tt>. It lists the correspondences between MARC field numbers and Greenstone metadata. Any MARC fields that are not listed simply do not appear as metadata, though they are still present in the Greenstone document. Each line in the file has the format <blockquote> &lt;MARC field number&gt; -&gt; GreenstoneMetadataName </blockquote> Lines in the file that begin with "\#" are comments.</p>
    1313
    1414description4=<p>The standard version of this file is loosely based on the MARC to Dublin Core mapping found at <a href="http\://www.loc.gov/marc/marc2dc.html">http\://www.loc.gov/marc/marc2dc.html</a> (which assumes USMARC/MARC21).</p>
     
    1616description5=<p>Multiple MARC fields may map to a single Dublin Core field. For example, fields 720 ("Uncontrolled name"), 100 ("Personal name"), 110 ("Corporate name") and 111 ("Meeting name") all map to <i>dc.Creator</i>. Actual MARC records normally define only one of these fields, and anyway Greenstone allows multi-valued metadata.</p>
    1717
    18 description6=<p>Some mappings are dependent on subfields. For example, MARC field 260 contains information about publication and distribution. Subfields "c" (Date of Publication) and "g" (Date of manufacture) are mapped to <i>dc.Date</i>, using the following mapping line\: <blockquote> 260$c$g -&gt; dc.Date </blockquote>  Greenstone also provides a file for mapping MARC to <b>qualified</b> dublin core\: in your Greenstone 3 installation folder\'s <tt>gs2build/etc/marc2qdc.txt</tt>. This can be used by the MARC plugin by setting the <i>-metadata_mapping_file</i> option to "marc2qdc.txt".</p>
     18description6=<p>Some mappings are dependent on subfields. For example, MARC field 260 contains information about publication and distribution. Subfields "c" (Date of Publication) and "g" (Date of manufacture) are mapped to <i>dc.Date</i>, using the following mapping line\: <blockquote> 260$c$g -&gt; dc.Date </blockquote>  Greenstone also provides a file for mapping MARC to <b>qualified</b> dublin core\: in your Greenstone 3 installation folder's <tt>gs2build/etc/marc2qdc.txt</tt>. This can be used by the MARC plugin by setting the <i>-metadata_mapping_file</i> option to "marc2qdc.txt".</p>
  • documented-examples/trunk/oai-e/resources/collectionConfig.properties

    r36615 r36619  
    1111
    1212
    13 shortDescription=<p>This collection demonstrates Greenstone\'s <i>ImportFrom</i> feature. Using the <a href="http\://www.openarchives.org">Open Archive Protocol</a> (version 1.1), it retrieves metadata from <a href="http\://rocky.dlib.vt.edu/~jcdlpix">rocky.dlib.vt.edu/~jcdlpix</a>, a collection of photographs taken at the inaugural <a href="http\://www.acm.org/jcdl/jcdl01/">Joint Conference on Digital Libraries</a>. A Greenstone collection is built from the records exported from this OAI data provider. The implementation is flexible enough to cope with the minor syntax differences between OAI 1.1 and OAI 2.0.</p>
     13shortDescription=<p>This collection demonstrates Greenstone's <i>ImportFrom</i> feature. Using the <a href="http\://www.openarchives.org">Open Archive Protocol</a> (version 1.1), it retrieves metadata from <a href="http\://rocky.dlib.vt.edu/~jcdlpix">rocky.dlib.vt.edu/~jcdlpix</a>, a collection of photographs taken at the inaugural <a href="http\://www.acm.org/jcdl/jcdl01/">Joint Conference on Digital Libraries</a>. A Greenstone collection is built from the records exported from this OAI data provider. The implementation is flexible enough to cope with the minor syntax differences between OAI 1.1 and OAI 2.0.</p>
    1414
    15 description1=<h3>How the collection works</h3><p>The collection configuration file, <tt>collectionConfig.xml</tt>, includes an <i>acquire</i> line that is interpreted by a special program called <i>importfrom.pl</i>. Like other Greenstone programs, this takes as argument the name of the collection, and provides a summary of other arguments when invoked with argument <i>-help</i>. It reads the collection configuration file, finds the acquire line, and processes it. In this case, it is run with the command\: <pre> importfrom.pl oai-e </pre> (the collection\'s name is <i>oai-e</i>).</p>
     15description1=<h3>How the collection works</h3><p>The collection configuration file, <tt>collectionConfig.xml</tt>, includes an <i>acquire</i> line that is interpreted by a special program called <i>importfrom.pl</i>. Like other Greenstone programs, this takes as argument the name of the collection, and provides a summary of other arguments when invoked with argument <i>-help</i>. It reads the collection configuration file, finds the acquire line, and processes it. In this case, it is run with the command\: <pre> importfrom.pl oai-e </pre> (the collection's name is <i>oai-e</i>).</p>
    1616
    17 description2=<p>The <i>acquire</i> line in the configuration file specifies the OAI protocol and gives the base URL of an OAI repository. The <i>importfrom</i> program downloads all the metadata in that repository into the collection\'s <i>import</i> directory. The <i>getdoc</i> argument instructs it to also download the collection\'s source documents, whose URLs are given in each document\'s Dublin Core <i>Identifier</i> field (this is a common convention). The metadata files, which each contain an XML record for one source document, are placed in the <i>import</i> file structure along with the documents themselves, and the document filename is the same as the filename in the URL. The <i>Identifier</i> field is overridden to give the local filename, and its original value is retained in a new field called <i>OrigURL</i>.</p>
     17description2=<p>The <i>acquire</i> line in the configuration file specifies the OAI protocol and gives the base URL of an OAI repository. The <i>importfrom</i> program downloads all the metadata in that repository into the collection's <i>import</i> directory. The <i>getdoc</i> argument instructs it to also download the collection's source documents, whose URLs are given in each document's Dublin Core <i>Identifier</i> field (this is a common convention). The metadata files, which each contain an XML record for one source document, are placed in the <i>import</i> file structure along with the documents themselves, and the document filename is the same as the filename in the URL. The <i>Identifier</i> field is overridden to give the local filename, and its original value is retained in a new field called <i>OrigURL</i>.</p>
    1818
    19 description3=<p>This <i>oai-e</i> collection\'s own <tt>etc/oai.txt</tt> is an example of a downloaded metadata file.</p>
     19description3=<p>This <i>oai-e</i> collection's own <tt>etc/oai.txt</tt> is an example of a downloaded metadata file.</p>
    2020
    21 description4=<p>Once the OAI information has been imported, the collection is processed in the usual way. Besides the four standard plugins (GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin and DirectoryPlugin), the configuration file specifies the OAI plugin, which processes OAI metadata, and the image plugin, because in this case the collection\'s source documents are image files. The OAI plugin has been supplied with an <i>input_encoding</i> argument because data in this archive contains extended characters. It also has a <i>default_language</i> argument. Greenstone normally determines the language of documents automatically, but these metadata records are too small for this to be done reliably\: hence English is specified explicitly in the <i>language</i> argument. The OAI plugin parses the metadata and passes it to the appropriate source document file, which is then processed by an appropriate plugin -- in this case <i>ImagePlugin</i>. This plugin specifies the resolution for the screen versions of the images.</p>
     21description4=<p>Once the OAI information has been imported, the collection is processed in the usual way. Besides the four standard plugins (GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin and DirectoryPlugin), the configuration file specifies the OAI plugin, which processes OAI metadata, and the image plugin, because in this case the collection's source documents are image files. The OAI plugin has been supplied with an <i>input_encoding</i> argument because data in this archive contains extended characters. It also has a <i>default_language</i> argument. Greenstone normally determines the language of documents automatically, but these metadata records are too small for this to be done reliably\: hence English is specified explicitly in the <i>language</i> argument. The OAI plugin parses the metadata and passes it to the appropriate source document file, which is then processed by an appropriate plugin -- in this case <i>ImagePlugin</i>. This plugin specifies the resolution for the screen versions of the images.</p>
    2222
    2323description5=<p>Extracted metadata from OAI records are mapped to Dublin Core Metadata Set by default. As a result, classifiers and indexes in this collection are built with Dublin meatadata elements.</p>
    2424
    25 description6=<p>The collection configuration file, <tt>collectionConfig.xml</tt>, specifies a single full-text index containing <i>dc.Description</i> metadata and overrides Greenstone\'s custom <i>gsf</i> format templates <tt>DocumentHeading</tt> and <tt>DocumentContent</tt> (XSL). When a document is displayed, the <i>DocumentHeading</i> format statement puts out its <i>dc.Subject</i>. Then the <i>DocumentContent</i> statement follows this with <i>screenicon</i>, which is produced by <i>ImagePlugin</i> and gives a screen-resolution version of the image; it can be hyperlinked to the <i>dc.OrigURL</i> metadata -- that is, the original version of the image on the remote OAI site. Since this is no longer available on the web, it is now hyperlinked to the full version of the image file. This is followed by the image\'s <i>dc.Description</i>, also with a hyperlink; the image\'s size and type, again generated as metadata by <i>ImagePlugin</i>; and then <i>dc.Subject</i>, <i>dc.Publisher</i>, and <i>dc.Rights</i> metadata. <a href="library/collection/oai-e/document/01dle6">This</a> is the result.</p>
     25description6=<p>The collection configuration file, <tt>collectionConfig.xml</tt>, specifies a single full-text index containing <i>dc.Description</i> metadata and overrides Greenstone's custom <i>gsf</i> format templates <tt>DocumentHeading</tt> and <tt>DocumentContent</tt> (XSL). When a document is displayed, the <i>DocumentHeading</i> format statement puts out its <i>dc.Subject</i>. Then the <i>DocumentContent</i> statement follows this with <i>screenicon</i>, which is produced by <i>ImagePlugin</i> and gives a screen-resolution version of the image; it can be hyperlinked to the <i>dc.OrigURL</i> metadata -- that is, the original version of the image on the remote OAI site. Since this is no longer available on the web, it is now hyperlinked to the full version of the image file. This is followed by the image's <i>dc.Description</i>, also with a hyperlink; the image's size and type, again generated as metadata by <i>ImagePlugin</i>; and then <i>dc.Subject</i>, <i>dc.Publisher</i>, and <i>dc.Rights</i> metadata. <a href="library/collection/oai-e/document/01dle6">This</a> is the result.</p>
    2626
    2727description7=<p>There are two browsing classifiers, one based on <i>dc.Subject</i> metadata and the other on <i>dc.Description</i> metadata (but with a button named "captions"). Recall that the <i>AZCompactList</i> classifier is like <i>AZList</i> but generates a bookshelf for duplicate items. In this collection there are a lot of images but only a few different values for <i>dc.Subject</i> metadata.</p>
    2828
    29 description8=<p>It\'s a little surprising that <i>AZCompactList</i> is used (instead of <i>AZList</i>) for the <i>dc.Description</i> index too, because <i>dc.Description</i> metadata is usually unique for each image. However, in this collection the same description has occasionally been given to several images, and some of the divisions in an <i>AZList</i> would contain a large number of images, slowing down transmission of that page. To avoid this, the compact version of the list is used with some arguments (<i>mincompact</i>, <i>maxcompact</i>, <i>mingroup</i>, <i>minnesting</i>) to control the display -- e.g. groups (represented by bookshelves) are not formed unless they have at least 5 (<i>mingroup</i>) items. To find out the meaning of the other arguments for this classifier, execute the command <i>classinfo.pl AZCompactList</i>. The programs <i>classinfo.pl</i> (for classifiers) and <i>pluginfo.pl</i> (for plugins) are useful tools for learning about the capabilities of Greenstone modules. Note incidentally the backslash in the configuration file, used to indicate a continuation of the previous line.</p>
     29description8=<p>It's a little surprising that <i>AZCompactList</i> is used (instead of <i>AZList</i>) for the <i>dc.Description</i> index too, because <i>dc.Description</i> metadata is usually unique for each image. However, in this collection the same description has occasionally been given to several images, and some of the divisions in an <i>AZList</i> would contain a large number of images, slowing down transmission of that page. To avoid this, the compact version of the list is used with some arguments (<i>mincompact</i>, <i>maxcompact</i>, <i>mingroup</i>, <i>minnesting</i>) to control the display -- e.g. groups (represented by bookshelves) are not formed unless they have at least 5 (<i>mingroup</i>) items. To find out the meaning of the other arguments for this classifier, execute the command <i>classinfo.pl AZCompactList</i>. The programs <i>classinfo.pl</i> (for classifiers) and <i>pluginfo.pl</i> (for plugins) are useful tools for learning about the capabilities of Greenstone modules. Note incidentally the backslash in the configuration file, used to indicate a continuation of the previous line.</p>
    3030
    3131description9=<p>The <i>VList</i> format specification shows the image thumbnail, hyperlinked to the associated document, followed by <i>dc.Description</i> metadata; the result can be seen <a href="library/collection/oai-e/browse/CL2">here</a>. The <i>Vlists</i> for the classifiers use <i>numleafdocs</i> to switch between an icon representing several documents (which will appear as a bookshelf) and the thumbnail itself, if there is only one image.</p>
     
    3535description11=<p>Configuration of the server is done via the <i>oai.cfg</i> file in the Greenstone <i>etc</i> directory. This file specifies general information about the repository, and lists collections to be made accessible to OAI clients. By default, collections are not accessible. To enable a collection, add its name to the <i>oaicollection</i> list.</p>
    3636
    37 description12=<p>Greenstone\'s OAI server currently supports Dublin Core, qualified Dublin Core and rfc1807 metadata sets. The <i>oaimetadata</i> line specifies which sets should be used. For collections that use other metadata sets, metadata mapping rules should be provided to map the existing metadata to the sets in use. See the <i>oai.cfg</i> file for details.</p>
     37description12=<p>Greenstone's OAI server currently supports Dublin Core, qualified Dublin Core and rfc1807 metadata sets. The <i>oaimetadata</i> line specifies which sets should be used. For collections that use other metadata sets, metadata mapping rules should be provided to map the existing metadata to the sets in use. See the <i>oai.cfg</i> file for details.</p>
  • documented-examples/trunk/pagedimg-e/resources/collectionConfig.properties

    r36615 r36619  
    44shortDescription=<p>This collection contains a few newspapers from the <a href='http\://www.nzdl.org/cgi-bin/library?a=p&amp;p=about&amp;c=niupepa'>Niupepa</a> collection of Maori newspapers.</p>
    55
    6 description1=<h3>How the collection works</h3> <p>Each newspaper issue consists of a set of images, one per page, and a set of text files for the OCR\'d text. An item file links the set of pages into a single newspaper document. PagedImagePlugin is used to process the item files.</p>
     6description1=<h3>How the collection works</h3> <p>Each newspaper issue consists of a set of images, one per page, and a set of text files for the OCR'd text. An item file links the set of pages into a single newspaper document. PagedImagePlugin is used to process the item files.</p>
    77
    88description2=<p>There are two styles of item files, and this collection demonstrates both. The first uses a text based format, and consists of a list of metadata for the document, and a list of pages. Some examples are\: <i>Te Waka o Te Iwi, Vol. 1, No. 1</i> (in <tt>import/09/09_1_1.item</tt>) and <i>Te Whetu o Te Tau, Vol. 1, No. 3</i> (in <tt>import/10/10_1_3.item</tt>. This format allows specification of document level metadata, and a single list of pages.</p>
     
    1414description5=<p><tt>plugin PagedImagePlugin -documenttype hierarchy -process_exp xml.*\.item$ ... <br/> plugin PagedImagePlugin -documenttype paged ...</tt></p>
    1515
    16 description6=<p>XML based newpapers have been grouped into a folder called <tt>xml</tt>. This enables us to process these files differently, by utilising the <tt>process_exp</tt> option which all plugins support. The first PagedImagePlugin in the list looks for item files underneath the xml folder. These documents will be processed as hierarchical documents. Item files that don\'t match the process expression (i.e. aren\'t underneath the xml folder) will be passed onto the second PagedImagePlugin, and these are treated as paged documents.</p>
     16description6=<p>XML based newpapers have been grouped into a folder called <tt>xml</tt>. This enables us to process these files differently, by utilising the <tt>process_exp</tt> option which all plugins support. The first PagedImagePlugin in the list looks for item files underneath the xml folder. These documents will be processed as hierarchical documents. Item files that don't match the process expression (i.e. aren't underneath the xml folder) will be passed onto the second PagedImagePlugin, and these are treated as paged documents.</p>
    1717
    18 description7=<p><b>Formatting</b> <p>Unlike in Greenstone 2, where the document formatting was modified to customize the display, in Greenstone 3 we rely for the rest on Greenstone\'s default behaviour.</p>
     18description7=<p><b>Formatting</b> <p>Unlike in Greenstone 2, where the document formatting was modified to customize the display, in Greenstone 3 we rely for the rest on Greenstone's default behaviour.</p>
    1919
  • documented-examples/trunk/style-e/resources/collectionConfig.properties

    r36615 r36619  
    1616
    1717
    18 shortDescription=<p>This collection demonstrates Greenstone\'s use of Cascading Style Sheets (CSS) for visual formatting in web browsers. On every page, you can change the style-sheet in effect, to modify that page\'s appearance. This collection contains the same material as the original Greenstone demo collection.</p>
     18shortDescription=<p>This collection demonstrates Greenstone's use of Cascading Style Sheets (CSS) for visual formatting in web browsers. On every page, you can change the style-sheet in effect, to modify that page's appearance. This collection contains the same material as the original Greenstone demo collection.</p>
    1919
    20 description1=<p>A combination of JavaScript and the overriding of GS3 XSL templates in Greenstone 3\'s <i>global</i> format statement is used by the collection to provide the stylesheet switching. As in some other <i>Documented Example Collections</i>, GLI\'s <tt>Format &gt; Format Features &gt; global</tt> can be used to define the <b>additionalHeaderContent</b> template. Doing so overrides the existing <i>additionalHeaderContent</i> template, and appends any specified HTML elements to the HTML header.</p>
     20description1=<p>A combination of JavaScript and the overriding of GS3 XSL templates in Greenstone 3's <i>global</i> format statement is used by the collection to provide the stylesheet switching. As in some other <i>Documented Example Collections</i>, GLI's <tt>Format &gt; Format Features &gt; global</tt> can be used to define the <b>additionalHeaderContent</b> template. Doing so overrides the existing <i>additionalHeaderContent</i> template, and appends any specified HTML elements to the HTML header.</p>
    2121
    2222description2=<p>In this case, the <b>additionalHeaderContent</b> specifies the custom collection stylesheet currently active and the JavaScript to facilitate the stylesheet switching when a link is clicked. The <b>create-banner</b> XSL template in the <i>global</i> format statement is also overridden to provide links to the multiple stylesheets within the existing GS3 banner section, and invoke the custom JavaScript when any link is clicked. \n\
     
    2424  &lt;xsl\:template name="additionalHeaderContent"&gt; \n\
    2525    &lt;xsl\:variable name="httpCollection"&gt; \n\
    26       &lt;xsl\:value-of select="/page/pageResponse/collection/metadataList/metadata[@name=\'httpPath\']"/&gt; \n\
     26      &lt;xsl\:value-of select="/page/pageResponse/collection/metadataList/metadata[@name='httpPath']"/&gt; \n\
    2727    &lt;/xsl\:variable&gt; \n\
    2828    &lt;link rel="stylesheet" href="{$httpCollection}/style/gs3-style-default-extra.css" type="text/css"  \n\
     
    3535    &lt;div class="choose_style"&gt; \n\
    3636        Choose a style\: \n\
    37         &lt;a href="#" onclick="replaceStyle(\'gs3-style-default-extra\');return false;"&gt;Default Greenstone&lt;/a&gt;, \n\
    38         &lt;a href="#" onclick="replaceStyle(\'gs3-style-blue\');return false;"&gt;Blue&lt;/a&gt;, \n\
    39         &lt;a href="#" onclick="replaceStyle(\'gs3-style-olive-purple\');return false;"&gt;OlivePurple&lt;/a&gt;,        \n\
    40         &lt;a href="#" onclick="replaceStyle(\'');return false;"&gt;None&lt;/a&gt; \n\
     37        &lt;a href="#" onclick="replaceStyle('gs3-style-default-extra');return false;"&gt;Default Greenstone&lt;/a&gt;, \n\
     38        &lt;a href="#" onclick="replaceStyle('gs3-style-blue');return false;"&gt;Blue&lt;/a&gt;, \n\
     39        &lt;a href="#" onclick="replaceStyle('gs3-style-olive-purple');return false;"&gt;OlivePurple&lt;/a&gt;,      \n\
     40        &lt;a href="#" onclick="replaceStyle('');return false;"&gt;None&lt;/a&gt; \n\
    4141        &lt;/div&gt; \n\
    4242    &lt;div id="gs_banner" class="ui-widget-header ui-corner-bottom"&gt;         \n\
     
    5454description3=<p>If you want to download any of these stylesheets for your own collections, here are links to them\: \n\
    5555<ul> \n\
    56 <li><a href='https\://trac.greenstone.org/browser/documented-examples/trunk/style-e/style/gs3-style-default-extra.css\'>GS3 default extra</a> - builds on top of GS3\'s default style</li> \n\
    57 <li><a href='https\://trac.greenstone.org/browser/documented-examples/trunk/style-e/style/gs3-style-blue.css\'>Blue theme</a> - modifies the GS3 default style for a blue colouring</li> \n\
    58 <li><a href='https\://trac.greenstone.org/browser/documented-examples/trunk/style-e/style/gs3-style-olive-purple.css\'>olive-purple theme</a> - modifies the GS3 default style for a vivid colouring of vine green and purples</li> \n\
     56<li><a href='https\://trac.greenstone.org/browser/documented-examples/trunk/style-e/style/gs3-style-default-extra.css'>GS3 default extra</a> - builds on top of GS3's default style</li> \n\
     57<li><a href='https\://trac.greenstone.org/browser/documented-examples/trunk/style-e/style/gs3-style-blue.css'>Blue theme</a> - modifies the GS3 default style for a blue colouring</li> \n\
     58<li><a href='https\://trac.greenstone.org/browser/documented-examples/trunk/style-e/style/gs3-style-olive-purple.css'>olive-purple theme</a> - modifies the GS3 default style for a vivid colouring of vine green and purples</li> \n\
    5959<li>None - clears all CSS styling from the current page (needs reload to get the default GS3 style back)</li> \n\
    6060</ul> \n\
     
    6464<ul> \n\
    6565<li>To use a stylesheet as the default, place it in greenstone/web/interfaces/default/style and rename it to <tt>style.css</tt>. This will affect all collections.</li> \n\
    66 <li>To use a stylesheet for a particular collection, place it in <tt>greenstone/web/sites/localsite/collect/&lt;collection&gt;/style</tt> then specify the stylesheet link in the <b>additionalHeaderContent</b> of GLI\'s <i>global</i> format statement (<tt>Format &gt; Format Features &gt; global</tt>) as follows\: \n\
     66<li>To use a stylesheet for a particular collection, place it in <tt>greenstone/web/sites/localsite/collect/&lt;collection&gt;/style</tt> then specify the stylesheet link in the <b>additionalHeaderContent</b> of GLI's <i>global</i> format statement (<tt>Format &gt; Format Features &gt; global</tt>) as follows\: \n\
    6767<pre>&lt;xsl\:template name="additionalHeaderContent"&gt; \n\
    6868    &lt;xsl\:variable name="httpCollection"&gt; \n\
    69       &lt;xsl\:value-of select="/page/pageResponse/collection/metadataList/metadata[@name=\'httpPath\']"/&gt; \n\
     69      &lt;xsl\:value-of select="/page/pageResponse/collection/metadataList/metadata[@name='httpPath']"/&gt; \n\
    7070    &lt;/xsl\:variable&gt; \n\
    7171    &lt;link href="{$httpCollection}/style/stylesheet-name.css" rel="stylesheet" type="text/css"/&gt; \n\
  • documented-examples/trunk/wrdpdf-e/resources/collectionConfig.properties

    r36615 r36619  
    22document_text=documents
    33
    4 shortDescription=<p>This collection demonstrates Greenstone\'s ability to build collections from documents provided in different formats. It contains a number of papers written by various members of the NZDL project in PDF, MSWord, RTF, and Postscript formats.</p>
     4shortDescription=<p>This collection demonstrates Greenstone's ability to build collections from documents provided in different formats. It contains a number of papers written by various members of the NZDL project in PDF, MSWord, RTF, and Postscript formats.</p>
    55
    66description1=<p>The documents in this collection have been produced by members of the Department of Computer Science, University of Waikato. The University of Waikato holds copyright. They may be distributed freely, without any restrictions.</p>
    77
    8 description2=<h3>How the collection works</h3> <p> This collection\'s configuration file, <tt>collectionConfig.xml</tt>, contains the four plugins <i>WordPlugin</i>, <i>RTFPlugin</i>, <i>PDFPlugin</i> and <i>PostScriptPlugin</i> (along with the standard four, <i>GreenstoneXMLPlugin</i>, <i>MetadataXMLPlugin</i>, <i>ArchivesInfPlugin</i> and <i>DirectoryPlugin</i>). These four plugins all extract <i>Title</i> and <i>Source</i> (i.e. filename) metadata.</p>
     8description2=<h3>How the collection works</h3> <p> This collection's configuration file, <tt>collectionConfig.xml</tt>, contains the four plugins <i>WordPlugin</i>, <i>RTFPlugin</i>, <i>PDFPlugin</i> and <i>PostScriptPlugin</i> (along with the standard four, <i>GreenstoneXMLPlugin</i>, <i>MetadataXMLPlugin</i>, <i>ArchivesInfPlugin</i> and <i>DirectoryPlugin</i>). These four plugins all extract <i>Title</i> and <i>Source</i> (i.e. filename) metadata.</p>
    99
    1010description3=<p>Greenstone contains third-party software that is used to convert Word, RTF, PDF and PostScript files into HTML. The Greenstone team does not maintain these modules, although we do try to include the latest versions with each Greenstone release. Bugs arise with unusual Word documents (e.g. from older Macintosh systems), and sometimes the text is badly extracted. Some PDF files have no machine-readable text at all, comprising instead a sequence of page <i>images</i> from which text can only be extracted by optical character recognition (OCR), which Greenstone does not attempt. If you encounter these problems, you can either remove the offending documents from your collection, or try using some of the advanced plugin options to process the documents in different ways. For more information, see the Enhanced PDF and Word tutorials on the <a href='http\://wiki.greenstone.org/wiki/index.php/Tutorial_exercises'>Greenstone wiki</a>. Alternatively, a new Greenstone 3 collection will add in a pre-configured <i>UnknownConverterPlugin</i> that will use <i>apache tika</i> by default to process docx files. You can reconfigure it, or add another UnknownConverterPlugin and configure it appropriately, to process other document types, refer to <a href="http\://wiki.greenstone.org/doku.php?id=en\:plugin\:unknownconverterplugin">The UnknownConverterPlugin</a> page on the Greenstone wiki.</p>
Note: See TracChangeset for help on using the changeset viewer.