Ignore:
Timestamp:
2022-09-14T19:35:58+12:00 (19 months ago)
Author:
anupama
Message:

Forgot to ensure English appears as last edited on GTI. Commit 1/2

File:
1 edited

Legend:

Unmodified
Added
Removed
  • documented-examples/trunk/gsarch-e/resources/collectionConfig.properties

    r36598 r36614  
    1 name=Greenstone Archives collection
    2 SearchBySender=search by sender
    3 Subject=Subject
    4 Date=Date
    5 From=From
    6 ReplyTo=In reply to
    7 index_text=Messages
    8 index_subject=Subject lines
    9 index_from=From fields
     1name=QQQQGreenstone Archives collection
     2SearchBySender=QQQQsearch by sender
     3Subject=QQQQSubject
     4Date=QQQQDate
     5From=QQQQFrom
     6ReplyTo=QQQQIn reply to
     7index_text=QQQQMessages
     8index_subject=QQQQSubject lines
     9index_from=QQQQFrom fields
    1010
    11 shortDescription=<p>This is a collection of email messages from the Greenstone mailing list archives, from November/December, 2008.</p>
     11shortDescription=QQQQ<p>This is a collection of email messages from the Greenstone mailing list archives, from November/December, 2008.</p>
    1212
    13 description1=<h3>How the collection works</h3><p>The Greenstone Archives collection uses the <i>Email</i> plugin, which parses files in email formats. In this case, there is a file per month per mailing list, and each file contains many email messages. The <i>Email</i> plugin splits these into individual documents, and produces <i>Title</i>, <i>Subject</i>, <i>From</i>, <i>FromName</i>, <i>FromAddr</i>, <i>Date</i>, <i>DateText</i>, <i>InReplyTo</i>, and optionally <i>Headers</i>, metadata.</p>
     13description1=QQQQ<h3>How the collection works</h3><p>The Greenstone Archives collection uses the <i>Email</i> plugin, which parses files in email formats. In this case, there is a file per month per mailing list, and each file contains many email messages. The <i>Email</i> plugin splits these into individual documents, and produces <i>Title</i>, <i>Subject</i>, <i>From</i>, <i>FromName</i>, <i>FromAddr</i>, <i>Date</i>, <i>DateText</i>, <i>InReplyTo</i>, and optionally <i>Headers</i>, metadata.</p>
    1414
    15 description2=<p>The collection configuration file, <tt>etc/collectionConfig.xml</tt> specifies <i>&lt;importOption name="groupsize" value="200"/&gt;</i>. This groups documents together into groups of 200. Email collections typically have many small documents, and grouping them together prevents Greenstone\'s internal file structures from becoming bloated and occupying more disk space than necessary. Notice that the <i>Email</i> plugin first splits the input files up into individual Emails, then <i>groupsize</i> groups them together again. This allows the collection designer to control what is going on.</p>
     15description2=QQQQ<p>The collection configuration file, <tt>etc/collectionConfig.xml</tt> specifies <i>&lt;importOption name="groupsize" value="200"/&gt;</i>. This groups documents together into groups of 200. Email collections typically have many small documents, and grouping them together prevents Greenstone\'s internal file structures from becoming bloated and occupying more disk space than necessary. Notice that the <i>Email</i> plugin first splits the input files up into individual Emails, then <i>groupsize</i> groups them together again. This allows the collection designer to control what is going on.</p>
    1616
    17 description3=<p>The <i>indexes</i> line specifies 3 searchable indexes, which can be seen by clicking beside the word "Messages" on the <a href="library/collection/gsarch-e/search/TextQuery">search page</a> to reveal a drop-down menu. The first (called <i>Messages</i>) is created from the document text, while the others are formed from <i>From</i> and <i>Subject</i> metadata.</p>
     17description3=QQQQ<p>The <i>indexes</i> line specifies 3 searchable indexes, which can be seen by clicking beside the word "Messages" on the <a href="library/collection/gsarch-e/search/TextQuery">search page</a> to reveal a drop-down menu. The first (called <i>Messages</i>) is created from the document text, while the others are formed from <i>From</i> and <i>Subject</i> metadata.</p>
    1818
    19 description4=<p>There are three classifiers, based on <i>Subject</i>, <i>FromName</i>, and <i>Date</i> metadata. The <i>AZCompactList</i> classifier used for the first two is like <i>AZList</i> but generates a bookshelf for duplicate items, as illustrated <a href="library/collection/gsarch-e/browse/CL1">here</a>. This is represented by a tree structure whose nodes are either leaf nodes, representing documents, or internal nodes. A metadata item called numleafdocs gives the total number of documents below an internal node. The format statement for the first classifier, called <i>CL1Vlist</i>, checks whether this item exists. If so the node must be an internal one, in which case it is labeled by its <i>Title</i>. Otherwise the node\'s label starts with the <i>Subject</i> which links to the document, then gives <i>FromName</i> metadata, with a link to "Search by Sender", followed by the <i>DateText</i>.</p>
     19description4=QQQQ<p>There are three classifiers, based on <i>Subject</i>, <i>FromName</i>, and <i>Date</i> metadata. The <i>AZCompactList</i> classifier used for the first two is like <i>AZList</i> but generates a bookshelf for duplicate items, as illustrated <a href="library/collection/gsarch-e/browse/CL1">here</a>. This is represented by a tree structure whose nodes are either leaf nodes, representing documents, or internal nodes. A metadata item called numleafdocs gives the total number of documents below an internal node. The format statement for the first classifier, called <i>CL1Vlist</i>, checks whether this item exists. If so the node must be an internal one, in which case it is labeled by its <i>Title</i>. Otherwise the node\'s label starts with the <i>Subject</i> which links to the document, then gives <i>FromName</i> metadata, with a link to "Search by Sender", followed by the <i>DateText</i>.</p>
    2020
    21 description5=<p>The second classifier (<i>CL2Vlist</i>) is similar, but shows slightly different information -- the result can be seen <a href="library/collection/gsarch-e/browse/CL2">here</a>. For internal nodes, the actual number of leaf documents (<i>numleafdocs</i>) is given in parentheses after the <i>Title</i>. For document nodes the <i>FromName</i>, with a link to "Search By Sender", <i>Subject</i> (linked to the document), and <i>DateText</i> metadata is shown.</p>
     21description5=QQQQ<p>The second classifier (<i>CL2Vlist</i>) is similar, but shows slightly different information -- the result can be seen <a href="library/collection/gsarch-e/browse/CL2">here</a>. For internal nodes, the actual number of leaf documents (<i>numleafdocs</i>) is given in parentheses after the <i>Title</i>. For document nodes the <i>FromName</i>, with a link to "Search By Sender", <i>Subject</i> (linked to the document), and <i>DateText</i> metadata is shown.</p>
    2222
    23 description6=<p>The third classifier is a <i>DateList</i>, which allows selection by month and year.</p>
     23description6=QQQQ<p>The third classifier is a <i>DateList</i>, which allows selection by month and year.</p>
    2424
    25 description7=<p>Finally, the <tt>documentHeading</tt> is overridden to show the header fields\: <i>FromName</i>, <i>DateText</i>, <i>Subject</i>, <i>InReplyTo</i> (as the default documentHeading would not show the <i>InReplyTo</i> Field, nor to label the fields). The default <tt>documentContent</tt> already displays the message text (with the call to &lt;xsl\:call-template name="documentNodeText"/&gt;). <i>FromName</i> is linked to a search on that name, while <i>InReplyTo</i> links to the email message that it refers to.</p>
     25description7=QQQQ<p>Finally, the <tt>documentHeading</tt> is overridden to show the header fields\: <i>FromName</i>, <i>DateText</i>, <i>Subject</i>, <i>InReplyTo</i> (as the default documentHeading would not show the <i>InReplyTo</i> Field, nor to label the fields). The default <tt>documentContent</tt> already displays the message text (with the call to &lt;xsl\:call-template name="documentNodeText"/&gt;). <i>FromName</i> is linked to a search on that name, while <i>InReplyTo</i> links to the email message that it refers to.</p>
Note: See TracChangeset for help on using the changeset viewer.