Changeset 36577
- Timestamp:
- 2022-09-06T00:02:57+12:00 (20 months ago)
- Location:
- documented-examples/trunk
- Files:
-
- 3 edited
Legend:
- Unmodified
- Added
- Removed
-
documented-examples/trunk/dls-e/resources/collectionConfig.properties
r36575 r36577 1 name= Development Library Subset collection2 section_Title= section titles3 section_text= chapters4 document_text= entire documents5 document= Document6 textdate= publication date\:7 textnumpages= no. of pages\:8 textsource= source ref\:1 name=QQQQDevelopment Library Subset collection 2 section_Title=QQQQsection titles 3 section_text=QQQQchapters 4 document_text=QQQQentire documents 5 document=QQQQDocument 6 textdate=QQQQpublication date\: 7 textnumpages=QQQQno. of pages\: 8 textsource=QQQQsource ref\: 9 9 10 10 11 shortDescription= <p>The Humanitarian Development Libraries represent a large collection of practical information aimed at helping reduce poverty, increasing human potential, and providing a practical and useful education for all. This subset contains about 25 publications--documents, reports, and periodical articles--in various areas of human development, from agricultural practice to economic policies, from water and sanitation to society and culture, from education to manufacturing, from disaster mitigation to micro-enterprises.</p>11 shortDescription=QQQQ<p>The Humanitarian Development Libraries represent a large collection of practical information aimed at helping reduce poverty, increasing human potential, and providing a practical and useful education for all. This subset contains about 25 publications--documents, reports, and periodical articles--in various areas of human development, from agricultural practice to economic policies, from water and sanitation to society and culture, from education to manufacturing, from disaster mitigation to micro-enterprises.</p> 12 12 13 description0= <p>The editors of this collection are Human Info NGO, HumanityCD Ltd, and participating organizations. Contact us at Humanitarian and Development Libraries Project, Oosterveldiaan 196, B-2610 Antwerp, Belgium, Tel 32-3-448.05.54, Fax 32-3-449.75.74, email <a href=mailto\:[email protected]>[email protected]</a>.13 description0=QQQQ<p>The editors of this collection are Human Info NGO, HumanityCD Ltd, and participating organizations. Contact us at Humanitarian and Development Libraries Project, Oosterveldiaan 196, B-2610 Antwerp, Belgium, Tel 32-3-448.05.54, Fax 32-3-449.75.74, email <a href=mailto\:[email protected]>[email protected]</a>. 14 14 15 description1= <h3>How the collection works</h3><p>The DLS collection is fairly complex. If you\'re just starting out you might prefer to look at some other collections first (e.g. <a href="library/collection/wrdpdf-e/page/about">Word and PDF demonstration</a>, or the <a href="library/collection/gsarch-e/page/about">Greenstone Archives</a>, or the <a href="library/collection/image-e/page/about">Simple Image collection</a>).</p>15 description1=QQQQ<h3>How the collection works</h3><p>The DLS collection is fairly complex. If you\'re just starting out you might prefer to look at some other collections first (e.g. <a href="library/collection/wrdpdf-e/page/about">Word and PDF demonstration</a>, or the <a href="library/collection/gsarch-e/page/about">Greenstone Archives</a>, or the <a href="library/collection/image-e/page/about">Simple Image collection</a>).</p> 16 16 17 description2= <p>The collection configuration file, <tt>collectionConfig.xml</tt>, like all collection configuration files, begins with the <i>creator</i> metadata element that gives the email address of the collection\'s creator, and another metadata ("public") that determines whether the collection will appear on the home page of the Greenstone installation. Note that setting "public" to "false" only removes it from the home page; it will still be accessible in the library to anyone that knows the URL to the collection.</p>17 description2=QQQQ<p>The collection configuration file, <tt>collectionConfig.xml</tt>, like all collection configuration files, begins with the <i>creator</i> metadata element that gives the email address of the collection\'s creator, and another metadata ("public") that determines whether the collection will appear on the home page of the Greenstone installation. Note that setting "public" to "false" only removes it from the home page; it will still be accessible in the library to anyone that knows the URL to the collection.</p> 18 18 19 description3= <p><b>Plugins</b>. The "plugin" lines in the collection configuration file give the plugins used by the collection. The documents in the DLS collection are in HTML, so <i>HTMLPlugin</i> must be included. The <i>description_tags</i> option processes tags in the text that define sections and section titles as described below.</p>19 description3=QQQQ<p><b>Plugins</b>. The "plugin" lines in the collection configuration file give the plugins used by the collection. The documents in the DLS collection are in HTML, so <i>HTMLPlugin</i> must be included. The <i>description_tags</i> option processes tags in the text that define sections and section titles as described below.</p> 20 20 21 description4= <p>The other plugins, <i>GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin, and DirectoryPlugin</i>, are used by Greenstone for internal purposes and are standard in almost all collections.</p>21 description4=QQQQ<p>The other plugins, <i>GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin, and DirectoryPlugin</i>, are used by Greenstone for internal purposes and are standard in almost all collections.</p> 22 22 23 description5= <p><b>Searchable indexes</b>. The block of lines starting with <i>indexes</i> specifies what searchable indexes will be available. In this collection there are three\: you can see them when you pull down the "Search for" menu on the collection\'s <a href="library/collection/dls-e/search/TextQuery">search page</a>. The first index is called "chapters", the second "section titles", and the third "entire documents". The names of these three indexes are given by three properties (section_text, section_Title and document_text) in the translatable <tt>collectionConfig.properties</tt> file located in the collection\'s <tt>resources</tt> subfolder.</p>23 description5=QQQQ<p><b>Searchable indexes</b>. The block of lines starting with <i>indexes</i> specifies what searchable indexes will be available. In this collection there are three\: you can see them when you pull down the "Search for" menu on the collection\'s <a href="library/collection/dls-e/search/TextQuery">search page</a>. The first index is called "chapters", the second "section titles", and the third "entire documents". The names of these three indexes are given by three properties (section_text, section_Title and document_text) in the translatable <tt>collectionConfig.properties</tt> file located in the collection\'s <tt>resources</tt> subfolder.</p> 24 24 25 description6= <p>The contents of the indexes -- that is, the specification of what it is that will be searched -- are defined by the <i>indexes</i> line at the beginning of this block. This specifies three indexes, two at the section level (beginning with <i>section\:</i>) and one at the document level (beginning with <i>document\:</i>). The difference is that a multi-word query will only match a section-level index if all query terms appear in the same section, whereas it will match a document-level index if the terms appear anywhere within the document (which typically comprises several sections). The first and third indexes are <i>section\:text</i> and <i>document\:text</i>, and the <i>\:text</i> means that the full text of sections and documents respectively will be searched. The second is <i>section\:Title</i>, which means that <i>Title</i> metadata will be searched -- in this case, section titles (rather than document titles). The three indexes appear in the order in which they are specified on the <i>indexes</i> line.</p>25 description6=QQQQ<p>The contents of the indexes -- that is, the specification of what it is that will be searched -- are defined by the <i>indexes</i> line at the beginning of this block. This specifies three indexes, two at the section level (beginning with <i>section\:</i>) and one at the document level (beginning with <i>document\:</i>). The difference is that a multi-word query will only match a section-level index if all query terms appear in the same section, whereas it will match a document-level index if the terms appear anywhere within the document (which typically comprises several sections). The first and third indexes are <i>section\:text</i> and <i>document\:text</i>, and the <i>\:text</i> means that the full text of sections and documents respectively will be searched. The second is <i>section\:Title</i>, which means that <i>Title</i> metadata will be searched -- in this case, section titles (rather than document titles). The three indexes appear in the order in which they are specified on the <i>indexes</i> line.</p> 26 26 27 description7= <p><b>Classifiers</b>. The block of lines labeled <i>classify</i> define the browsing indexes, called "classifiers" in Greenstone. There are four of them, corresponding to four buttons on the navigation bar at the top of each page in the collection (e.g. the <a href="library/collection/dls-e/search/TextQuery">search page</a>)\: <i>subjects</i>, <i>titles</i>, <i>organisations</i>, and <i>howto</i> The <i>search</i> button comes first, then come the four classifiers, in order.</p>27 description7=QQQQ<p><b>Classifiers</b>. The block of lines labeled <i>classify</i> define the browsing indexes, called "classifiers" in Greenstone. There are four of them, corresponding to four buttons on the navigation bar at the top of each page in the collection (e.g. the <a href="library/collection/dls-e/search/TextQuery">search page</a>)\: <i>subjects</i>, <i>titles</i>, <i>organisations</i>, and <i>howto</i> The <i>search</i> button comes first, then come the four classifiers, in order.</p> 28 28 29 description8= <p>The first classifier provides access by subject. It is a <i>Hierarchy</i> classifier whose hierarchy is defined in the file <tt>etc/dls.Subject.txt</tt> (the <i>hfile</i> argument); this file is discussed below. This classifier is based on <i>dls.Subject</i> metadata, and when several books appear at a leaf of the hierarchy they are sorted by <i>dls.Title</i> metadata (as you can see when you open classifier browser <tt>CL1.4.1</tt>). The second classifier provides access by title. It is also a <i>Hierarchy</i> classifier, this time based on <i>dls.AZList</i> metadata, whose hierarchy is defined in <tt>etc/dls.AZList.txt</tt>. This file is discussed below. The third provides access by organization\: it is a <i>List</i> classifier based on <i>dls.Organization</i> metadata. The <i>-bookshelf_type always</i> option creates a new bookshelf for each organization, even if only one document belongs to that category. The fourth provides access by "Howto" text\: it is a <i>List</i> classifier based on <i>dls.Keyword</i> metadata. The <i>-bookshelf_type never</i> option prevents bookshelves being created even if two documents share the same keywords.</p>29 description8=QQQQ<p>The first classifier provides access by subject. It is a <i>Hierarchy</i> classifier whose hierarchy is defined in the file <tt>etc/dls.Subject.txt</tt> (the <i>hfile</i> argument); this file is discussed below. This classifier is based on <i>dls.Subject</i> metadata, and when several books appear at a leaf of the hierarchy they are sorted by <i>dls.Title</i> metadata (as you can see when you open classifier browser <tt>CL1.4.1</tt>). The second classifier provides access by title. It is also a <i>Hierarchy</i> classifier, this time based on <i>dls.AZList</i> metadata, whose hierarchy is defined in <tt>etc/dls.AZList.txt</tt>. This file is discussed below. The third provides access by organization\: it is a <i>List</i> classifier based on <i>dls.Organization</i> metadata. The <i>-bookshelf_type always</i> option creates a new bookshelf for each organization, even if only one document belongs to that category. The fourth provides access by "Howto" text\: it is a <i>List</i> classifier based on <i>dls.Keyword</i> metadata. The <i>-bookshelf_type never</i> option prevents bookshelves being created even if two documents share the same keywords.</p> 30 30 31 description9= <p><b>Cover images</b>. Greenstone looks for a cover image for each document, whose name is the same as the document\'s but with a <i>.jpg</i> extension. This image is associated with the document, and may be displayed on the document page (see below). Cover images can be switched off by setting the -no_cover_image flag for each plugin.</p>31 description9=QQQQ<p><b>Cover images</b>. Greenstone looks for a cover image for each document, whose name is the same as the document\'s but with a <i>.jpg</i> extension. This image is associated with the document, and may be displayed on the document page (see below). Cover images can be switched off by setting the -no_cover_image flag for each plugin.</p> 32 32 33 description10= <p><b>Format statements</b>. The <i>format</i> elements (<format;>, <browse>, <search> and <display> XML elements), called "format statements", govern how various parts of the collection should be displayed. The <i>VList</i> format statement applies to lists of items displayed vertically, such as the lists of titles, subjects and organisations, and the table of contents for the target documents. It is overridden for the search results list by the <i>SearchVList</i> format statement, and also for the <i>Howto</i> classifier by the <i>CL4VList</i> statement (CL4 specifies the fourth classifier).</p>33 description10=QQQQ<p><b>Format statements</b>. The <i>format</i> elements (<format;>, <browse>, <search> and <display> XML elements), called "format statements", govern how various parts of the collection should be displayed. The <i>VList</i> format statement applies to lists of items displayed vertically, such as the lists of titles, subjects and organisations, and the table of contents for the target documents. It is overridden for the search results list by the <i>SearchVList</i> format statement, and also for the <i>Howto</i> classifier by the <i>CL4VList</i> statement (CL4 specifies the fourth classifier).</p> 34 34 35 description11= <p>The <i>DocumentText</i> statement governs how the document text is formatted, with <i>Title</i> metadata ([<i>Title</i>]) in HTML <i>heading</i> format followed by the text of the document [<i>Text</i>]. By default, cover images are shown with each document (<i>DocumentImages</i>), and the <i>DocumentButtons</i> are available\: the <i>Expand Text, Expand Contents, Detach</i> and <i>Highlight</i> buttons are shown with each document.</p>35 description11=QQQQ<p>The <i>DocumentText</i> statement governs how the document text is formatted, with <i>Title</i> metadata ([<i>Title</i>]) in HTML <i>heading</i> format followed by the text of the document [<i>Text</i>]. By default, cover images are shown with each document (<i>DocumentImages</i>), and the <i>DocumentButtons</i> are available\: the <i>Expand Text, Expand Contents, Detach</i> and <i>Highlight</i> buttons are shown with each document.</p> 36 36 37 description12= <p>Greenstone 3 uses XML for format statements, allowing librarians with XML experience to more easily understand and use format statements than Greenstone 2 which worked with a custom way of specifying format statements. For more information on understanding format statements and writing your own format statements for collections, refer to <a href="http\://wiki.greenstone.org/doku.php?id=en\:user\:gs3_format_statements">Greenstone 3 Format Statements</a> on the Greenstone wiki.</p>37 description12=QQQQ<p>Greenstone 3 uses XML for format statements, allowing librarians with XML experience to more easily understand and use format statements than Greenstone 2 which worked with a custom way of specifying format statements. For more information on understanding format statements and writing your own format statements for collections, refer to <a href="http\://wiki.greenstone.org/doku.php?id=en\:user\:gs3_format_statements">Greenstone 3 Format Statements</a> on the Greenstone wiki.</p> 38 38 39 description13= <p><b>Collection-level metadata</b>. The <i><displayItem></i> elements under the top-level <i><displayItemList></i> in the configuration file are also standard in all Greenstone collections. They give general information about the collection, defining its name, and a description that appears on its home page. The description text (defined in the translatable <tt>resources/collectionConfig.properties</tt> files) can be seen on the DLS collection\'s home page (this text is part of it).</p>39 description13=QQQQ<p><b>Collection-level metadata</b>. The <i><displayItem></i> elements under the top-level <i><displayItemList></i> in the configuration file are also standard in all Greenstone collections. They give general information about the collection, defining its name, and a description that appears on its home page. The description text (defined in the translatable <tt>resources/collectionConfig.properties</tt> files) can be seen on the DLS collection\'s home page (this text is part of it).</p> 40 40 41 description14= <p><b>Language translations</b>. In the collection configuration file, lines that look like <tt><displayItem assigned="true" dictionary="collectionConfig" key="..." name="..."/></tt> allow for translatable collection-level metadata, that are defined in the <tt>resources/collectionConfig.properties</tt> text files and can be translated in the same location such as by creating French and Spanish versions (in <tt>resources/collectionConfig_fr.properties</tt> and <tt>resources/collectionConfig_es.properties</tt>, respectively). Note that we advise translators to go through the GTI (Greenstone Translation Interface) system if they want to contribute translations to Greenstone as used by everyone, such as translations to Greenstone\'s demo collections and these documented example collections. The properties files allow for accented characters (e.g. French <i>é</i>). The files are in UTF-8, and these characters are represented by multi-byte sequences (<C3><A9> in this case). Alternatively they could be represented by their HTML entity names (like <i>& eacute ;</i>). It makes no difference for how they appear on the screen.</p>41 description14=QQQQ<p><b>Language translations</b>. In the collection configuration file, lines that look like <tt><displayItem assigned="true" dictionary="collectionConfig" key="..." name="..."/></tt> allow for translatable collection-level metadata, that are defined in the <tt>resources/collectionConfig.properties</tt> text files and can be translated in the same location such as by creating French and Spanish versions (in <tt>resources/collectionConfig_fr.properties</tt> and <tt>resources/collectionConfig_es.properties</tt>, respectively). Note that we advise translators to go through the GTI (Greenstone Translation Interface) system if they want to contribute translations to Greenstone as used by everyone, such as translations to Greenstone\'s demo collections and these documented example collections. The properties files allow for accented characters (e.g. French <i>é</i>). The files are in UTF-8, and these characters are represented by multi-byte sequences (<C3><A9> in this case). Alternatively they could be represented by their HTML entity names (like <i>& eacute ;</i>). It makes no difference for how they appear on the screen.</p> 42 42 43 description15= <p><b>Description tags</b>. The description tags recognized by <i>HTMLPlugin</i> are inserted into the HTML source text of the documents to define where sections begin and end, and to specify section titles. They look like this\: <pre> <!-- <Section> <Description> <Metadata name="Title"> Realizing human rights for poor people\: Strategies for achieving the international development targets </Metadata> </Description> --> (text of section goes here) <!-- </Section> --> </pre> The <!-- ... --> markers are used to ensure that these tags are marked as comments in HTML and therefore do not affect document formatting. In the <i>Description</i> part other kinds of metadata can be specified, but this is not done for the style of collection we are describing here. Exactly the same specification (including the <!-- ... --> markers) can be used in Word documents too.</p>43 description15=QQQQ<p><b>Description tags</b>. The description tags recognized by <i>HTMLPlugin</i> are inserted into the HTML source text of the documents to define where sections begin and end, and to specify section titles. They look like this\: <pre> <!-- <Section> <Description> <Metadata name="Title"> Realizing human rights for poor people\: Strategies for achieving the international development targets </Metadata> </Description> --> (text of section goes here) <!-- </Section> --> </pre> The <!-- ... --> markers are used to ensure that these tags are marked as comments in HTML and therefore do not affect document formatting. In the <i>Description</i> part other kinds of metadata can be specified, but this is not done for the style of collection we are describing here. Exactly the same specification (including the <!-- ... --> markers) can be used in Word documents too.</p> 44 44 45 description16= <p><b>Metadata Files</b>. Metadata for all documents in the DLS collection is provided in metadata.xml files, one per document folder. In this collection\'s <tt>import/r0087e</tt> is the <tt>metadata.xml</tt> file for one book -- <i>Income generation and money management\: training women as entrepreneurs</i> -- which is a block of about ten lines encased in <<i>FileSet</i>> ... <<i>/FileSet</i>> tags. It defines <i>dls.Title</i>, <i>dls.Language</i>, <i>dls.Subject</i> and <i>dls.AZList</i> metadata. More than one value can be specified for any metadata item. For example, this book has two dls.Subject classifications. Both of these are stored as metadata values for this particular document (because <i>mode=accumulate</i> is specified; the alternative, and the default, is <i>mode=override</i>).</p>45 description16=QQQQ<p><b>Metadata Files</b>. Metadata for all documents in the DLS collection is provided in metadata.xml files, one per document folder. In this collection\'s <tt>import/r0087e</tt> is the <tt>metadata.xml</tt> file for one book -- <i>Income generation and money management\: training women as entrepreneurs</i> -- which is a block of about ten lines encased in <<i>FileSet</i>> ... <<i>/FileSet</i>> tags. It defines <i>dls.Title</i>, <i>dls.Language</i>, <i>dls.Subject</i> and <i>dls.AZList</i> metadata. More than one value can be specified for any metadata item. For example, this book has two dls.Subject classifications. Both of these are stored as metadata values for this particular document (because <i>mode=accumulate</i> is specified; the alternative, and the default, is <i>mode=override</i>).</p> 46 46 47 description17= <p><b>Hierarchy files</b>. Hierarchy files contain a succession of lines each of which has three items. The first item is a text string which is matched against the metadata that occurs in the <i>metadata.xml</i> file described above. The second item is a number that defines the position in the hierarchy. The third item is a text string that describes the node of the hierarchy on the web pages that Greenstone generates.</p>47 description17=QQQQ<p><b>Hierarchy files</b>. Hierarchy files contain a succession of lines each of which has three items. The first item is a text string which is matched against the metadata that occurs in the <i>metadata.xml</i> file described above. The second item is a number that defines the position in the hierarchy. The third item is a text string that describes the node of the hierarchy on the web pages that Greenstone generates.</p> 48 48 49 description18= <p>For example, the following shows three lines from the subject hierarchy file <tt>etc/dls.Subject.txt</tt>. \n\49 description18=QQQQ<p>For example, the following shows three lines from the subject hierarchy file <tt>etc/dls.Subject.txt</tt>. \n\ 50 50 <pre> "Animal Husbandry and Animal Product Processing " \n\ 51 51 7 "Animal Husbandry and Animal Product Processing " "Animal Husbandry and Animal Product Processing|Cattle " \n\ … … 55 55 </p> 56 56 57 description19= <p>These three lines define one top level bookshelf (at position 7), titled "Animal Husbandry and Animal Product Processing ", with two bookshelves underneath it, titled "Cattle " and "Other animals (micro-livestock, little known animals, silkworms, reptiles, frogs, snails, game, etc.) " respectively.</p>57 description19=QQQQ<p>These three lines define one top level bookshelf (at position 7), titled "Animal Husbandry and Animal Product Processing ", with two bookshelves underneath it, titled "Cattle " and "Other animals (micro-livestock, little known animals, silkworms, reptiles, frogs, snails, game, etc.) " respectively.</p> 58 58 59 description20= <p>In this case, the first strings (and therefore the entries in metadata.xml files) contain the entire hierarchy values. Levels in the hierarchy are separated by "| ". They could be used directly by a <i>Hierarchy</i> classifier without the use of the hierarchy file. However, then the entries would be ordered alphabetically, not in the special order defined by the file.</p>59 description20=QQQQ<p>In this case, the first strings (and therefore the entries in metadata.xml files) contain the entire hierarchy values. Levels in the hierarchy are separated by "| ". They could be used directly by a <i>Hierarchy</i> classifier without the use of the hierarchy file. However, then the entries would be ordered alphabetically, not in the special order defined by the file.</p> 60 60 61 description21= <p>The <tt>etc/dls.AZList.txt</tt> hierarchy file used by the titles classifier contains a similar structure. Ordinarily, a titles browser would use a <i>List</i> (or <i>AZList</i>) classifier. In this case, we want to predefine the A-Z groupings, and include a separate entry for periodicals, as can be seen in classifier browser <a href="library/collection/dls-e/browse/CL2/7">here</a>.</p>61 description21=QQQQ<p>The <tt>etc/dls.AZList.txt</tt> hierarchy file used by the titles classifier contains a similar structure. Ordinarily, a titles browser would use a <i>List</i> (or <i>AZList</i>) classifier. In this case, we want to predefine the A-Z groupings, and include a separate entry for periodicals, as can be seen in classifier browser <a href="library/collection/dls-e/browse/CL2/7">here</a>.</p> -
documented-examples/trunk/oai-e/resources/collectionConfig.properties
r36575 r36577 1 name= OAI demo2 Rights= Rights3 Caption= Caption4 Publisher= Publisher5 original= original6 Subject= Subject7 available= available8 sampleoid= 01dle69 index_Description= photo captions10 document= Document1 name=QQQQOAI demo 2 Rights=QQQQRights 3 Caption=QQQQCaption 4 Publisher=QQQQPublisher 5 original=QQQQoriginal 6 Subject=QQQQSubject 7 available=QQQQavailable 8 sampleoid=QQQQ01dle6 9 index_Description=QQQQphoto captions 10 document=QQQQDocument 11 11 12 12 13 shortDescription= <p>This collection demonstrates Greenstone\'s <i>ImportFrom</i> feature. Using the <a href="http\://www.openarchives.org">Open Archive Protocol</a> (version 1.1), it retrieves metadata from <a href="http\://rocky.dlib.vt.edu/~jcdlpix">rocky.dlib.vt.edu/~jcdlpix</a>, a collection of photographs taken at the inaugural <a href="http\://www.acm.org/jcdl/jcdl01/">Joint Conference on Digital Libraries</a>. A Greenstone collection is built from the records exported from this OAI data provider. The implementation is flexible enough to cope with the minor syntax differences between OAI 1.1 and OAI 2.0.</p>13 shortDescription=QQQQ<p>This collection demonstrates Greenstone\'s <i>ImportFrom</i> feature. Using the <a href="http\://www.openarchives.org">Open Archive Protocol</a> (version 1.1), it retrieves metadata from <a href="http\://rocky.dlib.vt.edu/~jcdlpix">rocky.dlib.vt.edu/~jcdlpix</a>, a collection of photographs taken at the inaugural <a href="http\://www.acm.org/jcdl/jcdl01/">Joint Conference on Digital Libraries</a>. A Greenstone collection is built from the records exported from this OAI data provider. The implementation is flexible enough to cope with the minor syntax differences between OAI 1.1 and OAI 2.0.</p> 14 14 15 description1= <h3>How the collection works</h3><p>The collection configuration file, <tt>collectionConfig.xml</tt>, includes an <i>acquire</i> line that is interpreted by a special program called <i>importfrom.pl</i>. Like other Greenstone programs, this takes as argument the name of the collection, and provides a summary of other arguments when invoked with argument <i>-help</i>. It reads the collection configuration file, finds the acquire line, and processes it. In this case, it is run with the command\: <pre> importfrom.pl oai-e </pre> (the collection\'s name is <i>oai-e</i>).</p>15 description1=QQQQ<h3>How the collection works</h3><p>The collection configuration file, <tt>collectionConfig.xml</tt>, includes an <i>acquire</i> line that is interpreted by a special program called <i>importfrom.pl</i>. Like other Greenstone programs, this takes as argument the name of the collection, and provides a summary of other arguments when invoked with argument <i>-help</i>. It reads the collection configuration file, finds the acquire line, and processes it. In this case, it is run with the command\: <pre> importfrom.pl oai-e </pre> (the collection\'s name is <i>oai-e</i>).</p> 16 16 17 description2= <p>The <i>acquire</i> line in the configuration file specifies the OAI protocol and gives the base URL of an OAI repository. The <i>importfrom</i> program downloads all the metadata in that repository into the collection\'s <i>import</i> directory. The <i>getdoc</i> argument instructs it to also download the collection\'s source documents, whose URLs are given in each document\'s Dublin Core <i>Identifier</i> field (this is a common convention). The metadata files, which each contain an XML record for one source document, are placed in the <i>import</i> file structure along with the documents themselves, and the document filename is the same as the filename in the URL. The <i>Identifier</i> field is overridden to give the local filename, and its original value is retained in a new field called <i>OrigURL</i>.</p>17 description2=QQQQ<p>The <i>acquire</i> line in the configuration file specifies the OAI protocol and gives the base URL of an OAI repository. The <i>importfrom</i> program downloads all the metadata in that repository into the collection\'s <i>import</i> directory. The <i>getdoc</i> argument instructs it to also download the collection\'s source documents, whose URLs are given in each document\'s Dublin Core <i>Identifier</i> field (this is a common convention). The metadata files, which each contain an XML record for one source document, are placed in the <i>import</i> file structure along with the documents themselves, and the document filename is the same as the filename in the URL. The <i>Identifier</i> field is overridden to give the local filename, and its original value is retained in a new field called <i>OrigURL</i>.</p> 18 18 19 description3= <p>This <i>oai-e</i> collection\'s own <tt>etc/oai.txt</tt> is an example of a downloaded metadata file.</p>19 description3=QQQQ<p>This <i>oai-e</i> collection\'s own <tt>etc/oai.txt</tt> is an example of a downloaded metadata file.</p> 20 20 21 description4= <p>Once the OAI information has been imported, the collection is processed in the usual way. Besides the four standard plugins (GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin and DirectoryPlugin), the configuration file specifies the OAI plugin, which processes OAI metadata, and the image plugin, because in this case the collection\'s source documents are image files. The OAI plugin has been supplied with an <i>input_encoding</i> argument because data in this archive contains extended characters. It also has a <i>default_language</i> argument. Greenstone normally determines the language of documents automatically, but these metadata records are too small for this to be done reliably\: hence English is specified explicitly in the <i>language</i> argument. The OAI plugin parses the metadata and passes it to the appropriate source document file, which is then processed by an appropriate plugin -- in this case <i>ImagePlugin</i>. This plugin specifies the resolution for the screen versions of the images.</p>21 description4=QQQQ<p>Once the OAI information has been imported, the collection is processed in the usual way. Besides the four standard plugins (GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin and DirectoryPlugin), the configuration file specifies the OAI plugin, which processes OAI metadata, and the image plugin, because in this case the collection\'s source documents are image files. The OAI plugin has been supplied with an <i>input_encoding</i> argument because data in this archive contains extended characters. It also has a <i>default_language</i> argument. Greenstone normally determines the language of documents automatically, but these metadata records are too small for this to be done reliably\: hence English is specified explicitly in the <i>language</i> argument. The OAI plugin parses the metadata and passes it to the appropriate source document file, which is then processed by an appropriate plugin -- in this case <i>ImagePlugin</i>. This plugin specifies the resolution for the screen versions of the images.</p> 22 22 23 description5= <p>Extracted metadata from OAI records are mapped to Dublin Core Metadata Set by default. As a result, classifiers and indexes in this collection are built with Dublin meatadata elements.</p>23 description5=QQQQ<p>Extracted metadata from OAI records are mapped to Dublin Core Metadata Set by default. As a result, classifiers and indexes in this collection are built with Dublin meatadata elements.</p> 24 24 25 description6= <p>The collection configuration file, <tt>collectionConfig.xml</tt>, specifies a single full-text index containing <i>dc.Description</i> metadata and overrides Greenstone\'s custom <i>gsf</i> format templates <tt>DocumentHeading</tt> and <tt>DocumentContent</tt> (XSL). When a document is displayed, the <i>DocumentHeading</i> format statement puts out its <i>dc.Subject</i>. Then the <i>DocumentContent</i> statement follows this with <i>screenicon</i>, which is produced by <i>ImagePlugin</i> and gives a screen-resolution version of the image; it can be hyperlinked to the <i>dc.OrigURL</i> metadata -- that is, the original version of the image on the remote OAI site. Since this is no longer available on the web, it is now hyperlinked to the full version of the image file. This is followed by the image\'s <i>dc.Description</i>, also with a hyperlink; the image\'s size and type, again generated as metadata by <i>ImagePlugin</i>; and then <i>dc.Subject</i>, <i>dc.Publisher</i>, and <i>dc.Rights</i> metadata. <a href="library/collection/oai-e/document/01dle6">This</a> is the result.</p>25 description6=QQQQ<p>The collection configuration file, <tt>collectionConfig.xml</tt>, specifies a single full-text index containing <i>dc.Description</i> metadata and overrides Greenstone\'s custom <i>gsf</i> format templates <tt>DocumentHeading</tt> and <tt>DocumentContent</tt> (XSL). When a document is displayed, the <i>DocumentHeading</i> format statement puts out its <i>dc.Subject</i>. Then the <i>DocumentContent</i> statement follows this with <i>screenicon</i>, which is produced by <i>ImagePlugin</i> and gives a screen-resolution version of the image; it can be hyperlinked to the <i>dc.OrigURL</i> metadata -- that is, the original version of the image on the remote OAI site. Since this is no longer available on the web, it is now hyperlinked to the full version of the image file. This is followed by the image\'s <i>dc.Description</i>, also with a hyperlink; the image\'s size and type, again generated as metadata by <i>ImagePlugin</i>; and then <i>dc.Subject</i>, <i>dc.Publisher</i>, and <i>dc.Rights</i> metadata. <a href="library/collection/oai-e/document/01dle6">This</a> is the result.</p> 26 26 27 description7= <p>There are two browsing classifiers, one based on <i>dc.Subject</i> metadata and the other on <i>dc.Description</i> metadata (but with a button named "captions"). Recall that the <i>AZCompactList</i> classifier is like <i>AZList</i> but generates a bookshelf for duplicate items. In this collection there are a lot of images but only a few different values for <i>dc.Subject</i> metadata.</p>27 description7=QQQQ<p>There are two browsing classifiers, one based on <i>dc.Subject</i> metadata and the other on <i>dc.Description</i> metadata (but with a button named "captions"). Recall that the <i>AZCompactList</i> classifier is like <i>AZList</i> but generates a bookshelf for duplicate items. In this collection there are a lot of images but only a few different values for <i>dc.Subject</i> metadata.</p> 28 28 29 description8= <p>It\'s a little surprising that <i>AZCompactList</i> is used (instead of <i>AZList</i>) for the <i>dc.Description</i> index too, because <i>dc.Description</i> metadata is usually unique for each image. However, in this collection the same description has occasionally been given to several images, and some of the divisions in an <i>AZList</i> would contain a large number of images, slowing down transmission of that page. To avoid this, the compact version of the list is used with some arguments (<i>mincompact</i>, <i>maxcompact</i>, <i>mingroup</i>, <i>minnesting</i>) to control the display -- e.g. groups (represented by bookshelves) are not formed unless they have at least 5 (<i>mingroup</i>) items. To find out the meaning of the other arguments for this classifier, execute the command <i>classinfo.pl AZCompactList</i>. The programs <i>classinfo.pl</i> (for classifiers) and <i>pluginfo.pl</i> (for plugins) are useful tools for learning about the capabilities of Greenstone modules. Note incidentally the backslash in the configuration file, used to indicate a continuation of the previous line.</p>29 description8=QQQQ<p>It\'s a little surprising that <i>AZCompactList</i> is used (instead of <i>AZList</i>) for the <i>dc.Description</i> index too, because <i>dc.Description</i> metadata is usually unique for each image. However, in this collection the same description has occasionally been given to several images, and some of the divisions in an <i>AZList</i> would contain a large number of images, slowing down transmission of that page. To avoid this, the compact version of the list is used with some arguments (<i>mincompact</i>, <i>maxcompact</i>, <i>mingroup</i>, <i>minnesting</i>) to control the display -- e.g. groups (represented by bookshelves) are not formed unless they have at least 5 (<i>mingroup</i>) items. To find out the meaning of the other arguments for this classifier, execute the command <i>classinfo.pl AZCompactList</i>. The programs <i>classinfo.pl</i> (for classifiers) and <i>pluginfo.pl</i> (for plugins) are useful tools for learning about the capabilities of Greenstone modules. Note incidentally the backslash in the configuration file, used to indicate a continuation of the previous line.</p> 30 30 31 description9= <p>The <i>VList</i> format specification shows the image thumbnail, hyperlinked to the associated document, followed by <i>dc.Description</i> metadata; the result can be seen in the <a href="library/collection/oai-e/browse/CL2">CL2</a> classifier browser. The <i>Vlists</i> for the classifiers use <i>numleafdocs</i> to switch between an icon representing several documents (which will appear as a bookshelf) and the thumbnail itself, if there is only one image.</p>31 description9=QQQQ<p>The <i>VList</i> format specification shows the image thumbnail, hyperlinked to the associated document, followed by <i>dc.Description</i> metadata; the result can be seen in the <a href="library/collection/oai-e/browse/CL2">CL2</a> classifier browser. The <i>Vlists</i> for the classifiers use <i>numleafdocs</i> to switch between an icon representing several documents (which will appear as a bookshelf) and the thumbnail itself, if there is only one image.</p> 32 32 33 description10= <h3>The Greenstone OAI server</h3><p>Greenstone comes with a built-in OAI data provider. This runs as a CGI program called "oaiserver.cgi", and is installed in the Greenstone <i>cgi-bin</i> directory. It can be accessed via the same URL as the Greenstone library (replacing "library.cgi" with "oaiserver.cgi"). If you are using the Windows local library server, you must install a web server (such as Apache) to run the OAI server.</p>33 description10=QQQQ<h3>The Greenstone OAI server</h3><p>Greenstone comes with a built-in OAI data provider. This runs as a CGI program called "oaiserver.cgi", and is installed in the Greenstone <i>cgi-bin</i> directory. It can be accessed via the same URL as the Greenstone library (replacing "library.cgi" with "oaiserver.cgi"). If you are using the Windows local library server, you must install a web server (such as Apache) to run the OAI server.</p> 34 34 35 description11= <p>Configuration of the server is done via the <i>oai.cfg</i> file in the Greenstone <i>etc</i> directory. This file specifies general information about the repository, and lists collections to be made accessible to OAI clients. By default, collections are not accessible. To enable a collection, add its name to the <i>oaicollection</i> list.</p>35 description11=QQQQ<p>Configuration of the server is done via the <i>oai.cfg</i> file in the Greenstone <i>etc</i> directory. This file specifies general information about the repository, and lists collections to be made accessible to OAI clients. By default, collections are not accessible. To enable a collection, add its name to the <i>oaicollection</i> list.</p> 36 36 37 description12= <p>Greenstone\'s OAI server currently supports Dublin Core, qualified Dublin Core and rfc1807 metadata sets. The <i>oaimetadata</i> line specifies which sets should be used. For collections that use other metadata sets, metadata mapping rules should be provided to map the existing metadata to the sets in use. See the <i>oai.cfg</i> file for details.</p>37 description12=QQQQ<p>Greenstone\'s OAI server currently supports Dublin Core, qualified Dublin Core and rfc1807 metadata sets. The <i>oaimetadata</i> line specifies which sets should be used. For collections that use other metadata sets, metadata mapping rules should be provided to map the existing metadata to the sets in use. See the <i>oai.cfg</i> file for details.</p> -
documented-examples/trunk/wiki-e/resources/collectionConfig.properties
r36575 r36577 1 name= MediaWiki collection2 index_Title= Titles3 index_text= Text4 index_Source= Filenames1 name=QQQQMediaWiki collection 2 index_Title=QQQQTitles 3 index_text=QQQQText 4 index_Source=QQQQFilenames 5 5 6 6 7 shortDescription= <p>This demonstration collection is made from the Greenstone Wiki website. It shows off the new feature of building a Greenstone collection from a MediaWiki website in Greenstone.</p>7 shortDescription=QQQQ<p>This demonstration collection is made from the Greenstone Wiki website. It shows off the new feature of building a Greenstone collection from a MediaWiki website in Greenstone.</p> 8 8 9 description1= <h3>How the collection works</h3><p>The collection configuration file, <tt>collectionConfig.xml</tt>, contains the plugins <i>MediaWikiPlugin</i>, <i>ImagePlugin</i>, <i>ZipPlugin</i>, <i>PDFPlugin</i>, <i>PowerPointPlugin</i>, <i>WordPlugin</i> (along with the standard plugins <i>GreenstoneXMLPlugin</i>, <i>ArchivesInfPlugin</i> and <i>DirectoryPlugin</i>). The <i>MediaWikiPlugin</i> handles the HTML pages downloaded from a MediaWiki website, while <i>ImagePlugin</i>, <i>ZipPlugin</i>, <i>PDFPlugin</i>, <i>PowerPointPlugin</i> and <i>WordPlugin</i> handle the image, zip, PDF, PowerPoint and Word files associated with the Greenstone Wiki.</p>9 description1=QQQQ<h3>How the collection works</h3><p>The collection configuration file, <tt>collectionConfig.xml</tt>, contains the plugins <i>MediaWikiPlugin</i>, <i>ImagePlugin</i>, <i>ZipPlugin</i>, <i>PDFPlugin</i>, <i>PowerPointPlugin</i>, <i>WordPlugin</i> (along with the standard plugins <i>GreenstoneXMLPlugin</i>, <i>ArchivesInfPlugin</i> and <i>DirectoryPlugin</i>). The <i>MediaWikiPlugin</i> handles the HTML pages downloaded from a MediaWiki website, while <i>ImagePlugin</i>, <i>ZipPlugin</i>, <i>PDFPlugin</i>, <i>PowerPointPlugin</i> and <i>WordPlugin</i> handle the image, zip, PDF, PowerPoint and Word files associated with the Greenstone Wiki.</p> 10 10 11 description2= <p>To build a collection from a MediaWiki website of your choice, you would first download the wiki files using the <i>MediaWiki</i> option on the <i>Download</i> panel of GLI. This download type works in a similar way to a the <i>Web</i> download, but is specially designed for crawling MediaWiki websites.</p>11 description2=QQQQ<p>To build a collection from a MediaWiki website of your choice, you would first download the wiki files using the <i>MediaWiki</i> option on the <i>Download</i> panel of GLI. This download type works in a similar way to a the <i>Web</i> download, but is specially designed for crawling MediaWiki websites.</p> 12 12 13 description3= <p>Once the files are downloaded, copy them into a collection using the <i>Gather</i>, dragging them from the <i>Downloaded Files</i> folder in the Workspace tree on the left-hand side.</p>13 description3=QQQQ<p>Once the files are downloaded, copy them into a collection using the <i>Gather</i>, dragging them from the <i>Downloaded Files</i> folder in the Workspace tree on the left-hand side.</p> 14 14 15 description4= <p>In the <i>Document Plugins</i> section of the <i>Design</i> panel, add <i>MediaWikiPlugin</i>. <i>MediaWikiPlugin</i> has several specific options which control aspects of page presentation, such as whether or not the table of contents, navigation toolbars and search box are shown on each page. Configure these options based on how you want the pages to appear. You can see the options used by this collection in its collection configuration file, <tt>collectionConfig.xml</tt>.</p>15 description4=QQQQ<p>In the <i>Document Plugins</i> section of the <i>Design</i> panel, add <i>MediaWikiPlugin</i>. <i>MediaWikiPlugin</i> has several specific options which control aspects of page presentation, such as whether or not the table of contents, navigation toolbars and search box are shown on each page. Configure these options based on how you want the pages to appear. You can see the options used by this collection in its collection configuration file, <tt>collectionConfig.xml</tt>.</p> 16 16
Note:
See TracChangeset
for help on using the changeset viewer.