Changeset 13119


Ignore:
Timestamp:
2006-10-16T16:22:51+13:00 (18 years ago)
Author:
kjdon
Message:

updated some more tutorials for 2.71

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl-documentation/tutorials/xml-source/tutorial_en.xml

    r13114 r13119  
    396396<NumberedItem>
    397397<Text id="0201">From <Link>http://www.greenstone.org</Link></Text>
    398 <Text id="0202">Most people download the Windows distribution from <Link>http://www.greenstone.org</Link>, which contains the latest version of Greenstone. There are several optional modules that must be downloaded separately (to avoid a single massive download): <b>documented example collections</b>, the <b>Export to CD-ROM</b> package, the <b>Language Pack</b> (Greenstone 2.62 and earlier) and <b>Classic Interface Pack</b> (Greenstone 2.63 and later). There is also the set of <b>sample files</b> used in these exercises. (To reduce the download size the documented example collections are distributed in unbuilt form and need to be built.)</Text>
     398<Text id="0202">Most people download the Windows distribution from <Link>http://www.greenstone.org</Link>, which contains the latest version of Greenstone. There are several optional modules that must be downloaded separately (to avoid a single massive download): <b>documented example collections</b>, the <b>Export to CD-ROM</b> package (Greenstone 2.70 and earlier), the <b>Language Pack</b> (Greenstone 2.62 and earlier) and <b>Classic Interface Pack</b> (Greenstone 2.63 and later). There is also the set of <b>sample files</b> used in these exercises. (To reduce the download size the documented example collections are distributed in unbuilt form and need to be built.)</Text>
    399399<Text id="0203">You need <b>Java</b> to run Greenstone. You might already have it; otherwise download it from <Link>http://java.sun.com</Link>. To work with image collections, you need <b>ImageMagick</b> (from <Link>http://www.imagemagick.org</Link>). </Text>
    400400</NumberedItem>
     
    812812</Title>
    813813<SampleFiles folder="Word_and_PDF"/>
    814 <Version initial="2.60" current="2.70w"/>
     814<Version initial="2.60" current="2.71"/>
    815815<Content>
    816816<Comment>
     
    818818</Comment>
    819819<NumberedItem>
    820 <Text id="0281">Start a new collection called <b>reports</b> (<AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/>), base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>, and choose Dublin Core as the metadata set.</Text>
     820<Text id="0281">Start a new collection called <b>reports</b> (<AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/>) and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text>
    821821</NumberedItem>
    822822<NumberedItem>
     
    863863</Comment>
    864864<Heading>
    865 <Text id="0295">Collection design; branding a collection with an image</Text>
    866 </Heading>
    867 <NumberedItem>
    868 <Text id="0296">Change to the <AutoText key="glidict::GUI.Design"/> panel, which is split into several sections. The first section <AutoText key="glidict::CDM.GUI.General"/> appears. This allows you to modify the values you provided when defining the collection, if desired. You can also brand the collection using a suitable image.</Text>
    869 </NumberedItem>
    870 <NumberedItem>
    871 <Text id="0297">Click on the <AutoText key="glidict::General.Browse" type="button"/> button associated with <AutoText key="glidict::CDM.General.Icon_Collection"/>, and browse to the image <Path>sample_files &rarr; Word_and_PDF &rarr; wrdpdf.gif</Path> on your computer. When you select this image, Greenstone automatically generates an appropriate URL for the image. <b>Preview</b> the collection: you should see the new image at the top left of the page.</Text>
    872 <Comment>
    873 <Text id="0297a">Information on the <AutoText key="glidict::CDM.GUI.General"/> page does not require a rebuild of the collection to take effect. Just go to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/>.</Text>
    874 </Comment>
    875 </NumberedItem>
    876 <NumberedItem>
    877 <Text id="0301">If you are on the web, you can easily make your own Greenstone-style icon by going to</Text>
    878 <Link>http://www.greenstone.org/make-images.html</Link>
    879 <Text id="0302">and following the instructions there.</Text>
    880 </NumberedItem>
    881 <Heading>
    882 <Text id="0303">Document plugins</Text>
    883 </Heading>
    884 <NumberedItem>
    885 <Text id="0304">Back in the Librarian Interface, look at the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the <AutoText text="ZIPPlug"/>, <AutoText text="TEXTPlug"/>, <AutoText text="HTMLPlug"/>, <AutoText text="EMAILPlug"/>, <AutoText text="ImagePlug"/>, <AutoText text="ISISPlug"/> and <AutoText text="NULPlug"/> plugins. To delete a plugin, select it and click <AutoText key="glidict::CDM.PlugInManager.Remove" type="button"/>. <AutoText text="GAPlug"/> is required for any type of source collection and should not be removed. </Text>
    886 </NumberedItem>
    887 <Comment>
    888 <Text id="0304a">The next section is <AutoText key="glidict::CDM.GUI.SearchTypes"/>. In this exercise, we will not make any changes to this section.</Text>
    889 </Comment>
     865<Text id="0303"><AutoText key="glidict::CDM.GUI.Plugins" type="plain"/></Text>
     866</Heading>
     867<NumberedItem>
     868<Text id="0304">In the Librarian Interface, look at the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the <AutoText text="ZIPPlug"/>, <AutoText text="TEXTPlug"/>, <AutoText text="HTMLPlug"/>, <AutoText text="EMAILPlug"/>, <AutoText text="ImagePlug"/>, <AutoText text="ISISPlug"/> and <AutoText text="NULPlug"/> plugins. To delete a plugin, select it and click <AutoText key="glidict::CDM.PlugInManager.Remove" type="button"/>. <AutoText text="GAPlug"/> is required for any type of source collection and should not be removed. </Text>
     869</NumberedItem>
    890870<Heading>
    891871<Text id="0309">Search indexes</Text>
     
    895875</NumberedItem>
    896876<NumberedItem>
    897 <Text id="0310b">Modify the <AutoText key="metadata::ex.Title"/> index to include <AutoText key="metadata::dc.Title"/> by selecting the index in the <AutoText key="glidict::CDM.IndexManager.Indexes"/> box and then selecting <AutoText key="metadata::dc.Title"/> from the <AutoText key="glidict::CDM.IndexManager.Source"/> box. Click <AutoText key="glidict::CDM.IndexManager.MGPP.Replace_Index" type="button"/>. Searching this index will search both dc.Title and ex.Title metadata. If you want to restrict searching to just the manually added dc.Title metadata, deselect <AutoText key="metadata::ex.Title"/> from the <AutoText key="glidict::CDM.IndexManager.Source"/> box and click <AutoText key="glidict::CDM.IndexManager.MGPP.Replace_Index" type="button"/>.</Text>
    898 </NumberedItem>
    899 <NumberedItem>
    900 <Text id="0312">You can add indexes based on any metadata. Add a new index based on <AutoText key="metadata::dc.Creator"/>. Change the <AutoText key="glidict::CDM.IndexManager.Index_Name"/> field to "authors", and select <AutoText key="metadata::dc.Creator"/> in the <AutoText key="glidict::CDM.IndexManager.Source"/> list. You will need to deselect the <AutoText key="metadata::ex.Title"/> and <AutoText key="metadata::dc.Title"/> metadata items. Click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>.</Text>
    901 </NumberedItem>
    902 <Comment>
    903 <Text id="0313">The next two sections are <AutoText key="glidict::CDM.GUI.Subcollections"/> and <AutoText key="glidict::CDM.GUI.SuperCollection"/>. In this exercise, we will not make any changes to these.</Text>
     877<Text id="0310b">Modify the <AutoText key="metadata::ex.Title"/> index to include <AutoText key="metadata::dc.Title"/> by selecting the index in the <AutoText key="glidict::CDM.IndexManager.Indexes"/> box and clicking <AutoText key="glidict::CDM.IndexManager.Edit_Index" type="button"/>. Select <AutoText key="metadata::dc.Title"/> from the list of metadata, and click <AutoText key="glidict::CDM.IndexManager.MGPP.Replace_Index" type="button"/>. Searching this index will search both dc.Title and ex.Title metadata. If you want to restrict searching to just the manually added dc.Title metadata, edit the index again and deselect <AutoText key="metadata::ex.Title"/> from the list of metadata.</Text>
     878</NumberedItem>
     879<NumberedItem>
     880<Text id="0312">You can add indexes based on any metadata. Add a new index based on <AutoText key="metadata::dc.Creator"/> by clicking <AutoText key="glidict::CDM.IndexManager.New_Index" type="button"/>. Select <AutoText key="metadata::dc.Creator"/> in the list of metadata, and click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>.</Text>
     881</NumberedItem>
     882<Comment>
     883<Text id="0313">The next section is <AutoText key="glidict::CDM.GUI.Subcollections"/>. In this exercise, we will not make any changes to this.</Text>
    904884</Comment>
    905885<Heading>
     
    917897<Text id="0318b"><AutoText text="AZCompactList"/> is like <AutoText text="AZList"/>, except that values that appear multiple times in the hierarchy are automatically grouped together and a new node, shown as a bookshelf icon, is formed.</Text>
    918898</NumberedItem>
    919 <Comment>
    920 <Text id="0319">The last three sections are <AutoText key="glidict::CDM.GUI.Formats"/>, <AutoText key="glidict::CDM.GUI.Translation"/> and <AutoText key="glidict::CDM.GUI.MetadataSets"/>. In this exercise, we will not make any changes to these.</Text>
    921 </Comment>
    922899<NumberedItem>
    923900<Text id="0320">Switch to the <AutoText key="glidict::GUI.Create"/> panel, and <b>build</b> and <b>preview</b> the collection.</Text>
    924901</NumberedItem>
    925902<NumberedItem>
    926 <Text id="0321">Check that all the facilities work properly. There should be three full-text indexes, called <i>text</i>, <i>titles</i>, and <i>authors</i>. The <AutoText key="coredm::_Global:labelTitle_" type="italics"/> list should display all the documents to which you have assigned <AutoText key="metadata::dc.Title"/> metadata (and only those documents). The <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list should show one bookshelf for each author you have assigned as <AutoText key="metadata::dc.Creator"/>, and clicking on that bookshelf should take you to all the documents they authored.</Text>
     903<Text id="0321">Check that all the facilities work properly. There should be three full-text indexes, called <i>text</i>, <i>dc.Title,ex.Title</i>, and <i>dc.Creator</i>. The <AutoText key="coredm::_Global:labelTitle_" type="italics"/> list should display all the documents to which you have assigned <AutoText key="metadata::dc.Title"/> metadata (and only those documents). The <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list should show one bookshelf for each author you have assigned as <AutoText key="metadata::dc.Creator"/>, and clicking on that bookshelf should take you to all the documents they authored.</Text>
     904</NumberedItem>
     905<Heading>
     906<Text id="0321-1">Renaming the search indexes</Text>
     907</Heading>
     908<NumberedItem>
     909<Text id="">The default display text for the indexes in the drop-down list on the search page contains the content of the index. Now we will change this display text to make it nicer. Go to the <AutoText key="glidict::GUI.Format"/> panel by clicking its tab. This panel is split into several sections, each controlling some aspect of collection presentation.</Text>
     910</NumberedItem>
     911<NumberedItem>
     912<Text id="">Select <AutoText key="glidict::CDM.GUI.SearchMetadata"/> in the left hand list. This pane allows you to modify what text is displayed for the drop-down lists in the search form (indexes, subcollections, levels etc). Set the  <AutoText key="glidict::CDM.SearchMetadataManager.Component_Name"/> for the <AutoText text="dc.Title,Title"/> index to be "titles", and that for the <AutoText text="dc.Creator"/> index to be "creators". Preview the collection by clicking the <AutoText key="glidict::CreatePane.Preview_Collection"/>. The search form should display the new text.</Text>
    927913</NumberedItem>
    928914<Heading>
     
    962948<Text id="0321f">Extracted metadata is unreliable. But it is very cheap! On the other hand, manually assigned metadata is reliable, but expensive. The previous section of this exercise has shown how to aim for the best of both worlds by using extracted metadata but correcting it when it is wrong. While this may not satisfy the professional librarian, it could provide a useful compromise for the music teacher who wants to get their collection together with a minimum of effort.</Text>
    963949</NumberedItem>
     950<Heading>
     951<Text id="0295">Branding a collection with an image</Text>
     952</Heading>
     953<NumberedItem>
     954<Text id="0296">Switch back to the <AutoText key="glidict::GUI.Format"/> panel. The first section <AutoText key="glidict::CDM.GUI.General"/> appears. This allows you to modify the values you provided when defining the collection, if desired. You can also brand the collection using a suitable image.</Text>
     955</NumberedItem>
     956<NumberedItem>
     957<Text id="0297">Click on the <AutoText key="glidict::General.Browse" type="button"/> button associated with <AutoText key="glidict::CDM.General.Icon_Collection"/>, and browse to the image <Path>sample_files &rarr; Word_and_PDF &rarr; wrdpdf.gif</Path> on your computer. When you select this image, Greenstone automatically generates an appropriate URL for the image. <b>Preview</b> the collection: you should see the new image at the top left of the page.</Text>
     958</NumberedItem>
    964959</Content>
    965960</Tutorial>
     
    969964</Title>
    970965<Prerequisite id="word_pdf_collection"/>
     966<Version initial="2.70w" current="2.71"/>
    971967<Content>
    972968<Comment>
     
    974970</Comment>
    975971<NumberedItem>
    976 <Text id="fw-2">Open the <b>reports</b> collection in the Librarian Interface and go to the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel.</Text>
     972<Text id="fw-2">Open the <b>reports</b> collection in the Librarian Interface and go to the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Format"/> panel.</Text>
    977973</NumberedItem>
    978974<Heading>
     
    982978<Text id="fw-3a">In this part of the exercise, we make the format statement simpler without changing the resulting display.</Text>
    983979<Text id="fw-3">Greenstone's default format statement is complex because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections. For this collection, we don't need all of the complexity.</Text>
     980<Text id="fw-3a">Make sure that the <AutoText text="VList"/> format statement is selected in the list of formats.</Text>
    984981<Text id="fw-4">The default <AutoText text="VList"/> format statement looks like the following:</Text>
    985982<Format>
     
    1001998{Or}{[dc.Title],[ex.Title],Untitled}[/highlight] {If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}&lt;/td&gt;<br/>
    1002999</Format>
    1003 <Text id="fw-9a">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    10041000<Text id="fw-10">Preview the collection to make sure the display hasn't changed. You shouldn't notice any difference when looking at search results, classifiers etc. </Text>
    10051001</NumberedItem>
     
    10101006<Text id="fw-11">For collections with documents that undergo a conversion process during importing (e.g. Word, PDF, PowerPoint documents, but not text, HTML documents), the original file is stored in the collection along with the converted version. The default <AutoText text="VList"/> format statement links to both versions:</Text>
    10111007<Text id="fw-12"><Format>[link][icon][/link]</Format> links to the Greenstone HTML version, while <Format>[srclink][srcicon][/srclink]</Format> links to the original.</Text>
    1012 <Text id="fw-13">Choose <AutoText text="SearchVList"/> in <AutoText key="glidict::CDM.GUI.Formats"/> by selecting <AutoText text="Search"/> from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> to add the <AutoText text="SearchVList"/> format statement into the list of assigned formats. Experiment with removing either of the two links from the format statement. (Remember to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> after any changes.)</Text>
     1008<Text id="fw-13">Choose <AutoText text="SearchVList"/> in <AutoText key="glidict::CDM.GUI.Formats"/> by selecting <AutoText text="Search"/> from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> to add the <AutoText text="SearchVList"/> format statement into the list of assigned formats. Experiment with removing either of the two links from the format statement.</Text>
    10131009<Text id="fw-13a">To see the results of your changes, preview the collection and do a search. You are making changes to <AutoText text="SearchVList"/>, which means the changes will only apply to search results.</Text>
    10141010<Text id="fw-13b">Storing and displaying the original allows users to see the correct format, but requires the user to have the relevant program installed. It also increases the size of the collection. The Greenstone version can be viewed in a browser, but may not look as nice.</Text>
     
    10191015<NumberedItem>
    10201016<Text id="fw-14">Next, we'll customize the format for the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list. Classifier bookshelves have only a few pieces of metadata to display: <Format>[ex.Title]</Format> and <Format>[numleafdocs]</Format>. Whatever metadata the classifier has been built on, the bookshelf label is always stored as <Format>[ex.Title]</Format>. This is why a Creator is printed out for each bookshelf even though <Format>[dc.Creator]</Format> is not specified in the format statement. <Format>[numleafdocs]</Format> is only defined for bookshelves, so this metadata can be used in an <Format>{If}</Format> statement to make bookshelves and documents display differently in the list.</Text>
    1021 <Text id="fw-15">Make each bookshelf in the Creator classifier show how many entries it contains. In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel, select the  <AutoText text="CL2 AZCompactList"/> classifier which is based on <AutoText key="metadata::dc.Creator"/> metadata from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click the <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> button to add this format into the list of assigned formats. Note that it gets added as <AutoText text="CL2VList"/> in this list: its the <AutoText text="VList"/> format for the second (<AutoText text="CL2"/>) classifier.</Text>
    1022 <Text id="fw15a">Append the following text and click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>:</Text>
     1017<Text id="fw-15">Make each bookshelf in the Creator classifier show how many entries it contains. In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Format"/> panel, select the <AutoText text="CL2 AZCompactList"/> classifier which is based on <AutoText key="metadata::dc.Creator"/> metadata from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click the <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> button to add this format into the list of assigned formats. Note that it gets added as <AutoText text="CL2VList"/> in this list: it is the <AutoText text="VList"/> format for the second (<AutoText text="CL2"/>) classifier.</Text>
     1018<Text id="fw15a">Append the following text to the bottom of the format statement:</Text>
    10231019<Format>
    10241020{If}{[numleafdocs],&lt;td&gt;&lt;i&gt;([numleafdocs])&lt;/i&gt;&lt;/td&gt;}
    10251021</Format>
    1026 <Text id="fw-16">Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>, switch to the <AutoText key="glidict::GUI.Create"/> panel, and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/> (no need to rebuild). Click on the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and notice that the bookshelves now display how many documents they contain.</Text>
     1022<Text id="fw-16"><b>Preview</b> the collection. Click on the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and notice that the bookshelves now display how many documents they contain.</Text>
    10271023<Text id="fw-17">This revised format statement has the effect of specifying in brackets how many items are contained within a bookshelf.  Since only bookshelves define <Format>[numleafdocs]</Format>, only they will display this. By modifying <AutoText text="CL2VList"/> instead of <AutoText text="VList"/>, the change will only apply to the second classifier (Creators).</Text>
    10281024</NumberedItem>
     
    10311027</Heading>
    10321028<NumberedItem>
    1033 <Text id="fw-18">Next we modify the document entries in the Creator classifier to display all authors. Back in <AutoText key="glidict::CDM.GUI.Formats"/>, select the <AutoText text="CL2VList"/> format in the list of assigned formats. After <Format>{If}{[ex.Source],&lt;br&gt;</Format> in the format statement, add <Format>[sibling:dc.Creator]</Format>. Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
     1029<Text id="fw-18">Next we modify the document entries in the Creator classifier to display all authors. Back in <AutoText key="glidict::CDM.GUI.Formats"/>, select the <AutoText text="CL2VList"/> format in the list of assigned formats. After <Format>{If}{[ex.Source],&lt;br&gt;</Format> in the format statement, add <Format>[sibling:dc.Creator]</Format>.</Text>
    10341030<Text id="fw-19"><Format>[ex.Source]</Format> is not defined for bookshelves, so can also be used to differentiate bookshelves and documents.</Text>
    10351031<Text id="fw-20">The resulting format statement looks like:</Text>
     
    10391035&lt;td valign=top&gt;[highlight]<br/>
    10401036{Or}{[dc.Title],[ex.Title],Untitled}[/highlight]<br/>
    1041 {If}{[ex.Source],&lt;br&gt;[sibling:dc.Creator] <br/>
     1037{If}{[ex.Source],&lt;br&gt;<highlight>[sibling:dc.Creator]</highlight><br/>
    10421038&lt;i&gt;([ex.Source])&lt;/i&gt;}&lt;/td&gt;<br/>
    10431039{If}{[numleafdocs],&lt;td&gt;&lt;i&gt;([numleafdocs])&lt;/i&gt;&lt;/td&gt;}
    10441040</Format>
    1045 <Text id="fw-21">This will display the Greenstone link, the link to the original, then the Title. For bookshelves, it will also display how many documents the bookshelf contains. For documents, it will display all the Authors (Creators), and the source document. <Format>[sibling:dc.Creator]</Format> displays all the Creator metadata for the document, separated by a space (<AutoText text=" " type="quoted"/>). Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and make sure that all authors are displayed for documents.</Text>  </NumberedItem>
    1046 <NumberedItem>
    1047 <Text id="fw-22">You can change the separator between the authors. Modify the format statement, and replace <Format>[sibling:dc.Creator]</Format> with <Format>[sibling(All'&lt;br/&gt;'):dc.Creator]</Format>. This will add a new line after each author (<Format>&lt;br/&gt;</Format> specifies a line break in HTML). Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>. Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text>
     1041<Text id="fw-21">This will display the Greenstone link, the link to the original, then the Title. For bookshelves, it will also display how many documents the bookshelf contains. For documents, it will display all the Authors (Creators), and the source document. <Format>[sibling:dc.Creator]</Format> displays all the Creator metadata for the document, separated by a space (<AutoText text=" " type="quoted"/>), while <Format>[dc.Creator]</Format> displays only the first author. Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and make sure that all authors are displayed for documents.</Text> 
     1042</NumberedItem>
     1043<NumberedItem>
     1044<Text id="fw-22">You can change the separator between the authors. Modify the format statement, and replace <Format>[sibling:dc.Creator]</Format> with <Format>[sibling(All'&lt;br/&gt;'):dc.Creator]</Format>. This will add a new line after each author (<Format>&lt;br/&gt;</Format> specifies a line break in HTML). Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text>
    10481045<Text id="fw-23">If you have done exercise <TutorialRef id="enhanced_word"/>, the collection will have both dc.Creator and ex.Creator metadata. To display both, you can use </Text>
    10491046<Format>
    10501047[sibling:dc.Creator] [sibling:ex.Creator]
    10511048</Format>
    1052 <Text id="fw-23a">To display dc.Creator if its present, otherwise display ex.Creator, use</Text>
     1049<Text id="fw-23a">To display dc.Creator if it is present, otherwise display ex.Creator, use</Text>
    10531050<Format>
    10541051{Or}{[sibling:dc.Creator],[sibling:ex.Creator]}
     
    10621059</Title>
    10631060<SampleFiles folder="Word_and_PDF"/>
    1064 <Version initial="2.70" current="2.70w"/>
     1061<Version initial="2.70" current="2.71"/>
    10651062<Content>
    10661063<Text id="ep-2">Greenstone converts PDF files to HTML using third-party software: <AutoText text="pdftohtml.pl" type="italics"/>. This lets users view these documents even if they don't have the PDF software installed. Unfortunately, sometimes the formatting of the resulting HTML files is not so good.</Text>
     
    10691066<Text id="ep-3a">In the Librarian Interface, start a new collection called "PDF collection" and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text>
    10701067<Text id="ep-3b">In the <AutoText key="glidict::GUI.Gather"/> panel, drag just the PDF documents from <Path>sample_files &rarr; Word_and_PDF &rarr; Documents</Path> into the new collection. Also drag in the PDF documents from <Path>sample_files &rarr; Word_and_PDF &rarr; difficult_pdf</Path>.</Text>
    1071 <Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "15 documents were processed and included in the collection. 1 was rejected".</Text>
     1068<Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "5 documents were processed and included in the collection. 1 was rejected".</Text>
    10721069</NumberedItem>
    10731070<NumberedItem>
     
    10911088<NumberedItem>
    10921089<Text id="ep-6">In the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel, configure <AutoText text="PDFPlug"/>. Switch on the <AutoText text="use_sections"/> option. </Text>
    1093 <Text id="ep-7"><b>Build</b> and <b>preview</b> the collection. View the text versions of some of the PDF documents. Note that these are now split into a series of pages, and a "go to page" box is provided. The format is still a bit ugly though.</Text>
     1090<Text id="ep-7"><b>Build</b> and <b>preview</b> the collection. View the text versions of some of the PDF documents. Note that these are now split into a series of pages, and a "go to page" box is provided. The format is still a bit ugly though, and pdf05-notext.pdf is still not processed.</Text>
    10941091</NumberedItem>
    10951092<Heading>
     
    10981095<NumberedItem>
    10991096<Text id="ep-12">If conversion to HTML doesn't produce the result you like, PDF documents can be converted to a series of images, one per page or slide. This requires ImageMagick and Ghostscript to be installed.</Text>
    1100 
    11011097</NumberedItem>
    11021098<NumberedItem>
     
    11041100</NumberedItem>
    11051101<NumberedItem>
    1106 <Text id="ep-14"><b>Build</b> the collection and <b>preview</b>. All PDF documents have been processed and divided into sections, but each section displays <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. For the conversion to images for PDF documents, no text is extracted. </Text>
    1107 </NumberedItem>
    1108 <NumberedItem>
    1109 <Text id="ep-15">In order to view the documents properly, you will need to modify the format statement. In the <AutoText key="glidict::CDM.GUI.Formats"/> section on the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText text="DocumentText"/> format statement. Replace </Text>
     1102<Text id="ep-14"><b>Build</b> the collection and <b>preview</b>. All PDF documents (including pdf05-notext.pdf) have been processed and divided into sections, but each section displays <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. For the conversion to images for PDF documents, no text is extracted. </Text>
     1103</NumberedItem>
     1104<NumberedItem>
     1105<Text id="ep-15">In order to view the documents properly, you will need to modify the format statement. In the <AutoText key="glidict::CDM.GUI.Formats"/> section on the <AutoText key="glidict::GUI.Format"/> panel, select the <AutoText text="DocumentText"/> format statement. Replace </Text>
    11101106<Format>
    11111107[Text]
     
    11331129</NumberedItem>
    11341130<NumberedItem>
    1135 <Text id="ep-21">We achieve this by adding two <AutoText text="PDFPlug"/> plugins to the collection, with different options. Currently, the Librarian Interface does not allow you to add the same plugin twice to the collection (with the exception of <AutoText text="UnknownPlug"/>). You will need to edit the collection configuration file by hand.</Text>
    1136 <Text id="ep-21a">Close the collection in the Librarian Interface. Then open <Path>Greenstone &rarr; collect &rarr; pdfcolle &rarr; etc &rarr; collect.cfg</Path> using a text editor, e.g. WordPad. In the list of plugins, add another <AutoText text="PDFPlug"/>, i.e.</Text>
    1137 <Format>
    1138 plugin PDFPlug
    1139 </Format>
    1140 <Text id="ep-22">Don't worry about the options here - we will add these using the Librarian Interface.</Text>
    1141 <Text id="ep-22a">Note that if you ever need to edit a collection's <Path>collect.cfg</Path> file by hand, you must close the collection in the Librarian Interface first, otherwise the next time it saves the file, it will overwrite your changes.</Text>
    1142 </NumberedItem>
    1143 <NumberedItem>
    1144 <Text id="ep-23">Open up the collection again in the Librarian Interface, and go to the  <AutoText key="glidict::GUI.Gather"/> panel. Make a new folder called <AutoText text="notext" type="quoted"/>: right click in the collection panel and select <AutoText key="glidict::CollectionPopupMenu.New_Folder"/> from the menu. Change the <AutoText key="glidict::NewFolderOrFilePrompt.Folder_Name"/> to <AutoText text="notext" type="quoted"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     1131<Text id="ep-21">We achieve this by putting the problem files into a separate folder, and adding two <AutoText text="PDFPlug"/> plugins to the collection, with different options.</Text>
     1132</NumberedItem>
     1133<NumberedItem>
     1134<Text id="ep-23">Go to the <AutoText key="glidict::GUI.Gather"/> panel. Make a new folder called <AutoText text="notext" type="quoted"/>: right click in the collection panel and select <AutoText key="glidict::CollectionPopupMenu.New_Folder"/> from the menu. Change the <AutoText key="glidict::NewFolderOrFilePrompt.Folder_Name"/> to <AutoText text="notext" type="quoted"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
    11451135<Text id="ep-23a">Move the two pdf files that have problems with html (<Path>pdf05-notext.pdf</Path> and <Path>pdf06-weirdchars</Path>.pdf) into this folder by drag and drop. We will set up the plugins so that PDF files in this <Path>notext</Path> folder are processed differently to the other PDF files.</Text>
    11461136</NumberedItem>
    11471137<NumberedItem>
    1148 <Text id="ep-24">Switch to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. You will see that there are two PDFPlug plugins in the list. </Text>
    1149 </NumberedItem>
    1150 <NumberedItem>
    1151 <Text id="ep-25">Switch to <AutoText key="glidict::Preferences.Mode.Systems"/> mode, as you will need to use regular expressions in the options (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_Options"/> &rarr; <AutoText key="glidict::Preferences.Mode"/></Menu>)</Text>
    1152 </NumberedItem>
    1153 <NumberedItem>
    1154 <Text id="ep-26">Configure the two <AutoText text="PDFPlug"/> plugins so that the options look like the following:</Text>
    1155 
     1138<Text id="ep-24">Switch to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. Add a second PDF plugin by selecting <AutoText text="PDFPlug"/> from the <AutoText key="glidict::CDM.PlugInManager.PlugIn"/> drop-down list, and clicking <AutoText key="glidict::CDM.PlugInManager.Add" type="button"/>. This plugin will come after the first PDF plugin, so we configure it to process PDF documents as HTML. Set the <AutoText text="convert_to"/> option to <AutoText text="html"/>, and switch on the <AutoText text="use_sections"/> option. Click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     1139</NumberedItem>
     1140<NumberedItem>
     1141<Text id="ep-25">Now switch to <AutoText key="glidict::Preferences.Mode.Systems"/> mode, as you will need to use regular expressions in the options for the first PDFplugin (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_Options"/> &rarr; <AutoText key="glidict::Preferences.Mode"/></Menu>). Configure the first PDF plugin, and set the <AutoText text="process_exp"/> option to <AutoText text="'notext.*\.pdf'"/>.</Text>
     1142</NumberedItem>
     1143<NumberedItem>
     1144<Text id="ep-26">The two PDF plugins should have options like the following:</Text>
    11561145<Format>
    11571146plugin PDFPlug -convert_to pagedimg_jpg -process_exp "notext.*\.pdf"<br/>
     
    11741163<Text id="ew-a">Enhanced Word document handling</Text>
    11751164</Title>
     1165<Version initial="2.70w" current="2.71"/>
    11761166<Content>
    11771167<Text id="ew-1">The standard way Greenstone processes Word documents is to convert them to HTML format using a third-party program, wvWare. This sometimes doesn't do a very good job of conversion. If you are using Windows, and have Microsoft Word installed, you can take advantage of Windows native scripting to do a better job of conversion. If the original document was hierarchically structured using Word styles, these can be used to structure the resulting HTML. Word document properties can also be extracted as metadata.</Text>
     
    12971287<Text id="0403">Exporting a collection to CD-ROM/DVD</Text>
    12981288</Title>
    1299 <Version initial="2.60" current="2.70w"/>
     1289<Version initial="2.60" current="2.71"/>
    13001290<Content>
    13011291<Comment>
    1302 <Text id="0404">To publish a collection on CD-ROM or DVD, Greenstone's Export to CD-ROM export module must be installed (see <TutorialRef id="install_greenstone"/>).</Text>
     1292<Text id="0404">To publish a collection on CD-ROM or DVD, Greenstone's Export to CD-ROM export module must be installed. This is included with CD-ROM distributions, and all distributions 2.70w and later. It must be installed separately for non-CD-ROM versions of Greenstone, version 2.70 and earlier (see <TutorialRef id="install_greenstone"/>).</Text>
    13031293</Comment>
    13041294<NumberedItem>
     
    13061296</NumberedItem>
    13071297<NumberedItem>
    1308 <Text id="0406">Choose <Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_CDimage"/></Menu>. In the resulting popup window, select the collection or collections that you wish to export by ticking their check boxes. You can optionally enter a name for the CD-ROM: this is the name that will appear in the menu when the CDROM is run. If a name is not entered, the default <AutoText text="Greenstone Collections"/> will be used. Click <AutoText key="glidict::WriteCDImagePrompt.Export" type="button"/>.</Text>
     1298<Text id="0406">Choose <Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_CDimage"/></Menu>. In the resulting popup window, select the collection or collections that you wish to export by ticking their check boxes. You can optionally enter a name for the CD-ROM: this is the name that will appear in the menu when the CDROM is run. If a name is not entered, the default <AutoText text="Greenstone Collections"/> will be used. You can also specify whether the resulting CD-ROM will install files onto the host machine when used or not. Click <AutoText key="glidict::WriteCDImagePrompt.Export" type="button"/> to start the export process.</Text>
    13091299<Text id="0408">The necessary files for export are written to:</Text>
    13101300<Path>Greenstone &rarr; tmp &rarr; exported_xxx</Path>
     
    13221312</Title>
    13231313<SampleFiles folder="tudor"/>
    1324 <Version initial="2.60" current="2.70w"/>
     1314<Version initial="2.60" current="2.71"/>
    13251315<Content>
    13261316<NumberedItem>
    1327 <Text id="0388">Invoke the Greenstone Librarian Interface (from the Windows <i>Start</i> menu) and start a new collection called <b>tudor</b> (use the <AutoText key="glidict::Menu.File"/> menu). Fill out the pop-up dialog with appropriate values and leave <b>Dublin Core</b>, which is selected by default, as the metadata set.</Text>
     1317<Text id="0388">Invoke the Greenstone Librarian Interface (from the Windows <i>Start</i> menu) and start a new collection called <b>tudor</b> (use the <AutoText key="glidict::Menu.File"/> menu), based on the default <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text>
    13281318</NumberedItem>
    13291319<NumberedItem>
     
    13561346</NumberedItem>
    13571347<NumberedItem>
    1358 <Text id="0393c">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>rebuild</b> the collection. Go back to the <AutoText key="glidict::GUI.Enrich"/> panel and look at the extracted metadata for some of the HTML files in <Path>englishhistory.net &rarr; tudor &rarr; monarchs</Path>. The new metadata should new be visible.</Text>
     1348<Text id="0393c">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>rebuild</b> the collection. Go back to the <AutoText key="glidict::GUI.Enrich"/> panel and look at the extracted metadata for some of the HTML files in <Path>englishhistory.net &rarr; tudor &rarr; monarchs</Path>. The new metadata should now be visible.</Text>
    13591349</NumberedItem>
    13601350<Heading>
     
    13871377</Content>
    13881378</Tutorial>
     1379<!-- ** here -->
    13891380<Tutorial id="enhanced_html_collection">
    13901381<Title>
     
    15991590<Text id="st-1">Section tagging for HTML documents</Text>
    16001591</Title>
     1592<Version initial="2.70w" current="2.71"/>
    16011593<Content>
    16021594<NumberedItem>
     
    18771869<Text id="is-1">CDS/ISIS collection</Text>
    18781870</Title>
     1871<Version initial="2.70w" current="2.71"/>
    18791872<Content>
    18801873<Comment>
     
    31923185<Text id="gems-1">Editing metadata sets</Text>
    31933186</Title>
     3187<Version initial="2.70w" current="2.71"/>
    31943188<Content>
    31953189<Text id="gems-2">GEMS (Greenstone Editor for Metadata Sets) can be used to modify existing metadata sets or create new ones.</Text>
Note: See TracChangeset for help on using the changeset viewer.