Changeset 11968


Ignore:
Timestamp:
2006-06-27T16:25:07+12:00 (18 years ago)
Author:
kjdon
Message:

updated for 2.70w, modifications after hawaii workshop

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl-documentation/tutorials/xml-source/tutorial_en.xml

    r11897 r11968  
    4444</NumberedItem>
    4545<NumberedItem>
    46 <Text id="0089">The InstallShield Wizard begins to install the UNAIDS pre-packaged collection. Select the English language.</Text>
    47 </NumberedItem>
    48 <NumberedItem>
    49 <Text id="0090">Click the <b>&lt;next&gt;</b> button.</Text>
    50 </NumberedItem>
    51 <NumberedItem>
    52 <Text id="0091">Choose <b>Run from CD-ROM (standard) </b>as the setup type. This is the default and is already selected. Then click <b>&lt;next&gt;</b>.</Text>
    53 </NumberedItem>
    54 <NumberedItem>
    55 <Text id="0092">Click <b>&lt;next&gt; </b>again to install the UNAIDS collection in the default folder, which is <b>C:\Program Files\UNAIDS Library 2.0 [CD-ROM]</b>.</Text>
     46<Text id="0089">The InstallShield Wizard begins to install the UNAIDS pre-packaged collection. Select the English language and click <b>&lt;OK&gt;</b>.</Text>
     47</NumberedItem>
     48<NumberedItem>
     49<Text id="0090">On the welcome screen, click the <b>&lt;Next&gt;</b> button.</Text>
     50</NumberedItem>
     51<NumberedItem>
     52<Text id="0091">Choose <b>Run from CD-ROM (standard)</b> as the setup type. This is the default and is already selected. Then click <b>&lt;Next&gt;</b>.</Text>
     53</NumberedItem>
     54<NumberedItem>
     55<Text id="0092">Click <b>&lt;Next&gt;</b> again to install the UNAIDS collection in the default folder, which is <b>C:\Program Files\UNAIDS Library 2.0 [CD-ROM]</b>.</Text>
    5656<Comment>
    5757<Text id="0093">Installation Wizard copies the required files from CD-ROM to disk</Text>
     
    5959</NumberedItem>
    6060<NumberedItem>
    61 <Text id="0094">Click <b>&lt;OK</b>&gt; to confirm completion of UNAIDS collection (twice).</Text>
     61<Text id="0094">Click <b>&lt;OK&gt;</b> to confirm completion of UNAIDS collection (twice).</Text>
    6262<Comment>
    6363<Text id="0095">InstallShield quits&mdash;the UNAIDS Library is installed.</Text>
     
    373373<Text id="0193">Installing Greenstone</Text>
    374374</Title>
    375 <Version initial="2.60" current="2.70"/>
     375<Version initial="2.60" current="2.70w"/>
    376376<Content>
    377377<Heading>
     
    484484</Title>
    485485<Prerequisite id="install_greenstone"/>
    486 <Version initial="2.60" current="2.70"/>
     486<Version initial="2.60" current="2.70w"/>
    487487<Content>
    488488<Comment>
     
    589589</Title>
    590590<SampleFiles folder="hobbits"/>
    591 <Version initial="2.60" current="2.70"/>
     591<Version initial="2.60" current="2.70w"/>
    592592<Content>
    593593<Comment>
     
    678678</Content>
    679679</Tutorial>
    680 <Tutorial id="large_html_collection">
     680<Tutorial id="simple_image_collection">
    681681<Title>
    682 <Text id="0387">A large collection of HTML files&mdash;Tudor</Text>
     682<Text id="0337">A simple image collection</Text>
    683683</Title>
    684 <SampleFiles folder="tudor"/>
    685 <Version initial="2.60" current="2.70"/>
     684<SampleFiles folder="images"/>
     685<Version initial="2.60" current="2.70w"/>
    686686<Content>
    687687<NumberedItem>
    688 <Text id="0388">Invoke the Greenstone Librarian Interface (from the Windows <i>Start</i> menu) and start a new collection called <b>tudor</b> (use the <AutoText key="glidict::Menu.File"/> menu). Fill out the pop-up dialog with appropriate values and leave <b>Dublin Core</b>, which is selected by default, as the metadata set.</Text>
    689 </NumberedItem>
    690 <NumberedItem>
    691 <Text id="0389">In the <AutoText key="glidict::GUI.Gather"/> panel, open the <Path>tudor</Path> folder in <Path>sample_files</Path>.</Text>
    692 </NumberedItem>
    693 <NumberedItem>
    694 <Text id="0390">Drag <Path>englishhistory.net</Path> from the left-hand side to the right to include it in your <b>tudor</b> collection.</Text>
    695 </NumberedItem>
    696 <NumberedItem>
    697 <Text id="0391">Switch to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Build_Collection" type="button"/>.</Text>
    698 </NumberedItem>
    699 <NumberedItem>
    700 <Text id="0392">When building has finished, <b>preview</b> the collection.</Text>
    701 </NumberedItem>
    702 <Heading>
    703 <Text id="0392a">Extracting more metadata from the HTML</Text>
    704 </Heading>
    705 <NumberedItem>
    706 <Text id="0393">The browsing facilities in this collection (<AutoText key="coredm::_Global:labelTitle_" type="italics"/> and <AutoText key="coredm::_Global:labelSource_" type="italics"/>) are based entirely on extracted metadata. Return to the <AutoText key="glidict::GUI.Enrich"/> panel in the Librarian Interface and examine the metadata that has been extracted for some of the files.</Text>
    707 </NumberedItem>
    708 <NumberedItem>
    709 <Text id="0393a">Many HTML documents contain metadata in <Format>&lt;meta&gt;</Format> tags in the <Format>&lt;head&gt;</Format> of the page. Open up the <Path>englishhistory.net &rarr; tudor &rarr; monarchs &rarr; boleyn.html</Path> file by navigating to it in the tree on the left hand side, and double clicking it. This will open it in a web browser. View the HTML source of the page (<Menu>View &rarr; Source</Menu> in Internet Explorer, <Menu>View &rarr; Page Source</Menu> in Mozilla). You will notice that this page has <AutoText text="page_topic,content" type="italics"/> and <AutoText text="author" type="italics"/> metadata.</Text>
    710  </NumberedItem>
    711 <NumberedItem>
    712 <Text id="0393b">By default, <AutoText text="HTMLPlug"/> only looks for Title metadata. Configure the plugin so that it looks for the other metadata too. Switch to the <AutoText key="glidict::GUI.Design"/> panel and select the <AutoText key="glidict::CDM.GUI.Plugins"/> section. Select the <AutoText text="plugin HTMLPlug"/> line and click <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/>. A popup window appears. Switch on the <AutoText text="metadata_fields"/> option, and set the value to <AutoText text="Title,Author,Page_topic,Content" type="quoted"/>. Click <AutoText key="glidict::General.OK" type="button"/>.</Text>
    713  </NumberedItem>
    714 <NumberedItem>
    715 <Text id="0393c">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>rebuild</b> the collection. Go back to the <AutoText key="glidict::GUI.Enrich"/> panel and look at the extracted metadata for some of the HTML files in <Path>englishhistory.net &rarr; tudor &rarr; monarchs</Path>. The new metadata should new be visible.</Text>
    716 </NumberedItem>
    717 <Heading>
    718 <Text id="0393d">Blocking the stray images</Text>
    719 </Heading>
    720 <Comment>
    721 <Text id="0394">You've probably noticed that the collection contains a few stray image files, as well as the HTML documents. This is a mistake. The issue is that many of the HTML documents include images, and although Greenstone attempts to determine which images belong to HTML pages and only considers other images for inclusion in the collection, in this case it hasn't been completely successful. (This is because the web site from which these files were downloaded occasionally departs from the usual convention of hierarchical structuring.)</Text>
    722 </Comment>
    723 <NumberedItem>
    724 <Text id="0395">Switch back to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. Beside <AutoText text="plugin HTMLPlug"/> you will see <AutoText text="-smart_block"/>. This is the option that attempts to identify images in the HTML pages and block them from inclusion&mdash;in this case, it's not smart enough! <b>Configure</b> <AutoText text="plugin HTMLPlug"/> again, scroll down the page to locate the <AutoText text="smart_block"/> option, and switch it off.</Text>
    725 </NumberedItem>
    726 <NumberedItem>
    727 <Text id="0396"><b>Rebuild</b> and <b>preview</b> the collection. The collection is exactly as before except that these stray images are suppressed. What is happening is that plug-ins operate as a pipeline: files are passed to each one in turn until one is found that can process it. By default (i.e. without <AutoText text="smart_block"/>) the HTML plug-in blocks <i>all</i> images, which is appropriate for this collection.</Text>
    728 </NumberedItem>
    729 <Heading>
    730 <Text id="0397">Looking at different views of the files in the <AutoText key="glidict::GUI.Gather"/> and <AutoText key="glidict::GUI.Enrich"/> panels</Text>
    731 </Heading>
    732 <NumberedItem>
    733 <Text id="0398">Switch to the <AutoText key="glidict::GUI.Gather"/> panel and in the right-hand side open <Path>englishhistory.net &rarr; tudor</Path>.</Text>
    734 </NumberedItem>
    735 <NumberedItem>
    736 <Text id="0400">Change the <AutoText key="glidict::Filter.Filter_Tree"/> menu for the right-hand side from <AutoText key="glidict::Filter.All_Files"/> to <AutoText key="glidict::Filter.0"/>. Notice the files displayed above are filtered accordingly, to show only files of this type.</Text>
    737 </NumberedItem>
    738 <NumberedItem>
    739 <Text id="0401">Change the <AutoText key="glidict::Filter.Filter_Tree"/> menu to <AutoText key="glidict::Filter.3"/>. Again, the files shown above alter.</Text>
    740 </NumberedItem>
    741 <NumberedItem>
    742 <Text id="0402">Now return the <AutoText key="glidict::Filter.Filter_Tree"/> setting back to <AutoText key="glidict::Filter.All_Files"/>, otherwise you may get confused later. Remember, if the <AutoText key="glidict::GUI.Gather"/> or <AutoText key="glidict::GUI.Enrich"/> panels do not seem to be showing all your files, this could be the problem.</Text>
     688<Text id="0338">In the Librarian Interface, start a new collection (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>) called <b>backdrop</b>. Fill out the fields with appropriate information. For <AutoText key="glidict::NewCollectionPrompt.Base_Collection"/>, select the item <b>Simple image collection (image-e)</b> from the pull-down menu.</Text>
     689<Comment>
     690<Text id="0340a">When you base a collection on an existing one, it inherits all the settings of the old one. You won't be asked to choose a metadata set because the new collection inherits the ones (if any) used by the seed collection.</Text>
     691</Comment>
     692</NumberedItem>
     693<NumberedItem>
     694<Text id="0341">Copy the images provided in <Path>sample_files &rarr; images</Path> into your newly-formed collection.</Text>
     695</NumberedItem>
     696<NumberedItem>
     697<Text id="0342">Change to the <AutoText key="glidict::GUI.Create"/> panel and <b>build</b> the collection.</Text>
     698</NumberedItem>
     699<NumberedItem>
     700<Text id="0343"><b>Preview</b> the result.</Text>
     701</NumberedItem>
     702<NumberedItem>
     703<Text id="0344">Click on <AutoText key="coredm::_Global:labelBrwse_"/> in the navigation bar to view a list of the photos ordered by filename and presented as a thumbnail accompanied by some basic data about the image. The structure of this collection is the same as <b>Simple image collection (image-e)</b>, but the content is different.</Text>
     704</NumberedItem>
     705<NumberedItem>
     706<Text id="0345">Back in the Librarian Interface, change to the <AutoText key="glidict::GUI.Enrich"/> panel and view the extracted metadata for <Path>Bear.jpg</Path>.</Text>
     707</NumberedItem>
     708<Heading>
     709<Text id="0347">Adding a metadata set to the collection</Text>
     710</Heading>
     711<Comment>
     712<Text id="0346">We now add our own metadata and use it to give users a new way to browse the collection. We use the Dublin Core metadata set.</Text>
     713</Comment>
     714<NumberedItem>
     715<Text id="0348">The collection (image-e) on which <b>backdrop</b> is based uses only extracted metadata. To add another metadata set, go to the <AutoText key="glidict::GUI.Design"/> panel of the Librarian Interface and click <AutoText key="glidict::CDM.GUI.MetadataSets"/> in the list on the left (the last one). Then click  <AutoText key="glidict::CDM.MetadataSetManager.Add" type="button"/> (lower left button).</Text>
     716</NumberedItem>
     717<NumberedItem>
     718<Text id="0349">In the window that pops up, select <AutoText text="dublin.mds"/> and click <AutoText key="glidict::CDM.MetadataSetManager.Chooser.Add" type="button"/>.</Text>
     719</NumberedItem>
     720<NumberedItem>
     721<Text id="0351">Now switch to the <AutoText key="glidict::GUI.Enrich"/> panel by clicking this tab. The metadata for each file now shows the (empty) Dublin Core <AutoText text="dc."/> fields as well as the extracted <AutoText text="ex."/> fields.</Text>
     722</NumberedItem>
     723<Heading>
     724<Text id="0350a">Adding Title and Description metadata</Text>
     725</Heading>
     726<NumberedItem>
     727<Text id="0352">We work with just the first three files (<Path>Bear.jpg</Path>, <Path>Cat.jpg</Path> and <Path>Cheetah.jpg</Path>) to get a flavour of what is possible. First, set each file's <AutoText key="metadata::dc.Title"/> field to be the same as its filename but without the filename extension:</Text>
     728<Text id="0353">Click on <Path>Bear.jpg</Path> so its metadata fields are available, then click on its <AutoText key="metadata::dc.Title"/> field on the right-hand side. Type in <b>Bear</b>.</Text>
     729<Text id="0355">Repeat the process for <Path>Cat.jpg</Path> and <Path>Cheetah.jpg</Path>.</Text>
     730</NumberedItem>
     731<NumberedItem>
     732<Text id="0355a">Add a description for each image as <AutoText key="metadata::dc.Description"/> metadata.</Text>
     733<Text id="0372">What description should you enter? To remind yourself of a file's content, the Librarian Interface lets you open files by double-clicking them. It launches the appropriate application based on the filename extension, Word for .doc files, Acrobat for .pdf files and so on.</Text>
     734<Text id="0372a">Double-click <Path>Bear.jpg</Path>: on Windows, the image will normally be displayed by Microsoft's Photo Editor (although this depends on how your computer has been set up).</Text>
     735<Text id="0373">Back in the <AutoText key="glidict::GUI.Enrich"/> pane, make sure that <Path>Bear.jpg</Path> is selected in the collection tree on the left hand side. Enter the text <b>Bear in the Rocky Mountains</b> as the value for the <AutoText key="metadata::dc.Description"/> field.</Text>
     736<Text id="0374">Repeat this process for <Path>Cat.jpg</Path> and <Path>Cheetah.jpg</Path>, adding a suitable description for each.</Text>
     737</NumberedItem>
     738<Heading>
     739<Text id="0357">Change Format Features to display new metadata</Text>
     740</Heading>
     741<NumberedItem>
     742<Text id="0356">Now we customize the collection's appearance. Building or previewing the collection at this point won't reveal anything new. That's because we haven't changed the design of the collection to take advantage of the new metadata.</Text>
     743</NumberedItem>
     744<NumberedItem>
     745<Text id="0358">Go to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Formats"/> from the left-hand list. Leave the feature selection controls at their default values, so that <AutoText key="glidict::CDM.FormatManager.Feature"/> remains blank and <AutoText text="VList" /> is selected as the <AutoText key="glidict::CDM.FormatManager.Part"/>. In the <AutoText key="glidict::CDM.FormatManager.Editor"/>, edit the text as follows:</Text>
     746<BulletList>
     747<Bullet>
     748<Text id="0359">Change <Format>_ImageName_:</Format> to <Format>Title:</Format></Text>
     749</Bullet>
     750<Bullet>
     751<Text id="0359a">Change <Format>[Image]</Format> to <Format>[dc.Title]</Format></Text>
     752</Bullet>
     753<Bullet>
     754<Text id="0359b">After <Format>[dc.Title]&lt;br&gt;</Format> add <Format>Description: [dc.Description]&lt;br&gt;</Format></Text>
     755</Bullet>
     756</BulletList>
     757<Comment>
     758<Text id="0360">Metadata names are case-sensitive in Greenstone: it is important that you capitalize "Title" and "Description" (and don't capitalize "dc").</Text>
     759</Comment>
     760</NumberedItem>
     761<NumberedItem>
     762<Text id="0361a">Next click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>. The new format statement will be displayed in the list of assigned format statements. The first substitution alters the fragment of text that appears to the right of the thumbnail image, the second alters the item of metadata that follows it. The addition displays the description after the Title.</Text>
     763</NumberedItem>
     764<NumberedItem>
     765<Text id="0362a">Go to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Build_Collection" type="button"/>. Once it has finished building, <b>preview</b> the collection. When you click on <AutoText key="coredm::_Global:labelBrwse_"/> in the navigation bar the presentation has changed to "Title: Bear" and so on. Each image's description should appear beside the thumbnail, following the title.</Text>
     766</NumberedItem>
     767<Comment>
     768<Text id="0363">After the first three items, the Title and Description become blank because we have only assigned Dublin Core metadata to these first three. To get a full listing, enter all the metadata.</Text>
     769</Comment>
     770<Comment>
     771<Text id="0364">For some design parameters the collection must be rebuilt before the effect of changes can be seen. However, changes to format statements take place immediately and you can see the result straightaway by clicking <b>reload</b> (or <b>refresh</b>) in the web browser. Above, you were asked to build before previewing because you had added metadata.</Text>
     772</Comment>
     773<Heading>
     774<Text id="0365">Changing the size of image thumbnails</Text>
     775</Heading>
     776<NumberedItem>
     777<Text id="0366">Lets change the size of the thumbnail image and make it smaller. Thumbnail images are created by the <AutoText text="ImagePlug"/> plug-in, so we need to access its configuration settings. To do this, switch to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Plugins"/> from the list on the left. Double-click <AutoText text="plugin ImagePlug"/> to pop up a window that shows its settings. (Alternatively, select <AutoText text="plugin ImagePlug"/> with a single click and then click <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/> further down the screen). Currently all options are off, so standard defaults are used. Select <AutoText text="thumbnailsize"/>, set it to <AutoText text="50"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     778</NumberedItem>
     779<NumberedItem>
     780<Text id="0367"><b>Build</b> and <b>preview</b> the collection.</Text>
     781</NumberedItem>
     782<NumberedItem>
     783<Text id="0368">Once you have seen the result of the change, return to the <AutoText key="glidict::GUI.Design"/> panel, select the configuration options for <AutoText text="ImagePlug"/>, and switch the <AutoText text="thumbnailsize"/> option off so that the thumbnail reverts to its normal size when the collection is re-built.</Text>
     784</NumberedItem>
     785<Heading>
     786<Text id="0380">Adding a browsing classifier based on Description metadata</Text>
     787</Heading>
     788<NumberedItem>
     789<Text id="0381">Now we'll add a new browsing option based on the descriptions. In the <AutoText key="glidict::GUI.Design"/> panel, select <AutoText key="glidict::CDM.GUI.Classifiers"/> from the left-hand list. Set the menu item for <AutoText key="glidict::CDM.ClassifierManager.Classifier"/> to <AutoText text="AZList" />; then click <AutoText key="glidict::CDM.ClassifierManager.Add" type="button"/>.</Text>
     790</NumberedItem>
     791<NumberedItem>
     792<Text id="0382">A window pops up to control the classifier's options. Set the <AutoText text="metadata"/> option to <AutoText key="metadata::dc.Description"/> and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     793</NumberedItem>
     794<NumberedItem>
     795<Text id="0382a"><b>Build</b> the collection, and <b>preview</b> it. Choose the new <b>descriptions</b> link that appears in the navigation bar.</Text>
     796</NumberedItem>
     797<Comment>
     798<Text id="0383">Only three items are shown, because only items with the relevant metadata (dc.Description in this case) appear in the list. The original browse list includes all photos in the collection because it is based on <AutoText key="metadata::ex.Image"/>, extracted metadata that reflects an image's filename, which is set for all images in the collection.</Text>
     799</Comment>
     800<Heading>
     801<Text id="0384">Creating a searchable index based on Description metadata</Text>
     802</Heading>
     803<NumberedItem>
     804<Text id="0385">Now we'll add an index so that the collection can be searched by descriptions. Switch to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Indexes"/> from the left-hand list. Enter the text "descriptions" as the <AutoText key="glidict::CDM.IndexManager.Index_Name"/>, select <AutoText key="metadata::dc.Description"/> from the <AutoText key="glidict::CDM.IndexManager.Source"/> list, and click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>.</Text>
     805</NumberedItem>
     806<NumberedItem>
     807<Text id="0386">Switch to the <AutoText key="glidict::GUI.Create"/> panel, <b>build</b> the collection, then <b>preview</b> it. There is now a <AutoText key="coredm::_Global:labelSearch_"/> button in the navigation bar. As an example, search for the term "bear" in the <i>descriptions</i> index (which is the only index at this point).</Text>
    743808</NumberedItem>
    744809</Content>
     
    749814</Title>
    750815<SampleFiles folder="Word_and_PDF"/>
    751 <Version initial="2.60" current="2.70"/>
     816<Version initial="2.60" current="2.70w"/>
    752817<Content>
    753818<Comment>
     
    763828<Text id="0287">Switch to the <AutoText key="glidict::GUI.Create"/> panel, and <b>build</b> and <b>preview</b> the collection.</Text>
    764829</NumberedItem>
    765 <Comment>
    766 <Text id="0287a">Some of the documents don't look very nice in Greenstone. One of them, <Path>pdf05-notext.pdf</Path>, could not be processed using the default configuration. Another, <Path>pdf06-weirdchars.pdf</Path>, was processed but looks very strange. Exercise <TutorialRef id="enhanced_pdf"/> looks at how to configure PDFPlug to handle these files better.</Text>
    767 </Comment>
    768830<Heading>
    769831<Text id="0287b">Viewing the extracted metadata</Text>
     
    785847</Heading>
    786848<NumberedItem>
    787 <Text id="0291a">In the <AutoText key="glidict::GUI.Enrich"/> panel, manually add Dublin Core <AutoText key="metadata::dc.Title"/> metadata to those documents which have incorrect <AutoText key="metadata::ex.Title"/> metadata. Select <Path>word03.doc</Path> and double-click to open it. Copy the title of this document (<AutoText text="Greenstone: A comprehensive open-source digital library software system" type="quoted"/>) and return to the Librarian Interface. Scroll up or down in the metadata table until you can see <AutoText key="metadata::dc.Title"/>. Click in the value box, paste in the metadata and press <b>Enter</b>. </Text>
     849<Text id="0291a">In the <AutoText key="glidict::GUI.Enrich"/> panel, manually add Dublin Core <AutoText key="metadata::dc.Title"/> metadata to those documents which have incorrect <AutoText key="metadata::ex.Title"/> metadata. Select <Path>word03.doc</Path> and double-click to open it. Copy the title of this document (<AutoText text="Greenstone: A comprehensive open-source digital library software system" type="quoted"/>) and return to the Librarian Interface. Scroll up or down in the metadata table until you can see <AutoText key="metadata::dc.Title"/>. Click in the value box and paste in the metadata.</Text>
    788850</NumberedItem>
    789851<NumberedItem>
     
    791853</NumberedItem>
    792854<NumberedItem>
    793 <Text id="0292a">Close the document when you have finished copying metadata from it. External programs opened when viewing documents must be closed before building the collection, otherwise errors can occur.</Text>
    794 </NumberedItem>
    795 <NumberedItem>
    796 <Text id="0293">Next add <AutoText key="metadata::dc.Title"/> and <AutoText key="metadata::dc.Creator"/> metadata for a few of the other documents, including <Path>pdf05-notext.pdf</Path>.</Text>
     855<Text id="0292a">Close the document (in Microsoft Word) when you have finished copying metadata from it. External programs opened when viewing documents must be closed before building the collection, otherwise errors can occur.</Text>
     856</NumberedItem>
     857<NumberedItem>
     858<Text id="0293">Next add <AutoText key="metadata::dc.Title"/> and <AutoText key="metadata::dc.Creator"/> metadata for a few of the other documents.</Text>
    797859</NumberedItem>
    798860<NumberedItem>
     
    910972<Prerequisite id="word_pdf_collection"/>
    911973<Content>
     974<Comment>
     975<Text id="fw-1a">In this exercise, we play around with the format statements in the Word and PDF collection.</Text>
     976</Comment>
    912977<NumberedItem>
    913978<Text id="fw-2">Open the <b>reports</b> collection in the Librarian Interface and go to the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel.</Text>
     
    917982</Heading>
    918983<NumberedItem>
    919 <Text id="fw-3">Greenstone's default format statement is complex because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections.</Text>
    920 
     984<Text id="fw-3a">In this part of the exercise, we make the format statement simpler without changing the resulting display.</Text>
     985<Text id="fw-3">Greenstone's default format statement is complex because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections. For this collection, we don't need all of the complexity.</Text>
    921986<Text id="fw-4">The default <AutoText text="VList"/> format statement looks like the following:</Text>
    922987<Format>
     
    927992[/highlight]{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}&lt;/td&gt;
    928993</Format>
    929 <Text id="fw-5">This format statement is the default used for search results, classifiers, and document table of contents. First we will tidy this up a bit. </Text>
    930 
     994<Text id="fw-5">This format statement is the default used for any vertical list, such as search results, classifiers, and document table of contents.</Text>
    931995<Text id="fw-6"><Format>{Or}{[ex.thumbicon],[ex.srcicon]}</Format> chooses <i>ex.thumbicon</i> metadata if its there, otherwise chooses <i>ex.srcicon</i> metadata. If neither are present, nothing is displayed. For this collection there is no <i>ex.thumbicon</i> metadata so the choice is not needed.</Text>
    932 
    933996<Text id="fw-7">Replace <Format>{Or}{[ex.thumbicon],[ex.srcicon]}</Format> with <Format>[ex.srcicon]</Format>.  </Text>
    934 
    935997<Text id="fw-8">There is no <i>dls.Title</i> metadata, so remove that element from <Format>{Or}{[dls.Title],[dc.Title],[ex.Title],Untitled}</Format>.</Text>
    936 
    937998<Text id="fw-9">The resulting format statement looks like the following:</Text>
    938999<Format>
     
    9431004</Format>
    9441005<Text id="fw-9a">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    945 <Text id="fw-10">Preview the collection to make sure the display hasn't changed.</Text>
    946 
    947 </NumberedItem>
    948 <Heading>
    949 <Text id="fw-10a">Linking to Greenstone version or original version</Text>
     1006<Text id="fw-10">Preview the collection to make sure the display hasn't changed. You shouldn't notice any difference when looking at search results, classifiers etc. </Text>
     1007</NumberedItem>
     1008<Heading>
     1009<Text id="fw-10a">Linking to Greenstone version or original version of documents</Text>
    9501010</Heading>
    9511011<NumberedItem>
    9521012<Text id="fw-11">For collections with documents that undergo a conversion process during importing (e.g. Word, PDF, PowerPoint documents, but not text, HTML documents), the original file is stored in the collection along with the converted version. The default <AutoText text="VList"/> format statement links to both versions:</Text>
    953 
    9541013<Text id="fw-12"><Format>[link][icon][/link]</Format> links to the Greenstone HTML version, while <Format>[srclink][srcicon][/srclink]</Format> links to the original.</Text>
    955 
    956 <Text id="fw-13">Choose <AutoText text="SearchVList"/> in <AutoText key="glidict::CDM.GUI.Formats"/> by selecting <AutoText text="Search"/> from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Experiment with removing either of the two links from the format statement. Storing and displaying the original allows users to see the correct format, but requires the user to have the relevant program installed. It also increases the size of the collection. The Greenstone version can be viewed in a browser, but may not look as nice.</Text>
    957 
    958 </NumberedItem>
    959 <Heading>
    960 <Text id="fw-13a">Making bookshelves show how many items they contain</Text>
    961 </Heading>
    962 <NumberedItem>
    963 <Text id="fw-14">Next, we'll customize the format for the <AutoText key="coredm::_labelCreator_" type="italics"/> list. Classifier nodes have only a few pieces of metadata to display: <Format>[ex.Title]</Format> and <Format>[numleafdocs]</Format>. Whatever metadata the classifier has been built on, the node label is always stored as <Format>[ex.Title]</Format>. This is why a Creator is printed out for each bookshelf node even though <i>dc.Creator</i> is not specified in the format statement. <Format>[numleafdocs]</Format> is only defined for bookshelf nodes, so this metadata can be used in an <Format>{If}</Format> statement to make bookshelf nodes and document nodes display differently.</Text>
    964 
    965 </NumberedItem>
    966 <NumberedItem>
    967 <Text id="fw-15">Make each bookshelf node in the Creator classifier show how many entries it contains. In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel, select the dc.Creator <AutoText text="AZCompactList"/> classifier from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list.  Append the following: </Text>
     1014<Text id="fw-13">Choose <AutoText text="SearchVList"/> in <AutoText key="glidict::CDM.GUI.Formats"/> by selecting <AutoText text="Search"/> from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> to add the <AutoText text="SearchVList"/> format statement into the list of assigned formats. Experiment with removing either of the two links from the format statement. (Remember to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> after any changes.)</Text>
     1015<Text id="fw-13a">To see the results of your changes, preview the collection and do a search. You are making changes to <AutoText text="SearchVList"/>, which means the changes will only apply to search results.</Text>
     1016<Text id="fw-13b">Storing and displaying the original allows users to see the correct format, but requires the user to have the relevant program installed. It also increases the size of the collection. The Greenstone version can be viewed in a browser, but may not look as nice.</Text>
     1017</NumberedItem>
     1018<Heading>
     1019<Text id="fw-14a">Making bookshelves show how many items they contain</Text>
     1020</Heading>
     1021<NumberedItem>
     1022<Text id="fw-14">Next, we'll customize the format for the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list. Classifier bookshelves have only a few pieces of metadata to display: <Format>[ex.Title]</Format> and <Format>[numleafdocs]</Format>. Whatever metadata the classifier has been built on, the bookshelf label is always stored as <Format>[ex.Title]</Format>. This is why a Creator is printed out for each bookshelf even though <Format>[dc.Creator]</Format> is not specified in the format statement. <Format>[numleafdocs]</Format> is only defined for bookshelves, so this metadata can be used in an <Format>{If}</Format> statement to make bookshelves and documents display differently in the list.</Text>
     1023<Text id="fw-15">Make each bookshelf in the Creator classifier show how many entries it contains. In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel, select the  <AutoText text="CL2 AZCompactList"/> classifier which is based on <AutoText key="metadata::dc.Creator"/> metadata from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click the <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> button to add this format into the list of assigned formats. Note that it gets added as <AutoText text="CL2VList"/> in this list: its the <AutoText text="VList"/> format for the second (<AutoText text="CL2"/>) classifier.</Text>
     1024<Text id="fw15a">Append the following text and click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>:</Text>
    9681025<Format>
    9691026{If}{[numleafdocs],&lt;td&gt;&lt;i&gt;([numleafdocs])&lt;/i&gt;&lt;/td&gt;}
    9701027</Format>
    971 <Text id="fw-16">Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>, switch to the <AutoText key="glidict::GUI.Create"/> panel, and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/> (no need to rebuild). Preview the <AutoText key="coredm::_labelCreator_" type="italics"/> list.</Text>
    972 <Text id="fw-17">This revised format statement has the effect of specifying in brackets how many items are contained within a bookshelf.  Since only bookshelf nodes define <Format>[numleafdocs]</Format>, only these nodes will display this. By modifying <AutoText text="CL2VList"/> instead of <AutoText text="VList"/>, the change will only apply to the second classifier (Creators).</Text>
     1028<Text id="fw-16">Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>, switch to the <AutoText key="glidict::GUI.Create"/> panel, and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/> (no need to rebuild). Click on the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and notice that the bookshelves now display how many documents they contain.</Text>
     1029<Text id="fw-17">This revised format statement has the effect of specifying in brackets how many items are contained within a bookshelf.  Since only bookshelves define <Format>[numleafdocs]</Format>, only they will display this. By modifying <AutoText text="CL2VList"/> instead of <AutoText text="VList"/>, the change will only apply to the second classifier (Creators).</Text>
    9731030</NumberedItem>
    9741031<Heading>
     
    9761033</Heading>
    9771034<NumberedItem>
    978 <Text id="fw-18">Next we modify the document nodes in the Creator classifier to display all authors. Back in <AutoText key="glidict::CDM.GUI.Formats"/>, select the <AutoText text="CL2VList"/> format in the list of assigned formats. After <Format>{If}{[ex.Source],&lt;br&gt;</Format> in the format statement, add <Format>[sibling:dc.Creator]</Format>.</Text>
    979 <Text id="fw-19"><Format>[ex.Source]</Format> is not defined for bookshelf nodes, so can also be used to differentiate bookshelves and documents.</Text>
     1035<Text id="fw-18">Next we modify the document entries in the Creator classifier to display all authors. Back in <AutoText key="glidict::CDM.GUI.Formats"/>, select the <AutoText text="CL2VList"/> format in the list of assigned formats. After <Format>{If}{[ex.Source],&lt;br&gt;</Format> in the format statement, add <Format>[sibling:dc.Creator]</Format>. Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
     1036<Text id="fw-19"><Format>[ex.Source]</Format> is not defined for bookshelves, so can also be used to differentiate bookshelves and documents.</Text>
    9801037<Text id="fw-20">The resulting format statement looks like:</Text>
    9811038<Format>
     
    9881045{If}{[numleafdocs],&lt;td&gt;&lt;i&gt;([numleafdocs])&lt;/i&gt;&lt;/td&gt;}
    9891046</Format>
    990 <Text id="fw-21">This will display the Greenstone link, the link to the original, then the Title. For bookshelf nodes, it will also display how many documents the bookshelf contains. For document nodes, it will display all the Authors (Creators), and the source document. <Format>[sibling:dc.Creator]</Format> displays all the Creator metadata for the document, separated by a space (<AutoText text=" " type="quoted"/>). Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text> 
    991 <Text id="fw-22">Change the separator between the authors. Modify the format statement, and replace <Format>[sibling:dc.Creator]</Format> with <Format>[sibling(All'&lt;br/&gt;'):dc.Creator]</Format>. This will add a new line after each author. Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text>
    992 <Text id="fw-23">If you have done exercise <TutorialRef id="enhanced_word"/>, the collection will have both dc.Creator and ex.Creator metadata. To display both, you can use <Format>[sibling:dc.Creator] [sibling:ex.Creator]</Format>, or to display dc.Creator if its present, otherwise display ex.Creator, use <Format>{Or}{[sibling:dc.Creator],[sibling:ex.Creator]}</Format>.</Text>
     1047<Text id="fw-21">This will display the Greenstone link, the link to the original, then the Title. For bookshelves, it will also display how many documents the bookshelf contains. For documents, it will display all the Authors (Creators), and the source document. <Format>[sibling:dc.Creator]</Format> displays all the Creator metadata for the document, separated by a space (<AutoText text=" " type="quoted"/>). Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and make sure that all authors are displayed for documents.</Text>  </NumberedItem>
     1048<NumberedItem>
     1049<Text id="fw-22">You can change the separator between the authors. Modify the format statement, and replace <Format>[sibling:dc.Creator]</Format> with <Format>[sibling(All'&lt;br/&gt;'):dc.Creator]</Format>. This will add a new line after each author (<Format>&lt;br/&gt;</Format> specifies a line break in HTML). Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>. Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text>
     1050<Text id="fw-23">If you have done exercise <TutorialRef id="enhanced_word"/>, the collection will have both dc.Creator and ex.Creator metadata. To display both, you can use </Text>
     1051<Format>
     1052[sibling:dc.Creator] [sibling:ex.Creator]
     1053</Format>
     1054<Text id="fw-23a">To display dc.Creator if its present, otherwise display ex.Creator, use</Text>
     1055<Format>
     1056{Or}{[sibling:dc.Creator],[sibling:ex.Creator]}
     1057</Format>
    9931058</NumberedItem>
    9941059</Content>
     
    9981063<Text id="ep-1">Enhanced PDF handling</Text>
    9991064</Title>
    1000 <Prerequisite id="word_pdf_collection"/>
    1001 <Version initial="2.70" current="2.70"/>
     1065<SampleFiles folder="Word_and_PDF"/>
     1066<Version initial="2.70" current="2.70w"/>
    10021067<Content>
    10031068<Text id="ep-2">Greenstone converts PDF files to HTML using third-party software: <AutoText text="pdftohtml.pl" type="italics"/>. This lets users view these documents even if they don't have the PDF software installed. Unfortunately, sometimes the formatting of the resulting HTML files is not so good.</Text>
    10041069<Text id="ep-3">This exercise explores some extra options to the PDF plugin which may produce a nicer version for display. Some of these options use the standard pdftohtml program, others use ImageMagick and Ghostscript to convert the file to a series of images. Ghostscript is a program that can convert Postscript and PDF files to other formats. You can download it from <Link>http://www.cs.wisc.edu/~ghost/</Link> (follow the link to the current stable release).</Text>
    10051070<NumberedItem>
    1006 <Text id="ep-3a">In the Librarian Interface, open up the <b>reports</b> collection created in the <TutorialRef id="word_pdf_collection"/> exercise. Rebuild the collection and examine the output. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "15 documents were processed and included in the collection. 1 was rejected".</Text>
    1007 </NumberedItem>
    1008 <NumberedItem>
    1009 <Text id="ep-4">Preview the collection and view the documents. <Path>pdf05-notext.pdf</Path> does not appear. Note that the other PDF documents appear as one long document, with no sections. </Text>
     1071<Text id="ep-3a">In the Librarian Interface, start a new collection called "PDF collection" and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text>
     1072<Text id="ep-3b">In the <AutoText key="glidict::GUI.Gather"/> panel, drag just the PDF documents from <Path>sample_files &rarr; Word_and_PDF &rarr; Documents</Path> into the new collection. Also drag in the PDF documents from <Path>sample_files &rarr; Word_and_PDF &rarr; difficult_pdf</Path>.</Text>
     1073<Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "15 documents were processed and included in the collection. 1 was rejected".</Text>
     1074</NumberedItem>
     1075<NumberedItem>
     1076<Text id="ep-4">Preview the collection and view the documents. <Path>pdf05-notext.pdf</Path> does not appear as it could not be processed. <Path>pdf06-weirdchars.pdf</Path> was processed but looks very strange. The other PDF documents appear as one long document, with no sections. </Text>
    10101077</NumberedItem>
    10111078<Heading>
     
    10161083</Comment>
    10171084<NumberedItem>
    1018 <Text id="0335">Use the <AutoText key="glidict::Menu.File_Options"/> item on the <AutoText key="glidict::Menu.File"/> menu to switch to <AutoText key="glidict::Preferences.Mode.Expert"/> mode and then build the collection again. The <AutoText key="glidict::GUI.Create"/> panel looks different in <AutoText key="glidict::Preferences.Mode.Expert"/> mode because it gives more options: locate the <AutoText key="glidict::CreatePane.Build_Collection" type="button"/> button, near the bottom of the window, and click it. Now a message appears saying that the file could not be processed, and why. Amongst all the output, we get the following message: "Error: PDF contains no extractable text. Could not convert pdf05notext.pdf to HTML format". pdftohtml.pl to convert a PDF file to HTML if the PDF file has no extractable text.</Text>
     1085<Text id="0335">Use the <AutoText key="glidict::Menu.File_Options"/> item on the <AutoText key="glidict::Menu.File"/> menu to switch to <AutoText key="glidict::Preferences.Mode.Expert"/> mode and then build the collection again. The <AutoText key="glidict::GUI.Create"/> panel looks different in <AutoText key="glidict::Preferences.Mode.Expert"/> mode because it gives more options: locate the <AutoText key="glidict::CreatePane.Build_Collection" type="button"/> button, near the bottom of the window, and click it. Now a message appears saying that the file could not be processed, and why. Amongst all the output, we get the following message: "Error: PDF contains no extractable text. Could not convert pdf05notext.pdf to HTML format". pdftohtml.pl cannot convert a PDF file to HTML if the PDF file has no extractable text.</Text>
    10191086</NumberedItem>
    10201087<NumberedItem>
     
    10391106</NumberedItem>
    10401107<NumberedItem>
    1041 <Text id="ep-14">Build the collection and preview. All PDF documents have been processed and divided into sections, but each section displays <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. For the conversion to images for PDF documents, no text is extracted. </Text>
     1108<Text id="ep-14"><b>Build</b> the collection and <b>preview</b>. All PDF documents have been processed and divided into sections, but each section displays <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. For the conversion to images for PDF documents, no text is extracted. </Text>
    10421109</NumberedItem>
    10431110<NumberedItem>
    10441111<Text id="ep-15">In order to view the documents properly, you will need to modify the format statement. In the <AutoText key="glidict::CDM.GUI.Formats"/> section on the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText text="DocumentText"/> format statement. Replace </Text>
    1045 
    10461112<Format>
    10471113[Text]
     
    10491115<Text id="ep-16">with</Text>
    10501116<Format>
     1117[srcicon]
     1118</Format>
     1119</NumberedItem>
     1120<NumberedItem>
     1121<Text id="ep-18">Preview the collection. Images from the document are now displayed instead of the extracted text. Both <Path>pdf05-notext.pdf</Path> and <Path>pdf06-weirdchars.pdf</Path> display nicely now.</Text>
     1122<Comment>
     1123<Text id="ep-17">In this collection, we only have PDF documents and they have all been converted to images. If we had other document types in the collection, we should use a different format statement, such as:</Text>
     1124<Format>
    10511125{If}{[parent:FileFormat] eq PDF,[srcicon],[Text]}
    10521126</Format>
    1053 
    1054 <Text id="ep-17">Because the other documents in the collection do not use images, we only want to show images for PDF documents. <AutoText text="FileFormat"/> is an extracted metadata item which shows the format of the source document. We use this to test whether the documents are PDF or not.</Text>
    1055 
    1056 </NumberedItem>
    1057 <NumberedItem>
    1058 <Text id="ep-18">Preview the collection from the <AutoText key="glidict::GUI.Create"/> panel. (There is no need to build it). Images from the document are now displayed instead of the extracted text. Both <Path>pdf05-notext.pdf</Path> and <Path>pdf06-weirdchars.pdf</Path> display nicely now. Make sure that the word documents still display properly. </Text>
     1127<Text id="ep-17a"><AutoText text="FileFormat"/> is an extracted metadata item which shows the format of the source document. We can use this to test whether the documents are PDF or not: for PDF documents, display [srcicon], for other documents, display [Text].</Text>
     1128</Comment>
    10591129</NumberedItem>
    10601130<Heading>
     
    10661136<NumberedItem>
    10671137<Text id="ep-21">We achieve this by adding two <AutoText text="PDFPlug"/> plugins to the collection, with different options. Currently, the Librarian Interface does not allow you to add the same plugin twice to the collection (with the exception of <AutoText text="UnknownPlug"/>). You will need to edit the collection configuration file by hand.</Text>
    1068 <Text id="ep-21a">Close the reports collection in the Librarian Interface. Then open <Path>Greenstone &rarr; collect &rarr; reports &rarr; etc &rarr; collect.cfg</Path> using a text editor, e.g. WordPad. In the list of plugins, add another <AutoText text="PDFPlug"/>, i.e.</Text>
     1138<Text id="ep-21a">Close the collection in the Librarian Interface. Then open <Path>Greenstone &rarr; collect &rarr; pdfcolle &rarr; etc &rarr; collect.cfg</Path> using a text editor, e.g. WordPad. In the list of plugins, add another <AutoText text="PDFPlug"/>, i.e.</Text>
    10691139<Format>
    10701140plugin PDFPlug
     
    10751145<NumberedItem>
    10761146<Text id="ep-23">Open up the collection again in the Librarian Interface, and go to the  <AutoText key="glidict::GUI.Gather"/> panel. Make a new folder called <AutoText text="notext" type="quoted"/>: right click in the collection panel and select <AutoText key="glidict::CollectionPopupMenu.New_Folder"/> from the menu. Change the <AutoText key="glidict::NewFolderOrFilePrompt.Folder_Name"/> to <AutoText text="notext" type="quoted"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
    1077 <Text id="ep-23a">Move the two pdf files that have problems with html (<Path>pdf05-notext.pdf</Path> and <Path>pdf06-weirdchars</Path>.pdf ) into this folder by drag and drop. We will set up the plugins so that PDF files in this <Path>notext</Path> folder are processed differently to the other PDF files.</Text>
     1147<Text id="ep-23a">Move the two pdf files that have problems with html (<Path>pdf05-notext.pdf</Path> and <Path>pdf06-weirdchars</Path>.pdf) into this folder by drag and drop. We will set up the plugins so that PDF files in this <Path>notext</Path> folder are processed differently to the other PDF files.</Text>
    10781148</NumberedItem>
    10791149<NumberedItem>
     
    10901160plugin PDFPlug -convert_to html -use_sections
    10911161</Format>
    1092 
    10931162<Text id="ep-27">The <AutoText text="paged_img" type="italics"/> version must come earlier in the list than the <AutoText text="html" type="italics"/> version. The <AutoText text="process_exp"/> for the first <AutoText text="PDFPlug"/> will process any PDF files in the <Path>notext</Path> directory. The second <AutoText text="PDFPlug"/> will process any PDF files that are not processed by the first one.</Text>
    1094 
    10951163<Text id="ep-28">Note that all plugins have the <AutoText text="process_exp"/> option, and this can be used to customize which documents are processed by which plugin. This option is only visible in <AutoText key="glidict::Preferences.Mode.Systems"/> and <AutoText key="glidict::Preferences.Mode.Expert"/> modes.</Text>
    10961164<Text id="ep-29">Change back to <AutoText key="glidict::Preferences.Mode.Librarian"/> mode.</Text>
    10971165</NumberedItem>
    10981166<NumberedItem>
    1099 <Text id="ep-30">Edit the <AutoText text="DocumentText"/> format statement. PDF files processed as HTML will not have images to display, so we need to make sure they get text displayed instead.</Text>
    1100 <Text id="ep-1">Change the first <Format>[srcicon]</Format> element in the following part with <Format>{Or}{[srcicon],[Text]}</Format>, i.e. change</Text>
    1101 <Format>
    1102 {If}{[parent:FileFormat] eq PDF,[srcicon],[Text]}
    1103 </Format>
    1104 <Text id="ep-32">to</Text>
    1105 <Format>
    1106 {If}{[parent:FileFormat] eq PDF, {Or}{[srcicon],[Text]},[Text]}
    1107 </Format>
    1108 </NumberedItem>
    1109 <NumberedItem>
    1110 <Text id="ep-33">Build and preview the collection. All PDF  documents should look relatively nice. Try searching this collection. You will be able to locate the PDFs that were converted to HTML (try e.g. <AutoText text="bibliography" type="quoted"/>), but not the ones that were converted to images (try searching for <AutoText text="banana" type="quoted"/> or <AutoText text="METS" type="quoted"/>).</Text>
     1167<Text id="ep-30">Edit the <AutoText text="DocumentText"/> format statement. PDF files processed as HTML will not have images to display, so we need to make sure they get text displayed instead. Change <Format>[srcicon]</Format> to <Format>{Or}{[srcicon],[Text]}</Format>.</Text>
     1168</NumberedItem>
     1169<NumberedItem>
     1170<Text id="ep-33">Build and preview the collection. All PDF documents should look relatively nice. Try searching this collection. You will be able to search for the PDFs that were converted to HTML (try e.g. <AutoText text="bibliography" type="quoted"/>), but not the ones that were converted to images (try searching for <AutoText text="banana" type="quoted"/> or <AutoText text="METS" type="quoted"/>).</Text>
    11111171</NumberedItem>
    11121172</Content>
     
    11141174<Tutorial id="enhanced_word">
    11151175<Title>
    1116 <Text id="ew-">Enhanced Word document handling</Text>
     1176<Text id="ew-a">Enhanced Word document handling</Text>
    11171177</Title>
    11181178<Content>
     
    12351295</Content>
    12361296</Tutorial>
    1237 <Tutorial id="simple_image_collection">
    1238 <Title>
    1239 <Text id="0337">A simple image collection</Text>
    1240 </Title>
    1241 <SampleFiles folder="images"/>
    1242 <Version initial="2.60" current="2.70"/>
    1243 <Content>
    1244 <NumberedItem>
    1245 <Text id="0338">In the Librarian Interface, start a new collection (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>) called <b>backdrop</b>. Fill out the fields with appropriate information. For <AutoText key="glidict::NewCollectionPrompt.Base_Collection"/>, select the item <b>Simple image collection (image-e)</b> from the pull-down menu.</Text>
    1246 <Comment>
    1247 <Text id="0340a">When you base a collection on an existing one, it inherits all the settings of the old one. You won't be asked to choose a metadata set because the new collection inherits the ones (if any) used by the seed collection.</Text>
    1248 </Comment>
    1249 </NumberedItem>
    1250 <NumberedItem>
    1251 <Text id="0341">Copy the images provided in <Path>sample_files &rarr; images</Path> into your newly-formed collection.</Text>
    1252 </NumberedItem>
    1253 <NumberedItem>
    1254 <Text id="0342">Change to the <AutoText key="glidict::GUI.Create"/> panel and <b>build</b> the collection.</Text>
    1255 </NumberedItem>
    1256 <NumberedItem>
    1257 <Text id="0343"><b>Preview</b> the result.</Text>
    1258 </NumberedItem>
    1259 <NumberedItem>
    1260 <Text id="0344">Click on <AutoText key="coredm::_Global:labelBrwse_"/> in the navigation bar to view a list of the photos ordered by filename and presented as a thumbnail accompanied by some basic data about the image. The structure of this collection is the same as <b>Simple image collection (image-e)</b>, but the content is different.</Text>
    1261 </NumberedItem>
    1262 <NumberedItem>
    1263 <Text id="0345">Back in the Librarian Interface, change to the <AutoText key="glidict::GUI.Enrich"/> panel and view the extracted metadata for <Path>Bear.jpg</Path>.</Text>
    1264 </NumberedItem>
    1265 <Heading>
    1266 <Text id="0347">Adding a metadata set to the collection</Text>
    1267 </Heading>
    1268 <Comment>
    1269 <Text id="0346">We now add our own metadata and use it to give users a new way to browse the collection. We use the Dublin Core metadata set.</Text>
    1270 </Comment>
    1271 <NumberedItem>
    1272 <Text id="0348">The collection (image-e) on which <b>backdrop</b> is based uses only extracted metadata. To add another metadata set, go to the <AutoText key="glidict::GUI.Design"/> panel of the Librarian Interface and click <AutoText key="glidict::CDM.GUI.MetadataSets"/> in the list on the left (the last one). Then click  <AutoText key="glidict::CDM.MetadataSetManager.Add" type="button"/> (lower left button).</Text>
    1273 </NumberedItem>
    1274 <NumberedItem>
    1275 <Text id="0349">In the window that pops up, select <AutoText text="dublin.mds"/> and click <AutoText key="glidict::CDM.MetadataSetManager.Chooser.Add" type="button"/>.</Text>
    1276 </NumberedItem>
    1277 <NumberedItem>
    1278 <Text id="0351">Now switch to the <AutoText key="glidict::GUI.Enrich"/> panel by clicking this tab. The metadata for each file now shows the (empty) Dublin Core <AutoText text="dc."/> fields as well as the extracted <AutoText text="ex."/> fields.</Text>
    1279 </NumberedItem>
    1280 <Heading>
    1281 <Text id="0350a">Adding Title and Description metadata</Text>
    1282 </Heading>
    1283 <NumberedItem>
    1284 <Text id="0352">We work with just the first three files (<Path>Bear.jpg</Path>, <Path>Cat.jpg</Path> and <Path>Cheetah.jpg</Path>) to get a flavour of what is possible. First, set each file's <AutoText key="metadata::dc.Title"/> field to be the same as its filename but without the filename extension:</Text>
    1285 <Text id="0353">Click on <Path>Bear.jpg</Path> so its metadata fields are available, then click on its <AutoText key="metadata::dc.Title"/> field on the right-hand side. Type in <b>Bear</b>, and click <b>Enter</b>.</Text>
    1286 <Text id="0355">Repeat the process for <Path>Cat.jpg</Path> and <Path>Cheetah.jpg</Path>.</Text>
    1287 </NumberedItem>
    1288 <NumberedItem>
    1289 <Text id="0355a">Add a description for each image as <AutoText key="metadata::dc.Description"/> metadata.</Text>
    1290 <Text id="0372">What description should you enter? To remind yourself of a file's content, the Librarian Interface lets you open files by double-clicking them. It launches the appropriate application based on the filename extension, Word for .doc files, Acrobat for .pdf files and so on.</Text>
    1291 <Text id="0372a">Double-click <Path>Bear.jpg</Path>: on Windows, the image will normally be displayed by Microsoft's Photo Editor (although this depends on how your computer has been set up).</Text>
    1292 <Text id="0373">Back in the <AutoText key="glidict::GUI.Enrich"/> pane, make sure that <Path>Bear.jpg</Path> is selected in the collection tree on the left hand side. Enter the text <b>Bear in the Rocky Mountains</b> as the value for the <AutoText key="metadata::dc.Description"/> field and press <b>Enter</b> to have it added.</Text>
    1293 <Text id="0374">Repeat this process for <Path>Cat.jpg</Path> and <Path>Cheetah.jpg</Path>, adding a suitable description for each.</Text>
    1294 </NumberedItem>
    1295 <Heading>
    1296 <Text id="0357">Change Format Features to display new metadata</Text>
    1297 </Heading>
    1298 <NumberedItem>
    1299 <Text id="0356">Now we customize the collection's appearance. Building or previewing the collection at this point won't reveal anything new. That's because we haven't changed the design of the collection to take advantage of the new metadata.</Text>
    1300 </NumberedItem>
    1301 <NumberedItem>
    1302 <Text id="0358">Go to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Formats"/> from the left-hand list. Leave the feature selection controls at their default values, so that <AutoText key="glidict::CDM.FormatManager.Feature"/> remains blank and <AutoText text="VList" /> is selected as the <AutoText key="glidict::CDM.FormatManager.Part"/>. In the <AutoText key="glidict::CDM.FormatManager.Editor"/>, edit the text as follows:</Text>
    1303 <BulletList>
    1304 <Bullet>
    1305 <Text id="0359">Change <Format>_ImageName_:</Format> to <Format>Title:</Format></Text>
    1306 </Bullet>
    1307 <Bullet>
    1308 <Text id="0359a">Change <Format>[Image]</Format> to <Format>[dc.Title]</Format></Text>
    1309 </Bullet>
    1310 <Bullet>
    1311 <Text id="0359b">After <Format>[dc.Title]&lt;br&gt;</Format> add <Format>Description: [dc.Description]&lt;br&gt;</Format></Text>
    1312 </Bullet>
    1313 </BulletList>
    1314 <Comment>
    1315 <Text id="0360">Metadata names are case-sensitive in Greenstone: it is important that you capitalize "Title" and "Description" (and don't capitalize "dc").</Text>
    1316 </Comment>
    1317 </NumberedItem>
    1318 <NumberedItem>
    1319 <Text id="0361a">Next click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>. The new format statement will be displayed in the list of assigned format statements. The first substitution alters the fragment of text that appears to the right of the thumbnail image, the second alters the item of metadata that follows it. The addition displays the description after the Title.</Text>
    1320 </NumberedItem>
    1321 <NumberedItem>
    1322 <Text id="0362a">Go to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Build_Collection" type="button"/>. Once it has finished building, <b>preview</b> the collection. When you click on <AutoText key="coredm::_Global:labelBrwse_"/> in the navigation bar the presentation has changed to "Title: Bear" and so on. Each image's description should appear beside the thumbnail, following the title.</Text>
    1323 </NumberedItem>
    1324 <Comment>
    1325 <Text id="0363">After the first three items, the Title and Description become blank because we have only assigned Dublin Core metadata to these first three. To get a full listing, enter all the metadata.</Text>
    1326 </Comment>
    1327 <Comment>
    1328 <Text id="0364">For some design parameters the collection must be rebuilt before the effect of changes can be seen. However, changes to format statements take place immediately and you can see the result straightaway by clicking <b>reload</b> (or <b>refresh</b>) in the web browser. Above, you were asked to build before previewing because you had added metadata.</Text>
    1329 </Comment>
    1330 <Heading>
    1331 <Text id="0365">Changing the size of image thumbnails</Text>
    1332 </Heading>
    1333 <NumberedItem>
    1334 <Text id="0366">Lets change the size of the thumbnail image and make it smaller. Thumbnail images are created by the <AutoText text="ImagePlug"/> plug-in, so we need to access its configuration settings. To do this, switch to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Plugins"/> from the list on the left. Double-click <AutoText text="plugin ImagePlug"/> to pop up a window that shows its settings. (Alternatively, select <AutoText text="plugin ImagePlug"/> with a single click and then click <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/> further down the screen). Currently all options are off, so standard defaults are used. Select <AutoText text="thumbnailsize"/>, set it to <AutoText text="50"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
    1335 </NumberedItem>
    1336 <NumberedItem>
    1337 <Text id="0367"><b>Build</b> and <b>preview</b> the collection.</Text>
    1338 </NumberedItem>
    1339 <NumberedItem>
    1340 <Text id="0368">Once you have seen the result of the change, return to the <AutoText key="glidict::GUI.Design"/> panel, select the configuration options for <AutoText text="ImagePlug"/>, and switch the <AutoText text="thumbnailsize"/> option off so that the thumbnail reverts to its normal size when the collection is re-built.</Text>
    1341 </NumberedItem>
    1342 <Heading>
    1343 <Text id="0380">Adding a browsing classifier based on Description metadata</Text>
    1344 </Heading>
    1345 <NumberedItem>
    1346 <Text id="0381">Now we'll add a new browsing option based on the descriptions. In the <AutoText key="glidict::GUI.Design"/> panel, select <AutoText key="glidict::CDM.GUI.Classifiers"/> from the left-hand list. Set the menu item for <AutoText key="glidict::CDM.ClassifierManager.Classifier"/> to <AutoText text="AZList" />; then click <AutoText key="glidict::CDM.ClassifierManager.Add" type="button"/>.</Text>
    1347 </NumberedItem>
    1348 <NumberedItem>
    1349 <Text id="0382">A window pops up to control the classifier's options. Set the <AutoText text="metadata"/> option to <AutoText key="metadata::dc.Description"/> and click <AutoText key="glidict::General.OK" type="button"/>.</Text>
    1350 </NumberedItem>
    1351 <NumberedItem>
    1352 <Text id="0382a"><b>Build</b> the collection, and <b>preview</b> it. Choose the new <b>descriptions</b> link that appears in the navigation bar.</Text>
    1353 </NumberedItem>
    1354 <Comment>
    1355 <Text id="0383">Only three items are shown, because only items with the relevant metadata (dc.Description in this case) appear in the list. The original browse list includes all photos in the collection because it is based on <AutoText key="metadata::ex.Image"/>, extracted metadata that reflects an image's filename, which is set for all images in the collection.</Text>
    1356 </Comment>
    1357 <Heading>
    1358 <Text id="0384">Creating a searchable index based on Description metadata</Text>
    1359 </Heading>
    1360 <NumberedItem>
    1361 <Text id="0385">Now we'll add an index so that the collection can be searched by descriptions. Switch to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Indexes"/> from the left-hand list. Enter the text "descriptions" as the <AutoText key="glidict::CDM.IndexManager.Index_Name"/>, select <AutoText key="metadata::dc.Description"/> from the <AutoText key="glidict::CDM.IndexManager.Source"/> list, and click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>.</Text>
    1362 </NumberedItem>
    1363 <NumberedItem>
    1364 <Text id="0386">Switch to the <AutoText key="glidict::GUI.Create"/> panel, <b>build</b> the collection, then <b>preview</b> it. There is now a <AutoText key="coredm::_Global:labelSearch_"/> button in the navigation bar. As an example, search for the term "bear" in the <i>descriptions</i> index (which is the only index at this point).</Text>
    1365 </NumberedItem>
    1366 </Content>
    1367 </Tutorial>
    13681297<Tutorial id="export_to_CDROM">
    13691298<Title>
    13701299<Text id="0403">Exporting a collection to CD-ROM/DVD</Text>
    13711300</Title>
    1372 <Prerequisite id="large_html_collection"/>
    1373 <Version initial="2.60" current="2.70"/>
     1301<Version initial="2.60" current="2.70w"/>
    13741302<Content>
    13751303<Comment>
     
    13911319</Content>
    13921320</Tutorial>
    1393 <Tutorial id="downloading_from_internet">
     1321<Tutorial id="large_html_collection">
    13941322<Title>
    1395 <Text id="0411">Downloading files from the web</Text>
     1323<Text id="0387">A large collection of HTML files&mdash;Tudor</Text>
    13961324</Title>
    1397 <Prerequisite id="large_html_collection"/>
    1398 <Version initial="2.60" current="2.70"/>
     1325<SampleFiles folder="tudor"/>
     1326<Version initial="2.60" current="2.70w"/>
    13991327<Content>
    1400 <Comment>
    1401 <Text id="0412">The Greenstone Librarian Interface's Download panel allows you to download individual files, parts of websites, and indeed whole websites, from the web.</Text>
    1402 </Comment>
    1403 <NumberedItem>
    1404 <Text id="0413">Start a new collection called <b>webtudor</b>, and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/></Text>
    1405 </NumberedItem>
    1406 <NumberedItem>
    1407 <Text id="0414">In a web browser, visit <Link>http://englishhistory.net</Link>, follow the link to <i>Tudor England</i>, and click &lt;<b>Enter</b>&gt;. You should be at the URL</Text>
    1408 <Link>http://englishhistory.net/tudor/contents.html</Link>
    1409 <Text id="0415">This is where we started the downloading process to obtain the files you have been using for the <b>tudor</b> collection. You could do the same thing by copying this URL from the web browser, pasting it into the <AutoText key="glidict::GUI.Download"/> panel, and clicking the <AutoText key="glidict::Mirroring.Download" type="button"/> button. However, several megabytes will be downloaded, which might strain your network resources&mdash;or your patience! For a faster exercise we focus on a smaller section of the site. </Text>
    1410 </NumberedItem>
    1411 <NumberedItem>
    1412 <Text id="0415a">In the <AutoText key="glidict::GUI.Download"/> panel, enter this URL</Text>
    1413 <Link>http://englishhistory.net/tudor/citizens/</Link>
    1414 <Text id="0417">into the <AutoText key="glidict::Mirroring.Source_URL"/> box. There are several options that govern how the download process proceeds. To copy just the <i>citizens</i> section of the website, select <AutoText key="glidict::Mirroring.Higher_Directories"/>. If you don't do this (or if you miss out the terminating "/"), the downloading process will follow links to other areas of the <i>englishhistory.net</i> website and grab those as well. Set <AutoText key="glidict::Mirroring.Download_Depth"/> to <AutoText key="glidict::Mirroring.Download_Depth.Unlimited"/>&mdash;we want to follow as many links as necessary to download all the pages.</Text>
    1415 </NumberedItem>
    1416 <NumberedItem>
    1417 <Text id="0417a">If your computer is behind a firewall or proxy server, you will need to edit the proxy settings in the Librarian Interface. Open the <AutoText key="glidict::Preferences.Connection"/> tab in <Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_Options"/></Menu> and switch on the <AutoText key="glidict::Preferences.Connection.Use_Proxy"/> checkbox. Enter the proxy server address and port number in the <AutoText key="glidict::Preferences.Connection.Proxy_Host"/> and <AutoText key="glidict::Preferences.Connection.Proxy_Port"/> boxes. Click <AutoText key="General.OK" type="button"/>.</Text>
    1418 </NumberedItem>
    1419 <NumberedItem>
    1420 <Text id="0418">Now click <AutoText key="glidict::Mirroring.Download" type="button"/>. If you have set proxy information in <AutoText key="glidict::Menu.File_Options"/>, a popup will ask for you user name and password. Once the download has started, a progress bar appears in the lower half of the panel that reports on how the downloading process is doing.</Text>
    1421 <Comment>
    1422 <Text id="0419">More detailed information can be obtained by clicking <AutoText key="glidict::Mirroring.DownloadJob.Log" type="button"/>. The process can be paused and restarted as needed, or stopped altogether by clicking <AutoText key="glidict::Mirroring.DownloadJob.Close" type="button"/>. Downloading can be a lengthy process involving multiple sites, and so Greenstone allows additional downloads to be queued up. When new URLs are pasted into the <AutoText key="glidict::Mirroring.Source_URL"/> box and <AutoText key="glidict::Mirroring.Download" type="button"/> clicked, a new progress bar is appended to those already present in the lower half of the panel. When the currently active download item completes, the next is started automatically.</Text>
    1423 </Comment>
    1424 </NumberedItem>
    1425 <NumberedItem>
    1426 <Text id="0420">Downloaded files are stored in a top-level folder called <AutoText key="glidict::Tree.DownloadedFiles"/> that appears on the left-hand side of the <AutoText key="glidict::GUI.Gather"/> panel. You may not need all the downloaded files, and you choose which you want by dragging selected files from this folder over into the collection area on the right-hand side, just like we have done before when selecting data from the <Path>sample_files</Path> folder. In this example we will include everything that has been downloaded.</Text>
    1427 <Text id="0421">Select the <Path>englishhistory.net</Path> folder within <AutoText key="glidict::Tree.DownloadedFiles"/> and drag it across into the collection area.</Text>
    1428 </NumberedItem>
    1429 <NumberedItem>
    1430 <Text id="0422">Switch to the <AutoText key="glidict::GUI.Create"/> panel to <b>build</b> and <b>preview</b> the collection. It is smaller than the previous collection because we included only the <i>citizens</i> files. However, these now represent the latest versions of the documents.</Text>
    1431 </NumberedItem>
    1432 </Content>
    1433 </Tutorial>
    1434 <Tutorial id="web_linking">
    1435 <Title>
    1436 <Text id="0423">Pointing to documents on the web</Text>
    1437 </Title>
    1438 <Prerequisite id="downloading_from_internet"/>
    1439 <Version initial="2.60" current="2.70"/>
    1440 <Content>
    1441 <NumberedItem>
    1442 <Text id="0424">Open up your <b>webtudor</b> collection, and in the <AutoText key="glidict::GUI.Gather"/> panel inspect the files you dragged into it. The first folder is <Path>englishhistory.net</Path>, which opens up to reveal <Path>tudor</Path>, and so on. The files represent a complete sweep of the pages (and supporting images) that constitute the <i>Tudor citizens</i> section of the <i>englishhistory.net</i> web site. They were downloaded from the web in a way that preserved the structure of the original site. This allows any page's original URL to be reconstructed from the folder hierarchy.</Text>
    1443 </NumberedItem>
    1444 <NumberedItem>
    1445 <Text id="0425">In the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText key="glidict::CDM.GUI.Plugins"/> section, then select the <AutoText text="plugin HTMLPlug"/> line and click <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/>. A popup window appears. Locate the <AutoText text="file_is_url"/> option (about halfway down the first block of items) and switch it on. While you are there, switch off the <AutoText text="smart_block"/> option so that stray images are not processed. Click <AutoText key="glidict::General.OK" type="button"/>.</Text>
    1446 <Text id="0426">Setting this option to the <AutoText text="HTMLPlug"/> means that Greenstone sets an additional piece of metadata for each document called <AutoText text="URL"/>, which gives its original URL.</Text>
    1447 <Text id="0427">It is important that the files gathered in the collection start with the web domain name (<i>englishhistory.net</i> in this case). The conversion process will not work if you dragged over a subfolder, for example the <Path>tudor</Path> folder, because this will set <AutoText text="URL"/> metadata to something like</Text>
    1448 <Indent>
    1449 http://tudor/citizens/...
    1450 </Indent>
    1451 <Text id="0428">rather than</Text>
    1452 <Indent>
    1453 http://englishhistory.net/tudor/citizens/...
    1454 </Indent>
    1455 <Text id="0429">If you have copied over a subfolder previously, delete it and make a fresh copy. Drag the folder in the right-hand side of the <AutoText key="glidict::GUI.Gather"/> panel on to the trash can in the lower right corner. Then obtain a fresh copy of the files by dragging across the <Path>englishhistory.net</Path> folder from the <AutoText key="glidict::Tree.DownloadedFiles"/> folder on the left-hand side.</Text>
    1456 </NumberedItem>
    1457 <NumberedItem>
    1458 <Text id="0430">To make use of the new URL metadata, the icon link must be changed to serve up the original URL rather than the copy stored in the digital library. Go to the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText key="glidict::CDM.GUI.Formats"/> section and edit the <AutoText text="VList" /> format statement by replacing</Text>
    1459 <Format>[link][icon][/link]</Format>
    1460 <Text id="0431">with</Text>
    1461 <Format>[weblink][webicon][/weblink]</Format>
    1462 <Text id="0432">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> to commit the change.</Text>
    1463 </NumberedItem>
    1464 <NumberedItem>
    1465 <Text id="0433">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>build</b> and <b>preview</b> the collection. Note that the document icons have changed. The collection behaves exactly as before, except that when you click a document icon your web browser retrieves the original document from the web (assuming it is still there by the time you do this exercise!). If you are working offline you will be unable to retrieve the document.</Text>
     1328<NumberedItem>
     1329<Text id="0388">Invoke the Greenstone Librarian Interface (from the Windows <i>Start</i> menu) and start a new collection called <b>tudor</b> (use the <AutoText key="glidict::Menu.File"/> menu). Fill out the pop-up dialog with appropriate values and leave <b>Dublin Core</b>, which is selected by default, as the metadata set.</Text>
     1330</NumberedItem>
     1331<NumberedItem>
     1332<Text id="0389">In the <AutoText key="glidict::GUI.Gather"/> panel, open the <Path>tudor</Path> folder in <Path>sample_files</Path>.</Text>
     1333</NumberedItem>
     1334<NumberedItem>
     1335<Text id="0390">Drag <Path>englishhistory.net</Path> from the left-hand side to the right to include it in your <b>tudor</b> collection.</Text>
     1336</NumberedItem>
     1337<NumberedItem>
     1338<Text id="0391">Switch to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Build_Collection" type="button"/>.</Text>
     1339</NumberedItem>
     1340<NumberedItem>
     1341<Text id="0392">When building has finished, <b>preview</b> the collection.</Text>
     1342</NumberedItem>
     1343<Heading>
     1344<Text id="0392a">Extracting more metadata from the HTML</Text>
     1345</Heading>
     1346<NumberedItem>
     1347<Text id="0393">The browsing facilities in this collection (<AutoText key="coredm::_Global:labelTitle_" type="italics"/> and <AutoText key="coredm::_Global:labelSource_" type="italics"/>) are based entirely on extracted metadata. Return to the <AutoText key="glidict::GUI.Enrich"/> panel in the Librarian Interface and examine the metadata that has been extracted for some of the files.</Text>
     1348</NumberedItem>
     1349<NumberedItem>
     1350<Text id="0393a">Many HTML documents contain metadata in <Format>&lt;meta&gt;</Format> tags in the <Format>&lt;head&gt;</Format> of the page. Open up the <Path>englishhistory.net &rarr; tudor &rarr; monarchs &rarr; boleyn.html</Path> file by navigating to it in the tree on the left hand side, and double clicking it. This will open it in a web browser. View the HTML source of the page (<Menu>View &rarr; Source</Menu> in Internet Explorer, <Menu>View &rarr; Page Source</Menu> in Mozilla). You will notice that this page has <AutoText text="page_topic,content" type="italics"/> and <AutoText text="author" type="italics"/> metadata.</Text>
     1351 </NumberedItem>
     1352<NumberedItem>
     1353<Text id="0393b">By default, <AutoText text="HTMLPlug"/> only looks for Title metadata. Configure the plugin so that it looks for the other metadata too. Switch to the <AutoText key="glidict::GUI.Design"/> panel and select the <AutoText key="glidict::CDM.GUI.Plugins"/> section. Select the <AutoText text="plugin HTMLPlug"/> line and click <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/>. A popup window appears. Switch on the <AutoText text="metadata_fields"/> option, and set the value to</Text>
     1354<Format>
     1355Title,Author,Page_topic,Content
     1356</Format>
     1357<Text id="0393b-1">Make sure that you have copied this exactly, with no spaces. Click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     1358</NumberedItem>
     1359<NumberedItem>
     1360<Text id="0393c">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>rebuild</b> the collection. Go back to the <AutoText key="glidict::GUI.Enrich"/> panel and look at the extracted metadata for some of the HTML files in <Path>englishhistory.net &rarr; tudor &rarr; monarchs</Path>. The new metadata should new be visible.</Text>
     1361</NumberedItem>
     1362<Heading>
     1363<Text id="0393d">Blocking the stray images</Text>
     1364</Heading>
     1365<Comment>
     1366<Text id="0394">You've probably noticed that the collection contains a few stray image files, as well as the HTML documents. This is a mistake. The issue is that many of the HTML documents include images, and although Greenstone attempts to determine which images belong to HTML pages and only considers other images for inclusion in the collection, in this case it hasn't been completely successful. (This is because the web site from which these files were downloaded occasionally departs from the usual convention of hierarchical structuring.)</Text>
     1367</Comment>
     1368<NumberedItem>
     1369<Text id="0395">Switch back to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. Beside <AutoText text="plugin HTMLPlug"/> you will see <AutoText text="-smart_block"/>. This is the option that attempts to identify images in the HTML pages and block them from inclusion&mdash;in this case, it's not smart enough! <b>Configure</b> <AutoText text="plugin HTMLPlug"/> again, scroll down the page to locate the <AutoText text="smart_block"/> option, and switch it off.</Text>
     1370</NumberedItem>
     1371<NumberedItem>
     1372<Text id="0396"><b>Rebuild</b> and <b>preview</b> the collection. The collection is exactly as before except that these stray images are suppressed. What is happening is that plug-ins operate as a pipeline: files are passed to each one in turn until one is found that can process it. By default (i.e. without <AutoText text="smart_block"/>) the HTML plug-in blocks <i>all</i> images, which is appropriate for this collection.</Text>
     1373</NumberedItem>
     1374<Heading>
     1375<Text id="0397">Looking at different views of the files in the <AutoText key="glidict::GUI.Gather"/> and <AutoText key="glidict::GUI.Enrich"/> panels</Text>
     1376</Heading>
     1377<NumberedItem>
     1378<Text id="0398">Switch to the <AutoText key="glidict::GUI.Gather"/> panel and in the right-hand side open <Path>englishhistory.net &rarr; tudor</Path>.</Text>
     1379</NumberedItem>
     1380<NumberedItem>
     1381<Text id="0400">Change the <AutoText key="glidict::Filter.Filter_Tree"/> menu for the right-hand side from <AutoText key="glidict::Filter.All_Files"/> to <AutoText key="glidict::Filter.0"/>. Notice the files displayed above are filtered accordingly, to show only files of this type.</Text>
     1382</NumberedItem>
     1383<NumberedItem>
     1384<Text id="0401">Change the <AutoText key="glidict::Filter.Filter_Tree"/> menu to <AutoText key="glidict::Filter.3"/>. Again, the files shown above alter.</Text>
     1385</NumberedItem>
     1386<NumberedItem>
     1387<Text id="0402">Now return the <AutoText key="glidict::Filter.Filter_Tree"/> setting back to <AutoText key="glidict::Filter.All_Files"/>, otherwise you may get confused later. Remember, if the <AutoText key="glidict::GUI.Gather"/> or <AutoText key="glidict::GUI.Enrich"/> panels do not seem to be showing all your files, this could be the problem.</Text>
    14661388</NumberedItem>
    14671389</Content>
     
    14721394</Title>
    14731395<Prerequisite id="large_html_collection"/>
    1474 <Version initial="2.60" current="2.70"/>
     1396<Version initial="2.60" current="2.70w"/>
    14751397<Content>
    14761398<Comment>
     
    15831505</Title>
    15841506<Prerequisite id="large_html_collection"/>
    1585 <Version initial="2.60" current="2.70"/>
     1507<Version initial="2.60" current="2.70w"/>
    15861508<Content>
    15871509<NumberedItem>
     
    15961518<Text id="0469">This displays something that looks like this: </Text>
    15971519<Indent>
    1598 <table><tr><td><img width='15' height='20' src="tutorial_files/itext.gif"/></td><td width='408' valign='top'>A discussion of question five from Tudor Quiz: Henry VIII <br/><i>(quizstuff.html)</i></td></tr></table>
     1520<table><tr><td><img width='15' height='20' src="../tutorial_files/itext.gif"/></td><td width='408' valign='top'>A discussion of question five from Tudor Quiz: Henry VIII <br/><i>(quizstuff.html)</i></td></tr></table>
    15991521</Indent>
    16001522<Text id="0472">for a particular document whose <i>Title</i> metadata is <AutoText text="A discussion of question five from Tudor Quiz: Henry VIII"/> and whose <i>Source</i> metadata is <AutoText text="quizstuff.html"/>.</Text>
     
    16151537<Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the <AutoText key="coredm::_Global:labelTitle_"/> list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default. </Text>
    16161538<Comment>
    1617 <Text id="0478">But there's a problem. Beside the bookshelves in the <AutoText key="coredm::_Global:labelSubject_"/> browser, beneath the subject appears a mysterious "()". What is printed on these bookshelf nodes is governed by the same format statement, and though bookshelf nodes of the hierarchy have associated <i>Title</i> metadata&mdash;their title is the name of the metadata value associated with that bookshelf&mdash;they do not have <AutoText key="metadata::ex.Source"/> metadata, so it comes out blank.</Text>
     1539<Text id="0478">But there's a problem. Beside the bookshelves in the <AutoText key="coredm::_Global:labelSubject_"/> browser, beneath the subject appears a mysterious "()". What is printed for these bookshelves is governed by the same format statement, and though bookshelf nodes of the hierarchy have associated <i>Title</i> metadata&mdash;their title is the name of the metadata value associated with that bookshelf&mdash;they do not have <AutoText key="metadata::ex.Source"/> metadata, so it comes out blank.</Text>
    16181540</Comment>
    16191541</NumberedItem>
     
    16491571<NumberedItem>
    16501572<Text id="0490">Now go to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/>. Documents in the search results list will be displayed like this:</Text>
    1651 <table><tr><td><img width='15' height='20' src="tutorial_files/itext.gif" /></td><td width='408' valign='top'>A discussion of question five from Tudor Quiz: Henry VIII <br/>
     1573<table><tr><td><img width='15' height='20' src="../tutorial_files/itext.gif" /></td><td width='408' valign='top'>A discussion of question five from Tudor Quiz: Henry VIII <br/>
    16521574Tudor period|Others</td></tr></table>
    16531575<Text id="0493">(The vertical bar appears because this <i>dc.Subject and Keywords</i> metadata is hierarchical metadata. Unfortunately there is no way to get at individual components of the hierarchy. For most metadata, such as title and author, this isn't a problem.)</Text>
     
    16711593</NumberedItem>
    16721594<NumberedItem>
    1673 <Text id="0498">Go to the <AutoText key="glidict::GUI.Create"/> panel, click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/>, and examine the subject hierarchy again to see the effect of your changes.</Text>
     1595<Text id="0498">Go to the <AutoText key="glidict::GUI.Create"/> panel, click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/>, and examine the subject hierarchy again to see the effect of your changes. Bookshelves should say <AutoText text="Bookshelf title:"/> and then the title, while documents will  display <AutoText text="Title:"/> and the title. Note that the number of documents in the bookshelf is not displayed: we are using <Format>[numleafdocs]</Format> to test what kind of item in the list we are at, but we are not displaying it.</Text>
    16741596</NumberedItem>
    16751597</Content>
     
    17311653</Content>
    17321654</Tutorial>
     1655<Tutorial id="downloading_from_internet">
     1656<Title>
     1657<Text id="0411">Downloading files from the web</Text>
     1658</Title>
     1659<Version initial="2.60" current="2.70w"/>
     1660<Content>
     1661<Comment>
     1662<Text id="0412">The Greenstone Librarian Interface's Download panel allows you to download individual files, parts of websites, and indeed whole websites, from the web.</Text>
     1663</Comment>
     1664<NumberedItem>
     1665<Text id="0413">Start a new collection called <b>webtudor</b>, and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/></Text>
     1666</NumberedItem>
     1667<NumberedItem>
     1668<Text id="0414">In a web browser, visit <Link>http://englishhistory.net</Link>, follow the link to <i>Tudor England</i>, and click &lt;<b>Enter</b>&gt;. You should be at the URL</Text>
     1669<Link>http://englishhistory.net/tudor/contents.html</Link>
     1670<Text id="0415">This is where we started the downloading process to obtain the files you have been using for the <b>tudor</b> collection. You could do the same thing by copying this URL from the web browser, pasting it into the <AutoText key="glidict::GUI.Download"/> panel, and clicking the <AutoText key="glidict::Mirroring.Download" type="button"/> button. However, several megabytes will be downloaded, which might strain your network resources&mdash;or your patience! For a faster exercise we focus on a smaller section of the site. </Text>
     1671</NumberedItem>
     1672<NumberedItem>
     1673<Text id="0415a">In the <AutoText key="glidict::GUI.Download"/> panel, enter this URL</Text>
     1674<Link>http://englishhistory.net/tudor/citizens/</Link>
     1675<Text id="0417">into the <AutoText key="glidict::Mirroring.Source_URL"/> box. There are several options that govern how the download process proceeds. To copy just the <i>citizens</i> section of the website, select <AutoText key="glidict::Mirroring.Higher_Directories"/>. If you don't do this (or if you miss out the terminating "/"), the downloading process will follow links to other areas of the <i>englishhistory.net</i> website and grab those as well. Set <AutoText key="glidict::Mirroring.Download_Depth"/> to <AutoText key="glidict::Mirroring.Download_Depth.Unlimited"/>&mdash;we want to follow as many links as necessary to download all the pages.</Text>
     1676</NumberedItem>
     1677<NumberedItem>
     1678<Text id="0417a">If your computer is behind a firewall or proxy server, you will need to edit the proxy settings in the Librarian Interface. Open the <AutoText key="glidict::Preferences.Connection"/> tab in <Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_Options"/></Menu> and switch on the <AutoText key="glidict::Preferences.Connection.Use_Proxy"/> checkbox. Enter the proxy server address and port number in the <AutoText key="glidict::Preferences.Connection.Proxy_Host"/> and <AutoText key="glidict::Preferences.Connection.Proxy_Port"/> boxes. Click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     1679</NumberedItem>
     1680<NumberedItem>
     1681<Text id="0418">Now click <AutoText key="glidict::Mirroring.Download" type="button"/>. If you have set proxy information in <AutoText key="glidict::Menu.File_Options"/>, a popup will ask for you user name and password. Once the download has started, a progress bar appears in the lower half of the panel that reports on how the downloading process is doing.</Text>
     1682<Comment>
     1683<Text id="0419">More detailed information can be obtained by clicking <AutoText key="glidict::Mirroring.DownloadJob.Log" type="button"/>. The process can be paused and restarted as needed, or stopped altogether by clicking <AutoText key="glidict::Mirroring.DownloadJob.Close" type="button"/>. Downloading can be a lengthy process involving multiple sites, and so Greenstone allows additional downloads to be queued up. When new URLs are pasted into the <AutoText key="glidict::Mirroring.Source_URL"/> box and <AutoText key="glidict::Mirroring.Download" type="button"/> clicked, a new progress bar is appended to those already present in the lower half of the panel. When the currently active download item completes, the next is started automatically.</Text>
     1684</Comment>
     1685</NumberedItem>
     1686<NumberedItem>
     1687<Text id="0420">Downloaded files are stored in a top-level folder called <AutoText key="glidict::Tree.DownloadedFiles"/> that appears on the left-hand side of the <AutoText key="glidict::GUI.Gather"/> panel. You may not need all the downloaded files, and you choose which you want by dragging selected files from this folder over into the collection area on the right-hand side, just like we have done before when selecting data from the <Path>sample_files</Path> folder. In this example we will include everything that has been downloaded.</Text>
     1688<Text id="0421">Select the <Path>englishhistory.net</Path> folder within <AutoText key="glidict::Tree.DownloadedFiles"/> and drag it across into the collection area.</Text>
     1689</NumberedItem>
     1690<NumberedItem>
     1691<Text id="0422">Switch to the <AutoText key="glidict::GUI.Create"/> panel to <b>build</b> and <b>preview</b> the collection. It is smaller than the previous collection because we included only the <i>citizens</i> files. However, these now represent the latest versions of the documents.</Text>
     1692</NumberedItem>
     1693</Content>
     1694</Tutorial>
     1695<Tutorial id="web_linking">
     1696<Title>
     1697<Text id="0423">Pointing to documents on the web</Text>
     1698</Title>
     1699<Prerequisite id="downloading_from_internet"/>
     1700<Version initial="2.60" current="2.70w"/>
     1701<Content>
     1702<NumberedItem>
     1703<Text id="0424">Open up your <b>webtudor</b> collection, and in the <AutoText key="glidict::GUI.Gather"/> panel inspect the files you dragged into it. The first folder is <Path>englishhistory.net</Path>, which opens up to reveal <Path>tudor</Path>, and so on. The files represent a complete sweep of the pages (and supporting images) that constitute the <i>Tudor citizens</i> section of the <i>englishhistory.net</i> web site. They were downloaded from the web in a way that preserved the structure of the original site. This allows any page's original URL to be reconstructed from the folder hierarchy.</Text>
     1704</NumberedItem>
     1705<NumberedItem>
     1706<Text id="0425">In the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText key="glidict::CDM.GUI.Plugins"/> section, then select the <AutoText text="plugin HTMLPlug"/> line and click <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/>. A popup window appears. Locate the <AutoText text="file_is_url"/> option (about halfway down the first block of items) and switch it on. While you are there, switch off the <AutoText text="smart_block"/> option so that stray images are not processed. Click <AutoText key="glidict::General.OK" type="button"/>.</Text>
     1707<Text id="0426">Setting this option to the <AutoText text="HTMLPlug"/> means that Greenstone sets an additional piece of metadata for each document called <AutoText text="URL"/>, which gives its original URL.</Text>
     1708<Text id="0427">It is important that the files gathered in the collection start with the web domain name (<i>englishhistory.net</i> in this case). The conversion process will not work if you dragged over a subfolder, for example the <Path>tudor</Path> folder, because this will set <AutoText text="URL"/> metadata to something like</Text>
     1709<Indent>
     1710http://tudor/citizens/...
     1711</Indent>
     1712<Text id="0428">rather than</Text>
     1713<Indent>
     1714http://englishhistory.net/tudor/citizens/...
     1715</Indent>
     1716<Text id="0429">If you have copied over a subfolder previously, delete it and make a fresh copy. Drag the folder in the right-hand side of the <AutoText key="glidict::GUI.Gather"/> panel on to the trash can in the lower right corner. Then obtain a fresh copy of the files by dragging across the <Path>englishhistory.net</Path> folder from the <AutoText key="glidict::Tree.DownloadedFiles"/> folder on the left-hand side.</Text>
     1717</NumberedItem>
     1718<NumberedItem>
     1719<Text id="0430">To make use of the new URL metadata, the icon link must be changed to serve up the original URL rather than the copy stored in the digital library. Go to the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText key="glidict::CDM.GUI.Formats"/> section and edit the <AutoText text="VList" /> format statement by replacing</Text>
     1720<Format>[link][icon][/link]</Format>
     1721<Text id="0431">with</Text>
     1722<Format>[weblink][webicon][/weblink]</Format>
     1723<Text id="0432">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> to commit the change.</Text>
     1724</NumberedItem>
     1725<NumberedItem>
     1726<Text id="0433">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>build</b> and <b>preview</b> the collection. Note that the document icons have changed. The collection behaves exactly as before, except that when you click a document icon your web browser retrieves the original document from the web (assuming it is still there by the time you do this exercise!). If you are working offline you will be unable to retrieve the document.</Text>
     1727</NumberedItem>
     1728</Content>
     1729</Tutorial>
    17331730<Tutorial id="bibliography_collection">
    17341731<Title>
     
    17361733</Title>
    17371734<SampleFiles folder="marc"/>
    1738 <Version initial="2.60" current="2.70"/>
     1735<Version initial="2.60" current="2.70w"/>
    17391736<Content>
    17401737<Comment>
     
    19141911&lt;td valign=top&gt;&lt;b&gt;[ex.Photographer^all]&lt;/b&gt;&lt;br/&gt;[ex.Notes^all]&lt;/td&gt;
    19151912</Format>
     1913<Text id="is-11a">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    19161914</NumberedItem>
    19171915<NumberedItem>
     
    19201918<Text id="is-13"><AutoText text="ISISPlug"/> stores a nicely formatted version of the record as the document text, and this is what is displayed when we view a record. Lets tidy it up a little more.</Text>
    19211919<NumberedItem>
    1922 <Text id="is-14">In the <AutoText key="glidict::CDM.GUI.Formats"/> section, remove the <AutoText key="coredm::_document:textDETACH_" type="italics"/> and <AutoText key="coredm::_document:textNOHIGHLIGHT_" type="italics"/> buttons by setting the <AutoText text="DocumentButtons"/> format statement to empty.</Text>
    1923 </NumberedItem>
    1924 <NumberedItem>
    1925 <Text id="is-15">Clear the <AutoText text="DocumentHeading"/> format statement to remove the <AutoText text="Untitled" type="quoted"/> at the top of the document.</Text>
     1920<Text id="is-14">In the <AutoText key="glidict::CDM.GUI.Formats"/> section, remove the <AutoText key="coredm::_document:textDETACH_" type="italics"/> and <AutoText key="coredm::_document:textNOHIGHLIGHT_" type="italics"/> buttons by setting the <AutoText text="DocumentButtons"/> format statement to empty, and clicking <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
     1921</NumberedItem>
     1922<NumberedItem>
     1923<Text id="is-15">Remove the <AutoText text="Untitled" type="quoted"/> at the top of the document by setting the <AutoText text="DocumentHeading"/> format statement to empty and clicking <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    19261924</NumberedItem>
    19271925<NumberedItem>
     
    19351933}
    19361934</Format>
     1935<Text id="is-16a">Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
     1936
    19371937</NumberedItem>
    19381938<NumberedItem>
     
    19461946</Title>
    19471947<SampleFiles folder="custom"/>
    1948 <Version initial="2.70" current="2.70"/>
     1948<Version initial="2.70" current="2.70w"/>
    19491949<Content>
    19501950<Text id="mf-2">The appearance of all pages produced by Greenstone is governed by macro files, which reside in the folder <Path>Greenstone &rarr; macros</Path>, images, and CSS stylesheets, both of which reside in <Path>Greenstone &rarr; images</Path>. </Text>
     
    22012201</Title>
    22022202<SampleFiles folder="beatles"/>
    2203 <Version initial="2.60" current="2.70"/>
     2203<Version initial="2.60" current="2.70w"/>
    22042204<Content>
    22052205<NumberedItem>
     
    22372237<Prerequisite id="multimedia_collection_explore"/>
    22382238<SampleFiles folder="beatles"/>
    2239 <Version initial="2.60" current="2.70"/>
     2239<Version initial="2.60" current="2.70w"/>
    22402240<Content>
    22412241<Comment>
     
    22432243</Comment>
    22442244<NumberedItem>
    2245 <Text id="0552">Start a new collection (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>) called <b>small_beatles</b>, basing it on the default "New Collection." (Basing it on the existing Advanced Beatles collection would make your life far easier, but we want you to learn how to build it from scratch!) Fill out the fields with appropriate information. Use the Dublin Core metadata set (set by default).</Text>
     2245<Text id="0552">Start a new collection (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>) called <b>small beatles</b>, basing it on the default "New Collection." (Basing it on the existing Advanced Beatles collection would make your life far easier, but we want you to learn how to build it from scratch!) Fill out the fields with appropriate information. Use the Dublin Core metadata set (set by default).</Text>
    22462246</NumberedItem>
    22472247<NumberedItem>
     
    23102310<Text id="0575"><b>Build</b> the collection again and <b>preview</b> it.</Text>
    23112311</NumberedItem>
     2312<Comment>
     2313<Text id="0575a">Note how we assigned dc.Format metadata to all documents in the collection with a minimum of labour. We did this by capitalizing on the folder structure of the original information. Even though we complained earlier about how messy this folder structure is, you can still take advantage of it when assigning metadata.</Text>
     2314</Comment>
    23122315<Heading>
    23132316<Text id="0579">Suppressing dummy text</Text>
    23142317</Heading>
    23152318<NumberedItem>
    2316 <Text id="0580">Alongside the Audio files there is an MP3 icon, which plays the audio when you click it, and also a text document that contains some dummy text. This isn't supposed to be seen, but to suppress it you have to fiddle with a format statement.</Text>
     2319<Text id="0580">Alongside the Audio files there is an MP3 icon, which plays the audio when you click it, and also a text document that contains some dummy text. Image files also have dummy documents. These dummy documents aren't supposed to be seen, but to suppress them you have to fiddle with a format statement. </Text>
    23172320<BulletList>
    23182321<Bullet>
     
    23202323</Bullet>
    23212324<Bullet>
    2322 <Text id="0582">Ensure that <AutoText text="VList" /> is selected, and make the changes that are highlighted below. You need to insert three lines into the first line, and delete the second line.<br/> <br/> Change:</Text>
     2325<Text id="0582">Ensure that <AutoText text="VList" /> is selected, and make the changes that are highlighted below. You need to insert five lines into the first line, and delete the second line. (Note, the changes are available in a text file, see below.)</Text>
     2326<Text id="0582a">Change:</Text>
    23232327<Format>
    23242328&lt;td valign=top&gt;<highlight>[link][icon][/link]</highlight>&lt;/td&gt;<br/>
     
    23322336&lt;td valign=top&gt;<br/>
    23332337<highlight>{If}{[dc.Format] eq 'Audio', </highlight><br/>
    2334 <highlight>&nbsp;&nbsp;[srclink][srcicon][/srclink], </highlight><br/>
    2335 <highlight>&nbsp;&nbsp;[link][icon][/link]}</highlight>&lt;/td&gt; <br/>
    2336 &lt;td valign=top&gt;[highlight] {Or}{[dls.Title],[dc.Title],[Title],Untitled} [/highlight]{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}&lt;/td&gt;
     2338<highlight>[srclink][srcicon][/srclink], </highlight><br/>
     2339<highlight>{If}{[dc.Format] eq 'Images',</highlight><br/>
     2340<highlight>[srclink][thumbicon][/srclink],</highlight><br/>
     2341<highlight>[link][icon][/link]}}</highlight>&lt;/td&gt; <br/>
     2342&lt;td valign=top&gt;[highlight]<br/>
     2343{Or}{[dls.Title],[dc.Title],[Title],Untitled}<br/>
     2344[/highlight]{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}&lt;/td&gt;
    23372345</Format>
    23382346</Bullet>
     
    23432351<Text id="0585">To make this easier for you we have prepared a plain text file that contains the new text. In WordPad open the following file:</Text>
    23442352<Path>sample_files &rarr; beatles &rarr; format_tweaks &rarr; audio_tweak.txt</Path>
    2345 <Text id="0586">(Be sure to use WordPad rather than Notepad, because Notepad does not display the line breaks correctly.) Place it in the copy buffer by highlighting the text in WordPad and selecting <Menu>Edit &rarr; Copy</Menu>. Now move back to the Librarian Interface, highlight all the text that makes up the current VList format statement, and use <Menu><AutoText key="glidict::Menu.Edit"/> &rarr; <AutoText key="glidict::Menu.Edit_Paste"/></Menu> to transform the old statement to the new one. Remember to press <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> when finished.</Text>
     2353<Text id="0586">(Be sure to use WordPad rather than Notepad, because Notepad does not display the line breaks correctly.) Place it in the copy buffer by highlighting the text in WordPad and selecting <Menu>Edit &rarr; Copy</Menu>. Now move back to the Librarian Interface, highlight all the text that makes up the current <AutoText text="VList"/> format statement, and use <Menu><AutoText key="glidict::Menu.Edit"/> &rarr; <AutoText key="glidict::Menu.Edit_Paste"/></Menu> to transform the old statement to the new one. Remember to press <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> when finished.</Text>
    23462354<Text id="0589"><b>Preview</b> the result. You may need to click the browser's &lt;<b>Reload</b>&gt; button to force it to re-load the page.</Text>
    23472355</NumberedItem>
     
    23542362&lt;td valign=top&gt;<br/>
    23552363{If}{[dc.Format] eq 'Audio',<br/>
    2356 &nbsp;&nbsp;[srclink][srcicon][/srclink],<br/>
    2357 &nbsp;&nbsp;[link][icon][/link]}&lt;/td&gt; <br/>
    2358 &lt;td valign=top&gt;[highlight] {Or}{[dls.Title],[dc.Title],[Title],Untitled} [/highlight]<highlight>{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}</highlight>&lt;/td&gt;</Format>
     2364[srclink][srcicon][/srclink],<br/>
     2365{If}{[dc.Format] eq 'Images',<br/>
     2366[srclink][thumbicon][/srclink],<br/>
     2367[link][icon][/link]}}&lt;/td&gt; <br/>
     2368&lt;td valign=top&gt;[highlight]<br/>
     2369{Or}{[dls.Title],[dc.Title],[Title],Untitled}<br/>
     2370[/highlight]<highlight>{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}</highlight>&lt;/td&gt;</Format>
    23592371</Bullet>
    23602372</BulletList>
    2361 <Text id="0595">Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> after all this work! <b>Preview</b> the result (you don't need to build the collection.)</Text>
     2373<Text id="0595">Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> after all this work! <b>Preview</b> the result (you don't need to rebuild the collection.)</Text>
    23622374</NumberedItem>
    23632375<Heading>
     
    23892401</Heading>
    23902402<NumberedItem>
    2391 <Text id="0606">Make the bookshelves show how many documents they contain by inserting a line in the <AutoText text="VList"/> format statement in the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel:</Text>
     2403<Text id="0606">Make the bookshelves show how many documents they contain by inserting a line in the <AutoText text="VList"/> format statement in the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel. The added line is shown highlighted below. The complete format statement can be copied from <Path>sample_files &rarr; beatles &rarr; format_tweaks &rarr; show_num_docs.txt</Path>.</Text>
    23922404<Format>
    23932405&lt;td valign=top&gt;<br/>
    23942406{If}{[dc.Format] eq 'Audio',<br/>
    2395 &nbsp;&nbsp;[srclink][srcicon][/srclink],<br/>
    2396 &nbsp;&nbsp;[link][icon][/link]}&lt;/td&gt;<br/>
     2407[srclink][srcicon][/srclink],<br/>
     2408{If}{[dc.Format] eq 'Images',<br/>
     2409[srclink][thumbicon][/srclink],<br/>
     2410[link][icon][/link]}}&lt;/td&gt;<br/>
    23972411<highlight>&lt;td&gt;{If}{[numleafdocs],([numleafdocs])}&lt;/td&gt;</highlight><br/>
    2398 &lt;td valign=top&gt;[highlight] {Or}{[dls.Title],[dc.Title],[Title],Untitled} [/highlight]&lt;/td&gt;</Format>
    2399 <Text id="0607">You will find this text in <Path>format_tweaks &rarr; show_num_docs.txt</Path>, which can be copied and pasted in as before. Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    2400 <Text id="0609"><b>Preview</b> the result (you don't need to build the collection.)</Text>
    2401 </NumberedItem>
    2402 <NumberedItem>
    2403 <Text id="0610">Now turn to the images. Dummy documents are displayed here too. To suppress these dummy documents, change the <AutoText text="VList" /> format statement in the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel again by adding the two highlighted lines, and the close curly bracket:</Text>
    2404 <Format>&lt;td valign=top&gt;<br/>
    2405 {If}{[dc.Format] eq 'Audio',<br/>
    2406 &nbsp;&nbsp;[srclink][srcicon][/srclink],<br/>
    2407 &nbsp;&nbsp;<highlight>{If}{[dc.Format] eq 'Images',</highlight><br/>
    2408 &nbsp;&nbsp;&nbsp;&nbsp;<highlight>[srclink][thumbicon][/srclink],</highlight><br/>
    2409 &nbsp;&nbsp;&nbsp;&nbsp;[link][icon][/link]}<highlight>}</highlight>&lt;/td&gt;<br/>
    2410 &lt;td&gt;{If}{[numleafdocs],([numleafdocs])}&lt;/td&gt;<br/>
    2411 &lt;td valign=top&gt;[highlight] {Or}{[dls.Title],[dc.Title],[Title],Untitled} [/highlight]&lt;/td&gt;</Format>
    2412 </NumberedItem>
     2412&lt;td valign=top&gt;[highlight]<br/>
     2413{Or}{[dls.Title],[dc.Title],[Title],Untitled}<br/>
     2414[/highlight]&lt;/td&gt;</Format>
     2415<Text id="0607">Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
     2416<Text id="0609"><b>Preview</b> the result (you don't need to build the collection.) Bookshelves in the titles and browse classifiers should show how many documents they contain.</Text>
     2417</NumberedItem>
     2418<Heading>
     2419<Text id="0612a">Adding a Phind phrase browser</Text>
     2420</Heading>
    24132421<NumberedItem>
    24142422<Text id="0612">In the <AutoText key="glidict::CDM.GUI.Classifiers"/> section on the <AutoText key="glidict::GUI.Design"/> panel, add a <AutoText text="Phind"/> classifier. Leave the settings at their defaults: this generates a phrase browsing classifier that sources its phrases from <i>Title</i> and <i>text</i>.</Text>
    2415 </NumberedItem>
    2416 <NumberedItem>
    2417 <Text id="0613">To complete the collection, use the browse button of <AutoText key="glidict::CDM.General.Icon_Collection"/> in the <AutoText key="glidict::CDM.GUI.General"/> section of the <AutoText key="glidict::GUI.Design"/> panel to select the following image:</Text>
    2418 <Path>advbeat_large &rarr; images &rarr; beatlesmm.png</Path>
    2419 <Text id="0616"><b>Build</b> the collection again and <b>preview</b> it.</Text>
    2420 </NumberedItem>
    2421 <Comment>
    2422 <Text id="0617">Note how we assigned dc.Format metadata to all documents in the collection with a minimum of labour. We did this by capitalizing on the folder structure of the original information. Even though we complained earlier about how messy this folder structure is, you can still take advantage of it when assigning metadata.</Text>
    2423 </Comment>
     2423<Text id="0612b"><b>Build</b> the collection again and <b>preview</b> it. Select the new "phrases" option from the navigation bar. Enter a single word in the text box, such as <AutoText text="band" type="quotes"/>. The phrase browser will present you with phrases found in the collection containing the search term. This can provide a useful way of browsing a very large collection. Note that even though it is called a phrase browser, only single terms can be used as the starting point for browsing.</Text>
     2424</NumberedItem>
     2425<Heading>
     2426<Text id="0612a">Branding the collection with an image</Text>
     2427</Heading>
     2428<NumberedItem>
     2429<Text id="0613">To complete the collection, lets give it a new image for the top left corner of the page. Go to the <AutoText key="glidict::CDM.GUI.General"/> section of the <AutoText key="glidict::GUI.Design"/> panel. Use the browse button of <AutoText key="glidict::CDM.General.Icon_Collection"/> to select the following image:</Text>
     2430<Path>sample_files &rarr; beatles &rarr; advbeat_large &rarr; images &rarr; beatlesmm.png</Path>
     2431<Text id="0613a">Preview the collection, and make sure the new image appears.</Text>
     2432</NumberedItem>
    24242433<Heading>
    24252434<Text id="0623">Using <AutoText text="UnknownPlug"/></Text>
     
    24972506</NumberedItem>
    24982507<NumberedItem>
    2499 <Text id="0646">Copy the <Path>images</Path> and <Path>macros</Path> folders located there into your collection's top-level folder. (It's OK to overwrite the existing <Path>images</Path> folder: the image in it is included in the folder being copied.) The <Path>images</Path> folder includes some useful icons, and the <Path>macros</Path> folder defines some macro names that use these images. To see the macro definitions, take a look by using a text editor to open the file <Path>extra.dm</Path> in the <Path>macros</Path> folder.</Text>
     2508<Text id="0645a">Open up another file browser, and locate the small beatles collection in your Greenstone installation:</Text>
     2509<Path>greenstone &rarr; collect &rarr; smallbea</Path>
     2510<Text id="0645b"><AutoText text="smallbea"/> is the folder name generated by Greenstone for this collection. You can determine what the folder name is for a collection by looking at the title bar of the Librarian Interface: the folder name is displayed in brackets after the collection name.</Text>
     2511</NumberedItem>
     2512<NumberedItem>
     2513<Text id="0646">Using the file browser, copy the <Path>images</Path> and <Path>macros</Path> folders from the <Path>advbeat_large</Path> folder into the <Path>smallbea</Path> folder. (It's OK to overwrite the existing <Path>images</Path> folder: the image in it is included in the folder being copied.) The <Path>images</Path> folder includes some useful icons, and the <Path>macros</Path> folder defines some macro names that use these images. To see the macro definitions, take a look by using a text editor to open the file <Path>extra.dm</Path> in the <Path>macros</Path> folder.</Text>
    25002514</NumberedItem>
    25012515<Heading>
     
    25332547</Heading>
    25342548<NumberedItem>
    2535 <Text id="0653">Open your collection's <Path>macros</Path> folder and locate the <Path>extra.dm</Path> file within it. <b>Right-click</b> on it. If prompted, select <b>WordPad</b> as the application to open it with.</Text>
    2536 </NumberedItem>
    2537 <NumberedItem>
    2538 <Text id="0654">The file content is fairly brief, specifying only what needs to be overridden from the default behaviour for this collection. In WordPad, near the top of the file you should see:</Text>
     2549<Text id="0653">Open your collection's <Path>macros</Path> folder and locate the <Path>extra.dm</Path> file within it. <b>Open</b> it in a text editor, e.g. WordPad.</Text>
     2550</NumberedItem>
     2551<NumberedItem>
     2552<Text id="0654">The file content is fairly brief, specifying only what needs to be overridden from the default behaviour for this collection. Near the top of the file you should see:</Text>
    25392553<Format>
    25402554_collectionspecificstyle_ {<br/>
     
    25452559}
    25462560</Format>
    2547 <Text id="0655">Use copy and paste on these lines to make this part of the file look like:</Text>
    2548 <Format>
    2549 # Original statements<br/>
    2550 #_collectionspecificstyle_ {<br/>
    2551 #&lt;style&gt;<br/>
    2552 #body.bgimage \{ background-image: url("_httpcimages_/beat_margin.gif");  \}<br/>
    2553 #\#page \{ margin-left: 120px; \} <br/>
    2554 #&lt;/style&gt;<br/>
    2555 #}<br/>
    2556 <br/>
    2557 _collectionspecificstyle_ {<br/>
    2558 &lt;style&gt;<br/>
    2559 body.bgimage \{ background-image: url("_httpcimages_/tile.jpg");  \}<br/>
    2560 &lt;/style&gt;<br/>
    2561 }
    2562 </Format>
    2563 <Text id="0656">A hash (#) at the start of line signals a comment, and Greenstone ignores the following text. We use this to comment out the original statements and replace them with modified lines. It is useful to retain the original version in case we need to restore the original lines at a later date. These lines relate to the background image used. The new image <Path>tile.jpg</Path> was also in the <Path>images</Path> folder that was copied across previously.</Text>
    2564 </NumberedItem>
    2565 <NumberedItem>
    2566 <Text id="0657">Within <b>WordPad</b>, save <i>extra.dm</i>.</Text>
     2561<Text id="0655">Replace the text <AutoText text="beat_margin.gif" type="quotes"/> with <AutoText text="tile.jpg" type="quotes"/>. Save the file. </Text>
     2562<Text id="0656">This line relates to the background image used. The new image <Path>tile.jpg</Path> was in the <Path>images</Path> folder that was copied across previously.</Text>
    25672563</NumberedItem>
    25682564<NumberedItem>
     
    25702566<Text id="0659">Other features can be altered by editing the macro files&mdash;for example, the headers and footers used on each page, and the highlighting style used for search terms (specify a different colour, use bold etc.).</Text>
    25712567</NumberedItem>
    2572 <NumberedItem>
    2573 <Text id="0660">If you want to you can reverse the most recent change you made by commenting out the new lines added (add #) and uncommenting the original lines (delete # character). Remember to save the file. To undo all the customized changes made, delete the content of the <Path>macros</Path> and <Path>images</Path> folders.</Text>
    2574 </NumberedItem>
    25752568<Heading>
    25762569<Text id="0661">Building a full-size version of the collection</Text>
     
    25832576</Bullet>
    25842577<Bullet>
    2585 <Text id="0664">Start a new collection called <i>advbeat</i> (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>).</Text>
    2586 </Bullet>
    2587 <Bullet>
    2588 <Text id="0665">Base this new collection on <i>small_beatles</i>.</Text>
     2578<Text id="0664">Start a new collection called <i>large beatles</i> (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>).</Text>
     2579</Bullet>
     2580<Bullet>
     2581<Text id="0665">Base this new collection on <i>small beatles</i>.</Text>
    25892582</Bullet>
    25902583<Bullet>
     
    25922585</Bullet>
    25932586<Bullet>
    2594 <Text id="0670"><b>Build</b> the collection and preview the result. (If you want the collection to have an icon, you will have to add it from the <AutoText key="glidict::GUI.Design"/> panel.)</Text>
     2587<Text id="0670"><b>Build</b> the collection and <b>preview</b> the result. (If you want the collection to have an icon, you will have to add it from the <AutoText key="glidict::GUI.Design"/> panel.)</Text>
    25952588</Bullet>
    25962589</BulletList>
     
    26122605</Title>
    26132606<SampleFiles folder="niupepa"/>
    2614 <Version initial="2.60" current="2.70"/>
     2607<Version initial="2.60" current="2.70w"/>
    26152608<Content>
    26162609<Comment>
     
    26332626</NumberedItem>
    26342627<NumberedItem>
    2635 <Text id="0681">Now go to the <AutoText key="glidict::GUI.Create"/> panel, <b>build</b> the collection and <b>preview</b> the result. Search for <AutoText text="waka" type="quoted"/> and view one of the titles listed (all three appear as <AutoText text="Te Whetu o Te Tau" type="italics"/>). Browse by <AutoText key="coredm::_Global:labelTitle_"/> and view one of the <AutoText text="Te Waka o Te Iwi" type="italics"/> newspapers.</Text>
     2628<Text id="0681">Now go to the <AutoText key="glidict::GUI.Create"/> panel, <b>build</b> the collection and <b>preview</b> the result. Search for <AutoText text="waka" type="quoted"/> and view one of the titles listed (all three appear as <AutoText text="Te Whetu o Te Tau" type="italics"/>). Browse by <AutoText key="coredm::_Global:labelTitle_"/> and view one of the <AutoText text="Te Waka o Te Iwi" type="italics"/> newspapers. Note that only the <AutoText text="Te Whetu o Te Tau" type="italics"/> newspapers have text; <AutoText text="Te Waka o Te Iwi" type="italics"/> papers don't.</Text>
    26362629</NumberedItem>
    26372630<Comment>
     
    26512644</NumberedItem>
    26522645<NumberedItem>
    2653 <Text id="0687">In the <AutoText key="glidict::CDM.GUI.Formats"/> section, select the <AutoText key="metadata::ex.Title"/> classifier in the <AutoText key="glidict::CDM.FormatManager.Feature"/> list, and <AutoText text="VList"/> in the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Delete the contents of the <AutoText key="glidict::CDM.FormatManager.Editor"/> box, and add the following:</Text>
     2646<Text id="0687">In the <AutoText key="glidict::CDM.GUI.Formats"/> section, select the <AutoText key="metadata::ex.Title"/> classifier in the <AutoText key="glidict::CDM.FormatManager.Feature"/> list, and <AutoText text="VList"/> in the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Delete the contents of the <AutoText key="glidict::CDM.FormatManager.Editor"/> box, and add the following text. (This format statement can be copied and pasted from the file <Path>sample_files &rarr; niupepa &rarr; formats &rarr; titles_tweak.txt</Path>.)</Text>
    26542647<Format>
    26552648&lt;td valign="top"&gt;[link][icon][/link]&lt;/td&gt;<br/>
     
    26622655</Format>
    26632656<Text id="0687a">Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>.</Text>
    2664 <Text id="0687b">(This format statement can be copied and pasted from the file <Path>sample_files &rarr; niupepa &rarr; formats &rarr; titles_tweak.txt</Path>)</Text>
    26652657</NumberedItem>
    26662658<NumberedItem>
     
    26752667</Comment>
    26762668<NumberedItem>
    2677 <Text id="0696">In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText text="DocumentText"/> format statement. The default format string displays the document's plain text, which, if there is none, is set to <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. Change this to:</Text>
    2678 <Format>
    2679 &lt;center&gt;&lt;table&gt;&lt;tr&gt;<br/> 
    2680 &nbsp;&nbsp;&lt;td valign=top&gt;[srclink][screenicon][/srclink]&lt;/td&gt;<br/> 
    2681 &nbsp;&nbsp;&lt;td valign=top&gt;[Text]&lt;/td&gt;<br/> 
    2682 &lt;/tr&gt;&lt;/table&gt;&lt;/center&gt;
     2669<Text id="0696">In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel, select the <AutoText text="DocumentText"/> format statement. The default format string displays the document's plain text, which, if there is none, is set to <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. Change this to the following text. (This format statement can be copied and pasted from the file <Path>sample_files &rarr; niupepa &rarr; formats &rarr; doc_tweak.txt</Path>)</Text>
     2670<Format>
     2671&lt;table&gt;&lt;tr&gt;<br/> 
     2672&lt;td valign=top&gt;[srclink][screenicon][/srclink]&lt;/td&gt;<br/> 
     2673&lt;td valign=top&gt;[Text]&lt;/td&gt;<br/> 
     2674&lt;/tr&gt;&lt;/table&gt;
    26832675</Format>
    26842676<Text id="0696a">and click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    2685 <Text id="0697">(This format statement can be copied and pasted from the file <Path>sample_files &rarr; niupepa &rarr; formats &rarr; doc_tweak.txt</Path>)</Text>
    26862677<Comment>
    26872678<Text id="0698">Including <Format>[screenicon]</Format> has the effect of embedding the screen-sized image generated by switching the <AutoText text="screenview"/> option on in <AutoText text="PagedImgPlug"/>. It is hyperlinked to the original image by the construct <Format>[srclink]...[/srclink]</Format>.</Text>
     
    27282719<Text id="0690h-1">In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Design"/> panel, select <AutoText text="Search"/> in <AutoText key="glidict::CDM.FormatManager.Feature"/>, and <AutoText text="VList"/> in <AutoText key="glidict::CDM.FormatManager.Part"/>. The previous changes modified <AutoText text="VList"/>, so they will apply to all <AutoText text="VList"/>s that don't have specific format statements. These next changes are made to <AutoText text="SearchVList"/> so will only apply to search results.</Text>
    27292720<Text id="0690i">The extracted Title for the current section is specified as <Format>[ex.Title]</Format> while the Title for the parent section is <Format>[parent:ex.Title]</Format>. Since the same <AutoText text="SearchVList"/> format statement is used when searching both whole newspapers and newspaper pages, we need to make sure it works in both cases.</Text>
    2730 <Text id="0690j">Set the format statement to the following:</Text>
     2721<Text id="0690j">Set the format statement to the following text (it can be copied and pasted from the file <Path>sample_files &rarr; niupepa &rarr; formats &rarr; search_tweak.txt</Path>.)</Text>
    27312722<Format>
    27322723&lt;td valign="top"&gt;[link][icon][/link]&lt;/td&gt;<br/>
     
    27402731&lt;/td&gt;
    27412732</Format>
    2742 <Text id="1690j-1">and click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>.</Text>
    2743 <Text id="0690k">(The format statement can be copied and pasted from the file <Path>sample_files &rarr; niupepa &rarr; formats &rarr; search_tweak.txt</Path>.)</Text>
     2733<Text id="1690j-1">Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>.</Text>
    27442734<Text id="0690l"><b>Preview</b> the search results. Items display newspaper title, Volume, Number and Date if available, and pages also display the page number.</Text>
    27452735</NumberedItem>
     
    27552745<SampleFiles folder="niupepa"/>
    27562746<Prerequisite id="scanned_image_collection"/>
    2757 <Version initial="2.70" current="2.70"/>
     2747<Version initial="2.70" current="2.70w"/>
    27582748<Content>
    27592749<Comment>
     
    28432833<Text id="sc31">We can modify the document display to switch between the text version and the screenview and full size versions. We do this using a combination of format statements and macro files.</Text>
    28442834<NumberedItem>
    2845 <Text id="sc32">First, copy the new macro file into the collection. Create a new folder <Path>Greenstone &rarr; collect &rarr; pagedimg &rarr; macros</Path>. Copy <Path>sample_files &rarr; niupepa &rarr; macros &rarr; extra.dm</Path> into this folder.</Text>
     2835<Text id="sc32">First of all we will add a macro file to the collection. In a file browser outside of Greenstone, locate the Paged Image collection in your Greenstone installation: <Path>Greenstone &rarr; collect &rarr; pagedima</Path>. Create a new folder called <Path>macros</Path> in the <Path>pagedima</Path> folder.</Text>
     2836<Text id="sc32a">Also in a file browser, locate the file <Path>sample_files &rarr; niupepa &rarr; macros &rarr; extra.dm</Path>. Copy this file and paste it into the new <Path>macros</Path> folder you just created.</Text>
    28462837</NumberedItem>
    28472838<NumberedItem>
     
    28522843</NumberedItem>
    28532844<NumberedItem>
    2854 <Text id="sc33c">Select the <AutoText text="DocumentHeading"/> format item and set it to the following:</Text>
     2845<Text id="sc33c">Select the <AutoText text="DocumentHeading"/> format item and set it to the following text (which can copied from <Path>sample_files &rarr; niupepa &rarr; formats &rarr; adv_doc_heading.txt</Path>).</Text>
    28552846<Format>
    28562847&lt;div class="heading_title"&gt;{Or}{[parent(Top):ex.Title],[ex.Title]}&lt;/div&gt;<br/>
     
    28632854</Format>
    28642855<Text id="sc33c-1">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    2865 <Text id="sc33d">This format statement can be copied from <Path>sample_files &rarr; niupepa &rarr; formats &rarr; adv_doc_heading.txt</Path>.</Text>
    28662856<Text id="sc33e"><Format>{Or}{[parent(Top):ex.Title],[ex.Title]}</Format> outputs the newspaper Title metadata. This is only stored at the top level of the document, so if we are at a subsection, we need to get it from the top (<Format>[parent(Top):ex.Title]</Format>). Note that we can't just use <Format>[parent:ex.Title]</Format> as this retrieves the Title from the immediate parent node, which may not be the top node of the document.</Text>
    28672857<Text id="sc33g"><Format>_document:viewpreview_, _document:viewfullsize_, _document:viewtext_</Format> are macros defined in <Path>extra.dm</Path> which output buttons for preview, fullsize and text versions, respectively. We choose which buttons to display based on what metadata and text the document has.</Text>
     
    28702860</NumberedItem>
    28712861<NumberedItem>
    2872 <Text id="sc34a">Select the <AutoText text="DocumentText"/> format statement and set it to:</Text>
     2862<Text id="sc34a">Select the <AutoText text="DocumentText"/> format statement and set it to the following text (which can be copied from <Path>sample_files &rarr; niupepa &rarr; formats &rarr; adv_doc_text.txt</Path>):</Text>
    28732863<Format>
    28742864{If}{_cgiargp_ eq 'fullsize',[srcicon],<br/>
     
    28772867</Format>
    28782868<Text id="sc34a-1">Remember to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>
    2879 <Text id="sc34b">This format statement can be copied from <Path>sample_files &rarr; niupepa &rarr; formats &rarr; adv_doc_text.txt</Path>. It changes the display based on the <AutoText text="p" type="quoted"/> argument (<Format>_cgiargp_</Format>). This is not used normally for document display, so we can use it here to switch between full size image (<Format>[srcicon]</Format>), preview size image (<Format>[screenicon]</Format>) and text (<Format>[Text]</Format>) versions of each page.</Text>
     2869<Text id="sc34b">This format statement changes the display based on the <AutoText text="p" type="quoted"/> argument (<Format>_cgiargp_</Format>). This is not used normally for document display, so we can use it here to switch between full size image (<Format>[srcicon]</Format>), preview size image (<Format>[screenicon]</Format>) and text (<Format>[Text]</Format>) versions of each page.</Text>
    28802870</NumberedItem>
    28812871<NumberedItem>
     
    28892879</Title>
    28902880<SampleFiles folder="oai"/>
    2891 <Version initial="2.60" current="2.70"/>
     2881<Version initial="2.60" current="2.70w"/>
    28922882<Content>
    28932883<Comment>
     
    29922982</Title>
    29932983<Prerequisite id="OAI_collection"/>
    2994 <Version initial="2.60" current="2.70"/>
     2984<Version initial="2.60" current="2.70w"/>
    29952985<Content>
    29962986<Comment>
     
    30433033<Text id="0750">Use METS as Greenstone's Internal Representation</Text>
    30443034</Title>
    3045 <Prerequisite id="large_html_collection"/>
    3046 <Version initial="2.60" current="2.70"/>
     3035<Version initial="2.60" current="2.70w"/>
    30473036<Content>
    30483037<NumberedItem>
    3049 <Text id="0751">In the Greenstone Librarian Interface, open the <b>Tudor</b> collection.</Text>
     3038<Text id="0751">In the Greenstone Librarian Interface, open up one of your existing collections, for example the <b>hobbits</b> collection.</Text>
    30503039</NumberedItem>
    30513040<Comment>
     
    30653054</NumberedItem>
    30663055<NumberedItem>
    3067 <Text id="0759">In your Windows file browser, locate the <Path>archives</Path> folder for the Tudor collection. For each document in the collection, Greenstone has generated two files: <Path>docmets.xml</Path>, the core METS description, and <Path>doctxt.xml</Path>, a supporting file. (Note: unless you are connected to the Internet you will be unable to view <Path>doctxt.xml</Path> in your web browser, because it refers to a remote resource.) Depending on the source documents there may be additional files, such as the images used within a web page. One of METS' many features is the ability to reference information in external XML files. Greenstone uses this to tie the content of the document, which is stored in the external XML file <Path>doctxt.xml</Path>, to its hierarchical structure, which is described in the core METS file <Path>docmets.xml</Path>.</Text>
     3056<Text id="0759">In your Windows file browser, locate the <Path>archives</Path> folder for the collection you are working with. For each document in the collection, Greenstone has generated two files: <Path>docmets.xml</Path>, the core METS description, and <Path>doctxt.xml</Path>, a supporting file. (Note: unless you are connected to the Internet you will be unable to view <Path>doctxt.xml</Path> in your web browser, because it refers to a remote resource.) Depending on the source documents there may be additional files, such as the images used within a web page. One of METS' many features is the ability to reference information in external XML files. Greenstone uses this to tie the content of the document, which is stored in the external XML file <Path>doctxt.xml</Path>, to its hierarchical structure, which is described in the core METS file <Path>docmets.xml</Path>.</Text>
    30683057</NumberedItem>
    30693058</Content>
     
    30743063</Title>
    30753064<SampleFiles folder="dspace"/>
    3076 <Version initial="2.60" current="2.70"/>
     3065<Version initial="2.60" current="2.70w"/>
    30773066<Content>
    30783067<NumberedItem>
     
    31473136{If}{[numleafdocs],([numleafdocs]) [ex.Title],[dc.Title]}
    31483137</Format>
    3149 <Text id="0784">and click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>. This will display the number of documents for each bookshelf in the authors classifier.</Text>
     3138<Text id="0784">and click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>. This will display the number of documents for each bookshelf in the <AutoText key="coredm::_Global:labelContributor_" type="italics"/> classifier.</Text>
    31503139</NumberedItem>
    31513140<NumberedItem>
     
    31533142</NumberedItem>
    31543143<Comment>
    3155 <Text id="0787">There are still only 5 documents, but against some of the entries&mdash;for example, <AutoText text="Interview with Bob Dylan" type="quoted"/>&mdash;appears the line <AutoText text="Also available as:" type="quoted"/> followed by icons that link to the alternative representations.</Text>
     3144<Text id="0787">There are still only 5 documents, but against some of the entries appears the line <AutoText text="Also available as:" type="quoted"/> followed by icons that link to the alternative representations.</Text>
    31563145</Comment>
    31573146</Content>
     
    31623151</Title>
    31633152<Prerequisite id="dspace_to_greenstone"/>
    3164 <Version initial="2.60" current="2.70"/>
     3153<Version initial="2.60" current="2.70w"/>
    31653154<Content>
    31663155<Comment>
Note: See TracChangeset for help on using the changeset viewer.