Changeset 11401


Ignore:
Timestamp:
2006-03-17T15:56:48+13:00 (18 years ago)
Author:
kjdon
Message:

merged some text elements together where sentences had been split up. also added some menu and Path tags. and started fixing up the wrong bits. now that I have committed the other languages I can commit this

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl-documentation/tutorials/xml-source/tutorial_en.xml

    r11383 r11401  
    77<Text id="0001">Greenstone tutorial exercises, Version 1 (May 2005)</Text>
    88</Title>
     9<Comment>
     10<Text id="intro">(If you are working from a Greenstone CD-ROM, sample files for these exercises are in the folder <i>sample_files</i>; otherwise they can be downloaded from <Link url="http://sourceforge.net/project/showfiles.php?group_id=12123&amp;package_id=152231">sourceforge</Link>.)</Text>
     11</Comment>
    912<Tutorial id="unaids_cdrom">
    1013<Title>
     
    2225<Text id="0085">On inserting the <b>UNAIDS CD-ROM</b>, for many computers installation will begin automatically. If not, "auto-run"--a configurable setting under Windows--is disabled on your computer and you need to double-click <i>setup.exe</i> on the CD-ROM.</Text>
    2326<Menu>
    24 <Text id="0086">My Computer --&gt;</Text>
    25 <Text id="0087"> UNAIDS20 --&gt;</Text>
    26 <Text id="0088"> setup.exe</Text>
     27<Text id="0086">My Computer --&gt; UNAIDS20 --&gt; setup.exe</Text>
    2728</Menu>
    2829</NumberedItem>
     
    4546<Text id="0094">Click <b>&lt;OK</b>&gt; to confirm completion of UNAIDS collection (twice).</Text>
    4647<Comment>
    47 <Text id="0095">InstallShield quits</Text>
    48 <Text id="0096">--the UNAIDS Library is installed.</Text>
     48<Text id="0095">InstallShield quits--the UNAIDS Library is installed.</Text>
    4949</Comment>
    5050</NumberedItem>
     
    5858<Text id="0099">Launch the prebuilt library by clicking:</Text>
    5959<Menu>
    60 <Text id="0100">Start --&gt;</Text>
    61 <Text id="0101"> All Programs --&gt;</Text>
    62 <Text id="0102"> UNAIDS Library 2.0 [CD-ROM] --&gt;</Text>
    63 <Text id="0103"> UNAIDS Library 2.0 (Standard Version).</Text>
     60<Text id="0100">Start --&gt; All Programs --&gt; UNAIDS Library 2.0 [CD-ROM] --&gt; UNAIDS Library 2.0 (Standard Version).</Text>
    6461</Menu>
    6562<Comment>
     
    176173<Bullet>
    177174<Question>
    178 <Text id="0139">Considering lower case variants only, how many times does the word "condom" appear in the collection? How many times for "condoms"?</Text>
     175<Text id="0139">Considering lower case variants only, how many times does the word "condom" appear in the collection? <br/>How many times for "condoms"?</Text>
    179176</Question>
    180177<Answer>6789<br/> 5243</Answer>
     
    182179<Bullet>
    183180<Question>
    184 <Text id="0140">If case sensitivity does not matter, how many times does the word "condom" appear in the collection? How many times for "condoms"?</Text>
     181<Text id="0140">If case sensitivity does not matter, how many times does the word "condom" appear in the collection? <br/>How many times for "condoms"?</Text>
    185182</Question>
    186183<Answer>7905<br/> 5571</Answer>
     
    194191<Bullet>
    195192<Question>
    196 <Text id="0142">How many <i>chapters</i> contain some variations of the word "condom"? Does this make it a useful search term?</Text>
     193<Text id="0142">How many <i>chapters</i> contain some variations of the word "condom"? <br/>Does this make it a useful search term?</Text>
    197194</Question>
    198195<Answer>
     
    363360<Text id="0193">Installing Greenstone</Text>
    364361</Title>
    365 <Version initial="2.60" current="2.60"/>
     362<Version initial="2.60" current="2.62"/>
    366363<Content>
    367364<Heading>
     
    370367<Text id="0195">There are various ways of getting Greenstone:</Text>
    371368<NumberedItem>
    372 <Text id="0196">From a UNESCO CD-ROM (version 2.60) (or FAO IMARK CD-ROM, but this is an earlier version 2.51)</Text>
     369<Text id="0196">From a UNESCO CD-ROM (version 2.70) (or FAO IMARK CD-ROM, but this is an earlier version 2.51)</Text>
    373370<Text id="0197">These CD-ROMs contain the <b>Greenstone software</b>, plus <b>documented example collections</b>, four <b>language interfaces</b> (English French Spanish Russian), the <b>Export to CD-ROM</b> package, the <b>ImageMagick</b> graphics package, the <b>Java runtime environment</b>, and an <b>installer</b> that installs all of these.</Text>
    374371</NumberedItem>
     
    376373<Text id="0198">From the IITE Digital Libraries in Education CD-ROM, or a Greenstone workshop CD-ROM</Text>
    377374<Comment>
    378 <Text id="0199">In addition to all the above software, these CD-ROMs contain the <b>Greenstone Language Pack</b>, which gives reader's interfaces in many languages (currently about 40). This has its own installer which you have to invoke separately, after you have installed Greenstone. They also contain a set of <b>sample files</b> to be used for exercises.</Text>
     375<Text id="0199">In addition to all the above software, these CD-ROMs contain the tutorial exercises and a set of <b>sample files</b> to be used for these exercises.</Text>
     376<Text id="0199a">CD-ROMS with Greenstone version 2.62 or earlier also include the <b>Greenstone Language Pack</b>, which gives reader's interfaces in many languages (currently about 40). This has its own installer which you have to invoke separately, after you have installed Greenstone.</Text>
     377<Text id="0199b">CD-ROMS with version 2.70 or later now come with reader's interfaces in all available languages. Textual images have been removed from the interface; they are now done using CSS. The Greenstone Language Pack is no longer needed. Instead, these CD-ROMS come with the <b>Classic Interface Pack</b>, which contains the old text images for use with a backwards compatiliblity macro file.</Text>
    379378</Comment>
    380379<Comment>
     
    383382</NumberedItem>
    384383<NumberedItem>
    385 <Text id="0201">From http://www.greenstone.org</Text>
    386 <Text id="0202">Most people download the Windows distribution from http://www.greenstone.org, which contains the latest version of the Greenstone. There are several optional modules that must be downloaded separately (to avoid a single massive download): <b>documented example collections</b>, the <b>Export to CD-ROM</b> package, and the <b>Language Pack</b>. There is also the set of <b>sample files</b> used in these exercises. (To reduce the download size the documented example collections are distributed in unbuilt form and need to be built.)</Text>
    387 <Text id="0203">You need <b>Java</b> to run Greenstone. You might already have it; otherwise download it from http://java.sun.com. To work with image collections, you need <b>ImageMagick</b> (from http://www.imagemagick.org).</Text>
    388 </NumberedItem>
    389 <Text id="0204">Most Greenstone CD-ROMs start the installation process as soon as they are inserted into the drive, assuming that the AutoPlay feature is enabled on your computer. If installation does not begin by itself, locate the file <i>setup.exe</i> and double click it to start the installation process. (On the IMARK CD-ROM this file resides in the folder <i>software_tools</i>--&gt;</Text>
    390 <Text id="0205"><i>Greenstone</i>). If you download Greenstone over the web, what you get is the installer--just double-click it.</Text>
     384<Text id="0201">From <Link>http://www.greenstone.org</Link></Text>
     385<Text id="0202">Most people download the Windows distribution from <Link>http://www.greenstone.org</Link>, which contains the latest version of Greenstone. There are several optional modules that must be downloaded separately (to avoid a single massive download): <b>documented example collections</b>, the <b>Export to CD-ROM</b> package, the <b>Language Pack</b> (Greenstone 2.62 and earlier) and <b>Classic Interface Pack</b> (Greenstone 2.63 and later). There is also the set of <b>sample files</b> used in these exercises. (To reduce the download size the documented example collections are distributed in unbuilt form and need to be built.)</Text>
     386<Text id="0203">You need <b>Java</b> to run Greenstone. You might already have it; otherwise download it from <Link>http://java.sun.com</Link>. To work with image collections, you need <b>ImageMagick</b> (from <Link>http://www.imagemagick.org</Link>). </Text>
     387</NumberedItem>
     388<Text id="0204">Most Greenstone CD-ROMs start the installation process as soon as they are inserted into the drive, assuming that the AutoPlay feature is enabled on your computer. If installation does not begin by itself, locate the file <i>setup.exe</i> and double click it to start the installation process. (On the IMARK CD-ROM this file resides in the folder <i>software_tools</i>--&gt;<i>Greenstone</i>). If you download Greenstone over the web, what you get is the installer--just double-click it.</Text>
    391389<Text id="0206"><b>If Greenstone has been installed on your computer before, you should completely remove the old version before installing a new one</b>. (However, you need not remove any pre-packaged collections that you may have installed.) To do this, see below under <i>Updating a Greenstone installation</i>.</Text>
    392390<Text id="0207">Here is what you need to do to install Greenstone. Older versions of the installer follow much the same sequence but use slightly different wording.</Text>
     
    427425<Text id="0219">Installing ImageMagick on a Windows system</Text>
    428426</Heading>
    429 <Text id="0220">Once Greenstone has been installed, you should ensure that ImageMagick is installed on your computer if you wish to build any image collections. If you are installing from a Greenstone CD-ROM, you will be asked whether you want to install ImageMagick: say <b>Yes</b>. If you are not, you will need to download ImageMagick (from <i>http://www.imagemagick.org</i>). To install this program you must have Windows "Administrator" privileges.<FootnoteRef id="foot1"/></Text>
     427<Text id="0220">Once Greenstone has been installed, you should ensure that ImageMagick is installed on your computer if you wish to build any image collections. If you are installing from a Greenstone CD-ROM, you will be asked whether you want to install ImageMagick: say <b>Yes</b>. If you are not, you will need to download ImageMagick (from <Link>http://www.imagemagick.org</Link>). To install this program you must have Windows "Administrator" privileges. (If you do not have Windows Administrator privileges, the ImageMagick installer will give a cryptic error complaining that it failed to set a particular Windows registry value. If this happens you can continue your work with Greenstone, but you will not be able to build collections of images.)</Text>
    430428<Text id="0221"> The remaining steps are straightforward, and, as before, we recommend the default settings. Here is what you need to do.</Text>
    431429<BulletList>
     
    462460</BulletList>
    463461<Footnote id="foot1">
    464 <Text id="0796"> If you do not have Windows Administrator privileges, the ImageMagick installer will give a cryptic error complaining that it failed to set a particular Windows registry value. If this happens you can continue your work with Greenstone, but you will not be able to build collections of images.</Text>
     462<Text id="0796"> </Text>
    465463</Footnote>
    466464</Content>
     
    501499</Heading>
    502500<NumberedItem>
    503 <Text id="0242">The reinstallation procedure is exactly the same as the original installation procedure, described above. If you already have ImageMagick, you do not need to install it again.</Text>
     501<Text id="0242">The reinstallation procedure is exactly the same as the original installation procedure, described in <TutorialRef id="install_greenstone"/>. If you already have ImageMagick, you do not need to install it again.</Text>
    504502</NumberedItem>
    505503<Comment>
     
    519517</NumberedItem>
    520518<Heading>
    521 <Text id="0248">Installing the Greenstone language pack</Text>
     519<Text id="0248">Installing the Greenstone language pack (2.62 and earlier)</Text>
    522520</Heading>
    523521<Comment>
     
    525523</Comment>
    526524<NumberedItem>
    527 <Text id="0250">Locate the Greenstone Language Pack. This may be on the CD-ROM from which you installed Greenstone, or you may have to download it from <i>http://www.greenstone.org</i>.</Text>
    528 </NumberedItem>
    529 <NumberedItem>
    530 <Text id="0251">Double-click the <i>.exe</i> file; this will start the installer. Accept all the defaults</Text>
     525<Text id="0250">Locate the Greenstone Language Pack (glp-x.xx.exe/glp-x.xx-linux.bin/gli-x.xx-macOSx.command). This may be on the CD-ROM from which you installed Greenstone, or you may have to download it from <Link>http://www.greenstone.org</Link>. </Text>
     526</NumberedItem>
     527<NumberedItem>
     528<Text id="0251">Run the executable file (double click it on Windows); this will start the installer. Accept all the defaults</Text>
    531529</NumberedItem>
    532530<NumberedItem>
    533531<Text id="0252">Restart the Greenstone Digital Library and look at the interface language menu again. Now you should see about 40 different languages.</Text>
     532</NumberedItem>
     533<Heading>
     534<Text id="0252z">Enabling other languages (2.63 and later)</Text>
     535</Heading>
     536<Comment>
     537<Text id="0252y">If you have downloaded Greenstone from the web, then all the laguanges will be enabled by default. However, if you have installed Greenstone from a UNESCO CD-ROM, then only English, French, Spanish and Russian will be enabled.</Text>
     538</Comment>
     539<NumberedItem>
     540<Text id="0252x">To enable a new language, edit the file <Path>greenstone\etc\main.cfg</Path>. Look for the appropriate "Language" line, and uncomment it (i.e. remove the # from the start). Check that the required encoding is also enabled.</Text>
     541<Text id="0252w">For example, suppose that we want to enable Turkish. The Language line for Turkish looks like:</Text>
     542<Format>#Language shortname=tr longname=Turkish default_encoding=windows-1254</Format>
     543<Text id="0252v">To enable it, we remove the #, i.e. make it look like:</Text>
     544<Format>Language shortname=tr longname=Turkish default_encoding=windows-1254</Format>
     545The default encoding for Turkish is windows-1254. So we look for the windows-1254 Encoding line:
     546<Format>Encoding shortname=windows-1254 "longname=Turkish (Windows-1254)" map=win1254.ump</Format>
     547<Text id="0252u">This is already enabled (no # at the start) so we don't need to do anything else.</Text>
     548</NumberedItem>
     549<Heading>
     550<Text id="0252a">Installing the Classic Interface Pack (2.63 and later)</Text>
     551</Heading>
     552<Comment>
     553<Text id="0252b">Greenstone now comes with all languages enabled.
     554The generated HTML uses text + CSS rather than images for navigation bar,
     555home, help, preferences buttons etc. The classic interface pack is not needed if you want to use Greenstone in another language. It is only needed if you want to revert back to the old style HTML with text images. This may be useful if you have customized your greenstone, or if you require compatibility with Netscape 4.</Text>
     556</Comment>
     557<NumberedItem>
     558<Text id="0252c">Locate the Classic Interface Pack (gcip-x.xx.zip). This may be on the CD-ROM from which you installed Greenstone, or you may have to download it from <Link>http://www.greenstone.org</Link>. </Text>
     559</NumberedItem>
     560<NumberedItem>
     561<Text id="0252d">The classic interface pack is a zip file containing the old text images, such as classifier buttons. Unzip the zip file into the images directory of your Greenstone installation.</Text>
     562</NumberedItem>
     563<NumberedItem>
     564<Text id="1252e">Enable the use of the old-style macros by editing <Path>greenstone\etc\main.cfg</Path>: replace <i>nav_css.dm</i> with <i>nav_ns4.dm</i> in the <i>macrofiles</i> list.</Text>
     565</NumberedItem>
     566<NumberedItem>
     567<Text id="0252">Restart the Greenstone Digital Library. It should now be using the old text images.</Text>
    534568</NumberedItem>
    535569</Content>
     
    543577<Content>
    544578<Comment>
    545 <Text id="0254">You will need some HTML files, such as those in the hobbits folder in sample_files. You can download the sample files that are used in these exercises from http://www.greenstone.org.</Text>
     579<Text id="0254">You will need some HTML files, such as those in the hobbits folder in sample_files.</Text>
    546580</Comment>
    547581<NumberedItem>
    548582<Text id="0255">Start the Greenstone Librarian Interface:</Text>
    549583<Menu>
    550 Start--&gt;All Programs--&gt;Greenstone Digital Library Software--&gt;Greenstone Librarian Interface
     584<Text id="0255a">Start--&gt;All Programs--&gt;Greenstone Digital Library Software v2.70--&gt;Greenstone Librarian Interface</Text>
    551585</Menu>
    552586<Comment>
    553 <Text id="0256">After a short pause a startup screen appears, and then after a slightly longer pause the main Greenstone Librarian Interface appears.</Text>
     587<Text id="0256">After a short pause a startup screen appears, and then after a slightly longer pause the main Greenstone Librarian Interface appears. (A command prompt is also opened in the background.)</Text>
    554588</Comment>
    555589</NumberedItem>
    556590<NumberedItem>
    557591<Text id="0257">Start a new collection within the Librarian Interface:</Text>
    558 <Text id="0258">File--&gt;</Text>
    559 <Text id="0259">New</Text>
     592<Text id="0258"><Menu>File--&gt;New</Menu></Text>
    560593</NumberedItem>
    561594<NumberedItem>
     
    570603<NumberedItem>
    571604<Text id="0263">Another window pops up, from which you select the metadata set (or sets) to use. This is discussed in other exercises. For now, select <b>Dublin Core Metadata Element Set Version 1.1 </b>followed by <b>&lt;OK&gt;</b>.</Text>
     605<Comment>
     606<Text id="0263a">If this is the first time you have opened a collection in the Librarian Interface, two popup progress bars will appear, to show progress while loading plugins and classifiers.</Text>
     607</Comment>
    572608</NumberedItem>
    573609<NumberedItem>
     
    575611</NumberedItem>
    576612<NumberedItem>
    577 <Text id="0265">Now drag the <i>hobbits</i> folder from the left-hand side and drop it on the right. The progress bar at the bottom shows some activity. Gradually, duplicates of all the files will appear in the right-hand panel.</Text>
     613<Text id="0265">Now drag the <i>hobbits</i> folder from the left-hand side and drop it on the right. The progress bar at the bottom shows some activity. Gradually, duplicates of all the files will appear in the collection panel.</Text>
    578614<Comment>
    579615<Text id="0266">You can inspect the files that have been copied by double-clicking on the folder in the right-hand side.</Text>
     
    590626</NumberedItem>
    591627<NumberedItem>
    592 <Text id="0270">Click the <b>Preview Collection</b> button to look at the end result. This loads the relevant page into your web browser (starting it up if necessary). Look around the collection and learn about Hobbits!</Text>
     628<Text id="0270">Click the <b>&lt;Preview Collection&gt;</b> button to look at the end result. This loads the relevant page into your web browser (starting it up if necessary). Look around the collection and learn about Hobbits!</Text>
    593629</NumberedItem>
    594630<NumberedItem>
     
    599635</NumberedItem>
    600636<NumberedItem>
    601 <Text id="0273">Use the scroll bar on the extreme right to view the bottom part of the list. There you will see fields starting "ex." that express the extracted metadata: for example <i>ex.Title</i>, based on the text within the HTML Title tags, and <i>ex.Language</i>, the document's language (represented using the ISO standard 2-letter mnemonic) which is set by an algorithm that Greenstone uses to analyse the document's text.</Text>
    602 </NumberedItem>
    603 <NumberedItem>
    604 <Text id="0274">Close the collection by clicking <b>File</b>--&gt;</Text>
    605 <Text id="0275"><b>Close</b></Text>
    606 <Text id="0276">. This automatically saves the collection to disk.</Text>
     637<Text id="0273">Use the scroll bar on the extreme right to view the bottom part of the list. There you will see fields starting "ex." that express the extracted metadata: for example <i>ex.Title</i>, based on the text within the HTML Title tags, and <i>ex.Language</i>, the document's language (represented using the ISO standard 2-letter mnemonic) which Greenstone determines by analysing the document's text.</Text>
     638</NumberedItem>
     639<NumberedItem>
     640<Text id="0274">Close the collection by clicking <Menu>File--&gt;Close</Menu>. This automatically saves the collection to disk.</Text>
    607641</NumberedItem>
    608642<Heading>
     
    610644</Heading>
    611645<NumberedItem>
    612 <Text id="0278">To set up a shortcut to the source files, return to the Gather panel and navigate to the folder in your local file space that contains the files you want to use--in our case, the <i>sample_files</i> folder. Select this folder and then right-click it. Follow the instructions to set up a shortcut. Close all the folders in the file tree and you will see the shortcut to your source files in the left-hand pane of the Gather panel.</Text>
     646<Text id="0278">To set up a shortcut to the source files, in the <b>Gather</b> panel navigate to the folder in your local file space that contains the files you want to use--in our case, the <i>sample_files</i> folder. Select this folder and then right-click it. Follow the instructions to set up a shortcut. Close all the folders in the file tree and you will see the shortcut to your source files in the left-hand pane of the <b>Gather</b> panel.</Text>
    613647</NumberedItem>
    614648</Content>
     
    628662</NumberedItem>
    629663<NumberedItem>
    630 <Text id="0282">Copy the 12 files from <i>sample_files</i>--&gt;</Text>
    631 <Text id="0283"><i>Word_and_PDF</i>--&gt;</Text>
    632 <Text id="0284"><i>Documents</i></Text>
    633 <Text id="0285"> into the collection. You can select multiple files by clicking on the first one and shift-clicking on the last one, and drag them all across together.</Text>
    634 <Text id="0286">(This is the normal technique of multiple selection.)</Text>
     664<Text id="0282">Copy the 12 files from <Path>sample_files--&gt;Word_and_PDF--&gt;Documents</Path> into the collection. You can select multiple files by clicking on the first one and shift-clicking on the last one, and drag them all across together. (This is the normal technique of multiple selection.)</Text>
    635665</NumberedItem>
    636666<NumberedItem>
     
    647677</Heading>
    648678<NumberedItem>
    649 <Text id="0291">In the <b>Enrich</b> panel, manually add Dublin Core <i>dc.Title</i> metadata to one of these documents. Select <i>word03.doc</i> and double-click to open it in Word. Copy the title of this document ("Greenstone: A comprehensive open-source digital library software system") from Word, return to the Librarian Interface, click the <i>dc.Title</i> field, and paste the value into the Value box. Click &lt;<b>Append</b>&gt;.</Text>
    650 </NumberedItem>
    651 <NumberedItem>
    652 <Text id="0292">Now add <i>dc.Creator</i> information for the same document. You can add more than one value for the same field, to accommodate multiple authors--just put in the next value and click &lt;<b>Append</b>&gt;.</Text>
     679<Text id="0291">In the <b>Enrich</b> panel, manually add Dublin Core <i>dc.Title</i> metadata to one of these documents. Select <i>word03.doc</i> and double-click to open it. Copy the title of this document ("Greenstone: A comprehensive open-source digital library software system") and return to the Librarian Interface. Scroll up or down in the metadata table until you can see <b>dc.Title</b>. Click in the value box, paste in the metadata and press <b>Enter</b>. </Text>
     680</NumberedItem>
     681<NumberedItem>
     682<Text id="0292">Now add <i>dc.Creator</i> information for the same document. You can add more than one value for the same field: when you press <b>Enter</b> in a metadata value field, a new empty field of the same type will be generated.</Text>
     683</NumberedItem>
     684<NumberedItem>
     685<Text id="0292a">Close the document when you have finished copying metadata from it. External programs opened when viewing documents must be closed before building the collection, otherwise errors can occur.</Text>
    653686</NumberedItem>
    654687<NumberedItem>
     
    662695</Heading>
    663696<NumberedItem>
    664 <Text id="0296">Change to the <b>Design</b> panel, which is split into several sections. The first section <b>General Options</b> appears. This allows you to modify the values you provided when defining the collection, if desired. You can also brand the collection using a suitable image.</Text>
    665 </NumberedItem>
    666 <NumberedItem>
    667 <Text id="0297">Click on the &lt;<b>Browse</b>&gt; button associated with "URL to about page icon", and browse to the image</Text>
    668 <Text id="0298"><i>sample_files</i>--&gt;</Text>
    669 <Text id="0299"><i>Word_and_PDF</i>--&gt;</Text>
    670 <Text id="0300"><i>wrdpdf.gif</i> on your computer. When you select this image, Greenstone automatically generates an appropriate URL for the image.</Text>
     697<Text id="0296">Change to the <b>Design</b> panel, which is split into several sections. The first section <b>General</b> appears. This allows you to modify the values you provided when defining the collection, if desired. You can also brand the collection using a suitable image.</Text>
     698</NumberedItem>
     699<NumberedItem>
     700<Text id="0297">Click on the &lt;<b>Browse</b>&gt; button associated with <b>URL to about page icon</b>, and browse to the image <Path>sample_files--&gt;Word_and_PDF--&gt;wrdpdf.gif</Path> on your computer. When you select this image, Greenstone automatically generates an appropriate URL for the image. <b>Preview</b> the collection.</Text>
    671701</NumberedItem>
    672702<NumberedItem>
     
    679709</Heading>
    680710<NumberedItem>
    681 <Text id="0304">Now look at the <b>Document Plugins</b> section, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the ZIPPlug, TEXTPlug, HTMLPlug, EMAILPlug ImagePlug and NULPlug plugins. To delete a plugin, select it and click &lt;<b>Remove Plugin</b>&gt;. GAPlug is required for any type of source collection and should not be removed.</Text>
     711<Text id="0304">Now look at the <b>Document Plugins</b> section, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the ZIPPlug, TEXTPlug, HTMLPlug, EMAILPlug, ImagePlug, NULPlug and ISISPlug plugins. To delete a plugin, select it and click &lt;<b>Remove Plugin</b>&gt;. GAPlug is required for any type of source collection and should not be removed. </Text>
    682712</NumberedItem>
    683713<Heading>
     
    685715</Heading>
    686716<NumberedItem>
    687 <Text id="0306">Go to the <b>Search Types</b> section. This specifies what kind of search interface and what search indexes will be provided for the collection. Let's add a form search option. Click &lt;<b>Enable Advanced Searches</b>&gt;; this includes "form search" in the collection.</Text>
    688 </NumberedItem>
    689 <NumberedItem>
    690 <Text id="0307">To include "plain search" as well, pull down the <b>Search Types</b> menu and select <b>plain</b>; then click &lt;<b>Add Search Type</b>&gt;.</Text>
    691 </NumberedItem>
    692 <NumberedItem>
    693 <Text id="0308">To set plain search as the default type, click on <b>plain</b>, then click &lt;<b>Move Up</b>&gt;. The first one in the list will be the default.</Text>
     717<Text id="0306">Go to the <b>Search Types</b> section. This specifies what kind of search interface and what search indexes will be provided for the collection. Let's add a form search option. Click &lt;<b>Enable Advanced Searches</b>&gt;; this allows form searching to be added to the collection.</Text>
     718</NumberedItem>
     719<NumberedItem>
     720<Text id="0307">To include "form search" as well as the default "plain search", pull down the <b>Search Types</b> menu and select <b>form</b>; then click &lt;<b>Add Search Type</b>&gt;.</Text>
     721<Text id="0308">Plain search will be the default search type as it is first in the list.</Text>
    694722</NumberedItem>
    695723<Heading>
     
    718746</NumberedItem>
    719747<NumberedItem>
    720 <Text id="0317">A popup window <b>Configuring Arguments</b> appears. Select <i>dc.Title</i> from the <b>metadata</b> drop-down list, and check the <b>button name</b> checkbox to provide a button name; call it "Title". Now click &lt;<b>OK</b>&gt;.</Text>
     748<Text id="0317">A popup window <b>Configuring Arguments</b> appears. Select <i>dc.Title</i> from the <b>metadata</b> drop-down list and click &lt;<b>OK</b>&gt;.</Text>
    721749</NumberedItem>
    722750<NumberedItem>
     
    753781<Text id="0326">Now preview the collection. The titles and filenames lists show only one of the documents. When you click the "text" icon to look at the text extracted from that document, it's garbage. During the building process this message appeared: "One document was processed and included in the collection; one was rejected."</Text>
    754782</NumberedItem>
    755 <NumberedItem>
    756 <Text id="0327">These problems can be overcome by an option to PDFPlug. Greenstone can convert PDF files into a series of images with a corresponding file that details how they are composed into the complete document (called an <i>item</i> file). For this part of the exercise, ImageMagick needs to be installed (see exercise 3 on Installing Greenstone).</Text>
     783<Heading>
     784<Text id="0333">Modes in the Librarian Interface</Text>
     785</Heading>
     786<Comment>
     787<Text id="0334">The Librarian Interface can operate in different modes. So far, you have been using the default mode, called "Librarian." </Text>
     788</Comment>
     789<NumberedItem>
     790<Text id="0335">Use the <i>Preferences</i> item on the <i>File</i> menu to switch to <i>Expert</i> mode and then build the collection again. The <b>Create</b> panel looks different in Expert mode because it gives more options: locate the <b>Build Collection</b> button, near the bottom of the window, and click it. Now a message appears saying that the file could not be processed, and why.</Text>
     791</NumberedItem>
     792<NumberedItem>
     793<Text id="0336">We recommend that you switch back to <i>Librarian</i> mode for subsequent exercises, to avoid confusion.</Text>
     794</NumberedItem>
     795<Heading>
     796<Text id="0336a">Improved PDF Conversion with Ghostscript</Text>
     797</Heading>
     798<Comment>
     799<Text id="0336b">If you have Ghostscript installed, then you can use a new method of handling these difficult PDF documents. Ghostscript is a program that can convert Postscript and PDF files to other formats. You can download it from <Link>http://www.cs.wisc.edu/~ghost/</Link> (follow the link to the current stable release).</Text>
     800</Comment>
     801<NumberedItem>
     802<Text id="0327">Greenstone can convert PDF files into a series of images with a corresponding file that details how they are composed into the complete document (called an <i>item</i> file). For this part of the exercise, ImageMagick also needs to be installed (see <TutorialRef id="install_greenstone"/>).</Text>
    757803</NumberedItem>
    758804<NumberedItem>
     
    763809</NumberedItem>
    764810<NumberedItem>
    765 <Text id="0330">In order to view the documents properly we need to modify a format statement. In the <b>Format Features</b> section on the <b>Design</b> panel, select the <b>DocumentText</b> format statement. Replace:<Format>[Text]</Format></Text>
    766 <Text id="0331">with</Text>
    767 <Format>[srcicon]</Format>
    768 </NumberedItem>
    769 <NumberedItem>
    770 <Text id="0332"><b>Preview</b> the collection from the <b>Create</b> panel. (There is no need to build it). Images from the documents are now displayed instead of the extracted text. Both <i>No extractable text.pdf</i> and <i>Weird characters.pdf</i> display nicely now.</Text>
    771 </NumberedItem>
    772 <Heading>
    773 <Text id="0333">Modes in the Librarian Interface</Text>
    774 </Heading>
    775 <Comment>
    776 <Text id="0334">The Librarian Interface can operate in different modes. So far, you have been using the default mode, called "Librarian."</Text>
    777 </Comment>
    778 <NumberedItem>
    779 <Text id="0335">Use the <i>Preferences</i> item on the <i>File</i> menu to switch to <i>Expert</i> mode and then build the collection again. The <b>Create</b> panel looks different in Expert mode because it gives more options: locate the <b>Build Collection</b> button, near the bottom of the window, and click it. Now a message appears saying that the file could not be processed, and why.</Text>
    780 </NumberedItem>
    781 <NumberedItem>
    782 <Text id="0336">We recommend that you switch back to <i>Librarian</i> mode for subsequent exercises, to avoid confusion.</Text>
     811<Text id="0330">In order to view the documents properly we need to modify a format statement. In the <b>Format Features</b> section on the <b>Design</b> panel, select the <b>DocumentText</b> format statement. Replace <Format>[Text]</Format> with <Format>[srcicon]</Format> and click <b>Replace Format</b>.</Text>
     812</NumberedItem>
     813<NumberedItem>
     814<Text id="0332"><b>Preview</b> the collection from the <b>Create</b> panel. (There is no need to build it). Images from the documents are now displayed instead of the extracted text. Both <i>No extractable text.pdf</i> and <i>Weird characters.pdf</i> display nicely now. </Text>
    783815</NumberedItem>
    784816</Content>
     
    792824<Content>
    793825<NumberedItem>
    794 <Text id="0338">Start a new collection (File--&gt;</Text>
    795 <Text id="0339">New) called <b>backdrop</b>. Fill out the fields with appropriate information. For <b>Base this collection on</b>, select the item <b>Simple image collection (image-e)</b> from the pull-down menu.</Text>
     826<Text id="0338">Start a new collection (<Menu>File--&gt;New</Menu>) called <b>backdrop</b>. Fill out the fields with appropriate information. For <b>Base this collection on</b>, select the item <b>Simple image collection (image-e)</b> from the pull-down menu.</Text>
    796827<Comment>
    797828<Text id="0340">Greenstone does not ask you to choose a metadata set because the new collection inherits whatever is used by the base collection.</Text>
     
    835866</NumberedItem>
    836867<NumberedItem>
    837 <Text id="0353">Click on <b><i>Ascent.jpg </i></b>so its metadata fields are available, then click on its <b>dc.Title </b>field on the right-hand side. Click on the <b>Value </b>text box, enter <b>Ascent</b>, and click <b>&lt;Append&gt;</b>.</Text>
     868<Text id="0353">Click on <b><i>Ascent.jpg </i></b>so its metadata fields are available, then click on its <b>dc.Title </b>field on the right-hand side. Type in <b>Ascent</b>, and click <b>Enter</b>.</Text>
    838869</NumberedItem>
    839870<Comment>
     
    895926</Comment>
    896927<NumberedItem>
    897 <Text id="0373">Back in the Librarian Interface enter the text <b>Moon rising over mountain landscape </b>as the <b>dc.Description </b>field's value and click <b>&lt;Append&gt; </b>to have it added.</Text>
     928<Text id="0373">Back in the Librarian Interface enter the text <b>Moon rising over mountain landscape </b>as the <b>dc.Description </b>field's value and click <b>Enter</b>to have it added.</Text>
    898929</NumberedItem>
    899930<NumberedItem>
     
    9781009</Heading>
    9791010<NumberedItem>
    980 <Text id="0398">Switch to the <b>Gather</b> panel and in the right-hand side open <i>englishhistory.net </i>--&gt;</Text>
    981 <Text id="0399"> <i>tudor</i>.</Text>
     1011<Text id="0398">Switch to the <b>Gather</b> panel and in the right-hand side open <Path>englishhistory.net --&gt; tudor</Path>.</Text>
    9821012</NumberedItem>
    9831013<NumberedItem>
     
    10071037</NumberedItem>
    10081038<NumberedItem>
    1009 <Text id="0406">Choose <b>File</b>--&gt;</Text>
    1010 <Text id="0407"><b>Write CD/DVD image</b>, and in the popup window select the <b>tudor</b> collection as the collection to export. You can optionally name the CD-ROM; otherwise the default "collections" is used. Do so now, entering "Tudor collection" in the field for <b>CD/DVD name</b>; then click <b>&lt;Write CD/DVD image&gt;</b>.</Text>
     1039<Text id="0406">Choose <Menu>File--&gt;Write CD/DVD image</Menu>, and in the popup window select the <b>tudor</b> collection as the collection to export. You can optionally name the CD-ROM; otherwise the default "collections" is used. Do so now, entering "Tudor collection" in the field for <b>CD/DVD name</b>; then click <b>&lt;Write CD/DVD image&gt;</b>.</Text>
    10111040<Text id="0408">The necessary files for export are written to:</Text>
    1012 <Path>C:\Program Files\Greenstone\tmp\exported_Tudorcollection</Path>
     1041<Text id="0408a"><Path>C:\Program Files\Greenstone\tmp\exported_Tudorcollection</Path></Text>
    10131042<Text id="0409">You need to use your own computer's software to write these on to CD-ROM. On Windows XP this ability is built into the operating system: assuming you have a CD-ROM or DVD writer insert a blank disk into the drive and drag the contents of <i>exported_Tudorcollection</i> into the folder that represents the disk.</Text>
    10141043<Comment>
     
    10321061</NumberedItem>
    10331062<NumberedItem>
    1034 <Text id="0414">In a web browser, visit <i>http://englishhistory.net</i>, follow the link to <i>Tudor England</i>, and click &lt;<b>enter</b>&gt;. You should be at the URL</Text>
     1063<Text id="0414">In a web browser, visit <Link>http://englishhistory.net</Link>, follow the link to <i>Tudor England</i>, and click &lt;<b>enter</b>&gt;. You should be at the URL</Text>
    10351064<Link>http://englishhistory.net/tudor/contents.html</Link>
    10361065<Text id="0415">This is where we started the downloading process to obtain the files you have been using for the <b>tudor</b> collection.</Text>
     
    10991128<Content>
    11001129<Comment>
    1101 <Text id="0435">We return to the Tudor collection and add metadata that expresses a subject hierarchy. Then we build a classifier that exploits it by allowing</Text>
    1102 <Text id="0436"> readers to browse the documents about Monarchs, Relatives, Citizens, and Others separately.</Text>
     1130<Text id="0435">We return to the Tudor collection and add metadata that expresses a subject hierarchy. Then we build a classifier that exploits it by allowing readers to browse the documents about Monarchs, Relatives, Citizens, and Others separately.</Text>
    11031131</Comment>
    11041132<Heading>
     
    11061134</Heading>
    11071135<NumberedItem>
    1108 <Text id="0438">Open up your <b>tudor</b> collection (the original version, not the <b>webtudor</b> version), switch to the <b>Enrich </b>panel and select the <i>monarchs</i> folder (a subfolder of <i>tudor</i>). Set its <b>dc.Subject and Keywords</b> metadata to <b>Tudor period|Monarchs</b>. (For brevity, we refer to this metadata element in future simply as <b>dc.Subject</b>.) The vertical bar ("|") is a hierarchy marker. Selecting a <i>folder</i> and using the <b>Append</b> button to set its metadata has the effect of setting this metadata value for all files contained in this folder, its subfolders, and so on. A popup alerts you to this fact.</Text>
     1136<Text id="0438">Open up your <b>tudor</b> collection (the original version, not the <b>webtudor</b> version), switch to the <b>Enrich </b>panel and select the <i>monarchs</i> folder (a subfolder of <i>tudor</i>). Set its <b>dc.Subject and Keywords</b> metadata to <b>Tudor period|Monarchs</b>. (For brevity, we refer to this metadata element in future simply as <b>dc.Subject</b>.) The vertical bar ("|") is a hierarchy marker. Selecting a <i>folder</i> and adding metadata has the effect of setting this metadata value for all files contained in this folder, its subfolders, and so on. A popup alerts you to this fact.</Text>
    11091137</NumberedItem>
    11101138<NumberedItem>
     
    11811209</Heading>
    11821210<NumberedItem>
    1183 <Text id="0463">Switch to the <b>Create</b> panel and view the options that are displayed in the top portion of the screen. Select <b>maxdocs</b> and set its numeric counter to <b>3</b>. Now <b>build</b>. In fact, you will find that the collection now contains 5 documents (not 3 as you specified: for technical reasons the number you give to <b>maxdocs</b> is an approximate value.)</Text>
     1211<Text id="0463">Switch to the <b>Create</b> panel and view the options that are displayed in the top portion of the screen. Select <b>maxdocs</b> and set its numeric counter to <b>3</b>. Now <b>build</b>.</Text>
    11841212</NumberedItem>
    11851213<NumberedItem>
     
    12101238[/highlight]{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}&lt;/td&gt;
    12111239</Format>
    1212 <Text id="0469">This displays</Text>
    1213 <Text id="0470">something that looks like this:</Text>
     1240<Text id="0469">This displays something that looks like this: </Text>
    12141241<Indent>
    12151242<table><tr><td><img width='15' height='20' src="tutorial_images/itext.gif"/></td><td width='408' valign='top'>A discussion of question five from Tudor Quiz: Henry VIII <br/><i>(quizstuff.html)</i></td></tr></table>
     
    12281255&lt;i&gt;([ex.Source])&lt;/i&gt;<br/>
    12291256&lt;/td&gt;
    1230 </Format>
    1231 <Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at</Text>
    1232 <Text id="0477">the <i>titles a-z</i> list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.</Text>
    1233 <Comment>
    1234 <Text id="0478">But there's a problem. Beside the bookshelves in the hierarchy browser, beneath the subject appears a mysterious "()".</Text>
    1235 <Text id="0479">What is printed on these bookshelf nodes is governed by the same format statement, and though bookshelf nodes of the hierarchy have associated</Text>
    1236 <Text id="0480">Title metadata--their title is the name of the metadata value associated with that bookshelf--they do not have</Text>
    1237 <Text id="0481">ex.Source metadata, so it comes out blank.</Text>
     1257</Format>
     1258<Text id="0475a">Remember to click <b>&lt;Replace Format&gt;</b>.</Text>
     1259<Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the <i>titles a-z</i> list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default. </Text>
     1260<Comment>
     1261<Text id="0478">But there's a problem. Beside the bookshelves in the hierarchy browser, beneath the subject appears a mysterious "()". What is printed on these bookshelf nodes is governed by the same format statement, and though bookshelf nodes of the hierarchy have associated <i>Title</i> metadata--their title is the name of the metadata value associated with that bookshelf--they do not have <i>ex.Source</i> metadata, so it comes out blank.</Text>
    12381262</Comment>
    12391263</NumberedItem>
     
    12971321</Heading>
    12981322<Comment>
    1299 <Text id="0500">The appearance of all pages produced by Greenstone is governed by macro files, which reside in the folder</Text>
    1300 <Text id="0501">C:\Program Files\Greenstone\macros. The</Text>
    1301 <Text id="0502">garish example collection is a version of the</Text>
    1302 <Text id="0503">demo collection with bizarre layout and coloring. Now we apply the same bizarre layout and coloring to the</Text>
    1303 <Text id="0504">tudor collection.</Text>
     1323<Text id="0500">The appearance of all pages produced by Greenstone is governed by macro files, which reside in the folder <Path>C:\Program Files\Greenstone\macros</Path>. The garish example collection is a version of the demo collection with bizarre layout and coloring. Now we apply the same bizarre layout and coloring to the tudor collection.</Text>
    13041324</Comment>
    13051325<NumberedItem>
     
    13131333</NumberedItem>
    13141334<Comment>
    1315 <Text id="0508">A small but important enhancement to Greenstone has been made since the garish collection was written. Instead of using the</Text>
    1316 <Text id="0509">[c=garish] macro argument to restrict the macros to apply to a certain collection, you can now put collection-specific macros in the</Text>
    1317 <Text id="0510">macros directory of the collection, in a file called</Text>
    1318 <Text id="0511">extra.dm. In fact, this is what you have just done.</Text>
     1335<Text id="0508">A small but important enhancement to Greenstone has been made since the garish collection was written. Instead of using the [c=garish] macro argument to restrict the macros to apply to a certain collection, you can now put collection-specific macros in the macros directory of the collection, in a file called extra.dm. In fact, this is what you have just done. </Text>
    13191336</Comment>
    13201337<Heading>
     
    13371354</NumberedItem>
    13381355<Comment>
    1339 <Text id="0518">To learn how more about macros, read</Text>
    1340 <Text id="0519">Customizing the Greenstone User Interface, an illustrated guide to customizing the user interface, by Allison Zhang of the Washington Research Library Consortium, available at http://www.wrlc.org/dcpc/UserInterface/interface.htm.</Text>
     1356<Text id="0518">To learn how more about macros, read <i>Customizing the Greenstone User Interface</i>, an illustrated guide to customizing the user interface, by Allison Zhang of the Washington Research Library Consortium, available at <Link>http://www.wrlc.org/dcpc/UserInterface/interface.htm</Link>.</Text>
    13411357</Comment>
    13421358</Content>
     
    13861402<Text id="0532">In the <b>Design</b> panel select <b>Search Types</b> from the left-hand list and activate the <b>Enable Advanced Searches </b>options.</Text>
    13871403</NumberedItem>
     1404<NumberedItem>
     1405<Text id="0532a">Add form searching to the collection by selecting <b>form</b> in the <b>Search Types</b> menu and clicking &lt;<b>Add Search Type</b>&gt;. Remove plain searching by selecting <b>plain</b> in the <b>Currently Assigned Search Types</b> list, and clicking &lt;<b>Remove Search Type</b>&gt;.</Text>
    13881406<NumberedItem>
    13891407<Text id="0533"><b>Build</b> the collection once again, and <b>preview</b> the results. Notice that the collection's home page no longer includes a query box. (This is because the search form is too big to fit here nicely.) To search, you have to click <b>search</b> in the navigation bar. Note that the <i>Preferences </i>page has changed to control the advanced searching options.</Text>
     
    14561474</Comment>
    14571475<NumberedItem>
    1458 <Text id="0552">Start a new collection (<i>File</i>--&gt;</Text>
    1459 <Text id="0553"><i>New</i>) called <b>small_beatles</b>, basing it on the default "New Collection." (Basing it on the existing Advanced Beatles collection would make your life far easier, but we want you to learn how to build it from scratch!) Fill out the fields with appropriate information. Use the Dublin Core metadata set (set by default).</Text>
     1476<Text id="0552">Start a new collection (<Menu>File--&gt;New</Menu>) called <b>small_beatles</b>, basing it on the default "New Collection." (Basing it on the existing Advanced Beatles collection would make your life far easier, but we want you to learn how to build it from scratch!) Fill out the fields with appropriate information. Use the Dublin Core metadata set (set by default).</Text>
    14601477</NumberedItem>
    14611478<NumberedItem>
     
    15661583<Text id="0585">To make this easier for you we have prepared a plain text file that contains the new text. In WordPad open the following file:</Text>
    15671584<Path>sample_files--&gt;beatles--&gt;format_tweaks--&gt;audio_tweak.txt</Path>
    1568 <Text id="0586">(Be sure to use WordPad rather than Notepad, because Notepad does not display the line breaks correctly.) Place it in the copy buffer by highlighting the text in WordPad and selecting Edit--&gt;</Text>
    1569 <Text id="0587">Copy. Now move back to the Librarian Interface, highlight all the text that makes up the current VList format statement, and use Edit--&gt;</Text>
    1570 <Text id="0588">Paste to transform the old statement to the new one. Remember to press &lt;<b>Replace Format</b>&gt; when finished.</Text>
    1571 <b>
    1572 <Text id="0589">Preview</Text>
    1573 </b>
    1574 <Text id="0590"> the result. If</Text>
    1575 <Text id="0591">you are using the Greenstone Local Library server, change to the <b>Create </b>panel and click &lt;<b>Preview Collection</b>&gt;, which causes the local library server to rescan the format statements. You do not need to build the collection again because format statements are only used by the runtime system.</Text>
     1585<Text id="0586">(Be sure to use WordPad rather than Notepad, because Notepad does not display the line breaks correctly.) Place it in the copy buffer by highlighting the text in WordPad and selecting <Menu>Edit--&gt;Copy</Menu>. Now move back to the Librarian Interface, highlight all the text that makes up the current VList format statement, and use <Menu>Edit--&gt;Paste</Menu> to transform the old statement to the new one. Remember to press &lt;<b>Replace Format</b>&gt; when finished.</Text>
     1586<Text id="0589"><b>Preview</b> the result. If you are using the Greenstone Local Library server, change to the <b>Create </b>panel and click &lt;<b>Preview Collection</b>&gt;, which causes the local library server to rescan the format statements. You do not need to build the collection again because format statements are only used by the runtime system.</Text>
    15761587<Text id="0592">However, you may need to click the browser's &lt;<b>Reload</b>&gt; button to force it to re-load the page.</Text>
    15771588</NumberedItem>
     
    16521663</NumberedItem>
    16531664<NumberedItem>
    1654 <Text id="0613">To complete the collection, use the browse button of <b>URL to 'about page' icon</b> in the <b>General</b> section of the <b>Design</b> panel to select the following image: <i>advbeatles_large</i>--&gt;</Text>
    1655 <Text id="0614"><i>images</i>--&gt;</Text>
    1656 <Text id="0615"><i>flick4.gif.</i></Text>
     1665<Text id="0613">To complete the collection, use the browse button of <b>URL to 'about page' icon</b> in the <b>General</b> section of the <b>Design</b> panel to select the following image:</Text>
     1666<Path>advbeatles_large--&gt;images--&gt;flick4.gif.</Path>
    16571667<Text id="0616"><b>Build</b> the collection again and <b>preview</b> it.</Text>
    16581668</NumberedItem>
     
    16611671</Comment>
    16621672<Comment>
    1663 <Text id="0618">In the next exercise we incorporate the MIDI files. Greenstone has no MIDI plugin (yet). But that doesn't mean you can't use MIDI files! We also clean up the</Text>
    1664 <Text id="0619">titles a-z</Text>
    1665 <Text id="0620"><b> </b>browser.<b></b></Text>
     1673<Text id="0618">In the next exercise we incorporate the MIDI files. Greenstone has no MIDI plugin (yet). But that doesn't mean you can't use MIDI files! We also clean up the <i>titles a-z</i> browser.</Text>
    16661674</Comment>
    16671675<Comment>
     
    16751683</Heading>
    16761684<NumberedItem>
    1677 <Text id="0624">To switch modes, click <i>File</i>--&gt;</Text>
    1678 <Text id="0625"><i>Preferences</i>--&gt;</Text>
    1679 <Text id="0626"><i>Mode </i>and change to <b>Library Systems Specialist</b>. Note from the description that appears that you need to be able to formulate regular expressions to use this mode fully. That is what we do below.</Text>
     1685<Text id="0624">To switch modes, click <Menu>File--&gt;Preferences--&gt;Mode</Menu> and change to <b>Library Systems Specialist</b>. Note from the description that appears that you need to be able to formulate regular expressions to use this mode fully. That is what we do below.</Text>
    16801686</NumberedItem>
    16811687<NumberedItem>
     
    17221728</Comment>
    17231729<Comment>
    1724 <Text id="0640">One powerful use of regular expressions in the exercise was to clean up the</Text>
    1725 <Text id="0641">titles a-z</Text>
    1726 <Text id="0642"> browser. Perhaps the best way of doing this would be to have proper title metadata. The metadata extracted from HTML files is messy and inconsistent, and this was reflected in the original titles a-z browser. Defining proper title metadata would be simple but rather laborious. Instead, we have opted to use regular expressions in the AZCompactList classifier to clean up the title metadata. This is difficult to understand, and a bit fiddly to do, but if you can cope with its idiosyncrasies it provides a quick way to clean up the extracted metadata and avoid having to enter a large amount of metadata.</Text>
     1730<Text id="0640">One powerful use of regular expressions in the exercise was to clean up the <i>titles a-z</i> browser. Perhaps the best way of doing this would be to have proper title metadata. The metadata extracted from HTML files is messy and inconsistent, and this was reflected in the original titles a-z browser. Defining proper title metadata would be simple but rather laborious. Instead, we have opted to use regular expressions in the <i>AZCompactList</i> classifier to clean up the title metadata. This is difficult to understand, and a bit fiddly to do, but if you can cope with its idiosyncrasies it provides a quick way to clean up the extracted metadata and avoid having to enter a large amount of metadata.</Text>
    17271731</Comment>
    17281732<Heading>
     
    17641768</NumberedItem>
    17651769<NumberedItem>
    1766 <Text id="0649">The complete statement is in the file <i>format_tweaks</i>--&gt;</Text>
    1767 <Text id="0650"><i>multi_icons.txt</i>.</Text>
     1770<Text id="0649">The complete statement is in the file <Path>format_tweaks--&gt;multi_icons.txt</Path>.</Text>
    17681771</NumberedItem>
    17691772<NumberedItem>
     
    17781781<NumberedItem>
    17791782<Text id="0654">The file content is fairly brief, specifying only what needs to be overridden from the default behaviour for this collection. In WordPad, near the top of the file you should see:</Text>
    1780 <Format>_httpiconchalk_ {_httpcimages_/beat_margin.gif}<br/>
    1781 _widthchalk_ {1800}<br/>
    1782 _heightchalk_ {68}.</Format>
    1783 <Text id="0655">Use copy and paste on these three lines to make this part of the file look like:</Text>
    1784 <Format># Original statements<br/>
    1785 #_httpiconchalk_ {_httpcimages_/beat_margin.gif}<br/>
    1786 #_widthchalk_ {1800}<br/>
    1787 #_heightchalk_ {68}<br/>
    1788 _httpiconchalk_ {_httpcimages_/tile.jpg}<br/>
    1789 _widthchalk_ {22}<br/>
    1790 _heightchalk_ {22}</Format>
    1791 <Text id="0656">A hash (#) at the start of line signals a comment, and Greenstone ignores the following text. We use this to comment out the original three statements and replace them with modified lines. It is useful to retain the original version in case we need to restore the original lines at a later date. These three lines relate to the background image used. The new image <i>tile.jpg</i> was also in the <i>images</i> folder that was copied across previously.</Text>
     1783<Format>
     1784_collectionspecificstyle_ {<br/>
     1785&lt;style&gt;<br/>
     1786body.bgimage \{ background-image: url("_httpcimages_/beat_margin.gif");  \}<br/>
     1787\#page \{ margin-left: 120px; \}<br/>
     1788&lt;style&gt;<br/>
     1789}
     1790</Format>
     1791<Text id="0655">Use copy and paste on these lines to make this part of the file look like:</Text>
     1792<Format>
     1793# Original statements<br/>
     1794#_collectionspecificstyle_ {<br/>
     1795#&lt;style&gt;<br/>
     1796#body.bgimage \{ background-image: url("_httpcimages_/beat_margin.gif");  \}<br/>
     1797#\#page \{ margin-left: 120px; \} <br/>
     1798#&lt;style&gt;<br/>
     1799#}<br/>
     1800<br/>
     1801_collectionspecificstyle_ {<br/>
     1802&lt;style&gt;<br/>
     1803body.bgimage \{ background-image: url("_httpcimages_/tile.jpg");  \}<br/>
     1804&lt;style&gt;<br/>
     1805}
     1806</Format>
     1807<Text id="0656">A hash (#) at the start of line signals a comment, and Greenstone ignores the following text. We use this to comment out the original statements and replace them with modified lines. It is useful to retain the original version in case we need to restore the original lines at a later date. These lines relate to the background image used. The new image <i>tile.jpg</i> was also in the <i>images</i> folder that was copied across previously.</Text>
    17921808</NumberedItem>
    17931809<NumberedItem>
     
    18171833</Bullet>
    18181834<Bullet>
    1819 <Text id="0666">Copy the content of <i>sample_files</i>--&gt;</Text>
    1820 <Text id="0667"><i>beatles</i>--&gt;</Text>
    1821 <Text id="0668"><i>advbeat_large</i>--&gt;</Text>
    1822 <Text id="0669"><i>import</i> into this newly formed collection. Since there are considerably more files in this set of documents the copy will take longer.</Text>
     1835<Text id="0666">Copy the content of <Path>sample_files--&gt;beatles--&gt;advbeat_large--&gt;import</Path> into this newly formed collection. Since there are considerably more files in this set of documents the copy will take longer.</Text>
    18231836<Text id="0670"><b>Build</b> the collection and preview the result. (If you want the collection to have an icon, you will have to add it from the <b>Design</b> panel.)</Text>
    18241837</Bullet>
     
    18801893</NumberedItem>
    18811894<NumberedItem>
    1882 <Text id="0687"><b>Modify</b></Text>
    1883 <Text id="0688"> the format statement for <b>VList</b>. Find the part of the default statement that says</Text>
     1895<Text id="0687"><b>Modify</b> the format statement for <b>VList</b>. Find the part of the default statement that says</Text>
    18841896<Format>{If}{[ex.Source],&lt;br&gt;&lt;i&gt;([ex.Source])&lt;/i&gt;}</Format>
    18851897<Text id="0689">and change it to</Text>
     
    19962008<Format>&lt;h3&gt;[Subject]&lt;/h3&gt;</Format>
    19972009<Comment>
    1998 <Text id="0723">The document heading appears above the detach and no highlighting buttons when you get to a document in the collection. By default DocumentHeading displays the document's ex.Title metadata. In this particular set of OAI exported records, titles are filenames of JPEG images, and the filenames are particularly uninformative (for example, 01dla14). You can see them in the <b>Enrich</b> panel if you select an image in sample_small--&gt;</Text>
    1999 <Text id="0724">oai--&gt;</Text>
    2000 <Text id="0725"> JCDLPICS--&gt;</Text>
    2001 <Text id="0726">srcdocs and check its filename and ex.Title metadata. The above format statement displays ex.Subject metadata instead.</Text>
    2002 </Comment>
    2003 </NumberedItem>
    2004 <NumberedItem>
    2005 <Text id="0727">Finally, you will have noticed that where the document itself should appear, you see only <i>This document has no text</i>. To rectify this, select <b>DocumentText</b> in the <b>Choose Feature</b> pull-down list and use the following as its format statement (which is currently blank) (this text is in</Text>
    2006 <Text id="0728"><i>doctxt_tweak.txt</i></Text>
    2007 <Text id="0729"> in the <i>format_tweaks</i> folder mentioned earlier):</Text>
     2010<Text id="0723">The document heading appears above the detach and no highlighting buttons when you get to a document in the collection. By default DocumentHeading displays the document's ex.Title metadata. In this particular set of OAI exported records, titles are filenames of JPEG images, and the filenames are particularly uninformative (for example, 01dla14). You can see them in the <b>Enrich</b> panel if you select an image in <Path>sample_small--&gt;oai--&gt;JCDLPICS--&gt;srcdocs</Path> and check its filename and <i>ex.Title</i> metadata. The above format statement displays <i>ex.Subject</i> metadata instead.</Text>
     2011</Comment>
     2012</NumberedItem>
     2013<NumberedItem>
     2014<Text id="0727">Finally, you will have noticed that where the document itself should appear, you see only <i>This document has no text</i>. To rectify this, select <b>DocumentText</b> in the <b>Choose Feature</b> pull-down list and use the following as its format statement (which is currently blank) (this text is in <i>doctxt_tweak.txt</i> in the <i>format_tweaks</i> folder mentioned earlier):</Text>
    20082015<Format>&lt;center&gt;&lt;table width=_pagewidth_ border=1&gt;<br/>
    20092016&lt;tr&gt;&lt;td colspan=2 align=center&gt;<br/>
     
    20482055</NumberedItem>
    20492056<NumberedItem>
    2050 <Text id="0739">Open a DOS window to access the command-line prompt. This facility should be located somewhere within your Start--&gt;</Text>
    2051 <Text id="0740">Programs menu, but details vary between different Windows systems. If you cannot locate it, select Start--&gt;</Text>
    2052 <Text id="0741">Run and enter <i>cmd</i> in the popup window that appears.</Text>
     2057<Text id="0739">Open a DOS window to access the command-line prompt. This facility should be located somewhere within your <Menu>Start--&gt;Programs</Menu> menu, but details vary between different Windows systems. If you cannot locate it, select <Menu>Start--&gt;Run</Menu> and enter <i>cmd</i> in the popup window that appears.</Text>
    20532058</NumberedItem>
    20542059<NumberedItem>
     
    20692074</NumberedItem>
    20702075<NumberedItem>
    2071 <Text id="0747">Run <i>perl -S importfrom.pl oaiservi</i></Text>
     2076<Text id="0747">Run:</Text>
     2077 <Command>perl -S importfrom.pl oaiservi</Command>
    20722078<Comment>
    20732079<Text id="0748">Greenstone will immediately set to work and generate a stream of diagnostic output. The importfrom.pl program connects to the OAI data provider specified in collection configuration file (it does this for each "acquire" line in the file) and exports all the records on that site.</Text>
     
    20932099</Comment>
    20942100<NumberedItem>
    2095 <Text id="0753">Click <i>File</i>--&gt;</Text>
    2096 <Text id="0754"><i>Preferences</i>--&gt;</Text>
    2097 <Text id="0755"><i>Mode </i>and change to <i>Expert</i> mode.</Text>
     2101<Text id="0753">Click <Menu>File--&gt;Preferences--&gt;Mode</Menu> and change to <i>Expert</i> mode.</Text>
    20982102</NumberedItem>
    20992103<NumberedItem>
     
    21012105</NumberedItem>
    21022106<NumberedItem>
    2103 <Text id="0757">Now change to the <b>Create </b>panel, locate the options for the import process and set <i>-saveas </i>to <i>METS</i>. Import options are not available unless you are in <i>Expert</i> mode.</Text>
     2107<Text id="0757">Now change to the <b>Create </b>panel, locate the options for the import process and set <i>saveas </i>to <i>METS</i>. Import options are not available unless you are in <i>Expert</i> mode.</Text>
    21042108</NumberedItem>
    21052109<NumberedItem>
     
    21192123<Content>
    21202124<NumberedItem>
    2121 <Text id="0761">If you have just done the previous exercise the Greenstone Librarian Interface will already be in <i>Expert</i> mode. Otherwise, change to <i>Library System Specialist</i> (or <i>Expert</i>) mode (using File--&gt;</Text>
    2122 <Text id="0762">Preferences), because you will need to change the order of plug-ins in the <b>Design</b> panel.</Text>
     2125<Text id="0761">First, change to <i>Library System Specialist</i> (or <i>Expert</i>) mode (using <Menu>File--&gt;Preferences</Menu>), because you will need to change the order of plug-ins in the <b>Design</b> panel.</Text>
    21232126</NumberedItem>
    21242127<NumberedItem>
     
    22242227</Heading>
    22252228<NumberedItem>
    2226 <Text id="0791">Perform the first four steps of the "Downloading over OAI" exercise: open a command window, change directory to where Greenstone is installed, run <i>setup.bat</i>,and change directory once again, this time into <i>collect\stoned</i>, the collection you built in the last exercise.</Text>
     2229<Text id="0791">Perform the first four steps of the <TutorialRef id="OAI_downloading"/> exercise: open a command window, change directory to where Greenstone is installed, run <i>setup.bat</i>,and change directory once again, this time into <i>collect\stoned</i>, the collection you built in the last exercise.</Text>
    22272230</NumberedItem>
    22282231<NumberedItem>
Note: See TracChangeset for help on using the changeset viewer.