Index: /documentation/trunk/tutorials/xml-source/tutorial_en.xml
===================================================================
--- /documentation/trunk/tutorials/xml-source/tutorial_en.xml (revision 37337)
+++ /documentation/trunk/tutorials/xml-source/tutorial_en.xml (revision 37338)
@@ -933,5 +933,5 @@
-In the Librarian Interface, look at the section of the panel, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the , , , , , , , and plugins. To delete a plugin, select it and click . is required for any type of source collection and should not be removed.
+In the Librarian Interface, look at the section of the panel, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the , , , , , , , , and plugins. To delete a plugin, select it and click . is required for any type of source collection and should not be removed.
@@ -1090,12 +1090,15 @@
For collections with documents that undergo a conversion process during importing (e.g. Word, PDF, PowerPoint documents, but not text, HTML documents), the original file is stored in the collection along with the converted version. The default format statement links to both versions, but the format statement for links only to the converted version of the original file. That is, this format statement:
-<gsf:link type="document">
- <gsf:icon type="document"/>
- </gsf:link>
+<td>
+ <gsf:link type="document">
+ <gsf:icon type="document"/>
+ </gsf:link>
+</td> links to the Greenstone HTML version, while
-<gsf:link type="source">
- <gsf:metadata name="srcicon"/>
- </gsf:link>
-
+<td>
+ <gsf:link type="source">
+ <gsf:metadata name="srcicon"/>
+ </gsf:link>
+</td> links to the original.Choose in . Experiment with removing and restoring either of the two links from the format statement, previewing the effect of each change.
@@ -1682,5 +1685,5 @@
-Next we'll customize the format statement to highlight the query terms in a PDF file when it is opened from the search result list. This requires Acrobat Reader 7.0 version or higher, and currently only works on a Microsoft Windows platform.
+Next we'll customize the format statement to highlight the query terms in a PDF file when it is opened from the search result list. This requires Acrobat Reader 7.0 version or higher, and currently only works on a Microsoft Windows platform and Linux systems.
@@ -1701,6 +1704,6 @@
</td>
<td valign="top">
@@ -2029,4 +2032,57 @@
Note: When Greenstone encounters a file that matches the provided associate_ext value (pdf in our case), it sets the metadata value for that document to be the macro _iconXXX_, where XXX is whatever the filename extension is (so in our case). As long as there is an existing macro defined for that combination of the word icon and the filename extension, then a suitable icon will be displayed when the document appears in a VList. For pdf the displayed icon will be .
+
+
+Go to Format Features → search and you will see:
+
+ <gsf:template match="documentNode">
+ <td valign="top">
+ <gsf:link type="document">
+ <Tab n="3"/><gsf:icon type="document"/>
+ </gsf:link>
+ </td>
+ <td>
+ <gsf:link type="document">
+ <xsl:call-template name="choose-title"/>
+ </gsf:link>
+ </td>
+ </gsf:template>
+
+The above will only display search results where there is a link to the Greenstone generated HTML version of the original source document, followed by the title of the document.
+Change the above to:
+
+ <gsf:template match="documentNode">
+ <td valign="top">
+ <gsf:link type="document">
+ <Tab n="3"/><gsf:icon type="document"/>
+ </gsf:link>
+ </td>
+
+
+ <td valign="top">
+ <gsf:link type="source">
+ <gsf:choose-metadata>
+ <gsf:metadata name="thumbicon"/>
+ <gsf:metadata name="srcicon"/>
+ </gsf:choose-metadata>
+ </gsf:link>
+ </td>
+ <td valign="top">
+ <gsf:metadata name="equivDocLink"/>
+ <gsf:metadata name="equivDocIcon"/>
+ <gsf:metadata name="/equivDocLink"/>
+ </td>
+
+
+ <td>
+ <gsf:link type="document">
+ <xsl:call-template name="choose-title"/>
+ </gsf:link>
+ </td>
+ </gsf:template>
+
+Now, following the link to Greenstone's HTML document, there is a link to the source document (the doc file) and a link to its equivalent doc (the equivalent PDF file in our example).
+
+
@@ -2095,6 +2151,5 @@
By default, only looks for Title metadata. Configure the plugin so that it looks for the other metadata too. Switch to the panel and select the section. Select the line and click . A popup window appears. Switch on the option, and set the value to
-Title,Author,Page_topic,Content
-
+Title,Author,Page_topic,Content
Click .
@@ -2156,4 +2211,5 @@
Now switch to the panel, build the collection, and preview it. Choose the new link that appears in the navigation bar, and click the bookshelves to navigate around the four-entry hierarchy that you have created.
+
Partitioning the full-text index based on metadata values
@@ -4028,5 +4085,5 @@
Refresh in the web browser to viewPreview the new list.As a consequence of using the option of the classifier, bookshelf icons appear when titles are browsed. This revised format statement has the effect of specifying in brackets how many items are contained within a bookshelf for classifier nodes. It works by exploiting the fact that only bookshelf icons define [numleafdocs] metadata. For document nodes, Title is not displayed. Instead, Volume, Number and Date information are displayed.
-You may notice that the browser shows the volume numbers in inverse order. To correct this, inIn the Pane, under , configure the classifier. Tick and set the metadata to ex.Volume|ex.Number. Rebuilding now will ensure the ex.Volume Number of each newspaper are listed in numeric order. This has the effect of also sorting the ex.Number value for each ex.Volume.
+You may notice that the browser shows the volume numbers in inverse order. To correct this, inIn the Pane, under , configure the classifier. Tick and set the metadata to ex.Volume|ex.Number. Rebuilding now will ensure the ex.Volume Number of each newspaper are listed in numeric order. This has the effect of also sorting the ex.Number value for each ex.Volume.
@@ -4909,7 +4966,8 @@
Open a DOS prompt on Windows or a terminal on Mac/Linux and experiment to see what it takes to convert your Greenstone installation's web/sites/localsite/collect/DjVuColl/superhero.djvu file.You may have to invoke djvutxt using its full filepath, in which case on Windows the command would look like:
-C:\PATH\TO\YOUR\djvutxt C:\PATH\TO\YOUR\GS\web\sites\localsite\collect\DjVuColl\superhero.djvu C:\PATH\TO\YOUR\GS\superhero.txt
+C:\PATH\TO\YOUR\djvutxt C:\PATH\TO\YOUR\GS\web\sites\localsite\collect\DjVuColl\import\superhero.djvu C:\PATH\TO\YOUR\GS\superhero.txtwhile on Unix systems the command would look like:
-/PATH/TO/YOUR/djvutxt /PATH/TO/YOUR/GS/web/sites/localsite/collect/DjVuColl/superhero.djvu /PATH/TO/YOUR/GS/superhero.txt
+/PATH/TO/YOUR/djvutxt /PATH/TO/YOUR/GS/web/sites/localsite/collect/DjVuColl/import/superhero.djvu /PATH/TO/YOUR/GS/superhero.txt
+If you compiled up djvulibre from source, djvutxt will be in /PATH/TO/YOUR/djvulibre/bin/djvutxt.Once you have the command working, inspect the output file. You should see mostly legible text in it. Only when you've been able to successfully complete this step should you proceed to the next steps.
@@ -4938,5 +4996,5 @@
Greenstone doesn't have an icon for DjVu documents, since it doesn't know about the format. If you Google for the djvu icon, you'd probably find the Wikipedia page for it.
-Save one of their DjVu icon images. Then open the image in Windows Paint or GIMP or another image editor, and use the application's scaling feature to scale the image's height or the width (whichever is greater) to anywhere between 26 and 32 pixels. Save the scaled image as a GIF file with the name "idjvu.gif", storing it in your Greenstone installation's web/interfaces/default/images folder.
+Save one of their DjVu icon images. Then open the image in Windows Paint or GIMP or another image editor, and use the application's scaling feature to scale the image's height or the width (whichever is greater) to anywhere between 26 and 32 pixels. Save the scaled image as a GIF file with the name "idjvu.gif", storing it in your Greenstone installation's web/interfaces/default/images folder. You can also use free online image resizing websites to carry out this step.Greenstone knows nothing about the icondjvu macro we defined as the value for UnknownConverterPlugin's srcicon field, so we have to teach Greenstone about this new macro. Use a text editor to open your Greenstone 3's web/sites/localsite/siteConfig.xml file.
@@ -4948,7 +5006,36 @@
The above has now associated the icon image we want appearing for the djvu document with the macro we defined for the srcicon field in UnknownConverterPlugin's configuration.
-Restart GLI, which will restart the Greenstone server, reloading the siteConfig.xml you have just edited. Rebuild the DjVu Collection again and preview it. This time, when you browse and search the collection, you should see the djvu icon appearing in place of the unknown icon for your DjVu document.
-
-Having designed your collection to handle DjVu documents, you can now add any other documents, including more DjVu documents. Greenstone should now be able to index the text content of DjVu documents in the collection to make them searchable, in all instances where text can be successfully extracted from them by djvutxt.
+Restart GLI, which will restart the Greenstone server, reloading the siteConfig.xml you have just edited. Rebuild the DjVu Collection again and preview it. This time, when you browse the collection, you should see the djvu icon appearing in place of the unknown icon for your DjVu document.
+
+Having designed your collection to handle DjVu documents, you can now add any other documents, including more DjVu documents. Greenstone should now be able to index the text content of DjVu documents in the collection to make them searchable, in all instances where text can be successfully extracted from them by djvutxt.
+Make the search format statement look like below, then try searching:
+
+ <gsf:template match="documentNode">
+ <td valign="top">
+ <gsf:link type="document">
+ <gsf:icon type="document"/>
+ </gsf:link>
+ </td>
+ <td valign="top">
+ <gsf:link type="source">
+ <gsf:choose-metadata>
+ <gsf:metadata name="thumbicon"/>
+ <gsf:metadata name="srcicon"/>
+ </gsf:choose-metadata>
+ </gsf:link>
+ </td>
+ <td>
+ <gsf:link type="document">
+ <xsl:call-template name="choose-title"/>
+ </gsf:link>
+ <gsf:switch>
+ <gsf:metadata name="equivDocLink"/>
+ <gsf:when test="exists">
+ Also available as: <gsf:metadata name="equivDocLink"/><gsf:metadata name="equivDocIcon"/><gsf:metadata name="/equivDocLink"/>
+ </gsf:when>
+ </gsf:switch>
+ </td>
+ </gsf:template>
+
@@ -5815,5 +5902,5 @@
-Return to the folder (in Greenstone3 → web → interfaces → default → style → themes). Open in a web browser. Scroll down so that the Datepicker calendar is completely visible on your screen, and take a screen shot. (On Windows, this is done by pressing the print screen - - button.)
+Return to the folder (in Greenstone3 → web → interfaces → default → style → themes). Open in a web browser. Scroll down so that the Datepicker calendar is completely visible on your screen. Take a screenshot: either by using your browser's screenshot feature, first selecting the outline of the Datepicker image, or else use your PC's ability to take the screen shot. (On Windows, you can do this by pressing the print screen - - button.)