Changeset 13119 for trunk/gsdl-documentation
- Timestamp:
- 2006-10-16T16:22:51+13:00 (18 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/gsdl-documentation/tutorials/xml-source/tutorial_en.xml
r13114 r13119 396 396 <NumberedItem> 397 397 <Text id="0201">From <Link>http://www.greenstone.org</Link></Text> 398 <Text id="0202">Most people download the Windows distribution from <Link>http://www.greenstone.org</Link>, which contains the latest version of Greenstone. There are several optional modules that must be downloaded separately (to avoid a single massive download): <b>documented example collections</b>, the <b>Export to CD-ROM</b> package , the <b>Language Pack</b> (Greenstone 2.62 and earlier) and <b>Classic Interface Pack</b> (Greenstone 2.63 and later). There is also the set of <b>sample files</b> used in these exercises. (To reduce the download size the documented example collections are distributed in unbuilt form and need to be built.)</Text>398 <Text id="0202">Most people download the Windows distribution from <Link>http://www.greenstone.org</Link>, which contains the latest version of Greenstone. There are several optional modules that must be downloaded separately (to avoid a single massive download): <b>documented example collections</b>, the <b>Export to CD-ROM</b> package (Greenstone 2.70 and earlier), the <b>Language Pack</b> (Greenstone 2.62 and earlier) and <b>Classic Interface Pack</b> (Greenstone 2.63 and later). There is also the set of <b>sample files</b> used in these exercises. (To reduce the download size the documented example collections are distributed in unbuilt form and need to be built.)</Text> 399 399 <Text id="0203">You need <b>Java</b> to run Greenstone. You might already have it; otherwise download it from <Link>http://java.sun.com</Link>. To work with image collections, you need <b>ImageMagick</b> (from <Link>http://www.imagemagick.org</Link>). </Text> 400 400 </NumberedItem> … … 812 812 </Title> 813 813 <SampleFiles folder="Word_and_PDF"/> 814 <Version initial="2.60" current="2.7 0w"/>814 <Version initial="2.60" current="2.71"/> 815 815 <Content> 816 816 <Comment> … … 818 818 </Comment> 819 819 <NumberedItem> 820 <Text id="0281">Start a new collection called <b>reports</b> (<AutoText key="glidict::Menu.File"/> → <AutoText key="glidict::Menu.File_New"/>) , base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>, and choose Dublin Core as the metadata set.</Text>820 <Text id="0281">Start a new collection called <b>reports</b> (<AutoText key="glidict::Menu.File"/> → <AutoText key="glidict::Menu.File_New"/>) and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text> 821 821 </NumberedItem> 822 822 <NumberedItem> … … 863 863 </Comment> 864 864 <Heading> 865 <Text id="0295">Collection design; branding a collection with an image</Text> 866 </Heading> 867 <NumberedItem> 868 <Text id="0296">Change to the <AutoText key="glidict::GUI.Design"/> panel, which is split into several sections. The first section <AutoText key="glidict::CDM.GUI.General"/> appears. This allows you to modify the values you provided when defining the collection, if desired. You can also brand the collection using a suitable image.</Text> 869 </NumberedItem> 870 <NumberedItem> 871 <Text id="0297">Click on the <AutoText key="glidict::General.Browse" type="button"/> button associated with <AutoText key="glidict::CDM.General.Icon_Collection"/>, and browse to the image <Path>sample_files → Word_and_PDF → wrdpdf.gif</Path> on your computer. When you select this image, Greenstone automatically generates an appropriate URL for the image. <b>Preview</b> the collection: you should see the new image at the top left of the page.</Text> 872 <Comment> 873 <Text id="0297a">Information on the <AutoText key="glidict::CDM.GUI.General"/> page does not require a rebuild of the collection to take effect. Just go to the <AutoText key="glidict::GUI.Create"/> panel and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/>.</Text> 874 </Comment> 875 </NumberedItem> 876 <NumberedItem> 877 <Text id="0301">If you are on the web, you can easily make your own Greenstone-style icon by going to</Text> 878 <Link>http://www.greenstone.org/make-images.html</Link> 879 <Text id="0302">and following the instructions there.</Text> 880 </NumberedItem> 881 <Heading> 882 <Text id="0303">Document plugins</Text> 883 </Heading> 884 <NumberedItem> 885 <Text id="0304">Back in the Librarian Interface, look at the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the <AutoText text="ZIPPlug"/>, <AutoText text="TEXTPlug"/>, <AutoText text="HTMLPlug"/>, <AutoText text="EMAILPlug"/>, <AutoText text="ImagePlug"/>, <AutoText text="ISISPlug"/> and <AutoText text="NULPlug"/> plugins. To delete a plugin, select it and click <AutoText key="glidict::CDM.PlugInManager.Remove" type="button"/>. <AutoText text="GAPlug"/> is required for any type of source collection and should not be removed. </Text> 886 </NumberedItem> 887 <Comment> 888 <Text id="0304a">The next section is <AutoText key="glidict::CDM.GUI.SearchTypes"/>. In this exercise, we will not make any changes to this section.</Text> 889 </Comment> 865 <Text id="0303"><AutoText key="glidict::CDM.GUI.Plugins" type="plain"/></Text> 866 </Heading> 867 <NumberedItem> 868 <Text id="0304">In the Librarian Interface, look at the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel, by clicking on this in the list to the left. Here you can add, configure or remove plugins to be used in the collection. There is no need to remove any plugins, but it will speed up processing a little. In this case we have only Word, PDF, RTF, and PostScript documents, and can remove the <AutoText text="ZIPPlug"/>, <AutoText text="TEXTPlug"/>, <AutoText text="HTMLPlug"/>, <AutoText text="EMAILPlug"/>, <AutoText text="ImagePlug"/>, <AutoText text="ISISPlug"/> and <AutoText text="NULPlug"/> plugins. To delete a plugin, select it and click <AutoText key="glidict::CDM.PlugInManager.Remove" type="button"/>. <AutoText text="GAPlug"/> is required for any type of source collection and should not be removed. </Text> 869 </NumberedItem> 890 870 <Heading> 891 871 <Text id="0309">Search indexes</Text> … … 895 875 </NumberedItem> 896 876 <NumberedItem> 897 <Text id="0310b">Modify the <AutoText key="metadata::ex.Title"/> index to include <AutoText key="metadata::dc.Title"/> by selecting the index in the <AutoText key="glidict::CDM.IndexManager.Indexes"/> box and then selecting <AutoText key="metadata::dc.Title"/> from the <AutoText key="glidict::CDM.IndexManager.Source"/> box. Click <AutoText key="glidict::CDM.IndexManager.MGPP.Replace_Index" type="button"/>. Searching this index will search both dc.Title and ex.Title metadata. If you want to restrict searching to just the manually added dc.Title metadata, deselect <AutoText key="metadata::ex.Title"/> from the <AutoText key="glidict::CDM.IndexManager.Source"/> box and click <AutoText key="glidict::CDM.IndexManager.MGPP.Replace_Index" type="button"/>.</Text>898 </NumberedItem> 899 <NumberedItem> 900 <Text id="0312">You can add indexes based on any metadata. Add a new index based on <AutoText key="metadata::dc.Creator"/> . Change the <AutoText key="glidict::CDM.IndexManager.Index_Name"/> field to "authors", and select <AutoText key="metadata::dc.Creator"/> in the <AutoText key="glidict::CDM.IndexManager.Source"/> list. You will need to deselect the <AutoText key="metadata::ex.Title"/> and <AutoText key="metadata::dc.Title"/> metadata items. Click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>.</Text>901 </NumberedItem> 902 <Comment> 903 <Text id="0313">The next two sections are <AutoText key="glidict::CDM.GUI.Subcollections"/> and <AutoText key="glidict::CDM.GUI.SuperCollection"/>. In this exercise, we will not make any changes to these.</Text>877 <Text id="0310b">Modify the <AutoText key="metadata::ex.Title"/> index to include <AutoText key="metadata::dc.Title"/> by selecting the index in the <AutoText key="glidict::CDM.IndexManager.Indexes"/> box and clicking <AutoText key="glidict::CDM.IndexManager.Edit_Index" type="button"/>. Select <AutoText key="metadata::dc.Title"/> from the list of metadata, and click <AutoText key="glidict::CDM.IndexManager.MGPP.Replace_Index" type="button"/>. Searching this index will search both dc.Title and ex.Title metadata. If you want to restrict searching to just the manually added dc.Title metadata, edit the index again and deselect <AutoText key="metadata::ex.Title"/> from the list of metadata.</Text> 878 </NumberedItem> 879 <NumberedItem> 880 <Text id="0312">You can add indexes based on any metadata. Add a new index based on <AutoText key="metadata::dc.Creator"/> by clicking <AutoText key="glidict::CDM.IndexManager.New_Index" type="button"/>. Select <AutoText key="metadata::dc.Creator"/> in the list of metadata, and click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>.</Text> 881 </NumberedItem> 882 <Comment> 883 <Text id="0313">The next section is <AutoText key="glidict::CDM.GUI.Subcollections"/>. In this exercise, we will not make any changes to this.</Text> 904 884 </Comment> 905 885 <Heading> … … 917 897 <Text id="0318b"><AutoText text="AZCompactList"/> is like <AutoText text="AZList"/>, except that values that appear multiple times in the hierarchy are automatically grouped together and a new node, shown as a bookshelf icon, is formed.</Text> 918 898 </NumberedItem> 919 <Comment>920 <Text id="0319">The last three sections are <AutoText key="glidict::CDM.GUI.Formats"/>, <AutoText key="glidict::CDM.GUI.Translation"/> and <AutoText key="glidict::CDM.GUI.MetadataSets"/>. In this exercise, we will not make any changes to these.</Text>921 </Comment>922 899 <NumberedItem> 923 900 <Text id="0320">Switch to the <AutoText key="glidict::GUI.Create"/> panel, and <b>build</b> and <b>preview</b> the collection.</Text> 924 901 </NumberedItem> 925 902 <NumberedItem> 926 <Text id="0321">Check that all the facilities work properly. There should be three full-text indexes, called <i>text</i>, <i>titles</i>, and <i>authors</i>. The <AutoText key="coredm::_Global:labelTitle_" type="italics"/> list should display all the documents to which you have assigned <AutoText key="metadata::dc.Title"/> metadata (and only those documents). The <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list should show one bookshelf for each author you have assigned as <AutoText key="metadata::dc.Creator"/>, and clicking on that bookshelf should take you to all the documents they authored.</Text> 903 <Text id="0321">Check that all the facilities work properly. There should be three full-text indexes, called <i>text</i>, <i>dc.Title,ex.Title</i>, and <i>dc.Creator</i>. The <AutoText key="coredm::_Global:labelTitle_" type="italics"/> list should display all the documents to which you have assigned <AutoText key="metadata::dc.Title"/> metadata (and only those documents). The <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list should show one bookshelf for each author you have assigned as <AutoText key="metadata::dc.Creator"/>, and clicking on that bookshelf should take you to all the documents they authored.</Text> 904 </NumberedItem> 905 <Heading> 906 <Text id="0321-1">Renaming the search indexes</Text> 907 </Heading> 908 <NumberedItem> 909 <Text id="">The default display text for the indexes in the drop-down list on the search page contains the content of the index. Now we will change this display text to make it nicer. Go to the <AutoText key="glidict::GUI.Format"/> panel by clicking its tab. This panel is split into several sections, each controlling some aspect of collection presentation.</Text> 910 </NumberedItem> 911 <NumberedItem> 912 <Text id="">Select <AutoText key="glidict::CDM.GUI.SearchMetadata"/> in the left hand list. This pane allows you to modify what text is displayed for the drop-down lists in the search form (indexes, subcollections, levels etc). Set the <AutoText key="glidict::CDM.SearchMetadataManager.Component_Name"/> for the <AutoText text="dc.Title,Title"/> index to be "titles", and that for the <AutoText text="dc.Creator"/> index to be "creators". Preview the collection by clicking the <AutoText key="glidict::CreatePane.Preview_Collection"/>. The search form should display the new text.</Text> 927 913 </NumberedItem> 928 914 <Heading> … … 962 948 <Text id="0321f">Extracted metadata is unreliable. But it is very cheap! On the other hand, manually assigned metadata is reliable, but expensive. The previous section of this exercise has shown how to aim for the best of both worlds by using extracted metadata but correcting it when it is wrong. While this may not satisfy the professional librarian, it could provide a useful compromise for the music teacher who wants to get their collection together with a minimum of effort.</Text> 963 949 </NumberedItem> 950 <Heading> 951 <Text id="0295">Branding a collection with an image</Text> 952 </Heading> 953 <NumberedItem> 954 <Text id="0296">Switch back to the <AutoText key="glidict::GUI.Format"/> panel. The first section <AutoText key="glidict::CDM.GUI.General"/> appears. This allows you to modify the values you provided when defining the collection, if desired. You can also brand the collection using a suitable image.</Text> 955 </NumberedItem> 956 <NumberedItem> 957 <Text id="0297">Click on the <AutoText key="glidict::General.Browse" type="button"/> button associated with <AutoText key="glidict::CDM.General.Icon_Collection"/>, and browse to the image <Path>sample_files → Word_and_PDF → wrdpdf.gif</Path> on your computer. When you select this image, Greenstone automatically generates an appropriate URL for the image. <b>Preview</b> the collection: you should see the new image at the top left of the page.</Text> 958 </NumberedItem> 964 959 </Content> 965 960 </Tutorial> … … 969 964 </Title> 970 965 <Prerequisite id="word_pdf_collection"/> 966 <Version initial="2.70w" current="2.71"/> 971 967 <Content> 972 968 <Comment> … … 974 970 </Comment> 975 971 <NumberedItem> 976 <Text id="fw-2">Open the <b>reports</b> collection in the Librarian Interface and go to the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI. Design"/> panel.</Text>972 <Text id="fw-2">Open the <b>reports</b> collection in the Librarian Interface and go to the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Format"/> panel.</Text> 977 973 </NumberedItem> 978 974 <Heading> … … 982 978 <Text id="fw-3a">In this part of the exercise, we make the format statement simpler without changing the resulting display.</Text> 983 979 <Text id="fw-3">Greenstone's default format statement is complex because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections. For this collection, we don't need all of the complexity.</Text> 980 <Text id="fw-3a">Make sure that the <AutoText text="VList"/> format statement is selected in the list of formats.</Text> 984 981 <Text id="fw-4">The default <AutoText text="VList"/> format statement looks like the following:</Text> 985 982 <Format> … … 1001 998 {Or}{[dc.Title],[ex.Title],Untitled}[/highlight] {If}{[ex.Source],<br><i>([ex.Source])</i>}</td><br/> 1002 999 </Format> 1003 <Text id="fw-9a">Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>1004 1000 <Text id="fw-10">Preview the collection to make sure the display hasn't changed. You shouldn't notice any difference when looking at search results, classifiers etc. </Text> 1005 1001 </NumberedItem> … … 1010 1006 <Text id="fw-11">For collections with documents that undergo a conversion process during importing (e.g. Word, PDF, PowerPoint documents, but not text, HTML documents), the original file is stored in the collection along with the converted version. The default <AutoText text="VList"/> format statement links to both versions:</Text> 1011 1007 <Text id="fw-12"><Format>[link][icon][/link]</Format> links to the Greenstone HTML version, while <Format>[srclink][srcicon][/srclink]</Format> links to the original.</Text> 1012 <Text id="fw-13">Choose <AutoText text="SearchVList"/> in <AutoText key="glidict::CDM.GUI.Formats"/> by selecting <AutoText text="Search"/> from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> to add the <AutoText text="SearchVList"/> format statement into the list of assigned formats. Experiment with removing either of the two links from the format statement. (Remember to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/> after any changes.)</Text>1008 <Text id="fw-13">Choose <AutoText text="SearchVList"/> in <AutoText key="glidict::CDM.GUI.Formats"/> by selecting <AutoText text="Search"/> from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> to add the <AutoText text="SearchVList"/> format statement into the list of assigned formats. Experiment with removing either of the two links from the format statement.</Text> 1013 1009 <Text id="fw-13a">To see the results of your changes, preview the collection and do a search. You are making changes to <AutoText text="SearchVList"/>, which means the changes will only apply to search results.</Text> 1014 1010 <Text id="fw-13b">Storing and displaying the original allows users to see the correct format, but requires the user to have the relevant program installed. It also increases the size of the collection. The Greenstone version can be viewed in a browser, but may not look as nice.</Text> … … 1019 1015 <NumberedItem> 1020 1016 <Text id="fw-14">Next, we'll customize the format for the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list. Classifier bookshelves have only a few pieces of metadata to display: <Format>[ex.Title]</Format> and <Format>[numleafdocs]</Format>. Whatever metadata the classifier has been built on, the bookshelf label is always stored as <Format>[ex.Title]</Format>. This is why a Creator is printed out for each bookshelf even though <Format>[dc.Creator]</Format> is not specified in the format statement. <Format>[numleafdocs]</Format> is only defined for bookshelves, so this metadata can be used in an <Format>{If}</Format> statement to make bookshelves and documents display differently in the list.</Text> 1021 <Text id="fw-15">Make each bookshelf in the Creator classifier show how many entries it contains. In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI. Design"/> panel, select the <AutoText text="CL2 AZCompactList"/> classifier which is based on <AutoText key="metadata::dc.Creator"/> metadata from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click the <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> button to add this format into the list of assigned formats. Note that it gets added as <AutoText text="CL2VList"/> in this list: its the <AutoText text="VList"/> format for the second (<AutoText text="CL2"/>) classifier.</Text>1022 <Text id="fw15a">Append the following text and click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>:</Text>1017 <Text id="fw-15">Make each bookshelf in the Creator classifier show how many entries it contains. In the <AutoText key="glidict::CDM.GUI.Formats"/> section of the <AutoText key="glidict::GUI.Format"/> panel, select the <AutoText text="CL2 AZCompactList"/> classifier which is based on <AutoText key="metadata::dc.Creator"/> metadata from the <AutoText key="glidict::CDM.FormatManager.Feature"/> drop down list, and <AutoText text="VList"/> from the <AutoText key="glidict::CDM.FormatManager.Part"/> list. Click the <AutoText key="glidict::CDM.FormatManager.Add" type="button"/> button to add this format into the list of assigned formats. Note that it gets added as <AutoText text="CL2VList"/> in this list: it is the <AutoText text="VList"/> format for the second (<AutoText text="CL2"/>) classifier.</Text> 1018 <Text id="fw15a">Append the following text to the bottom of the format statement:</Text> 1023 1019 <Format> 1024 1020 {If}{[numleafdocs],<td><i>([numleafdocs])</i></td>} 1025 1021 </Format> 1026 <Text id="fw-16"> Click <AutoText key="glidict::CDM.FormatManager.Add" type="button"/>, switch to the <AutoText key="glidict::GUI.Create"/> panel, and click <AutoText key="glidict::CreatePane.Preview_Collection" type="button"/> (no need to rebuild). Click on the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and notice that the bookshelves now display how many documents they contain.</Text>1022 <Text id="fw-16"><b>Preview</b> the collection. Click on the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and notice that the bookshelves now display how many documents they contain.</Text> 1027 1023 <Text id="fw-17">This revised format statement has the effect of specifying in brackets how many items are contained within a bookshelf. Since only bookshelves define <Format>[numleafdocs]</Format>, only they will display this. By modifying <AutoText text="CL2VList"/> instead of <AutoText text="VList"/>, the change will only apply to the second classifier (Creators).</Text> 1028 1024 </NumberedItem> … … 1031 1027 </Heading> 1032 1028 <NumberedItem> 1033 <Text id="fw-18">Next we modify the document entries in the Creator classifier to display all authors. Back in <AutoText key="glidict::CDM.GUI.Formats"/>, select the <AutoText text="CL2VList"/> format in the list of assigned formats. After <Format>{If}{[ex.Source],<br></Format> in the format statement, add <Format>[sibling:dc.Creator]</Format>. Click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>.</Text>1029 <Text id="fw-18">Next we modify the document entries in the Creator classifier to display all authors. Back in <AutoText key="glidict::CDM.GUI.Formats"/>, select the <AutoText text="CL2VList"/> format in the list of assigned formats. After <Format>{If}{[ex.Source],<br></Format> in the format statement, add <Format>[sibling:dc.Creator]</Format>.</Text> 1034 1030 <Text id="fw-19"><Format>[ex.Source]</Format> is not defined for bookshelves, so can also be used to differentiate bookshelves and documents.</Text> 1035 1031 <Text id="fw-20">The resulting format statement looks like:</Text> … … 1039 1035 <td valign=top>[highlight]<br/> 1040 1036 {Or}{[dc.Title],[ex.Title],Untitled}[/highlight]<br/> 1041 {If}{[ex.Source],<br> [sibling:dc.Creator]<br/>1037 {If}{[ex.Source],<br><highlight>[sibling:dc.Creator]</highlight><br/> 1042 1038 <i>([ex.Source])</i>}</td><br/> 1043 1039 {If}{[numleafdocs],<td><i>([numleafdocs])</i></td>} 1044 1040 </Format> 1045 <Text id="fw-21">This will display the Greenstone link, the link to the original, then the Title. For bookshelves, it will also display how many documents the bookshelf contains. For documents, it will display all the Authors (Creators), and the source document. <Format>[sibling:dc.Creator]</Format> displays all the Creator metadata for the document, separated by a space (<AutoText text=" " type="quoted"/>). Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and make sure that all authors are displayed for documents.</Text> </NumberedItem> 1046 <NumberedItem> 1047 <Text id="fw-22">You can change the separator between the authors. Modify the format statement, and replace <Format>[sibling:dc.Creator]</Format> with <Format>[sibling(All'<br/>'):dc.Creator]</Format>. This will add a new line after each author (<Format><br/></Format> specifies a line break in HTML). Don't forget to click <AutoText key="glidict::CDM.FormatManager.Replace" type="button"/>. Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text> 1041 <Text id="fw-21">This will display the Greenstone link, the link to the original, then the Title. For bookshelves, it will also display how many documents the bookshelf contains. For documents, it will display all the Authors (Creators), and the source document. <Format>[sibling:dc.Creator]</Format> displays all the Creator metadata for the document, separated by a space (<AutoText text=" " type="quoted"/>), while <Format>[dc.Creator]</Format> displays only the first author. Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list and make sure that all authors are displayed for documents.</Text> 1042 </NumberedItem> 1043 <NumberedItem> 1044 <Text id="fw-22">You can change the separator between the authors. Modify the format statement, and replace <Format>[sibling:dc.Creator]</Format> with <Format>[sibling(All'<br/>'):dc.Creator]</Format>. This will add a new line after each author (<Format><br/></Format> specifies a line break in HTML). Preview the <AutoText key="coredm::_Global:labelCreator_" type="italics"/> list.</Text> 1048 1045 <Text id="fw-23">If you have done exercise <TutorialRef id="enhanced_word"/>, the collection will have both dc.Creator and ex.Creator metadata. To display both, you can use </Text> 1049 1046 <Format> 1050 1047 [sibling:dc.Creator] [sibling:ex.Creator] 1051 1048 </Format> 1052 <Text id="fw-23a">To display dc.Creator if it s present, otherwise display ex.Creator, use</Text>1049 <Text id="fw-23a">To display dc.Creator if it is present, otherwise display ex.Creator, use</Text> 1053 1050 <Format> 1054 1051 {Or}{[sibling:dc.Creator],[sibling:ex.Creator]} … … 1062 1059 </Title> 1063 1060 <SampleFiles folder="Word_and_PDF"/> 1064 <Version initial="2.70" current="2.7 0w"/>1061 <Version initial="2.70" current="2.71"/> 1065 1062 <Content> 1066 1063 <Text id="ep-2">Greenstone converts PDF files to HTML using third-party software: <AutoText text="pdftohtml.pl" type="italics"/>. This lets users view these documents even if they don't have the PDF software installed. Unfortunately, sometimes the formatting of the resulting HTML files is not so good.</Text> … … 1069 1066 <Text id="ep-3a">In the Librarian Interface, start a new collection called "PDF collection" and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text> 1070 1067 <Text id="ep-3b">In the <AutoText key="glidict::GUI.Gather"/> panel, drag just the PDF documents from <Path>sample_files → Word_and_PDF → Documents</Path> into the new collection. Also drag in the PDF documents from <Path>sample_files → Word_and_PDF → difficult_pdf</Path>.</Text> 1071 <Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and " 15 documents were processed and included in the collection. 1 was rejected".</Text>1068 <Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "5 documents were processed and included in the collection. 1 was rejected".</Text> 1072 1069 </NumberedItem> 1073 1070 <NumberedItem> … … 1091 1088 <NumberedItem> 1092 1089 <Text id="ep-6">In the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel, configure <AutoText text="PDFPlug"/>. Switch on the <AutoText text="use_sections"/> option. </Text> 1093 <Text id="ep-7"><b>Build</b> and <b>preview</b> the collection. View the text versions of some of the PDF documents. Note that these are now split into a series of pages, and a "go to page" box is provided. The format is still a bit ugly though .</Text>1090 <Text id="ep-7"><b>Build</b> and <b>preview</b> the collection. View the text versions of some of the PDF documents. Note that these are now split into a series of pages, and a "go to page" box is provided. The format is still a bit ugly though, and pdf05-notext.pdf is still not processed.</Text> 1094 1091 </NumberedItem> 1095 1092 <Heading> … … 1098 1095 <NumberedItem> 1099 1096 <Text id="ep-12">If conversion to HTML doesn't produce the result you like, PDF documents can be converted to a series of images, one per page or slide. This requires ImageMagick and Ghostscript to be installed.</Text> 1100 1101 1097 </NumberedItem> 1102 1098 <NumberedItem> … … 1104 1100 </NumberedItem> 1105 1101 <NumberedItem> 1106 <Text id="ep-14"><b>Build</b> the collection and <b>preview</b>. All PDF documents have been processed and divided into sections, but each section displays <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. For the conversion to images for PDF documents, no text is extracted. </Text>1107 </NumberedItem> 1108 <NumberedItem> 1109 <Text id="ep-15">In order to view the documents properly, you will need to modify the format statement. In the <AutoText key="glidict::CDM.GUI.Formats"/> section on the <AutoText key="glidict::GUI. Design"/> panel, select the <AutoText text="DocumentText"/> format statement. Replace </Text>1102 <Text id="ep-14"><b>Build</b> the collection and <b>preview</b>. All PDF documents (including pdf05-notext.pdf) have been processed and divided into sections, but each section displays <AutoText key="perlmodules::BasPlug.dummy_text" type="quoted"/>. For the conversion to images for PDF documents, no text is extracted. </Text> 1103 </NumberedItem> 1104 <NumberedItem> 1105 <Text id="ep-15">In order to view the documents properly, you will need to modify the format statement. In the <AutoText key="glidict::CDM.GUI.Formats"/> section on the <AutoText key="glidict::GUI.Format"/> panel, select the <AutoText text="DocumentText"/> format statement. Replace </Text> 1110 1106 <Format> 1111 1107 [Text] … … 1133 1129 </NumberedItem> 1134 1130 <NumberedItem> 1135 <Text id="ep-21">We achieve this by adding two <AutoText text="PDFPlug"/> plugins to the collection, with different options. Currently, the Librarian Interface does not allow you to add the same plugin twice to the collection (with the exception of <AutoText text="UnknownPlug"/>). You will need to edit the collection configuration file by hand.</Text> 1136 <Text id="ep-21a">Close the collection in the Librarian Interface. Then open <Path>Greenstone → collect → pdfcolle → etc → collect.cfg</Path> using a text editor, e.g. WordPad. In the list of plugins, add another <AutoText text="PDFPlug"/>, i.e.</Text> 1137 <Format> 1138 plugin PDFPlug 1139 </Format> 1140 <Text id="ep-22">Don't worry about the options here - we will add these using the Librarian Interface.</Text> 1141 <Text id="ep-22a">Note that if you ever need to edit a collection's <Path>collect.cfg</Path> file by hand, you must close the collection in the Librarian Interface first, otherwise the next time it saves the file, it will overwrite your changes.</Text> 1142 </NumberedItem> 1143 <NumberedItem> 1144 <Text id="ep-23">Open up the collection again in the Librarian Interface, and go to the <AutoText key="glidict::GUI.Gather"/> panel. Make a new folder called <AutoText text="notext" type="quoted"/>: right click in the collection panel and select <AutoText key="glidict::CollectionPopupMenu.New_Folder"/> from the menu. Change the <AutoText key="glidict::NewFolderOrFilePrompt.Folder_Name"/> to <AutoText text="notext" type="quoted"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text> 1131 <Text id="ep-21">We achieve this by putting the problem files into a separate folder, and adding two <AutoText text="PDFPlug"/> plugins to the collection, with different options.</Text> 1132 </NumberedItem> 1133 <NumberedItem> 1134 <Text id="ep-23">Go to the <AutoText key="glidict::GUI.Gather"/> panel. Make a new folder called <AutoText text="notext" type="quoted"/>: right click in the collection panel and select <AutoText key="glidict::CollectionPopupMenu.New_Folder"/> from the menu. Change the <AutoText key="glidict::NewFolderOrFilePrompt.Folder_Name"/> to <AutoText text="notext" type="quoted"/>, and click <AutoText key="glidict::General.OK" type="button"/>.</Text> 1145 1135 <Text id="ep-23a">Move the two pdf files that have problems with html (<Path>pdf05-notext.pdf</Path> and <Path>pdf06-weirdchars</Path>.pdf) into this folder by drag and drop. We will set up the plugins so that PDF files in this <Path>notext</Path> folder are processed differently to the other PDF files.</Text> 1146 1136 </NumberedItem> 1147 1137 <NumberedItem> 1148 <Text id="ep-24">Switch to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. You will see that there are two PDFPlug plugins in the list. </Text> 1149 </NumberedItem> 1150 <NumberedItem> 1151 <Text id="ep-25">Switch to <AutoText key="glidict::Preferences.Mode.Systems"/> mode, as you will need to use regular expressions in the options (<Menu><AutoText key="glidict::Menu.File"/> → <AutoText key="glidict::Menu.File_Options"/> → <AutoText key="glidict::Preferences.Mode"/></Menu>)</Text> 1152 </NumberedItem> 1153 <NumberedItem> 1154 <Text id="ep-26">Configure the two <AutoText text="PDFPlug"/> plugins so that the options look like the following:</Text> 1155 1138 <Text id="ep-24">Switch to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. Add a second PDF plugin by selecting <AutoText text="PDFPlug"/> from the <AutoText key="glidict::CDM.PlugInManager.PlugIn"/> drop-down list, and clicking <AutoText key="glidict::CDM.PlugInManager.Add" type="button"/>. This plugin will come after the first PDF plugin, so we configure it to process PDF documents as HTML. Set the <AutoText text="convert_to"/> option to <AutoText text="html"/>, and switch on the <AutoText text="use_sections"/> option. Click <AutoText key="glidict::General.OK" type="button"/>.</Text> 1139 </NumberedItem> 1140 <NumberedItem> 1141 <Text id="ep-25">Now switch to <AutoText key="glidict::Preferences.Mode.Systems"/> mode, as you will need to use regular expressions in the options for the first PDFplugin (<Menu><AutoText key="glidict::Menu.File"/> → <AutoText key="glidict::Menu.File_Options"/> → <AutoText key="glidict::Preferences.Mode"/></Menu>). Configure the first PDF plugin, and set the <AutoText text="process_exp"/> option to <AutoText text="'notext.*\.pdf'"/>.</Text> 1142 </NumberedItem> 1143 <NumberedItem> 1144 <Text id="ep-26">The two PDF plugins should have options like the following:</Text> 1156 1145 <Format> 1157 1146 plugin PDFPlug -convert_to pagedimg_jpg -process_exp "notext.*\.pdf"<br/> … … 1174 1163 <Text id="ew-a">Enhanced Word document handling</Text> 1175 1164 </Title> 1165 <Version initial="2.70w" current="2.71"/> 1176 1166 <Content> 1177 1167 <Text id="ew-1">The standard way Greenstone processes Word documents is to convert them to HTML format using a third-party program, wvWare. This sometimes doesn't do a very good job of conversion. If you are using Windows, and have Microsoft Word installed, you can take advantage of Windows native scripting to do a better job of conversion. If the original document was hierarchically structured using Word styles, these can be used to structure the resulting HTML. Word document properties can also be extracted as metadata.</Text> … … 1297 1287 <Text id="0403">Exporting a collection to CD-ROM/DVD</Text> 1298 1288 </Title> 1299 <Version initial="2.60" current="2.7 0w"/>1289 <Version initial="2.60" current="2.71"/> 1300 1290 <Content> 1301 1291 <Comment> 1302 <Text id="0404">To publish a collection on CD-ROM or DVD, Greenstone's Export to CD-ROM export module must be installed (see <TutorialRef id="install_greenstone"/>).</Text>1292 <Text id="0404">To publish a collection on CD-ROM or DVD, Greenstone's Export to CD-ROM export module must be installed. This is included with CD-ROM distributions, and all distributions 2.70w and later. It must be installed separately for non-CD-ROM versions of Greenstone, version 2.70 and earlier (see <TutorialRef id="install_greenstone"/>).</Text> 1303 1293 </Comment> 1304 1294 <NumberedItem> … … 1306 1296 </NumberedItem> 1307 1297 <NumberedItem> 1308 <Text id="0406">Choose <Menu><AutoText key="glidict::Menu.File"/> → <AutoText key="glidict::Menu.File_CDimage"/></Menu>. In the resulting popup window, select the collection or collections that you wish to export by ticking their check boxes. You can optionally enter a name for the CD-ROM: this is the name that will appear in the menu when the CDROM is run. If a name is not entered, the default <AutoText text="Greenstone Collections"/> will be used. Click <AutoText key="glidict::WriteCDImagePrompt.Export" type="button"/>.</Text>1298 <Text id="0406">Choose <Menu><AutoText key="glidict::Menu.File"/> → <AutoText key="glidict::Menu.File_CDimage"/></Menu>. In the resulting popup window, select the collection or collections that you wish to export by ticking their check boxes. You can optionally enter a name for the CD-ROM: this is the name that will appear in the menu when the CDROM is run. If a name is not entered, the default <AutoText text="Greenstone Collections"/> will be used. You can also specify whether the resulting CD-ROM will install files onto the host machine when used or not. Click <AutoText key="glidict::WriteCDImagePrompt.Export" type="button"/> to start the export process.</Text> 1309 1299 <Text id="0408">The necessary files for export are written to:</Text> 1310 1300 <Path>Greenstone → tmp → exported_xxx</Path> … … 1322 1312 </Title> 1323 1313 <SampleFiles folder="tudor"/> 1324 <Version initial="2.60" current="2.7 0w"/>1314 <Version initial="2.60" current="2.71"/> 1325 1315 <Content> 1326 1316 <NumberedItem> 1327 <Text id="0388">Invoke the Greenstone Librarian Interface (from the Windows <i>Start</i> menu) and start a new collection called <b>tudor</b> (use the <AutoText key="glidict::Menu.File"/> menu) . Fill out the pop-up dialog with appropriate values and leave <b>Dublin Core</b>, which is selected by default, as the metadata set.</Text>1317 <Text id="0388">Invoke the Greenstone Librarian Interface (from the Windows <i>Start</i> menu) and start a new collection called <b>tudor</b> (use the <AutoText key="glidict::Menu.File"/> menu), based on the default <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text> 1328 1318 </NumberedItem> 1329 1319 <NumberedItem> … … 1356 1346 </NumberedItem> 1357 1347 <NumberedItem> 1358 <Text id="0393c">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>rebuild</b> the collection. Go back to the <AutoText key="glidict::GUI.Enrich"/> panel and look at the extracted metadata for some of the HTML files in <Path>englishhistory.net → tudor → monarchs</Path>. The new metadata should n ew be visible.</Text>1348 <Text id="0393c">Switch to the <AutoText key="glidict::GUI.Create"/> panel and <b>rebuild</b> the collection. Go back to the <AutoText key="glidict::GUI.Enrich"/> panel and look at the extracted metadata for some of the HTML files in <Path>englishhistory.net → tudor → monarchs</Path>. The new metadata should now be visible.</Text> 1359 1349 </NumberedItem> 1360 1350 <Heading> … … 1387 1377 </Content> 1388 1378 </Tutorial> 1379 <!-- ** here --> 1389 1380 <Tutorial id="enhanced_html_collection"> 1390 1381 <Title> … … 1599 1590 <Text id="st-1">Section tagging for HTML documents</Text> 1600 1591 </Title> 1592 <Version initial="2.70w" current="2.71"/> 1601 1593 <Content> 1602 1594 <NumberedItem> … … 1877 1869 <Text id="is-1">CDS/ISIS collection</Text> 1878 1870 </Title> 1871 <Version initial="2.70w" current="2.71"/> 1879 1872 <Content> 1880 1873 <Comment> … … 3192 3185 <Text id="gems-1">Editing metadata sets</Text> 3193 3186 </Title> 3187 <Version initial="2.70w" current="2.71"/> 3194 3188 <Content> 3195 3189 <Text id="gems-2">GEMS (Greenstone Editor for Metadata Sets) can be used to modify existing metadata sets or create new ones.</Text>
Note:
See TracChangeset
for help on using the changeset viewer.