Changeset 27906
- Timestamp:
- 2013-07-18T21:59:30+12:00 (11 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
documentation/trunk/tutorials/xml-source/tutorial_en.xml
r27896 r27906 609 609 <Content> 610 610 <NumberedItem> 611 <Text id="0338aa">Using your Windows file browser, locate the folder < MajorVersion number="2"><Path>sample_files → images → image2</Path>. Copy this entire folder into your <Path>Greenstone → collect</Path> folder.</MajorVersion><MajorVersion number="3"><Path>sample_files → images → image3</Path>. Copy this entire folder into your<Path>Greenstone → web → sites → localsite → collect</Path> folder.</MajorVersion></Text>611 <Text id="0338aa">Using your Windows file browser, locate the folder <Path>sample_files → images → image-e</Path>. Copy this entire folder into your <MajorVersion number="2"><Path>Greenstone → collect</Path> folder.</MajorVersion><MajorVersion number="3"><Path>Greenstone → web → sites → localsite → collect</Path> folder.</MajorVersion></Text> 612 612 </NumberedItem> 613 613 <NumberedItem> … … 1160 1160 <Text id="pdfbox-ext-11">Now that you've installed the PDFBox extension, this will be available as an option in the plugin's configuration dialog. To turn on the PDFBox extension, go to the <AutoText key="glidict::GUI.Design"/> panel, select <AutoText key="glidict::CDM.GUI.Plugins"/> from the left, and on the right double click the <AutoText text="PDFPlugin"/> (alternatively, select this plugin and click the <AutoText key="glidict::CDM.PlugInManager.Configure" type="button"/> below) to open the dialog to configure this plugin. In the <AutoText key="glidict::CDM.PlugInManager.Configure"/> dialog, scroll down to the section <AutoText text="AutoLoadConverters"/> and select the checkbox next to the <AutoText text="pdfbox_conversion"/> option. Click <AutoText key="glidict::General.OK"/> to close the dialog, switch to the <AutoText key="glidict::GUI.Create"/> panel and rebuild your collection. This time, PDF files will be processed by PDFBox which will extract their text.</Text> 1161 1161 <Text id="pdfbox-ext-12">Try this feature out on a collection of recent PDF files, by configuring its PDFPlugin with the <AutoText text="pdfbox_conversion"/> option turned on.</Text> 1162 <Text id="pdfbox-ext-12">You can also experiment by configuring the PDFPlugin used in the <b>Reports</b> collection, although that one contains old PDF versions which the default settings of <AutoText text="PDFPlugin"/> can already process successfully. If you do decide to test out the PDFBox extension with the <b>Reports</b> collection, then rebuild it and preview it. However, once you've inspected the results, you may wish to go back to the <AutoText key="glidict::GUI.Design"/> panel and turn off <AutoText text="pdfbox_conversion"/> and rebuild the collection once more, so that it's back to its original state and ready for future tutorials.</Text>1163 1162 </NumberedItem> 1164 1163 </Content> … … 1203 1202 <MajorVersion number="3">Note that these are now split into a series of pages, and two means of jumping between various pages is provided: on the left, individual pages are listed vertically by page number and clicking the "plus" box next to a page will expand its contents, while on the right there's a box with a horizontal scroller which can be used to scroll to the page you wish to view. 1204 1203 </MajorVersion> 1205 The format is still a bit ugly though, andpdf05-notext.pdf is still not processed.</Text>1204 <MajorVersion number="2">The format is still a bit ugly though, and</MajorVersion><MajorVersion number="3">Note that</MajorVersion> pdf05-notext.pdf is still not processed.</Text> 1206 1205 </NumberedItem> 1207 1206 <Heading> … … 1805 1804 </Comment> 1806 1805 <NumberedItem> 1807 <Text id="0463">Switch to the <AutoText key="glidict::GUI.Create"/> panel and view the options that are displayed in the top portion of the screen. Select <AutoText text="maxdocs"/> and set its numeric counter to <AutoText text="3"/>. Now <b>build</b>.</Text>1806 <Text id="0463">Switch to the <AutoText key="glidict::GUI.Create"/> panel, choose <AutoText text="Import Options"/> and view the options that are displayed in the top portion of the screen. Select <AutoText text="maxdocs"/> and set its numeric counter to <AutoText text="3"/>. Now <b>build</b>.</Text> 1808 1807 </NumberedItem> 1809 1808 <NumberedItem> … … 1842 1841 </MajorVersion> 1843 1842 <MajorVersion number="3"> 1844 <Text id="0473b">This format appears in the search results list, in the <AutoText key="gs3::metadata_names::Title.buttonname" /> list, and also when you get down to individual documents in the <AutoText key="gs3::metadata_names::Subjects.buttonname" /> hierarchy. This is Greenstone's default format statement used in the <AutoText text="browse"/> and <AutoText text="search"/> format features.</Text>1843 <Text id="0473b">This format appears in the <AutoText key="gs3::metadata_names::Title.buttonname" /> list and also when you get down to individual documents in the <AutoText key="gs3::metadata_names::Subjects.buttonname" /> hierarchy. This is Greenstone's default format statement used in the <AutoText text="browse"/> format features.</Text> 1845 1844 </MajorVersion> 1846 1845 </NumberedItem> … … 1866 1865 <Tab n="1"/></td><br/> 1867 1866 <Tab n="1"/><td valign="top"><br/> 1868 <Tab n="2"/><gsf:metadata name="Title"/><br/> 1869 <Tab n="2"/><br/><br/> 1870 <Tab n="2"/><i>(<gsf:metadata name="Source"/>)</i><br/> 1867 <Tab n="2"/><gsf:link type="document"><br/> 1868 <Tab n="3"/><gsf:metadata name="Title"/><br/> 1869 <Tab n="3"/><br/><br/> 1870 <Tab n="3"/><i>(<gsf:metadata name="Source"/>)</i><br/> 1871 <Tab n="2"/></gsf:link><br/> 1871 1872 <Tab n="1"/></td><br/> 1872 1873 </gsf:template> 1873 1874 </MajorVersion> 1874 1875 </Format> 1875 <MajorVersion number="3"> 1876 <Text id="0475-3a">Replace the <AutoText text="search"/> format feature with the above format statement too.</Text> 1877 </MajorVersion> 1878 <Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the <MajorVersion number="2"><AutoText key="coredm::_Global:labelTitle_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Title.buttonname" /></MajorVersion> list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.</Text> 1876 <Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look <MajorVersion number="2">at some search results and </MajorVersion>at the <MajorVersion number="2"><AutoText key="coredm::_Global:labelTitle_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Title.buttonname" /></MajorVersion> list. <MajorVersion number="2">They are</MajorVersion><MajorVersion number="3">It is</MajorVersion> just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.</Text> 1879 1877 <MajorVersion number="3"> 1880 1878 <Text id="0476-3">We can also reduce the <AutoText text="VList classifierNode"/> template of the <AutoText text="browse"/> format feature further, also without changing the display. Replace it with:</Text> … … 1949 1947 </NumberedItem> 1950 1948 <NumberedItem> 1951 <Text id="0486"><b>Preview</b> the <MajorVersion number="2"><AutoText key="coredm::_Global:labelSubject_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Subjects.buttonname" /></MajorVersion> list in the collection. <MajorVersion number="2">First, the offending "()" has disappeared from the bookshelves. Second, when</MajorVersion><MajorVersion number="3">When</MajorVersion> you get down to a list of documents in the subject hierarchy, the filename does not appear beside the title, because <AutoText key="metadata::ex.Source"/> is not specified in the format statement and this format statement applies to all nodes in the <i>subject</i> classifier. Note that the search results and titles lists have not changed: they still display the filename underneath the title.</Text>1949 <Text id="0486"><b>Preview</b> the <MajorVersion number="2"><AutoText key="coredm::_Global:labelSubject_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Subjects.buttonname" /></MajorVersion> list in the collection. <MajorVersion number="2">First, the offending "()" has disappeared from the bookshelves. Second, when</MajorVersion><MajorVersion number="3">When</MajorVersion> you get down to a list of documents in the subject hierarchy, the filename does not appear beside the title, because <AutoText key="metadata::ex.Source"/> is not specified in the format statement and this format statement applies to all nodes in the <i>subject</i> classifier. <MajorVersion number="2">Note that the search results and titles lists have not changed: they still display the filename underneath the title.</MajorVersion><MajorVersion number="3">Note that the titles list has not changed: it still displays the filename underneath the title.</MajorVersion></Text> 1952 1950 </NumberedItem> 1953 1951 <NumberedItem> … … 1956 1954 </MajorVersion> 1957 1955 <MajorVersion number="3"> 1958 <Text id="0487-3">Select the <AutoText text="search"/> format feature once more for some further editing. Replace the line:</Text> 1959 </MajorVersion> 1960 <MajorVersion number="2"> 1956 <Text id="0487-3">Select the <AutoText text="search"/> format feature for some editing. </Text> 1957 </MajorVersion> 1958 <MajorVersion number="2"> 1959 <Text id="0487-3-a">Replace the line:</Text> 1961 1960 <Format> 1962 1961 <td>[link][icon][/link]</td><br/> … … 1967 1966 </MajorVersion> 1968 1967 <MajorVersion number="3"> 1969 <Format> 1970 <i>(<gsf:metadata name="Source"/>)</i><br/> 1971 </Format> 1972 <Text id="ep-16">with</Text> 1973 <Format> 1974 <gsf:metadata name="dc.Subject"/><br/> 1968 <Text id="0487-3-b">After the final <Format></gsf:link></Format>, add the line:</Text> 1969 <Format> 1970 <br /><gsf:metadata name="dc.Subject"/><br/> 1975 1971 </Format> 1976 1972 </MajorVersion> … … 4682 4678 <Text id="indexers-1">Building and searching with different indexers</Text> 4683 4679 </Title> 4684 < SampleFiles folder="demo"/>4680 <MajorVersion number="2"><SampleFiles folder="demo"/></MajorVersion> 4685 4681 <Version initial="2.70w" current="2.86|3.05"/> 4686 4682 <Content> … … 4697 4693 <NumberedItem> 4698 4694 <Text id="indexers-8">In the <AutoText key="glidict::GUI.Gather"/> panel, click <AutoText key="glidict::Tree.World"/> and click <MajorVersion number="2"><b>Greenstone demo (demo)</b></MajorVersion><MajorVersion number="3"><Path>localsite → Demo Collection (lucene-jdbm-demo)</Path></MajorVersion>, it will show the documents in the <b>Greenstone demo</b> collection. Drag all 11 folders in the demo folder into the new collection.</Text> 4699 <Comment> 4700 <Text id="demo-collection">If you haven't installed the <MajorVersion number="2"><b>Greenstone demo (demo)</b></MajorVersion><MajorVersion number="3"><b>Demo Collection (lucene-jdbm-demo)</b></MajorVersion> collection yet, you can download the <Path>demo.zip</Path> file from the link above, unzip it and put it into the <Path>collect</Path> folder in your Greenstone installation.</Text> 4701 </Comment> 4695 <MajorVersion number="2"> 4696 <Comment> 4697 <Text id="demo-collection">If you haven't installed the <b>Greenstone demo (demo)</b> collection yet, you can download the <Path>demo.zip</Path> file from the link above, unzip it and put it into the <Path>collect</Path> folder in your Greenstone installation.</Text> 4698 </Comment> 4699 </MajorVersion> 4702 4700 </NumberedItem> 4703 4701 <NumberedItem> … … 4721 4719 </Heading> 4722 4720 <NumberedItem> 4723 <Text id="indexers-15">Lucene provides single letter and multiple letter wildcards and range searching. The query syntax could be quite complicated (for more information please see <Link>http://lucene.apache.org/java/docs/queryparsersyntax.html</Link> . Here we will learn how to use the wildcards while constructing queries.</Text>4721 <Text id="indexers-15">Lucene provides single letter and multiple letter wildcards and range searching. The query syntax could be quite complicated (for more information please see <Link>http://lucene.apache.org/java/docs/queryparsersyntax.html</Link>). Here we will learn how to use the wildcards while constructing queries.</Text> 4724 4722 </NumberedItem> 4725 4723 <NumberedItem> … … 4779 4777 <NumberedItem> 4780 4778 <Text id="indexers-26-3">MGPP supports stemming, casefolding and accentfolding. By default, searching in collections built with MGPP indexer is set to <AutoText text="whole word must match"/> and <AutoText text="upper/lower case must match"/>. So searching <i>econom</i> will return 0 documents. Searching for <i>fao</i> will return 0 documents, whereas searching for <i>FAO</i> will return 89 word counts and 11 matched documents.</Text> 4781 <Text id="indexers-26a-3">Go to the <AutoText text=" advanced search"/> page by clicking the <AutoText text="advancedsearch"/> button at the top right corner. You can see that <b>stem</b> is off, which means the <b>word endings</b> option is set to <AutoText text="whole word must match"/>. And <b>case</b> (folding) is off too, which means the <b>case difference</b> option is set to <AutoText text="upper/lower case must match"/>.</Text>4779 <Text id="indexers-26a-3">Go to the <AutoText text="text search"/> page by clicking the <AutoText text="text search"/> button at the top right corner. You can see that <b>stem</b> is off, which means the <b>word endings</b> option is set to <AutoText text="whole word must match"/>. And <b>case</b> (folding) is off too, which means the <b>case difference</b> option is set to <AutoText text="upper/lower case must match"/>.</Text> 4782 4780 </NumberedItem> 4783 4781 <NumberedItem>
Note:
See TracChangeset
for help on using the changeset viewer.