Ignore:
Timestamp:
2012-08-30T17:55:24+12:00 (12 years ago)
Author:
ak19
Message:

Updated indexers tutorial for GS3, now that Advanced searching and hash-c hotkey (used in the tutorial) are working for 64 bit linux machines too.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • documentation/trunk/tutorials/xml-source/tutorial_en.xml

    r26131 r26142  
    43814381</Heading>
    43824382<NumberedItem>
    4383 <Text id="indexers-7">Start a new collection (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>) called <b>Demo Lucene</b> and base it on the <b>Greenstone demo (demo)</b> collection, fill out its fields appropriately.</Text>
    4384 </NumberedItem>
    4385 <NumberedItem>
    4386 <Text id="indexers-8">In the <AutoText key="glidict::GUI.Gather"/> panel, click <AutoText key="glidict::Tree.World"/> and click <b>Greenstone demo (demo)</b>, it will show the documents in the <b>Greenstone demo</b> collection. Drag all 11 folders underneath <Path>Greenstone demo (demo)</Path> into the new collection.</Text>
     4383<Text id="indexers-7">Start a new collection (<Menu><AutoText key="glidict::Menu.File"/> &rarr; <AutoText key="glidict::Menu.File_New"/></Menu>) called <b>Demo Lucene</b> and base it on the <MajorVersion number="2"><b>Greenstone demo (demo)</b></MajorVersion><MajorVersion number="3"><b>Demo Collection (lucene-jdbm-demo)</b></MajorVersion> collection, fill out its fields appropriately.</Text>
     4384</NumberedItem>
     4385<NumberedItem>
     4386<Text id="indexers-8">In the <AutoText key="glidict::GUI.Gather"/> panel, click <AutoText key="glidict::Tree.World"/> and click <MajorVersion number="2"><b>Greenstone demo (demo)</b></MajorVersion><MajorVersion number="3"><b>Demo Collection (lucene-jdbm-demo)</b></MajorVersion>, it will show the documents in the <b>Greenstone demo</b> collection. Drag all 11 folders underneath <Path>Greenstone demo (demo)</Path> into the new collection.</Text>
    43874387<Comment>
    43884388<Text id="demo-collection">If you haven't installed the <b>Greenstone demo (demo)</b> collection yet, you can download the <Path>demo.zip</Path> file from the link above, unzip it and put it into the <Path>collect</Path> folder in your Greenstone installation.</Text>
     
    43904390</NumberedItem>
    43914391<NumberedItem>
     4392<MajorVersion number="2">
    43924393<Text id="indexers-9">Go to the <AutoText key="glidict::GUI.Enrich"/> panel, look at the metadata that is associated with each directory. Go to the <AutoText key="glidict::CDM.GUI.Indexes"/> section in the <AutoText key="glidict::GUI.Design"/> panel. The <b>MGPP indexer</b> is in use because the <b>Greenstone Demo</b> collection, which this collection is based on, uses the <b>MGPP indexer</b>.</Text>
    4393 </NumberedItem>
     4394</MajorVersion>
     4395<MajorVersion number="3">
     4396<Text id="indexers-9-3">Go to the <AutoText key="glidict::GUI.Enrich"/> panel, look at the metadata that is associated with each directory. Go to the <AutoText key="glidict::CDM.GUI.Indexes"/> section in the <AutoText key="glidict::GUI.Design"/> panel. The <b>Lucene indexer</b> is already in use because the <b>Demo Collection (lucene-jdbm-demo)</b> collection, which this collection is based on, uses the <b>Lucene indexer</b>.</Text>
     4397</MajorVersion>
     4398</NumberedItem>
     4399<MajorVersion number="2">
    43944400<NumberedItem>
    43954401<Text id="indexers-11">Click the <AutoText key="glidict::CDM.BuildTypeManager.Change"/> button at the right top corner of the panel. A new window will pop up for selecting the Indexers. After selecting an indexer, a brief description will appear in the box below. Select Lucene and click <AutoText key="glidict::General.OK"/>. Please note that the <AutoText key="glidict::CDM.IndexManager.Indexes"/> section may have changed accordingly.</Text>
    43964402</NumberedItem>
     4403</MajorVersion>
    43974404<NumberedItem>
    43984405<Text id="indexers-13"><b>Build</b> and <b>preview</b> the collection.</Text>
     
    44174424</Heading>
    44184425<NumberedItem>
    4419 <Text id="indexers-20">Start a new collection called <b>Greenstone Demo MGPP</b> and also base it on the <b>Greenstone demo (demo)</b>.</Text>
     4426<Text id="indexers-20">Start a new collection called <b>Greenstone Demo MGPP</b> and also base it on the <MajorVersion number="2"><b>Greenstone demo (demo)</b></MajorVersion><MajorVersion number="3"><b>Demo Collection (lucene-jdbm-demo)</b></MajorVersion>.</Text>
    44204427</NumberedItem>
    44214428<NumberedItem>
     
    44234430</NumberedItem>
    44244431<NumberedItem>
     4432<MajorVersion number="2">
    44254433<Text id="indexers-22">In the <AutoText key="glidict::CDM.GUI.Indexes"/> section of the <AutoText key="glidict::GUI.Design"/> panel, you will notice that the active indexer is <b>MGPP</b>, since this is the default. (If not, you'd click the <AutoText key="glidict::CDM.BuildTypeManager.Change"/> button, select <b>MGPP</b> and click <AutoText key="glidict::General.OK"/>, in which case the <AutoText key="glidict::CDM.IndexManager.Indexes"/> section and its options may change accordingly.)</Text>
    4426 </NumberedItem>
    4427 <NumberedItem>
    4428 <Text id="indexers-23">There are three options at the bottom of the panel &mdash; <AutoText key="glidict::CDM.IndexingManager.Stem"/>, <AutoText key="glidict::CDM.IndexingManager.Casefold"/> and <AutoText key="glidict::CDM.IndexingManager.Accent_fold"/>. Notice that all three are enabled. Once an option is enabled, it will also appear in the collection's <AutoText key="coredm::_Global:linktextPREFERENCES_"/> page and can be turned on or off from there.</Text>
    4429 </NumberedItem>
    4430 <NumberedItem>
    4431 <Text id="indexers-23a">In the <AutoText key="glidict::CDM.LevelManager.Level_Title"/> section, also select <AutoText key="glidict::CDM.LevelManager.Section"/>, if it isn't already.</Text>
     4434</MajorVersion>
     4435<MajorVersion number="3">
     4436<Text id="indexers-22-3">In the <AutoText key="glidict::CDM.GUI.Indexes"/> section of the <AutoText key="glidict::GUI.Design"/> panel, you will notice that the active indexer is <b>Lucene</b>. Click the <AutoText key="glidict::CDM.BuildTypeManager.Change"/> button at the right top corner of the panel. A new window will pop up for selecting the Indexers. After selecting an indexer, a brief description will appear in the box below. Select MGPP and click <AutoText key="glidict::General.OK"/>. Please note that the <AutoText key="glidict::CDM.IndexManager.Indexes"/> section may have changed accordingly.</Text>
     4437</MajorVersion>
     4438</NumberedItem>
     4439<NumberedItem>
     4440<Text id="indexers-23">There are three options at the bottom of the panel &mdash; <AutoText key="glidict::CDM.IndexingManager.Stem"/>, <AutoText key="glidict::CDM.IndexingManager.Casefold"/> and <AutoText key="glidict::CDM.IndexingManager.Accent_fold"/>. Notice that all three are enabled. <MajorVersion number="2">Once an option is enabled, it will also appear in the collection's <AutoText key="coredm::_Global:linktextPREFERENCES_"/> page and can be turned on or off from there.</MajorVersion></Text>
     4441</NumberedItem>
     4442<NumberedItem>
     4443<Text id="indexers-23a">In the <AutoText key="glidict::CDM.LevelManager.Level_Title"/> section, also select <AutoText key="glidict::CDM.LevelManager.Section"/>, if it isn't already<MajorVersion number="3">, but make document the default</MajorVersion>.</Text>
    44324444</NumberedItem>
    44334445<NumberedItem>
     
    44374449<Text id="indexers-25">Search with MGPP</Text>
    44384450</Heading>
     4451<MajorVersion number="2">
    44394452<NumberedItem>
    44404453<Text id="indexers-26">MGPP supports stemming, casefolding and accentfolding. By default, searching in collections built with MGPP indexer is set to <AutoText key="coredm::_preferences:textnostem_"/> and <AutoText key="coredm::_preferences:textignorecase_"/>. So searching <i>econom</i> will return 0 documents. Searching for <i>fao</i> and <i>FAO</i> return the same result &mdash; 85 word counts and 11 matched documents.</Text>
     
    44504463<Text id="indexers-28a">Go back to the <AutoText key="coredm::_Global:linktextPREFERENCES_"/> page and change the <AutoText key="coredm::_preferences:textcasediffs_"/> option back to <AutoText key="coredm::_preferences:textignorecase_"/> to avoid confusion later on. Click <AutoText key="coredm::_preferences:textsetprefs_"/> button.</Text>
    44514464</NumberedItem>
     4465</MajorVersion>
     4466<MajorVersion number="3">
     4467<NumberedItem>
     4468<Text id="indexers-26-3">MGPP supports stemming, casefolding and accentfolding. By default, searching in collections built with MGPP indexer is set to <AutoText key="coredm::_preferences:textnostem_"/> and <AutoText key="coredm::_preferences:textmatchcase_"/>. So searching <i>econom</i> will return 0 documents. Searching for <i>fao</i> will return 0 documents whereas searching for <i>FAO</i> will return &mdash; 89 word counts and 11 matched documents.</Text>
     4469<Text id="indexers-26a-3">Go to the <AutoText text="advanced search form"/> page by clicking the <AutoText text="advanced search form"/> button at the top right corner. You can see that <b>stem</b> is off, which means <AutoText key="coredm::_preferences:textwordends_"/> option is set to <AutoText key="coredm::_preferences:textnostem_"/>. And <b>case</b> (folding) is off too, which means the <AutoText key="coredm::_preferences:textcasediffs_"/> option is set to <AutoText key="coredm::_preferences:textmatchcase_"/>.</Text>
     4470</NumberedItem>
     4471<NumberedItem>
     4472<Text id="indexers-27-3">Sometimes we may want to ignore word endings while searching so as to match different variations of the term. Change the <AutoText text="stem"/> option from <AutoText text="off"/> to <AutoText text="on"/>. This will change the search settings from the default, which is that the <AutoText key="coredm::_preferences:textnostem_"/> to <AutoText key="coredm::_preferences:textstem_"/>. Now try searching for <i>econom</i> again, 9 documents are found.</Text>
     4473<Text id="indexers-27a-3">Please note that word endings are determined according to the third-party stemming tables incorporated in Greenstone, not by the user. Thus the searches may not do precisely what is expected, especially when cultural variations or dialects are concerned. Besides, not all languages support stemming, only English and French have stemming at the moment.</Text>
     4474<Text id="indexers-27b-3">Change the <AutoText text="stem"/> option back to <AutoText text="off"/> (to <AutoText key="coredm::_preferences:textnostem_"/>) to avoid confusion later on.</Text>
     4475</NumberedItem>
     4476<NumberedItem>
     4477<Text id="indexers-28-3">Sometimes we may want to search for the exact term, that is, differentiate the upper cases from lower cases. In the <AutoText text="advanced search form"/> page, the default settings already insist that upper/lower case must match (case stemming is off). If you want to ignore case when searching, switch <AutoText text="case"/> folding to <AutoText text="on"/> (<AutoText key="coredm::_preferences:textignorecase_"/>). Now try searching for <i>fao</i> and <i>FAO</i> respectively this time. Notice the search results are the same for both this time.</Text>
     4478</NumberedItem>
     4479</MajorVersion>
    44524480<Heading>
    44534481<Text id="indexers-29">Use search mode hotkeys with query term</Text>
    44544482</Heading>
    44554483<Comment>
    4456 <Text id="mgpp-1">MGPP has several hotkeys for setting the search modes for a query term. These hotkeys explicitly set the <AutoText key="coredm::_preferences:textwordends_"/> option and the <AutoText key="coredm::_preferences:textcasediffs_"/> option for the query being constructed.</Text>
    4457 </Comment>
    4458 <NumberedItem>
    4459 <Text id="mgpp-2"><b>#s</b> and <b>#u</b> are hotkeys for the <AutoText key="coredm::_preferences:textwordends_"/> option. Appending <b>#s</b> to a query term will specifically enable the <AutoText key="coredm::_preferences:textstem_"/> function. For example, try search for <i>econom#s</i>, 9 documents are found, which is the same as in step 17. Remember that we have set it back to <AutoText key="coredm::_preferences:textnostem_"/>. This means using hotkeys will override the current preference settings.</Text>
     4484<Text id="mgpp-1">MGPP has several hotkeys for setting the search modes for a query term. These hotkeys explicitly set the <AutoText key="coredm::_preferences:textwordends_"/> option and the <AutoText key="coredm::_preferences:textcasediffs_"/> option for the query being constructed. <MajorVersion number="3">Use them in the plain <AutoText text="text search"/> or <AutoText text="form search"/>.</MajorVersion></Text>
     4485</Comment>
     4486<NumberedItem>
     4487<Text id="mgpp-2"><b>#s</b> and <b>#u</b> are hotkeys for the <AutoText key="coredm::_preferences:textwordends_"/> option. Appending <b>#s</b> to a query term will specifically enable the <AutoText key="coredm::_preferences:textstem_"/> function. For example, <MajorVersion number="3">click on the <AutoText text="Form search"/> button and </MajorVersion>try searching for <i>econom#s</i>. 9 documents are found, which is the same as in step 17.<MajorVersion number="2"> Remember that we have set it back to <AutoText key="coredm::_preferences:textnostem_"/>. This means using hotkeys will override the current preference settings.</MajorVersion></Text>
    44604488</NumberedItem>
    44614489<NumberedItem>
    44624490<Text id="mgpp-3">Appending <b>#u</b> to a query term will explicitly set the current search to <AutoText key="coredm::_preferences:textnostem_"/>. </Text>
    4463 <Text id="mgpp-4">Note that using hotkeys will only affect that query term. That is, hotkeys are used per term. For example, if a query expression contains more than one term, some terms can have hotkeys and others not, and the hotkeys can be different for different terms. This provides a fine-grained control of the query, whereas changing settings in the <AutoText key="coredm::_Global:linktextPREFERENCES_"/> page will affect the query as a whole.</Text>
    4464 </NumberedItem>
    4465 <NumberedItem>
    4466 <Text id="mgpp-5">Hotkeys <b>#i</b> and <b>#c</b> control the case sensitivity. Appending <b>#i</b> to a query term will explicitly set the search to <AutoText key="coredm::_preferences:textignorecase_"/> (i.e. case insensitive).</Text>
    4467 </NumberedItem>
    4468 <NumberedItem>
    4469 <Text id="mgpp-6">In contrast, appending <b>#c</b> will specifically turn off the casefolding, that is, <AutoText key="coredm::_preferences:textmatchcase_"/>. For example, search for <i>fao#c</i> returns 0 documents.</Text>
     4491<Text id="mgpp-4">Note that using hotkeys will only affect that query term. That is, hotkeys are used per term. For example, if a query expression contains more than one term, some terms can have hotkeys and others not, and the hotkeys can be different for different terms. This provides a fine-grained control of the query, whereas <MajorVersion number="2">changing settings in the <AutoText key="coredm::_Global:linktextPREFERENCES_"/> page will affect the query as a whole</MajorVersion><MajorVersion number="3">changing the controls for a search field in the <AutoText text="advanced search form"/> page will apply to all the query terms in that field</MajorVersion>.</Text>
     4492</NumberedItem>
     4493<NumberedItem>
     4494<Text id="mgpp-5">Hotkeys <b>#i</b> and <b>#c</b> control the case sensitivity. Appending <b>#i</b> to a query term will explicitly set the search to <AutoText key="coredm::_preferences:textignorecase_"/> (i.e. case insensitive).<MajorVersion number="3"> For example, search for <i>fao#i</i> returns 11 documents.</MajorVersion></Text>
     4495</NumberedItem>
     4496<NumberedItem>
     4497<Text id="mgpp-6">In contrast, appending <b>#c</b> will specifically turn off the casefolding, that is, <AutoText key="coredm::_preferences:textmatchcase_"/>.<MajorVersion number="2"> For example, search for <i>fao#c</i> returns 0 documents.</MajorVersion></Text>
    44704498</NumberedItem>
    44714499<NumberedItem>
Note: See TracChangeset for help on using the changeset viewer.