Changeset 24535

Show
Ignore:
Timestamp:
31.08.2011 21:15:56 (8 years ago)
Author:
ak19
Message:

GS v2.85 changes to tutorials from Word and PDF upto and including the Web Download tutorials.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • documentation/trunk/tutorials/xml-source/tutorial_en.xml

    r24515 r24535  
    579579<NumberedItem> 
    580580<Text id="0275a">Hyperlinks in a Greenstone collection work like this: If the link is to a document that is also in the collection, clicking it takes you to that document in the collection. If the link is to a document that is <i>not</i> in the collection, clicking it takes you to that document on the web.</Text> 
    581 <Text id="0257b">Go back to the web browser and click the <Path>titles</Path>  link near the top of the page. Open the file <Path>boleyn.html</Path> and look for the link to <Path>Katharine of Aragon</Path> (in the 5th paragraph of the <Path>Biography</Path> section). This links to a document inside the collection--<Path>aragon.html</Path>. View this document by clicking the link. For an external link, click <Path>letters written by Anne</Path> (in the <Path>Primary Sources</Path> section). This takes you out on to the web. If you want a warning message to be displayed first, you can open <Path>Greenstone &rarr; etc &rarr; main.cfg</Path> file and uncomment the line <Format>cgiarg shortname=el argdefault=prompt</Format>. Note, that if you are already browsing a collection, then you will need to go back to the home page and re-enter the collection to see this take effect (due to caching of the el argument).</Text> 
     581<Text id="0257b">Go back to the web browser and click the <Path>titles</Path>  link near the top of the page. Open the file <Path>boleyn.html</Path> and look for the link to <Path>Katharine of Aragon</Path> (in the 5th paragraph of the <Path>Biography</Path> section). This links to a document inside the collection--<Path>aragon.html</Path>. View this document by clicking the link. For an external link, click <Path>letters written by Anne</Path> (in the <Path>Primary Sources</Path> section). This takes you out on to the web. If you want a warning message to be displayed first, you can open <Path>Greenstone &rarr; etc &rarr; main.cfg</Path> file and uncomment the line <Format>cgiarg shortname=el argdefault=prompt</Format> (remove the # at the start of a line to uncomment it). Note, that if you are already browsing a collection, then you will need to go back to the home page and re-enter the collection to see this take effect (due to caching of the el argument).</Text> 
    582582</NumberedItem> 
    583583<Heading>  
     
    604604</NumberedItem> 
    605605<NumberedItem> 
    606 <Text id="0341">Copy the images provided in <Path>sample_files &rarr; images</Path> into your newly-formed collection.</Text> 
     606<Text id="0341">Copy the images (avoid the README.TXT file) provided in <Path>sample_files &rarr; images</Path> into your newly-formed collection.</Text> 
    607607</NumberedItem> 
    608608<NumberedItem> 
     
    700700</Heading> 
    701701<NumberedItem> 
    702 <Text id="0385a">Now we'll add an index so that the collection can be searched by descriptions. Switch to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Indexes"/> from the left-hand list. Click the <AutoText key="glidict::CDM.IndexManager.New_Index" type="button"/> button. Select <AutoText key="metadata::dc.Description"/> from the list of metadata to include in the index and click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>. Leave <AutoText key="glidict::CDM.IndexManager.Level"/> at its default, "document".</Text> 
     702<Text id="0385a">Now we'll add an index so that the collection can be searched by descriptions. Switch to the <AutoText key="glidict::GUI.Design"/> panel and select <AutoText key="glidict::CDM.GUI.Indexes"/> from the left-hand list. Click the <AutoText key="glidict::CDM.IndexManager.New_Index" type="button"/> button. Select <AutoText key="metadata::dc.Description"/> from the list of metadata to include in the index and click <AutoText key="glidict::CDM.IndexManager.Add_Index" type="button"/>. Leave <AutoText key="glidict::CDM.LevelManager.Level_Title"/> at its default, "document".</Text> 
    703703</NumberedItem> 
    704704<NumberedItem> 
     
    781781</NumberedItem> 
    782782<NumberedItem> 
    783 <Text id="0310b">By default the titles index (<b><AutoText key="metadata::dc.Title" type="plain"/>,<AutoText key="metadata::ex.Title" type="plain"/></b>) includes <AutoText key="metadata::dc.Title"/> and <AutoText key="metadata::ex.Title"/>. Searching this index will search both <AutoText key="metadata::dc.Title"/> and <AutoText key="metadata::ex.Title"/> metadata. If you wanted to restrict searching to just the manually added <AutoText key="metadata::dc.Title"/> metadata, edit this index and deselect <AutoText key="metadata::ex.Title"/> from the list of metadata.</Text> 
     783<Text id="0310b">By default the titles index (<b><AutoText key="metadata::dc.Title" type="plain"/>,<AutoText key="metadata::ex.dc.Title" type="plain"/>,<AutoText key="metadata::ex.Title" type="plain"/></b>) includes <AutoText key="metadata::dc.Title"/>, <AutoText key="metadata::ex.dc.Title"/> and <AutoText key="metadata::ex.Title"/>. Searching this index will search <AutoText key="metadata::dc.Title"/>, <AutoText key="metadata::ex.dc.Title"/> and <AutoText key="metadata::ex.Title"/> metadata. If you wanted to restrict searching to just the manually added <AutoText key="metadata::dc.Title"/> metadata, edit this index and deselect <AutoText key="metadata::ex.dc.Title"/> and <AutoText key="metadata::ex.Title"/> from the list of metadata.</Text> 
    784784</NumberedItem> 
    785785<NumberedItem> 
     
    843843</Format> 
    844844<Text id="fw-5">This format statement is the default used for any vertical list, such as search results, classifiers, and document table of contents.</Text> 
    845 <Text id="fw-6"><Format>{Or}{[ex.thumbicon],[ex.srcicon]}</Format> chooses <i>ex.thumbicon</i> metadata if its there, otherwise chooses <i>ex.srcicon</i> metadata. If neither are present, nothing is displayed. For this collection there is no <i>ex.thumbicon</i> metadata so the choice is not needed.</Text> 
     845<Text id="fw-6"><Format>{Or}{[ex.thumbicon],[ex.srcicon]}</Format> chooses <i>ex.thumbicon</i> metadata if it's there, otherwise chooses <i>ex.srcicon</i> metadata. If neither are present, nothing is displayed. For this collection there is no <i>ex.thumbicon</i> metadata so the choice is not needed.</Text> 
    846846<Text id="fw-7">Replace <Format>{Or}{[ex.thumbicon],[ex.srcicon]}</Format> (highlighted above) with <Format>[ex.srcicon]</Format>.  </Text> 
    847847<Text id="fw-8">There is no <i>exp.Title</i> metadata, so remove that element from <Format>{Or}{[dc.Title],[exp.Title],[ex.Title],Untitled}</Format>.</Text> 
     
    946946<Text id="ep-3a">In the Librarian Interface, start a new collection called "PDF collection" and base it on <AutoText key="glidict::NewCollectionPrompt.NewCollection"/>.</Text> 
    947947<Text id="ep-3b">In the <AutoText key="glidict::GUI.Gather"/> panel, drag just the PDF documents from <Path>sample_files &rarr; Word_and_PDF &rarr; Documents</Path> into the new collection. Also drag in the PDF documents from <Path>sample_files &rarr; Word_and_PDF &rarr; difficult_pdf</Path>.</Text> 
    948 <Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "3 were processed and included in the collection. 1 was rejected".</Text> 
     948<Text id="ep-3c">Go to the <AutoText key="glidict::GUI.Create"/> panel and build the collection. Examine the output from the build process. You will notice that one of the documents could not be processed. The following messages are shown: "The file pdf05-notext.pdf was recognised but could not be processed by any plugin.", and "3 documents were processed and included in the collection. 1 was rejected".</Text> 
    949949</NumberedItem> 
    950950<NumberedItem> 
     
    958958</Comment> 
    959959<NumberedItem> 
    960 <Text id="0335">Use the <AutoText key="glidict::Menu.File_Options"/> item on the <AutoText key="glidict::Menu.File"/> menu to switch to <AutoText key="glidict::Preferences.Mode.Expert"/> mode and then build the collection again. The <AutoText key="glidict::GUI.Create"/> panel looks different in <AutoText key="glidict::Preferences.Mode.Expert"/> mode because it gives more options: locate the <AutoText key="glidict::CreatePane.Build_Collection" type="button"/> button, near the bottom of the window, and click it. Now a message appears saying that the file could not be processed, and why. Amongst all the output, we get the following message: "Error: PDF contains no extractable text. Could not convert pdf05-notext.pdf to HTML format". pdftohtml.pl cannot convert a PDF file to HTML if the PDF file has no extractable text.</Text> 
     960<Text id="0335">Use the <AutoText key="glidict::Menu.File_Options"/> item on the <AutoText key="glidict::Menu.File"/> menu, <AutoText key="glidict::Preferences.Mode"/> tab, to switch to <AutoText key="glidict::Preferences.Mode.Expert"/> mode and then build the collection again. The <AutoText key="glidict::GUI.Create"/> panel looks different in <AutoText key="glidict::Preferences.Mode.Expert"/> mode because it gives more options: locate the <AutoText key="glidict::CreatePane.Build_Collection" type="button"/> button, near the bottom of the window, and click it. Now a message appears saying that the file could not be processed, and why. Amongst all the output, we get the following message: "Error: PDF contains no extractable text. Could not convert pdf05-notext.pdf to HTML format". pdftohtml.pl cannot convert a PDF file to HTML if the PDF file has no extractable text.</Text> 
    961961</NumberedItem> 
    962962<NumberedItem> 
     
    10871087<Text id="ew-5"><b>Build</b> the collection. You will notice that the Microsoft Word program is started up for each Word document&mdash;the document is saved as HTML from Word itself, to get a better conversion. <b>Preview</b> the collection. In the <AutoText key="coredm::_Global:labelTitle_"/> list, notice that <Path>word03.doc</Path> and <Path>word06.doc</Path> now have a book icon, rather than a page icon. These now appear with hierarchical structure.</Text> 
    10881088<Text id="ew-6">The default behaviour for <AutoText text="WordPlugin"/> with <AutoText text="windows_scripting"/> is to section the document based on <AutoText text="Heading 1" type="quoted"/>, <AutoText text="Heading 2" type="quoted"/>, <AutoText text="Heading 3" type="quoted"/> styles. If you open up the <Path>word03.doc</Path> or <Path>word06.doc</Path> documents in Word, you will see that the sections use these Heading styles.</Text>  
    1089 <Text id="ew-7">Note, to view style information in Word 2003, you can select <Menu>Format &rarr; Styles and Formatting</Menu> from the menu, and a side bar will appear on the right hand side. Click on a section heading and the formatting information will be displayed in this side bar.</Text> 
     1089<Text id="ew-7">Note, to view style information in Word 2003, you can select <Menu>Format &rarr; Styles and Formatting</Menu> from the menu, and a side bar will appear on the right hand side. (In Word 2007 and later, find the <Menu>Change Styles</Menu> button on the far right of the menu ribbon. Click on the tiny <Menu>Expand</Menu> icon to its bottom right to display the styles side bar.) Click on a section heading and the formatting information will be displayed in this side bar.</Text> 
    10901090</NumberedItem> 
    10911091<NumberedItem> 
     
    11721172</NumberedItem> 
    11731173<NumberedItem> 
    1174 <Text id="ew-27">In the <AutoText key="glidict::GUI.Enrich"/> panel, look at the metadata that has been extracted for <Path>word05.doc</Path> and <Path>word06.doc</Path>. Now open the documents in Word and look at what properties have been set (<Menu>File &rarr; Properties</Menu> for Word 2003). They have Title, Author, Subject, and Keywords properties. <AutoText text="WordPlugin"/> can be configured to look for these properties and extract them.</Text> 
     1174<Text id="ew-27">In the <AutoText key="glidict::GUI.Enrich"/> panel, look at the metadata that has been extracted for <Path>word05.doc</Path> and <Path>word06.doc</Path>. Now open the documents in Word and look at what properties have been set (<Menu>File &rarr; Properties</Menu> for Word 2003. In Word 2007+, click the Word Icon on the top left, then choose <Menu>Prepare &rarr; Properties</Menu>). They have Title, Author, Subject, and Keywords properties. <AutoText text="WordPlugin"/> can be configured to look for these properties and extract them.</Text> 
    11751175</NumberedItem> 
    11761176<NumberedItem> 
     
    15581558</NumberedItem> 
    15591559<NumberedItem> 
    1560 <Text id="0418">Now click <AutoText key="glidict::Mirroring.Download" type="button"/>. If you have set proxy information in <AutoText key="glidict::Menu.File_Options"/>, a popup will ask for your user name and password. Once the download has started, a progress bar appears in the lower half of the panel that reports on how the downloading process is doing.</Text> 
     1560<Text id="0418">Now click <AutoText key="glidict::Mirroring.Download" type="button"/>. If you have set proxy information in <AutoText key="glidict::Menu.File_Options"/>, a popup will ask for your user name and password. If you're on Windows Vista or later, Windows may show a popup message asking whether you wish to block or unblock the download. In such a case, choose to unblock. Once the download has started, a progress bar appears in the lower half of the panel that reports on how the downloading process is doing.</Text> 
    15611561<Comment> 
    15621562<Text id="0419">More detailed information can be obtained by clicking <AutoText key="glidict::Mirroring.DownloadJob.Log" type="button"/>. The process can be paused and restarted as needed, or stopped altogether by clicking <AutoText key="glidict::Mirroring.DownloadJob.Close" type="button"/>. Downloading can be a lengthy process involving multiple sites, and so Greenstone allows additional downloads to be queued up. When new URLs are pasted into the <AutoText text="url"/> box and <AutoText key="glidict::Mirroring.Download" type="button"/> clicked, a new progress bar is appended to those already present in the lower half of the panel. When the currently active download item completes, the next is started automatically.</Text>