Changeset 27116

Show
Ignore:
Timestamp:
25.03.2013 08:59:24 (6 years ago)
Author:
jlwhisler
Message:

Changes to large HTML through downloading from the internet tutorials. Added |3.05 to Version’s current attribute. Added Auto Text for GS3 to correspond with GS2’s coredm keys – so tutorials for GS3 can be generated properly from a Greenstone 3 installation. Changed file path to match for GS3 and menu path to view source in Firefox.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • documentation/trunk/tutorials/xml-source/tutorial_en.xml

    r27115 r27116  
    16481648</Title> 
    16491649<SampleFiles folder="tudor"/> 
    1650 <Version initial="2.60" current="2.85"/> 
     1650<Version initial="2.60" current="2.85|3.05"/> 
    16511651<Content> 
    16521652<Comment> 
     
    16721672</Heading> 
    16731673<NumberedItem> 
    1674 <Text id="0393">The browsing facilities in this collection (<AutoText key="coredm::_Global:labelTitle_" type="italics"/> and <AutoText key="coredm::_Global:labelSource_" type="italics"/>) are based entirely on extracted metadata. Switch to the <AutoText key="glidict::GUI.Enrich"/> panel in the Librarian Interface and examine the metadata that has been extracted for some of the files.</Text> 
    1675 </NumberedItem> 
    1676 <NumberedItem> 
    1677 <Text id="0393a">Many HTML documents contain metadata in <Format>&lt;meta&gt;</Format> tags in the <Format>&lt;head&gt;</Format> of the page. Open up the <Path>englishhistory.net &rarr; tudor &rarr; monarchs &rarr; boleyn.html</Path> file by navigating to it in the tree on the left hand side, and double clicking it. This will open it in a web browser. View the HTML source of the page (<Menu>View &rarr; Source</Menu> in Internet Explorer, <Menu>View &rarr; Page Source</Menu> in Mozilla). You will notice that this page has <AutoText text="page_topic, content" type="italics"/> and <AutoText text="author" type="italics"/> metadata.</Text> 
     1674<Text id="0393">The browsing facilities in this collection <MajorVersion number="2">(<AutoText key="coredm::_Global:labelTitle_" type="italics"/> and <AutoText key="coredm::_Global:labelSource_" type="italics"/>)</MajorVersion><MajorVersion number="3">(<AutoText key="gs3::metadata_names::Title.buttonname" /> 
     1675 and <AutoText key="gs3::metadata_names::Source.buttonname" />)</MajorVersion> are based entirely on extracted metadata. Switch to the <AutoText key="glidict::GUI.Enrich"/> panel in the Librarian Interface and examine the metadata that has been extracted for some of the files.</Text> 
     1676</NumberedItem> 
     1677<NumberedItem> 
     1678<Text id="0393a">Many HTML documents contain metadata in <Format>&lt;meta&gt;</Format> tags in the <Format>&lt;head&gt;</Format> of the page. Open up the <Path>englishhistory.net &rarr; tudor &rarr; monarchs &rarr; boleyn.html</Path> file by navigating to it in the tree on the left hand side, and double clicking it. This will open it in a web browser. View the HTML source of the page (<Menu>View &rarr; Source</Menu> in Internet Explorer, <Menu>Tools &rarr; Web Developer &rarr; Page Source</Menu> in Mozilla). You will notice that this page has <AutoText text="page_topic, content" type="italics"/> and <AutoText text="author" type="italics"/> metadata.</Text> 
    16781679 </NumberedItem> 
    16791680<NumberedItem> 
     
    17091710</Title> 
    17101711<Prerequisite id="large_html_collection"/> 
    1711 <Version initial="2.60" current="2.85"/> 
     1712<Version initial="2.60" current="2.85|3.05"/> 
    17121713<Content> 
    17131714<Comment> 
     
    17391740</NumberedItem> 
    17401741<NumberedItem> 
    1741 <Text id="0444">Now switch to the <AutoText key="glidict::GUI.Create"/> panel, <b>build</b> the collection, and <b>preview</b> it. Choose the new <AutoText key="coredm::_Global:labelSubject_"/> link that appears in the navigation bar, and click the bookshelves to navigate around the four-entry hierarchy that you have created.</Text> 
     1742<Text id="0444">Now switch to the <AutoText key="glidict::GUI.Create"/> panel, <b>build</b> the collection, and <b>preview</b> it. Choose the new <MajorVersion number="2"><AutoText key="coredm::_Global:labelSubject_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Subjects.buttonname" /></MajorVersion> link that appears in the navigation bar, and click the bookshelves to navigate around the four-entry hierarchy that you have created.</Text> 
    17421743</NumberedItem> 
    17431744<Heading> 
     
    17541755</NumberedItem> 
    17551756<NumberedItem> 
    1756 <Text id="0460"><b>Build</b> the collection again, <b>preview</b> it, and try out the new <AutoText key="coredm::_Global:labelPhrase_"/> option in the navigation bar. An interesting PHIND search term for this collection is <AutoText text="king" type="quoted"/>. Note that even though it is called a phrase browser, only single terms can be used as the starting point for browsing.</Text> 
     1757<Text id="0460"><b>Build</b> the collection again, <b>preview</b> it, and try out the new <MajorVersion number="2"><AutoText key="coredm::_Global:labelPhrase_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::PhindPhraseBrowse::PhindApplet.name" /></MajorVersion> option in the navigation bar. An interesting PHIND search term for this collection is <AutoText text="king" type="quoted"/>. Note that even though it is called a phrase browser, only single terms can be used as the starting point for browsing.</Text> 
    17571758</NumberedItem> 
    17581759<Heading> 
     
    18031804</NumberedItem> 
    18041805<NumberedItem> 
    1805 <Text id="0464">Preview the newly rebuilt collection's <AutoText key="coredm::_Global:labelTitle_"/> page. Previously this listed more than a dozen pages per letter of the alphabet, but now there are just three&mdash;the first three files encountered by the building process.</Text> 
     1806<Text id="0464">Preview the newly rebuilt collection's <MajorVersion number="2"><AutoText key="coredm::_Global:labelTitle_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Title.buttonname" /></MajorVersion> page. Previously this listed more than a dozen pages per letter of the alphabet, but now there are just three&mdash;the first three files encountered by the building process.</Text> 
    18061807</NumberedItem> 
    18071808<NumberedItem> 
     
    18151816</Title> 
    18161817<Prerequisite id="large_html_collection"/> 
    1817 <Version initial="2.60" current="2.85"/> 
     1818<Version initial="2.60" current="2.85|3.05"/> 
    18181819<Content> 
    18191820<NumberedItem> 
     
    18331834</Indent> 
    18341835<Text id="0472">for a particular document whose <i>Title</i> metadata is <AutoText text="A discussion of question five from Tudor Quiz: Henry VIII"/> and whose <i>Source</i> metadata is <AutoText text="quizstuff.html"/>.</Text> 
    1835 <Text id="0473">This format appears in the search results list, in the <AutoText key="coredm::_Global:labelTitle_"/> list, and also when you get down to individual documents in the <AutoText key="coredm::_Global:labelSubject_"/> hierarchy. This is Greenstone's default format statement<MajorVersion number="3"> used in the <AutoText text="browse"/> and <AutoText text="search"/> format features.</MajorVersion>.</Text> 
     1836<MajorVersion number="2"> 
     1837<Text id="0473a">This format appears in the search results list, in the <AutoText key="coredm::_Global:labelTitle_"/> list, and also when you get down to individual documents in the <AutoText key="coredm::_Global:labelSubject_"/> hierarchy. This is Greenstone's default format statement.</Text> 
     1838</MajorVersion> 
     1839<MajorVersion number="3"> 
     1840<Text id="0473b">This format appears in the search results list, in the <AutoText key="gs3::metadata_names::Title.buttonname" /> list, and also when you get down to individual documents in the <AutoText key="gs3::metadata_names::Subjects.buttonname" /> hierarchy. This is Greenstone's default format statement used in the <AutoText text="browse"/> and <AutoText text="search"/> format features.</Text> 
     1841</MajorVersion> 
    18361842</NumberedItem> 
    18371843<Comment> 
     
    18661872<Text id="0475-3a">Replace the <AutoText text="search"/> format feature with the above format statement too.</Text> 
    18671873</MajorVersion> 
    1868 <Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the <AutoText key="coredm::_Global:labelTitle_"/> list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.</Text> 
     1874<Text id="0476"><b>Preview</b> the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the <MajorVersion number="2"><AutoText key="coredm::_Global:labelTitle_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Title.buttonname" /></MajorVersion> list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.</Text> 
    18691875<MajorVersion number="3"> 
    18701876<Text id="0476-3">We can also reduce the <AutoText text="VList classifierNode"/> template of the <AutoText text="browse"/> format feature further, also without changing the display. Replace it with:</Text> 
     
    19391945</NumberedItem> 
    19401946<NumberedItem> 
    1941 <Text id="0486"><b>Preview</b> the <AutoText key="coredm::_Global:labelSubject_"/> list in the collection. <MajorVersion number="2">First, the offending "()" has disappeared from the bookshelves. Second, when</MajorVersion><MajorVersion number="3">When</MajorVersion> you get down to a list of documents in the subject hierarchy, the filename does not appear beside the title, because <AutoText key="metadata::ex.Source"/> is not specified in the format statement and this format statement applies to all nodes in the <i>subject</i> classifier. Note that the search results and titles lists have not changed: they still display the filename underneath the title.</Text> 
     1947<Text id="0486"><b>Preview</b> the <MajorVersion number="2"><AutoText key="coredm::_Global:labelSubject_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Subjects.buttonname" /></MajorVersion> list in the collection. <MajorVersion number="2">First, the offending "()" has disappeared from the bookshelves. Second, when</MajorVersion><MajorVersion number="3">When</MajorVersion> you get down to a list of documents in the subject hierarchy, the filename does not appear beside the title, because <AutoText key="metadata::ex.Source"/> is not specified in the format statement and this format statement applies to all nodes in the <i>subject</i> classifier. Note that the search results and titles lists have not changed: they still display the filename underneath the title.</Text> 
    19421948</NumberedItem> 
    19431949<NumberedItem> 
     
    19781984</NumberedItem> 
    19791985<NumberedItem> 
    1980 <Text id="0494">Finally, let's return to the <AutoText key="coredm::_Global:labelSubject_" type="italics"/> hierarchy and learn how to do different things to the bookshelves and to the documents themselves. <MajorVersion number="2">In the <AutoText key="glidict::CDM.FormatManager.Feature"/> menu, re-select the item</MajorVersion><MajorVersion number="3">Reselect the format feature for</MajorVersion></Text> 
     1986<Text id="0494">Finally, let's return to the <MajorVersion number="2"><AutoText key="coredm::_Global:labelSubject_"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Subjects.buttonname" /></MajorVersion> hierarchy and learn how to do different things to the bookshelves and to the documents themselves. <MajorVersion number="2">In the <AutoText key="glidict::CDM.FormatManager.Feature"/> menu, re-select the item</MajorVersion><MajorVersion number="3">Reselect the format feature for</MajorVersion></Text> 
    19811987<Indent> 
    19821988CL2<MajorVersion number="2">:</MajorVersion> Hierarchy -metadata <AutoText key="metadata::dc.Subject" type="plain"/> 
     
    20192025<Text id="st-1">Section tagging for HTML documents</Text> 
    20202026</Title> 
    2021 <Version initial="2.70w" current="2.85"/> 
     2027<Version initial="2.70w" current="2.85|3.05"/> 
    20222028<Content> 
    20232029<NumberedItem> 
     
    20252031</NumberedItem> 
    20262032<NumberedItem> 
    2027 <Text id="st-2">Using a text editor (e.g. WordPad) open up one of the HTML files from the demo collection: <Path>Greenstone &rarr; collect &rarr; demo &rarr; import &rarr; fb33fe &rarr;fb33fe.htm</Path>. You will see some HTML comments which contain section information for Greenstone. They look like:</Text> 
     2033<Text id="st-2">Using a text editor (e.g. WordPad) open up one of the HTML files from the demo collection:  
     2034<MajorVersion number="2"> 
     2035<Path>Greenstone &rarr; collect &rarr; demo &rarr; import &rarr; fb33fe &rarr; fb33fe.htm</Path> 
     2036</MajorVersion> 
     2037<MajorVersion number="3"> 
     2038<Path>Greenstone3 &rarr; web &rarr; sites &rarr; localsite &rarr; collect &rarr; lucene-jdbm-demo &rarr; import &rarr; fb33fe &rarr; fb33fe.htm</Path> 
     2039</MajorVersion> 
     2040. You will see some HTML comments which contain section information for Greenstone. They look like:</Text> 
    20282041<Format> 
    20292042&lt;!--<br/> 
     
    20432056--&gt; 
    20442057</Format> 
    2045 <Text id="st-3">When Greenstone encounters a <Format>&lt;Section&gt;</Format> tag in one of these comments, it will start a new subsection of the document. This will be closed when a <Format>&lt;/Section&gt;</Format> tag is encountered. Metadata can also be added for each section&mdash;in this case, <AutoText text="Title"/> metadata has been added for each section. In the browser, find the <AutoText text="Farming snails 1"/> document in the demo collection (through the <AutoText key="coredm::_Global:labelTitle_" type="italics"/> browser). Look at its table of contents and compare it to the <Format>&lt;Section&gt;</Format> tags in the HTML document.</Text> 
     2058<Text id="st-3">When Greenstone encounters a <Format>&lt;Section&gt;</Format> tag in one of these comments, it will start a new subsection of the document. This will be closed when a <Format>&lt;/Section&gt;</Format> tag is encountered. Metadata can also be added for each section&mdash;in this case, <AutoText text="Title"/> metadata has been added for each section. In the browser, find the <AutoText text="Farming snails 1"/> document in the demo collection (through the <MajorVersion number="2"><AutoText key="coredm::_Global:labelTitle_" type="italics"/></MajorVersion><MajorVersion number="3"><AutoText key="gs3::metadata_names::Title.buttonname" type="italics"/></MajorVersion> browser). Look at its table of contents and compare it to the <Format>&lt;Section&gt;</Format> tags in the HTML document.</Text> 
    20462059</NumberedItem> 
    20472060<NumberedItem> 
     
    20752088<Text id="0411">Downloading files from the web</Text> 
    20762089</Title> 
    2077 <Version initial="2.60" current="2.85"/> 
     2090<Version initial="2.60" current="2.85|3.05"/> 
    20782091<Content> 
    20792092<Comment>