Changeset 37641 for documentation
- Timestamp:
- 2023-04-09T20:03:02+12:00 (13 months ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
documentation/trunk/tutorials/xml-source/tutorial_en.xml
r37599 r37641 4862 4862 <MajorVersion number="3">gs3-setup.bat</MajorVersion> 4863 4863 </Command> 4864 <Text id="0742">to set up the ability to run Greenstone command-line programs. On Linux/Mac, you would run <Command>source <MajorVersion number="2"> setup.bash</MajorVersion><MajorVersion number="3">gs3-setup.sh</MajorVersion></Command>.</Text>4864 <Text id="0742">to set up the ability to run Greenstone command-line programs. On Linux/Mac, you would run <Command>source <MajorVersion number="2">./setup.bash</MajorVersion><MajorVersion number="3">./gs3-setup.sh</MajorVersion></Command>.</Text> 4865 4865 </NumberedItem> 4866 4866 <NumberedItem> … … 4936 4936 </NumberedItem> 4937 4937 <NumberedItem><Text id="ucp-11">Visit the <Link url="http://www.djvu.org/resources/djvu_digital_vs_super_hero_pdf.php">'DjVu-Digital vs. "Super Hero" PDF' page</Link>. The page compares a PDF sample document to its equivalent DjVu version and provides download links for both.</Text> 4938 <Text id="ucp-11a">Download their <Link url="https://trac.greenstone.org/export/37353/documentation/trunk/tutorial_sample_files/unknownconverter/superhero.djvu">sample DjVu document</Link> (originally <Link url="http://www.djvu.org/docs/superhero.djvu?djvuopts&zoom=page">here</Link>) into your <i>DjVu Collection</i>'s <b>import</b> folder at <MajorVersion number="2"><Path>Greenstone → collect → djvucoll → import</Path> </MajorVersion><MajorVersion number="3"><Path>Greenstone → web → sites → localsite → collect → djvucoll → import</Path></MajorVersion>. </Text>4938 <Text id="ucp-11a">Download their <Link url="https://trac.greenstone.org/export/37353/documentation/trunk/tutorial_sample_files/unknownconverter/superhero.djvu">sample DjVu document</Link> (originally <Link url="http://www.djvu.org/docs/superhero.djvu?djvuopts&zoom=page">here</Link>) into your <i>DjVu Collection</i>'s <b>import</b> folder at <MajorVersion number="2"><Path>Greenstone → collect → djvucoll → import</Path> </MajorVersion><MajorVersion number="3"><Path>Greenstone → web → sites → localsite → collect → djvucoll → import</Path></MajorVersion>. If you're offline, you can also get this file from <Path>sample_files → djvu → superhero.djvu</Path>.</Text> 4939 4939 </NumberedItem> 4940 4940 <NumberedItem><Text id="ucp-12">Back in GLI, in the <b>Collection</b> view of the <AutoText key="glidict::GUI.Gather"/> pane, right click and select <AutoText key="glidict::CollectionPopupMenu.Refresh"/>. You should now see your new document "superhero.djvu" ready to be built.</Text> … … 4998 4998 <NumberedItem><Text id="ucp-39">Greenstone doesn't have an icon for DjVu documents, since it doesn't know about the format. If you Google for the djvu icon, you'd probably find the <Link url="https://en.wikipedia.org/wiki/DjVu">Wikipedia page for it</Link>.</Text> 4999 4999 <Text id="ucp-40">Save one of their DjVu icon images. Then open the image in Windows Paint or GIMP or another image editor, and use the application's scaling feature to scale the image's height or the width (whichever is greater) to anywhere between 26 and 32 pixels. Save the scaled image as a GIF file with the name "<Format>idjvu.gif</Format>", storing it in your Greenstone installation's <Format>web/interfaces/default/images</Format> folder. You can also use free online image resizing websites to carry out this step.</Text> 5000 <Text id="ucp-40a">If you're working offline, you can get a resized and ready copy of the idjvu.gif file from <Path>sample_files → djvu → idjvu.gif</Path>. Put it in your Greenstone 3 installation's <Format>web/interfaces/default/images</Format> folder.</Text> 5000 5001 </NumberedItem> 5001 5002 <NumberedItem><Text id="ucp-41">Greenstone knows nothing about the <Format>icondjvu</Format> macro we defined as the value for UnknownConverterPlugin's <Format>srcicon</Format> field, so we have to teach Greenstone about this new macro. Use a text editor to open your Greenstone 3's <Format>web/sites/localsite/siteConfig.xml</Format> file.</Text> … … 5010 5011 </NumberedItem> 5011 5012 <NumberedItem><Text id="ucp-47">Having designed your collection to handle DjVu documents, you can now add any other documents, including more DjVu documents. Greenstone should now be able to index the text content of DjVu documents in the collection to make them searchable, in all instances where text can be successfully extracted from them by <Format>djvutxt</Format>.</Text> 5012 <Text id="ucp-47a">Make the search format statement look like below , then try searching:</Text>5013 <Text id="ucp-47a">Make the search format statement look like below (you can copy it from <Path>sample_files → djvu → formats → format_tweaks.txt</Path>), then try searching:</Text> 5013 5014 <Format> 5014 5015 <gsf:template match="documentNode"><br/> … … 5696 5697 </Format> 5697 5698 <Text id="ic-10b">As per the above manifest file, the operation to be performed by an incremental build is a <Delete> operation on two documents. For the delete operation, the documents are not indicated by the <Filename> XML element, but by the <OID> element which specifies the object identifier. We need to use the OID here because we're telling Greenstone precisely what the identifiers of the documents are that we wish to have removed from our collection. The identifiers of every built document in a Greenstone collection are specified in the Identifier field of the document's <i>doc.xml</i> file located in the collection's <Format>archives</Format> folder. The <i>doc.xml</i> file is the Greenstone-specific XML format in which Greenstone stores documents already imported.</Text> 5698 <Text id="ic-10c">For instance, to find the identifier of the <i>b18ase.htm</i> document in your built collection, open up <Format><MajorVersion number="3">web\sites\localsite\</MajorVersion>collect\incremen\archives\b18ase -b.dir\doc.xml</Format> in a text editor. Then scroll down, looking for a piece of Greenstone extracted metadata labelled <i>Identifier</i>, which is the OID for this document:</Text>5699 <Text id="ic-10c">For instance, to find the identifier of the <i>b18ase.htm</i> document in your built collection, open up <Format><MajorVersion number="3">web\sites\localsite\</MajorVersion>collect\incremen\archives\b18ase.dir\doc.xml</Format> in a text editor. Then scroll down, looking for a piece of Greenstone extracted metadata labelled <i>Identifier</i>, which is the OID for this document:</Text> 5699 5700 <!--<Format><Metadata name="Identifier">b18ase-b18ase_htm</Metadata></Format>--> 5700 5701 <Format><Metadata name="Identifier">b18ase</Metadata></Format> … … 5714 5715 <Format>perl -S incremental-buildcol.pl -activate <MajorVersion number="3">-site localsite</MajorVersion> incremen</Format> 5715 5716 <Text id="ic-12d">If you were to scroll through the buildcol output in the terminal this time, you would see the following:</Text> 5716 <Format>GreenstoneXMLPlugin: processing fb33fe -f.dir\doc.xml<br />5717 GreenstoneXMLPlugin: processing b18ase -b.dir\doc.xml5717 <Format>GreenstoneXMLPlugin: processing fb33fe.dir\doc.xml<br /> 5718 GreenstoneXMLPlugin: processing b18ase.dir\doc.xml 5718 5719 </Format> 5719 5720 <Text id="ic-12e">Only these 2 files were actually processed by <Format>buildcol</Format>, and that's because the manifest specified they were being deleted.</Text>
Note:
See TracChangeset
for help on using the changeset viewer.