Changeset 33034
- Timestamp:
- 2019-04-24T20:44:11+12:00 (5 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
documentation/trunk/tutorials/xml-source/tutorial_en.xml
r32994 r33034 443 443 <Text id="0218">To invoke the Greenstone Reader's Interface, go to the <i>Greenstone-2.87</i> item under <i>All Programs</i> on the Windows <i>Start</i> menu and select <i>Greenstone Server</i>, once the server window is displayed click <<b>Enter Library</b>>.</Text> 444 444 <Text id="0219">To invoke the Greenstone Librarian Interface, go to the same item and select <i>Librarian Interface (GLI)</i>.</Text> 445 <Text id="0220">On a Mac, one of the ways you can launch GLI is to click on the green <Format>gli</Format> application icon located in your Greenstone installation folder. On both linux and mac, you can also use a terminal to go into the Greenstone installation folder and then run the command <Format>./gli/gli.sh</Format> from there.</Text> 445 446 </Content> 446 447 </Tutorial> … … 536 537 <Text id="0256"> After a short pause a startup screen appears, and then after a slightly longer pause the main Greenstone Librarian Interface appears. (A command prompt is also opened in the background.)</Text> 537 538 </Comment> 539 <Text id="0256a">On a Mac, one of the ways you can launch GLI is to click on the green <Format>gli</Format> application icon located in your Greenstone installation folder. (On MacOS version 10.14.4, also known as Mojave, when you first run GLI in this manner, you may see a warning that future MacOS versions won't support this application. Dismiss the warning and proceed.)</Text> 540 <Text id="0256b">On linux as well as mac, you can run GLI by using a terminal to go into the Greenstone installation folder and then running the command <Format>./gli/gli.sh</Format> from there.</Text> 538 541 </NumberedItem> 539 542 <Heading> … … 4713 4716 <MajorVersion number="3">http://<hostname:portnumber>/greenstone3/oaiserver</MajorVersion> 4714 4717 </Format> 4718 <MajorVersion number="3"><Text id="gli-oai-5">(If you set up your Greenstone 3 server to operate over https, then adjust the above URL to have <Format>https</Format> as prefix and to contain the associated https port number instead.)</Text></MajorVersion> 4715 4719 <Text id="gli-oai-5">Make sure that you can generally access this URL from your browser.</Text> 4716 4720 <Text id="gli-oai-5a"><AutoText text="Visit the library home page"/>, as this will load the greenstone collections, so that any associated files like images or pdf documents become accessible for download. (Without visiting the library home page, the collections would not be loaded and the images from the Simple Images collection, that we will be downloading below alongside the oai files, will not be available for download.)</Text> 4717 4721 </NumberedItem> 4718 4722 <NumberedItem> 4719 <Text id="gli-oai-6">If the server is not running on <Format>localhost</Format> and your computer is behind a firewall or proxy server, you may need to edit the proxy settings in the Librarian Interface. Click the <AutoText key="glidict::Mirroring.Preferences" type="button"/> button. Switch on the <AutoText key="glidict::Preferences.Connection.Use_Proxy"/> checkbox. Enter the proxy server address and port number in the <AutoText key="glidict::Preferences.Connection.HTTP_Proxy_Host"/> and <AutoText key="glidict::Preferences.Connection.Proxy_Port"/> boxes. Click <AutoText key="glidict::General.OK" type="button"/> to get back to the <AutoText key="glidict::DOWNLOAD.MODE.OAIDownload"/> section of the <AutoText key="glidict::GUI.Download"/> panel. </Text>4723 <Text id="gli-oai-6">If the server is not running on <Format>localhost</Format> and your computer is behind a firewall or proxy server, you may need to edit the proxy settings in the Librarian Interface. Click the <AutoText key="glidict::Mirroring.Preferences" type="button"/> button. Switch on the <AutoText key="glidict::Preferences.Connection.Use_Proxy"/> checkbox. Enter the proxy server address and port number in the <AutoText key="glidict::Preferences.Connection.HTTP_Proxy_Host"/> and <AutoText key="glidict::Preferences.Connection.Proxy_Port"/> boxes. <MajorVersion number="3">Further, if you set up your Greenstone to run over <Format>https</Format> (or more generally, if you will be downloading from <Format>https</Format> URLs), tick the box labelled "No certificate checking for HTTPS downloads".</MajorVersion> Click <AutoText key="glidict::General.OK" type="button"/> to get back to the <AutoText key="glidict::DOWNLOAD.MODE.OAIDownload"/> section of the <AutoText key="glidict::GUI.Download"/> panel. </Text> 4720 4724 </NumberedItem> 4721 4725 <NumberedItem> … … 4746 4750 <NumberedItem> 4747 4751 <Text id="oai-11a">Start up the Greenstone server application.</Text> 4752 <Text id="oai-11b"><AutoText text="Visit the library home page"/> to load the greenstone collections including any associated files, as explained above.</Text> 4748 4753 </NumberedItem> 4749 4754 <NumberedItem> … … 4763 4768 <Text id="0742">to set up the ability to run Greenstone command-line programs. On Linux/Mac, you would run <Command>source <MajorVersion number="2">setup.bash</MajorVersion><MajorVersion number="3">gs3-setup.sh</MajorVersion></Command>.</Text> 4764 4769 </NumberedItem> 4770 <NumberedItem> 4771 <Text id="0739b">If you <MajorVersion number="3">set up your Greenstone to run over <Format>https</Format> or </MajorVersion> intend to use the <i>command line</i> to download from any URLs that begin with <Format>https</Format> instead of <Format>http</Format>, then you will further need to edit your <MajorVersion number="2">Greenstone 2 installation's <Format>bin/linux/wgetrc</Format></MajorVersion><MajorVersion number="3">Greenstone 3 installation's <Format>gs2build/bin/linux/wgetrc</Format></MajorVersion> file as follows. Open the file in a text editor and change the line that says:</Text> 4772 <Format>#check_certificate = off</Format> 4773 <Text id="0739b">to:</Text> 4774 <Format>check_certificate = off</Format> 4775 <Text id="0739b">Removing the hash sign at the start of this line changes it from being a mere comment to activating the line. Save the edited file and close it. The effect of this step will be that downloading from <Format>https</Format> URLs will now succeed even when download commands are run from the command line.</Text> 4776 </NumberedItem> 4765 4777 <Comment> 4766 4778 <Text id="0743">GLI uses a perl script, <Format>downloadfrom.pl</Format>, to do the downloading. This can be run on the command line, outside of GLI.</Text> … … 4781 4793 <Text id="0747">The OAI records will be downloaded into the folder where the downloadfrom.pl script is run from. To change this, use the <Format>-cache_dir <i>full-path-to-folder</i></Format> option and set its value to the full path of the destination folder you choose. (If you wanted to download the documents along with the records, then you would additionally pass in the <Format>-get_doc</Format> flag to the above command as well as the <Format>-get_doc_exts</Format> flag followed by a comma-separated list of file extensions like "jpg,pdf".)</Text> 4782 4794 <Format> 4783 <MajorVersion number="2">perl -S downloadfrom.pl -download_mode OAI -url http://<hostname:portnumber>/greenstone/cgi-bin/oaiserver.cgi -set backdrop -max_records 15 -get_doc -get_doc_exts "jpg,pdf" -cache_dir " type-full-path-to-a-download-folder"</MajorVersion>4784 <MajorVersion number="3">perl -S downloadfrom.pl -download_mode OAI -url http://<hostname:portnumber>/greenstone3/oaiserver -set backdrop -max_records 15 -get_doc -get_doc_exts "jpg,pdf" -cache_dir " type-full-path-to-a-download-folder"</MajorVersion>4795 <MajorVersion number="2">perl -S downloadfrom.pl -download_mode OAI -url http://<hostname:portnumber>/greenstone/cgi-bin/oaiserver.cgi -set backdrop -max_records 15 -get_doc -get_doc_exts "jpg,pdf" -cache_dir "<type-full-path-to-a-download-folder>"</MajorVersion> 4796 <MajorVersion number="3">perl -S downloadfrom.pl -download_mode OAI -url http://<hostname:portnumber>/greenstone3/oaiserver -set backdrop -max_records 15 -get_doc -get_doc_exts "jpg,pdf" -cache_dir "<type-full-path-to-a-download-folder>"</MajorVersion> 4785 4797 </Format> 4786 4798 </NumberedItem> … … 4840 4852 <Text id="ucp-18">We're in luck, because among the DjVu related tools that <Link url="http://djvu.sourceforge.net">DjVuLibre</Link> provides is one called "<Format>djvutxt</Format>" that can perform the text extraction for us. DjVuLibre is available for Windows, Mac and Linux:</Text> 4841 4853 <BulletList> 4842 <Bullet><Text id="ucp-19b">Some <Link url="https://unix.stackexchange.com/questions/25256/why-isnt-there-a-djvu2text">Linux machines may even come pre-installed with DjVuLibre</Link>. If not, you can use a package manager to install it for you, or compile it up easily from <Link url="https://sourceforge.net/projects/djvu/files/DjVuLibre/">source</Link> in the usual Unix manner.</Text></Bullet>4843 4854 <Bullet><Text id="ucp-19">DjVuLibre provides binary installers for <Link url="https://sourceforge.net/projects/djvu/files/DjVuLibre_Windows/">Windows</Link> and <Link url="https://sourceforge.net/projects/djvu/files/DjVuLibre_MacOS/">Mac</Link>. Grab the one for your operating system and install it somewhere sensible: somewhere you have permissions to install and run it from. On Windows, running the installer in the regular manner requires you to have admin permissions. If you don't have admin rights, you can run the installer as follows (instructions taken from <Link url="https://superuser.com/questions/171917/force-a-program-to-run-without-administrator-privileges-or-uac">this superuser exchange</Link>) to install DjVuLibre in a non-admin location. Use a text editor to create a file called <Format>nonadmin.bat</Format> (beware the file doesn't end up with an additional <Format>.txt</Format> extension when saving it). Copy and paste, or carefully type, the following text into the file, then save and close it:</Text> 4844 4855 <Format>cmd /min /C "set __COMPAT_LAYER=RUNASINVOKER && start "" %1"</Format> 4845 4856 <Text id="ucp-19a">Next, drag and drop the DjVuLibre setup executable onto the new <Format>nonadmin.bat</Format> file to run setup in a way that bypasses the admin privileges usually required for a successful installation. When installing, you'll now finally be allowed to choose a custom install directory, instead of the installer choosing an off-limits admin location like <Format>C:\Program Files (x86)</Format> for you. So make sure to choose a location in your User area as install directory.</Text> 4846 4857 <Text id="ucp-19c">Upon successful installation, you're given the option to launch DjVuLibre's <i>DjView</i> tool, which will open the DjVuLibre manual (in djvu format). In the left pane of DjView, you can see a listing of the various tools DjVuLibre is comprised of, and read up on them. You can also read about <i>djvutxt</i> or the other DjVu tools that DjVuLibre provides in their <Link url="http://djvu.sourceforge.net/doc/index.html">documentation page</Link>, but for this tutorial, we'll just be using their <Format>djvutxt</Format> tool.</Text></Bullet> 4858 <Bullet><Text id="ucp-19b">Some <Link url="https://unix.stackexchange.com/questions/25256/why-isnt-there-a-djvu2text">Linux machines may even come pre-installed with DjVuLibre</Link>. If not, you can use a package manager to install it for you, or compile it up easily from <Link url="https://sourceforge.net/projects/djvu/files/DjVuLibre/">source</Link> in the usual Unix manner, as explained below.</Text></Bullet> 4859 <Bullet><Text id="ucp-19d">If you're on a Unix (Linux or Mac) system where you don't have the permissions needed to install DjVuLibre, then you can compile it up from source code as follows: download the <Link url="https://sourceforge.net/projects/djvu/files/DjVuLibre/">source tarball</Link> and untar this in a user location. Open a terminal and change directory into the untarred DjVuLibre source folder. Then run the following three commands in sequence, adjusting the <Format>prefix</Format> flag to start with the full path to your Greenstone installation:</Text> 4860 <Format> 4861 ./configure --prefix=/PATH/TO/YOUR/GS/djvulibre<br /> 4862 make<br /> 4863 make install 4864 </Format> 4865 <Text id="ucp-19e">If compiling was successful, djvulibre binaries would have been generated inside your Greenstone installation's new <Format>djvulibre/bin</Format> folder, at <Format>/PATH/TO/YOUR/GS/djvulibre/bin</Format>. For this tutorial, the most important of these djvulibre binaries is <Format>djvuxt</Format>, which will now be located at <Format>/PATH/TO/YOUR/GS/djvulibre/bin/djvutxt</Format> on your Unix system.</Text> 4866 </Bullet> 4847 4867 </BulletList> 4848 4868 </NumberedItem> … … 4852 4872 <Text id="ucp-22">Open a DOS prompt on Windows or a terminal on Mac/Linux and experiment to see what it takes to convert your Greenstone installation's <Format>web/sites/localsite/collect/DjVuColl/superhero.djvu</Format> file.</Text> 4853 4873 <Text id="ucp-22a">You may have to invoke <Format>djvutxt</Format> using its full filepath, in which case on Windows the command would look like:</Text> 4854 <Format>C:\PATH\TO\YOUR\djvutxt C:\PATH\TO\ GS\web\sites\localsite\collect\DjVuColl\superhero.djvu C:\PATH\TO\YOUR\GS\superhero.txt</Format>4874 <Format>C:\PATH\TO\YOUR\djvutxt C:\PATH\TO\YOUR\GS\web\sites\localsite\collect\DjVuColl\superhero.djvu C:\PATH\TO\YOUR\GS\superhero.txt</Format> 4855 4875 <Text id="ucp-22b">while on Unix systems the command would look like:</Text> 4856 <Format>/PATH/TO/YOUR/djvutxt /PATH/TO/ GS/web/sites/localsite/collect/DjVuColl/superhero.djvu /PATH/TO/YOUR/GS/superhero.txt</Format>4876 <Format>/PATH/TO/YOUR/djvutxt /PATH/TO/YOUR/GS/web/sites/localsite/collect/DjVuColl/superhero.djvu /PATH/TO/YOUR/GS/superhero.txt</Format> 4857 4877 <Text id="ucp-23">Once you have the command working, inspect the output file. You should see mostly legible text in it. Only when you've been able to successfully complete this step should you proceed to the next steps.</Text> 4858 4878 </NumberedItem> … … 4916 4936 </NumberedItem> 4917 4937 <NumberedItem> 4918 <Text id="0757">Now change to the <AutoText key="glidict::GUI.Create"/> panel, locate the options for the import process and set <AutoText text="saveas"/> to <AutoText text="GreenstoneMETS"/>. Import options are not available unless you are in <AutoText key="glidict::Preferences.Mode.Expert"/> mode.</Text>4938 <Text id="0757">Now change to the <AutoText key="glidict::GUI.Create"/> panel, locate the options for the import process to the left (labelled <AutoText text="Import Options"/>) and set <AutoText text="saveas"/> to <AutoText text="GreenstoneMETS"/>. Import options are not available unless you are in <AutoText key="glidict::Preferences.Mode.Expert"/> mode.</Text> 4919 4939 </NumberedItem> 4920 4940 <NumberedItem> … … 4922 4942 </NumberedItem> 4923 4943 <NumberedItem> 4924 <Text id="0759">In your file browser, locate the <Path>archives</Path> folder for the collection you are working with (in <Path>Greenstone<MajorVersion number="3">3 → web → sites → localsite</MajorVersion> → collect → <collname> → archives</Path>). For each document in the collection, Greenstone has generated two files : <Path>docmets.xml</Path>, the core METS description, and <Path>doctxt.xml</Path>, a supporting file. (Note: unless you are connected to the Internet you may be unable to view <Path>doctxt.xml</Path> in your web browser, because it refers to a remote resource.) Depending on the source documents there may be additional files, such as the images used within a web page. One of METS' many features is the ability to reference information in external XML files. Greenstone uses this to tie the content of the document, which is stored in the external XML file <Path>doctxt.xml</Path>, to its hierarchical structure, which is described in the core METS file <Path>docmets.xml</Path>.</Text>4944 <Text id="0759">In your file browser, locate the <Path>archives</Path> folder for the collection you are working with (in <Path>Greenstone<MajorVersion number="3">3 → web → sites → localsite</MajorVersion> → collect → <collname> → archives</Path>). For each document in the collection, Greenstone has generated two files<!--within each document's "HASH"-prefixed directory-->: <Path>docmets.xml</Path>, the core METS description, and <Path>doctxt.xml</Path>, a supporting file. (Note: unless you are connected to the Internet you may be unable to view <Path>doctxt.xml</Path> in your web browser, because it refers to a remote resource.) Depending on the source documents there may be additional files, such as the images used within a web page. One of METS' many features is the ability to reference information in external XML files. Greenstone uses this to tie the content of the document, which is stored in the external XML file <Path>doctxt.xml</Path>, to its hierarchical structure, which is described in the core METS file <Path>docmets.xml</Path>.</Text> 4925 4945 </NumberedItem> 4926 4946 </Content> … … 5120 5140 <Text id="gems-4">Start the Greenstone Editor for Metadata Sets (GEMS)</Text> 5121 5141 <Text id="gems-5"><MajorVersion number="2"><Menu>Start → All Programs → Greenstone-2.87 → Metadata Set Editor (GEMS)</Menu></MajorVersion><MajorVersion number="3"><Menu>Start → All Programs → Greenstone-3.09 → Greenstone Editor for Metadata Sets (GEMS)</Menu></MajorVersion></Text> 5122 <Text id="gems-5a">(If you're on Linux, use a terminal to run the <Command>gli/gems.sh</Command> start-up script.) </Text>5142 <Text id="gems-5a">(If you're on a Unix system, use a terminal to run the <Command>gli/gems.sh</Command> start-up script. On a Mac, you can also double-click on the green <Format>gems</Format> application icon that's located in your Greenstone installation folder in order to launch GEMS.) </Text> 5123 5143 </NumberedItem> 5124 5144 <NumberedItem> … … 5178 5198 </MajorVersion> 5179 5199 <MajorVersion number="3"> 5180 <Text id="indexers-9-3">Go to the <AutoText key="glidict::GUI.Enrich"/> panel, look at the metadata that is associated with each directory. Go to the <AutoText key="glidict::CDM.GUI.Indexes"/> section in the <AutoText key="glidict::GUI.Design"/> panel. The <b>Lucene indexer</b> is already in usebecause the <b>Demo Collection (lucene-jdbm-demo)</b> collection, which this collection is based on, uses the <b>Lucene indexer</b>.</Text>5200 <Text id="indexers-9-3">Go to the <AutoText key="glidict::GUI.Enrich"/> panel, look at the metadata that is associated with each directory. Go to the <AutoText key="glidict::CDM.GUI.Indexes"/> section in the <AutoText key="glidict::GUI.Design"/> panel. Look at the top right area of the panel, where you will see that the <b>Lucene indexer</b> is already in use. This is because the <b>Demo Collection (lucene-jdbm-demo)</b> collection, which this collection is based on, uses the <b>Lucene indexer</b>.</Text> 5181 5201 </MajorVersion> 5182 5202 </NumberedItem>
Note:
See TracChangeset
for help on using the changeset viewer.