Changeset 25962 for documentation/trunk


Ignore:
Timestamp:
2012-07-17T15:39:13+12:00 (12 years ago)
Author:
ak19
Message:

Added XML markup to the 3 OAI related tutorials which were finished and tested at the start of this year.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • documentation/trunk/tutorials/xml-source/tutorial_en.xml

    r25955 r25962  
    31643164</Content>
    31653165</Tutorial>
     3166<Tutorial id="setting_up_GS_OAI_server">
     3167<Title>
     3168<Text id="oaiserver-0">Setting up your Greenstone OAI Server</Text>
     3169</Title>
     3170<Prerequisite id="simple_image_collection"/>
     3171<Version initial="2.85" current="2.85"/>
     3172<Content>
     3173<Comment>
     3174<Text id="oaiserver-1">Greenstone 2 collections are not enabled for OAI out of the box. To make a collection available to serve up over OAI, some minor adjustments need to be made first.</Text>
     3175<Text id="oaiserver-2">This tutorial will look at how to make an existing collection available over OAI and how to get it validated against the Open Archives validator.</Text>
     3176</Comment>
     3177<NumberedItem>
     3178<Text id="oaiserver-2">Use a text editor to open the file etc/oai.cfg located in your Greenstone installation folder. The oai.cfg configuration file contains properties that control the behaviour and features of your Greenstone OAI server.</Text>
     3179<Text id="oaiserver-4">The basic properties to edit in order to get your collection served by the inbuilt OAI server are the <AutoText text="repositoryNametype" type="italics"/>, <AutoText text="repositoryIDtype" type="italics"/> and <AutoText text="oaicollection" type="italics"/>. Look up these properties in the file.</Text>
     3180<Text id="oaiserver-5">For <AutoText text="repositoryName" type="italics"/> and <AutoText text="repositoryID" type="italics"/>, type in some values that make sense for your digital library. For example:</Text>
     3181<Format>repositoryName "Greenstone"<br />
     3182repositoryID "greenstone"</Format>
     3183</NumberedItem>
     3184<NumberedItem>
     3185<Text id="oaiserver-6">For this tutorial, we'll make the backdrop collection created in the simple image tutorial available over OAI. Therefore, add this collection's name to the end of the <AutoText text="oaicollection" type="italics"/> property:</Text>
     3186<Format>oaicollection demo documented-examples/oai-e backdrop</Format>
     3187<Text id="oaiserver-7">If you have a great many documents and do not want the OAI server to return all of them in one go, you could set the <AutoText text="resumeafter" type="italics"/> property to something lower than the default 250 value in the oai.cfg file. Like:</Text>
     3188<Format>resumeafter 50</Format>
     3189</NumberedItem>
     3190<NumberedItem>
     3191<Text id="oaiserver-8">If you're on Windows, it's best to be using the Apache web server. So if you're using the Local Library Server, stop the web server by exiting the little white dialog (the Greenstone Server Interface). Use a file browser to go into your Greenstone installation directory and rename the <AutoText text="server.exe" type="italics"/> there to <AutoText text="server.not" type="italics"/> to disable it. Now re-launch the Greenstone Server from the <AutoText text="Start"/> menu, so that this time, the included Apache web server will be used instead, launching its own little white dialog.</Text>
     3192</NumberedItem>
     3193<NumberedItem>
     3194<Text id="oaiserver-9">You are now ready to visit your oaiserver home page to check that it's all looking good. Start up the Greenstone Server by going to Windows <Path>Start &rarr; All Programs &rarr; Greenstone 2.85 &rarr; Greenstone Server</Path>.</Text>
     3195<Text id="oaiserver-10">Press the <AutoText text="Enter Library"/> button and you will end up on your Digital Library home page as usual. Adjust the URL so that instead of the <AutoText text="library.cgi" type="italics"/> suffix, it says <AutoText text="oaiserver.cgi" type="italics"/>.</Text>
     3196<Text id="oaiserver-11">The page that loads now will contain an error message (<AutoText text="badVerb" type="italics"/>) saying that you've provided an illegal OAI verb. This is because the OAI specification requires you to provide more instruction in the URL as to what you want. The specification defines verbs and possible arguments to them.</Text>
     3197<Text id="oaiserver-12">A basic verb is <AutoText text="Identify" type="italics"/>, which requests the OAI server to return some information about the OAI repository it's serving. Adjust the URL once more by suffixing <AutoText text="?verb=Identify" type="italics"/> so that your URL now looks like:</Text>
     3198<Format>http://&lt;domain&gt;/greenstone/cgi-bin/oaiserver.cgi?verb=Identify</Format>
     3199<Text id="oaiserver-13">Visiting this page now gives some information about your Greenstone OAI repository.</Text>
     3200</NumberedItem>
     3201<NumberedItem>
     3202<Text id="oaiserver-14">Although the data transmitted over OAI is in the form of XML, Greenstone uses a stylesheet to transform that XML response into a user-friendly, structured web page you see when you perform the <Autotext text="Identify"/> request (thereby visiting the <AutoText text="verb=Identify" type="italics"/> response page). This allows <AutoText text="Identify" type="italics"/> and other verbs in the OAI specification to be shown in the main Greenstone OAI Server pages as link buttons. You can see these in the main Greenstone <AutoText text="oaiserver.cgi" type="italics"/> (or <AutoText text="oaiserver.cgi?verb=Identify" type="italics"/>) page, as a row of links starting with "Identify" at the top and in the lower end of the page.</Text>
     3203<Text id="oaiserver-15">Clicking on the links will execute that verb as a request and return the response from your Greenstone OAI server as a structured web page. Try clicking on all the links.</Text>
     3204</NumberedItem>
     3205<NumberedItem>
     3206<Text id="oaiserver-16">OAI defines a concept called a <Autotext text="Set"/>. In Greenstone, the OAI Set concept is mapped to the practical Greenstone collection. The link to the <AutoText text="ListSets" type="italics"/> verb will therefore request the Greenstone OAI server to list all the collections that have been enabled for OAI.</Text>
     3207<Text id="oaiserver-17">Click on the <b>ListSets</b> button link and have a look.</Text>
     3208<Text id="oaiserver-18">The response page for the <AutoText text="ListSets" type="italics"/> verb will show you that your backdrop collection is one of the collections available over OAI in your Greenstone repository.</Text>
     3209</NumberedItem>
     3210<NumberedItem>
     3211<Text id="oaiserver-19">You will see a couple of buttons next to each collection (or <Autotext text="Set"/>) listed here. The first is <b>Identifiers</b> and the second <b>Records</b>. Click on the <b>Identifiers</b> button for the backdrop Set. This will list all the IDs of the documents contained in your OAI collection. If you look at the IDs, they look similar enough to Greenstone's internal document IDs, but with an additional prefix (<Format>oai:&lt;repositoryID&gt;:setname</Format>, where <AutoText text="repositoryID" type="italics"/> was set by you in the oai.cfg configuration file).</Text>
     3212</NumberedItem>
     3213<NumberedItem>
     3214<Text id="oaiserver-20">Click the browser Back button to get back to the ListSets page and press the <b>Records</b> button located next to the backdrop collection created in <b>A Simple image collection</b> tutorial.</Text>
     3215<Text id="oaiserver-21">As you would have specified some Dublin Core (dc) metadata for some of the images in the backdrop collection, the page that loads will display this information for each document in the collection (Set).</Text>
     3216<Text id="oaiserver-22">Greenstone's OAI at present supports 3 metadata formats, as is explained in the comments in the oai.cfg file. Of these three, the OAI standard for Dublin Core, <AutoText text="oai_dc" type="italics"/>, is the one pertinent to this tutorial. If your collection specifies metadata for a different metadata set format, you can use the oai.cfg file to tell Greenstone how to map the metadata fields of your chosen metadata set format into the Dublin Core metadata set supported by the Greenstone OAI server (or one of the other metadata sets it supports).</Text>
     3217<Text id="oaiserver-23">Look in the oai.cfg file again and scroll down to the section on <AutoText text="oaimapping" type="italics"/>, which will explain and provide examples for how to specify such mappings from your metadata format to one that Greenstone's OAI server uses. For instance, the <b>demo</b> collection comes enabled for OAI upon installation, and specifies some mappings from its <Autotext text="DLS" type="italics"/> metadata format to <Autotext text="OAI DC" type="italics"/>. Its <AutoText key="metadata::dls.Title"/> metadata is mapped using the following line in the oai.cfg configuration file:</Text>
     3218<Format>oaimapping dls.Title oai_dc.title</Format>
     3219<Text id="oaiserver-24">Because the backdrop collection uses DC metadata already, no mapping is required.</Text>
     3220</NumberedItem>
     3221</Content>
     3222</Tutorial>
     3223<Tutorial id="connecting_GLI_to_OAI_server">
     3224<Title>
     3225<Text id="gli-oai-0">Connecting to an OAI server from GLI</Text>
     3226</Title>
     3227<Prerequisite id="simple_image_collection"/>
     3228<Version initial="2.85" current="2.85"/>
     3229<Comment>
     3230<Text id="gli-oai-1">GLI can serve like an OAI client application: it can connect to a remote OAI server and retrieve metadata, even download documents. In the previous tutorial, we set up the Greenstone's OAI server and set up the backdrop collection to be served over OAI. In this tutorial we will use GLI to connect to that OAI server and download OAI metadata for the <b>A Simple image collection</b> and even download its documents.</Text>
     3231</Comment>
     3232<Content>
     3233<NumberedItem>
     3234<Text id="gli-oai-2">Launch GLI. This should launch the Greenstone server as well, if this is not already running, so that the OAI server is also up and running.</Text>
     3235</NumberedItem>
     3236<NumberedItem>
     3237<Text id="gli-oai-3">In GLI, go to the <AutoText key="glidict::GUI.Download"/> panel. To the left, choose <AutoText key="glidict::DOWNLOAD.MODE.OAIDownload"/> as the <AutoText text="Download Setting"/>.</Text>
     3238</NumberedItem>
     3239<NumberedItem>
     3240<Text id="gli-oai-4">On the right, set the Source URL field to contain the URL to your Greenstone OAI server. It would be of the form</Text>
     3241<Format>http://&lt;hostname:portnumber&gt;/greenstone/cgi-bin/oaiserver.cgi</Format>
     3242<Text id="gli-oai-4a">Make sure that you can generally access this URL from your browser.</Text>
     3243</NumberedItem>
     3244<NumberedItem>
     3245<Text id="gli-oai-5">If at this stage you press the <AutoText key="glidict::Download.ServerInformation"/> button (in the central row of buttons), a dialog will pop up with basic details about the OAI server. At the end, it will diplay the names of the sets available at the OAI Server. In our example, <AutoText text="backdrop" type="italics"/> would be listed as one of the setNames.</Text>
     3246</NumberedItem>
     3247<NumberedItem>
     3248<Text id="gli-oai-6">Tick the <AutoText key="perlmodules::OAIDownload.metadata_prefix_disp"/> checkbox as well as the <AutoText key="perlmodules::OAIDownload.set_disp"/> checkbox. For the latter, type backdrop for the set name. Then tick <AutoText key="perlmodules::OAIDownload.get_doc_disp"/>, <AutoText key="perlmodules::OAIDownload.get_doc_exts_disp"/> and add jpg to the list of comma separated values for it so that it becomes</Text>
     3249<Format>jpg,doc,pdf,ppt</Format>
     3250<Text id="gli-oai-7">Next, tick <AutoText key="perlmodules::OAIDownload.max_records_disp"/> and set it to 10. There will be 9 images in the collection, so we don't really need to set the Max records value, but this is a helpful feature that you can use when downloading from an OAI server.</Text>
     3251</NumberedItem>
     3252<NumberedItem>
     3253<Text id="gli-oai-8">Finally, press the <AutoText key="glidict::Mirroring.Download"/> button that's located beside the <AutoText key="glidict::Download.ServerInformation"/> button. GLI will start downloading oai metadata. Moreover, because we have ticked the <AutoText key="perlmodules::OAIDownload.get_doc_disp"/> checkbox, it will also be retrieving actual documents, but not more than 10, because of the limit of 10 that we've placed on the number of records to download.</Text>
     3254</NumberedItem>
     3255<NumberedItem>
     3256<Text id="gli-oai-9">After a while, it will have finished downloading. Change to the <AutoText key="glidict::GUI.Gather"/> panel, and on the left-hand side, open up the <AutoText key="glidict::Tree.DownloadedFiles"/>Downloaded Files folder. This is where Greenstone stores files you downloaded using the <AutoText key="glidict::GUI.Download"/> panel. In this case, it will contain a folder wherein the oai metadata files and images that you've just downloaded from your own Greenstone OAI server is stored.</Text>
     3257</NumberedItem>
     3258<NumberedItem>
     3259<Text id="gli-oai-10">You can now drag and drop these downloaded files into a new Greenstone collection. Because there are <Format>*.oai</Format> files among them, GLI will offer to add the <AutoText text="OAIPlugin"/>. Accept, and go to the <AutoText key="glidict::CDM.GUI.Plugins"/> section of the <AutoText key="glidict::GUI.Design"/> panel. There, you will find <AutoText text="OAIPlugin"/> at the end of your plugin list. Select it and press the <AutoText key="glidict::CDM.Move.Move_Up"/> button so that it is listed above the <AutoText text="EmbeddedMetadataPlugin"/>. Because <AutoText text="OAIPlugin"/> appears earlier in the plugin pipeline, it processes the metadata in the oai files, rather than letting the more general <AutoText text="EmbeddedMetadataPlugin"/> process their contents.</Text>
     3260</NumberedItem>
     3261<NumberedItem>
     3262<Text id="gli-oai-11">Move onto the <AutoText key="glidict::GUI.Create"/> panel and press the build button. During this stage, the <AutoText text="OAIPlugin"/> will extract the metadata in the oai files and attach them to the associated jpg file. You can see this once the collection has been built, by switching to the <AutoText key="glidict::GUI.Enrich"/> panel and clicking on an oai file, as no metadata is set for such files. If you then click on a jpg file and scroll down, there will be metadata names that start with <Format>ex.dc</Format>. This refers to Greenstone-extracted Dublin Core metadata.  <AutoText key="metadata::ex.dc.Description"/> and  <AutoText key="metadata::ex.dc.Title"/> will be set to the values you had assigned the images in the tutorial <b>A Simple Image Collection</b>. Greenstone will have added additional <Format>ex.dc</Format> metadata in the form of <AutoText key="metadata::ex.dc.Identifier"/>, which is the source URL for this image.</Text>
     3263</NumberedItem>
     3264<NumberedItem>
     3265<Text id="gli-oai-12">If you wish, you can now set up this collection in a manner similar to how the <b>backdrop</b> collection was set up in <b>A Simple Image Collection</b>. Don't forget to copy any specific format statements, then rebuild it and <b>Preview</b> the collection.</Text>
     3266</NumberedItem>
     3267</Content>
     3268</Tutorial>
     3269<Tutorial id="connecting_GLI_to_OAI_server">
     3270<Title>
     3271<Text id="gs-oai-0">Connecting to the Greenstone OAI server from the outside world</Text>
     3272</Title>
     3273<Prerequisite id="setting_up_GS_OAI_server"/>
     3274<Version initial="2.85" current="2.85"/>
     3275<Comment>
     3276<Text id="gs-oai-1">For this exercise, you need to be on a networked computer and your host computer needs to be visible to the outside world.
     3277(That is, when you provide the full name of your computer, someone else in the world should be able to find that computer by typing its URL into their browser's address field.)</Text>
     3278<Text id="gs-oai-2">For now though, we proceed to using an external OAI client to access our up-and-running Greenstone OAI server. It's not just any OAI client either, but an OAI Server validator.</Text>
     3279</Comment>
     3280<Content>
     3281<NumberedItem>
     3282<Text id="gs-oai-3">You will want to be running the included Apache web server. So if you're on Windows and using the Local Library Server, quit it and rename the <Autotext text="server.exe" type="italics"/> application in your Greenstone installation folder to server.not. Then use the <Autotext text="Start" type="italics"/> menu shortcut to the Greenstone Server once more, to now launch the Apache web server.</Text>
     3283</NumberedItem>
     3284<NumberedItem>
     3285<Text id="gs-oai-4">For this exercise, we will visit the <b>Open Archives Validator</b>, for which your OAIserver needs to provide a valid email address. In a text editor, open up your greenstone installation's etc/oai.cfg file and set the value of the <Autotext text="maintainer" type="italics"/> field to your email address.</Text>
     3286<Text id="gs-oai-5">Note that by default, your Greenstone installation will make the <b>demo</b> collection available over OAI. This collection has been set up with a dummy (and invalid) email address for the <Autotext text="creator" type="italics"/> and <Autotext text="maintainer" type="italics"/> fields in the collection's collect.cfg file. You will need to open up collect/demo/etc/collect.cfg and clear the email values for the <Autotext text="creator" type="italics"/> and <Autotext text="maintainer" type="italics"/> properties (or else set these to a valid email again). Otherwise the OpenArchives validator will resort to using the <b>demo</b> collection's default dummy email to send the initial validation results to. Alternatively, you can simply remove the <b>demo</b> collection from being listed in the oai.cfg file's oaicollection property, which will cease to make the <b>demo</b> collection available over OAI.</Text>
     3287<Text id="gs-oai-6">Note also that, if you wish to specify contact emails at a collection level, you will need to edit your greenstone installation's <Format>collect/&lt;collection-name&gt;/etc/collect.cfg</Format> file for those collections and set the <Autotext text="creator" type="italics"/> and <Autotext text="maintainer" type="italics"/> fields to the desired email address.</Text>
     3288</NumberedItem>
     3289<NumberedItem>
     3290<Text id="gs-oai-7">If your collection contains document items for which you have not assigned any (Dublin Core, <b>dc</b>) metadata, the OAI validation can fail because it is dependent on having Metadata Formats listed even on a per record (per document) basis. And if your document has no <b>dc</b> metadata assigned, Greenstone won't know what OAI-supported metadata format is used by that document in order to list it.</Text>
     3291<Text id="gs-oai-8">In practice, this means that you either have to assign one or more <Format>dc.*</Format> metadata to each document in your OAI collection, or you will have to set up an oaimapping in the oai.cfg file to map existing metadata of whichever format to <Format>dc.*</Format> metadata.</Text>
     3292<Text id="gs-oai-9">For instance, if you created an image collection without assigning any metadata and are happy to use the Title or Source metadata that Greenstone extracted for each image (<AutoText key="metadata::ex.Title"/>, <AutoText key="metadata::ex.SourceFile"/>) as the image document's "title", you could map either of these metadata to <AutoText key="metadata::dc.Title"/> in the file oai.cfg. To do so, you'd open up oai.cfg in an editor, go down to the section specifying the oaimapping properties and add a new line:</Text>
     3293<Format>oaimapping Title oai_dc.title</Format>
     3294<Text id="gs-oai-9a">(Or: <Format>oaimapping SourceFile oai_dc.title</Format>).</Text>
     3295<Text id="gs-oai-10">This step is not necessary for the <b>backdrop</b> collection, since each image in the collection was assigned some <Format>dc.*</Format> metadata.</Text>
     3296</NumberedItem>
     3297<NumberedItem>
     3298<Text id="gs-oai-11">If you are working with legacy collections (built before Greenstone version 2.85) you may have to rebuild them if you plan to make them available over OAI and compliant with the Open Archives validator. Rebuilding old collections will recalculate the <AutoText text="earliest datestamp"/> for the repository. This calculation is different from Greenstone 2.85 onwards.</Text>
     3299</NumberedItem>
     3300<NumberedItem>
     3301<Text id="gs-oai-12">Next you will need to set up your Greenstone server to be accessible from outside, so that external OAI clients can access it.</Text>
     3302<Text id="gs-oai-13">Go to the <Path>File &rarr; Settings</Path> menu of your Greenstone server interface dialog and check the <AutoText text="Allow External Connections"/> option and also check the <AutoText text="Get local IP and resolve to a name"/> option (or the <AutoText text="Get local IP"/> option) as its address resolution method.</Text>
     3303</NumberedItem>
     3304<NumberedItem>
     3305<Text id="gs-oai-14">Press the button in the Greenstone Server Interface dialog that says <AutoText text="Enter Library"/> (or it may say <AutoText text="Restart Library"/>). Your Digital Library home page will open up in a browser tab. Adjust this URL to have a suffix of <Format>oaiserver.cgi</Format> in place of the terminating <Format>library.cgi</Format>, then copy the resulting URL and visit <Link>http://www.openarchives.org/Register/ValidateSite</Link>.</Text>
     3306</NumberedItem>
     3307<NumberedItem>
     3308<Text id="gs-oai-15">The Open Archives Validator page will request the URL to your Greenstone OAI server. Paste the URL you have in your copy buffer into the field provided for this, and press the <b>Validate baseURL</b> button to start running the tests. You will be told to check your email to continue the remaining tests and get the validation report.</Text>
     3309<Text id="gs-oai-16">If the validator does not recognise the URL, make sure you have given the full domain of your host machine rather than just the host name. Alternatively, visit the <AutoText text="oaiserver.cgi?verb=Identify" type="italics"/> page again and check that works. If it doesn't, maybe your machine is not set up to be accessible to outside networks. Check you proxy settings, make sure you've set up port forwarding and that your firewall is not interfering.</Text>
     3310</NumberedItem>
     3311</Content>
     3312</Tutorial>
    31663313<Tutorial id="METS_export">
    31673314<Title>
Note: See TracChangeset for help on using the changeset viewer.