Changeset 10880
- Timestamp:
- 2005-11-11T10:12:44+13:00 (18 years ago)
- Location:
- trunk/gsdl3/docs/manual
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/gsdl3/docs/manual/manual.tex
r10863 r10880 21 21 \author{Katherine Don, George Buchanan and Ian H. Witten \\[1ex] 22 22 Department of Computer Science \\ 23 University of Waikato \\ Hamilton, New Zealand \\ 24 \{kjdon, grbuchan, ihw\}@cs.waikato.ac.nz} 23 University of Waikato \\ Hamilton, New Zealand \\ } 25 24 26 25 \date{} … … 36 35 reimplementation of the \gs\ digital library software. The current 37 36 version (\gsii) enjoys considerable success and is being widely used. 38 \gsiii \ will capitali se on this success, and in addition it will37 \gsiii \ will capitalize on this success, and in addition it will 39 38 \begin{bulletedlist} 40 39 \item improve flexibility, modularity, and extensibility … … 44 43 self-documentation 45 44 \item make full use of existing XML-related standards and software 46 \item provide improved internationali sation, particularly in terms of sort order,45 \item provide improved internationalization, particularly in terms of sort order, 47 46 information browsing, etc. 48 47 \item include new features that facilitate additional ``content management'' … … 58 57 A description of the general design and architecture of \gsiii\ is covered by the document {\em The design of Greenstone3: An agent based dynamic digital library} (design-2002.ps, in the docs/manual directory). 59 58 60 This documentation consists of several parts. Section~\ref{sec:install} is for administrators, and covers \gsiii\ installation, how to access the library, and some administration issues. Section~\ref{sec:user} is for users of the software, and looks at using the sample collections, creating new collections, and how to make small customi sations to the interface. The remaining sections are aimed towards the \gs\ developer. Section~\ref{sec:develop-runtime} describes the run-time system, including the structure of the software, and the message format, while Section~\ref{sec:develop-build} describes the collection building process. Section~\ref{sec:new-features} describes how to add new features to \gs, such as how to add new services, new page types, new plugins for different document formats. Section~\ref{sec:distributed} describes how to make \gs\ run in a distributed fashion, using SOAP as an example communications protocol. Finally, there are several appendices, including how to install \gs\ from CVS, some notes on Tomcat and SOAP, and a comparison of \gsii\ and \gsiii\ format statements.59 This documentation consists of several parts. Section~\ref{sec:install} is for administrators, and covers \gsiii\ installation, how to access the library, and some administration issues. Section~\ref{sec:user} is for users of the software, and looks at using the sample collections, creating new collections, and how to make small customizations to the interface. The remaining sections are aimed towards the \gs\ developer. Section~\ref{sec:develop-runtime} describes the run-time system, including the structure of the software, and the message format, while Section~\ref{sec:develop-build} describes the collection building process. Section~\ref{sec:new-features} describes how to add new features to \gs, such as how to add new services, new page types, new plugins for different document formats. Section~\ref{sec:distributed} describes how to make \gs\ run in a distributed fashion, using SOAP as an example communications protocol. Finally, there are several appendices, including how to install \gs\ from CVS, some notes on Tomcat and SOAP, and a comparison of \gsii\ and \gsiii\ format statements. 61 60 \newpage 62 61 \tableofcontents … … 125 124 & Imported source packages from other systems e.g. MG, MGPP \\ 126 125 greenstone3/extensions 127 & Extensions to greenstone 3 core functionality, e g, Vishnu visualizer, Alerting service \\126 & Extensions to greenstone 3 core functionality, e.g., Vishnu visualizer, Alerting service \\ 128 127 greenstone3/lib 129 128 & Shared library files\\ … … 141 140 & some Perl and/or shell building scripts\\ 142 141 greenstone3/packages 143 & External packages that may be installed as part of greenstone, e.g. Tomcat, My sql\\142 & External packages that may be installed as part of greenstone, e.g. Tomcat, MySQL \\ 144 143 greenstone3/docs 145 144 & Documentation\\ … … 150 149 & The web.xml file lives here (servlet configuration information for Tomcat)\\ 151 150 greenstone3/web/WEB-INF/classes 152 & Individual class files needed by the servlet go in here, also properties files for java resource bundles - used to handle all the language specific text. This direc otry is on the servlet classpath\\151 & Individual class files needed by the servlet go in here, also properties files for java resource bundles - used to handle all the language specific text. This directory is on the servlet classpath\\ 153 152 greenstone3/web/WEB-INF/lib 154 153 & jar files needed by the servlets go here \\ … … 187 186 One \gsiii\ installation can have many sites and interfaces, and these can be paired in different combinations. One instantiation of a servlet uses one site and one interface, so every specified pairing results in a new servlet instance. For example, a single site might be served with two different interfaces. This provides different modes of access to the same content. e.g. HTML vs WML, or perhaps providing a completely different look and feel for different audiences. Alternatively, a standard interface may be used with many different sites---providing a consistent mode of access to a lot of different content. 188 187 189 Collections live in the \gst{collect} directory of a site. Any collections that are found in this directory when the servlet is initiali sed will be loaded up and presented to the user. Collections require valid configuration files, but apart from this, nothing needs to be done to the site to use new collections. Collections added while Tomcat is running will not be noticed automatically. Either the server needs to be restarted, or a configuration request may be sent to the library, triggering a (re)load of the collection (this is described in Section~\ref{sec:runtime-config}).188 Collections live in the \gst{collect} directory of a site. Any collections that are found in this directory when the servlet is initialized will be loaded up and presented to the user. Collections require valid configuration files, but apart from this, nothing needs to be done to the site to use new collections. Collections added while Tomcat is running will not be noticed automatically. Either the server needs to be restarted, or a configuration request may be sent to the library, triggering a (re)load of the collection (this is described in Section~\ref{sec:runtime-config}). 190 189 191 190 There are two sites that come with the distribution: \gst{localsite}, and \gst{gateway}. \gst{localsite} has several demo collections, while \gst{gateway} has none. \gst{gateway} specifies that a SOAP connection should be made to \gst{localsite}. Getting this to work involves setting up a soap server for localsite: see Section~\ref{sec:distributed} for details. … … 197 196 198 197 The file \gst{\gsdlhome/web/WEB-INF/web.xml} contains the configuration information for Tomcat. It tells Tomcat what servlets to load, what initial parameters to pass them, and what web names map to the servlets. 199 There are four servlets specified in web.xml (these correspond to the four servlet links in the welcome page for \gsiii): one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other three are the \gs\ library servlets described in Section~\ref{sec:browser-access}, \gst{library}, \gst{classic} and \gst{gateway}. Each servlet must specify which site and which interface to use. Having multiple servlets provides a way of serving different sites, or the same site with a different style of presentation. Site\_name and interface\_name are just two examples of initiali sation parameters used by the library servlets. The full list is shown in Table~\ref{tab:serv-init}.198 There are four servlets specified in web.xml (these correspond to the four servlet links in the welcome page for \gsiii): one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other three are the \gs\ library servlets described in Section~\ref{sec:browser-access}, \gst{library}, \gst{classic} and \gst{gateway}. Each servlet must specify which site and which interface to use. Having multiple servlets provides a way of serving different sites, or the same site with a different style of presentation. Site\_name and interface\_name are just two examples of initialization parameters used by the library servlets. The full list is shown in Table~\ref{tab:serv-init}. 200 199 201 200 For more details about Tomcat see Appendix~\ref{app:tomcat}. 202 201 203 202 \begin{table} 204 \caption{\gs\ servlet initiali sation parameters}203 \caption{\gs\ servlet initialization parameters} 205 204 \label{tab:serv-init} 206 205 {\footnotesize … … 223 222 224 223 Initial \gsiii\ system configuration is determined by a set of configuration files, all expressed in XML. Each site has a configuration file that binds parameters for the site, \gst{siteConfig.xml}. Each interface has a configuration file, \gst{interfaceConfig.xml}, that specifies Actions for the interface. Collections also have several configuration files; these are discussed in Section~\ref{sec:collconfig}. 225 The configuration files are read in when the system is initiali sed, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a system command to the library. There are a series of system commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}.224 The configuration files are read in when the system is initialized, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a system command to the library. There are a series of system commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}. 226 225 227 226 \subsubsection{Site configuration file}\label{sec:siteconfig} … … 322 321 323 322 324 \subsection{Run-time re-initiali sation}\label{sec:runtime-config}323 \subsection{Run-time re-initialization}\label{sec:runtime-config} 325 324 326 325 When Tomcat is started up, the site and interface configuration files are read in, and actions/services/collections loaded as necessary. The configuration is then static unless Tomcat is restarted, or re-configuration commands issued. … … 358 357 Browsing involves navigating pre-defined hierarchies of documents, following links of interest to find documents. The hierarchies may be constructed on different metadata fields, for example, alphabetical lists of Titles, or a hierarchy of Subject classifications. Clicking on a bookshelf icon takes you to a lower level in the hierarchy, while clicking on a book or page icon takes you to a document. 359 358 360 In the standard interface that comes with \gsiii\ \footnote{of course, this is all customi sable}, collections in a digital library are presented in the following manner. The 'home' page of the library shows a list of all the public collections in that library. Clicking on a collection link takes you to the home page for the collection, which we call the collection's 'about' page. The standard page banner looks something like that shown in Figure~\ref{fig:page-banner}.359 In the standard interface that comes with \gsiii\ \footnote{of course, this is all customizable}, collections in a digital library are presented in the following manner. The 'home' page of the library shows a list of all the public collections in that library. Clicking on a collection link takes you to the home page for the collection, which we call the collection's 'about' page. The standard page banner looks something like that shown in Figure~\ref{fig:page-banner}. 361 360 362 361 \begin{figure}[h] … … 387 386 Building native \gsiii\ collections is done using the \gst{gs3-build.sh/bat} script, with the \gst{collectionConfig.xml} file controlling how the building is done. There are a number of considerations in building a collection: what documents appear in the collection, how they are indexed for searching, which classifications are used for browsing, etc. 388 387 389 Firstly, the documents that comprise the collection should be placed in the import subdirectory. At present, only documents in this directory will appear in the collection. Documents can be organi sed into sub folders inside the import directory.388 Firstly, the documents that comprise the collection should be placed in the import subdirectory. At present, only documents in this directory will appear in the collection. Documents can be organized into sub folders inside the import directory. 390 389 [TODO: describe the kinds of documents that can be added, something about METS files?] 391 390 … … 449 448 The collectionConfig.xml file controls the all of these options for collection building, and the format is described in Section~\ref{sec:collconfig}. 450 449 451 To build a collection, place the source documents and optional metadata.xml file(s) in the import directory, place the \gst{collectionConfig.xml} file in the etc directory, and execute \gst{gs3build.sh/bat sitename collectionname}. The process will run, placing the new indexes in the \gst{building} subdirectory of the collection's directory. You must have mysqlrunning before you start building---running \gst{ant start} will start up the MySQL server as well as tomcat.450 To build a collection, place the source documents and optional metadata.xml file(s) in the import directory, place the \gst{collectionConfig.xml} file in the etc directory, and execute \gst{gs3build.sh/bat sitename collectionname}. The process will run, placing the new indexes in the \gst{building} subdirectory of the collection's directory. You must have MySQL running before you start building---running \gst{ant start} will start up the MySQL server as well as tomcat. 452 451 453 452 Once the build process is complete, the building directory should be renamed to index (after deleting or renaming the existing index directory, if any), and Tomcat prompted to reload the collection---either by restarting the server, or by sending an activate collection command to the library servlet. … … 457 456 The Greenstone Librarian Interface (GLI) can be used to create \gsii\ style collections for \gsiii. It can be started under Windows by selecting Greenstone Librarian Interface from the Greenstone 3 Digital Library menu in the Program Files section of the Start menu. On Linux, run \gst{./gli4gs3.sh} from the \gst{greenstone3/gli} directory. 458 457 459 Currently, the GLI works almost exactly the same as for \gsii\footnote{Eventually the GLI will be modified to use native \gsiii\ config files and collection building}. Collection configuration is done in a \gsii\ manner. The main difference is that \gsiii\ has different sites and interfaces and servlets, whereas \gsii\ has a single collect directory, and a single runtime cgi program.458 Currently, the GLI works almost exactly the same as for \gsii\footnote{Eventually the GLI will be modified to use native \gsiii\ configuration files and collection building}. Collection configuration is done in a \gsii\ manner. The main difference is that \gsiii\ has different sites and interfaces and servlets, whereas \gsii\ has a single collect directory, and a single runtime cgi program. 460 459 461 460 The GLI for \gsiii\ has a couple of new configuration parameters: site and servlet. It operates within a single site---you can edit, delete, create new collections within this site. A servlet is also specified for that site---this is used when previewing a collection. While you are working in one site, you cannot edit collections from another site. However, you can base a collection on one from another site. To change the working site and/or servlet, go to Preferences-$>$Connection in the File menu. By default, the GLI will use site \gst{localsite}, and servlet \gst{library}. 462 461 463 Collection building using the GLI will use the \gsii\ Perl scripts and plugins. At the conclusion of the \gsii\ build process, a conversion script will be run to create the \gsiii\ configuration files. This means that format statements are no longer 'live'---changing these will require changes to the \gsiii\ config files. You can either rebuild the collection through the GLI (may take a while), or run the conversion script directly (see following section).462 Collection building using the GLI will use the \gsii\ Perl scripts and plugins. At the conclusion of the \gsii\ build process, a conversion script will be run to create the \gsiii\ configuration files. This means that format statements are no longer 'live'---changing these will require changes to the \gsiii\ configuration files. You can either rebuild the collection through the GLI (may take a while), or run the conversion script directly (see following section). 464 463 465 464 Detailed instructions about using the GLI can be found in Sections 3.1 and 3.2 of the Greenstone 2 User's Guide (\gst{GS2-User-en.pdf}. This can be found in your \gsii\ installation, or in the greenstone3/docs/manual directory if you have installed \gsiii\ from a distribution. … … 483 482 The script attempts to create \gsiii\ format statements from the old \gsii\ ones. The conversion may not always work properly, so if the collection looks a bit strange under \gsiii\ , you should check the format statements. Format statements are described in Section~\ref{sec:formatstmt}. 484 483 485 Once again, to have the collection recogni sed by the library servlet, you can either restart Tomcat, or load it dynamically.484 Once again, to have the collection recognized by the library servlet, you can either restart Tomcat, or load it dynamically. 486 485 487 486 \subsection{Collection configuration files}\label{sec:collconfig} … … 498 497 \subsubsection{collectionInit.xml} 499 498 500 This optional file is only used for non-standard, customi sed collections. It specifies the class name of the non-standard collection class. The only syntax so far is the class name:499 This optional file is only used for non-standard, customized collections. It specifies the class name of the non-standard collection class. The only syntax so far is the class name: 501 500 502 501 \begin{gsc}\begin{verbatim} … … 504 503 \end{verbatim}\end{gsc} 505 504 506 Section~\ref{sec:new-coll-types} describes an example collection where this file is used. Depending on the type of collection that this is used for, one or both of the other config files may not be needed.505 Section~\ref{sec:new-coll-types} describes an example collection where this file is used. Depending on the type of collection that this is used for, one or both of the other configuration files may not be needed. 507 506 508 507 \subsubsection{collectionConfig.xml} … … 578 577 The \gst{<metadataList>} element specifies some collection metadata, such as creator. The \gst{<displayItemList>} specifies some language dependent information that is used for collection display, such as collection name and short description. These displayItem elements can be specified in different languages. 579 578 580 The \gst{<search>} element specifies what indexes should be built, and provides some display and formatting information for each one. Search has an attribute, \gst{type}, which specifies which indexer to be used for indexing. Currently, \gst{mg} and \gst{mgpp}[??] are available. If type is not specified, mg is used. Multiple search elements may be specified, if more than one indexer is to be used. (Note, this is not yet recogni sed by the run-time system.)579 The \gst{<search>} element specifies what indexes should be built, and provides some display and formatting information for each one. Search has an attribute, \gst{type}, which specifies which indexer to be used for indexing. Currently, \gst{mg} and \gst{mgpp}[??] are available. If type is not specified, mg is used. Multiple search elements may be specified, if more than one indexer is to be used. (Note, this is not yet recognized by the run-time system.) 581 580 582 581 Search indexes appear as individual \gst{<index>} elements within the \gst{<search>} element. Some choices for the index are made using attributes of the element itself, and some through child elements. … … 597 596 </index> 598 597 \end{verbatim}\end{gsc} 599 ...in this case the \gst{<field>} tag refers to the ``title'' metadata item, found in the Dublin Core namespace. The mgsearch engine would be used on this index.598 ...in this case the \gst{<field>} tag refers to the ``title'' metadata item, found in the Dublin Core namespace. The MG search engine would be used on this index. 600 599 601 600 Alternatively, to index the full document texts by section: … … 865 864 This will display the dls.Title metadata if available, otherwise it will use the dc.Title metadata if available, otherwise it will use the Title metadata. If there are no values for any of these metadata elements, then nothing will be displayed. 866 865 867 The \gst{<gsf:switch>} element allows different formatting depending on the value of a specified metadata element. For example, the following switch statement could be used to display a different icon for each document in a list depending on which organi sation it came from.866 The \gst{<gsf:switch>} element allows different formatting depending on the value of a specified metadata element. For example, the following switch statement could be used to display a different icon for each document in a list depending on which organization it came from. 868 867 869 868 \begin{gsc} … … 958 957 A particular collection can override the properties for any service. For example, if a collection uses the GS2MGSearch service rack (look in the buildConfig.xml file for a list of service racks used), and the collection builder wants to change the text associated with this service, they can put a GS2MGSearch.properties file in the resources directory of the collection. 959 958 This will be used in preference to one in the default resources directory. 960 Note that while changes in the default properties files seem to require a tomcat restart to take effect, changes in the collec iton specific properties files take effect immediately.961 962 \subsection{Customi sing the interface}\label{sec:interface-customise}963 964 Format statements in the collection configuration files provide a way to change small parts of the collection display. For large scale customi sations to a collection, or ones that apply to a site as a whole, a second mechanism is available. The interface is defined by a set of XSLT files that transform the page data into HTML. Any of these files can be overridden to provide specialised display, on a site or collection basis.959 Note that while changes in the default properties files seem to require a tomcat restart to take effect, changes in the collection specific properties files take effect immediately. 960 961 \subsection{Customizing the interface}\label{sec:interface-customise} 962 963 Format statements in the collection configuration files provide a way to change small parts of the collection display. For large scale customizations to a collection, or ones that apply to a site as a whole, a second mechanism is available. The interface is defined by a set of XSLT files that transform the page data into HTML. Any of these files can be overridden to provide specialized display, on a site or collection basis. 965 964 966 965 The first section looks at customizing the existing interface, while the second section looks at defining a whole new interface. The last section describes how to add a new language translation of an interface. … … 970 969 Most of an interface is defined by XSLT files, which are stored in \gst{\$GSDL3HOME/\-web/\-interfaces/\-interface-name/\-transform}. These can be changed and the changes will take effect straight away. If changes only apply to certain collections or sites, not everything that uses the interface, you can override some of the files by putting new ones in a different place. XSLT files are looked for in the following order: collection, site, interface, default interface. (This currently only apples to sites, and therefore collections, that reside in the same \gs\ installation as the interface.) 971 970 972 Sites and collections can have a transform directory, which is where customi sed XSLT files should go. Any XSLT files in here will be used in preference to the interface files when using this collection. For example, if you want to have a completely different layout for the about page of a collection, you can put a new \gst{about.xsl} file into the collection's \gst{transform} directory, and this will be used instead. This is what we do for the Gutenberg sample collection.973 974 This also applies to files that are included from other XSLT files. For example the query.xsl for the query pages includes a file called querytools.xsl. To have a particular site show a different query interface either of these files may need to be modified. Creating a new version of either of these and putting it in the site transform directory will work. Either the new query.xsl will include the default querytools, or the default query.xsl will include the new querytools.xsl. The xsl:include directives are preprocessed by the java code and full paths added based on availability of the files, so that the correct one is used.971 Sites and collections can have a transform directory, which is where customized XSLT files should go. Any XSLT files in here will be used in preference to the interface files when using this collection. For example, if you want to have a completely different layout for the about page of a collection, you can put a new \gst{about.xsl} file into the collection's \gst{transform} directory, and this will be used instead. This is what we do for the Gutenberg sample collection. 972 973 This also applies to files that are included from other XSLT files. For example the query.xsl for the query pages includes a file called querytools.xsl. To have a particular site show a different query interface either of these files may need to be modified. Creating a new version of either of these and putting it in the site transform directory will work. Either the new query.xsl will include the default querytools, or the default query.xsl will include the new querytools.xsl. The xsl:include directives are preprocessed by the Java code and full paths added based on availability of the files, so that the correct one is used. 975 974 976 975 Note that you cannot include a file with the same name as the including file. For example query.xsl cannot include query.xsl (it is tempting to want to do this if you just want to change one template for a particular file, and then include the default. but you cant). … … 992 991 Keys will be looked up in the properties file closest to the specified language. For example, if language \gst{fr\_CA} was specified (French language, country Canada), and the default locale was \gst{en\_GB}, Java would look at properties files in the following order, until it found the key: \gst{XXX\_fr\_CA.properties}, \gst{XXX\_fr.properties}, \gst{XXX\_en\_GB.properties}, then \gst{XXX\_en.properties}, and finally the default \gst{XXX.properties}. 993 992 994 These new files are available straight away---to use the new language, add e.g. \gst{l=fr} to the arguments in the URL. To get \gs\ to add it in to the list of languages on the preferences page, an entry needs to be added into the languages list in the \gst{interfaceConfig.xml} file (see Section~\ref{sec:interfaceconfig}). Modification of this file requires a restart of the Tomcat server for the changes to be recogni sed.993 These new files are available straight away---to use the new language, add e.g. \gst{l=fr} to the arguments in the URL. To get \gs\ to add it in to the list of languages on the preferences page, an entry needs to be added into the languages list in the \gst{interfaceConfig.xml} file (see Section~\ref{sec:interfaceconfig}). Modification of this file requires a restart of the Tomcat server for the changes to be recognized. 995 994 996 995 \newpage … … 1072 1071 Messages inside the system (``internal'' messages) all follow the same basic format: message elements contain multiple request elements, or multiple response elements. Messaging is all synchronous. The same number of responses as requests will be returned. Currently all requests are independent, so any requests can be combined into the same message, and they will be answered separately, with their responses being sent back in a single message. 1073 1072 1074 When a page request (external request) comes in to the Receptionist, it looks at the action attribute and passes the request to the appropriate Action module. The Action will fire one or more internal requests to the MessageRouter, based on the arguments. The data is gathered into a response, which is returned to the Receptionist. The page that the receptionist returns contains the original request, the response from the action and other info as needed (depends on the type of Receptionist). The data may be transformed in some way --- for the \gs\ servlet we transform using XSLT to generate htmlpages.1073 When a page request (external request) comes in to the Receptionist, it looks at the action attribute and passes the request to the appropriate Action module. The Action will fire one or more internal requests to the MessageRouter, based on the arguments. The data is gathered into a response, which is returned to the Receptionist. The page that the receptionist returns contains the original request, the response from the action and other info as needed (depends on the type of Receptionist). The data may be transformed in some way --- for the \gs\ servlet we transform using XSLT to generate HTML pages. 1075 1074 1076 1075 Actions send internal style messages to the MessageRouter. Some can be answered by it, others are passed on to collections, and maybe on to services. Internal requests are for simple actions, such as search, retrieve metadata, retrieve document text … … 1217 1216 A service description also contains some display information---this includes the name of the service, and the text for the submit button. 1218 1217 1219 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. The parameters in this example include their display information. Figure~\ref{fig:query-display} shows an example htmlsearch form that may be generated from this describe response.1218 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. The parameters in this example include their display information. Figure~\ref{fig:query-display} shows an example HTML search form that may be generated from this describe response. 1220 1219 1221 1220 \begin{quote}\begin{gsc}\begin{verbatim} … … 1300 1299 \end{figure} 1301 1300 1302 A describe request to an applet type service returns the applet htmlelement: this will be embedded into a web page to run the applet.1301 A describe request to an applet type service returns the applet HTML element: this will be embedded into a web page to run the applet. 1303 1302 \begin{quote}\begin{gsc}\begin{verbatim} 1304 1303 <request type='describe' to='mgppdemo/PhindApplet'/> … … 1329 1328 \end{verbatim}\end{gsc}\end{quote} 1330 1329 1331 Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the Receptionist must fill in this parameter before displaying the html.1330 Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the Receptionist must fill in this parameter before displaying the HTML. 1332 1331 1333 1332 \subsection{'system'-type messages}\label{sec:system} … … 1603 1602 \end{verbatim}\end{gsc}\end{quote} 1604 1603 1605 One or more parameters specifying metadata may be included in a request. Also, a metadata value of \gst{all} will retrieve all the metadata for each document.1604 One or more parameters specifying metadata may be included in a request. Also, a metadata value of \gst{all} will retrieve all the metadata for each document. 1606 1605 1607 1606 Any browse-type service must also implement a metadata retrieval service to provide metadata for the nodes in the classification hierarchy. The name of it is the browse service name plus \gst{MetadataRetrieve}. For example, the ClassifierBrowse service described in the previous section should also have a ClassifierBrowseMetadataRetrieve service. The request and response format is exactly the same as for the DocumentMetadataRetrieve service, except that \gst{<documentNode>} elements are replaced by \gst{<classifierNode>} elements (and the corresponding list element is also changed). … … 1785 1784 1786 1785 A 'page' is some XML or HTML (or other?) data returned in response to an 1787 external 'page'-type request. These requests originate from outside \gs\ , for example from a servlet, or java application, and are received by the Receptionist. As described below in Section~\ref{sec:page-requests}, the requests are XML representations of \gs\ URLs. One of the arguments is action (a). This tells the Receptionist which Action module to pass the request to.1786 external 'page'-type request. These requests originate from outside \gs\ , for example from a servlet, or Java application, and are received by the Receptionist. As described below in Section~\ref{sec:page-requests}, the requests are XML representations of \gs\ URLs. One of the arguments is action (a). This tells the Receptionist which Action module to pass the request to. 1788 1787 1789 1788 Action modules decode the rest of the arguments to determine what requests need to be made to the system. One or more internal requests may be made to the MessageRouter. A request for format information from the Collection/Service may also be made. The resulting data is gathered together into a single XML response, \gst{<page>}, and returned to the Receptionist. … … 1844 1843 & & but no processing of the results is done \\ 1845 1844 & & currently only used in process actions \\ 1846 o & output type & XML, html, WML \\1845 o & output type & XML, HTML, WML \\ 1847 1846 l & language & en, fr, zh ...\\ 1848 1847 d & document id & HASHxxx \\ … … 1898 1897 1899 1898 \subsubsection{Collection specific formatting}\label{sec:collformat} 1900 get format info, transform gsf->xsl. transf rom xml->html1901 1902 config params are passed in to the transformation1899 get format info, transform gsf->xsl. transform xml->html 1900 1901 configuration params are passed in to the transformation 1903 1902 \subsubsection{CGI arguments} 1904 1903 … … 1906 1905 \subsubsection{Page action}\label{sec:pageaction} 1907 1906 1908 PageAction is responsible for displaying kinds of information pages, such as the home page of the library, or the home page of a collection, or the help and preferen ecs pages. These pages are not associated with specific services like the other page types. In general, the data comes from describe requests to various modules.1907 PageAction is responsible for displaying kinds of information pages, such as the home page of the library, or the home page of a collection, or the help and preferences pages. These pages are not associated with specific services like the other page types. In general, the data comes from describe requests to various modules. 1909 1908 The different pages are requested using the subaction argument. For the 'home' page, a 'describe' request is sent to the MessageRouter---this returns a list of all the collections, services, serviceClusters and sites known about. For each collection, its metadata is retrieved via a 'describe' request. This metadata is added into the previous result, which is then added into the page. For the 'about' page, a \gst{describe} request is sent to the module that the about page is about: this may be a collection or a service cluster. This returns a list of metadata 1910 1909 and a list of services. … … 1952 1951 \subsubsection{XML Document action}\label{sec:xmldocumentaction} 1953 1952 1954 XMLD OcumentAction is a little different to the standard DocumentAction. It operates in two modes, \gst{text} and \gst{toc}. In \gst{text} mode, it will retrieve the content of the current document node using a DocumentContentRetrieve request. In \gst{toc} mode, it retrieves the entire table of contents for the document using a DocumentStructureRetrieve request. Either mode may also retrieve metadata for the current section or each section in the table of contents.1953 XMLDocumentAction is a little different to the standard DocumentAction. It operates in two modes, \gst{text} and \gst{toc}. In \gst{text} mode, it will retrieve the content of the current document node using a DocumentContentRetrieve request. In \gst{toc} mode, it retrieves the entire table of contents for the document using a DocumentStructureRetrieve request. Either mode may also retrieve metadata for the current section or each section in the table of contents. 1955 1954 1956 1955 \subsubsection{GS2Browse action}\label{sec:browseaction} … … 2013 2012 GSXSLT & some manipulation functions for \gs\ XSLT\\ 2014 2013 GlobalProperties & Holds the global properties (from global.properties) \\ 2015 MacroResolver & Used with replace elements in collection config files, replaces a macro or string with another string, metadata or text from a dictionary\\2014 MacroResolver & Used with replace elements in collection configuration files, replaces a macro or string with another string, metadata or text from a dictionary\\ 2016 2015 GS2MacroResolver & MacroResolver for GS2 collections, that uses the GDBM database\\ 2017 2016 Misc & miscellaneous functions\\ … … 2040 2039 \subsection{Creating new services}\label{sec:new-services} 2041 2040 2042 *inherit from ServiceRack - abstract base class. this handles the main process method, determines the service name and request type. if request type is describe, and to is empty, it returns a list of services (short\_service\_info) which is initiali sed in the configure method. a describe request to a particular service results in getServiceDescription being called, which must be supplied by the subclass.2041 *inherit from ServiceRack - abstract base class. this handles the main process method, determines the service name and request type. if request type is describe, and to is empty, it returns a list of services (short\_service\_info) which is initialized in the configure method. a describe request to a particular service results in getServiceDescription being called, which must be supplied by the subclass. 2043 2042 other request types (process) get sent to processXXX methods, where XXX is the service name. 2044 2043 … … 2062 2061 2063 2062 Java GUI Interface: There are couple of alternatives. Depending on what you want to display in the GUI, you could talk to either a Receptionist or a MessageRouter. The library classes can be set up and compiled into the GUI program. 2064 Talking to a Receptionist will give you access to pages of XML. It is likely that the standard Receptionist class would be used - this doesn't transform the data to HTML. Queries such as ``give me the home page of a collection'' and ``do the following search'' can be issued. All t ehdata needed for the result view is returned. Queries are quite simple, but are limited to what kinds of Actions are available in the library.2065 Talking to a MessageRouter requires a bit more effort on the part of the GUI program, but results in greater flexibility. The kinds of queries that can be issued are individual units of action, such as ``describe yourself'', ``search'', ``retrieve the content for this document''. More than one request may need to be made for a particular feature of the GUI. However you can ask for any combination of data available in the system, you are not relying on Actions. What you will implemen et though, may be a lot like the Action code in terms of request sequences.2066 2067 Interfaces in other programming languages: Because the communication is all XML based, other interfaces can talk to the Java library if a communication protocol is set up. This could be done using SOAP for example. L Ike for Java GUI interfaces, the program could talk to a Receptionist or to a MessageRouter.2068 e.g. java interface. where you can interface to. MR vs Receptionist. diff receptionists. egs, handheld - using servlet, transforming recpt, but new set of XSLT java program other program - talk to recpt but just get back XML data for pages. java gui - just talk to MR, do all processing itself.2063 Talking to a Receptionist will give you access to pages of XML. It is likely that the standard Receptionist class would be used - this doesn't transform the data to HTML. Queries such as ``give me the home page of a collection'' and ``do the following search'' can be issued. All the data needed for the result view is returned. Queries are quite simple, but are limited to what kinds of Actions are available in the library. 2064 Talking to a MessageRouter requires a bit more effort on the part of the GUI program, but results in greater flexibility. The kinds of queries that can be issued are individual units of action, such as ``describe yourself'', ``search'', ``retrieve the content for this document''. More than one request may need to be made for a particular feature of the GUI. However you can ask for any combination of data available in the system, you are not relying on Actions. What you will implement though, may be a lot like the Action code in terms of request sequences. 2065 2066 Interfaces in other programming languages: Because the communication is all XML based, other interfaces can talk to the Java library if a communication protocol is set up. This could be done using SOAP for example. Like for Java GUI interfaces, the program could talk to a Receptionist or to a MessageRouter. 2067 e.g. Java interface. where you can interface to. MR vs Receptionist. different receptionists. e.g., handheld - using servlet, transforming recpt, but new set of XSLT Java program other program - talk to recpt but just get back XML data for pages. Java gui - just talk to MR, do all processing itself. 2069 2068 2070 2069 Remote interfaces: remote interfaces can be set up in the same way as above, using a communication protocol between the interface, and the library program. … … 2077 2076 \subsection{New types of collections}\label{sec:new-coll-types} 2078 2077 2079 There are two types of standard \gs\ collections: collections built with the \gsiii\ building system, and collections that are imported from \gsii\ . There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. \gsiii\ has an ability to use any type of collection you can come up with, assuming some java code is provided.2080 2081 There are four levels of customi sation that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with \gs\ to describe these different levels.2078 There are two types of standard \gs\ collections: collections built with the \gsiii\ building system, and collections that are imported from \gsii\ . There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. \gsiii\ has an ability to use any type of collection you can come up with, assuming some Java code is provided. 2079 2080 There are four levels of customization that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with \gs\ to describe these different levels. 2082 2081 2083 2082 Firstly, new service classes need to be written to provide the functionality to search/browse/whatever the collection. If the services have similar interfaces and functionality to the standard services, this may be all that is needed. For example, the \gsii\ MGPP collections were the first to be served in \gsiii\ . When we came to do \gsii\ MG collections, all we had to do was write some new service classes that interacted with MG instead of MGPP. Because these collections used the same type of services, this was all we had to do. The format of the configuration files was similar, they just specified MG serviceRack classes rather than MGPP ones. … … 2145 2144 2146 2145 The classic interface was created to be used by this site (and is now a standard part of Greenstone). 2147 In many cases, creating a new interface just requires the new images and XSLT to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This classic interface required a bit more customi sation.2146 In many cases, creating a new interface just requires the new images and XSLT to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This classic interface required a bit more customization. 2148 2147 2149 2148 The standard \gsiii\ navigation bar lists all the services available for the collection. In \gsii\ , the navigation bar provides the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that produces the navigation bar needed to be altered to produce this. But also, a new Receptionist was needed. 2150 2149 The standard receptionist (DefaultReceptionist) gathers a little bit of extra information for each page of XML before transforming it: this is the list of services for the collection and their display information, allowing the services to be listed along the navigation bar. This is information that is needed by every page (except for the library home page) and therefore is obtained by the receptionist instead of by each action. The nzdl interface needed a bit more information than this: for the ClassifierBrowse service, if there was one, the list of classifiers and their display elements must be obtained. So a new Receptionist (NZDLReceptionist) was written that inherited from DefaultReceptionist, and added this new info into the page. 2151 2150 2152 One of the servlet initiali sation parameters is the receptionist class: this was added to the servlet definition in the web.xml file so that the LibraryServlet would load up the right receptionist class.2151 One of the servlet initialization parameters is the receptionist class: this was added to the servlet definition in the web.xml file so that the LibraryServlet would load up the right receptionist class. 2153 2152 2154 2153 … … 2249 2248 2250 2249 2251 Gre nstone sets up Tomcat to run on port 8080 by default. To change this, you can edit the tomcat.port property in build.properties. If you do this before installing Greenstone, then running 'ant install' will use the new port number. If you want to change it later on, shutdown tomcat, run 'ant reconfigure-server-settings', then when you restart tomcat it will use the new port.2250 Greenstone sets up Tomcat to run on port 8080 by default. To change this, you can edit the tomcat.port property in build.properties. If you do this before installing Greenstone, then running 'ant install' will use the new port number. If you want to change it later on, shutdown tomcat, run 'ant reconfigure-server-settings', then when you restart tomcat it will use the new port. 2252 2251 2253 2252 Note: Tomcat must be shutdown and restarted any time you make changes in the following for those changes to take effect: … … 2259 2258 \item any classes or jar files used by the servlets 2260 2259 \end{bulletedlist} 2261 \noindent Note: stdin and stdout for the servlets (on linux) both go to\\2260 \noindent Note: stdin and stdout for the servlets (on Linux) both go to\\ 2262 2261 \gst{\gsdlhome/packages/tomcat/logs/catalina.out} 2263 2262 … … 2273 2272 \end{gsc}\end{quote} 2274 2273 2275 By default, Tomcat allows directory listings. To disable this, change the 'listings' param ter to false in the default servlet definition, in Tomcat's web.xml file (\gst{\$GSDL3HOME/packages/tomcat/conf/web.xml}):2274 By default, Tomcat allows directory listings. To disable this, change the 'listings' parameter to false in the default servlet definition, in Tomcat's web.xml file (\gst{\$GSDL3HOME/packages/tomcat/conf/web.xml}): 2276 2275 2277 2276 We have set the greenstone context to be reloadable. This means that if a class or resource file in web/WEB-INF/lib or web/WEB-INF/classes changes, the servlet will be reloaded. This is useful for development, but should be turned off for production mode (set the reloadable attribute to false). … … 2300 2299 \subsection{Running Tomcat behind a proxy} 2301 2300 2302 Almost everything works fine when Tomcat is running behind a proxy. The only time this causes trouble is if the servlet itself needs to make external http connections. We do this in the infomine demo collection for example. One of the service classes sends httprequests to the infomine database at riverside. Since this is going through the proxy, a username and password is needed. It is not sufficient to prompt the user for a password because they are unlikely to have a password for the particular proxy that Tomcat is using. What we have done at present is to put a proxy element in the siteConfig.xml file. Here you have to enter a suitable username and password for the proxy server. Unfortunately these are entered in plain text. And the file is viewable via the servlet. So we need a better solution.2301 Almost everything works fine when Tomcat is running behind a proxy. The only time this causes trouble is if the servlet itself needs to make external HTTP connections. We do this in the infomine demo collection for example. One of the service classes sends HTTP requests to the infomine database at riverside. Since this is going through the proxy, a username and password is needed. It is not sufficient to prompt the user for a password because they are unlikely to have a password for the particular proxy that Tomcat is using. What we have done at present is to put a proxy element in the siteConfig.xml file. Here you have to enter a suitable username and password for the proxy server. Unfortunately these are entered in plain text. And the file is viewable via the servlet. So we need a better solution. 2303 2302 2304 2303 \newpage 2305 2304 \section{SOAP}\label{app:soap} 2306 2305 2307 Gre nstone uses the Apache Axis SOAP implementation for distributed communications. Axis runs as a servlet inside Tomcat, and SOAP web services can be deployed by this Axis servlet. The Greenstone installation process sets up Axis for Tomcat, and predeploys the localsite web service.2306 Greenstone uses the Apache Axis SOAP implementation for distributed communications. Axis runs as a servlet inside Tomcat, and SOAP web services can be deployed by this Axis servlet. The Greenstone installation process sets up Axis for Tomcat, and predeploys the localsite web service. 2308 2307 2309 2308 To deploy a SOAP service for other sites, run \gst{ant soap-deploy-site} … … 2398 2397 \end{verbatim}\end{gsc} 2399 2398 2400 These two examples show how to deal with Greenstone 2's external link macros. The first one is for a 'relative' external link. In this case, the links are like URL's but they actually refer to Greenstone internal documents. So the Greens otne 3 link is to the document, but with parameter s0.ext signifying that the d argument will need translating before retrieving the content.2401 The second example is a truly external link. This is translated into a html type page action, where the urlis presented as a frame along with the collection header in a separate frame.2399 These two examples show how to deal with Greenstone 2's external link macros. The first one is for a 'relative' external link. In this case, the links are like URL's but they actually refer to Greenstone internal documents. So the Greenstone 3 link is to the document, but with parameter s0.ext signifying that the d argument will need translating before retrieving the content. 2400 The second example is a truly external link. This is translated into a HTML type page action, where the URL is presented as a frame along with the collection header in a separate frame. 2402 2401 2403 2402 Sometimes we need to add in macros to be resolved in a second step:
Note:
See TracChangeset
for help on using the changeset viewer.