Changeset 6422
- Timestamp:
- 2004-01-09T18:06:22+13:00 (20 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/gsdl3/docs/manual/manual.tex
r6343 r6422 55 55 56 56 This documentation consists of several parts. Section~\ref{sec:install} covers greenstone installation, how to access the library, and some administration issues. Section~\ref{sec:user} looks at using the sample collections, creating new collections, and how to make small customisations to the interface. The remaining sections are aimed towards the Greenstone developer. Section~\ref{sec:develop-runtime} describes the run-time system, including the structure of the software, and the message format, while Section~\ref{sec:develop-build} describes the collection building process. Section~\ref{sec:new-features} describes how to add new features to Greenstone, such as how to add new services, new page types, new plugins for different document formats. Section~\ref{sec:distributed} describes how to make Greenstone run in a distributed fashion, using SOAP as an example communications protocol. Finally, there are several appendices, including how to install Greenstone from CVS, and a comparison of greenstone 2 and greenstone 3 format statements. 57 57 \newpage 58 58 \section{Greenstone installation and administration}\label{sec:install} 59 59 60 60 This section covers where to get Greenstone 3 from, how to install it and how to run it. The standard method of running Greenstone is as a Java servlet. We provide the Tomcat servlet container to serve the servlet :-). Standard web servers may be able to be configured to provide servlet support, and thereby remove the need to use Tomcat. Please see your web server documentation for this. This documentation assumes that you are using Tomcat. To access Greenstone, tomcat must be started up, and then it can be accessed via a web browser. 61 61 62 63 \subsection{Get and install Greenstone} 64 65 Greenstone is available from www.... There are currently two distributions: a self-installing tar for Linux, and a Windows executable. 66 62 67 Greenstone is also available through CVS (Concurrent Versioning System). This provides the absolute latest development version, and is not guaranteed to be stable. Appendix~\ref{app:cvs} describes how to download and install Greenstone from CVS. 63 68 64 \subsection{Get and install Greenstone}65 66 Greenstone is available from www.... There are currently two distributions: a self-installing tar for Linux, and a Windows executable.67 68 69 \subsubsection{Linux} 69 70 Download the file gsdl3-0.01-unix.sh. Then run it in a shell (./gsdl3-0.01-unix.sh). It will prompt you for where to install greenstone to, the name of your computer, what port to run tomcat on... Once Greenstone has been installed, you can start the library by running ./gsdl3.sh, and opening up a browser pointing to localhost:8080/gsdl3 (or different computer name and port). 70 ** add more once installer finished ** 71 72 Download the latest version of the self-installing tar file, gsdl3-x.xx-unix.sh, and run it in a shell (./gsdl3-x.xx-unix.sh). It will prompt you for where to install greenstone to, the name of your computer, what port to run tomcat on... Once Greenstone has been installed, you can start the library by running ./gsdl3.sh, and opening up a browser pointing to localhost:8080/gsdl3 (or different computer name and port). 71 73 72 74 \subsubsection{Windows} 73 74 Download the gsdl3-0.01-win32.exe file and double click it to start the installation. You will be prompted for ... Once Greenstone is installed, you can access the library by selecting Greenstone 3 Digital Library in the Start menu. 75 76 \subsubsection{Other notes} 77 78 To run Greenstone we are starting up the Tomcat server, and a mysql database server. 79 80 Once Greenstone has been installed, you can run tomcat, and access it in a browser at\gst{http://localhost:8080/gsdl3}---this gets you to a welcome page. From here, you can select to run the test servlet, the standard library servlet, and the remote servlet. 81 82 \noindent Note: Tomcat must be shutdown and restarted any time you make changes in the following for those changes to take effect:\\ 75 ** add more once installer finished ** 76 77 Download the latest Windows executable, gsdl3-x.xx-win32.exe, and double click it to start the installation. You will be prompted for ... Once Greenstone is installed, you can access the library by selecting Greenstone 3 Digital Library in the Start menu. 78 79 \subsubsection{Accessing the library in a browser} 80 81 Once you have started up the library (see the previous sections for OS dependent instructions), you can access it in a browser at http://localhost:8080/gsdl3 (or http://your-computer-name/your-chosen-port/gsdl3). This gets you to a welcome page, with three links: one to run a test servlet (this allows you to check that tomcat is running properly), one to run the standard library servlet using localsite, and one to run a library servlet using the site soapsite. This site uses a SOAP connection to communicate with localsite, and demonstrates the library working in a distributed fashion. See Section~\ref{sec:distributed} for details about how to run Greenstone distributedly. 82 83 \subsection{How the library works} 84 85 The standard library program is a Java servlet. 86 87 Other types of interfaces can be used, such as Java GUI programs. See Section~\ref{sec:new-interfaces} for details about how to make these. 88 89 \subsubsection{Restarting the library} 90 91 The library program (actually tomcat) can be restarted by ... (** put a mechanism in each install program **). 92 93 94 Tomcat must be shutdown and restarted any time you make changes in the following for those changes to take effect:\\ 83 95 \begin{bulletedlist} 84 96 \begin{gsc} … … 88 100 \item any classes or jar files used by the servlets 89 101 \end{bulletedlist} 90 \noindent Note: std in and stdoutfor the servlets both go to\\102 \noindent Note: stdout and stderr for the servlets both go to\\ 91 103 \gst{\gsdlhome/comms/jakarta/tomcat/logs/catalina.out} 92 104 … … 103 115 \caption{The Greenstone directory structure} 104 116 \label{tab:dirs} 105 \center{\footnotesize117 {\footnotesize 106 118 \begin{tabular}{l p{8cm}} 119 \hline 120 \bf directory & \bf description \\ 107 121 \hline 108 122 gsdl3 … … 175 189 where they live, whats the difference, what each contains.\\ 176 190 177 There are two Greenstone {\em sites} that come with the checkout: localsite, and soapsite. localsite has three collections, while soapsite has none. Each site has a configuration file which specifies the site name, site-wide services if any, and a list of remote sites to connect to. 178 localsite does not connect to any other sites. soapsite specifies a SOAP connection to localsite. 179 180 Talk here about matching site with an interface. 181 182 The file \gst{\gsdlhome/web/WEB-INF/web.xml} contains the setup information for Tomcat---tells it what servlets to load, what initial parameters to pass them, and what web names map to the servlets. 183 There are three servlets specified in web.xml: one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other two are Greenstone library servlets, {\em library}, which serves localsite, and {\em library1} which serves soapsite. 191 A site is comprised of a set of collections and possibly services. An interface is a set of images along with a set of xslt files used for translating xml output from the library into an appropriate form---html for the servlet case. 192 One greenstone installation can have many sites and interfaces. One instantiation of a servlet uses one site and one interface. Sites and interfaces can be matched up in different ways. For example, a single site might be served with two different interfaces. This provides different modes of access to the same content. eg HTML vs WML, or perhaps providing completely different look and feel for different audiences. A standard interface may be used with many different sites---provides a consistent mode of access to a lot of different content. 193 194 Collections live in the collect directory of a site. Any collections that are found in this directory when the servlet is initialised will be loaded up and presented to the user. Collections require valid configuration files, but apart from this, nothing needs to be done to the site to use new collections. Collection is added while tomcat is running will not be picked up: you can either restart the server, or send a configuration request to the servlet: these are described in Section~\ref{sec:runtime-config}. 195 196 There are two Greenstone sites that come with the distribution: localsite, and soapsite. localsite has several demo collections, while soapsite has none. soapsite specifies that a soap connection should be made to localsite. Getting this to work involves setting up a soap server for localsite: see Section~\ref{sec:distributed} for details. 197 198 Each site and interface has a configuration file which specifies parameters for the site or interface---these are described in Section~\ref{sec:config}. 199 200 The file \gst{\gsdlhome/web/WEB-INF/web.xml} contains the setup information for Tomcat. It tells Tomcat what servlets to load, what initial parameters to pass them, and what web names map to the servlets. 201 There are three servlets specified in web.xml (these correspond to the three links in the welcome page for greenstone): one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other two are Greenstone library servlets, {\em library}, which serves localsite, and {\em library1} which serves soapsite. Both of these servlets use the standard interface (called {\em default}). 184 202 185 203 \begin{table} 186 204 \caption{Greenstone servlet initialisation parameters} 187 205 \label{tab:serv-init} 206 {\footnotesize 188 207 \begin{tabular}{llp{5cm}} 208 \hline 189 209 \bf name & \bf sample value & \bf description \\ 190 210 \hline 191 gsdl3 home & /research/kjdon/gsdl3 & the base directory of the gsdl3 installation \\192 site name & localsite &the site to use \\193 interface name & default &the interface to use\\194 library name & library & the name of the library program\\195 default lang & en & the default language for the interface\\196 receptionist & NZDLReceptionist & (optional) specifies an alternative Receptionist to use\\197 messagerouter & NewMessageRouter & (optional) specifies an alternative MessageRouter to use\\198 \hline 199 \end{tabular} 211 gsdl3\_home & /research/kjdon/gsdl3 & the base directory of the gsdl3 installation \\ 212 site\_name & localsite & the name of the site to use \\ 213 interface\_name & default & the name or the interface to use\\ 214 library\_name & library & the web name of the servlet \\ 215 default\_lang & en & the default language for the interface\\ 216 receptionist\_class & NZDLReceptionist & (optional) specifies an alternative Receptionist to use\\ 217 messagerouter\_class & NewMessageRouter & (optional) specifies an alternative MessageRouter to use\\ 218 \hline 219 \end{tabular}} 200 220 \end{table} 201 221 202 The initialisation parameters used by the library servlets are shown in Table~\ref{tab:serv-init}. The most important parameters are sitename and interface name. Each servlet running uses one site and one interface. You can run multiple servlets, all using different combinations of site and interface. 203 204 \subsection{Configuring a greenstone installation} 222 The initialisation parameters used by the library servlets are shown in Table~\ref{tab:serv-init}. This is where you define what site and interface each servlet uses. Any number of servlets can be specified here. See Appendix~\ref{app:tomcat} for more details about Tomcat. 223 224 225 \subsection{Configuring a greenstone installation}\label{sec:config} 205 226 206 227 Initial Greenstone3 system configuration is determined by a set of configuration files, all expressed in XML. Each site has a configuration file that binds parameters for the site, \gst{siteConfig.xml}. Each interface has a configuration file, \gst{interfaceConfig.xml}, that specifies Actions for the interface. Collections also have several configuration files; these are discussed in Section~\ref{sec:collconfig}. 207 The configuration files are read in when the system is initialised, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a CGI-type command to the library. CGI command can be sent to the library are made to the interface configuration file, tomcat needs to be restarted.There are a series of CGI-type commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to shutdown and restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}.228 The configuration files are read in when the system is initialised, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a CGI-type command to the library. There are a series of CGI-type commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to shutdown and restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}. 208 229 209 230 \subsubsection{Site configuration file}\label{sec:siteconfig} … … 214 235 collections directory. 215 236 216 The HTTP address is used for retrieving resources from a site outside the XML protocol. Because a site is HTTP accessible , any files (e.g. images) belonging to that site or to its collections can be specified in the HTML of a page by a URL. This avoids having to retrieve these files from a remote site via the XML protocol\footnote{Currently, sites live inside the Tomcat gsdl3 root context, and therefore all their content is accessible over HTTP via the Tomcat address. We need to see if parts can be restricted. Also, if we use a different protocol, then resources from remote sites may need to come through the XML. Also, if we are running locally without using Tomcat, we may want to get them via file:// rather than http://.}.237 The HTTP address is used for retrieving resources from a site outside the XML protocol. Because a site is HTTP accessible through Tomcat, any files (e.g. images) belonging to that site or to its collections can be specified in the HTML of a page by a URL. This avoids having to retrieve these files from a remote site via the XML protocol\footnote{Currently, sites live inside the Tomcat gsdl3 root context, and therefore all their content is accessible over HTTP via the Tomcat address. We need to see if parts can be restricted. Also, if we use a different protocol, then resources from remote sites may need to come through the XML. Also, if we are running locally without using Tomcat, we may want to get them via file:// rather than http://.}. 217 238 218 239 Figure~\ref{fig:siteconfig} shows two example site configuration files. The first example is for a rudimentary site with no site-wide services, … … 241 262 <metadata name="Title">Collection builder</metadata> 242 263 <metadata name="Description">Builds collections in a 243 264 gsdl2-style manner</metadata> 244 265 </metadataList> 245 266 <serviceRackList> … … 270 291 <subaction name='home' xslt='home.xsl'/> 271 292 <subaction name='about' xslt='about.xsl'/> 293 <subaction name='help' xslt='help.xsl'/> 294 <subaction name='pref' xslt='pref.xsl'/> 272 295 </action> 273 296 <action name='q' class='QueryAction' xslt='basicquery.xsl'/> 274 <action name='b' class=' BrowseAction' xslt='classifier.xsl'/>297 <action name='b' class='GS2BrowseAction' xslt='classifier.xsl'/> 275 298 <action name='a' class='AppletAction' xslt='applet.xsl'/> 276 299 <action name='d' class='DocumentAction' xslt='document.xsl'/> 300 <action name='xd' class='XMLDocumentAction'> 301 <subaction name='toc' xslt='document-toc.xsl'/> 302 <subaction name='text' xslt='document-content.xsl'/> 303 </action> 277 304 <action name='pr' class='ProcessAction' xslt='process.xsl'/> 278 305 <action name='s' class='SystemAction' xslt='system.xsl'/> … … 280 307 </interfaceConfig> 281 308 \end{verbatim}\end{gsc} 282 \caption{ A sampleinterface configuration file}309 \caption{Default interface configuration file} 283 310 \label{fig:ifaceconfig} 284 311 \end{figure} … … 301 328 \caption{Example run-time configuration arguments.} 302 329 \label{tab:run-time config} 330 {\footnotesize 303 331 \begin{tabular}{lp{8cm}} 332 \hline 304 333 \gst{a=s\&sa=c} & reconfigures the whole site, reads in siteConfig.xml, reloads all the collections. Just part of this can be specified with another argument \gst{ss} (system subset). The valid values are \gst{collectionList}, \gst{siteList}, \gst{serviceList}, \gst{clusterList}. \\ 305 334 \gst{a=s\&sa=c\&sc=XXX} & reconfigures the XXX collection or cluster. \gst{ss} can also be used here, valid values are \gst{metadataList} and \gst{serviceList}. \\ 306 335 \gst{a=s\&sa=a} & (re)activate a specific module. Modules are specified using two arguments, \gst{st} (system module type) and \gst{sn} (system module name). Valid types are \gst{collection}, \gst{cluster} \gst{site}.\\ 307 336 \gst{a=s\&sa=d} & deactivate a module. \gst{st} and \gst{sn} can be used here too. Valid types are \gst{collection}, \gst{cluster}, \gst{site}, \gst{service}. Modules are removed from the current configuration, but will reappear if Tomcat is restarted.\\ 308 \gst{a=s\&sa=d\&sc=XXX} & deactivate a module belonging to the XXX collection or cluster. \gst{st} and \gst{sn} can be used here too. Valid types are \gst{service}. \\\end{tabular} 337 \gst{a=s\&sa=d\&sc=XXX} & deactivate a module belonging to the XXX collection or cluster. \gst{st} and \gst{sn} can be used here too. Valid types are \gst{service}. \\ 338 \hline 339 \end{tabular}} 309 340 \end{table} 310 341 \newpage 311 342 \section{Using Greenstone 3}\label{sec:user} 312 343 … … 367 398 \subsection{Collection configuration files}\label{sec:collconfig} 368 399 369 Each collection has two configuration files, \gst{collectionConfig.xml} and \gst{buildConfig.xml},that give metadata, display and other information for the400 Each collection has two, or possibly three, configuration files, \gst{collectionConfig.xml} and \gst{buildConfig.xml}, and optionally \gst{collectionInit.xml} that give metadata, display and other information for the 370 401 collection.\footnote{\gst{siteConfig.xml} and \gst{interfaceConfig.xml} is new for Greenstone3, while \gst{collectionConfig.xml} and \gst{buildConfig.xml} replace \gst{collect.cfg} and \gst{build.cfg} in 371 402 Greenstone2.} The first includes user-defined presentation metadata for the collection, … … 374 405 the build-time process and includes any metadata that can be determined 375 406 automatically. It also includes configuration information for any ServiceRacks needed by the collection. 407 408 \subsubsection{collectionInit.xml} 409 410 This optional file specifies a new collection class if the standrad one is not to be used. The only syntax so far is the class name: 411 412 \begin{gsc}\begin{verbatim} 413 <collectionInit class="XMLCollection"/> 414 \end{verbatim}\end{gsc} 415 416 Section~\ref{sec:new-coll-types} describes an example collection where this file is used. Depending on the type of collection that this is used for, one or both of the other config files may not be needed. 417 418 \subsubsection{collectionConfig.xml} 376 419 377 420 The collection configuration file is where the collection designer (e.g. a librarian) decides what form the collection should take. This includes the collection metadata such as title and description, and also includes what indexes and browsing structures should be built. The format of \gst{collectionConfig.xml} is still under consideration. However, Figure~\ref{fig:collconfig} shows the parts of it that have been defined so far. (Since collection building at this stage is still done using Greenstone2 Perl scripts and the old \gst{collect.cfg} file, we have only defined the format for the parts of \gst{collectionConfig.xml} that are used by the runtime-system.) … … 418 461 <classifier name="CL4"> 419 462 <format> 420 463 <gsf:template match="documentNode"> 421 464 <br /><gsf:link><gsf:metadata name='Keyword' /> 422 465 </gsf:link></gsf:template> … … 436 479 The \gst{<display>} element contains optional formatting information for the display of documents. Templates that can be specified here include \gst{documentHeading}, \gst{DocumentContent}, and other information that could be specified (in a yet to be decided format) are things such as whether or not to display the cover image, table of contents etc. 437 480 438 \subsection{ Building configuration file}\label{sec:buildconfig}481 \subsection{buildConfig.xml}\label{sec:buildconfig} 439 482 440 483 The file \gst{buildConfig.xml} is produced by the collection building process, and contains metadata and other information about the collection that can … … 491 534 </buildConfig> 492 535 \end{verbatim}\end{gsc} 493 \caption{Sample buildConfig.xml file }536 \caption{Sample buildConfig.xml file (mgppdemo collection)} 494 537 \label{fig:buildconfig} 495 538 \end{figure} … … 529 572 \caption{Format elements for GSF format language} 530 573 \label{tab:gsf-format} 531 \begin{tabular}{ll} 574 {\footnotesize 575 \begin{tabular}{p{6.5cm}p{6.5cm}} 576 \hline 532 577 \bf Element & \bf Description \\ 533 \gst{<gsf:text/>} & The document's text\\ 578 \hline 579 \gst{<gsf:text/>} & The document's text\\ 534 580 \gst{<gsf:link>...</gsf:link>} & The HTML link to the document itself \\ 535 \gst{<gsf:link type='document'>...</gsf:link>} & Same as above\\ 536 \gst{<gsf:link type='classifier'>...</gsf:link>} & A link to a classification node (use in classifierNode templates)\\ 537 \gst{<gsf:link type='source'>...</gsf:link>} & The HTML link to the original file---set for documents that have been converted from e.g. Word, PDF, PS \\ 581 \gst{<gsf:link type='document'>... 582 </gsf:link>} & Same as above\\ 583 \gst{<gsf:link type='classifier'>... 584 </gsf:link>} & A link to a classification node (use in classifierNode templates)\\ 585 \gst{<gsf:link type='source'>... 586 </gsf:link>} & The HTML link to the original file---set for documents that have been converted from e.g. Word, PDF, PS \\ 538 587 \gst{<gsf:icon/>} & An appropriate icon\\ 539 588 \gst{<gsf:icon type='document'/>} & same as above\\ … … 549 598 </gsf:choose-metadata>} 550 599 & A choice of metadata. Will select the first existing one. the metadata elements can have the select, separator and multiple attributes like normal.\\ 551 \gst{<gsf:switch preprocess='preprocess-type'> 552 <gsf:metadata name='Title'/><gsf:when test='test-type' test-value='xxx'>.....</gsf:when><gsf:when test='test-type' test-value='xxx'>...</gsf:when><gsf:otherwise>...</gsf:otherwise></gsf:switch>} & switch on the value of a particular metadata - the metadata is specified in gsf:metadata, has the same attributes as normal.\\ 553 \end{tabular} 600 \gst{<gsf:switch preprocess= 601 'preprocess-type'> 602 <gsf:metadata name='Title'/> 603 <gsf:when test='test-type' 604 test-value='xxx'>...</gsf:when> 605 <gsf:when test='test-type' 606 test-value='yyy'>...</gsf:when> 607 <gsf:otherwise>...</gsf:otherwise> 608 </gsf:switch>} & switch on the value of a particular metadata - the metadata is specified in gsf:metadata, has the same attributes as normal.\\ 609 \hline 610 \end{tabular}} 554 611 \end{table} 555 612 … … 562 619 To get the previous metadata, the format statement would have the following in it: 563 620 564 \gst{<gsf:metadata name='Title' select='ancestors' separator='; '/>; <gsf:metadata name='Title'/>} 621 \begin{gsc} 622 \begin{verbatim} 623 <gsf:metadata name='Title' select='ancestors' separator='; '/>; 624 <gsf:metadata name='Title'/> 625 \end{verbatim} 626 \end{gsc} 565 627 566 628 \begin{table} 567 629 \caption{Select types for metadata format elements} 568 630 \label{tab:gsf-select-types} 631 {\footnotesize 569 632 \begin{tabular}{ll} 570 633 \hline … … 579 642 descendents & All the descendent sections\\ 580 643 \hline 581 \end{tabular} 644 \end{tabular}} 582 645 \end{table} 583 646 … … 585 648 \begin{gsc} 586 649 \begin{verbatim} 587 <gsf:choose-metadata><gsf:option name='dc.Title'/><gsf:option name='dls.Title'/><gsf:option name='Title'/></gsf:choose-metadata> 650 <gsf:choose-metadata> 651 <gsf:option name='dc.Title'/> 652 <gsf:option name='dls.Title'/> 653 <gsf:option name='Title'/> 654 </gsf:choose-metadata> 588 655 \end{verbatim} 589 656 \end{gsc} … … 596 663 \begin{verbatim} 597 664 <gsf:switch metadata='Organization' preprocess='toLower;stripSpace'> 598 <gsf:when test='equals' test-value='bostid'><!-- output BOSTID image --></gsf:when> 599 <gsf:when test='equals' test-value='worldbank'><!-- output world bank image --></gsf:when> 665 <gsf:when test='equals' test-value='bostid'> 666 <!-- output BOSTID image --></gsf:when> 667 <gsf:when test='equals' test-value='worldbank'> 668 <!-- output world bank image --></gsf:when> 600 669 <gsf:otherwise><!-- output default image--></gsf:otherwise> 601 670 </gsf:switch> … … 620 689 <search> 621 690 <format> <!--Put here templates related to searching and 622 623 691 the query page. The common one is the documentNode 692 template --> 624 693 <gsf:template match='documentNode'>...</gsf:template> 625 694 </format> … … 628 697 <classifier name='xx'> 629 698 <format><!-- put here templates related to formating a 630 631 699 particular classifier page. Common ones are documentNode 700 and classifierNode templates--> 632 701 <gsf:template match='documentNode'>...</gsf:template> 633 702 <gsf:template match='classifierNode'>...</gsf:template> 634 703 <gsf:template match='classifierNode' mode='horizontal'>... 635 704 </gsf:template> 636 705 </format> 637 706 </classifier> … … 640 709 <display> 641 710 <format><!-- here goes any formatting relating to the display 642 643 711 of the documents. These are generally named templates, 712 and format options --> 644 713 <gsf:template name='documentContent'>...</gsf:template> 645 714 <gsf:option name='TOC' value='true'/> … … 668 737 \caption{Formatting options} 669 738 \label{tab:format_options} 670 \center{\footnotesize739 {\footnotesize 671 740 \begin{tabular}{llp{5cm}} 672 741 \hline … … 684 753 685 754 686 \subsection{Customising the interface} 755 \subsection{Customising the interface}\label{sec:interface-customise} 687 756 688 757 The interface can be customised in several ways. … … 717 786 To use a new interface, the tomcat web.xml must be edited: either change the interface that a current version of the servlet is using, or add another servlet instantiation to the file (see Section~\ref{sec:sites-and-ints} or Appendix~\ref{app:tomcat}). The Tomcat server must be restarted for this to take effect. 718 787 719 788 \newpage 720 789 \section{Developing Greenstone 3: Run-time system}\label{sec:develop-runtime} 721 790 … … 835 904 836 905 \begin{table} 837 \center{\footnotesize906 {\footnotesize 838 907 \begin{tabular}{lll} 839 908 \hline … … 1175 1244 \caption{Status codes currently used in Greenstone 3} 1176 1245 \label{tab:status codes} 1246 {\footnotesize 1177 1247 \begin{tabular}{llp{8cm}} 1248 \hline 1178 1249 \bf code name & \bf code & \bf meaning \\ 1179 1250 & \bf value & \\ 1251 \hline 1180 1252 SUCCESS & 1 & the request was accepted, and the process was completed \\ 1181 1253 ACCEPTED & 2 & the request was accepted, and the process has been started, but it is not completed yet \\ … … 1185 1257 HALTED & 12 & the process has stopped \\ 1186 1258 INFO & 20 & just an info message that doesn't imply anything \\ 1187 \end{tabular} 1259 \hline 1260 \end{tabular}} 1188 1261 \end{table} 1189 1262 … … 1694 1767 \caption{Configure CGI arguments} 1695 1768 \label{tab:system-cgi} 1769 {\footnotesize 1696 1770 \begin{tabular}{ll} 1697 1771 \hline 1698 1772 \bf arg & \bf description\\ 1773 \hline 1699 1774 a=s & system action\\ 1700 1775 sa=c$|$a$|$d & type of system request: c (configure), a (add/activate), \\ … … 1709 1784 st=collection& \\ 1710 1785 \hline 1711 \end{tabular} 1786 \end{tabular}} 1712 1787 \end{table} 1713 1788 … … 1717 1792 \caption{The utility classes in org.greenstone.gsdl3.util} 1718 1793 \label{tab:utils} 1719 \center{\footnotesize1794 {\footnotesize 1720 1795 \begin{tabular}{lp{3.75in}} 1721 1796 \hline 1722 1797 \bf Utility class & \bf Description\\ 1798 \hline 1723 1799 ConfigVars & holds the servlet startup variables, including library name, site name, interface name, default language\\ 1724 1800 Dictionary & wrapper around a Resource Bundle, providing strings with parameter\\ … … 1736 1812 XSLTUtil & contains static methods to be called from within XSLT \\ 1737 1813 \hline 1738 \end{tabular} 1739 } 1814 \end{tabular}} 1740 1815 \end{table} 1741 1816 1817 \newpage 1742 1818 \section{Collection building architecture}\label{sec:develop-build} 1743 1819 **** GEORGE **** … … 1746 1822 modules API\\ 1747 1823 1824 \newpage 1748 1825 \section{Developing Greenstone 3: Adding new features}\label{sec:new-features} 1749 1826 1750 \subsection{Creating new services} 1827 \subsection{Creating new services}\label{sec:new-services} 1751 1828 1752 1829 *inherit from ServiceRack - abstract base class. this handles the main process method, determines the service name and request type. if request type is describe, and to is empty, it returns a list of services (short\_service\_info) which is initialised in the configure method. a describe request to a particular service results in getServiceDescription being called, which must be supplied by the subclass. … … 1760 1837 1761 1838 * should a metadata retrieval service advertise what metadata is available?? 1762 \subsection{creating new actions/pages} 1763 1764 \subsection{new interfaces} 1839 \subsection{creating new actions/pages}\label{sec:new-pages} 1840 1841 \subsection{new interfaces}\label{sec:new-interfaces} 1765 1842 e.g. java interface. where you can interface to. MR vs Receptionist. diff receptionists. egs, handheld - using servlet, transforming recpt, but new set of XSLT java program other program - talk to recpt but just get back XML data for pages. java gui - just talk to MR, do all processing itself. 1766 1843 1767 \subsection{Adding new classifiers} 1844 \subsection{Adding new classifiers}\label{sec:new-classifiers} 1768 1845 *** GEORGE *** 1769 \subsection{Adding new plugins} 1846 \subsection{Adding new plugins}\label{sec:new-plugins} 1770 1847 *** GEORGE *** 1771 1848 1772 \subsection{Documented examples} 1773 1774 talk about the sample collections briefly - but most documentation is in the description 1775 \subsubsection{The NZDL web interface} 1776 1777 We have created a second interface that can be seen at \gst{http://www.greenstone.org/greenstone3/nzdl}. There are some small differences between this and the standard greenstone interface. 1778 We created a new interface---called nzdl, put into the web/interfaces directory. It has a set of images and transform files like the standard interface. And most of the XSLT files have been overridden. 1779 1780 * Along the navigation bar, it has search and classifiers. The standard interface has each service along there. We needed to modify the navigation bar XSLT code, but also we added a new receptionist. 1781 interface found at www... 1782 what did we have to do to get this interface? 1783 classifiers displayed instead of services, query services all have same button, hard coded query page. 1784 assumptions made, classes modified - new Receptionist, new XSLT. 1785 1786 1787 1849 \subsection{New types of collections}\label{sec:new-coll-types} 1850 1851 There are two types of standard Greenstone collections: collections built with the Greenstone 3 building system, and collections that are imported from Greenstone 2. There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. Greenstone 3 has an ability to use any type of collection you can come up with, assuming some java code is provided. 1852 1853 1854 There are four levels of customisation that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with Greenstone to describe these different levels. 1855 1856 Firstly, new service classes need to be written to provide the functionality to search/browse/whatever the collection. If the services have similar interfaces and functionality to the standard services, this may be all that is needed. For example, the Greenstone 2 MGPP collections were the first to be served in Greenstone 3. When we came to do Greenstone 2 MG collections, all we had to do was write some new service classes that interacted with MG instead of MGPP. Because these collections used the same type of services, this was all we had to do. The format of the configuration files was similar, they just specified MG serviceRack classes rather than MGPP ones. 1857 1858 The nzmaps collection used the same level of customisation, just implementing new services and fitting all the extra display elements into the standard query/display framework using javascript. 1859 1860 The gberg collection, however, was done quite differently to the standard collections. New services were provided to search the database (built with Lucene) and to provide the documents and parts of documents (using XSLT to transform the raw XML files). The collectionConfig file had some extra information in it: a list of the documents in the collection along with their Titles. Because the standard collection class has no notion of document lists, a new class was created (org.greenstone.gsdl3.collection.XMLCollection). This class is basically the same as a standard collection class except that it looks for and stores in memory the documentList from the collectionConfig file. 1861 1862 To tell Greenstone to load up a different type of collection class, we use another configuration file: etc/collectionInit.xml. This specifies the name of the collection class to use. 1863 Currently, this is all that is specified in that file, but you may want to add parameters for the class etc. 1864 1865 \gst{<collectionInit class="XMLCollection"/>} 1866 1867 The display for the collection is also quite different. The home page for the collection displays the list of documents. To achieve this, the describe response from the collection had to include the list, and a new XSLT was written for the collection that displayed this. Collection XSLT should be put in the transform directory of the collection\footnote{These are currently only used when running greenstone in a non-distributed fashion, but it will be added in properly at some stage}. 1868 1869 Document display is significantly different to standard greenstone. There are two modes of display: table of contents mode, and content mode. Clicking on a document link from the collection home page takes the user to the table of contents for the collection. Clicking on one of the sections in the table of contents takes them to a display of that section. To facilitate this, not only do we need new XSLT files , we also needed a new action. XMLDocumentAction was created, that used two subactions, toc and text, for the different modes of display. 1870 1871 The Receptionist was told about this new action by the addition of the following to the interfaceConfig.xml file: 1872 1873 \begin{gsc}\begin{verbatim} 1874 <action name='xd' class='XMLDocumentAction'> 1875 <subaction name='toc' xslt='document-toc.xsl'/> 1876 <subaction name='text' xslt='document-content.xsl'/> 1877 </action> 1878 \end{verbatim}\end{gsc} 1879 1880 XSLT files are linked to subactions rather than the action as a whole. The collection supplies the two XSLT files written appropriately for the data it contains. 1881 1882 All links that link to the documents have to be changed to use the xd action rather than the standard d action. These include the links from the home page, and the links from query results. 1883 1884 Querying of the collection is almost the same as usual. The query service provides a list of parameters, does the query and then sends back a list of document identifiers. The standard query action was fine for this collection. The change occurs in the way that the results are displayed---this is accomplished using a format statement supplied in the collectionConfig file inside the search node. 1885 1886 \begin{gsc}\begin{verbatim} 1887 <search> 1888 <format> 1889 <gsf:template match="documentNode"> 1890 <xsl:param name="collName"/> 1891 <xsl:param name="serviceName"/> 1892 <td> 1893 <b><a href="{$library_name}?a=xd&sa=text&c={$collName}& 1894 amp;d={@nodeID}&p.a=q&p.s={$serviceName}"> 1895 <xsl:choose> 1896 <xsl:when test="metadataList/metadata[@name='Title']"> 1897 <gsf:metadata name="Title"/> 1898 </xsl:when> 1899 <xsl:otherwise>(section)</xsl:otherwise> 1900 </xsl:choose> 1901 </a> 1902 </b> from <b><a href="{$library_name}?a=xd&sa=toc& 1903 c={$collName}&d={@nodeID}.rt&p.a=q&p.s={$serviceName}"> 1904 <gsf:metadata name="Title" select="root"/></a></b> 1905 </td> 1906 </gsf:template> 1907 </format> 1908 </search> 1909 \end{verbatim}\end{gsc} 1910 1911 Instead of displaying an icon and the Title, it displays the Title of the section and the title of the document. Both of these are linked to the document: the section title to the content of that section, the document title to the table of contents for the document. Because these require non-standard arguments to the library, these parts of the template are written in XSLT not greenstone format language. As is shown here it is perfectly feasible to write a format statement that includes XSLT mixed in with greenstone format elements. 1912 1913 The document display uses CSS to format the output---these are kept in the collection and specified in the collections XSLT files. The documents also specify DTD files. Due to the way we read in the XML files, Tomcat sometimes has trouble locating the DTDs. One option is to may all the links absolute links to files in the collection folder, the other option is to put them in Greenstone's DTD folder gsdl3/resources/dtd. 1914 1915 \subsection{The NZDL mirror site} 1916 1917 The library seen at \gst{http://www.greenstone.org/greenstone3/nzdl} is like a mirror to \gst{http://www.nzdl.org}---it aims to present the same collections, in the same way but using Greenstone 3 instead of Greenstone 2. It uses a new site and a new interface. The web.xml file had a new servlet entry in it to specify the combination of nzdl site and interface. 1918 1919 The site was created by making a directory called nzdl in the sites folder. A siteConfig file was created. Because its running on Linux, we were able to link to all the collections in the old greenstone installation. The convert\_coll\_from\_gs2.pl script was run over all the collections to produce the new XML configuration files. 1920 1921 A new interface, also called nzdl, was created in the interfaces directory. 1922 In many cases, creating a new interface just requires the new images and XSLT to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This setup also required a bit more customisation. 1923 1924 The standard Greenstone navigation bar lists all the services available for the collection. In Greenstone 2, the navigation bar provided the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that produced the navigation bar needed to be altered to produce this. But also, a new Receptionist was needed. 1925 The standard receptionist (DefaultReceptionist) gathers a little bit of extra info for each page of XML before transforming it: this is the list of services for the collection and their display information, allowing the services to be listed along the navigation bar. This is information that is needed by every page (except for the library home page) and therefore is obtained by the receptionist instead of by each action. The nzdl interface needed a bit more information than this: for the ClassifierBrowse service, if there was one, the list of classifiers and their display elements must be obtained. So a new Receptionist was written that inherited from DefaultReceptionist, and added this new info into the page. 1926 1927 One of the servlet initialisation parameters is the receptionist class: this was added to the servlet definition in the web.xml file so that the LibraryServlet would load up the right receptionist class. 1928 1929 1930 \newpage 1788 1931 \section{Distributed Greenstone}\label{sec:distributed} 1789 1932 1790 \begin{figure}[t] 1933 Greenstone is designed to run in a distributed fashion. One greenstone installation can talk to several sites on different computers. This requires some sort of communication protocol. Any protocol can be used, however we have only implemented a simple SOAP protocol. 1934 1935 more explanation.. 1936 1937 \begin{figure}[h] 1791 1938 \centering 1792 1939 \includegraphics[width=4in]{remote} %5.8 … … 1795 1942 \end{figure} 1796 1943 1797 Greenstone is designed to run in a distributed fashion. One greenstone installation can talk to several sites on different computers. This requires some sort of communication protocol. Any protocol can be used, however we have only implemented a simple SOAP protocol.1798 1799 more explanation..1800 1801 1944 We have used Apache SOAP for Java. This is run as a servlet in Tomcat. 1802 1945 If you have obtained Greenstone through CVS, you will need to install soap separately, describe in Appendix~\ref{app:soap-cvs}. Debugging soap is described in Appendix~\ref{app:soap-debug}. 1803 1946 1947 \subsection{Serving a site using soap} 1948 what do we have to do?? resource file format, deploy the service etc. 1804 1949 1805 1950 \appendix 1806 1951 1952 \newpage 1953 \section{Using Greenstone 3 from CVS}\label{app:cvs} 1954 1955 *** need to make sure building stuff is in here *** 1956 1957 Greenstone 3 is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes. Whats in CVS is quite different to what comes in a release. The code needs to be compiled, and some files need editing... 1958 1959 To check out the greenstone code, use: 1960 1961 \begin{quote}\begin{gsc}\begin{verbatim} 1962 cvs -d :pserver:cvs\[email protected]:2402/usr/local/ 1963 global-cvs/gsdl-src co gsdl3 1964 \end{verbatim}\end{gsc}\end{quote} 1965 1966 If you need it, the password for anonymous CVS access is \gst{anonymous}. Note that some older versions of CVS have trouble accessing this repository due to the port number being present. We are using version 1.11.1p1. 1967 1968 The software needs to be compiled and installed. The installation procedure has been semi-automated. The following sections describe installation under Linux and windows. 1969 1970 \subsection{Linux install} 1971 1972 An install.sh script is provided to compile and install Greenstone3. What you need to do is: 1973 1974 \begin{quote}\begin{gsc} 1975 cd gsdl3\\ 1976 source setup.bash\\ 1977 install.bash\\ 1978 source setup.bash\\ 1979 \end{gsc}\end{quote} 1980 1981 Note: if you are using mozilla it doesn't seem to like localhost - you should edit the siteConfig files (web/sites/<sitename>/siteConfig.xml) to have your computer name instead of localhost. 1982 1983 Note: \gst{source setup.bash} needs to be done once in any xterm window before doing a make or running Tomcat. setup.bash sets the environment variables \gst{CLASSPATH, PATH, JAVA\_HOME} etc. 1984 1985 To shutdown or startup Tomcat, the commands are: 1986 \begin{quote}\begin{gsc} 1987 \gsdlhome/comms/jakarta/tomcat/bin/shutdown.sh\\ 1988 \gsdlhome/comms/jakarta/tomcat/bin/startup.sh\\ 1989 \end{gsc}\end{quote} 1990 1991 You shouldn't run install.bash twice. 1992 To update your installation, you can run update.bash - this updates your code from CVS, and re-makes all the java stuff. 1993 1994 \subsection{Windows install} 1995 \newpage 1807 1996 \section{Tomcat}\label{app:tomcat} 1808 1997 … … 1862 2051 Almost everything works fine when tomcat is running behind a proxy. The only time this causes trouble is if the servlet itself needs to make external http connections. We do this in the infomine demo collection for example. One of the service classes sends http requests to the infomine database at riverside. Since this is going through the proxy, a username and password is needed. It is not sufficient to prompt the user for a password because they are unlikely to have a password for the particular proxy that tomcat is using. What we have done at present is to put a proxy element in the siteConfig.xml file. Here you have to enter a suitable username and password for the proxy server. Unfortunately these are entered in plain text. And the file is viewable via the servlet. So we need a better solution. 1863 2052 2053 \newpage 1864 2054 \section{SOAP}\label{app:soap} 1865 2055 \subsection{Setting up SOAP from CVS}\label{app:soap-cvs} … … 1920 2110 1921 2111 1922 \section{Using Greenstone 3 from CVS}\label{app:cvs} 1923 1924 *** need to make sure building stuff is in here *** 1925 1926 Greenstone 3 is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes. Whats in CVS is quite different to what comes in a release. The code needs to be compiled, and some files need editing... 1927 1928 To check out the greenstone code, use: 1929 1930 \begin{quote}\begin{gsc}\begin{verbatim} 1931 cvs -d :pserver:cvs\[email protected]:2402/usr/local/ 1932 global-cvs/gsdl-src co gsdl3 1933 \end{verbatim}\end{gsc}\end{quote} 1934 1935 If you need it, the password for anonymous CVS access is \gst{anonymous}. Note that some older versions of CVS have trouble accessing this repository due to the port number being present. We are using version 1.11.1p1. 1936 1937 The software needs to be compiled and installed. The installation procedure has been semi-automated. The following sections describe installation under Linux and windows. 1938 1939 \subsection{Linux install} 1940 1941 An install.sh script is provided to compile and install Greenstone3. What you need to do is: 1942 1943 \begin{quote}\begin{gsc} 1944 cd gsdl3\\ 1945 source setup.bash\\ 1946 install.bash\\ 1947 source setup.bash\\ 1948 \end{gsc}\end{quote} 1949 1950 Note: if you are using mozilla it doesn't seem to like localhost - you should edit the siteConfig files (web/sites/<sitename>/siteConfig.xml) to have your computer name instead of localhost. 1951 1952 Note: \gst{source setup.bash} needs to be done once in any xterm window before doing a make or running Tomcat. setup.bash sets the environment variables \gst{CLASSPATH, PATH, JAVA\_HOME} etc. 1953 1954 To shutdown or startup Tomcat, the commands are: 1955 \begin{quote}\begin{gsc} 1956 \gsdlhome/comms/jakarta/tomcat/bin/shutdown.sh\\ 1957 \gsdlhome/comms/jakarta/tomcat/bin/startup.sh\\ 1958 \end{gsc}\end{quote} 1959 1960 You shouldn't run install.bash twice. 1961 To update your installation, you can run update.bash - this updates your code from CVS, and re-makes all the java stuff. 1962 1963 \subsection{Windows install} 1964 2112 \newpage 1965 2113 \section{Format statements: Greenstone 2 vs Greenstone 3}\label{app:format} 1966 2114 The following table shows the Greenstone 2 format elements, and their equivalents in Greenstone 3 1967 2115 \begin{table} 2116 \caption{Greenstone 3 equivalents of Greenstone 2 format statements} 2117 {\footnotesize 1968 2118 \begin{tabular}{ll} 2119 \hline 1969 2120 \bf Greenstone 2 & \bf Greenstone 3 \\ 2121 \hline 1970 2122 \gst{[Text]} & \gst{<gsf:text/>} \\ 1971 2123 \gst{[num]} & \gst{<gsf:metadata name='docnum'/>}\\ … … 1999 2151 & \gst{ <td><gsf:metadata name='Subject'/></td>}\\ 2000 2152 & \gst{ </gsf:when></gsf:switch>}\\ 2001 \end{tabular} 2002 2153 \hline 2154 \end{tabular}} 2155 \end{table} 2003 2156 \end{document}
Note:
See TracChangeset
for help on using the changeset viewer.