Changeset 6422


Ignore:
Timestamp:
2004-01-09T18:06:22+13:00 (20 years ago)
Author:
kjdon
Message:

more changes

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl3/docs/manual/manual.tex

    r6343 r6422  
    5555
    5656This documentation consists of several parts. Section~\ref{sec:install} covers greenstone installation, how to access the library, and some administration issues. Section~\ref{sec:user} looks at using the sample collections, creating new collections, and how to make small customisations to the interface. The remaining sections are aimed towards  the Greenstone developer. Section~\ref{sec:develop-runtime} describes the run-time system, including the structure of the software, and the message format, while Section~\ref{sec:develop-build} describes the collection building process. Section~\ref{sec:new-features} describes how to add new features to Greenstone, such as how to add new services, new page types, new plugins for different document formats.  Section~\ref{sec:distributed} describes how to make Greenstone run in a distributed fashion, using SOAP as an example communications protocol. Finally, there are several appendices, including how to install Greenstone from CVS, and a comparison of greenstone 2 and greenstone 3 format statements.
    57 
     57\newpage
    5858\section{Greenstone installation and administration}\label{sec:install}
    5959
    6060This section covers where to get Greenstone 3 from, how to install it and how to run it. The standard method of running Greenstone is as a Java servlet. We provide the Tomcat servlet container to serve the servlet :-). Standard web servers may  be able to be configured to provide servlet support, and thereby remove the need to use Tomcat. Please see your web server documentation for this. This documentation assumes that you are using Tomcat. To access Greenstone, tomcat must be started up, and then it can be accessed via a web browser.
    6161
     62
     63\subsection{Get and install Greenstone}
     64
     65Greenstone is available from www.... There are currently two distributions: a self-installing tar for Linux, and a Windows executable.
     66
    6267Greenstone is also available through CVS (Concurrent Versioning System). This provides the absolute latest development version, and is not guaranteed to be stable. Appendix~\ref{app:cvs} describes how to download and install Greenstone from CVS.
    6368
    64 \subsection{Get and install Greenstone}
    65 
    66 Greenstone is available from www.... There are currently two distributions: a self-installing tar for Linux, and a Windows executable.
    67 
    6869\subsubsection{Linux}
    69 
    70 Download the file gsdl3-0.01-unix.sh. Then run it in a shell (./gsdl3-0.01-unix.sh). It will prompt you for where to install greenstone to, the name of your computer, what port to run tomcat on... Once Greenstone has been installed, you can start the library  by running ./gsdl3.sh, and opening up a browser pointing to localhost:8080/gsdl3 (or different computer name and port).
     70** add more once installer finished **
     71
     72Download the latest version of the self-installing tar file, gsdl3-x.xx-unix.sh, and run it in a shell (./gsdl3-x.xx-unix.sh). It will prompt you for where to install greenstone to, the name of your computer, what port to run tomcat on... Once Greenstone has been installed, you can start the library  by running ./gsdl3.sh, and opening up a browser pointing to localhost:8080/gsdl3 (or different computer name and port).
    7173
    7274\subsubsection{Windows}
    73 
    74 Download the gsdl3-0.01-win32.exe file and double click it to start the installation. You will be prompted for ... Once Greenstone is installed, you can access the library by selecting Greenstone 3 Digital Library in the Start menu.
    75 
    76 \subsubsection{Other notes}
    77 
    78 To run Greenstone we are starting up the Tomcat server, and a mysql database server.
    79 
    80 Once Greenstone has been installed, you can run tomcat, and access it in a browser at\gst{http://localhost:8080/gsdl3}---this gets you to a welcome page. From here, you can select to run the test servlet, the standard library servlet, and the remote servlet.
    81 
    82 \noindent Note: Tomcat must be shutdown and restarted any time you make changes in the following for those changes to take effect:\\
     75** add more once installer finished **
     76
     77Download the latest Windows executable, gsdl3-x.xx-win32.exe, and double click it to start the installation. You will be prompted for ... Once Greenstone is installed, you can access the library by selecting Greenstone 3 Digital Library in the Start menu.
     78
     79\subsubsection{Accessing the library in a browser}
     80
     81Once you have started up the library (see the previous sections for OS dependent instructions), you can access it in a browser at http://localhost:8080/gsdl3 (or http://your-computer-name/your-chosen-port/gsdl3). This gets you to a welcome page, with three links: one to run a test servlet (this allows you to check that tomcat is running properly), one to run the standard library servlet using localsite, and one to run a library servlet using the site soapsite. This site uses a SOAP connection to communicate with localsite, and demonstrates the library working in a distributed fashion. See Section~\ref{sec:distributed} for details about how to run Greenstone distributedly.
     82
     83\subsection{How the library works}
     84
     85The standard library program is a Java servlet.
     86
     87Other types of interfaces can be used, such as Java GUI programs. See Section~\ref{sec:new-interfaces} for details about how to make these.
     88
     89\subsubsection{Restarting the library}
     90
     91The library program (actually tomcat) can be restarted by ... (** put a mechanism in each install program **).
     92
     93
     94Tomcat must be shutdown and restarted any time you make changes in the following for those changes to take effect:\\
    8395\begin{bulletedlist}
    8496\begin{gsc}
     
    88100\item any classes or jar files used by the servlets
    89101\end{bulletedlist}
    90 \noindent Note: stdin and stdout for the servlets both go to\\
     102\noindent Note: stdout and stderr for the servlets both go to\\
    91103\gst{\gsdlhome/comms/jakarta/tomcat/logs/catalina.out}
    92104
     
    103115\caption{The Greenstone directory structure}
    104116\label{tab:dirs}
    105 \center{\footnotesize
     117{\footnotesize
    106118\begin{tabular}{l p{8cm}}
     119\hline
     120\bf directory & \bf description \\
    107121\hline
    108122gsdl3
     
    175189where they live, whats the difference, what each contains.\\
    176190
    177 There are two Greenstone {\em sites} that come with the checkout: localsite, and soapsite. localsite has three collections, while soapsite has none. Each site has a configuration file which specifies the site name, site-wide services if any, and a list of remote sites to connect to.
    178 localsite does not connect to any other sites. soapsite specifies a SOAP connection to localsite.
    179 
    180 Talk here about matching site with an interface.
    181 
    182 The file \gst{\gsdlhome/web/WEB-INF/web.xml} contains the setup information for Tomcat---tells it what servlets to load, what initial parameters to pass them, and what web names map to the servlets.
    183 There are three servlets specified in web.xml: one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other two are Greenstone library servlets, {\em library}, which serves localsite, and {\em library1} which serves soapsite.
     191A site is comprised of a set of collections and possibly services. An interface is a set of images along with a set of xslt files used for translating xml output from the library into an appropriate form---html for the servlet case.
     192One greenstone installation can have many sites and interfaces. One instantiation of a servlet uses one site and one interface. Sites and interfaces can be matched up in different ways. For example, a single site might be served with two different interfaces. This provides different modes of access to the same content. eg HTML vs WML, or perhaps providing completely different look and feel for different audiences. A standard interface may be used with many different sites---provides a consistent mode of access to a lot of different content.
     193
     194Collections live in the collect directory of a site. Any collections that are found in this directory when the servlet is initialised will be loaded up and presented to the user. Collections require valid configuration files, but apart from this, nothing needs to be done to the site to use new collections. Collection is added while tomcat is running will not be picked up: you can either restart the server, or send a configuration request to the servlet: these are described in Section~\ref{sec:runtime-config}.
     195
     196There are two Greenstone sites that come with the distribution: localsite, and soapsite. localsite has several demo  collections, while soapsite has none. soapsite specifies that a soap connection should be made to localsite. Getting this to work involves setting up a soap server for localsite: see Section~\ref{sec:distributed} for details.
     197
     198Each site and interface has a configuration file which specifies parameters for the site or interface---these are described in Section~\ref{sec:config}.
     199
     200The file \gst{\gsdlhome/web/WEB-INF/web.xml} contains the setup information for Tomcat. It tells Tomcat what servlets to load, what initial parameters to pass them, and what web names map to the servlets.
     201There are three servlets specified in web.xml (these correspond to the three links in the welcome page for greenstone): one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other two are Greenstone library servlets, {\em library}, which serves localsite, and {\em library1} which serves soapsite. Both of these servlets use the standard interface (called {\em default}).
    184202
    185203\begin{table}
    186204\caption{Greenstone servlet initialisation parameters}
    187205\label{tab:serv-init}
     206{\footnotesize
    188207\begin{tabular}{llp{5cm}}
     208\hline
    189209\bf name & \bf sample value & \bf description \\
    190210\hline
    191 gsdl3home & /research/kjdon/gsdl3 & the base directory of the gsdl3 installation \\
    192 sitename & localsite & the site to use \\
    193 interfacename & default & the interface to use\\
    194 libraryname & library & the name of the library program \\
    195 defaultlang & en & the default language for the interface\\
    196 receptionist & NZDLReceptionist & (optional) specifies an alternative Receptionist to use\\
    197 messagerouter & NewMessageRouter & (optional) specifies an alternative MessageRouter to use\\
    198 \hline
    199 \end{tabular}
     211gsdl3\_home & /research/kjdon/gsdl3 & the base directory of the gsdl3 installation \\
     212site\_name & localsite & the name of the site to use \\
     213interface\_name & default & the name or the interface to use\\
     214library\_name & library & the web name of the servlet \\
     215default\_lang & en & the default language for the interface\\
     216receptionist\_class & NZDLReceptionist & (optional) specifies an alternative Receptionist to use\\
     217messagerouter\_class & NewMessageRouter & (optional) specifies an alternative MessageRouter to use\\
     218\hline
     219\end{tabular}}
    200220\end{table}
    201221
    202 The initialisation parameters used by the library servlets are shown in Table~\ref{tab:serv-init}. The most important parameters are sitename and interface name. Each servlet running uses one site and one interface. You can run multiple servlets, all using different combinations of site and interface.
    203 
    204 \subsection{Configuring a greenstone installation}
     222The initialisation parameters used by the library servlets are shown in Table~\ref{tab:serv-init}. This is where you define what site and interface each servlet uses. Any number of servlets can be specified here. See Appendix~\ref{app:tomcat} for more details about Tomcat.
     223
     224
     225\subsection{Configuring a greenstone installation}\label{sec:config}
    205226
    206227Initial Greenstone3 system configuration is determined by a set of configuration files, all expressed in XML. Each site has a configuration file that binds parameters for the site, \gst{siteConfig.xml}. Each interface has a configuration file, \gst{interfaceConfig.xml}, that specifies Actions for the interface. Collections also have several configuration files; these are discussed in Section~\ref{sec:collconfig}.
    207 The configuration files are read in when the system is initialised, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a CGI-type command to the library. CGI command can be sent to the library are made to the interface configuration file, tomcat needs to be restarted. There are a series of CGI-type commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to shutdown and restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}.
     228The configuration files are read in when the system is initialised, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a CGI-type command to the library. There are a series of CGI-type commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to shutdown and restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}.
    208229
    209230\subsubsection{Site configuration file}\label{sec:siteconfig}
     
    214235collections directory.
    215236
    216 The HTTP address is used for retrieving resources from a site outside the XML protocol. Because a site is HTTP accessible, any files (e.g. images) belonging to that site or to its collections can be specified in the HTML of a page by a URL. This avoids having to retrieve these files from a remote site via the XML protocol\footnote{Currently, sites live inside the Tomcat gsdl3 root context, and therefore all their content is accessible over HTTP via the Tomcat address. We need to see if parts can be restricted. Also, if we use a different protocol, then resources from remote sites may need to come through the XML. Also, if we are running locally without using Tomcat, we may want to get them via file:// rather than http://.}.
     237The HTTP address is used for retrieving resources from a site outside the XML protocol. Because a site is HTTP accessible through Tomcat, any files (e.g. images) belonging to that site or to its collections can be specified in the HTML of a page by a URL. This avoids having to retrieve these files from a remote site via the XML protocol\footnote{Currently, sites live inside the Tomcat gsdl3 root context, and therefore all their content is accessible over HTTP via the Tomcat address. We need to see if parts can be restricted. Also, if we use a different protocol, then resources from remote sites may need to come through the XML. Also, if we are running locally without using Tomcat, we may want to get them via file:// rather than http://.}.
    217238 
    218239Figure~\ref{fig:siteconfig} shows two example site configuration files. The first example is for a rudimentary site with no site-wide services,
     
    241262        <metadata name="Title">Collection builder</metadata>
    242263        <metadata name="Description">Builds collections in a
    243                 gsdl2-style manner</metadata>
     264           gsdl2-style manner</metadata>
    244265      </metadataList>
    245266      <serviceRackList>
     
    270291      <subaction name='home' xslt='home.xsl'/>
    271292      <subaction name='about' xslt='about.xsl'/>
     293      <subaction name='help' xslt='help.xsl'/>
     294      <subaction name='pref' xslt='pref.xsl'/>
    272295    </action>
    273296    <action name='q' class='QueryAction' xslt='basicquery.xsl'/>
    274     <action name='b' class='BrowseAction' xslt='classifier.xsl'/>
     297    <action name='b' class='GS2BrowseAction' xslt='classifier.xsl'/>
    275298    <action name='a' class='AppletAction' xslt='applet.xsl'/>
    276299    <action name='d' class='DocumentAction' xslt='document.xsl'/>
     300    <action name='xd' class='XMLDocumentAction'>
     301      <subaction name='toc' xslt='document-toc.xsl'/>
     302      <subaction name='text' xslt='document-content.xsl'/>
     303    </action>
    277304    <action name='pr' class='ProcessAction' xslt='process.xsl'/>
    278305    <action name='s' class='SystemAction' xslt='system.xsl'/>
     
    280307</interfaceConfig>
    281308\end{verbatim}\end{gsc}
    282 \caption{A sample interface configuration file}
     309\caption{Default interface configuration file}
    283310\label{fig:ifaceconfig}
    284311\end{figure}
     
    301328\caption{Example run-time configuration arguments.}
    302329\label{tab:run-time config}
     330{\footnotesize
    303331\begin{tabular}{lp{8cm}}
     332\hline
    304333\gst{a=s\&sa=c} & reconfigures the whole site, reads in siteConfig.xml, reloads all the collections. Just part of this can be specified with another argument \gst{ss} (system subset). The valid values are \gst{collectionList}, \gst{siteList}, \gst{serviceList}, \gst{clusterList}. \\
    305334\gst{a=s\&sa=c\&sc=XXX} & reconfigures the XXX collection or cluster. \gst{ss} can also be used here, valid values are \gst{metadataList} and \gst{serviceList}. \\
    306335\gst{a=s\&sa=a} & (re)activate a specific module. Modules are specified using two arguments, \gst{st} (system module type) and \gst{sn} (system module name). Valid types are \gst{collection}, \gst{cluster} \gst{site}.\\
    307336\gst{a=s\&sa=d} & deactivate a module. \gst{st} and \gst{sn} can be used here too. Valid types are \gst{collection}, \gst{cluster}, \gst{site}, \gst{service}. Modules are removed from the current configuration, but will reappear if Tomcat is restarted.\\
    308 \gst{a=s\&sa=d\&sc=XXX} & deactivate a module belonging to the XXX collection or cluster. \gst{st} and \gst{sn} can be used here too. Valid types are \gst{service}. \\\end{tabular}
     337\gst{a=s\&sa=d\&sc=XXX} & deactivate a module belonging to the XXX collection or cluster. \gst{st} and \gst{sn} can be used here too. Valid types are \gst{service}. \\
     338\hline
     339\end{tabular}}
    309340\end{table}
    310 
     341\newpage
    311342\section{Using Greenstone 3}\label{sec:user}
    312343
     
    367398\subsection{Collection configuration files}\label{sec:collconfig}
    368399
    369 Each collection has two configuration files, \gst{collectionConfig.xml} and \gst{buildConfig.xml}, that give metadata, display and other information for the
     400Each collection has two, or possibly three, configuration files, \gst{collectionConfig.xml} and \gst{buildConfig.xml}, and optionally \gst{collectionInit.xml} that give metadata, display and other information for the
    370401collection.\footnote{\gst{siteConfig.xml} and \gst{interfaceConfig.xml} is new for Greenstone3, while \gst{collectionConfig.xml} and \gst{buildConfig.xml} replace \gst{collect.cfg} and \gst{build.cfg} in
    371402Greenstone2.}  The first includes user-defined presentation metadata for the collection,
     
    374405the build-time process and includes any metadata that can be determined
    375406automatically. It also includes configuration information for any ServiceRacks needed by the collection.
     407
     408\subsubsection{collectionInit.xml}
     409
     410This optional file specifies a new collection class if the standrad one is not to be used. The only syntax so far is the class name:
     411
     412\begin{gsc}\begin{verbatim}
     413<collectionInit class="XMLCollection"/>
     414\end{verbatim}\end{gsc}
     415
     416Section~\ref{sec:new-coll-types} describes an example collection where this file is used. Depending on the type of collection that this is used for, one or both of the other config files may not be needed.
     417
     418\subsubsection{collectionConfig.xml}
    376419
    377420The collection configuration file is where the collection designer (e.g. a librarian) decides what form the collection should take. This includes the collection metadata such as title and description, and also includes what indexes and browsing structures should be built. The format of \gst{collectionConfig.xml} is still under consideration. However, Figure~\ref{fig:collconfig} shows the parts of it that have been defined so far. (Since collection building at this stage is still done using Greenstone2 Perl scripts and the old \gst{collect.cfg} file, we have only defined the format for the parts of \gst{collectionConfig.xml} that are used by the runtime-system.)
     
    418461    <classifier name="CL4">
    419462      <format>
    420     <gsf:template match="documentNode">
     463        <gsf:template match="documentNode">
    421464          <br /><gsf:link><gsf:metadata name='Keyword' />
    422465            </gsf:link></gsf:template>
     
    436479The \gst{<display>} element contains optional formatting information for the display of documents. Templates that can be specified here include \gst{documentHeading}, \gst{DocumentContent}, and other information that could be specified (in a yet to be decided format) are things such as  whether or not to display the cover image, table of contents etc.
    437480
    438 \subsection{Building configuration file}\label{sec:buildconfig}
     481\subsection{buildConfig.xml}\label{sec:buildconfig}
    439482
    440483The file \gst{buildConfig.xml} is produced by the collection building process, and contains  metadata and other information about the collection that can
     
    491534</buildConfig>
    492535\end{verbatim}\end{gsc}
    493 \caption{Sample buildConfig.xml file}
     536\caption{Sample buildConfig.xml file (mgppdemo collection)}
    494537\label{fig:buildconfig}
    495538\end{figure}
     
    529572\caption{Format elements for GSF format language}
    530573\label{tab:gsf-format}
    531 \begin{tabular}{ll}
     574{\footnotesize
     575\begin{tabular}{p{6.5cm}p{6.5cm}}
     576\hline
    532577\bf Element       & \bf Description \\
    533 \gst{<gsf:text/>} & The document's  text\\
     578\hline
     579\gst{<gsf:text/>} & The document's text\\
    534580\gst{<gsf:link>...</gsf:link>} & The HTML link to the document itself \\
    535 \gst{<gsf:link type='document'>...</gsf:link>} & Same as above\\
    536 \gst{<gsf:link type='classifier'>...</gsf:link>} & A link to a classification node (use in classifierNode templates)\\
    537 \gst{<gsf:link type='source'>...</gsf:link>} & The HTML link to the original file---set for documents that have been converted from e.g. Word, PDF, PS \\
     581\gst{<gsf:link type='document'>...
     582</gsf:link>} & Same as above\\
     583\gst{<gsf:link type='classifier'>...
     584</gsf:link>} & A link to a classification node (use in classifierNode templates)\\
     585\gst{<gsf:link type='source'>...
     586</gsf:link>} & The HTML link to the original file---set for documents that have been converted from e.g. Word, PDF, PS \\
    538587\gst{<gsf:icon/>}  & An appropriate icon\\
    539588\gst{<gsf:icon type='document'/>} & same as above\\
     
    549598</gsf:choose-metadata>}
    550599 & A choice of metadata. Will select the first existing one. the metadata elements can have the select, separator and multiple attributes like normal.\\
    551 \gst{<gsf:switch preprocess='preprocess-type'>
    552 <gsf:metadata name='Title'/><gsf:when test='test-type' test-value='xxx'>.....</gsf:when><gsf:when test='test-type' test-value='xxx'>...</gsf:when><gsf:otherwise>...</gsf:otherwise></gsf:switch>} & switch on the value of a particular metadata - the metadata is specified in gsf:metadata, has the same attributes as normal.\\
    553 \end{tabular}
     600\gst{<gsf:switch preprocess=
     601'preprocess-type'>
     602<gsf:metadata name='Title'/>
     603<gsf:when test='test-type'
     604test-value='xxx'>...</gsf:when>
     605<gsf:when test='test-type'
     606test-value='yyy'>...</gsf:when>
     607<gsf:otherwise>...</gsf:otherwise>
     608</gsf:switch>} & switch on the value of a particular metadata - the metadata is specified in gsf:metadata, has the same attributes as normal.\\
     609\hline
     610\end{tabular}}
    554611\end{table}
    555612
     
    562619To get the previous metadata, the format statement would have the following in it:
    563620
    564 \gst{<gsf:metadata name='Title' select='ancestors' separator='; '/>; <gsf:metadata name='Title'/>}
     621\begin{gsc}
     622\begin{verbatim}
     623<gsf:metadata name='Title' select='ancestors' separator='; '/>;
     624    <gsf:metadata name='Title'/>
     625\end{verbatim}
     626\end{gsc}
    565627
    566628\begin{table}
    567629\caption{Select types for metadata format elements}
    568630\label{tab:gsf-select-types}
     631{\footnotesize
    569632\begin{tabular}{ll}
    570633\hline
     
    579642descendents & All the descendent sections\\
    580643\hline
    581 \end{tabular}
     644\end{tabular}}
    582645\end{table}
    583646
     
    585648\begin{gsc}
    586649\begin{verbatim}
    587 <gsf:choose-metadata><gsf:option name='dc.Title'/><gsf:option name='dls.Title'/><gsf:option name='Title'/></gsf:choose-metadata>
     650<gsf:choose-metadata>
     651  <gsf:option name='dc.Title'/>
     652  <gsf:option name='dls.Title'/>
     653  <gsf:option name='Title'/>
     654</gsf:choose-metadata>
    588655\end{verbatim}
    589656\end{gsc}
     
    596663\begin{verbatim}
    597664<gsf:switch metadata='Organization' preprocess='toLower;stripSpace'>
    598   <gsf:when test='equals' test-value='bostid'><!-- output BOSTID image --></gsf:when>
    599   <gsf:when test='equals' test-value='worldbank'><!-- output world bank image --></gsf:when>
     665  <gsf:when test='equals' test-value='bostid'>
     666     <!-- output BOSTID image --></gsf:when>
     667  <gsf:when test='equals' test-value='worldbank'>
     668     <!-- output world bank image --></gsf:when>
    600669  <gsf:otherwise><!-- output default image--></gsf:otherwise>
    601670</gsf:switch>
     
    620689  <search>
    621690    <format> <!--Put here templates related to searching and
    622     the query page. The common one is the documentNode
    623     template -->
     691        the query page. The common one is the documentNode
     692        template -->
    624693      <gsf:template match='documentNode'>...</gsf:template>
    625694    </format>
     
    628697    <classifier name='xx'>
    629698      <format><!-- put here templates related to formating a
    630     particular classifier page. Common ones are documentNode
    631     and classifierNode templates-->
     699        particular classifier page. Common ones are documentNode
     700        and classifierNode templates-->
    632701        <gsf:template match='documentNode'>...</gsf:template>
    633702        <gsf:template match='classifierNode'>...</gsf:template>
    634703        <gsf:template match='classifierNode' mode='horizontal'>...
    635       </gsf:template>
     704          </gsf:template>
    636705      </format>
    637706    </classifier>
     
    640709  <display>
    641710    <format><!-- here goes any formatting relating to the display
    642     of the documents. These are generally named templates,
    643     and format options -->
     711        of the documents. These are generally named templates,
     712        and format options -->
    644713      <gsf:template name='documentContent'>...</gsf:template>
    645714      <gsf:option name='TOC' value='true'/>
     
    668737\caption{Formatting options}
    669738\label{tab:format_options}
    670 \center{\footnotesize
     739{\footnotesize
    671740\begin{tabular}{llp{5cm}}
    672741\hline
     
    684753
    685754
    686 \subsection{Customising the interface}
     755\subsection{Customising the interface}\label{sec:interface-customise}
    687756
    688757The interface can be customised in several ways.
     
    717786To use a new interface, the tomcat web.xml must be edited: either change the interface that a current version of the servlet is using, or add another servlet instantiation to the file (see Section~\ref{sec:sites-and-ints} or Appendix~\ref{app:tomcat}). The Tomcat server must be restarted for this to take effect.
    718787
    719 
     788\newpage
    720789\section{Developing Greenstone 3: Run-time system}\label{sec:develop-runtime}
    721790
     
    835904
    836905\begin{table}
    837 \center{\footnotesize
     906{\footnotesize
    838907\begin{tabular}{lll}
    839908\hline
     
    11751244\caption{Status codes currently used in Greenstone 3}
    11761245\label{tab:status codes}
     1246{\footnotesize
    11771247\begin{tabular}{llp{8cm}}
     1248\hline
    11781249\bf code name & \bf code  & \bf meaning \\
    11791250& \bf value & \\
     1251\hline
    11801252SUCCESS &  1 & the request was accepted, and the process was  completed \\
    11811253ACCEPTED & 2 & the request was accepted, and the process has been started, but it is not completed yet \\
     
    11851257HALTED & 12 & the process has stopped  \\
    11861258INFO & 20 & just an info message that doesn't imply anything \\
    1187 \end{tabular}
     1259\hline
     1260\end{tabular}}
    11881261\end{table}
    11891262
     
    16941767\caption{Configure CGI arguments}
    16951768\label{tab:system-cgi}
     1769{\footnotesize
    16961770\begin{tabular}{ll}
    16971771\hline
    16981772\bf arg & \bf description\\
     1773\hline
    16991774a=s & system action\\
    17001775sa=c$|$a$|$d & type of system request: c (configure), a (add/activate), \\
     
    17091784st=collection& \\
    17101785\hline
    1711 \end{tabular}
     1786\end{tabular}}
    17121787\end{table}
    17131788
     
    17171792\caption{The utility classes in org.greenstone.gsdl3.util}
    17181793\label{tab:utils}
    1719 \center{\footnotesize
     1794{\footnotesize
    17201795\begin{tabular}{lp{3.75in}}
    17211796\hline
    17221797\bf Utility class & \bf Description\\
     1798\hline
    17231799ConfigVars & holds the servlet startup variables, including library name, site name, interface name, default language\\
    17241800Dictionary & wrapper around a Resource Bundle, providing strings with parameter\\
     
    17361812XSLTUtil & contains static methods to be called from within XSLT \\
    17371813\hline
    1738 \end{tabular}
    1739 }
     1814\end{tabular}}
    17401815\end{table}
    17411816
     1817\newpage
    17421818\section{Collection building architecture}\label{sec:develop-build}
    17431819**** GEORGE ****
     
    17461822modules API\\
    17471823
     1824\newpage
    17481825\section{Developing Greenstone 3: Adding new features}\label{sec:new-features}
    17491826
    1750 \subsection{Creating new services}
     1827\subsection{Creating new services}\label{sec:new-services}
    17511828
    17521829*inherit from ServiceRack - abstract base class. this handles the main process method, determines the service name and request type. if request type is describe, and to is empty, it returns a list of services (short\_service\_info) which is initialised in the configure method. a describe request to a particular service results in getServiceDescription being called, which must be supplied by the subclass.
     
    17601837
    17611838* should a metadata retrieval service advertise what metadata is available??
    1762 \subsection{creating new actions/pages}
    1763 
    1764 \subsection{new interfaces}
     1839\subsection{creating new actions/pages}\label{sec:new-pages}
     1840
     1841\subsection{new interfaces}\label{sec:new-interfaces}
    17651842e.g. java interface. where you can interface to. MR vs Receptionist. diff receptionists. egs, handheld - using servlet, transforming recpt, but new set of XSLT java program other program - talk to recpt but just get back XML data for pages. java gui - just talk to MR, do all processing itself.
    17661843
    1767 \subsection{Adding new classifiers}
     1844\subsection{Adding new classifiers}\label{sec:new-classifiers}
    17681845*** GEORGE ***
    1769 \subsection{Adding new plugins}
     1846\subsection{Adding new plugins}\label{sec:new-plugins}
    17701847*** GEORGE ***
    17711848
    1772 \subsection{Documented examples}
    1773 
    1774 talk about the sample collections briefly - but most documentation is in the description
    1775 \subsubsection{The NZDL web interface}
    1776 
    1777 We have created a second interface that can be seen at \gst{http://www.greenstone.org/greenstone3/nzdl}. There are some small differences between this and the standard greenstone interface.
    1778 We created a new interface---called nzdl, put into the web/interfaces directory. It has a set of images and transform files like the standard interface. And most of the XSLT files have been overridden.
    1779 
    1780 * Along the navigation bar, it has search and classifiers. The standard interface has each service along there. We needed to modify the navigation bar XSLT code, but also we added a new receptionist.
    1781 interface found at www...
    1782 what did we have to do to get this interface?
    1783 classifiers displayed instead of services, query services all have same button, hard coded query page.
    1784 assumptions made, classes modified - new Receptionist, new XSLT.
    1785 
    1786 
    1787 
     1849\subsection{New types of collections}\label{sec:new-coll-types}
     1850
     1851There are two types of standard Greenstone collections: collections built with the Greenstone 3 building system, and collections that are imported from Greenstone 2. There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. Greenstone 3 has an ability to use any type of collection you can come up with, assuming  some java code is provided.
     1852
     1853
     1854There are four levels of customisation that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with Greenstone to describe these different levels.
     1855
     1856Firstly, new service classes need to be written to provide the functionality to search/browse/whatever the collection. If the services have similar interfaces and functionality to the standard services, this may be all that is needed. For example, the Greenstone 2 MGPP collections were the first to be served in Greenstone 3. When we came to do Greenstone 2 MG collections, all we had to do was write some new service classes that interacted with MG instead of MGPP. Because these collections used the same type of services, this was all we had to do. The format of the configuration files was similar, they just specified MG serviceRack classes rather than MGPP ones.
     1857
     1858The nzmaps collection used the same level of customisation, just implementing new services and fitting all the extra display elements into the standard query/display framework using javascript.
     1859
     1860The gberg collection, however, was done quite differently to the standard collections. New services were provided to search the database (built with Lucene) and to provide the documents and parts of documents (using XSLT to transform the raw XML files). The collectionConfig file had some extra information in it: a list of the documents in the collection along with their Titles. Because the standard collection class has no notion of document lists, a new class was created (org.greenstone.gsdl3.collection.XMLCollection). This class is basically the same as a standard collection class except that it looks for and stores in memory the documentList from the collectionConfig file.
     1861
     1862To tell Greenstone to load up a different type of collection class, we use another configuration file: etc/collectionInit.xml. This  specifies the name of the collection class to use.
     1863Currently, this is all that is specified in that file, but you may want to add parameters for the class etc.
     1864
     1865\gst{<collectionInit class="XMLCollection"/>}
     1866
     1867The display for the collection is also quite different. The home page for the collection  displays the list of documents. To achieve this, the describe response from the collection had to include the list, and a new XSLT was written for the collection that displayed this. Collection XSLT should be put in the transform directory of the collection\footnote{These are currently only used when running greenstone in a non-distributed fashion, but it will be added in properly at some stage}.
     1868
     1869Document display is  significantly different to standard greenstone. There are two modes of display: table of contents mode, and content mode. Clicking on a document link from the collection home page takes the user to the table of contents for the collection. Clicking on one of the sections in the table of contents takes them to a display of that section. To facilitate this, not only do we need new XSLT files , we also needed a new action. XMLDocumentAction was created, that used two subactions, toc and text, for the different modes of display.
     1870
     1871The Receptionist was told about this new action by the addition of the following to the interfaceConfig.xml file:
     1872
     1873\begin{gsc}\begin{verbatim}
     1874<action name='xd' class='XMLDocumentAction'>
     1875  <subaction name='toc' xslt='document-toc.xsl'/>
     1876  <subaction name='text' xslt='document-content.xsl'/>
     1877</action>
     1878\end{verbatim}\end{gsc}
     1879
     1880XSLT files are linked to subactions rather than the action as a whole. The collection supplies the two XSLT files written appropriately for the data it contains.
     1881
     1882All links that link to the documents have to be changed to use the xd action rather than the standard d action. These include the links from the home page, and the links from query results.
     1883
     1884Querying of the collection is almost the same as usual. The query service provides a list of parameters, does the query and then sends back a list of document identifiers. The standard query action was fine for this collection. The change occurs in the way that the results are displayed---this is accomplished using a format statement supplied in the collectionConfig file inside the search node.
     1885
     1886\begin{gsc}\begin{verbatim}
     1887<search>
     1888  <format>
     1889    <gsf:template match="documentNode">
     1890      <xsl:param name="collName"/>
     1891      <xsl:param name="serviceName"/>
     1892      <td>
     1893        <b><a href="{$library_name}?a=xd&amp;sa=text&amp;c={$collName}&
     1894            amp;d={@nodeID}&amp;p.a=q&amp;p.s={$serviceName}">
     1895             <xsl:choose>
     1896               <xsl:when test="metadataList/metadata[@name='Title']">
     1897                 <gsf:metadata name="Title"/>
     1898               </xsl:when>
     1899               <xsl:otherwise>(section)</xsl:otherwise>
     1900             </xsl:choose>
     1901           </a>
     1902         </b> from <b><a href="{$library_name}?a=xd&amp;sa=toc&amp;
     1903           c={$collName}&amp;d={@nodeID}.rt&amp;p.a=q&amp;p.s={$serviceName}">
     1904         <gsf:metadata name="Title" select="root"/></a></b>
     1905      </td>
     1906    </gsf:template>
     1907  </format>
     1908</search>
     1909\end{verbatim}\end{gsc}
     1910
     1911Instead of displaying an icon and the Title, it displays the Title of the section and the title of the document. Both of these are linked to the document: the section title to the content of that section, the document title to the table of contents for the document. Because these require non-standard arguments to the library, these parts of the template are written in XSLT not greenstone format language. As is shown here it is perfectly feasible to write a format statement that includes XSLT mixed in with greenstone format elements.
     1912
     1913The document display uses CSS to format the output---these are kept in the collection and specified in the collections XSLT files. The documents also specify DTD files. Due to the way we read in the XML files, Tomcat sometimes has trouble locating the DTDs. One option is to may all the links absolute links to files in the collection folder, the other option is to put them in Greenstone's DTD folder gsdl3/resources/dtd.
     1914
     1915\subsection{The NZDL mirror site}
     1916
     1917The library seen at \gst{http://www.greenstone.org/greenstone3/nzdl} is like a mirror to \gst{http://www.nzdl.org}---it aims to present the same collections, in the same way but using Greenstone 3 instead of Greenstone 2. It uses a new site and a new interface. The web.xml file had a new servlet entry in it to specify the combination of nzdl site and interface.
     1918
     1919The site was created by making a directory called nzdl in the sites folder. A siteConfig file was created. Because its running on Linux, we were able to link to all the collections in the old greenstone installation. The convert\_coll\_from\_gs2.pl script was run over all the collections to produce the new XML configuration files.
     1920
     1921A new interface, also called nzdl, was created in the interfaces directory.
     1922In many cases, creating a new interface just requires the new images and XSLT  to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This setup also required a bit more customisation.
     1923
     1924The standard Greenstone navigation bar lists all the services available for the collection. In Greenstone 2, the navigation bar provided the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that produced the navigation bar needed to be altered to produce this. But also, a new Receptionist was needed.
     1925The standard receptionist (DefaultReceptionist) gathers a little bit of extra info for each page of XML before transforming it: this is the list of services for the collection and their display information, allowing the services to be listed along the navigation bar. This is information that is needed by every page (except for the library home page) and therefore is obtained by the receptionist instead of by each action. The nzdl interface needed a bit more information than this: for the ClassifierBrowse service, if there was one, the list of classifiers and their display elements must be obtained. So a new Receptionist was written that inherited from DefaultReceptionist, and added this new info into the page.
     1926
     1927One of the servlet initialisation parameters is the receptionist class: this was added to the servlet definition in the web.xml file so that the LibraryServlet would load up the right receptionist class.
     1928
     1929
     1930\newpage
    17881931\section{Distributed Greenstone}\label{sec:distributed}
    17891932
    1790 \begin{figure}[t]
     1933Greenstone is designed to run in a distributed fashion. One greenstone installation can talk to several sites on different computers. This requires some sort of communication protocol. Any protocol can be used, however we have only implemented a simple SOAP protocol.
     1934
     1935more explanation..
     1936
     1937\begin{figure}[h]
    17911938  \centering
    17921939  \includegraphics[width=4in]{remote} %5.8
     
    17951942\end{figure}
    17961943
    1797 Greenstone is designed to run in a distributed fashion. One greenstone installation can talk to several sites on different computers. This requires some sort of communication protocol. Any protocol can be used, however we have only implemented a simple SOAP protocol.
    1798 
    1799 more explanation..
    1800 
    18011944We have used Apache SOAP for Java. This is run as a servlet in Tomcat.
    18021945If you have obtained Greenstone through CVS, you will need to install soap separately, describe in Appendix~\ref{app:soap-cvs}. Debugging soap is described in Appendix~\ref{app:soap-debug}.
    18031946
     1947\subsection{Serving a site using soap}
     1948what do we have to do?? resource file format, deploy the service etc.
    18041949
    18051950\appendix
    1806    
     1951
     1952\newpage
     1953\section{Using Greenstone 3 from CVS}\label{app:cvs}
     1954
     1955*** need to make sure building stuff is in here ***
     1956
     1957Greenstone 3 is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes. Whats in CVS is quite different to what comes in a release. The code needs to be compiled, and some files need editing...
     1958
     1959To check out the greenstone code, use:
     1960
     1961\begin{quote}\begin{gsc}\begin{verbatim}
     1962cvs -d :pserver:cvs\[email protected]:2402/usr/local/
     1963           global-cvs/gsdl-src co gsdl3
     1964\end{verbatim}\end{gsc}\end{quote}
     1965
     1966If you need it, the password for anonymous CVS access is \gst{anonymous}. Note that some older versions of CVS have trouble accessing this repository due to the port number being present. We are using version 1.11.1p1.
     1967
     1968The software needs to be compiled and installed. The installation procedure has been semi-automated. The following sections describe installation under Linux and windows.
     1969
     1970\subsection{Linux install}
     1971
     1972An install.sh script is provided to compile and install Greenstone3. What you need to do is:
     1973
     1974\begin{quote}\begin{gsc}
     1975cd gsdl3\\
     1976source setup.bash\\
     1977install.bash\\
     1978source setup.bash\\
     1979\end{gsc}\end{quote}
     1980
     1981Note: if you are using mozilla it doesn't seem to like localhost - you should edit the siteConfig files (web/sites/<sitename>/siteConfig.xml) to have your computer name instead of localhost.
     1982
     1983Note: \gst{source setup.bash} needs to be done once in any xterm window before doing a make or running Tomcat. setup.bash sets the environment variables \gst{CLASSPATH, PATH, JAVA\_HOME} etc.
     1984
     1985To shutdown or startup Tomcat, the commands are:
     1986\begin{quote}\begin{gsc}
     1987\gsdlhome/comms/jakarta/tomcat/bin/shutdown.sh\\
     1988\gsdlhome/comms/jakarta/tomcat/bin/startup.sh\\
     1989\end{gsc}\end{quote}
     1990
     1991You shouldn't run install.bash twice.
     1992To update your installation, you can run update.bash - this updates your code from CVS, and re-makes all the java stuff.
     1993
     1994\subsection{Windows install}
     1995\newpage
    18071996\section{Tomcat}\label{app:tomcat}
    18081997
     
    18622051Almost everything works fine when tomcat is running behind a proxy. The only time this causes trouble is if the servlet itself needs to make external http connections. We do this in the infomine demo collection for example. One of the service classes sends http requests to the infomine database at riverside. Since this is going through the proxy, a username and password is needed. It is not sufficient to prompt the user for a password because they are unlikely to have a password for the particular proxy that tomcat is using. What we have done at present is to put a proxy element in the siteConfig.xml file. Here you have to enter a suitable username and password for the proxy server. Unfortunately these are entered in plain text. And the file is viewable via the servlet. So we need a better solution.
    18632052
     2053\newpage
    18642054\section{SOAP}\label{app:soap}
    18652055\subsection{Setting up SOAP from CVS}\label{app:soap-cvs}
     
    19202110
    19212111
    1922 \section{Using Greenstone 3 from CVS}\label{app:cvs}
    1923 
    1924 *** need to make sure building stuff is in here ***
    1925 
    1926 Greenstone 3 is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes. Whats in CVS is quite different to what comes in a release. The code needs to be compiled, and some files need editing...
    1927 
    1928 To check out the greenstone code, use:
    1929 
    1930 \begin{quote}\begin{gsc}\begin{verbatim}
    1931 cvs -d :pserver:cvs\[email protected]:2402/usr/local/
    1932            global-cvs/gsdl-src co gsdl3
    1933 \end{verbatim}\end{gsc}\end{quote}
    1934 
    1935 If you need it, the password for anonymous CVS access is \gst{anonymous}. Note that some older versions of CVS have trouble accessing this repository due to the port number being present. We are using version 1.11.1p1.
    1936 
    1937 The software needs to be compiled and installed. The installation procedure has been semi-automated. The following sections describe installation under Linux and windows.
    1938 
    1939 \subsection{Linux install}
    1940 
    1941 An install.sh script is provided to compile and install Greenstone3. What you need to do is:
    1942 
    1943 \begin{quote}\begin{gsc}
    1944 cd gsdl3\\
    1945 source setup.bash\\
    1946 install.bash\\
    1947 source setup.bash\\
    1948 \end{gsc}\end{quote}
    1949 
    1950 Note: if you are using mozilla it doesn't seem to like localhost - you should edit the siteConfig files (web/sites/<sitename>/siteConfig.xml) to have your computer name instead of localhost.
    1951 
    1952 Note: \gst{source setup.bash} needs to be done once in any xterm window before doing a make or running Tomcat. setup.bash sets the environment variables \gst{CLASSPATH, PATH, JAVA\_HOME} etc.
    1953 
    1954 To shutdown or startup Tomcat, the commands are:
    1955 \begin{quote}\begin{gsc}
    1956 \gsdlhome/comms/jakarta/tomcat/bin/shutdown.sh\\
    1957 \gsdlhome/comms/jakarta/tomcat/bin/startup.sh\\
    1958 \end{gsc}\end{quote}
    1959 
    1960 You shouldn't run install.bash twice.
    1961 To update your installation, you can run update.bash - this updates your code from CVS, and re-makes all the java stuff.
    1962 
    1963 \subsection{Windows install}
    1964 
     2112\newpage
    19652113\section{Format statements: Greenstone 2 vs Greenstone 3}\label{app:format}
    19662114The following table shows the Greenstone 2 format elements, and their equivalents in Greenstone 3
    1967 
     2115\begin{table}
     2116\caption{Greenstone 3 equivalents of Greenstone 2 format statements}
     2117{\footnotesize
    19682118\begin{tabular}{ll}
     2119\hline
    19692120\bf Greenstone 2        & \bf Greenstone 3 \\
     2121\hline
    19702122\gst{[Text]} & \gst{<gsf:text/>} \\
    19712123\gst{[num]} & \gst{<gsf:metadata name='docnum'/>}\\
     
    19992151& \gst{     <td><gsf:metadata name='Subject'/></td>}\\
    20002152& \gst{  </gsf:when></gsf:switch>}\\
    2001 \end{tabular}
    2002 
     2153\hline
     2154\end{tabular}}
     2155\end{table}
    20032156\end{document}
Note: See TracChangeset for help on using the changeset viewer.