Changeset 6908
- Timestamp:
- 2004-03-04T10:29:44+13:00 (20 years ago)
- Location:
- trunk/gsdl3/docs/manual
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/gsdl3/docs/manual/manual.tex
r6904 r6908 10 10 \newcommand{\gsdlhome}{\$GSDL3HOME} 11 11 12 \newcommand{\gsii}{Greenstone 2} 13 \newcommand{\gsiii}{Greenstone 3} 14 \newcommand{\gs}{Greenstone} 15 12 16 \begin{document} 13 17 14 \title{ Greenstone 3: A modular digital library.}18 \title{\gsiii\ : A modular digital library.} 15 19 16 20 % if you work on this manual, add your name here … … 30 34 \noindent 31 35 Greenstone Digital Library Version 3 is a complete redesign and 32 reimplementation of the Greenstonedigital library software. The current33 version ( Greenstone2) enjoys considerable success and is being widely used.34 Greenstone3will capitalise on this success, and in addition it will36 reimplementation of the \gs\ digital library software. The current 37 version (\gsii) enjoys considerable success and is being widely used. 38 \gsiii \ will capitalise on this success, and in addition it will 35 39 \begin{bulletedlist} 36 40 \item improve flexibility, modularity, and extensibility 37 \item lower the bar for ``getting into'' the Greenstonecode with a view to41 \item lower the bar for ``getting into'' the \gs\ code with a view to 38 42 understanding and extending it 39 43 \item use XML where possible internally to improve the amount of … … 49 53 easier inclusion of existing Java code (such as for text mining). 50 54 \end{bulletedlist} 51 Parts of Greenstonewill remain in other languages (e.g. MG, MGPP); JNI (Java55 Parts of \gs\ will remain in other languages (e.g. MG, MGPP); JNI (Java 52 56 Native Interface) will be used to communicate with these. 53 57 54 A description of the general design and architecture of Greenstone3is covered by the document {\em The design of Greenstone3: An agent based dynamic digital library} (design-2002.ps, in the gsdl3/docs/manual directory).55 56 This documentation consists of several parts. Section~\ref{sec:install} covers greenstone installation, how to access the library, and some administration issues. Section~\ref{sec:user} looks at using the sample collections, creating new collections, and how to make small customisations to the interface. The remaining sections are aimed towards the Greenstone developer. Section~\ref{sec:develop-runtime} describes the run-time system, including the structure of the software, and the message format, while Section~\ref{sec:develop-build} describes the collection building process. Section~\ref{sec:new-features} describes how to add new features to Greenstone, such as how to add new services, new page types, new plugins for different document formats. Section~\ref{sec:distributed} describes how to make Greenstone run in a distributed fashion, using SOAP as an example communications protocol. Finally, there are several appendices, including how to install Greenstone from CVS, and a comparison of Greenstone2 and Greenstone3format statements.58 A description of the general design and architecture of \gsiii\ is covered by the document {\em The design of Greenstone3: An agent based dynamic digital library} (design-2002.ps, in the gsdl3/docs/manual directory). 59 60 This documentation consists of several parts. Section~\ref{sec:install} is for administrators, and covers \gsiii\ installation, how to access the library, and some administration issues. Section~\ref{sec:user} is for users of the software, and looks at using the sample collections, creating new collections, and how to make small customisations to the interface. The remaining sections are aimed towards the \gs\ developer. Section~\ref{sec:develop-runtime} describes the run-time system, including the structure of the software, and the message format, while Section~\ref{sec:develop-build} describes the collection building process. Section~\ref{sec:new-features} describes how to add new features to \gs\ , such as how to add new services, new page types, new plugins for different document formats. Section~\ref{sec:distributed} describes how to make \gs\ run in a distributed fashion, using SOAP as an example communications protocol. Finally, there are several appendices, including how to install \gs\ from CVS, some notes on Tomcat and SOAP, and a comparison of \gsii\ and \gsiii\ format statements. 57 61 \newpage 58 \section{ Greenstoneinstallation and administration}\label{sec:install}59 60 This section covers where to get Greenstone 3 from, how to install it and how to run it. The standard method of running Greenstone is as a Java servlet. We provide the Tomcat servlet container to serve the servlet :-). Standard web servers may be able to be configured to provide servlet support, and thereby remove the need to use Tomcat. Please see your web server documentation for this. This documentation assumes that you are using Tomcat. To access Greenstone, Tomcat must be started up, and then it can be accessed via a web browser.61 62 63 \subsection{Get and install Greenstone}64 65 Greenstone is available from \gst{http://www.greenstone.org/greensone3}. There are currently two distributions: a self-installing tar for Linux, and a Windows executable.66 67 Greenstone is also available through CVS (Concurrent Versioning System). This provides the absolute latest development version, and is not guaranteed to be stable. Appendix~\ref{app:cvs} describes how to download and install Greenstonefrom CVS.62 \section{\gs\ installation and administration}\label{sec:install} 63 64 This section covers where to get \gsiii\ from, how to install it and how to run it. The standard method of running \gsiii\ is as a Java servlet. We provide the Tomcat servlet container to serve the servlet :-). Standard web servers may be able to be configured to provide servlet support, and thereby remove the need to use Tomcat. Please see your web server documentation for this. This documentation assumes that you are using Tomcat. To access \gsiii\ , Tomcat must be started up, and then it can be accessed via a web browser. 65 66 67 \subsection{Get and install \gs\ } 68 69 \gsiii\ is available from \gst{http://www.greenstone.org/greenstone3}. There are currently two distributions: a self-installing tar for Linux, and a Windows executable. 70 71 \gsiii\ is also available through CVS (Concurrent Versioning System). This provides the latest development version, and is not guaranteed to be stable. Appendix~\ref{app:cvs} describes how to download and install \gsiii\ from CVS. 68 72 69 73 \subsubsection{Linux} 70 74 71 Download the latest version of the self-installing tar file, gsdl3-x.xx-unix.sh, and run it in a shell (./gsdl3-x.xx-unix.sh). Greenstone will be installed into a directory called gsdl3 inside the current directory. The install script will prompt you for the name of your computer and what port to run Tomcat on (the defaults being localhost and 8080). Once Greenstone has been installed, you can start the library by running ./gsdl3/gs3-launch.sh, and opening up a browser pointing to localhost:8080/gsdl3 (or different computer name and port).75 Download the latest version of the self-installing tar file, \gst{gsdl3-x.xx-unix.sh}, and run it in a shell (\gst{./gsdl3-x.xx-unix.sh}). \gsiii\ will be installed into a directory called \gst{gsdl3} inside the current directory. The install script will prompt you for the name of your computer and what port to run Tomcat on (the defaults being \gst{localhost} and \gst{8080}). Once \gsiii\ has been installed, you can start the library by running \gst{./gsdl3/gs3-launch.sh}, and opening up a browser pointing to \gst{http://localhost:8080/gsdl3} (substituting your chosen name and port if necessary). 72 76 73 77 \subsubsection{Windows} 74 78 75 Download the latest Windows executable, gsdl3-x.xx-win32.exe, and double click it to start the installation. You will be prompted for your computer name and port number to run Tomcat on (defaults are localhost and 8080). Once Greenstone is installed, you can access the library by selecting Greenstone 3 Digital Libraryin the Start menu.79 Download the latest Windows executable, \gst{gsdl3-x.xx-win32.exe}, and double click it to start the installation. You will be prompted for your computer name and the port number to run Tomcat on (defaults are \gst{localhost} and \gst{8080}). Once \gsiii\ is installed, you can access the library by selecting \gst{Greenstone Digital Library 3} in the Start menu. 76 80 77 81 \subsubsection{Accessing the library in a browser} 78 82 79 Once you have started up the library (see the previous sections for OS dependent instructions), you can access it in a browser at http://localhost:8080/gsdl3 (or http://your-computer-name:your-chosen-port/gsdl3). This gets you to a welcome page, with three links: one to run a test servlet (this allows you to check that Tomcat is running properly), one to run the standard library servlet using the site \gst{localsite}, and one to run a library servlet using the site \gst{soapsite}. This site uses a SOAP connection to communicate with localsite, and demonstrates the library working in a distributed fashion. See Section~\ref{sec:distributed} for details about how to run Greenstonedistributedly.83 Once you have started up the library (see the previous sections for OS dependent instructions), you can access it in a browser at \gst{http://localhost:8080/gsdl3} (or \gst{http://your-computer-name:your-chosen-port/gsdl3}). This gets you to a welcome page, with three links: one to run a test servlet (this allows you to check that Tomcat is running properly), one to run the standard library servlet using the site \gst{localsite}, and one to run a library servlet using the site \gst{soapsite}. This site uses a SOAP connection to communicate with localsite, and demonstrates the library working in a distributed fashion. The SOAP connection is not enabled by default: see Section~\ref{sec:distributed} for details about how to run \gsiii\ distributedly. 80 84 81 85 \subsection{How the library works} 82 86 83 The standard library program is a Java servlet. 87 The standard library program is a Java servlet. We use the Tomcat servlet container to present the servlets over the web. Tomcat takes CGI-style URLs and passes the arguments to the servlet, which processes these and returns a page of HTML. As far as an end-user is concerned, a servlet is a Java version of a CGI program. The interaction is similar: access is via a web browser, using arguments in a URL. 84 88 85 89 Other types of interfaces can be used, such as Java GUI programs. See Section~\ref{sec:new-interfaces} for details about how to make these. … … 87 91 \subsubsection{Restarting the library} 88 92 89 The library program (actually Tomcat) can be restarted by ... (** put a mechanism in each install program **).90 91 92 Tomcat must be shutdown andrestarted any time you make changes in the following for those changes to take effect:\\93 The library program (actually Tomcat) can be restarted in Windows by closing the window, and restarting it from the Start menu. In linux, you nned to go to the gsdl3 directory, and run \gst{gsdl3/gs3-launch.sh -shutdown}, then \gst{gsdl3/gs3-launch.sh}. 94 95 96 Tomcat must be restarted any time you make changes in the following for those changes to take effect:\\ 93 97 \begin{bulletedlist} 94 98 \begin{gsc} … … 104 108 \subsection{Directory structure} 105 109 106 Table~\ref{tab:dirs} shows the file hierarchy for Greenstone3.110 Table~\ref{tab:dirs} shows the file hierarchy for \gsiii\ . 107 111 The first part shows the common stuff which can be shared between 108 Greenstoneusers---the source, libraries etc. Under Linux, these can be installed into appropriate system directories. The second part shows112 \gs\ users---the source, libraries etc. Under Linux, these can be installed into appropriate system directories. The second part shows 109 113 stuff used by one person/group---their sites and interface setup (see Section~\ref{sec:sites-and-ints}). 110 etc. There can be several sites/interfaces per installation. 114 etc. There can be several sites/interfaces per installation. All the files inside the gsdl3/web directory comprise the gsdl3 context for Tomcat, and are accessible via Tomcat. 111 115 112 116 \begin{table} 113 \caption{The Greenstonedirectory structure}117 \caption{The \gs\ directory structure} 114 118 \label{tab:dirs} 115 119 {\footnotesize … … 139 143 & soap service description files \\ 140 144 gsdl3/resources/dtd 141 & Greenstonehas trouble loading DTD files sometimes. They can go here\\145 & \gsiii\ has trouble loading DTD files sometimes. They can go here\\ 142 146 gsdl3/bin 143 147 & executable stuff lives here\\ … … 184 188 \subsection{Sites and interfaces}\label{sec:sites-and-ints} 185 189 186 local gs stuff (sites and interfaces) vs installed stuff (code)\\187 where they live, whats the difference, what each contains. \\190 [local gs stuff (sites and interfaces) vs installed stuff (code)\\ 191 where they live, whats the difference, what each contains.]\\ 188 192 189 193 A site is comprised of a set of collections and possibly some site-wide services. An interface (in this web-based servlet context) is a set of images along with a set of xslt files used for translating xml output from the library into an appropriate form---html in general. 190 194 191 One greenstone installation can have many sites and interfaces. One instantiation of a servlet uses one site and one interface. Sites and interfaces can be matched up in different ways. For example, a single site might be served with two different interfaces. This provides different modes of access to the same content. eg HTML vs WML, or perhaps providing a completely different look and feel for different audiences. Astandard interface may be used with many different sites---providing a consistent mode of access to a lot of different content.192 193 Collections live in the collectdirectory of a site. Any collections that are found in this directory when the servlet is initialised will be loaded up and presented to the user. Collections require valid configuration files, but apart from this, nothing needs to be done to the site to use new collections. Collections added while Tomcat is running will not be noticed automatically. Either the server needs to be restarted, or a configuration request may be sent to the library, triggering a (re)load of the collection (this is described in Section~\ref{sec:runtime-config}).194 195 There are two Greenstone sites that come with the distribution: localsite, and soapsite. localsite has several demo collections, while soapsite has none. soapsite specifies that a soap connection should be made to localsite. Getting this to work involves setting up a soap server for localsite: see Section~\ref{sec:distributed} for details.195 One \gsiii\ installation can have many sites and interfaces, and these can be paired in different combinations. One instantiation of a servlet uses one site and one interface, so every specified pairing results in a new servlet instance. For example, a single site might be served with two different interfaces. This provides different modes of access to the same content. eg HTML vs WML, or perhaps providing a completely different look and feel for different audiences. Alternatively, a standard interface may be used with many different sites---providing a consistent mode of access to a lot of different content. 196 197 Collections live in the \gst{collect} directory of a site. Any collections that are found in this directory when the servlet is initialised will be loaded up and presented to the user. Collections require valid configuration files, but apart from this, nothing needs to be done to the site to use new collections. Collections added while Tomcat is running will not be noticed automatically. Either the server needs to be restarted, or a configuration request may be sent to the library, triggering a (re)load of the collection (this is described in Section~\ref{sec:runtime-config}). 198 199 There are two sites that come with the distribution: \gst{localsite}, and \gst{soapsite}. \gst{localsite} has several demo collections, while \gst{soapsite} has none. \gst{soapsite} specifies that a soap connection should be made to \gst{localsite}. Getting this to work involves setting up a soap server for localsite: see Section~\ref{sec:distributed} for details. 196 200 197 201 Each site and interface has a configuration file which specifies parameters for the site or interface---these are described in Section~\ref{sec:config}. 198 202 199 203 The file \gst{\gsdlhome/web/WEB-INF/web.xml} contains the setup information for Tomcat. It tells Tomcat what servlets to load, what initial parameters to pass them, and what web names map to the servlets. 200 There are three servlets specified in web.xml (these correspond to the three links in the welcome page for greenstone): one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other two are Greenstonelibrary servlets, {\em library}, which serves localsite, and {\em library1} which serves soapsite. Both of these servlets use the standard interface (called {\em default}).204 There are three servlets specified in web.xml (these correspond to the three links in the welcome page for \gsiii\ ): one is a test servlet that just prints ``hello greenstone'' to a web page. This is useful if you are having trouble getting Tomcat set up. The other two are \gs\ library servlets, {\em library}, which serves localsite, and {\em library1} which serves soapsite. Both of these servlets use the standard interface (called {\em default}). 201 205 202 206 \begin{table} 203 \caption{ Greenstoneservlet initialisation parameters}207 \caption{\gs\ servlet initialisation parameters} 204 208 \label{tab:serv-init} 205 209 {\footnotesize … … 210 214 gsdl3\_home & /research/kjdon/gsdl3 & the base directory of the gsdl3 installation \\ 211 215 site\_name & localsite & the name of the site to use \\ 212 interface\_name & default & the name o rthe interface to use\\216 interface\_name & default & the name of the interface to use\\ 213 217 library\_name & library & the web name of the servlet \\ 214 218 default\_lang & en & the default language for the interface\\ 215 219 receptionist\_class & NZDLReceptionist & (optional) specifies an alternative Receptionist to use\\ 216 220 messagerouter\_class & NewMessageRouter & (optional) specifies an alternative MessageRouter to use\\ 221 params\_class & NZDLParams & (optional) specifies an alternative GSParams class to use \\ 217 222 \hline 218 223 \end{tabular}} … … 222 227 223 228 224 \subsection{Configuring a greenstoneinstallation}\label{sec:config}225 226 Initial Greenstone3system configuration is determined by a set of configuration files, all expressed in XML. Each site has a configuration file that binds parameters for the site, \gst{siteConfig.xml}. Each interface has a configuration file, \gst{interfaceConfig.xml}, that specifies Actions for the interface. Collections also have several configuration files; these are discussed in Section~\ref{sec:collconfig}.227 The configuration files are read in when the system is initialised, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a CGI-type command to the library. There are a series of commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to shutdown andrestart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}.229 \subsection{Configuring a \gs\ installation}\label{sec:config} 230 231 Initial \gsiii\ system configuration is determined by a set of configuration files, all expressed in XML. Each site has a configuration file that binds parameters for the site, \gst{siteConfig.xml}. Each interface has a configuration file, \gst{interfaceConfig.xml}, that specifies Actions for the interface. Collections also have several configuration files; these are discussed in Section~\ref{sec:collconfig}. 232 The configuration files are read in when the system is initialised, and their contents are cached in memory. This means that changes made to these files once the system is running will not take immediate effect. Tomcat needs to be restarted for changes to the interface configuration file to take effect. However, changes to the site configuration file can be incorporated sending a CGI-type command to the library. There are a series of commands that can be sent to the library to induce reconfiguration of different modules, including reloading the whole site. This removes the need to restart the system to reflect these changes. These commands are described in Section~\ref{sec:runtime-config}. 228 233 229 234 \subsubsection{Site configuration file}\label{sec:siteconfig} … … 238 243 Figure~\ref{fig:siteconfig} shows two example site configuration files. The first example is for a rudimentary site with no site-wide services, 239 244 which does not connect to any external sites. The second example is for a site with one site-wide service cluster - a collection building cluster. It also connects to the first site using SOAP. 240 These two sites are running on the same machine. For site \gst{gsdl1} to talk to site \gst{localsite}, a SOAP server must be run for \gst{localsite}. The address of the SOAP server, in this case, is \gst{http://localhost:8080/soap/servlet/rpcrouter}.245 These two sites happen to be running on the same machine, which is why they can use \gst{localhost} in the address. For site \gst{gsdl1} to talk to site \gst{localsite}, a SOAP server must be run for \gst{localsite}. The address of the SOAP server, in this case, is \gst{http://localhost:8080/soap/servlet/rpcrouter}. 241 246 242 247 … … 281 286 \subsubsection{Interface configuration file}\label{sec:interfaceconfig} 282 287 283 The interface configuration file \gst{interfaceConfig.xml} lists all the actions that the interface knows about at the start (other ones can be loaded dynamically). Itspecifies what short name each action maps to (this is used in library urls for the a (action) parameter) e.g. QueryAction should use a=q. If the interface uses XSLT, it specifies what XSLT file should be used for each action and possibly each subaction. This makes it easy for developers to implement and use different actions and/or XSLT files without recompilation. The server must be restarted, however.284 285 It also lists all the languages that the interface text files have been translated into. These have a name attribute, which is the ISO code for the language, and a displayElement which gives the language name in that language (note the non-English characters have been specified in UTF-8 codes). This language list is used on the Preferences page to allow the user to change the interface language. Details on how to add a new language to a Greenstonelibrary are shown in Section~\ref{sec:interface-customise}.288 The interface configuration file \gst{interfaceConfig.xml} lists all the actions that the interface knows about at the start (other ones can be loaded dynamically). Actions create the web pages for the library: there is generally one Action per type of page. For example, a query action produces the pages for searching, while a document action displays the documents. The configuration file specifies what short name each action maps to (this is used in library urls for the a (action) parameter) e.g. QueryAction should use a=q. If the interface uses XSLT, it specifies what XSLT file should be used for each action and possibly each subaction. This makes it easy for developers to implement and use different actions and/or XSLT files without recompilation. The server must be restarted, however. 289 290 It also lists all the languages that the interface text files have been translated into. These have a \gst{name} attribute, which is the ISO code for the language, and a \gst{displayElement} which gives the language name in that language (note that this file should be encoded in UTF-8). This language list is used on the Preferences page to allow the user to change the interface language. Details on how to add a new language to a \gsiii\ library are shown in Section~\ref{sec:interface-customise}. 286 291 287 292 \begin{figure} … … 326 331 \subsection{Run-time re-initialisation}\label{sec:runtime-config} 327 332 328 should this section go in here, cos its kind of adminy, or go into the user stuff, cos you need to do it after building a collection??? 333 [**should this section go in here, cos its kind of adminy, or go into the user stuff, cos you need to do it after building a collection???**] 329 334 330 335 When Tomcat is started up, the site and interface configuration files are read in, and actions/services/collections loaded as necessary. The configuration is then static unless Tomcat is restarted, or re-configuration commands issued. 331 336 332 There are several CGI-typecommands that can be issued to Tomcat to avoid having to restart the server. These can reload the entire site, or just individual collections. Unfortunately at present there are no commands to reconfigure the interface, so if the interface configuration file has changed, Tomcat must be restarted for those changes to take effect. Similarly, if the java classes are modified, Tomcat must be restarted then too.333 334 Currently, the runtime configuration commands can only be accessed by typing in CGI-arguments into the URL,there is no nice web form yet to do this.335 336 The CGI arguments are entered after the \gst{library?} part of the URL. There are three types of commands: configure, activate, deactivate\footnote{There is no security for these commands yet in Greenstone, so the deactivate/delete command is disabled}. These are specified by \gst{a=s\&sa=c}, \gst{a=s\&sa=a}, and \gst{a=s\&sa=d}, respectively (\gst{a} is action, \gst{sa} is subaction). By default, the requests are sent to the MessageRouter, but they can be sent to a collection/cluster by the addition of \gst{sc=xxx}, where \gst{xxx} is the name of the collection or cluster. Table~\ref{tab:run-time config} describes the commands and arguments in a bit more detail.337 There are several commands that can be issued to Tomcat to avoid having to restart the server. These can reload the entire site, or just individual collections. Unfortunately at present there are no commands to reconfigure the interface, so if the interface configuration file has changed, Tomcat must be restarted for those changes to take effect. Similarly, if the java classes are modified, Tomcat must be restarted then too. 338 339 Currently, the runtime configuration commands can only be accessed by typing arguments into the URL; there is no nice web form yet to do this. 340 341 The arguments are entered after the \gst{library?} part of the URL. There are three types of commands: configure, activate, deactivate\footnote{There is no security for these commands yet in \gs\ , so the deactivate/delete command is disabled}. These are specified by \gst{a=s\&sa=c}, \gst{a=s\&sa=a}, and \gst{a=s\&sa=d}, respectively (\gst{a} is action, \gst{sa} is subaction). By default, the requests are sent to the MessageRouter, but they can be sent to a collection/cluster by the addition of \gst{sc=xxx}, where \gst{xxx} is the name of the collection or cluster. Table~\ref{tab:run-time config} describes the commands and arguments in a bit more detail. 337 342 338 343 \begin{table} … … 351 356 \end{table} 352 357 \newpage 353 \section{Using Greenstone 3}\label{sec:user} 354 355 Once you have greenstone 3 installed, you can access the sample collections. The installation comes with some example collections, and Section~\ref{sec:usecolls} describes these collections and how to use them. Section~\ref{sec:buildcol} describes how to build your own collections. 356 357 \subsection{Using a collection}\label{sec:usecolls} 358 \section{Using \gsiii\ }\label{sec:user} 359 360 Once \gsiii\ is installed, the sample collections can be accessed. The installation comes with several example collections, and Section~\ref{sec:usecolls} describes these collections and how to use them. Section~\ref{sec:buildcol} describes how to build new collections. 361 362 \subsection{Using a collection}\label{sec:usecolls} 363 [TODO: expand this section] 358 364 359 365 A collection typically consists of a set of documents, which could be text, html, word, PDF, images, bibliographic records etc, along with some access methods, or ``services''. Typical access methods include searching or browsing for document identifiers, and retrieval of content or metadata for those identifiers. … … 362 368 Browsing involves navigating pre-defined hierarchies of documents, following links of interest to find documents. The hierarchies may be constructed on different metadata fields, for example, alphabetical lists of Titles, or a hierarchy of Subject classifications. Clicking on a bookshelf icon takes you to a lower level in the hierarchy, while clicking on a book or page icon takes you to a document. 363 369 364 In the standard interface that comes with Greenstone3\footnote{of course, this is all customisable}, collections in a digital library are presented in the following manner. The 'home' page of the library shows a list of all the public collections in that library. Clicking on a collection link takes you to the home page for the collection, which we call the 'about' page. The standard page banner looks something like that shown in Figure~\ref{fig:page-banner}.370 In the standard interface that comes with \gsiii\ \footnote{of course, this is all customisable}, collections in a digital library are presented in the following manner. The 'home' page of the library shows a list of all the public collections in that library. Clicking on a collection link takes you to the home page for the collection, which we call the 'about' page. The standard page banner looks something like that shown in Figure~\ref{fig:page-banner}. 365 371 366 372 \begin{figure}[h] … … 371 377 \end{figure} 372 378 373 The image at the top left is a link to the collection's home page. The top right has buttons to link to the library home page, help pages and preference pages. All the available services are arrayed along a navigation bar, along the bottom of the banner. Clicking on a name accesses that service. Search type services generally provide a form to fill in, with parameters including what field or granularity to index, and the query itself. Clicking the 374 The results of a search 379 The image at the top left is a link to the collection's home page. The top right has buttons to link to the library home page, help pages and preference pages. All the available services are arrayed along a navigation bar, along the bottom of the banner. Clicking on a name accesses that service. 380 381 Search type services generally provide a form to fill in, with parameters including what field or granularity to index, and the query itself. Clicking the search button carries out the search, and a list of matching documents will be displayed. Clicking on the icons in the results list takes you to the document itself. 382 375 383 Once you are looking at a document, clicking the open book icon at the top of the document, underneath the navigation bar, will take you back to the service page that you accessed the document from. 376 384 377 describe the colls that the sample installation comes with\\385 [TODO: describe the colls that the sample installation comes with\\ 378 386 brief description of what a collection is.\\ 379 387 how to get around the collection, services etc. \\ 380 388 querying vs browsing \\ 381 use the demo colls that come with greenstone - one gs2 coll, one gs3 coll, tei coll??\\389 use the demo colls that come with \gsiii\ - one gs2 coll, one gs3 coll, tei coll??\\] 382 390 383 391 \subsection{Building a collection}\label{sec:buildcol} 384 392 385 There are two ways to get a new collection into Greenstone 3. The first is to build it using the greenstone 3 building process. The second way is to import a greenstone 2collection.386 387 Collections live in the collect directory of a site. As described in Section~\ref{sec:sites-and-ints}, there can be several sites per greenstoneinstallation. The collect directory is at \$GSDL3HOME/web/sites/site-name/collect, where site-name is the name of the site you want your new collection to belong to.388 389 The following two sections describe how to create a collection from scratch, and how to import a greenstone 2collection. Once a collection has been built, the library server needs to be notified that there is a new collection. This can be accomplished in two ways\footnote{eventually there will also probably be automatic polling for new collections}. If you are the library administrator, you can restart Tomcat. The library servlet will then be created afresh, and will discover the new collection when it scans the collect directory for the collection list. Alternatively, there is a CGI command to reload a collection which can also load a new one. Use the CGI arguments \gst{a=s\&sa=a\&st=collection\&sn=collname}---this tells the library program to reload the collname collection.393 There are two ways to get a new collection into \gsiii\ . The first is to build it using the \gsiii\ building process. The second way is to import a \gsii\ collection. 394 395 Collections live in the collect directory of a site. As described in Section~\ref{sec:sites-and-ints}, there can be several sites per \gsiii\ installation. The collect directory is at \$GSDL3HOME/web/sites/site-name/collect, where site-name is the name of the site you want your new collection to belong to. 396 397 The following two sections describe how to create a collection from scratch, and how to import a \gsii\ collection. Once a collection has been built, the library server needs to be notified that there is a new collection. This can be accomplished in two ways\footnote{eventually there will also probably be automatic polling for new collections}. If you are the library administrator, you can restart Tomcat. The library servlet will then be created afresh, and will discover the new collection when it scans the collect directory for the collection list. Alternatively, there is a CGI command to reload a collection which can also load a new one. Use the CGI arguments \gst{a=s\&sa=a\&st=collection\&sn=collname}---this tells the library program to reload the collname collection. 390 398 391 399 392 400 \subsubsection{Creating a collection from scratch} 393 401 394 Building Greenstone 3collections is done using the \gst{gs3-build.sh} script, with the \gst{collectionConfig.xml} file controlling how the building is done. There are a number of considerations in building a collection: including what documents appear in the collection, how they are indexed for searching, which classifications are used for browsing, etc.402 Building \gsiii\ collections is done using the \gst{gs3-build.sh} script, with the \gst{collectionConfig.xml} file controlling how the building is done. There are a number of considerations in building a collection: including what documents appear in the collection, how they are indexed for searching, which classifications are used for browsing, etc. 395 403 396 404 Firstly, the documents that comprise the collection should be placed in the import subdirectory. At present, only documents in this directory will appear in the collection. 397 405 [TODO: describe the kinds of documents that can be added, something about METS files?] 398 406 399 Metadata for documents can be added using metadata.xml files. These files have already been used in Greenstone 2, and the format is the same in Greenstone 3. A metadata.xml file has a root element of \gst{<DirectoryMetadata>}. This encloses a series of \gst{<FileSet>} items. Neither of these tags has any attributes. Each \gst{<FileSet>} item includes two parts: firstly, one or more \gst{<FileName>} tags, each of which encloses a regular expression to identify the files which are to be assigned the metadata. Only files in the same directory as the metadata.xml, or in one of its child directories, file will be selected. The filename tag encloses the regular expression as text, eg:407 Metadata for documents can be added using metadata.xml files. These files have already been used in \gsii\ , and the format is the same in \gsiii\ . A metadata.xml file has a root element of \gst{<DirectoryMetadata>}. This encloses a series of \gst{<FileSet>} items. Neither of these tags has any attributes. Each \gst{<FileSet>} item includes two parts: firstly, one or more \gst{<FileName>} tags, each of which encloses a regular expression to identify the files which are to be assigned the metadata. Only files in the same directory as the metadata.xml, or in one of its child directories, file will be selected. The filename tag encloses the regular expression as text, eg: 400 408 401 409 \begin{gsc}\begin{verbatim} … … 425 433 \end{verbatim}\end{gsc} 426 434 427 Here, only one file pattern is found in the file set. However, the \gst{Description} tag contains a number of separate metadata items. Note that the \gst{Title} metadata does not have the accumulate metadata. This means that when the title is assigned to a document, its existing \gst{Title} information will be lost. 428 429 The basic means of finding documents in Greenstone is search. Options for building the search indexes include which indexer to use, what granularity to use for the indexes (e.g. whether to index documents as a whole, or sections of documents), what content the index should have (the whole text of the document or one or many metadata fields). 430 431 Indexes can alter which search engine to use for that index, the level at which the index should be built (e.g. document, section or paragraph) and the text over which it should be built (e.g. the document text, titles alone, author names, etc.). Section-level indexes allow a reader to recall part of a document (for instance, a chapter) rather than the entire document. However, Greenstone 3 must be able to identify the internal structure of the document to achieve this. The degree to which structure can be found varies from file format to file format. 435 Here, only one file pattern is found in the file set. However, the \gst{Description} tag contains a number of separate metadata items. Note that the \gst{Title} metadata does not have the mode=accumulate attribute. This means that when the title is assigned to a document, its existing \gst{Title} information will be lost. 436 437 The basic means of finding documents in \gs\ is search. Options for building the search indexes include which indexer to use, what granularity to use for the indexes (e.g. whether to index documents as a whole, or sections of documents), what content the index should have (the whole text of the document or one or many metadata fields). Section-level indexes allow a reader to recall part of a document (for instance, a chapter) rather than the entire document. However, \gsiii\ must be able to identify the internal structure of the document to achieve this. The degree to which structure can be found varies from file format to file format. 432 438 433 439 The collectionConfig.xml file controls the all of these options for collection building, and the format is described in Section~\ref{sec:collconfig}. 434 440 435 Wherever possible, the Greenstone 3 will import and use options from a Greenstone 2 \gst{collect.cfg} file. However, it is strongly recommended that a proper \gst{collectionConfig.xml} file is used wherever possible. 441 If a collectionConfig.xml file is not found, the \gsiii\ build process will import and use options, wherever possible, from a \gsii\ \gst{collect.cfg} file. However, it is strongly recommended that a proper \gst{collectionConfig.xml} file is used wherever possible. [NOTE: I think we should require a proper config file for gs3 building--kjdon] 436 442 437 443 To build a collection, execute \gst{gs3build.sh sitename collectionname}. The process will run, placing the new indexes in the \gst{building} subdirectory of the collection's directory. You must have mysql running before you start building---running \gst{gs3-launch.sh} will start up the mysql server as well as tomcat. 438 444 439 Once the build process is complete, the building directory should be renamed to index (after deleting the existing index directory, if any), and Tomcat prompted to reload the collection---either by restarting the server, or by sending an activate collection command to the library servlet.445 Once the build process is complete, the building directory should be renamed to index (after deleting or renaming the existing index directory, if any), and Tomcat prompted to reload the collection---either by restarting the server, or by sending an activate collection command to the library servlet. 440 446 441 447 [TODO: need to describe namespaces somewhere? ] 442 448 443 \subsubsection{Importing a greenstone 2collection}444 445 Greenstone 3 can also serve Greenstone 2 collections. If you have a Greenstone 2 collection\footnote{For information about the Greenstone 2software, and how to build collections using it, visit \gst{www.greenstone.org}}, you can copy it into the collect directory of the site you are using. Or make a link to it from the collect directory if your OS supports that.446 The Greenstone 3 run time system requires different configuration files for a collection, so you need to run a conversion script. All this does is create the new collectionConfig.xml and buildConfig.xml from the old collect.cfg and build.cfg files. It does not change the collection in any way, so it can still be used by Greenstone 2software.447 448 The conversion script is \gst{convert\_coll\_from\_gs2.pl}. To run it, make sure you have sourced setup.bash (or run setup in Windows) in your top-level gsdl directory of the greenstone 2installation. Then you need to specify the path to the collect directory, and the collection name as parameters to the conversion script. For example,449 \subsubsection{Importing a \gsii\ collection} 450 451 \gsiii\ can also serve \gsii\ collections. If you have a \gsii\ collection\footnote{For information about the \gsii\ software, and how to build collections using it, visit \gst{www.greenstone.org}}, you can copy it into the collect directory of the site you are using. Or make a link to it from the collect directory if your OS supports that. 452 The \gsiii\ run time system requires different configuration files for a collection, so you need to run a conversion script. All this does is create the new collectionConfig.xml and buildConfig.xml from the old collect.cfg and build.cfg files. It does not change the collection in any way, so it can still be used by \gsii\ software. 453 454 The conversion script is \gst{convert\_coll\_from\_gs2.pl}. To run it, make sure you have sourced setup.bash (or run setup in Windows) in your top-level gsdl directory of the \gsii\ installation. Then you need to specify the path to the collect directory, and the collection name as parameters to the conversion script. For example, 449 455 450 456 \gst{convert\_coll\_from\_gs2.pl -collectdir \$GSDL3HOME/web/\-sites/\-localsite/\-collect demo} 451 457 452 The script attempts to create gs3 format statements from the old greenstone 2 ones. The conversion may not always work properly, so if the collection looks a bit strange under Greenstone 3, you should check the format statements. Format statements are described in Section~\ref{sec:formatstmt}.458 The script attempts to create gs3 format statements from the old \gsii\ ones. The conversion may not always work properly, so if the collection looks a bit strange under \gsiii\ , you should check the format statements. Format statements are described in Section~\ref{sec:formatstmt}. 453 459 454 460 Once again, to have the collection recognised by the library servlet, you can either restart Tomcat, or load it dynamically. … … 458 464 Each collection has two, or possibly three, configuration files, \gst{collectionConfig.xml} and \gst{buildConfig.xml}, and optionally \gst{collectionInit.xml} that give metadata, display and other information for the 459 465 collection.\footnote{\gst{collectionConfig.xml} and \gst{buildConfig.xml} replace \gst{collect.cfg} and \gst{build.cfg} in 460 Greenstone2.} The first includes user-defined presentation metadata for the collection,466 \gsii.} The first includes user-defined presentation metadata for the collection, 461 467 such as its name and the {\em About this collection} text; gives formatting information for the collection display; and also gives 462 468 instructions on how the collection is to be built. The second is produced by … … 464 470 automatically. It also includes configuration information for any ServiceRacks needed by the collection. 465 471 472 All the configuration files should be encoded using UTF-8. 473 466 474 \subsubsection{collectionInit.xml} 467 475 468 This optional file specifies a new collection class if the standrad one is not to be used.The only syntax so far is the class name:476 This optional file is only used for non-standard, customised collections. It specifies the class name of the non-standard collection class. The only syntax so far is the class name: 469 477 470 478 \begin{gsc}\begin{verbatim} … … 479 487 480 488 Display elements for a collection or metadata for a document can be entered in any language---use lang='en' attributes to metadata elements to specify which language they are in. 481 482 configuration files need to be encoded in utf-8.483 489 484 490 \begin{figure} … … 490 496 </metadataList> 491 497 <displayItemList> 492 <displayItem name="smallicon" lang="en">mgppdemosm.gif</displayItem> 493 <displayItem name="description" lang="fr">C'est une collection pour 494 demonstration du logiciel Greenstone. Elle contient une petite 495 partie du projet de bibliotheques humanitaires et de developpement 496 (11 livres).</displayItem> 498 <displayItem name="name" lang="en">Greenstone 3 demo</displayItem> 499 <displayItem name="icon" lang="en">gs3demo.gif</displayItem> 500 <displayItem name="smallicon" lang="en">gs3demosm.gif</displayItem> 501 <displayItem name="description" lang="fr">Il s'agit d'une collection 502 de démonstration pour le logiciel Greenstone. Elle contient 503 seulement un petit échantillon des Bibliothèques humanitaires 504 pour le Développement (11 documents).</displayItem> 497 505 <displayItem name="description" lang="en">This is a demonstration 498 506 collection for the Greenstone digital library software. It contains 499 a small subset (11 books) of the Humanity Development Library. It is 500 built with mgpp.</displayItem> 501 <displayItem name="name" lang="en">greenstone mgpp demo</displayItem> 502 <displayItem name="icon" lang="en">mgppdemo.gif</displayItem> 507 a small subset (11 books) of the Humanity Development Library. It 508 is built with mg using Greenstone 3 native building.</displayItem> 503 509 </displayItemList> 504 <search type='mgpp'> 505 <index name="idx"/> 510 <search type='mg'> 511 <index name="i1"> 512 <field>text</field> 513 <level>document</level> 514 <displayItem name='name' lang="en">entire documents</displayItem> 515 <displayItem name='name' lang="fr">documents entiers</displayItem> 516 <displayItem name='name' lang="es">documentos enteros</displayItem> 517 </index> 518 <index name="i2"> 519 <field>text</field> 520 <level>section</level> 521 <displayItem name='name' lang="en">chapters</displayItem> 522 <displayItem name='name' lang="fr">chapitres</displayItem> 523 <displayItem name='name' lang="es">capítulos</displayItem> 524 </index> 506 525 <format> 507 526 <gsf:template match="documentNode"> 508 527 <td valign='top'><gsf:link><gsf:icon/></gsf:link></td> 509 <td><gsf:metadata name='Title' select='ancestors' 510 separator=': '/>: <gsf:link><gsf:metadata name='Title' /> 511 </gsf:link></td> 528 <td><gsf:metadata name='Title' /></td> 512 529 </gsf:template> 513 530 </format> … … 528 545 </collectionConfig> 529 546 \end{verbatim}\end{gsc} 530 \caption{Sample collectionConfig.xml file (mgppdemo collection)} 547 [TODO: add in building instructions for the classifiers] 548 \caption{Sample collectionConfig.xml file (gs3demo collection)} 531 549 \label{fig:collconfig} 532 550 \end{figure} 533 [TODO: add in building istructions for the config file]534 551 535 552 The \gst{<metadataList>} element specifies some collection metadata, such as creator. The \gst{<displayItemList>} specifies some language dependent information that is used for collection display, such as collection name and short description. These displayItem elements can be specified in different languages. If languages other than English are used, the configuration file should be encoded in UTF-8. … … 539 556 Search indexes appear as individual \gst{<index>} elements within the \gst{<search>} element. Some choices for the index are made using attributes of the element itself, and some through child elements. 540 557 541 Each index must have a unique name, which is used to identify it within GreenstoneThe name is given as an attribute of the \gst{<index>} element.558 Each index must have a unique name, which is used to identify it within \gsiii\ The name is given as an attribute of the \gst{<index>} element. 542 559 543 560 The other choices are described using child elements of \gst{<index>}. The \gst{<level>} tag indicates the index level and the \gst{<field>} tag the text to be used. The \gst{<level>} tag can contain one of document, section or paragraph, while the \gst{<field>} tag can contain ``text'' or the name of a metadata field. If the \gst{<level>} tag is omitted, the default setting is to index by document, and if the \gst{<field>} tag is omitted, the default setting is to index the document text. 544 561 545 562 Example index specifications include: 563 564 [NOTE: I think we shouldn't have default level and field and that it must be specified--kjdon] 546 565 547 566 To index only the title of each separate document in the collection: … … 550 569 <level>document</level> 551 570 <field>dc:title</field> 552 <displayItem name='name' lang="en">entire documents</displayItem>553 <displayItem name='name' lang="fr">documents entiers</displayItem>554 <displayItem name='name' lang="es">documentos enteros</displayItem>555 571 </index> 556 572 \end{verbatim}\end{gsc} … … 559 575 Alternatively, to index the full document texts by section: 560 576 \begin{gsc}\begin{verbatim} 561 <index name="stx" type=''mgpp''>577 <index name="stx"> 562 578 <level>section</level> 563 <displayItem name='name' lang="en">entire documents</displayItem>564 <displayItem name='name' lang="fr">documents entiers</displayItem>565 <displayItem name='name' lang="es">documentos enteros</displayItem>566 579 </index> 567 580 \end{verbatim}\end{gsc} 568 581 ...or... 569 582 \begin{gsc}\begin{verbatim} 570 <index name="stx" type=''mg''>583 <index name="stx"> 571 584 <level>section</level> 572 585 <field>text</field> 573 <displayItem name='name' lang="en">entire documents</displayItem>574 <displayItem name='name' lang="fr">documents entiers</displayItem>575 <displayItem name='name' lang="es">documentos enteros</displayItem>576 586 </index> 577 587 \end{verbatim}\end{gsc} 578 ...in the first example, the \gst{<field>} tag is not explicitly defined, and would default to 'text', whereas it is explicitly set to 'text' in the second example. Note the different indexer selected for these two indexes. As they are of the same name, they should not appear in the same \gst{collectionConfig.xml} file. 579 580 The \gst{<search>} and \gst{<browse>} elements give some formatting information about the indexes and classifiers. \gst{<displayItem>} elements are used to provide titles for the indexes or classifiers, while \gst{<format>} elements provide formatting instructions, typically for a document or classifier node in a list of results. 581 582 of the \gst{collectionConfig.xml} file, and classifications as individual \gst{<classifier>} elements within the \gst{<browse>} element. In each case, some choices are made using attributes of the element itself, and some through child elements. 583 Moving onto \gst{<classifier>} items, the format is broadly similar to \gst{<index>} items, but with a couple of different choices. Firstly, each classifier should have a ``name'' and ``type'' attribute as with \gst{<index>} tags. In the case of \gst{<classifier>} items the ``type'' attribute identifies the type of classifier it is. At present, this should either be ``Hierarchy'' or ``AZList''. 588 ...in the first example, the \gst{<field>} tag is not explicitly defined, and would default to 'text', whereas it is explicitly set to 'text' in the second example. As they are of the same name, they should not appear in the same \gst{collectionConfig.xml} file. 589 590 Moving onto \gst{<classifier>} items, the format is broadly similar to \gst{<index>} items, but with a couple of different choices. Firstly, each classifier should have ``name'' and ``type'' attributes. In the case of \gst{<classifier>} items the ``type'' attribute identifies the type of classifier it is. At present, this should either be ``Hierarchy'' or ``AZList''. 584 591 585 592 The remaining choices for the classifier should follow as child elements of the \gst{<classifier>} element. The \gst{<file>} element should contain the name of the file that describes the classifier as its ``URL'' attribute. The format of this file will be described later - it will vary from classifier type to classifier type. The \gst{<field>} element identifies the name of the field to index. More than one \gst{<field>} element may appear if two or more metadata fields are to be used with the classifier. Finally, the \gst{<sort>} item identifies another metadata field which the items within one classifier node are to be ordered. Unlike the \gst{<index>} element, the \gst{<classifier>} element does not have default, assumed values for its children. … … 654 661 \subsection{Formatting the collection}\label{sec:formatstmt} 655 662 656 format statements. and displayItem stuff. advanced collection design.\\ 657 658 Part of collection design involves deciding how the collection should look. Greenstone has a default 'look' for a collection, so this is optional. However, the default may not suit the purposes of some collections, so many parts to the look of a collection can be determined by the collection designer. 659 660 In standard greenstone, the library is served to a web browser by a servlet, and the html is generated using XSLT. XSLT templates are used to format all the parts of the pages. Some commonly overwritten templates are those for formatting lists: search results list, classifier browsing hierarchies, and for parts of the document display. 663 Part of collection design involves deciding how the collection should look. \gsiii\ has a default 'look' for a collection, so this is optional. However, the default may not suit the purposes of some collections, so many parts to the look of a collection can be determined by the collection designer. 664 665 In standard \gsiii\ , the library is served to a web browser by a servlet, and the html is generated using XSLT. XSLT templates are used to format all the parts of the pages. Some commonly overwritten templates are those for formatting lists: search results list, classifier browsing hierarchies, and for parts of the document display. 661 666 662 667 Real XSLT templates for formatting search results or classifier lists are quite complicated, and not at all easy for a new user to write. For example, the following is a sample template for formatting a classifier list, to show Keyword metadata as a link to the document. … … 681 686 \end{bulletedlist} 682 687 683 Since XSLT is written in XML, we can use XSLT to transform XML into XSLT. Greenstone provides a simplified set of formatting commands, written in XML, which will be transformed into proper XSLT. Table~\ref{tab:gsf-format} shows the set of 'gsf' (greenstone format) elements. If you have come from a Greenstone 2 background, Appendix~\ref{app:format} shows Greenstone 2 format elements and their equivalents in Greenstone 3.688 Since XSLT is written in XML, we can use XSLT to transform XML into XSLT. \gsiii\ provides a simplified set of formatting commands, written in XML, which will be transformed into proper XSLT. Table~\ref{tab:gsf-format} shows the set of 'gsf' (Greenstone Format) elements. If you have come from a \gsii\ background, Appendix~\ref{app:format} shows \gsii\ format elements and their equivalents in \gsiii\ . 684 689 685 690 \begin{table} … … 692 697 \hline 693 698 \gst{<gsf:text/>} & The document's text\\ 699 \hline 694 700 \gst{<gsf:link>...</gsf:link>} & The HTML link to the document itself \\ 695 701 \gst{<gsf:link type='document'>... … … 699 705 \gst{<gsf:link type='source'>... 700 706 </gsf:link>} & The HTML link to the original file---set for documents that have been converted from e.g. Word, PDF, PS \\ 707 \hline 701 708 \gst{<gsf:icon/>} & An appropriate icon\\ 702 709 \gst{<gsf:icon type='document'/>} & same as above\\ 703 710 \gst{<gsf:icon type='classifier'/>} & bookshelf icon for classification nodes\\ 704 711 \gst{<gsf:icon type='source'/>} & An appropriate icon for the original file e.g. Word, PDF icon\\ 712 \hline 705 713 \gst{<gsf:metadata name='Title'/>} & The value of a metadata element for the current document or section, in this case, Title\\ 706 714 \gst{<gsf:metadata name='Title' select='select-type' [separator='y' multiple='true']/>} & A more extended selection of metadata values. The select field can be one of those shown in Table~\ref{tab:gsf-select-types}. There are two optional attributes: separator gives a String that will be used to separate the fields, default is ``, ``, and if multiple is set to true, looks for multiple values at each section.\\ 707 715 \hline 708 716 \gst{<gsf:choose-metadata> 709 717 <gsf:metadata name='metaA'/> … … 712 720 </gsf:choose-metadata>} 713 721 & A choice of metadata. Will select the first existing one. the metadata elements can have the select, separator and multiple attributes like normal.\\ 722 \hline 714 723 \gst{<gsf:switch preprocess= 715 724 'preprocess-type'> … … 882 891 The interface language can be changed by going to the preferences page, and choosing a language from the list. The list lists (:-)) all languages in which the interface has been defined so far. 883 892 884 It is easy to add a new interface language to greenstone. Language specific text strings are separated out from the rest of the system to allow for easy incorporation of new languages. These text strings are contained in Java resource bundle properties files. These are plain text files consisting of key-value pairs, located in resources/java. Each interface has one named interface\_name.properties (where `name' is the interface name). Each service class has one with the same name as the class (e.g. GS2Search.properties). To add another language all of the base .properties files must be translated. The translated files keep the same names, but with a language extension added. For example, a French version of interface\_default.properties would be named interface\_default\_fr.properties.893 It is easy to add a new interface language to \gs\ . Language specific text strings are separated out from the rest of the system to allow for easy incorporation of new languages. These text strings are contained in Java resource bundle properties files. These are plain text files consisting of key-value pairs, located in resources/java. Each interface has one named interface\_name.properties (where `name' is the interface name). Each service class has one with the same name as the class (e.g. GS2Search.properties). To add another language all of the base .properties files must be translated. The translated files keep the same names, but with a language extension added. For example, a French version of interface\_default.properties would be named interface\_default\_fr.properties. 885 894 886 895 Keys will be looked up in the properties file closest to the specified language. For example, if language fr\_CA was specified (french language, country Canada), and the default locale was en\_GB, java would look at properties files in the following order, until it found the key: XXX\_fr\_CA.properties, XXX\_fr.properties, XXX\_en\_GB.properties, then XXX\_en.properties, and finally the default XXX.properties. 887 896 888 You can tell Greenstoneabout a new language by adding it in to the languageList in the interfaceConfig.xml file. This will add it in to the list of languages on the preferences page. Modification of this file requires a restart of the Tomcat server for the changes to be recognised.897 You can tell \gs\ about a new language by adding it in to the languageList in the interfaceConfig.xml file. This will add it in to the list of languages on the preferences page. Modification of this file requires a restart of the Tomcat server for the changes to be recognised. 889 898 890 899 891 900 \subsubsection{Modifying an existing interface} 892 901 893 Most of an interface is defined by XSLT files, which are stored in \$GSDL3HOME/\-web/\-interfaces/\-interface-name/\-transform. These can be changed and the changes will take affect straight away. If changes only apply to certain collections or sites, not everything that uses the interface, you can override some of the files by putting new ones in a different place. XSLT files are looked for in the following order: collection, site, interface, default interface. (This currently only apples to sites, and therefore collections, that reside in the same greenstoneinstallation as the interface.) This also applies to files that are included from other XSLT files. For example the query.xsl for the query pages includes a file called querytools.xsl. To have a particular site show a different query interface either of these files may need to be modified. Creating a new version of either of these and putting it in the site transform directory will work. Either the new query.xsl will include the default querytools, or the default query.xsl will include the new querytools.xsl. The xsl:include directives are preprocessed by the java code and full paths added based on availability of the files, so that the correct one is used.902 Most of an interface is defined by XSLT files, which are stored in \$GSDL3HOME/\-web/\-interfaces/\-interface-name/\-transform. These can be changed and the changes will take affect straight away. If changes only apply to certain collections or sites, not everything that uses the interface, you can override some of the files by putting new ones in a different place. XSLT files are looked for in the following order: collection, site, interface, default interface. (This currently only apples to sites, and therefore collections, that reside in the same \gs\ installation as the interface.) This also applies to files that are included from other XSLT files. For example the query.xsl for the query pages includes a file called querytools.xsl. To have a particular site show a different query interface either of these files may need to be modified. Creating a new version of either of these and putting it in the site transform directory will work. Either the new query.xsl will include the default querytools, or the default query.xsl will include the new querytools.xsl. The xsl:include directives are preprocessed by the java code and full paths added based on availability of the files, so that the correct one is used. 894 903 895 904 Note that you cannot include a file with the same name as the including file. For example query.xsl cannot include query.xsl (it is tempting to want to do this if you just want to change one template for a particular file, and then include the default. but you cant). … … 904 913 905 914 \newpage 906 \section{Developing Greenstone 3: Run-time system}\label{sec:develop-runtime} 907 915 \section{Developing \gsiii\ : Run-time system}\label{sec:develop-runtime} 916 917 [TODO: rewrite this!!] 908 918 runtime object structure diagram. describe the modules.\\ 909 919 class hierarchy,\\ … … 918 928 \subsection{Overview of modules??} 919 929 920 A Greenstone3'library' system consists of many components: MessageRouter, Receptionist, Actions, Collections, ServiceRacks etc. Figure~\ref{fig:local} shows how they fit together in a stand-alone system.930 A \gsiii\ 'library' system consists of many components: MessageRouter, Receptionist, Actions, Collections, ServiceRacks etc. Figure~\ref{fig:local} shows how they fit together in a stand-alone system. 921 931 922 932 \begin{figure}[t] … … 945 955 946 956 We use the Tomcat web server, which operates either stand-alone in a test mode 947 or in conjunction with the Apache web server. The GreenstoneLibraryServlet957 or in conjunction with the Apache web server. The \gs\ LibraryServlet 948 958 class is loaded by Tomcat and the servlet's \gst{init()} method is called. Each time a 949 959 \gst{get/put/post} (etc.) is used, a new thread is started and … … 975 985 \subsection{Message passing} 976 986 977 Action in Greenstone 3 is originated by a request coming in from the outside. In the standard web-based greenstone, this comes from a servlet into the receptionist. This external type request is a request for a page of data, and contains a representation of the CGI style arguments. A page of XML is returned, which can be in HTML format or other depending on the output parameter to the request. Messages inside the system all follow the same basic format: message elements contain multiple request elements, or multiple response elements. Messaging is all synchronous. The same number of responses as requests will be returned.978 979 When a page request comes in to the Receptionist, it looks at the action attribute to determine which action to send it to. The response is returned from the action.The page that the receptionist returns contains the original request, the response from the action and other info as needed (depends on the type of Receptionist). The data may be transformed in some way --- for the servlet greenstonewe transform using XSLT to generate html pages which get returned to the servlet.987 Action in \gsiii\ is originated by a request coming in from the outside. In the standard web-based \gs\ , this comes from a servlet into the receptionist. This external type request is a request for a page of data, and contains a representation of the CGI style arguments. A page of XML is returned, which can be in HTML format or other depending on the output parameter to the request. Messages inside the system all follow the same basic format: message elements contain multiple request elements, or multiple response elements. Messaging is all synchronous. The same number of responses as requests will be returned. 988 989 When a page request comes in to the Receptionist, it looks at the action attribute to determine which action to send it to. The response is returned from the action.The page that the receptionist returns contains the original request, the response from the action and other info as needed (depends on the type of Receptionist). The data may be transformed in some way --- for the servlet \gs\ we transform using XSLT to generate html pages which get returned to the servlet. 980 990 981 991 Actions send internal style messages to the MessageRouter. Some can be answered by it, others are passed on to collections, and maybe on to services. Internal requests are for simple actions, such as search, retrieve metadata, retrieve document text … … 989 999 990 1000 request: 991 These are the special 'external'-style messages. Requests originate from outside Greenstone, for example from a servlet, or java application. They are requests for a 'page' of data---for example, the home page for a site; the query page for a collection; the text of a document. They contain, in XML, a list of arguments specifying what type of page is required. If the external context is a servlet, the arguments represent the 'CGI' arguments in a GreenstoneURL. The two main arguments are \gst{a} (action) and \gst{sa} (subaction). All other arguments are encoded as parameters.1001 These are the special 'external'-style messages. Requests originate from outside \gs\ , for example from a servlet, or java application. They are requests for a 'page' of data---for example, the home page for a site; the query page for a collection; the text of a document. They contain, in XML, a list of arguments specifying what type of page is required. If the external context is a servlet, the arguments represent the 'CGI' arguments in a \gs\ URL. The two main arguments are \gst{a} (action) and \gst{sa} (subaction). All other arguments are encoded as parameters. 992 1002 993 1003 Here are some examples of requests\footnote{In a servlet context, these correspond to the URLs \gst{a=p\&sa=about\&c=demo\&l=fr}, and \gst{a=q\&l=en\&s=TextQuery\&c=demo\&rt=r\&ca=0\&st=1\&m=10\&q=snail}.}: … … 1043 1053 \hline 1044 1054 \end{tabular}} 1045 \caption{Generic arguments that can appear in a GreenstoneURL}1055 \caption{Generic arguments that can appear in a \gs\ URL} 1046 1056 \label{tab:args} 1047 1057 \end{table} … … 1359 1369 1360 1370 \begin{table} 1361 \caption{Status codes currently used in Greenstone 3}1371 \caption{Status codes currently used in \gsiii\ } 1362 1372 \label{tab:status codes} 1363 1373 {\footnotesize … … 1757 1767 * talk general first: get data, get format info, transform gsf->xsl. transfrom xml->html 1758 1768 1759 * state saving. the XSLT files assume that arguments are saved somehow. This needs to be implemented outside Greenstoneproper - we do this in the servlet, using something or other.1760 1761 URL-style requests are received by the Receptionist. Based on the arguments, a page of data must be returned to the servlet. As described in Section~\ref{sec:page-requests}, the requests are XML representations of GreenstoneURLs. One of the arguments is action (a). This tells the Receptionist which Action module to pass the request to. Action modules decode the rest of the CGI-arguments to determine what requests need to be made to the system.1769 * state saving. the XSLT files assume that arguments are saved somehow. This needs to be implemented outside \gs\ proper - we do this in the servlet, using something or other. 1770 1771 URL-style requests are received by the Receptionist. Based on the arguments, a page of data must be returned to the servlet. As described in Section~\ref{sec:page-requests}, the requests are XML representations of \gs\ URLs. One of the arguments is action (a). This tells the Receptionist which Action module to pass the request to. Action modules decode the rest of the CGI-arguments to determine what requests need to be made to the system. 1762 1772 System requests are received by the MessageRouter, which answers them one by one, either itself or by passing them on to the appropriate module. 1763 1773 … … 1788 1798 \subsubsection{Receptionists}\label{sec:recepts} 1789 1799 1790 The receptionist is the controlling module for the page generation part of greenstone. It has the job of loading up all the actions, and it knows about the message router it and the actions are supposed to talk to. It routes messages received to the appropriate action (page-type messages) or directly to the message router (all other types). Receptionists also do other things, for example, adding to the page received back from the action any information that is common to all pages.1791 1792 There are different ways of providing an interface to greenstone, from web based CGI style (using servlets) to Java GUI applications. These different interfaces require slightly different responses from a receptionist, so we provide several standard types of receptionist.1800 The receptionist is the controlling module for the page generation part of \gs\ . It has the job of loading up all the actions, and it knows about the message router it and the actions are supposed to talk to. It routes messages received to the appropriate action (page-type messages) or directly to the message router (all other types). Receptionists also do other things, for example, adding to the page received back from the action any information that is common to all pages. 1801 1802 There are different ways of providing an interface to \gs\ , from web based CGI style (using servlets) to Java GUI applications. These different interfaces require slightly different responses from a receptionist, so we provide several standard types of receptionist. 1793 1803 1794 1804 Receptionist: This is the most basic receptionist. The page it returns consists of the original request, and the response from the action it was sent to. Methods preProcessRequest, and postProcessPage are called on the request and page, respectively, but in this basic receptionist, they don't do anything. … … 1798 1808 WebReceptionist: The WebReceptionist extends TransformingReceptionist. It doesn't do much else except some argument conversion. To keep the URLs short, parameters from the services are given shortnames, and these are used in the web pages. 1799 1809 1800 DefaultReceptionist: This extends WebReceptionist, and is the default one for greenstone 3servlets. Due to the page design, some extra information is needed for each page: some metadata about the current collection. The receptionist sends a describe request to the collection to get this, and appends it to the page before transformation using XSLT.1810 DefaultReceptionist: This extends WebReceptionist, and is the default one for \gsiii\ servlets. Due to the page design, some extra information is needed for each page: some metadata about the current collection. The receptionist sends a describe request to the collection to get this, and appends it to the page before transformation using XSLT. 1801 1811 1802 1812 NZDLReceptionist: (do we want to talk about this?) This is an example of a custom receptionist. For a look-alike nzdl.org system, even more information is needed for each page, namely the list of classifiers available from the ClassifierBrowse service. … … 1857 1867 indicates a request to the service itself. The extra arguments (not a, sa, sn, c) are simply copied into the 1858 1868 request as parameters. The response is in a form suitable for the applet, placed inside 1859 \gst{<appletData>} in a standard Greenstonemessage. AppletAction returns the1869 \gst{<appletData>} in a standard \gs\ message. AppletAction returns the 1860 1870 contents of appletData to the browser, i.e. to the applet itself. 1861 1871 … … 1906 1916 1907 1917 \subsubsection{Some class info - where should this go??} 1908 \begin{table} 1918 \begin{table}[h] 1909 1919 \caption{The utility classes in org.greenstone.gsdl3.util} 1910 1920 \label{tab:utils} … … 1917 1927 Dictionary & wrapper around a Resource Bundle, providing strings with parameter\\ 1918 1928 GSCGI & class to map between short name CGI arguments and long name request parameters \\ 1919 GSFile & class to create all Greenstonefile paths e.g. used to locate configuration files, XSLT files and collection data. \\1929 GSFile & class to create all \gs\ file paths e.g. used to locate configuration files, XSLT files and collection data. \\ 1920 1930 GSHTML & provides convenience methods for dealing with HTML, e.g. making strings HTML safe\\ 1921 1931 GSPath & used to create, examine and modify message address paths\\ 1922 1932 GSStatus & some static codes for status messages\\ 1923 GSXML & lots of methods for extracting information out of Greenstone XML, and creating some common types of elements. Also has static Strings for element and attribute names used by Greenstone.\\1924 GSXSLT & some manipulation functions for GreenstoneXSLT\\1933 GSXML & lots of methods for extracting information out of \gs\ XML, and creating some common types of elements. Also has static Strings for element and attribute names used by \gs\ .\\ 1934 GSXSLT & some manipulation functions for \gs\ XSLT\\ 1925 1935 Misc & miscellaneous functions\\ 1926 OID & class to handle Greenstone(2) OIDs\\1936 OID & class to handle \gs\ (2) OIDs\\ 1927 1937 XMLConverter & provides methods to create new Documents, parse Strings or Files into Documents, and convert Nodes to Strings\\ 1928 1938 XMLTransformer & methods to transform XML using XSLT \\ … … 1940 1950 1941 1951 \newpage 1942 \section{Developing Greenstone 3: Adding new features}\label{sec:new-features}1952 \section{Developing \gsiii\ : Adding new features}\label{sec:new-features} 1943 1953 1944 1954 \subsection{Creating new services}\label{sec:new-services} … … 1969 1979 \subsection{New types of collections}\label{sec:new-coll-types} 1970 1980 1971 There are two types of standard Greenstone collections: collections built with the Greenstone 3 building system, and collections that are imported from Greenstone 2. There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. Greenstone 3has an ability to use any type of collection you can come up with, assuming some java code is provided.1972 1973 1974 There are four levels of customisation that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with Greenstoneto describe these different levels.1975 1976 Firstly, new service classes need to be written to provide the functionality to search/browse/whatever the collection. If the services have similar interfaces and functionality to the standard services, this may be all that is needed. For example, the Greenstone 2 MGPP collections were the first to be served in Greenstone 3. When we came to do Greenstone 2MG collections, all we had to do was write some new service classes that interacted with MG instead of MGPP. Because these collections used the same type of services, this was all we had to do. The format of the configuration files was similar, they just specified MG serviceRack classes rather than MGPP ones.1981 There are two types of standard \gs\ collections: collections built with the \gsiii\ building system, and collections that are imported from \gsii\ . There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. \gsiii\ has an ability to use any type of collection you can come up with, assuming some java code is provided. 1982 1983 1984 There are four levels of customisation that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with \gs\ to describe these different levels. 1985 1986 Firstly, new service classes need to be written to provide the functionality to search/browse/whatever the collection. If the services have similar interfaces and functionality to the standard services, this may be all that is needed. For example, the \gsii\ MGPP collections were the first to be served in \gsiii\ . When we came to do \gsii\ MG collections, all we had to do was write some new service classes that interacted with MG instead of MGPP. Because these collections used the same type of services, this was all we had to do. The format of the configuration files was similar, they just specified MG serviceRack classes rather than MGPP ones. 1977 1987 1978 1988 The nzmaps collection used the same level of customisation, just implementing new services and fitting all the extra display elements into the standard query/display framework using javascript. … … 1980 1990 The gberg collection, however, was done quite differently to the standard collections. New services were provided to search the database (built with Lucene) and to provide the documents and parts of documents (using XSLT to transform the raw XML files). The collectionConfig file had some extra information in it: a list of the documents in the collection along with their Titles. Because the standard collection class has no notion of document lists, a new class was created (org.greenstone.gsdl3.collection.XMLCollection). This class is basically the same as a standard collection class except that it looks for and stores in memory the documentList from the collectionConfig file. 1981 1991 1982 To tell Greenstoneto load up a different type of collection class, we use another configuration file: etc/collectionInit.xml. This specifies the name of the collection class to use.1992 To tell \gs\ to load up a different type of collection class, we use another configuration file: etc/collectionInit.xml. This specifies the name of the collection class to use. 1983 1993 Currently, this is all that is specified in that file, but you may want to add parameters for the class etc. 1984 1994 1985 1995 \gst{<collectionInit class="XMLCollection"/>} 1986 1996 1987 The display for the collection is also quite different. The home page for the collection displays the list of documents. To achieve this, the describe response from the collection had to include the list, and a new XSLT was written for the collection that displayed this. Collection XSLT should be put in the transform directory of the collection\footnote{These are currently only used when running greenstonein a non-distributed fashion, but it will be added in properly at some stage}.1988 1989 Document display is significantly different to standard greenstone. There are two modes of display: table of contents mode, and content mode. Clicking on a document link from the collection home page takes the user to the table of contents for the collection. Clicking on one of the sections in the table of contents takes them to a display of that section. To facilitate this, not only do we need new XSLT files , we also needed a new action. XMLDocumentAction was created, that used two subactions, toc and text, for the different modes of display.1997 The display for the collection is also quite different. The home page for the collection displays the list of documents. To achieve this, the describe response from the collection had to include the list, and a new XSLT was written for the collection that displayed this. Collection XSLT should be put in the transform directory of the collection\footnote{These are currently only used when running \gs\ in a non-distributed fashion, but it will be added in properly at some stage}. 1998 1999 Document display is significantly different to standard \gs\ . There are two modes of display: table of contents mode, and content mode. Clicking on a document link from the collection home page takes the user to the table of contents for the collection. Clicking on one of the sections in the table of contents takes them to a display of that section. To facilitate this, not only do we need new XSLT files , we also needed a new action. XMLDocumentAction was created, that used two subactions, toc and text, for the different modes of display. 1990 2000 1991 2001 The Receptionist was told about this new action by the addition of the following to the interfaceConfig.xml file: … … 2029 2039 \end{verbatim}\end{gsc} 2030 2040 2031 Instead of displaying an icon and the Title, it displays the Title of the section and the title of the document. Both of these are linked to the document: the section title to the content of that section, the document title to the table of contents for the document. Because these require non-standard arguments to the library, these parts of the template are written in XSLT not greenstone format language. As is shown here it is perfectly feasible to write a format statement that includes XSLT mixed in with greenstoneformat elements.2032 2033 The document display uses CSS to format the output---these are kept in the collection and specified in the collections XSLT files. The documents also specify DTD files. Due to the way we read in the XML files, Tomcat sometimes has trouble locating the DTDs. One option is to may all the links absolute links to files in the collection folder, the other option is to put them in Greenstone's DTD folder gsdl3/resources/dtd.2041 Instead of displaying an icon and the Title, it displays the Title of the section and the title of the document. Both of these are linked to the document: the section title to the content of that section, the document title to the table of contents for the document. Because these require non-standard arguments to the library, these parts of the template are written in XSLT not \gs\ format language. As is shown here it is perfectly feasible to write a format statement that includes XSLT mixed in with \gs\ format elements. 2042 2043 The document display uses CSS to format the output---these are kept in the collection and specified in the collections XSLT files. The documents also specify DTD files. Due to the way we read in the XML files, Tomcat sometimes has trouble locating the DTDs. One option is to may all the links absolute links to files in the collection folder, the other option is to put them in \gs\ 's DTD folder gsdl3/resources/dtd. 2034 2044 2035 2045 \subsection{The NZDL mirror site} 2036 2046 2037 The library seen at \gst{http://www.greenstone.org/greenstone3/nzdl} is like a mirror to \gst{http://www.nzdl.org}---it aims to present the same collections, in the same way but using Greenstone 3 instead of Greenstone 2. It uses a new site and a new interface. The web.xml file had a new servlet entry in it to specify the combination of nzdl site and interface.2038 2039 The site was created by making a directory called nzdl in the sites folder. A siteConfig file was created. Because its running on Linux, we were able to link to all the collections in the old greenstoneinstallation. The convert\_coll\_from\_gs2.pl script was run over all the collections to produce the new XML configuration files.2047 The library seen at \gst{http://www.greenstone.org/greenstone3/nzdl} is like a mirror to \gst{http://www.nzdl.org}---it aims to present the same collections, in the same way but using \gsiii\ instead of \gsii\ . It uses a new site and a new interface. The web.xml file had a new servlet entry in it to specify the combination of nzdl site and interface. 2048 2049 The site was created by making a directory called nzdl in the sites folder. A siteConfig file was created. Because its running on Linux, we were able to link to all the collections in the old \gs\ installation. The convert\_coll\_from\_gs2.pl script was run over all the collections to produce the new XML configuration files. 2040 2050 2041 2051 A new interface, also called nzdl, was created in the interfaces directory. 2042 2052 In many cases, creating a new interface just requires the new images and XSLT to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This setup also required a bit more customisation. 2043 2053 2044 The standard Greenstone navigation bar lists all the services available for the collection. In Greenstone 2, the navigation bar provided the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that produced the navigation bar needed to be altered to produce this. But also, a new Receptionist was needed.2054 The standard \gs\ navigation bar lists all the services available for the collection. In \gsii\ , the navigation bar provided the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that produced the navigation bar needed to be altered to produce this. But also, a new Receptionist was needed. 2045 2055 The standard receptionist (DefaultReceptionist) gathers a little bit of extra info for each page of XML before transforming it: this is the list of services for the collection and their display information, allowing the services to be listed along the navigation bar. This is information that is needed by every page (except for the library home page) and therefore is obtained by the receptionist instead of by each action. The nzdl interface needed a bit more information than this: for the ClassifierBrowse service, if there was one, the list of classifiers and their display elements must be obtained. So a new Receptionist was written that inherited from DefaultReceptionist, and added this new info into the page. 2046 2056 … … 2049 2059 2050 2060 \newpage 2051 \section{Distributed Greenstone}\label{sec:distributed}2052 2053 Greenstone is designed to run in a distributed fashion. One greenstoneinstallation can talk to several sites on different computers. This requires some sort of communication protocol. Any protocol can be used, however we have only implemented a simple SOAP protocol.2061 \section{Distributed \gs\ }\label{sec:distributed} 2062 2063 \gs\ is designed to run in a distributed fashion. One \gs\ installation can talk to several sites on different computers. This requires some sort of communication protocol. Any protocol can be used, however we have only implemented a simple SOAP protocol. 2054 2064 2055 2065 more explanation.. … … 2063 2073 2064 2074 We have used Apache SOAP for Java. This is run as a servlet in Tomcat. 2065 If you have obtained Greenstonethrough CVS, you will need to install soap separately, describe in Appendix~\ref{app:soap-cvs}. Debugging soap is described in Appendix~\ref{app:soap-debug}.2075 If you have obtained \gs\ through CVS, you will need to install soap separately, describe in Appendix~\ref{app:soap-cvs}. Debugging soap is described in Appendix~\ref{app:soap-debug}. 2066 2076 2067 2077 \subsection{Serving a site using soap} … … 2071 2081 2072 2082 \newpage 2073 \section{Using Greenstone 3from CVS}\label{app:cvs}2083 \section{Using \gsiii\ from CVS}\label{app:cvs} 2074 2084 2075 2085 *** need to make sure building stuff is in here *** 2076 2086 2077 Greenstone 3is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes.2087 \gsiii\ is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes. 2078 2088 2079 2089 Note that you will need the Java 2 SDK, version 1.4.0 or higher. 2080 2090 2081 To check out the greenstonecode, use:2091 To check out the \gs\ code, use: 2082 2092 2083 2093 \begin{quote}\begin{gsc}\begin{verbatim} … … 2092 2102 \subsection{Linux install} 2093 2103 2094 An install.sh script is provided to compile and install Greenstone3. What you need to do is:2104 An install.sh script is provided to compile and install \gsiii\ . What you need to do is: 2095 2105 2096 2106 \begin{quote}\begin{gsc} … … 2134 2144 Run gs3-finalise.bat\\ 2135 2145 2136 To run Greenstone, run gs3-launch.bat. This will start the Tomcat server in a new DOS window (stop it by closing the window), and open a broser window showing the Greenstone 3 homepage. 2146 To run \gs\ , run gs3-launch.bat. This will start the Tomcat server in a new DOS window (stop it by closing the window), and open a broser window showing the \gsiii\ homepage. 2147 2148 \subsection{Creating a distribution} 2149 2150 The installation scripts have been set up in such a way that it is easy to create different distribution types (for linux). To create a standard binary distribution, carry out the following steps: 2151 2152 \begin{gsc}\begin{verbatim} 2153 cvs co gsdl3 2154 cd gsdl3 2155 source gs3-setup.sh 2156 ./gs3-prepare.sh 2157 ./gs3-configure.sh 2158 ./gs3-compile.sh 2159 ./gs3-for-distribution.sh 2160 2161 mv Header ../ 2162 cd ../ 2163 tar czvf gsdl3.tgz gsdl3/ 2164 cat Header gsdl3.tgz > gsdl3-x.xx-unix.sh 2165 \end{verbatim}\end{gsc} 2166 2167 Note that gs3-for-distribution.sh removes some files that are not needed for the distribution, including all the CVS directories. Once you have run this, you will no longer be able to update your gsdl3 code via cvs. 2168 2169 To create a source distribution, you can do: 2170 \begin{gsc}\begin{verbatim} 2171 cvs co gsdl3 2172 cd gsdl3 2173 source gs3-setup.sh 2174 ./gs3-prepare.sh 2175 <delete unnecessary files> 2176 cd ../ 2177 tar czvf gsdl3-x.xx-src.tgz gsdl3/ 2178 \end{verbatim}\end{gsc} 2179 2180 Some of the gs3-for-distribution script will need to be run (at the stage of delete unnecessary files), and there needs to be instructions on what to do when someone downloads the source distro. 2181 2182 I think it would be: 2183 \begin{gsc}\begin{verbatim} 2184 tar xzvf gsdl3-x.xx-src.tgz 2185 cd gsdl3 2186 source gs3-setup.sh 2187 ./gs3-configure.sh 2188 ./gs3-compile.sh 2189 ./gs3-finalise.sh 2190 \end{verbatim}\end{gsc} 2191 2137 2192 2138 2193 \newpage 2139 2194 \section{Tomcat}\label{app:tomcat} 2140 2195 2141 Tomcat is a servlet container. It is used to serve a Greenstonesite using a servlet.2142 2143 The file \gst{\gsdlhome/comms/jakarta/tomcat/conf/server.xml} is the Tomcat configuration file. The installation process adds a context for Greenstone3servlets (\gst{\gsdlhome/web})---this tells Tomcat where to find the web.xml file, and what URL (\gst{/gsdl3}) to give it. Anything inside the context directory is accessible via Tomcat\footnote{can we use .htaccess files to restrict access??}. For example, the index.html file that lives in \gst{\gsdlhome/web} can be accessed through the URL \gst{localhost:8080/gsdl3/index.html}. The demo collection's images can be accessed through \\2196 Tomcat is a servlet container. It is used to serve a \gs\ site using a servlet. 2197 2198 The file \gst{\gsdlhome/comms/jakarta/tomcat/conf/server.xml} is the Tomcat configuration file. The installation process adds a context for \gsiii\ servlets (\gst{\gsdlhome/web})---this tells Tomcat where to find the web.xml file, and what URL (\gst{/gsdl3}) to give it. Anything inside the context directory is accessible via Tomcat\footnote{can we use .htaccess files to restrict access??}. For example, the index.html file that lives in \gst{\gsdlhome/web} can be accessed through the URL \gst{localhost:8080/gsdl3/index.html}. The demo collection's images can be accessed through \\ 2144 2199 \gst{localhost:8080/gsdl3/sites/localsite/collect/demo/images/}. 2145 2200 … … 2188 2243 \end{gsc}\end{quote} 2189 2244 2190 In our example, the greenstone 3servlet can be accessed at \gst{http://www.greenstone.org/greenstone3/library}, instead of at \gst{http://puka.cs.waikato.ac.nz:8080/gsdl3/library}, which is not publically accessible.2245 In our example, the \gsiii\ servlet can be accessed at \gst{http://www.greenstone.org/greenstone3/library}, instead of at \gst{http://puka.cs.waikato.ac.nz:8080/gsdl3/library}, which is not publically accessible. 2191 2246 2192 2247 \subsection{Running Tomcat behind a proxy} … … 2199 2254 \subsection{Setting up SOAP from CVS}\label{app:soap-cvs} 2200 2255 2201 If you have obtained greenstonethrough CVS, you will need to install the SOAP stuff by running:2256 If you have obtained \gs\ through CVS, you will need to install the SOAP stuff by running: 2202 2257 2203 2258 \begin{quote}\begin{gsc} … … 2243 2298 \end{quote} 2244 2299 2245 8070 is the port that TcpTunnelGui listens on, and 8080 is the port that it sends the messages onto---the port that Tomcat is using. You need to modify Greenstoneto talk to port 8070 when it wants to talk to Tomcat, so that the messages go through TcpTunnelGui. This is specified in the \gst{<site>} element of the soapsite site configuration file (\gst{\gsdlhome/web/sites/soapsite/siteConfig.xml}).2300 8070 is the port that TcpTunnelGui listens on, and 8080 is the port that it sends the messages onto---the port that Tomcat is using. You need to modify \gs\ to talk to port 8070 when it wants to talk to Tomcat, so that the messages go through TcpTunnelGui. This is specified in the \gst{<site>} element of the soapsite site configuration file (\gst{\gsdlhome/web/sites/soapsite/siteConfig.xml}). 2246 2301 \begin{quote}\begin{gsc}\begin{verbatim} 2247 2302 <site name="org.greenstone.localsite" … … 2255 2310 2256 2311 \newpage 2257 \section{Format statements: Greenstone 2 vs Greenstone 3}\label{app:format}2258 The following table shows the Greenstone 2 format elements, and their equivalents in Greenstone 32259 \begin{table} 2260 \caption{ Greenstone 3 equivalents of Greenstone 2format statements}2312 \section{Format statements: \gsii\ vs \gsiii\ }\label{app:format} 2313 The following table shows the \gsii\ format elements, and their equivalents in \gsiii\ 2314 \begin{table}[h] 2315 \caption{\gsiii\ equivalents of \gsii\ format statements} 2261 2316 {\footnotesize 2262 2317 \begin{tabular}{ll} 2263 2318 \hline 2264 \bf Greenstone 2 & \bf Greenstone 3\\2319 \bf \gsii\ & \bf \gsiii\ \\ 2265 2320 \hline 2266 2321 \gst{[Text]} & \gst{<gsf:text/>} \\
Note:
See TracChangeset
for help on using the changeset viewer.