Changeset 7826
- Timestamp:
- 2004-07-29T13:32:35+12:00 (20 years ago)
- Location:
- trunk/gsdl3/docs/manual
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/gsdl3/docs/manual/manual.tex
r7635 r7826 462 462 463 463 Once the build process is complete, the building directory should be renamed to index (after deleting or renaming the existing index directory, if any), and Tomcat prompted to reload the collection---either by restarting the server, or by sending an activate collection command to the library servlet. 464 465 Summary: 464 466 465 467 [TODO: need to describe namespaces somewhere? ] … … 1011 1013 \subsection{Overview of modules??} 1012 1014 1013 A \gsiii\ 'library' system consists of many components: MessageRouter, Receptionist, Actions, Collections, ServiceRacks etc. Figure~\ref{fig:local} shows how they fit together in a stand-alone system. The re is a one-to-one correspondance between modules and Java classes, with the exception of services: for coding and/or run-time efficiency reasons, several Service modules may be grouped together into one ServiceRack class.1015 A \gsiii\ 'library' system consists of many components: MessageRouter, Receptionist, Actions, Collections, ServiceRacks etc. Figure~\ref{fig:local} shows how they fit together in a stand-alone system. The top left part is concerned with displaying the data, while the bottom right part is the collection data serving part. The two sides communicate through the MessaegRouter. There is a one-to-one correspondance between modules and Java classes, with the exception of services: for coding and/or run-time efficiency reasons, several Service modules may be grouped together into one ServiceRack class. 1014 1016 1015 1017 \begin{figure}[t] 1016 1018 \centering 1017 \includegraphics[width=4in]{ local} %5.81019 \includegraphics[width=4in]{newlocal} %5.8 1018 1020 \caption{A simple stand-alone site.} 1019 1021 \label{fig:local} … … 1023 1025 {\em MessageRouter}: this is the central module for a site. It controls the site, loading up all the collections, clusters, communicators needed. All messages pass through the MessageRouter. Communication between remote sites is always done between MessageRouters, one for each site. 1024 1026 1025 {\em Collection and ServiceCluster}: these are very similar . They both provide some metadata about the collection/cluster, and a list of services. The services are provided by ServiceRack objects that the collection/cluster loads up. A Collection is a specific type of ServiceCluster. A ServiceCluster groups services that are related conceptually, e.g. all the building services may be part of a cluster. What is part of a cluster is specified by the site configuration file. A Collection's services are grouped by the fact that they all operate on some common data---the documents in the collection.1027 {\em Collection and ServiceCluster}: these are very similar, and group a set of services into a conceptual group.. They both provide some metadata about the collection/cluster, and a list of services. The services are provided by ServiceRack objects that the collection/cluster loads up. A Collection is a specific type of ServiceCluster. A ServiceCluster groups services that are related conceptually, e.g. all the building services may be part of a cluster. What is part of a cluster is specified by the site configuration file. A Collection's services are grouped by the fact that they all operate on some common data---the documents in the collection. 1026 1028 Functionally Collection and ServiceCluster are very similar, but conceptually, and to the user, they are quite different. 1027 1029 1028 {\em Service}: these provide the core functionality of the system e.g. searching, retrieving documents, building collections etc. One or more may be grouped into a single Java class (ServiceRack) for code reuse, or to avoid instantiating the same objects several times. For example, MGPP searching services all need to have the index loaded into memory. Services provide the core functionality for the system, e.g. searching, retrieving documents, building collections etc.1030 {\em Service}: these provide the core functionality of the system e.g. searching, retrieving documents, building collections etc. One or more may be grouped into a single Java class (ServiceRack) for code reuse, or to avoid instantiating the same objects several times. For example, MGPP searching services all need to have the index loaded into memory. 1029 1031 1030 1032 {\em Communicator/Server}: these facilitate communication between remote modules. For example, if you want MR1 to talk to MR2, you need a Communicator-Server pair. The Server sits on top of MR2, and MR1 talks to the Communicator. Each communication type needs a new pair. So far we have only been using SOAP, so we have a SOAPCommunicator and a SOAPServer. 1031 1033 1032 {\em Receptionist}: this is the point of contact for the 'front end'. Its core functionality involves routing requests to the Actions, but it may do more than that. For example, a Receptionist may: modify the request in some way before sending it to the appropriate Action; add some data to the page responses that is common to all pages; transform the response into another form using XSLT for example. There is a hierarchy of different Receptionist types, which is described in Section~\ref{sec:recepts}.1034 {\em Receptionist}: this is the point of contact for the 'front end'. Its core functionality involves routing requests to the Actions, but it may do more than that. For example, a Receptionist may: modify the request in some way before sending it to the appropriate Action; add some data to the page responses that is common to all pages; transform the response into another form using XSLT. There is a hierarchy of different Receptionist types, which is described in Section~\ref{sec:recepts}. 1033 1035 1034 1036 {\em Actions}: these do the job of creating the 'pages'. There is a different action for each type of page, for example PageAction handles semi-static pages, QueryAction handles queries, DocumentAction displays documents. They know a little bit about specific service types. Based on the 'CGI' arguments passed in to them, they construct requests for the system, and put together the responses into data for the page. This data is returned to the Receptionist, which may transform it to HTML. The various actions are described in more detail in Section~\ref{sec:pagegen}. … … 1052 1054 If the Receptionist is a TransformingReceptionist, a mapping between shortnames and XSLT file names is also created. 1053 1055 1054 The MessageRouter reads in its site configuration file \gst{siteConfig.xml} (see Section~\ref{sec:siteconfig}). It creates a module map that maps names to objects. This is used for routing the messages. It also keeps small chunks of XML---serviceList, collectionList, clusterList and siteList. These are what get returned in response to a describe request (see Section~\ref{sec:describe}.). 1056 The MessageRouter reads in its site configuration file \gst{siteConfig.xml} (see Section~\ref{sec:siteconfig}). It creates a module map that maps names to objects. This is used for routing the messages. It also keeps small chunks of XML---serviceList, collectionList, clusterList and siteList. These are part of what get returned in response to a describe request (see Section~\ref{sec:describe}.). 1057 1055 1058 Each ServiceRack specified in the configuration file is created, then queried for its list of services. Each service name is added to the map, pointing to the ServiceRack object. Each service is also added to the serviceList. After this stage, ServiceRacks are transparent to the system, and each service is treated as a separate module. 1059 1056 1060 ServiceClusters are created and passed the \gst{<serviceCluster>} element for configuration. They are added to the map as is, with the cluster name as a key. A serviceCluster is also added to the serviceClusterList. 1057 For each site specified, the MessageRouter creates an appropriate type of Communicator object. Then it tries to get the site description. If the server for the remote site is up and running, this should be successful. The site will be added to the mapping with its site name as a key. The site's collections, services and clusters will also be added into the static xml lists. If the server for the remote site is not running, the site will not be included in the siteList or module map. To try again to access the site, either Tomcat must be restarted, or a run-time reconfigure-sites commands must be sent (see Section~\ref{sec:runtime-config}). 1058 1059 The MessageRouter also looks inside the site's \gst{collect} directory, and loads up a Collection object for each valid collection found. 1060 1061 1062 For each site specified, the MessageRouter creates an appropriate type of Communicator object. Then it tries to get the site description. If the server for the remote site is up and running, this should be successful. The site will be added to the mapping with its site name as a key. The site's collections, services and clusters will also be added into the static xml lists. If the server for the remote site is not running, the site will not be included in the siteList or module map. To try again to access the site, either Tomcat must be restarted, or a run-time reconfigure-site command must be sent (see Section~\ref{sec:runtime-config}). 1063 1064 The MessageRouter also looks inside the site's \gst{collect} directory, and loads up a Collection object for each valid collection found. If a \gst{collectionInit.xml} file is present, a subclass of Collection may be used. 1061 1065 The Collection object reads its \gst{buildConfig.xml} and \gst{collectionConfig.xml} 1062 1066 files, determines the metadata, and loads ServiceRack classes based on the … … 1067 1071 1068 1072 There are two types of messages used by the system: external and internal messages. All messages have an enclosing \gst{<message>} element, which contains either one or more requests, or one or more responses. In the following descriptions, the message element is not shown, but is assumed to be present. 1069 Action in \gsiii\ is originated by a request coming in from the outside. In the standard web-based \gs \ , this comes from a servlet into the receptionist. This ``external'' type request is a request for a page of data, and contains a representation of the CGI style arguments. A page of XML is returned, which can be in HTML format or other depending on the output parameter tothe request.1070 1071 Messages inside the system (``internal'' messages) all follow the same basic format: message elements contain multiple request elements, or multiple response elements. Messaging is all synchronous. The same number of responses as requests will be returned. Currently all requests are ind ividual, so any requests can be combined into the same message, and they will be answered separately, with their responses being sent back in a single message.1072 1073 When a page request comes in to the Receptionist, it looks at the action attribute to determine which action to send it to. The response is returned from the action. The page that the receptionist returns contains the original request, the response from the action and other info as needed (depends on the type of Receptionist). The data may be transformed in some way --- for the servlet \gs\ we transform using XSLT to generate html pages which get returned to the servlet.1073 Action in \gsiii\ is originated by a request coming in from the outside. In the standard web-based \gs, this comes from a servlet and is passed into the Receptionist. This ``external'' type request is a request for a page of data, and contains a representation of the CGI style arguments. A page of XML is returned, which can be in HTML format or other depending on the output parameter of the request. 1074 1075 Messages inside the system (``internal'' messages) all follow the same basic format: message elements contain multiple request elements, or multiple response elements. Messaging is all synchronous. The same number of responses as requests will be returned. Currently all requests are independent, so any requests can be combined into the same message, and they will be answered separately, with their responses being sent back in a single message. 1076 1077 When a page request (external request) comes in to the Receptionist, it looks at the action attribute and passes the request to the appropriate Action module. The Action will fire one or more internal requests to the MessageRouter, based on the arguments. The data is gathered into a response, which is returned to the Receptionist. The page that the receptionist returns contains the original request, the response from the action and other info as needed (depends on the type of Receptionist). The data may be transformed in some way --- for the \gs\ servlet we transform using XSLT to generate html pages. 1074 1078 1075 1079 Actions send internal style messages to the MessageRouter. Some can be answered by it, others are passed on to collections, and maybe on to services. Internal requests are for simple actions, such as search, retrieve metadata, retrieve document text 1076 There are different request types: describe, process, system... 1077 1078 The message formats for each request type, and the response formats for each module are described in the following section. 1079 1080 \subsection{an attempt at an API: message formats} 1081 1082 \subsubsection{external$->$action}\label{sec:page-requests} 1083 1084 request: 1085 These are the special 'external'-style messages. Requests originate from outside \gs\ , for example from a servlet, or java application. They are requests for a 'page' of data---for example, the home page for a site; the query page for a collection; the text of a document. They contain, in XML, a list of arguments specifying what type of page is required. If the external context is a servlet, the arguments represent the 'CGI' arguments in a \gs\ URL. The two main arguments are \gst{a} (action) and \gst{sa} (subaction). All other arguments are encoded as parameters. 1086 1087 Here are some examples of requests\footnote{In a servlet context, these correspond to the arguments \gst{a=p\&sa=about\&c=demo\&l=fr}, and \gst{a=q\&l=en\&s=TextQuery\&c=demo\&rt=r\&ca=0\&st=1\&m=10\&q=snail}.}: 1088 1089 \begin{quote}\begin{gsc}\begin{verbatim} 1090 <request type='page' action='p' subaction='about' 1091 lang='fr' output='html'> 1092 <paramList> 1093 <param name='c' value='demo'/> 1094 </paramList> 1095 </request> 1096 \end{verbatim}\end{gsc}\end{quote} 1097 1098 \begin{quote}\begin{gsc}\begin{verbatim} 1099 <request type='page' action='q' lang='en' output='html'> 1100 <paramList> 1101 <param name='s' value='TextQuery'/> 1102 <param name='c' value='demo'/> 1103 <param name='rt' value='r'/> 1104 <!-- the rest are the service specific params --> 1105 <param name='ca' value='0'/> <!-- casefold --> 1106 <param name='st' value='1'/> <!-- stem --> 1107 <param name='m' value='10'/> <!-- maxdocs --> 1108 <param name='q' value='snail'/> <!-- query string --> 1109 </paramList> 1110 </request> 1111 \end{verbatim}\end{gsc}\end{quote} 1112 1113 The Receptionist routes the message to the appropriate Action (determined by looking up its shortname$->$Action object map). The actions determine what information is needed from the server and retrieves it, making one or more internal requests to the MessageRouter. This information is gathered together into a single response, and returned to the Receptionist. The Receptionist may process the result further, depending on what type of Receptionist is it. 1114 1115 1116 \begin{table} 1117 {\footnotesize 1118 \begin{tabular}{lll} 1119 \hline 1120 \bf Argument & \bf Meaning &\bf Typical values \\ 1121 \hline 1122 a & action & a (applet), q (query), b (browse), p (page), pr (process) \\ 1123 & & s (system)\\ 1124 sa & subaction & home, about (page action)\\ 1125 c & collection or & demo, build \\ 1126 & service cluster \\ 1127 s & service name & TextQuery, ImportCollection \\ 1128 rt & request type & d (display), r (request), s (status) \\ 1129 ro & response only & 0 or 1 - if set to one, the request is carried out \\ 1130 & & but no processing of the results is done \\ 1131 & & currently only used in process actions \\ 1132 o & output type & XML, html, WML \\ 1133 l & language & en, fr, zh ...\\ 1134 d & document id & HASHxxx \\ 1135 r & resource id & ???\\ 1136 pid & process handle & an integer identifying a particular process request \\ 1137 \hline 1138 \end{tabular}} 1139 \caption{Generic arguments that can appear in a \gs\ URL} 1140 \label{tab:args} 1141 \end{table} 1080 There are different internal request types: describe, process, system, format, status. Process requests do the actual work of the system, while the other types get auxiliary information. The format of the requests and responses for each internal request type are described in the following sections. External style requests, and their page responses are described in the Section about page generation (Section~\ref{sec:pagegen}). 1142 1081 1143 1082 \subsection{'describe'-type messages}\label{sec:describe} … … 1176 1115 1177 1116 It is possible to ask just for a specific part of the information provided by a 1178 describe request, rather than the whole thing. 1179 messages get the \gst{collectionList} and the \gst{siteList} respectively: 1117 describe request, rather than the whole thing. For example, these two 1118 messages get the \gst{collectionList} and the \gst{siteList} respectively: 1180 1119 \begin{quote}\begin{gsc}\begin{verbatim} 1181 1120 <request lang='en' type='describe' to=''> … … 1192 1131 \end{verbatim}\end{gsc}\end{quote} 1193 1132 1194 When a collection or service cluster is asked to describe itself, what is returned is a list of metadata, some display elements, and a list of services. For example, here is such 1195 a message, along with a sample response. 1133 Subset options for the MessageRouter include \gst{collectionList}, \gst{serviceClusterList}, \gst{serviceList}, \gst{siteList}. 1134 1135 When a collection or service cluster is asked to describe itself, what is returned is a list of metadata, some display elements, and a list of services. For example, here is such a message, along with a sample response. 1196 1136 1197 1137 \begin{quote}\begin{gsc}\begin{verbatim} … … 1231 1171 \end{verbatim}\end{gsc}\end{quote} 1232 1172 1233 The subset parameter can also be used in a describe request to a collection, to retrieve just the \gst{metadataList} or \gst{serviceList}.1173 Subset options for a collection or serviceCluster include \gst{metadataList}, \gst{serviceList}, and \gst{displayItemList}. 1234 1174 1235 1175 This collection provides many typical services. Notice how this response lists the services available, while the collection configuration file for this collection (Figure~\ref{fig:collconfig}) described serviceRacks. Once the service racks have been configured, they become transparent in the system, and only services are referred to. … … 1237 1177 1238 1178 A \gst{describe} request sent to a service returns a list of parameters that 1239 the service accepts , some display information, (and in future may describe the content type for the request and response).1240 1241 Parameters can b yin the following formats:1179 the service accepts and some display information, (and in future may describe the content type for the request and response). Subset options for the request include \gst{paramList} and \gst{displayItemList}. 1180 1181 Parameters can be in the following formats: 1242 1182 \begin{quote}\begin{gsc}\begin{verbatim} 1243 1183 <param name='xxx' type='integer|boolean|string|invisible' default='yyy'/> … … 1280 1220 A service description also contains some display information---this includes the name of the service, and the text for the submit button. 1281 1221 1282 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. The parameters in this example include their display information. Figure~\ref{fig:query-display} gives an example html search form that may be generated from this describe response.1222 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. The parameters in this example include their display information. Figure~\ref{fig:query-display} shows an example html search form that may be generated from this describe response. 1283 1223 1284 1224 \begin{quote}\begin{gsc}\begin{verbatim} … … 1394 1334 Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the Receptionist must fill in this parameter before displaying the html. 1395 1335 1396 \subs ubsection{'system'-type messages}\label{sec:system}1397 1398 ``System'' requests are used to tell a MessageRouter, Collection or ServiceCluster to update its cached information and activate or deactivate other modules. For example, the MessageRouter has a set of Collection modules that it can talk to. It also holds some XML information about those collections---this is returned when a request for a collection list comes in. If a collection is deleted or modified, or a new one created, this information may need to change, and the list of available modules may also change. Currently the y are initiated by particular CGI parameters (see Section~\ref{sec:runtime-config}).1336 \subsection{'system'-type messages}\label{sec:system} 1337 1338 ``System'' requests are used to tell a MessageRouter, Collection or ServiceCluster to update its cached information and activate or deactivate other modules. For example, the MessageRouter has a set of Collection modules that it can talk to. It also holds some XML information about those collections---this is returned when a request for a collection list comes in. If a collection is deleted or modified, or a new one created, this information may need to change, and the list of available modules may also change. Currently these requests are initiated by particular CGI requests (see Section~\ref{sec:runtime-config}). 1399 1339 1400 1340 The basic format of a system request is as follows: … … 1417 1357 The third request is to activate collection demo. This could be a new collection, or a reactivation of an old one. If a collection module already exists, it will be deleted, and a new one loaded. The final request deactivates the site site1---this removes the site from the siteList and module map, and also removes any of that sites collections/services from the static lists. 1418 1358 1419 1420 A response just contains a status message, for example: 1421 \begin{quote}\begin{gsc}\begin{verbatim} 1422 <response from=""> 1423 <status>collectionList reconfigured successfully</status> 1424 </response> 1425 \end{verbatim}\end{gsc}\end{quote} 1426 1427 At some stage, an error or status code should be included. 1359 A response just contains a status message\footnote{TODO: add in error/status codes}, for example: 1360 \begin{quote}\begin{gsc}\begin{verbatim} 1361 <status>MessageRouter reconfigured successfully</status> 1362 <status>Error on reconfiguring collectionList</status> 1363 <status>collection:demo activated</status> 1364 <status>site:site1 deactivated</status> 1365 \end{verbatim}\end{gsc}\end{quote} 1428 1366 1429 1367 System requests are mainly answered by the MessageRouter. However, Collections and ServiceClusters will respond to a subset of these requests. … … 1445 1383 \end{verbatim}\end{gsc}\end{quote} 1446 1384 1447 The actual format statements are described in Section~\ref{sec:formatstmt}. They are templates written directly in XSLT, or in GSF , which stands for Greenstone Format, andis a simple XML representation of the more complicated XSLT templates.1448 GSF 1385 The actual format statements are described in Section~\ref{sec:formatstmt}. They are templates written directly in XSLT, or in GSF (GreenStone Format) which is a simple XML representation of the more complicated XSLT templates. 1386 GSF-style format statements need to be converted to proper XSLT. This is currently done by the Receptionist (but may be moved to an ActionHelper): the format XML is transformed to XSLT using XSLT with the config\_format.xsl stylesheet. 1449 1387 1450 1388 \subsection{'status'-type messages}\label{sec:status} 1451 1389 1452 These are only used with process-type services, which are those where a request is sent to start some type of process (see Section~\ref{sec:process}). The initial responsestates whether the process had successfully started, and whether its still continuing. If the process is not finished, status requests can be sent repeatedly to the service to poll the status, using the pid to identify the process. Status codes are used to identify the state of a process. The values used at the moment are listed in Table~\ref{tab:status codes}\footnote{A more standard set of codes should probably be used, for example, the HTTP codes}.1390 These are only used with process-type services, which are those where a request is sent to start some type of process (see Section~\ref{sec:process}). An initial 'process' request to a 'process' service generates a response which states whether the process had successfully started, and whether its still continuing. If the process is not finished, status requests can be sent repeatedly to the service to poll the status, using the pid to identify the process. Status codes are used to identify the state of a process. The values used at the moment are listed in Table~\ref{tab:status codes}\footnote{A more standard set of codes should probably be used, for example, the HTTP codes}. 1453 1391 1454 1392 \begin{table} … … 1506 1444 \end{verbatim}\end{gsc}\end{quote} 1507 1445 1508 \subs ubsection{processmessages}1446 \subsection{'process'-type messages} 1509 1447 1510 1448 Process requests and responses provide the major functionality of the system---these are the ones that do the actual work. The format depends on the service they are for, so I'll describe these by service. … … 1760 1698 \end{verbatim}\end{gsc}\end{quote} 1761 1699 1762 The \gst{code} attribute in the response specifies whether the command has been successfully stated, whether its still going, etc (see Table~\ref{tab:status codes} for a list of currently used codes). The pid attribute specifies a process id number that can be used when querying the status of this process. The content of the status element is (currently) just the output from the process so far. Status messages, which are described in Section~\ref{sec:status}, are used to find out how the process is going, and whether it has finished or not.1700 The \gst{code} attribute in the response specifies whether the command has been successfully stated, whether its still going, etc (see Table~\ref{tab:status codes} for a list of currently used codes). The pid attribute specifies a process id number that can be used when querying the status of this process. The content of the status element is (currently) just the output from the process so far. Status messages, which were described in Section~\ref{sec:status}, are used to find out how the process is going, and whether it has finished or not. 1763 1701 1764 1702 \subsubsection{'applet'-type services} … … 1812 1750 1813 1751 Enrich services typically take some text of documents (inside \gst{<nodeContent>} tags) and returns the text marked up in some way. One example of this is the GatePOSTag service: this identifies Dates, Locations, People and Organizations in the text, and annotates the text with the labels. In the following example, the request is for Location and Dates to be identified. 1814 *** TODO **** 1752 1815 1753 \begin{quote}\begin{gsc}\begin{verbatim} 1816 1754 <request lang="en" to="GatePOSTag" type="process"> … … 1837 1775 FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS 1838 1776 <annotation type="Location">Rome</annotation> 1839 1777 <annotation type="Date">1986</annotation> 1840 1778 P-69 1841 1779 ISBN 92-5-102397-2 … … 1847 1785 \end{verbatim}\end{gsc}\end{quote} 1848 1786 1849 \subsection{Page generation}\label{sec:pagegen} **** REDO ******** 1850 1851 * talk general first: get data, get format info, transform gsf->xsl. transfrom xml->html 1852 1853 * state saving. the XSLT files assume that arguments are saved somehow. This needs to be implemented outside \gs\ proper - we do this in the servlet, using something or other. 1854 1855 URL-style requests are received by the Receptionist. Based on the arguments, a page of data must be returned to the servlet. As described in Section~\ref{sec:page-requests}, the requests are XML representations of \gs\ URLs. One of the arguments is action (a). This tells the Receptionist which Action module to pass the request to. Action modules decode the rest of the CGI-arguments to determine what requests need to be made to the system. 1856 System requests are received by the MessageRouter, which answers them one by one, either itself or by passing them on to the appropriate module. 1857 1858 Once the data needed from the system has been accumulated, it is put into a 'page' of XML. The page is transformed to its output form, currently HTML, via XSLT transformations, and returned to the user. 1787 \subsection{Page generation}\label{sec:pagegen} 1788 1789 A 'page' is some XML or HTML (or other?) data returned in response to an 1790 external 'page'-type request. These requests originate from outside \gs\ , for example from a servlet, or java application, and are received by the Receptionist. As described below in Section~\ref{sec:page-requests}, the requests are XML representations of \gs\ URLs. One of the arguments is action (a). This tells the Receptionist which Action module to pass the request to. 1791 1792 Action modules decode the rest of the arguments to determine what requests need to be made to the system. One or more internal requests may be made to the MessageRouter. A request for format information from the Collection/Service may also be made. The resulting data is gathered together into a single XML response, \gst{<page>}, and returned to the Receptionist. 1793 1794 The page format is described in Section~\ref{sec:page-format}. The XML may be returned as is, or may be modified by the Receptionist. The various Receptionists are described in Section~\ref{sec:recepts}. The default receptionist used by a servlet transforms the XML into HTML using XSL stylesheets. Section~\ref{sec:collformat} looks at collection specific formatting, in particular for HTML output. 1795 Sections~\ref{sec:pageaction} to \ref{sec:systemaction} look at the various actions and what kind of data they gather. 1796 1797 \subsubsection{'page'-type requests and their arguments}\label{sec:page-requests} 1798 1799 These are requests for a 'page' of data---for example, the home page for a site; the query page for a collection; the text of a document. They contain, in XML, a list of arguments specifying what type of page is required. If the external context is a servlet, the arguments represent the 'CGI' arguments in a \gs\ URL. The two main arguments are \gst{a} (action) and \gst{sa} (subaction). All other arguments are encoded as parameters. 1800 1801 Here are some examples of requests\footnote{In a servlet context, these correspond to the arguments \gst{a=p\&sa=about\&c=demo\&l=fr}, and \gst{a=q\&l=en\&s=TextQuery\&c=demo\&rt=r\&ca=0\&st=1\&m=10\&q=snail}.}: 1802 1803 \begin{quote}\begin{gsc}\begin{verbatim} 1804 <request type='page' action='p' subaction='about' 1805 lang='fr' output='html'> 1806 <paramList> 1807 <param name='c' value='demo'/> 1808 </paramList> 1809 </request> 1810 \end{verbatim}\end{gsc}\end{quote} 1811 1812 \begin{quote}\begin{gsc}\begin{verbatim} 1813 <request type='page' action='q' lang='en' output='html'> 1814 <paramList> 1815 <param name='s' value='TextQuery'/> 1816 <param name='c' value='demo'/> 1817 <param name='rt' value='r'/> 1818 <!-- the rest are the service specific params --> 1819 <param name='ca' value='0'/> <!-- casefold --> 1820 <param name='st' value='1'/> <!-- stem --> 1821 <param name='m' value='10'/> <!-- maxdocs --> 1822 <param name='q' value='snail'/> <!-- query string --> 1823 </paramList> 1824 </request> 1825 \end{verbatim}\end{gsc}\end{quote} 1826 1827 There are some standard arguments used in Greenstone, and they are described in Table~\ref{tab:args}. These are used by Receptionists and Actions. The GSParams class specifies all the general basic arguments, and whether they should be saved or not (Some arguments need to be saved during a session, and this needs to be implemented outside \gs\ proper --- currently we do this in the servlet, using servlet session handling). The servlet has an init parameter \gst{params\_class} which specifies which params class to use: GSParams can be subclassed if necessary. The Receptionist and Actions must not have conflicting argument names. 1828 1829 Other arguments are used dynamically and come from the Services. Service arguments must always be saved during a session. Services may be created by different people, and may reside on a different site. There is no guarantee that there is no conflict with argument names between services and actions. Therefore service parameters are namespaced when they are put on the page, whereas interface (receptionist and action) parameters have no namespace. The default namespace is s1 (service1) --- any parameters that are for the service will be prefixed by this. For example, the case parameter for a search will be put in the page as s1.case, and the resulting argument in a search URL will be s1.case. When actions are deciding which parameters need to be sent in a request to a service, they can use the namespace information. 1830 1831 If there are two or more services combined on a page with a single submit button, they will use namespaces s1, s2, s3 etc as needed. The s (service) parameter will end up with a list of services. For example, \gst{s=TextQuery,MusicQuery,} and the order of these determines the mapping order of the namespaces, i.e. s1 will map to TextQuery, s2 to MusicQuery. 1832 1833 \begin{table} 1834 {\footnotesize 1835 \begin{tabular}{lll} 1836 \hline 1837 \bf Argument & \bf Meaning &\bf Typical values \\ 1838 \hline 1839 a & action & a (applet), q (query), b (browse), p (page), pr (process) \\ 1840 & & s (system)\\ 1841 sa & subaction & home, about (page action)\\ 1842 c & collection or & demo, build \\ 1843 & service cluster \\ 1844 s & service name & TextQuery, ImportCollection \\ 1845 rt & request type & d (display), r (request), s (status) \\ 1846 ro & response only & 0 or 1 - if set to one, the request is carried out \\ 1847 & & but no processing of the results is done \\ 1848 & & currently only used in process actions \\ 1849 o & output type & XML, html, WML \\ 1850 l & language & en, fr, zh ...\\ 1851 d & document id & HASHxxx \\ 1852 r & resource id & ???\\ 1853 pid & process handle & an integer identifying a particular process request \\ 1854 \hline 1855 \end{tabular}} 1856 \caption{Generic arguments that can appear in a \gs\ URL} 1857 \label{tab:args} 1858 \end{table} 1859 1860 \subsubsection{page format}\label{sec:page-format} 1859 1861 1860 1862 The basic page format is: 1861 1863 \begin{quote}\begin{gsc}\begin{verbatim} 1862 <page >1864 <page lang='en'> 1863 1865 <pageRequest/> 1864 1866 <pageResponse/> … … 1896 1898 NZDLReceptionist: (do we want to talk about this?) This is an example of a custom receptionist. For a look-alike nzdl.org system, even more information is needed for each page, namely the list of classifiers available from the ClassifierBrowse service. 1897 1899 1898 By default, the LibraryServlet uses DefaultReceptionist. However, there is an init-param called receptionist which can be set to make the servlet use a different one. 1899 1900 By default, the LibraryServlet uses DefaultReceptionist. However, there is a servlet init-param called \gst{receptionist} which can be set to make the servlet use a different one. 1901 1902 \subsubsection{Collection specific formatting}\label{sec:collformat} 1903 get format info, transform gsf->xsl. transfrom xml->html 1904 1905 config params are passed in to the transformation 1900 1906 \subsubsection{CGI arguments} 1901 1907 1902 The arguments used by the page come from several sources. Receptionist uses a couple, actions use some and services. the receptionist and actions are treated as a whole so must not have conflicting arguments. GSParams class specifies all the general basic arguments, and whether they should be saved or not. servlet has an init parameter params\_class, that specifies which params class to use - if subclass it. actions or receptionist may specify some new ones 1903 1904 services may be created by different people, may be on a different site. cant guarantee no conflict with action params, or even with other services. 1905 so service params are namespaced when they are put on the page. interface (recept and action) params will have no namespace) the default namespace is s1 (service1) - any parameters that are for the service will be prefixed by this. e.g. the case parameter for a search will be put in the page as s1.case. 1906 The actions must now look for all the s1 parameters to send to the service. 1907 1908 if there are two or more services combined on a page with a single submit button, they will use s1, s2, s3 etc as needed. the s parameter (service) will end up with a list e.g. s=TextQuery,MusicQuery, and the order of these determines the mapping order of the namespaces, ie s1 will be TextQuery, s2 MusicQuery. 1909 1910 also talk about saving arguments - save ones that GSParams says to save, and any service ones should always save. 1911 1912 \subsubsection{Page action} 1913 * kind of info pages. other actions are associated with specific services. 1914 * uses describe requests to modules 1915 Depending on the subaction argument, different pages can be generated. For the 'home' page, a 'describe' request is sent to the MessageRouter---this returns a list of all the collections, services, serviceClusters and sites known about. For each collection, its metadata is retrieved via a 'describe' request. This metadata is added into the previous result, which is then added into the page. The page is 1916 transformed using \gst{home.xsl}. For the 'about' page, a \gst{describe} request is sent to the module that the about page is about: this may be a collection or a service cluster. This returns a list of metadata 1917 and a list of services, and the result is transformed using \gst{about.xsl}. 1918 1919 1920 \subsubsection{Query action} 1908 1909 \subsubsection{Page action}\label{sec:pageaction} 1910 1911 PageAction is responsible for displaying kinds of information pages, such as the home page of the library, or the home page of a collection, or the help and preferenecs pages. These pages are not associated with specific services like the other page types. In general, the data comes from describe requests to various modules. 1912 The different pages are requested using the subaction argument. For the 'home' page, a 'describe' request is sent to the MessageRouter---this returns a list of all the collections, services, serviceClusters and sites known about. For each collection, its metadata is retrieved via a 'describe' request. This metadata is added into the previous result, which is then added into the page. For the 'about' page, a \gst{describe} request is sent to the module that the about page is about: this may be a collection or a service cluster. This returns a list of metadata 1913 and a list of services. 1914 1915 1916 \subsubsection{Query action}\label{sec:queryaction} 1921 1917 1922 1918 The basic URL is \gst{a=q\&s=TextQuery\&c=demo\&rt=d/r}. … … 1925 1921 displayed, but should be cached. The description includes a list of the parameters available for the query, such as case/stem, max num docs to return, etc. If the request type (rt) parameter is set to d for display, the action only needs to display the form, and this is the only request to the service. Otherwise, the submit button has been pressed, and a query request to the TextQuery service is sent. This has all the parameters from the URL put into the parameter list. A list of document identifiers 1926 1922 is returned. A followup query is sent to the MetadataRetrieve service of the collection: the content includes the list of 1927 documents, with a request for some of their metadata. Which metadata to retrieve is determined by looking through the XSLT that will be used to transform the page (Formatter object??). The service description and query result are combined into a page of XML, which is 1928 transformed using \gst{basicquery.xsl} to produce the html page. 1929 1930 \subsubsection{Applet action} 1923 documents, with a request for some of their metadata. Which metadata to retrieve is determined by looking through the XSLT that will be used to transform the page. The service description and query result are combined into a page of XML, which is returned to the Receptionist. 1924 1925 \subsubsection{Applet action}\label{sec:appletaction} 1931 1926 1932 1927 There are two types of request to the applet action: \gst{a=a \& rt=d\/} and … … 1935 1930 into the page, and the servlet returns the HTML. 1936 1931 1937 The value \gst{rt=r} signals a request from the applet. The result is returned 1938 directly to the applet code, in XML. The other parameters are sent to the 1939 service untransformed, and the result is passed directly back to the applet. 1940 Applet action can therefore work with any applet whose service understands the 1941 messages. 1942 1943 Here are two examples of requests generated by the Applet action, along with their corresponding responses. 1944 1945 The first request corresponds to the URL arguments \gst{a=a \& 1946 rt=d \& sn=Phind \& c=mgppdemo\/}, which translate to ``display the Phind 1947 applet for the mgppdemo collection''. 1948 1949 1950 The second request corresponds to the arguments \gst{a=a \& rt=r \& sn=Phind \& c=mgppdemo \& pc=1 \& pptext=health \& pfe=0 \& ple=10 \& pfd=0 \& pld=10 \& pfl=0 \& pll=10}---this 1951 indicates a request to the service itself. The extra arguments (not a, sa, sn, c) are simply copied into the 1952 request as parameters. The response is in a form suitable for the applet, placed inside 1953 \gst{<appletData>} in a standard \gs\ message. AppletAction returns the 1954 contents of appletData to the browser, i.e. to the applet itself. 1955 1932 The value \gst{rt=r} signals a request from the applet. A process request containing all the parameters is sent to the applet service. The result contains an appletData element, which contains a single element - this element is returned 1933 directly to the applet, in XML. No transformation is done. 1934 Because the AppletAction doesn't know or care anything about the applet data, it can work with any applet-service pair. 1956 1935 1957 1936 Note that the applet HTML may need to know the name of the \gst{library} … … 1962 1941 <PARAM NAME='library' VALUE=''/> 1963 1942 \end{verbatim}\end{gsc}\end{quote} 1964 When the Applet action encounters this parameter it inserts the name of the1943 When the AppletAction encounters this parameter it inserts the name of the 1965 1944 current library servlet as its value. 1966 1945 1967 \subsubsection{Document action} 1968 1969 DocumentAction sends a query to the DocumentRetrieve service of the collection requesting the text of the specified document. At this stage no additional information is obtained, but in future stuff like Title and 1970 table of contents would be needed to make the display nicer. 1971 1972 1973 \subsubsection{System action}\label{sec:system-action} 1946 \subsubsection{Document action}\label{sec:documentaction} 1947 1948 DocumentAction is responsible for displaying a document to the user. The display might involve some metadata and/or text for a document or part of a document. For hierarchical documents, a table of contents may be shown, while for paged documents (those with a single linear list of sections), next and previous page buttons may be shown. These different display types require different information about the document. Depending on the arguments, DocumentAction will send requests to several services: DocumentMetadataRetrieve, DocumentStructureRetrieve and DocumentContentRetrieve. 1949 1950 A basic display, for example, Title and text, involves a metadata request to get the Title, and a content request to get the text. Hierarchical table of contents display requires a structure request. If the entire contents is to be displayed, the parameter \gst{structure=entire} would be sent in the request. Otherwise, parameters \gst{structure=ancestors}, \gst{structure=children} and possibly \gst{structure=siblings} may be used, depending in the position of the current node in the document. These return a hierarchical structure of nodes, containing ancestor nodes, child nodes and sibling nodes, respectively. 1951 For paged display, the structure is not actually needed. A structure request is still sent, but this time it requests some information, rather the structure itself. The information requested includes the number of siblings and the current position of the current node, or the number of children (if the current node is the root of the document). 1952 1953 Metadata may be requested for the current node, or for any nodes in the structure, and content also. The metadata and content are added into the appropriate nodes in the structure hierarchy, and this is returned as the page data. 1954 1955 \subsubsection{XML Document action}\label{sec:xmldocumentaction} 1956 1957 XMLDOcumentAction is a little different to the standard DocumentAction. It operates in two modes, \gst{text} and \gst{toc}. In \gst{text} mode, it will retrieve the content of the current document node using a DocumentContentRetrieve request. In \gst{toc} mode, it retrieves the entire table of contents for the document using a DocumentStructureRetrieve request. Either mode may also retrieve metadata for the current section or each section in the table of contents. 1958 1959 \subsubsection{GS2Browse action}\label{sec:browseaction} 1960 1961 GS2BrowseAction is for displaying Greenstone 2 style classifiers. 1962 \subsubsection{System action}\label{sec:systemaction} 1974 1963 1975 1964 SystemAction allows for manual reconfiguration of various components at run-time. There is no interactive web-page displaying the options, it merely turns a set of CGI arguments into an XML system request. The response from a system request is a message which is displayed to the user. … … 1999 1988 2000 1989 2001 \subsubsection{Some class info - where should this go??} 1990 \subsection{Other code information} 1991 1992 Greenstone has a set of Utility classes, which are briefly described in Table~\ref{tab:utils}. 1993 2002 1994 \begin{table}[h] 2003 1995 \caption{The utility classes in org.greenstone.gsdl3.util} … … 2008 2000 \bf Utility class & \bf Description\\ 2009 2001 \hline 2010 ConfigVars & holds the servlet startup variables, including library name, site name, interface name, default language\\2011 Dictionary & wrapper around a Resource Bundle, providing strings with parameter\\2012 GS CGI & class to map between short name CGI arguments and long name request parameters\\2002 Dictionary & wrapper around a Resource Bundle, providing strings with parameters\\ 2003 GSConstants & holds some constants used for servlet arguments and configuration variables\\ 2004 GSEntityResolver & an EntityResolver which can be used to find resources such as DTDs\\ 2013 2005 GSFile & class to create all \gs\ file paths e.g. used to locate configuration files, XSLT files and collection data. \\ 2014 2006 GSHTML & provides convenience methods for dealing with HTML, e.g. making strings HTML safe\\ 2007 GSParams & contains names and default values for interface parameters\\ 2008 NZDLParams & a subclass of GSParams which holds default service parameters too, necessary for the classic style interface.\\ 2015 2009 GSPath & used to create, examine and modify message address paths\\ 2010 GSSQL & contains static strings for all the SQL table/field names\\ 2016 2011 GSStatus & some static codes for status messages\\ 2017 2012 GSXML & lots of methods for extracting information out of \gs\ XML, and creating some common types of elements. Also has static Strings for element and attribute names used by \gs\ .\\ … … 2019 2014 Misc & miscellaneous functions\\ 2020 2015 OID & class to handle \gs\ (2) OIDs\\ 2016 GS3OID & subclass of OID to handle \gsiii\ OIDs\\ 2017 SQLQuery & contains a connection to a SQL database, along with some methods for accessing the data, such as converting MG numbers to and from Greenstone OIDs.\\ 2021 2018 XMLConverter & provides methods to create new Documents, parse Strings or Files into Documents, and convert Nodes to Strings\\ 2022 2019 XMLTransformer & methods to transform XML using XSLT \\ … … 2054 2051 2055 2052 \subsection{new interfaces}\label{sec:new-interfaces} 2053 2054 It is easy to create new interfaces to \gsiii. Here we are talking about interfaces other than those to display in typical browser. 2055 2056 Handheld devices: Use the standard servlet setup, but with a different set of XSLT files to format the pages for small screens, or use WML. 2057 2058 Java GUI Interface: There are couple of alternatives. Depending on what you want to display in the GUI, you could talk to either a Receptionist or a MessageRouter. The library classes can be set up and compiled into the GUI program. 2059 Talking to a Receptionist will give you access to pages of XML. It is likely that the standard Receptionist class would be used - this doesn't transform the data to HTML. Queries such as ``give me the home page of a collection'' and ``do the following search'' can be issued. All teh data needed for the result view is returned. Queries are quite simple, but are limited to what kinds of Actions are available in the library. 2060 Talking to a MessageRouter requires a bit more effort on the part of the GUI program, but results in greater flexibility. The kinds of queries that can be issued are individual units of action, such as ``describe yourself'', ``search'', ``retrieve the content for this document''. More than one request may need to be made for a particular feature of the GUI. However you can ask for any combination of data available in the system, you are not relying on Actions. What you will implemenet though, may be a lot like the Action code in terms of request sequences. 2061 2062 Interfaces in other programming languages: Because the communication is all XML based, other interfaces can talk to the Java library if a communication protocol is set up. This could be done using SOAP for example. LIke for Java GUI interfaces, the program could talk to a Receptionist or to a MessageRouter. 2056 2063 e.g. java interface. where you can interface to. MR vs Receptionist. diff receptionists. egs, handheld - using servlet, transforming recpt, but new set of XSLT java program other program - talk to recpt but just get back XML data for pages. java gui - just talk to MR, do all processing itself. 2064 2065 Remote interfaces: remote interfaces can be set up in the same way as above, using a communication protocol between the interface, and the library program. 2057 2066 2058 2067 \subsection{Adding new classifiers}\label{sec:new-classifiers} … … 2065 2074 There are two types of standard \gs\ collections: collections built with the \gsiii\ building system, and collections that are imported from \gsii\ . There are many options to collection building but it is conceivable that these options don't meet the needs of all collection builders. \gsiii\ has an ability to use any type of collection you can come up with, assuming some java code is provided. 2066 2075 2067 2068 2076 There are four levels of customisation that may be needed with new collections: service, collection, interface XSLT, and action levels. We will use the example collections that come with \gs\ to describe these different levels. 2069 2077 2070 2078 Firstly, new service classes need to be written to provide the functionality to search/browse/whatever the collection. If the services have similar interfaces and functionality to the standard services, this may be all that is needed. For example, the \gsii\ MGPP collections were the first to be served in \gsiii\ . When we came to do \gsii\ MG collections, all we had to do was write some new service classes that interacted with MG instead of MGPP. Because these collections used the same type of services, this was all we had to do. The format of the configuration files was similar, they just specified MG serviceRack classes rather than MGPP ones. 2071 2079 2072 The nzmaps collection used the same level of customisation, just implementing new services and fitting all the extra display elements into the standard query/display framework using javascript. 2073 2074 The gberg collection, however, was done quite differently to the standard collections. New services were provided to search the database (built with Lucene) and to provide the documents and parts of documents (using XSLT to transform the raw XML files). The collectionConfig file had some extra information in it: a list of the documents in the collection along with their Titles. Because the standard collection class has no notion of document lists, a new class was created (org.greenstone.gsdl3.collection.XMLCollection). This class is basically the same as a standard collection class except that it looks for and stores in memory the documentList from the collectionConfig file. 2080 The XML Sample Texts (gberg) collection, however, was done quite differently to the standard collections. New services were provided to search the database (built with Lucene) and to provide the documents and parts of documents (using XSLT to transform the raw XML files). The collectionConfig file had some extra information in it: a list of the documents in the collection along with their Titles. Because the standard collection class has no notion of document lists, a new class was created (org.greenstone.gsdl3.collection.XMLCollection). This class is basically the same as a standard collection class except that it looks for and stores in memory the documentList from the collectionConfig file. 2075 2081 2076 2082 To tell \gs\ to load up a different type of collection class, we use another configuration file: etc/collectionInit.xml. This specifies the name of the collection class to use. … … 2083 2089 Document display is significantly different to standard \gs\ . There are two modes of display: table of contents mode, and content mode. Clicking on a document link from the collection home page takes the user to the table of contents for the collection. Clicking on one of the sections in the table of contents takes them to a display of that section. To facilitate this, not only do we need new XSLT files , we also needed a new action. XMLDocumentAction was created, that used two subactions, toc and text, for the different modes of display. 2084 2090 2085 The Receptionist was told about this new action by the addition of the following to the interfaceConfig.xml file:2091 The Receptionist was told about this new action by the addition of the following element to the interfaceConfig.xml file: 2086 2092 2087 2093 \begin{gsc}\begin{verbatim} … … 2125 2131 Instead of displaying an icon and the Title, it displays the Title of the section and the title of the document. Both of these are linked to the document: the section title to the content of that section, the document title to the table of contents for the document. Because these require non-standard arguments to the library, these parts of the template are written in XSLT not \gs\ format language. As is shown here it is perfectly feasible to write a format statement that includes XSLT mixed in with \gs\ format elements. 2126 2132 2127 The document display uses CSS to format the output---these are kept in the collection and specified in the collections XSLT files. The documents also specify DTD files. Due to the way we read in the XML files, Tomcat sometimes has trouble locating the DTDs. One option is to ma yall the links absolute links to files in the collection folder, the other option is to put them in \gs\ 's DTD folder gsdl3/resources/dtd.2128 2129 \subsection{The NZDL mirror site}2130 2131 The library seen at \gst{http://www.greenstone.org/greenstone3/nzdl} is like a mirror to \gst{http://www.nzdl.org}---it aims to present the same collections, in the same way but using \gsiii\ instead of \gsii\ . It uses a new site and a new interface. The web.xml file had a new servlet entry in it to specify the combination of nzdl site andinterface.2132 2133 The site was created by making a directory called nzdl in the sites folder. A siteConfig file was created. Because it s running on Linux, we were able to link to all the collections in the old \gs\ installation. The convert\_coll\_from\_gs2.pl script was run over all the collections to produce the new XML configuration files.2134 2135 A new interface, also called nzdl, was created in the interfaces directory.2136 In many cases, creating a new interface just requires the new images and XSLT to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This setup alsorequired a bit more customisation.2137 2138 The standard \gs \ navigation bar lists all the services available for the collection. In \gsii\ , the navigation bar provided the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that producedthe navigation bar needed to be altered to produce this. But also, a new Receptionist was needed.2139 The standard receptionist (DefaultReceptionist) gathers a little bit of extra info for each page of XML before transforming it: this is the list of services for the collection and their display information, allowing the services to be listed along the navigation bar. This is information that is needed by every page (except for the library home page) and therefore is obtained by the receptionist instead of by each action. The nzdl interface needed a bit more information than this: for the ClassifierBrowse service, if there was one, the list of classifiers and their display elements must be obtained. So a new Receptionistwas written that inherited from DefaultReceptionist, and added this new info into the page.2133 The document display uses CSS to format the output---these are kept in the collection and specified in the collections XSLT files. The documents also specify DTD files. Due to the way we read in the XML files, Tomcat sometimes has trouble locating the DTDs. One option is to make all the links absolute links to files in the collection folder, the other option is to put them in \gs\ 's DTD folder gsdl3/resources/dtd. 2134 2135 \subsection{The Classic Interface} 2136 2137 The library seen at \gst{http://www.greenstone.org/greenstone3/nzdl} is like a mirror to \gst{http://www.nzdl.org}---it aims to present the same collections, in the same way but using \gsiii\ instead of \gsii\ . It uses a new site (nzdl) with the classic interface. The web.xml file had a new servlet entry in it to specify the combination of nzdl site and classic interface. 2138 2139 The site was created by making a directory called nzdl in the sites folder. A siteConfig file was created. Because it is running on Linux, we were able to link to all the collections in the old \gs\ installation. The convert\_coll\_from\_gs2.pl script was run over all the collections to produce the new XML configuration files. 2140 2141 The classic interface was created to be used by this site (and is now a standard part of Greenstone). 2142 In many cases, creating a new interface just requires the new images and XSLT to be added to the new directory(see Sections~\ref{sec:sites-and-ints} and \ref{sec:interface-customise}). This classic interface required a bit more customisation. 2143 2144 The standard \gsiii\ navigation bar lists all the services available for the collection. In \gsii\ , the navigation bar provides the search option, and the different classifiers. This is not service specific, but hard coded to the search and classifiers. The XSLT that produces the navigation bar needed to be altered to produce this. But also, a new Receptionist was needed. 2145 The standard receptionist (DefaultReceptionist) gathers a little bit of extra information for each page of XML before transforming it: this is the list of services for the collection and their display information, allowing the services to be listed along the navigation bar. This is information that is needed by every page (except for the library home page) and therefore is obtained by the receptionist instead of by each action. The nzdl interface needed a bit more information than this: for the ClassifierBrowse service, if there was one, the list of classifiers and their display elements must be obtained. So a new Receptionist (NZDLReceptionist) was written that inherited from DefaultReceptionist, and added this new info into the page. 2140 2146 2141 2147 One of the servlet initialisation parameters is the receptionist class: this was added to the servlet definition in the web.xml file so that the LibraryServlet would load up the right receptionist class. … … 2166 2172 Sitename is the name of the site's directory, eg localsite. The siteuri is the identifier that will be used for the SOAP resource, eg org.greenstone.localsite. It should be a unique name amongst all the SOAP services that you want to connect to. 2167 2173 2168 The script makes sure that the SOAP servlet is deployed on Tomcat, and thendeploys the service for the site specified. A resource file (\gst{sitename.xml}) is created which is used to specify the service. It can be found in \gst{gsdl3/resources/soap}, and is generated from \gst{site.xml.in}.2174 The script deploys the service for the site specified. A resource file (\gst{sitename.xml}) is created which is used to specify the service. It can be found in \gst{gsdl3/resources/soap}, and is generated from \gst{site.xml.in}. 2169 2175 2170 2176 To get siteA to talk to siteB, you need to deploy a SOAP server on siteB, then add a \gst{<site>} element to the \gst{<siteList>} of siteA's \gst{siteConfig.xml} file (in \gst{gsdl3/web/sites/siteA/siteConfig.xml}). … … 2185 2191 \section{Using \gsiii\ from CVS}\label{app:cvs} 2186 2192 2187 *** need to make sure building stuff is in here *** 2193 [TODO: need to make sure building stuff is in here] 2188 2194 2189 2195 \gsiii\ is also available via CVS. You can download the latest version of the code. This is not guaranteed to be stable, in fact it is likely to be unstable. The advantage of using CVS is that you can update the code and get the latest fixes. … … 2195 2201 \begin{quote}\begin{gsc}\begin{verbatim} 2196 2202 cvs -d :pserver:cvs\[email protected]:2402/usr/local/ 2197 global-cvs/gsdl-src co gsdl32203 global-cvs/gsdl-src co -P gsdl3 2198 2204 \end{verbatim}\end{gsc}\end{quote} 2199 2205 2200 2206 If you need it, the password for anonymous CVS access is \gst{anonymous}. Note that some older versions of CVS have trouble accessing this repository due to the port number being present. We are using version 1.11.1p1. 2201 2207 2202 The software needs to be compiled and installed. The installation procedure has been semi-automated. The following sections describe installation under Linux and windows. 2208 The software needs to be compiled and installed. The installation procedure has been semi-automated. The following sections describe installation under Linux and windows. The most up to date instructions may be found in the README.txt file in the top level gsdl3 directory. 2203 2209 2204 2210 \subsection{Linux install} 2205 2211 2206 An install.sh script is provided to compile and install \gsiii\ .What you need to do is:2212 What you need to do is: 2207 2213 2208 2214 \begin{quote}\begin{gsc} … … 2210 2216 source gs3-setup.sh\\ 2211 2217 ./gs3-prepare.sh\\ 2212 ./gs3-configure.sh \\ 2213 ./gs3-compile.sh \\ 2218 ./configure \\ 2219 make \\ 2220 make install \\ 2221 \[make docs\] \\ 2214 2222 ./gs3-finalise.sh\\ 2223 source gs3-setup.sh \\ 2215 2224 \end{gsc}\end{quote} 2216 2225 … … 2225 2234 \subsection{Windows install} 2226 2235 2236 [TODO: check that these are correct] 2227 2237 Make sure that the following environment variables are set: JAVA\_HOME (where the JAva 2 SDK is installed); PATH (should include the CVS program, and \%JAVA\_HOME\%$\backslash$bin). The following commands should be run in a DOS prompt. 2228 2238 … … 2247 2257 2248 2258 To run \gs\ , run gs3-launch.bat. This will start the Tomcat server in a new DOS window (stop it by closing the window), and open a broser window showing the \gsiii\ homepage. 2249 2250 \subsection{Creating a distribution}2251 2252 The installation scripts have been set up in such a way that it is easy to create different distribution types (for linux). To create a standard binary distribution, carry out the following steps:2253 2254 \begin{gsc}\begin{verbatim}2255 cvs co gsdl32256 cd gsdl32257 source gs3-setup.sh2258 ./gs3-prepare.sh2259 ./gs3-configure.sh2260 ./gs3-compile.sh2261 ./gs3-for-distribution.sh2262 2263 mv Header ../2264 cd ../2265 tar czvf gsdl3.tgz gsdl3/2266 cat Header gsdl3.tgz > gsdl3-x.xx-unix.sh2267 \end{verbatim}\end{gsc}2268 2269 Note that gs3-for-distribution.sh removes some files that are not needed for the distribution, including all the CVS directories. Once you have run this, you will no longer be able to update your gsdl3 code via cvs.2270 2271 To create a source distribution, you can do:2272 \begin{gsc}\begin{verbatim}2273 cvs co gsdl32274 cd gsdl32275 source gs3-setup.sh2276 ./gs3-prepare.sh2277 <delete unnecessary files>2278 cd ../2279 tar czvf gsdl3-x.xx-src.tgz gsdl3/2280 \end{verbatim}\end{gsc}2281 2282 Some of the gs3-for-distribution script will need to be run (at the stage of delete unnecessary files), and there needs to be instructions on what to do when someone downloads the source distro.2283 2284 I think it would be:2285 \begin{gsc}\begin{verbatim}2286 tar xzvf gsdl3-x.xx-src.tgz2287 cd gsdl32288 source gs3-setup.sh2289 ./gs3-configure.sh2290 ./gs3-compile.sh2291 ./gs3-finalise.sh2292 \end{verbatim}\end{gsc}2293 2294 2259 2295 2260 \newpage
Note:
See TracChangeset
for help on using the changeset viewer.