Changeset 5435


Ignore:
Timestamp:
2003-09-03T14:03:13+12:00 (21 years ago)
Author:
kjdon
Message:

* empty log message *

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl3/docs/manual/manual.tex

    r4892 r5435  
    5454A description of the general design and architecture of Greenstone3 is covered by the document {\em The design of Greenstone3: An agent based dynamic digital library} (design-2002.ps, in the gsdl3/docs/manual directory).
    5555
     56NOTES: structure: make the classes and messages separate. have a class hierarchy and a module hierarchy/picture - keep the two separate. schemas/subschemas??
     57user vs developer - make a clearer distinction
     58are we going to publish an API. what is it? what do we want to provide?
    5659\section{System modules}\label{sec:modules}
    5760
     
    447450
    448451\subsection{'describe'-type messages}\label{sec:describe}
    449 **** REDO (the responses may now contain display information which is not shown here) ****
    450 This is the first of the standard internal messages.
    451 The most basic message is ``describe-yourself'', which can be sent to any module in the system. The module responds with a semi-predefined piece of XML, making these requests very efficient. The response is predefined apart from any language-specific text strings, which are put together as each request comes in.
     452
     453The most basic of the internal standard requests is ``describe-yourself'', which can be sent to any module in the system. The module responds with a semi-predefined piece of XML, making these requests very efficient. The response is predefined apart from any language-specific text strings, which are put together as each request comes in, based on the language attribute of the request.
    452454\begin{quote}\begin{gsc}\begin{verbatim}
    453455<request lang='en' type='describe' to=''/>
    454456\end{verbatim}\end{gsc}\end{quote}
    455 If the \gst{to} field is empty, it is answered by the MessageRouter.
     457If the \gst{to} field is empty, a request is answered by the MessageRouter.
    456458An example response from a MessageRouter might look like this:
    457459\begin{quote}\begin{gsc}\begin{verbatim}
    458460<response lang='en' type='describe'>
    459   <serviceList>
    460     <service name='CrossCollectionSearch' type='query' />
    461   </serviceList>
     461  <serviceList/>
    462462  <siteList>
    463463    <site name='org.greenstone.gsdl1'
     
    465465            type='soap' />
    466466  </siteList>
     467  <serviceClusterList>
     468    <serviceCluster name="build" />
     469  </serviceClusterList>
    467470  <collectionList>
    468471    <collection name='org.greenstone.gsdl1/
     
    474477</response>
    475478\end{verbatim}\end{gsc}\end{quote}
    476 This MessageRouter has one site-wide service, a cross-collection searching service. It
     479This MessageRouter has no individual site-wide services (an empty \gst{<serviceList>}), but has a service cluster called build (which provides collection importing and building functionality). It
    477480communicates with one site, \gst{org.greenstone.gsdl1}.  It is aware of four
    478481collections.  One of these, \gst{myfiles}, belongs to it; the other three are
     
    496499</request>
    497500\end{verbatim}\end{gsc}\end{quote}
    498 When a collection or service cluster is asked to describe itself, what is returned is all of the
    499 collection specific metadata and a list of services.  For example, here is such
     501
     502When a collection or service cluster is asked to describe itself, what is returned is a list of metadata, some display elements, and a list of services.  For example, here is such
    500503a message, along with a sample response.
    501504
    502505\begin{quote}\begin{gsc}\begin{verbatim}
    503 <request lang='en' type='describe' to='demo'/>
    504 
    505 <response lang='en' type='describe' from='demo' >
    506   <collection name='demo'>
     506<request lang='en' type='describe' to='mgppdemo'/>
     507
     508<response from="mgppdemo" type="describe">
     509  <collection name="mgppdemo">
     510    <displayItem lang="en" name="name">greenstone mgpp demo
     511    </displayItem>
     512    <displayItem lang="en" name="description">This is a
     513      demonstration collection for the Greenstone digital
     514      library software. It contains a small subset (11 books)
     515      of the Humanity Development Library. It is built with
     516      mgpp.</displayItem>
     517    <displayItem lang="en" name="icon">mgppdemo.gif</displayItem>
    507518    <serviceList>
    508       <service name='TextQuery' type='query' />
    509       <service name='DocumentContentRetrieve' type='retrieve' />
    510       <service name='DocumentMetadataRetrieve' type='retrieve' />
     519      <service name="DocumentStructureRetrieve" type="retrieve" />
     520      <service name="DocumentMetadataRetrieve" type="retrieve" />
     521      <service name="DocumentContentRetrieve" type="retrieve" />
     522      <service name="ClassifierBrowse" type="browse" />
     523      <service name="ClassifierBrowseMetadataRetrieve"
     524            type="retrieve" />
     525      <service name="TextQuery" type="query" />
     526      <service name="FieldQuery" type="query" />
     527      <service name="AdvancedFieldQuery" type="query" />
     528      <service name="PhindApplet" type="applet" />
    511529    </serviceList>
    512530    <metadataList>
    513       <metadata name='numDocs'>321</metadata>
    514       <metadata name='numSections'>5532</metadata>
    515       <metadata name='colName' lang='en'>The demo collection</metadata>
    516       <metadata name='colDescription' lang='en'>This is a demo collection.
    517       </metadata>
     531      <metadata name="creator">[email protected]</metadata>
     532      <metadata name="maintainer">[email protected]</metadata>
     533      <metadata name="numDocs">11</metadata> 
     534      <metadata name="buildType">mgpp</metadata>
     535      <metadata name="httpPath">http://kanuka:8090/gsdl3/sites/
     536                                localsite/collect/mgppdemo</metadata>
    518537    </metadataList>
    519538  </collection>
     
    521540\end{verbatim}\end{gsc}\end{quote}
    522541
    523 The subset parameter can also be used in a describe request to a collection, to retrieve just the metadataList or serviceList.
     542This collection provides many typical services...
     543
     544The subset parameter can also be used in a describe request to a collection, to retrieve just the \gst{metadataList} or \gst{serviceList}.
    524545
    525546A \gst{describe} request sent to a service returns a list of parameters that
     
    563584The type attribute is used to determine how to display the parameters on a web page or interface. For example, a string parameter may result in   a text entry box, a boolean an on/off button, enum\_single/enum\_multi a drop-down menu, where one or many items, respectively, can be selected.
    564585A multi-type parameter indicates that two or more parameters are associated, and should be displayed appropriately. For example, in a field query, the text box and field list should be associated. The occurs attribute specifies how many times the parameter should be displayed on the page.
    565 Parameters also come with display information: all the text strings needed to present them to teh user. These include the name of the parameter and the display values for any options.
    566 
    567 A service description also contains a display element - this contains all the language dependent text strings - put together on the fly. These strings are name of the service, what to use for the submit button, and text strings for all the parameters: name, what each value is called, etc.
    568 
    569 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. Figure~\ref{fig:query-display} gives an example html search form that may be generated from this describe response.
     586Parameters also come with display information: all the text strings needed to present them to the user. These include the name of the parameter and the display values for any options. These are included in the above parameter descriptions in the form of \gst{<displayItem>} elements.
     587
     588A service description also contains some display information---this includes the name of the service, and the text  for the submit button.
     589
     590Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. The parameters in this example include their display information. Figure~\ref{fig:query-display} gives an example html search form that may be generated from this describe response.
    570591
    571592\begin{quote}\begin{gsc}\begin{verbatim}
     
    574595<response from="mgppdemo/FieldQuery" type="describe">
    575596  <service name="FieldQuery" type="query">
     597    <displayItem name="name">Form Query</displayItem>
     598    <displayItem name="submit">Search</displayItem>
    576599    <paramList>
    577       <param default="Section" name="level" type="enum_single">
    578         <option name="Document" />
    579         <option name="Section" />
     600      <param default="Document" name="level" type="enum_single">
     601        <displayItem name="name">Granularity to search at</displayItem>
     602        <option name="Document">
     603          <displayItem name="name">Document</displayItem>
     604        </option>
     605        <option name="Section">
     606          <displayItem name="name">Section</displayItem>
     607        </option>
    580608      </param>
    581       <param default="1" name="case" type="boolean" />
    582       <param default="1" name="stem" type="boolean" />
    583       <param default="10" name="maxDocs" type="integer" />
     609      <param default="1" name="case" type="boolean">
     610        <displayItem name="name">Turn casefolding </displayItem>
     611        <option name="0">
     612          <displayItem name="name">off</displayItem>
     613        </option>
     614        <option name="1">
     615          <displayItem name="name">on</displayItem>
     616        </option>
     617      </param>
     618      <param default="1" name="stem" type="boolean">
     619        <displayItem name="name">Turn stemming </displayItem>
     620        <option name="0">
     621          <displayItem name="name">off</displayItem>
     622        </option>
     623        <option name="1">
     624          <displayItem name="name">on</displayItem>
     625        </option>
     626      </param>
     627      <param default="10" name="maxDocs" type="integer">
     628        <displayItem name="name">Maximum documents to return
     629        </displayItem>
     630      </param>
    584631      <param name="simpleField" occurs="4" type="multi">
    585         <param name="fqv" type="string" />
    586         <param default="" name="fqf" type="enum_single">
    587           <option name="ZZ" /><option name="TX" />
    588           <option name="SU" /><option name="TI" />
     632        <displayItem name="name"></displayItem>
     633        <param name="fqv" type="string">
     634          <displayItem name="name">Word or phrase </displayItem>
     635        </param>
     636        <param default="ZZ" name="fqf" type="enum_single">
     637          <displayItem name="name">in field</displayItem>
     638          <option name="ZZ">
     639            <displayItem name="name">All fields</displayItem>
     640          </option>   
     641          <option name="TX">
     642            <displayItem name="name">TextOnly</displayItem>
     643          </option>
     644          <option name="SU">
     645            <displayItem name="name">Subject</displayItem>
     646          </option>
     647          <option name="TI">
     648            <displayItem name="name">Title</displayItem>
     649          </option>
    589650        </param>
    590651      </param>
    591652    </paramList>
    592     <display>
    593       <name>Form Query</name>
    594       <submit>Search</submit>
    595       <param name="level">
    596         <name>Granularity to search at</name>
    597         <option name="Document">Document</option>
    598         <option name="Section">Section</option>
    599       </param>
    600       <param name="case">
    601         <name>Turn casefolding </name>
    602         <option name="0">off</option>
    603         <option name="1">on</option>
    604       </param>
    605       <param name="stem">
    606         <name>Turn stemming </name>
    607         <option name="0">off</option>
    608         <option name="1">on</optin>
    609       </param>
    610       <param name="maxDocs">
    611         <name>Maximum documents to return</name>
    612       </param>
    613       <param name="fqv">
    614         <name>Search for </name>
    615       </param>
    616       <param name="fqf">
    617         <name>in field</name>
    618         <option name="ZZ">All fields</option>
    619         <option name="TX">TextOnly</option>
    620         <option name="SU">Subject</option>
    621         <option name="TI">Title</option>
    622      </param>
    623     </display>
    624653  </service>
    625654</response>
     
    657686      The Phind java applet.
    658687    </applet>
     688    <displayItem name="name">Browse phrase hierarchies</displayItem>
    659689  </service>
    660690</response>
    661691\end{verbatim}\end{gsc}\end{quote}
    662692
    663 Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the receptionist must fill in this parameter before displaying the html.
     693Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the Receptionist must fill in this parameter before displaying the html.
    664694
    665695\subsection{'system'-type messages}\label{sec:system}
    666696
    667 ``System'' requests are used to tell a MessageRouter, Collection or ServiceCluster to update its cached information and activate or deactivate other modules. For example, the MessageRouter has a set of Collection modules that it can talk to. It also holds some XML information about those collections---this is returned when a request for a collection list comes in. If a collection is deleted or modified, or a new one created, this information may need to change, and the list of available modules may also change. Currenlty they are initiated by particular cgi parameters (see Section~\ref{sec:runtime-config}).
     697``System'' requests are used to tell a MessageRouter, Collection or ServiceCluster to update its cached information and activate or deactivate other modules. For example, the MessageRouter has a set of Collection modules that it can talk to. It also holds some XML information about those collections---this is returned when a request for a collection list comes in. If a collection is deleted or modified, or a new one created, this information may need to change, and the list of available modules may also change. Currently they are initiated by particular cgi parameters (see Section~\ref{sec:runtime-config}).
    668698
    669699The basic format of a system request is as follows:
     
    859889One or more parameters specifying metadata may be included in a request. Also, a value of \gst{all} will retrieve all the metadata for each document.
    860890
    861 Any browse-type service must also implement a metadata retrieval service to provide metadata for the nodes in the classification hierarchy. The name of it is the brose service name plus \gst{MetadataRetrieve}. For example, the ClassifierBrowse service described in the previous section should also have a ClassifierBrowseMetadataRetrieve service. The request and response format is exactly the same as for the DocumentMetadataREtrieve service, except that \gst{<documentNode>} elements are replaced by \gst{<classifierNode>} elements (and the corresponding list element is also changed).
     891Any browse-type service must also implement a metadata retrieval service to provide metadata for the nodes in the classification hierarchy. The name of it is the browse service name plus \gst{MetadataRetrieve}. For example, the ClassifierBrowse service described in the previous section should also have a ClassifierBrowseMetadataRetrieve service. The request and response format is exactly the same as for the DocumentMetadataRetrieve service, except that \gst{<documentNode>} elements are replaced by \gst{<classifierNode>} elements (and the corresponding list element is also changed).
    862892
    863893Give me the text (content) of this document:
     
    873903  <documentNodeList>
    874904    <documentNode nodeID="HASHac0a04dd14571c60d7fbfd.4.2">
    875       <nodeContent>&lt;Section&gt;
    876  
    877 &lt;/B&gt;&lt;P ALIGN=&quot;JUSTIFY&quot;&gt;&lt;/P&gt;
    878 &lt;P ALIGN=&quot;JUSTIFY&quot;&gt;190. When the plants in your second pen have
    879 grown big enough to provide food and shelter, you can put in the snails.&lt;/P&gt;
    880 
     905      <nodeContent>&lt;Section&gt;
     906       &lt;/B&gt;&lt;P ALIGN=&quot;JUSTIFY&quot;&gt;&lt;/P&gt;
     907       &lt;P ALIGN=&quot;JUSTIFY&quot;&gt;190. When the plants in
     908       your second pen have grown big enough to provide food and
     909       shelter, you can put in the snails.&lt;/P&gt;
    881910      </nodeContent>
    882911    </documentNode>
     
    885914\end{verbatim}\end{gsc}\end{quote}
    886915
    887 The content of a node is returned in a \gst{<nodeContent>} element.
     916The content of a node is returned in a \gst{<nodeContent>} element. In this case it is escaped HTML.
    888917
    889918Give me the ancestors and children of the specified node, along with the number of siblings it has:
     
    921950\end{verbatim}\end{gsc}\end{quote}
    922951
    923 Structure is returned inside a nodeStructure element, while structural info is returned in a nodeStructureInfo element. Possible values for strcuture parameters are as for browse services: \gst{ancestors}, \gst{parent}, \gst{siblings}, \gst{children}, \gst{descendents}. Possible values for info parameters are \gst{numSiblings}, \gst{siblingPosition}, \gst{numChildren}.
     952Structure is returned inside a \gst{<nodeStructure>} element, while structural info is returned in a \gst{<nodeStructureInfo>} element. Possible values for strcuture parameters are as for browse services: \gst{ancestors}, \gst{parent}, \gst{siblings}, \gst{children}, \gst{descendents}. Possible values for info parameters are \gst{numSiblings}, \gst{siblingPosition}, \gst{numChildren}.
    924953
    925954\subsubsection{'process'-type services}\label{sec:process}
    926 Process requests are not a request for data---they are a request for some action to be carried out, for example, create a new collection, or import a collection. The response is a status or an error message. The import and build commands may take a long time to complete, so a response is sent back after a successful start to the command. The status may be polled by the requester to see how the process is going.
     955Requests to process-type services are not requests for data---they request some action to be carried out, for example, create a new collection, or import a collection. The response is a status or an error message. The import and build commands may take a long time to complete, so a response is sent back after a successful start to the command. The status may be polled by the requester to see how the process is going.
    927956
    928957Process requests generally contain just a parameter list. Like for any service, the parameters used by a process-type service can be obtained by a describe request to that service.
     
    10011030\subsubsection{'enrich'-type services}
    10021031
     1032Enrich services typically take some text of documents (inside \gst{<nodeContent>} tags) and returns the text marked up in some way. One example of this is the GatePOSTag service: this identifies Dates, Locations, People and Organizations in the text, and annotates the text with the labels. In the following example, the request is for Location and Dates to be identified.
    10031033*** TODO ****
     1034\begin{quote}\begin{gsc}\begin{verbatim}
     1035<request lang="en" to="GatePOSTag" type="process">
     1036  <paramList>
     1037    <param name="annotationType" value="Date,Location" />
     1038  </paramList>
     1039  <documentNodeList>
     1040    <documentNode nodeID="HASHac0a04dd14571c60d7fbfd">
     1041      <nodeContent>
     1042        FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS
     1043        Rome 1986
     1044        P-69
     1045        ISBN 92-5-102397-2
     1046        FAO 1986
     1047      </nodeContent>
     1048    </documentNode>
     1049  </documentNodeList>
     1050</request>
     1051
     1052<response from="GatePOSTag" type="process">
     1053  <documentNodeList>
     1054    <documentNode nodeID="HASHac0a04dd14571c60d7fbfd">
     1055      <nodeContent>
     1056    FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS
     1057    <annotation type="Location">Rome</annotation>
     1058          <annotation type="Date">1986</annotation>
     1059        P-69
     1060        ISBN 92-5-102397-2
     1061        FAO <annotation type="Date">1986</annotation>
     1062      </nodeContent>
     1063    </documentNode>
     1064  </documentNodeList>
     1065</response>
     1066\end{verbatim}\end{gsc}\end{quote}
    10041067
    10051068\subsection{'status'-type messages}\label{sec:status}
     
    10781141\section{Page generation}\label{sec:pagegen} **** REDO ********
    10791142
    1080 \subsection{Receptionists}\label{sec:recepts}
    1081 
    1082 The receptionist is the controlling module for the page generation part of greenstone. It has the job of loading up all the actions, and it knows about the message router it and the actions are supposed to talk to. It routes messages received to the appropriate action (page-type messages) or directly to the message router (all other types). Receptionists also do other things, for example, adding to the page received back from the action any information that is common to all pages.
    1083 
    1084 There are different ways of providing an interface to greenstone, from web based cgi style (using servlets) to Java GUI applications. These different interfaces require slightly different responses from a receptionist, so we provide several standard types of receptionist.
    1085 
    1086 Receptionist: This is the most basic receptionist. The page it returns consists of the original request, and the response from the action it was sent to. Methods preProcessRequest, and postProcessPage are called on the request and page, respectively, but in this basic receptionist, they dont do anything.
    1087 
    1088 TransformingReceptionist: This extends Receptionist, and overwrites postProcessPage to transform the page using xslt. An xslt is listed for each action in the receptionists config file, and this is used to transform the page. First, some display information, and config information is added to the page. Then it is transformed using the specified xslt for the action, and returned.
    1089 
    1090 WebReceptionist: The WebReceptionist extends TransformingREceptionist. It doesn't do much else except some argument conversion. To keep the url's short, parameters from the services are given shortnames, and these are used in the web pages.
    1091 
    1092 DefaultReceptionist: This extends WebReceptionist, and is the default one for greenstone 3 servlets. Due to the page design, some extra information is needed for each page: some metadata about the current collection. THe receptionist sends a describe request to teh collection to get this, and appends it to teh page before transformation using xslt.
    1093 
    1094 NZDLReceptionist: (do we want to talk about this?) This is an example of a custom receptionist. For a look-alike nzdl.org system, even more information is needed for each page, namely the list of classifiers available from teh ClassifierBrowse service.
    1095 
    1096 By default, the LibraryServlet uses DefaultReceptionist. However, there is an init-param called receptionist which can be set to make the servlet use a different one.
    1097 
    1098 
    10991143* talk general first: get data, get format info, transform gsf->xsl. transfrom xml->html
    11001144
     
    11301174***TODO*** describe a bit more?? currently only can get this locally
    11311175
     1176\subsection{Receptionists}\label{sec:recepts}
     1177
     1178The receptionist is the controlling module for the page generation part of greenstone. It has the job of loading up all the actions, and it knows about the message router it and the actions are supposed to talk to. It routes messages received to the appropriate action (page-type messages) or directly to the message router (all other types). Receptionists also do other things, for example, adding to the page received back from the action any information that is common to all pages.
     1179
     1180There are different ways of providing an interface to greenstone, from web based cgi style (using servlets) to Java GUI applications. These different interfaces require slightly different responses from a receptionist, so we provide several standard types of receptionist.
     1181
     1182Receptionist: This is the most basic receptionist. The page it returns consists of the original request, and the response from the action it was sent to. Methods preProcessRequest, and postProcessPage are called on the request and page, respectively, but in this basic receptionist, they dont do anything.
     1183
     1184TransformingReceptionist: This extends Receptionist, and overwrites postProcessPage to transform the page using xslt. An xslt is listed for each action in the receptionists config file, and this is used to transform the page. First, some display information, and config information is added to the page. Then it is transformed using the specified xslt for the action, and returned.
     1185
     1186WebReceptionist: The WebReceptionist extends TransformingREceptionist. It doesn't do much else except some argument conversion. To keep the url's short, parameters from the services are given shortnames, and these are used in the web pages.
     1187
     1188DefaultReceptionist: This extends WebReceptionist, and is the default one for greenstone 3 servlets. Due to the page design, some extra information is needed for each page: some metadata about the current collection. THe receptionist sends a describe request to teh collection to get this, and appends it to teh page before transformation using xslt.
     1189
     1190NZDLReceptionist: (do we want to talk about this?) This is an example of a custom receptionist. For a look-alike nzdl.org system, even more information is needed for each page, namely the list of classifiers available from teh ClassifierBrowse service.
     1191
     1192By default, the LibraryServlet uses DefaultReceptionist. However, there is an init-param called receptionist which can be set to make the servlet use a different one.
     1193
     1194\subsection{cgi args}
     1195
     1196THe args used by the page come from several sources. Receptionist uses a couple, actions use some and services. the receptionist and actions are treated as a whole so must not have conflicting args. GSParams class specifies all teh general basic args, and whether they should be saved or not. servlet has an init parameter params\_class, that specifies which params class to use - if subclass it. actions or receptionist  may specify some new ones
     1197
     1198services may be created by different people, may be on a different site. cant garantee no conflict with action params, or even with other services.
     1199so service params are namespaced when they are put on the page. interface (recept and action) params wil have no namespace) the default namespace is s1 (service1) - any params that are for the service will be prefixed by this. eg the case param for a search will be put in the page as s1.case.
     1200THe actions must now look for all the s1 params to send to teh service.
     1201
     1202if there are  two or more services combined on a page with a single submit button, they will use s1, s2, s3 etc as needed. the s param (service) will end up with a list eg s=TextQuery,MusicQuery, and the order of these determines the mapping order of teh namespaces, ie s1 will be TExtQuery, s2 MusicQuery.
     1203
     1204also talk abotu saving args - save ones that GSParams says to save, and any service ones should always save.
    11321205\subsection{Internationalization}
    11331206
     
    13731446\label{tab:dirs}
    13741447\center{\footnotesize
    1375 \begin{tabular}{l p{7cm}}
     1448\begin{tabular}{l p{8cm}}
    13761449\hline
    13771450gsdl3
     
    13951468gsdl3/src/java/org/greenstone/gsdl3/action
    13961469  & Action classes used by the Receptionist---do the work of displaying the pages\\
    1397 gsdl3/src/java/org/greenstone/gsdl3/classes
    1398   & On compilation, the Java classes get put here---they can then be combined into a single jar file, and copied to the java lib directory \\
    13991470gsdl3/src/java/org/greenstone/gdbm
    14001471  & Java wrapper for gdbm---uses j-gdbm, a jni gdbm wrapper\\
     
    14631534\newcommand{\gshome}{\$GSDLHOME}
    14641535
    1465 Currently, Greenstone3 is only available through CVS. The installation procedure has been semi-automated. Note, these instructions are for installation on linux. If you want to use Greenstone3 on Windows, download it using CVS, then follow the instructions in \gst{http://www.cs.waikato.ac.nz/~mdewsnip/GSDL3Windows.html}.
     1536Currently, Greenstone3 is only available through CVS. The installation procedure has been semi-automated. Note, these instructions are for installation on linux. If you want to use Greenstone3 on Windows, download it using CVS, then follow the instructions in \gst{http://www.cs.waikato.ac.nz/\~mdewsnip/GSDL3Windows.html}.
    14661537
    14671538\subsubsection{Get the source}
     
    15461617It is possible to run several servlets at once, with different combinations of sites and/or interfaces.
    15471618
    1548 The file \gst{\gsdlhome/comms/jakarta/tomcat/conf/server.xml} is the Tomcat configuration file. The installation process adds a context for Greenstone3 servlets (\gst{\gsdlhome/web})---this tells Tomcat where to find the web.xml file, and what URL (\gst{/gsdl3}) to give it. Anything inside the context directory is accessible via Tomcat\footnote{can we use .htaccess files to restrict access??}. For example, the index.html file that lives in \gst{\gsdlhome/web} can be accessed through the URL \gst{localhost:8080/gsdl3/index.html}. The demo collection's images can be accessed through \gst{localhost:8080/gsdl3/sites/localsite/collect/demo/images/}~.
     1619The file \gst{\gsdlhome/comms/jakarta/tomcat/conf/server.xml} is the Tomcat configuration file. The installation process adds a context for Greenstone3 servlets (\gst{\gsdlhome/web})---this tells Tomcat where to find the web.xml file, and what URL (\gst{/gsdl3}) to give it. Anything inside the context directory is accessible via Tomcat\footnote{can we use .htaccess files to restrict access??}. For example, the index.html file that lives in \gst{\gsdlhome/web} can be accessed through the URL \gst{localhost:8080/gsdl3/index.html}. The demo collection's images can be accessed through \\
     1620\gst{localhost:8080/gsdl3/sites/localsite/collect/demo/images/}~.
    15491621
    15501622
    15511623Tomcat runs by default on port 8080---this can be changed in server.xml. The siteConfig files also need changing if Tomcat's port is changed: \gst{<httpAddress>} for the site, and \gst{<address>} for a remote site both use this.
     1624
    15521625
    15531626
     
    15801653On startup, the servlet loads in its collections and services. If the site or collection configuration files are changed, these changes will not take effect until the site/collection is reloaded. This can be done through the reconfiguration messages (see Section~\ref{sec:runtime-config}, or by restarting Tomcat.
    15811654
     1655Symlinks:
     1656
     1657Tomcat by default doesn't follow symlinks (although the symlink to lib seems to work). To make it follow symlinks, eg to have the collect directory of a site somewhere else, you need to add the following to tomcats server.xml \\
     1658(\$GSDL3HOME/comms/jakarta/tomcat/conf/server.xml):
     1659\gst{<Resources allowLinking='true'/>}
     1660This needs to go inside the gsdl3 context, i.e.
     1661
     1662\begin{quote}\begin{gsc}
     1663<Context path="/gsdl3" docBase="\$GSDL3HOME/web" debug="1" \\
     1664reloadable="true">\\
     1665   <Resources allowLinking='true'/>\\
     1666</Context>\\
     1667\end{gsc}\end{quote}
     1668By default, tomcat allows directory listings for everything in the docBase directory. For example, you can enter localhost:8080/gsdl3/sites and it will give you a list of all the sites. To turn this off, you need to edit Tomcat's default web.xml file (\$GSDL3HOME/comms/jakarta/tomcat/conf/web.xml):
     1669
     1670In the default servlet definition, change the 'listings' param to false.
     1671
     1672
     1673Running tomcat with apache.
     1674apache can be easily set up to proxy tomcat eg
     1675
     1676in the www.mysite.org
     1677\begin{quote}\begin{gsc}
     1678<VirtualHost a.b.c.d>\\
     1679ServerName www.mysite.org\\
     1680...\\
     1681ProxyPass /greenstone3 http://puka.cs.waikato.ac.nz:8080/gsdl3\\
     1682ProxyPassReverse /greenstone3 http://puka.cs.waikato.ac.nz:8080/gsdl3\\
     1683</VirtualHost>\\
     1684\end{gsc}\end{quote}
     1685can now access tomcat, instead of at puka.cs.waikato.ac.nz:8080/gsdl3, but at www.mysite.org/greenstone3
     1686
     1687if tomcat is running behind a proxy, and you want to access stuff like the infomine database where you need to make external connections, you need to fill in the proxy element in the siteConfig.xml file - unfortunately the password is added in plain text. but can make it so that only the server admin can see it.
    15821688\subsubsection{Using SOAP to talk to a remote site}
    15831689
     
    16281734\section{Greenstone Customization}
    16291735
     1736this is the dynamic stuff, immediate or through tomcat restart
    16301737\subsection{How to define a new interface}
    16311738
    1632 Most of an interface is defined by XSLT files, which are stored in web/interfaces/interface-name/transform.
     1739Most of an interface is defined by XSLT files, which are stored in web/interfaces/interface-name/transform. A new interface needs a directory in web/interfaces. inside, needs images and transform directories. and interfaceConfig.xml file. Any xslt may be overridden for a new interface by putting the replacement in the new interface transform directory. If the appropriate xslt file is not there, the default one will be used - this enables just overriding a few xslt files as needed.
     1740xslt are looked for in order: collection, site, interface, default interface. This also applies to included xslts. (this doesn't work for colls/sites on remote computers. ). the xsl:include directives are preprocessed by the java code and full paths added based on availability of teh files, so that the correct one is used.
     1741you cannot include a template with teh same name as teh includer.
    16331742\subsection{Adding a new language}
    16341743
     
    16381747\section{Greenstone Development}
    16391748
     1749this is the customization that requires recompilation.
    16401750Here are some random notes for developers who want to modify the source code.
    16411751\subsection{Greenstone utility classes}
     
    16711781\subsection{Creating new services}
    16721782
    1673 *inherit from service rack
     1783*inherit from ServiceRack - abstract base class. this handles the main process method, determines hte service name and request type. if request type is describe, and to is empty, it returns a list of services (short\_service\_info) which is initialised in the configure method. a describe request to a particular service results in getServiceDescription being called, which must be supplied by the subclass.
     1784other request types (process) get sent to processXXX methods, where XXX is the service name.
    16741785
    16751786* what methods are expected
     
    18201931                   A new instance is to be created with the args constructor arguments (if any). All constructor methods
    18211932                   are qualified for method selection.
    1822                    Example: <xsl:variable name="myType"
    1823                           select="my-package:extclass.new()">
     1933                   Example: \gst{<xsl:variable name="myType"
     1934                          select="my-package:extclass.new()">}
    18241935
    18251936                   To invoke an instance method on a specified instance:
     
    18301941                   are qualified methods. If a matching method is found, object will be used to identify the object instance
    18311942                   and args will be passed to the invoked method.
    1832                    Example: <xsl:variable name="new-pop"
    1833                         select="my-package:valueOf(\$myType, string(@population))">
     1943                   Example: \gst{<xsl:variable name="new-pop"
     1944                        select="my-package:valueOf(\$myType, string(@population))">}
    18341945
    18351946                   To invoke a static method:
     
    18411952                   the name methodName are qualified methods. If a matching method is found, args will be passed to the
    18421953                   invoked static method.
    1843                    Example: <xsl:variable name="new-pop"
    1844                         select="my-package:extclass.printit(string(@population))">
     1954                   Example: \gst{<xsl:variable name="new-pop"
     1955                        select="my-package:extclass.printit(string(@population))">}
    18451956
    18461957
     
    18591970                   class name of the class whose constructor is to be called. A new instance is to be created with the
    18601971                   args constructor arguments (if any). All constructor methods are qualified for method selection.
    1861                    Example: <xsl:variable name="myHash"
    1862                           select="java:java.util.Hashtable.new()">
     1972                   Example: \gst{<xsl:variable name="myHash"
     1973                          select="java:java.util.Hashtable.new()">}
    18631974
    18641975                   To invoke an instance method on a specified instance:
     
    18691980                   are qualified methods. If a matching method is found, object will be used to identify the object instance
    18701981                   and args will be passed to the invoked method.
    1871                    Example: <xsl:variable name="new-pop"
    1872                         select="java:put(\$myHash, string(@region), \$newpop)">
     1982                   Example: \gst{<xsl:variable name="new-pop"
     1983                        select="java:put(\$myHash, string(@region), \$newpop)">}
    18731984
    18741985                   To invoke a static method:
     
    18791990                   args arguments. Only static methods with the name methodName are qualified methods. If a matching
    18801991                   method is found, args will be passed to the invoked static method.
    1881                    Example: <xsl:variable name="new-pop"
    1882                         select="java:java.lang.Integer.valueOf(string(@population))">
     1992                   Example: \gst{<xsl:variable name="new-pop"
     1993                        select="java:java.lang.Integer.valueOf(string(@population))">}
    18831994
    18841995
Note: See TracChangeset for help on using the changeset viewer.