Changeset 5435
- Timestamp:
- 2003-09-03T14:03:13+12:00 (21 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/gsdl3/docs/manual/manual.tex
r4892 r5435 54 54 A description of the general design and architecture of Greenstone3 is covered by the document {\em The design of Greenstone3: An agent based dynamic digital library} (design-2002.ps, in the gsdl3/docs/manual directory). 55 55 56 NOTES: structure: make the classes and messages separate. have a class hierarchy and a module hierarchy/picture - keep the two separate. schemas/subschemas?? 57 user vs developer - make a clearer distinction 58 are we going to publish an API. what is it? what do we want to provide? 56 59 \section{System modules}\label{sec:modules} 57 60 … … 447 450 448 451 \subsection{'describe'-type messages}\label{sec:describe} 449 **** REDO (the responses may now contain display information which is not shown here) **** 450 This is the first of the standard internal messages. 451 The most basic message is ``describe-yourself'', which can be sent to any module in the system. The module responds with a semi-predefined piece of XML, making these requests very efficient. The response is predefined apart from any language-specific text strings, which are put together as each request comes in. 452 453 The most basic of the internal standard requests is ``describe-yourself'', which can be sent to any module in the system. The module responds with a semi-predefined piece of XML, making these requests very efficient. The response is predefined apart from any language-specific text strings, which are put together as each request comes in, based on the language attribute of the request. 452 454 \begin{quote}\begin{gsc}\begin{verbatim} 453 455 <request lang='en' type='describe' to=''/> 454 456 \end{verbatim}\end{gsc}\end{quote} 455 If the \gst{to} field is empty, it is answered by the MessageRouter.457 If the \gst{to} field is empty, a request is answered by the MessageRouter. 456 458 An example response from a MessageRouter might look like this: 457 459 \begin{quote}\begin{gsc}\begin{verbatim} 458 460 <response lang='en' type='describe'> 459 <serviceList> 460 <service name='CrossCollectionSearch' type='query' /> 461 </serviceList> 461 <serviceList/> 462 462 <siteList> 463 463 <site name='org.greenstone.gsdl1' … … 465 465 type='soap' /> 466 466 </siteList> 467 <serviceClusterList> 468 <serviceCluster name="build" /> 469 </serviceClusterList> 467 470 <collectionList> 468 471 <collection name='org.greenstone.gsdl1/ … … 474 477 </response> 475 478 \end{verbatim}\end{gsc}\end{quote} 476 This MessageRouter has one site-wide service, a cross-collection searching service.It479 This MessageRouter has no individual site-wide services (an empty \gst{<serviceList>}), but has a service cluster called build (which provides collection importing and building functionality). It 477 480 communicates with one site, \gst{org.greenstone.gsdl1}. It is aware of four 478 481 collections. One of these, \gst{myfiles}, belongs to it; the other three are … … 496 499 </request> 497 500 \end{verbatim}\end{gsc}\end{quote} 498 When a collection or service cluster is asked to describe itself, what is returned is all of the 499 collection specific metadata anda list of services. For example, here is such501 502 When a collection or service cluster is asked to describe itself, what is returned is a list of metadata, some display elements, and a list of services. For example, here is such 500 503 a message, along with a sample response. 501 504 502 505 \begin{quote}\begin{gsc}\begin{verbatim} 503 <request lang='en' type='describe' to='demo'/> 504 505 <response lang='en' type='describe' from='demo' > 506 <collection name='demo'> 506 <request lang='en' type='describe' to='mgppdemo'/> 507 508 <response from="mgppdemo" type="describe"> 509 <collection name="mgppdemo"> 510 <displayItem lang="en" name="name">greenstone mgpp demo 511 </displayItem> 512 <displayItem lang="en" name="description">This is a 513 demonstration collection for the Greenstone digital 514 library software. It contains a small subset (11 books) 515 of the Humanity Development Library. It is built with 516 mgpp.</displayItem> 517 <displayItem lang="en" name="icon">mgppdemo.gif</displayItem> 507 518 <serviceList> 508 <service name='TextQuery' type='query' /> 509 <service name='DocumentContentRetrieve' type='retrieve' /> 510 <service name='DocumentMetadataRetrieve' type='retrieve' /> 519 <service name="DocumentStructureRetrieve" type="retrieve" /> 520 <service name="DocumentMetadataRetrieve" type="retrieve" /> 521 <service name="DocumentContentRetrieve" type="retrieve" /> 522 <service name="ClassifierBrowse" type="browse" /> 523 <service name="ClassifierBrowseMetadataRetrieve" 524 type="retrieve" /> 525 <service name="TextQuery" type="query" /> 526 <service name="FieldQuery" type="query" /> 527 <service name="AdvancedFieldQuery" type="query" /> 528 <service name="PhindApplet" type="applet" /> 511 529 </serviceList> 512 530 <metadataList> 513 <metadata name='numDocs'>321</metadata> 514 <metadata name='numSections'>5532</metadata> 515 <metadata name='colName' lang='en'>The demo collection</metadata> 516 <metadata name='colDescription' lang='en'>This is a demo collection. 517 </metadata> 531 <metadata name="creator">[email protected]</metadata> 532 <metadata name="maintainer">[email protected]</metadata> 533 <metadata name="numDocs">11</metadata> 534 <metadata name="buildType">mgpp</metadata> 535 <metadata name="httpPath">http://kanuka:8090/gsdl3/sites/ 536 localsite/collect/mgppdemo</metadata> 518 537 </metadataList> 519 538 </collection> … … 521 540 \end{verbatim}\end{gsc}\end{quote} 522 541 523 The subset parameter can also be used in a describe request to a collection, to retrieve just the metadataList or serviceList. 542 This collection provides many typical services... 543 544 The subset parameter can also be used in a describe request to a collection, to retrieve just the \gst{metadataList} or \gst{serviceList}. 524 545 525 546 A \gst{describe} request sent to a service returns a list of parameters that … … 563 584 The type attribute is used to determine how to display the parameters on a web page or interface. For example, a string parameter may result in a text entry box, a boolean an on/off button, enum\_single/enum\_multi a drop-down menu, where one or many items, respectively, can be selected. 564 585 A multi-type parameter indicates that two or more parameters are associated, and should be displayed appropriately. For example, in a field query, the text box and field list should be associated. The occurs attribute specifies how many times the parameter should be displayed on the page. 565 Parameters also come with display information: all the text strings needed to present them to t eh user. These include the name of the parameter and the display values for any options.566 567 A service description also contains a display element - this contains all the language dependent text strings - put together on the fly. These strings are name of the service, what to use for the submit button, and text strings for all the parameters: name, what each value is called, etc.568 569 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. Figure~\ref{fig:query-display} gives an example html search form that may be generated from this describe response.586 Parameters also come with display information: all the text strings needed to present them to the user. These include the name of the parameter and the display values for any options. These are included in the above parameter descriptions in the form of \gst{<displayItem>} elements. 587 588 A service description also contains some display information---this includes the name of the service, and the text for the submit button. 589 590 Here is a sample describe request to the FieldQuery service of collection mgppdemo, along with its response. The parameters in this example include their display information. Figure~\ref{fig:query-display} gives an example html search form that may be generated from this describe response. 570 591 571 592 \begin{quote}\begin{gsc}\begin{verbatim} … … 574 595 <response from="mgppdemo/FieldQuery" type="describe"> 575 596 <service name="FieldQuery" type="query"> 597 <displayItem name="name">Form Query</displayItem> 598 <displayItem name="submit">Search</displayItem> 576 599 <paramList> 577 <param default="Section" name="level" type="enum_single"> 578 <option name="Document" /> 579 <option name="Section" /> 600 <param default="Document" name="level" type="enum_single"> 601 <displayItem name="name">Granularity to search at</displayItem> 602 <option name="Document"> 603 <displayItem name="name">Document</displayItem> 604 </option> 605 <option name="Section"> 606 <displayItem name="name">Section</displayItem> 607 </option> 580 608 </param> 581 <param default="1" name="case" type="boolean" /> 582 <param default="1" name="stem" type="boolean" /> 583 <param default="10" name="maxDocs" type="integer" /> 609 <param default="1" name="case" type="boolean"> 610 <displayItem name="name">Turn casefolding </displayItem> 611 <option name="0"> 612 <displayItem name="name">off</displayItem> 613 </option> 614 <option name="1"> 615 <displayItem name="name">on</displayItem> 616 </option> 617 </param> 618 <param default="1" name="stem" type="boolean"> 619 <displayItem name="name">Turn stemming </displayItem> 620 <option name="0"> 621 <displayItem name="name">off</displayItem> 622 </option> 623 <option name="1"> 624 <displayItem name="name">on</displayItem> 625 </option> 626 </param> 627 <param default="10" name="maxDocs" type="integer"> 628 <displayItem name="name">Maximum documents to return 629 </displayItem> 630 </param> 584 631 <param name="simpleField" occurs="4" type="multi"> 585 <param name="fqv" type="string" /> 586 <param default="" name="fqf" type="enum_single"> 587 <option name="ZZ" /><option name="TX" /> 588 <option name="SU" /><option name="TI" /> 632 <displayItem name="name"></displayItem> 633 <param name="fqv" type="string"> 634 <displayItem name="name">Word or phrase </displayItem> 635 </param> 636 <param default="ZZ" name="fqf" type="enum_single"> 637 <displayItem name="name">in field</displayItem> 638 <option name="ZZ"> 639 <displayItem name="name">All fields</displayItem> 640 </option> 641 <option name="TX"> 642 <displayItem name="name">TextOnly</displayItem> 643 </option> 644 <option name="SU"> 645 <displayItem name="name">Subject</displayItem> 646 </option> 647 <option name="TI"> 648 <displayItem name="name">Title</displayItem> 649 </option> 589 650 </param> 590 651 </param> 591 652 </paramList> 592 <display>593 <name>Form Query</name>594 <submit>Search</submit>595 <param name="level">596 <name>Granularity to search at</name>597 <option name="Document">Document</option>598 <option name="Section">Section</option>599 </param>600 <param name="case">601 <name>Turn casefolding </name>602 <option name="0">off</option>603 <option name="1">on</option>604 </param>605 <param name="stem">606 <name>Turn stemming </name>607 <option name="0">off</option>608 <option name="1">on</optin>609 </param>610 <param name="maxDocs">611 <name>Maximum documents to return</name>612 </param>613 <param name="fqv">614 <name>Search for </name>615 </param>616 <param name="fqf">617 <name>in field</name>618 <option name="ZZ">All fields</option>619 <option name="TX">TextOnly</option>620 <option name="SU">Subject</option>621 <option name="TI">Title</option>622 </param>623 </display>624 653 </service> 625 654 </response> … … 657 686 The Phind java applet. 658 687 </applet> 688 <displayItem name="name">Browse phrase hierarchies</displayItem> 659 689 </service> 660 690 </response> 661 691 \end{verbatim}\end{gsc}\end{quote} 662 692 663 Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the receptionist must fill in this parameter before displaying the html.693 Note that the library parameter has been left blank. This is because library refers to the current servlet that is running and the name is not necessarily known in advance. So either the applet action or the Receptionist must fill in this parameter before displaying the html. 664 694 665 695 \subsection{'system'-type messages}\label{sec:system} 666 696 667 ``System'' requests are used to tell a MessageRouter, Collection or ServiceCluster to update its cached information and activate or deactivate other modules. For example, the MessageRouter has a set of Collection modules that it can talk to. It also holds some XML information about those collections---this is returned when a request for a collection list comes in. If a collection is deleted or modified, or a new one created, this information may need to change, and the list of available modules may also change. Curren lty they are initiated by particular cgi parameters (see Section~\ref{sec:runtime-config}).697 ``System'' requests are used to tell a MessageRouter, Collection or ServiceCluster to update its cached information and activate or deactivate other modules. For example, the MessageRouter has a set of Collection modules that it can talk to. It also holds some XML information about those collections---this is returned when a request for a collection list comes in. If a collection is deleted or modified, or a new one created, this information may need to change, and the list of available modules may also change. Currently they are initiated by particular cgi parameters (see Section~\ref{sec:runtime-config}). 668 698 669 699 The basic format of a system request is as follows: … … 859 889 One or more parameters specifying metadata may be included in a request. Also, a value of \gst{all} will retrieve all the metadata for each document. 860 890 861 Any browse-type service must also implement a metadata retrieval service to provide metadata for the nodes in the classification hierarchy. The name of it is the bro se service name plus \gst{MetadataRetrieve}. For example, the ClassifierBrowse service described in the previous section should also have a ClassifierBrowseMetadataRetrieve service. The request and response format is exactly the same as for the DocumentMetadataREtrieve service, except that \gst{<documentNode>} elements are replaced by \gst{<classifierNode>} elements (and the corresponding list element is also changed).891 Any browse-type service must also implement a metadata retrieval service to provide metadata for the nodes in the classification hierarchy. The name of it is the browse service name plus \gst{MetadataRetrieve}. For example, the ClassifierBrowse service described in the previous section should also have a ClassifierBrowseMetadataRetrieve service. The request and response format is exactly the same as for the DocumentMetadataRetrieve service, except that \gst{<documentNode>} elements are replaced by \gst{<classifierNode>} elements (and the corresponding list element is also changed). 862 892 863 893 Give me the text (content) of this document: … … 873 903 <documentNodeList> 874 904 <documentNode nodeID="HASHac0a04dd14571c60d7fbfd.4.2"> 875 <nodeContent><Section> 876 877 </B><P ALIGN="JUSTIFY"></P> 878 <P ALIGN="JUSTIFY">190. When the plants in your second pen have 879 grown big enough to provide food and shelter, you can put in the snails.</P> 880 905 <nodeContent><Section> 906 </B><P ALIGN="JUSTIFY"></P> 907 <P ALIGN="JUSTIFY">190. When the plants in 908 your second pen have grown big enough to provide food and 909 shelter, you can put in the snails.</P> 881 910 </nodeContent> 882 911 </documentNode> … … 885 914 \end{verbatim}\end{gsc}\end{quote} 886 915 887 The content of a node is returned in a \gst{<nodeContent>} element. 916 The content of a node is returned in a \gst{<nodeContent>} element. In this case it is escaped HTML. 888 917 889 918 Give me the ancestors and children of the specified node, along with the number of siblings it has: … … 921 950 \end{verbatim}\end{gsc}\end{quote} 922 951 923 Structure is returned inside a nodeStructure element, while structural info is returned in a nodeStructureInfoelement. Possible values for strcuture parameters are as for browse services: \gst{ancestors}, \gst{parent}, \gst{siblings}, \gst{children}, \gst{descendents}. Possible values for info parameters are \gst{numSiblings}, \gst{siblingPosition}, \gst{numChildren}.952 Structure is returned inside a \gst{<nodeStructure>} element, while structural info is returned in a \gst{<nodeStructureInfo>} element. Possible values for strcuture parameters are as for browse services: \gst{ancestors}, \gst{parent}, \gst{siblings}, \gst{children}, \gst{descendents}. Possible values for info parameters are \gst{numSiblings}, \gst{siblingPosition}, \gst{numChildren}. 924 953 925 954 \subsubsection{'process'-type services}\label{sec:process} 926 Process requests are not a request for data---they are a request forsome action to be carried out, for example, create a new collection, or import a collection. The response is a status or an error message. The import and build commands may take a long time to complete, so a response is sent back after a successful start to the command. The status may be polled by the requester to see how the process is going.955 Requests to process-type services are not requests for data---they request some action to be carried out, for example, create a new collection, or import a collection. The response is a status or an error message. The import and build commands may take a long time to complete, so a response is sent back after a successful start to the command. The status may be polled by the requester to see how the process is going. 927 956 928 957 Process requests generally contain just a parameter list. Like for any service, the parameters used by a process-type service can be obtained by a describe request to that service. … … 1001 1030 \subsubsection{'enrich'-type services} 1002 1031 1032 Enrich services typically take some text of documents (inside \gst{<nodeContent>} tags) and returns the text marked up in some way. One example of this is the GatePOSTag service: this identifies Dates, Locations, People and Organizations in the text, and annotates the text with the labels. In the following example, the request is for Location and Dates to be identified. 1003 1033 *** TODO **** 1034 \begin{quote}\begin{gsc}\begin{verbatim} 1035 <request lang="en" to="GatePOSTag" type="process"> 1036 <paramList> 1037 <param name="annotationType" value="Date,Location" /> 1038 </paramList> 1039 <documentNodeList> 1040 <documentNode nodeID="HASHac0a04dd14571c60d7fbfd"> 1041 <nodeContent> 1042 FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS 1043 Rome 1986 1044 P-69 1045 ISBN 92-5-102397-2 1046 FAO 1986 1047 </nodeContent> 1048 </documentNode> 1049 </documentNodeList> 1050 </request> 1051 1052 <response from="GatePOSTag" type="process"> 1053 <documentNodeList> 1054 <documentNode nodeID="HASHac0a04dd14571c60d7fbfd"> 1055 <nodeContent> 1056 FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS 1057 <annotation type="Location">Rome</annotation> 1058 <annotation type="Date">1986</annotation> 1059 P-69 1060 ISBN 92-5-102397-2 1061 FAO <annotation type="Date">1986</annotation> 1062 </nodeContent> 1063 </documentNode> 1064 </documentNodeList> 1065 </response> 1066 \end{verbatim}\end{gsc}\end{quote} 1004 1067 1005 1068 \subsection{'status'-type messages}\label{sec:status} … … 1078 1141 \section{Page generation}\label{sec:pagegen} **** REDO ******** 1079 1142 1080 \subsection{Receptionists}\label{sec:recepts}1081 1082 The receptionist is the controlling module for the page generation part of greenstone. It has the job of loading up all the actions, and it knows about the message router it and the actions are supposed to talk to. It routes messages received to the appropriate action (page-type messages) or directly to the message router (all other types). Receptionists also do other things, for example, adding to the page received back from the action any information that is common to all pages.1083 1084 There are different ways of providing an interface to greenstone, from web based cgi style (using servlets) to Java GUI applications. These different interfaces require slightly different responses from a receptionist, so we provide several standard types of receptionist.1085 1086 Receptionist: This is the most basic receptionist. The page it returns consists of the original request, and the response from the action it was sent to. Methods preProcessRequest, and postProcessPage are called on the request and page, respectively, but in this basic receptionist, they dont do anything.1087 1088 TransformingReceptionist: This extends Receptionist, and overwrites postProcessPage to transform the page using xslt. An xslt is listed for each action in the receptionists config file, and this is used to transform the page. First, some display information, and config information is added to the page. Then it is transformed using the specified xslt for the action, and returned.1089 1090 WebReceptionist: The WebReceptionist extends TransformingREceptionist. It doesn't do much else except some argument conversion. To keep the url's short, parameters from the services are given shortnames, and these are used in the web pages.1091 1092 DefaultReceptionist: This extends WebReceptionist, and is the default one for greenstone 3 servlets. Due to the page design, some extra information is needed for each page: some metadata about the current collection. THe receptionist sends a describe request to teh collection to get this, and appends it to teh page before transformation using xslt.1093 1094 NZDLReceptionist: (do we want to talk about this?) This is an example of a custom receptionist. For a look-alike nzdl.org system, even more information is needed for each page, namely the list of classifiers available from teh ClassifierBrowse service.1095 1096 By default, the LibraryServlet uses DefaultReceptionist. However, there is an init-param called receptionist which can be set to make the servlet use a different one.1097 1098 1099 1143 * talk general first: get data, get format info, transform gsf->xsl. transfrom xml->html 1100 1144 … … 1130 1174 ***TODO*** describe a bit more?? currently only can get this locally 1131 1175 1176 \subsection{Receptionists}\label{sec:recepts} 1177 1178 The receptionist is the controlling module for the page generation part of greenstone. It has the job of loading up all the actions, and it knows about the message router it and the actions are supposed to talk to. It routes messages received to the appropriate action (page-type messages) or directly to the message router (all other types). Receptionists also do other things, for example, adding to the page received back from the action any information that is common to all pages. 1179 1180 There are different ways of providing an interface to greenstone, from web based cgi style (using servlets) to Java GUI applications. These different interfaces require slightly different responses from a receptionist, so we provide several standard types of receptionist. 1181 1182 Receptionist: This is the most basic receptionist. The page it returns consists of the original request, and the response from the action it was sent to. Methods preProcessRequest, and postProcessPage are called on the request and page, respectively, but in this basic receptionist, they dont do anything. 1183 1184 TransformingReceptionist: This extends Receptionist, and overwrites postProcessPage to transform the page using xslt. An xslt is listed for each action in the receptionists config file, and this is used to transform the page. First, some display information, and config information is added to the page. Then it is transformed using the specified xslt for the action, and returned. 1185 1186 WebReceptionist: The WebReceptionist extends TransformingREceptionist. It doesn't do much else except some argument conversion. To keep the url's short, parameters from the services are given shortnames, and these are used in the web pages. 1187 1188 DefaultReceptionist: This extends WebReceptionist, and is the default one for greenstone 3 servlets. Due to the page design, some extra information is needed for each page: some metadata about the current collection. THe receptionist sends a describe request to teh collection to get this, and appends it to teh page before transformation using xslt. 1189 1190 NZDLReceptionist: (do we want to talk about this?) This is an example of a custom receptionist. For a look-alike nzdl.org system, even more information is needed for each page, namely the list of classifiers available from teh ClassifierBrowse service. 1191 1192 By default, the LibraryServlet uses DefaultReceptionist. However, there is an init-param called receptionist which can be set to make the servlet use a different one. 1193 1194 \subsection{cgi args} 1195 1196 THe args used by the page come from several sources. Receptionist uses a couple, actions use some and services. the receptionist and actions are treated as a whole so must not have conflicting args. GSParams class specifies all teh general basic args, and whether they should be saved or not. servlet has an init parameter params\_class, that specifies which params class to use - if subclass it. actions or receptionist may specify some new ones 1197 1198 services may be created by different people, may be on a different site. cant garantee no conflict with action params, or even with other services. 1199 so service params are namespaced when they are put on the page. interface (recept and action) params wil have no namespace) the default namespace is s1 (service1) - any params that are for the service will be prefixed by this. eg the case param for a search will be put in the page as s1.case. 1200 THe actions must now look for all the s1 params to send to teh service. 1201 1202 if there are two or more services combined on a page with a single submit button, they will use s1, s2, s3 etc as needed. the s param (service) will end up with a list eg s=TextQuery,MusicQuery, and the order of these determines the mapping order of teh namespaces, ie s1 will be TExtQuery, s2 MusicQuery. 1203 1204 also talk abotu saving args - save ones that GSParams says to save, and any service ones should always save. 1132 1205 \subsection{Internationalization} 1133 1206 … … 1373 1446 \label{tab:dirs} 1374 1447 \center{\footnotesize 1375 \begin{tabular}{l p{ 7cm}}1448 \begin{tabular}{l p{8cm}} 1376 1449 \hline 1377 1450 gsdl3 … … 1395 1468 gsdl3/src/java/org/greenstone/gsdl3/action 1396 1469 & Action classes used by the Receptionist---do the work of displaying the pages\\ 1397 gsdl3/src/java/org/greenstone/gsdl3/classes1398 & On compilation, the Java classes get put here---they can then be combined into a single jar file, and copied to the java lib directory \\1399 1470 gsdl3/src/java/org/greenstone/gdbm 1400 1471 & Java wrapper for gdbm---uses j-gdbm, a jni gdbm wrapper\\ … … 1463 1534 \newcommand{\gshome}{\$GSDLHOME} 1464 1535 1465 Currently, Greenstone3 is only available through CVS. The installation procedure has been semi-automated. Note, these instructions are for installation on linux. If you want to use Greenstone3 on Windows, download it using CVS, then follow the instructions in \gst{http://www.cs.waikato.ac.nz/ ~mdewsnip/GSDL3Windows.html}.1536 Currently, Greenstone3 is only available through CVS. The installation procedure has been semi-automated. Note, these instructions are for installation on linux. If you want to use Greenstone3 on Windows, download it using CVS, then follow the instructions in \gst{http://www.cs.waikato.ac.nz/\~mdewsnip/GSDL3Windows.html}. 1466 1537 1467 1538 \subsubsection{Get the source} … … 1546 1617 It is possible to run several servlets at once, with different combinations of sites and/or interfaces. 1547 1618 1548 The file \gst{\gsdlhome/comms/jakarta/tomcat/conf/server.xml} is the Tomcat configuration file. The installation process adds a context for Greenstone3 servlets (\gst{\gsdlhome/web})---this tells Tomcat where to find the web.xml file, and what URL (\gst{/gsdl3}) to give it. Anything inside the context directory is accessible via Tomcat\footnote{can we use .htaccess files to restrict access??}. For example, the index.html file that lives in \gst{\gsdlhome/web} can be accessed through the URL \gst{localhost:8080/gsdl3/index.html}. The demo collection's images can be accessed through \gst{localhost:8080/gsdl3/sites/localsite/collect/demo/images/}~. 1619 The file \gst{\gsdlhome/comms/jakarta/tomcat/conf/server.xml} is the Tomcat configuration file. The installation process adds a context for Greenstone3 servlets (\gst{\gsdlhome/web})---this tells Tomcat where to find the web.xml file, and what URL (\gst{/gsdl3}) to give it. Anything inside the context directory is accessible via Tomcat\footnote{can we use .htaccess files to restrict access??}. For example, the index.html file that lives in \gst{\gsdlhome/web} can be accessed through the URL \gst{localhost:8080/gsdl3/index.html}. The demo collection's images can be accessed through \\ 1620 \gst{localhost:8080/gsdl3/sites/localsite/collect/demo/images/}~. 1549 1621 1550 1622 1551 1623 Tomcat runs by default on port 8080---this can be changed in server.xml. The siteConfig files also need changing if Tomcat's port is changed: \gst{<httpAddress>} for the site, and \gst{<address>} for a remote site both use this. 1624 1552 1625 1553 1626 … … 1580 1653 On startup, the servlet loads in its collections and services. If the site or collection configuration files are changed, these changes will not take effect until the site/collection is reloaded. This can be done through the reconfiguration messages (see Section~\ref{sec:runtime-config}, or by restarting Tomcat. 1581 1654 1655 Symlinks: 1656 1657 Tomcat by default doesn't follow symlinks (although the symlink to lib seems to work). To make it follow symlinks, eg to have the collect directory of a site somewhere else, you need to add the following to tomcats server.xml \\ 1658 (\$GSDL3HOME/comms/jakarta/tomcat/conf/server.xml): 1659 \gst{<Resources allowLinking='true'/>} 1660 This needs to go inside the gsdl3 context, i.e. 1661 1662 \begin{quote}\begin{gsc} 1663 <Context path="/gsdl3" docBase="\$GSDL3HOME/web" debug="1" \\ 1664 reloadable="true">\\ 1665 <Resources allowLinking='true'/>\\ 1666 </Context>\\ 1667 \end{gsc}\end{quote} 1668 By default, tomcat allows directory listings for everything in the docBase directory. For example, you can enter localhost:8080/gsdl3/sites and it will give you a list of all the sites. To turn this off, you need to edit Tomcat's default web.xml file (\$GSDL3HOME/comms/jakarta/tomcat/conf/web.xml): 1669 1670 In the default servlet definition, change the 'listings' param to false. 1671 1672 1673 Running tomcat with apache. 1674 apache can be easily set up to proxy tomcat eg 1675 1676 in the www.mysite.org 1677 \begin{quote}\begin{gsc} 1678 <VirtualHost a.b.c.d>\\ 1679 ServerName www.mysite.org\\ 1680 ...\\ 1681 ProxyPass /greenstone3 http://puka.cs.waikato.ac.nz:8080/gsdl3\\ 1682 ProxyPassReverse /greenstone3 http://puka.cs.waikato.ac.nz:8080/gsdl3\\ 1683 </VirtualHost>\\ 1684 \end{gsc}\end{quote} 1685 can now access tomcat, instead of at puka.cs.waikato.ac.nz:8080/gsdl3, but at www.mysite.org/greenstone3 1686 1687 if tomcat is running behind a proxy, and you want to access stuff like the infomine database where you need to make external connections, you need to fill in the proxy element in the siteConfig.xml file - unfortunately the password is added in plain text. but can make it so that only the server admin can see it. 1582 1688 \subsubsection{Using SOAP to talk to a remote site} 1583 1689 … … 1628 1734 \section{Greenstone Customization} 1629 1735 1736 this is the dynamic stuff, immediate or through tomcat restart 1630 1737 \subsection{How to define a new interface} 1631 1738 1632 Most of an interface is defined by XSLT files, which are stored in web/interfaces/interface-name/transform. 1739 Most of an interface is defined by XSLT files, which are stored in web/interfaces/interface-name/transform. A new interface needs a directory in web/interfaces. inside, needs images and transform directories. and interfaceConfig.xml file. Any xslt may be overridden for a new interface by putting the replacement in the new interface transform directory. If the appropriate xslt file is not there, the default one will be used - this enables just overriding a few xslt files as needed. 1740 xslt are looked for in order: collection, site, interface, default interface. This also applies to included xslts. (this doesn't work for colls/sites on remote computers. ). the xsl:include directives are preprocessed by the java code and full paths added based on availability of teh files, so that the correct one is used. 1741 you cannot include a template with teh same name as teh includer. 1633 1742 \subsection{Adding a new language} 1634 1743 … … 1638 1747 \section{Greenstone Development} 1639 1748 1749 this is the customization that requires recompilation. 1640 1750 Here are some random notes for developers who want to modify the source code. 1641 1751 \subsection{Greenstone utility classes} … … 1671 1781 \subsection{Creating new services} 1672 1782 1673 *inherit from service rack 1783 *inherit from ServiceRack - abstract base class. this handles the main process method, determines hte service name and request type. if request type is describe, and to is empty, it returns a list of services (short\_service\_info) which is initialised in the configure method. a describe request to a particular service results in getServiceDescription being called, which must be supplied by the subclass. 1784 other request types (process) get sent to processXXX methods, where XXX is the service name. 1674 1785 1675 1786 * what methods are expected … … 1820 1931 A new instance is to be created with the args constructor arguments (if any). All constructor methods 1821 1932 are qualified for method selection. 1822 Example: <xsl:variable name="myType"1823 select="my-package:extclass.new()"> 1933 Example: \gst{<xsl:variable name="myType" 1934 select="my-package:extclass.new()">} 1824 1935 1825 1936 To invoke an instance method on a specified instance: … … 1830 1941 are qualified methods. If a matching method is found, object will be used to identify the object instance 1831 1942 and args will be passed to the invoked method. 1832 Example: <xsl:variable name="new-pop"1833 select="my-package:valueOf(\$myType, string(@population))"> 1943 Example: \gst{<xsl:variable name="new-pop" 1944 select="my-package:valueOf(\$myType, string(@population))">} 1834 1945 1835 1946 To invoke a static method: … … 1841 1952 the name methodName are qualified methods. If a matching method is found, args will be passed to the 1842 1953 invoked static method. 1843 Example: <xsl:variable name="new-pop"1844 select="my-package:extclass.printit(string(@population))"> 1954 Example: \gst{<xsl:variable name="new-pop" 1955 select="my-package:extclass.printit(string(@population))">} 1845 1956 1846 1957 … … 1859 1970 class name of the class whose constructor is to be called. A new instance is to be created with the 1860 1971 args constructor arguments (if any). All constructor methods are qualified for method selection. 1861 Example: <xsl:variable name="myHash"1862 select="java:java.util.Hashtable.new()"> 1972 Example: \gst{<xsl:variable name="myHash" 1973 select="java:java.util.Hashtable.new()">} 1863 1974 1864 1975 To invoke an instance method on a specified instance: … … 1869 1980 are qualified methods. If a matching method is found, object will be used to identify the object instance 1870 1981 and args will be passed to the invoked method. 1871 Example: <xsl:variable name="new-pop"1872 select="java:put(\$myHash, string(@region), \$newpop)"> 1982 Example: \gst{<xsl:variable name="new-pop" 1983 select="java:put(\$myHash, string(@region), \$newpop)">} 1873 1984 1874 1985 To invoke a static method: … … 1879 1990 args arguments. Only static methods with the name methodName are qualified methods. If a matching 1880 1991 method is found, args will be passed to the invoked static method. 1881 Example: <xsl:variable name="new-pop"1882 select="java:java.lang.Integer.valueOf(string(@population))"> 1992 Example: \gst{<xsl:variable name="new-pop" 1993 select="java:java.lang.Integer.valueOf(string(@population))">} 1883 1994 1884 1995
Note:
See TracChangeset
for help on using the changeset viewer.