******************************************************************************************************************** CONTENTS ******************************************************************************************************************** A. To install your GS3 Fedora extension from fedora3-gs3ext.tar.gz B. How to manually build fedora collections C. The customisations that have been made for the extension D. How to manually set up Fedora 3.6.1 with Fedora GSearch 2.5 from scratch Installing Fedora Installing Fedora GSearch Debugging Version: Greenstone 3.05 Fedora 3.6.1 Fedora GSearch 2.5 Web services pages once installed: http://localhost:8383/fedora/services http://localhost:8383/fedoragsearch/services -- ak19, 2012 ******************************************************************************************************************** A. To install your Fedora extension for Greenstone 3 from the fedora3-gs3ext.tar.gz distribution file ******************************************************************************************************************** (Tested to work on Linux machines.) 0. You will need Greenstone 3.05 or a fresh checkout from SVN, already installed. 1. Download the Fedora Extension for Greenstone 3: http://trac.greenstone.org/browser/gs3-extensions/fedora/fedora3-gs3ext.tar.gz 2. Save into your Greenstone3/ext folder 3. Untar and decompress it: $ tar -xvzf fedora3-gs3ext.tar.gz 4. This will create a fedora3 folder in your Greenstone3/ext folder. 5. Open Greenstone3/build.properties and find the section: ##These need to be uncommented if using Fedora and Fedora GSearch with Greenstone's tomcat fedora.home=#${basedir}/ext/fedora3 #fedora.maxpermsize=-XX:MaxPermSize=128m #fedora.password=pounamu Uncomment all these properties and change the value for the fedora password if necessary: ##These need to be uncommented if using Fedora and Fedora GSearch with Greenstone's tomcat fedora.home=${basedir}/ext/fedora3 fedora.maxpermsize=-XX:MaxPermSize=128m fedora.password=pounamu 6. Use a terminal to go into this folder: $ cd Greenstone3/ext/fedora3 7. If you type `ant` in the x-term, it will print the Usage message which will display the list of commands available and the order in which to execute them for proper installation. It's a 5 step process (explained in detail further below): $ ant -Dgsdl3.home= start-fedora-install $ ant start $ ant -Dgsdl3src.home= continue-fedora-install $ ant stop $ ant -Dgsdl3src.home= finish-fedora-install That's it. Now you can start tomcat again and visit http://localhost:8383/fedora and http://localhost:8383/fedoragsearch/rest to check the pages display. 8. You can start creating Greenstone Fedora collections with FLI. Run FLI from the GS3 toplevel folder by typing: $ ./gli/fli.sh DETAILS ON STEP 7: a. From your Greenstone3/ext/fedora3 folder, run: $ ant -Dgsdl3.home= start-fedora-install This step prepares the fedora and gsearch war files for deployment (by customising template files) and moves them into Greenstone 3's tomcat. b. Start tomcat by going into your GS3 toplevel folder and running: $ ant start This step gets Greenstone's tomcat to deploy the fedora and fedoragsearch war files. You may want to check your Greenstone library is working by visiting http://localhost:8383/greenstone3 in your browser c. Go back into Greenstone3/ext/fedora3 and run: $ ant -Dgsdl3src.home= continue-fedora-install This target configures fedora-gsearch now that it has been deployed. It runs a fedora-gsearch ant build file to customise properties files that are internal to fedora-gsearch. d. Stop tomcat by going back into your GS3 toplevel folder and running: $ ant stop You may want to check java instances running tomcat have indeed stopped: $ ps aux | grep tomcat You will need to kill the process if any tomcat is still running at this stage (this can happen if you accidentally start tomcat several times in succession without stopping it). $ kill -9 e. Now that tomcat has stopped, go back into Greenstone3/ext/fedora3 and run: ant -Dgsdl3src.home= finish-fedora-install This will clean up the extension's war files in tomcat/packages/webapps now that they have already been deployed. (Do not perform this final step when tomcat is still running, else your fedora extension webapps will become undeployed.) When installation is completed, there will be a fedora3 folder in Greenstone3/ext, which you should not delete, since the Fedora Digital objects will be stored here, and the FedoraGSearch index will be created here. Further, the Fedora extension installation process would have made tomcat deploy several webapps in packages/tomcat/webapps. The resulting folders are: - fedora - fedora-demo - fedoragsearch - fop - imagemanip - saxon (Note that the installation process will have removed the Fedora extension war files it put into packages/tomcat/webapps during installation.) ******************************************************************************************************************** B. Manual building of fedora collections ******************************************************************************************************************** You can use FLI to create, build and preview Fedora collections using Greenstone. The same can be done manually by calling the g2f perl scripts from the command line. 1. Fedora needs to be running, so ensure the Greenstone tomcat is running: ant start 2. Set up the GS3 environment: source gs3-setup.bash 3. Create a new collection with mkcol.pl. Call it "fedora1" for example mkcol.pl -collectdir //web/sites/localsite/collect fedora1 4. Run import and build to ingest the new collection - First put the documents you want into the import directory of your new collection. - g2f-import.pl -hostname localhost -port 8383 -password pounamu -removeold -collectdir //web/sites/localsite/collect fedora1 (password may be optional at this stage, by include it for convenience) - g2f-buildcol.pl -hostname localhost -port 8383 -password pounamu -removeold -collectdir //web/sites/localsite/collect fedora1 5. If building the GS3 demo collection as a fedora collection: - use FLI to transfer dls to dc metadata upon Gathering the documents. - then before building, turn on the description_tags in the HTMLPlugin. IMPORTANT NOTE: Fedora GSearch for some reason doesn't like the images in the default GS3 demo collection and is unable to index the documents because of them. If using the default GS3 demo collection, all the images -- both png and jpg alike -- need to first be resaved as their respective file types. Use Imagemagick to resave the many pngs, by going into each document's subfolder of the import folder and running: mogrify -format png *.png Also open each folder's jpg document cover image in GIMP and resave under the same name (at 100% quality). 6. If you ran FLI, exit it. It should stop the Greenstone server. With fedora installed, it's always good to check that the java process that launched tomcat has indeed stopped: ps aux | grep "tomcat" (Since things will fail if multiple instances of this same tomcat are running, kill any java processes that are referring to tomcat.) 7. Create an index folder in the new fedora collection folder. Create a buildConfig.xml file in it containing the following: 8. Restart the GS3 server. ant start 9. Visit the collection from the Greenstone collections page 10. Deleting a Fedora GS3 collection requires the collection's documents and the collection file to be purged from the Fedora repository and removed from the Fedora Gsearch index. (In the case of a normal GS3 collection, just the GS3 collection's directory will be deleted.) There's now a script to delete a Greenstone Fedora collection which will take care of these additional steps if you are manually managing your collections. (FLI calls this script when a collection is deleted from FLI.) Deleting a Fedora GS3 collection is accomplished with the following 2 steps: - Run the g2f-deletecol.pl script over the collection to be deleted. Assuming the collection is called fedora1, you'd run: g2f-deletecol.pl -hostname localhost -port 8383 -password pounamu -collectdir //web/sites/localsite/collect fedora1 - manually delete the Greenstone collection directory from the filesystem To run this manually, - first remove the pids from the GSearch index: //packages/tomcat/webapps/fedoragsearch/client>//packages/tomcat/webapps/fedoragsearch/client/runRESTClient.sh localhost:8383 updateIndex deletePID greenstone:fedora1-HASH010313b14474bc72b296b15f It will ask for the fedoragsearch username and password, which by default are fedoraAdmin and pounamu, respectively. - then purge the necessary documents (pids) from the fedora repository: //ext/fedora3/client/bin>./fedora-purge.sh localhost:8383 fedoraAdmin pounamu greenstone:fedora1-HASHe14e36cba08bd41c663237 http "purging" - You can check it's all been deleted by visiting http://localhost:8383/fedora/search and searching for: greenstone:* or greenstone:* Visit http://localhost:8383/fedoragsearch/rest?operation=browseIndex and browse the PID field You can also visit http://localhost:8383/fedoragsearch/rest?operation=gfindObjects then you can search for a query term by prefixing the index field to it, e.g ds.fulltext:computers ******************************************************************************************************************** C. The customisations that have been made for the extension ******************************************************************************************************************** This section is of use when Fedora or GSearch is updated or if you want to update the fedora3-gs3ext.tar.gz distribution file with further customisations. The first version of the Fedora extension for Greenstone uses Fedora 3.6.1 and the GSearch 2.5 (which goes with Fedora 3.6.1). Some template files were added to Fedora and Fedora Gsearch's distribution files in order to customise these for installing them within a Greenstone 3 installation. These files are committed to SVN (without directory structure) at http://trac.greenstone.org/browser/gs3-extensions/fedora/fedora-files and http://trac.greenstone.org/browser/gs3-extensions/fedoragsearch-files They can be edited there if and when necessary, but they would then need to be included in the extension's distribution file fedora3-gs3ext.tar.gz to update its own existing copies. That's because these template files are included in the Fedora Extension for GS3 (fedora3-gs3ext.tar.gz). The extension's build.xml modifies these template files when installing the Fedora Extension into Greenstone 3. The template files contain placeholder strings that get updated when ant targets are run over the build.xml file. 1. The customisation files for Fedora 3.6.1 are: - ./install/install.properties.in - ./server/config/spring/akubra-llstore.xml.in - ./server/config/fedora.fcfg.in - ./server/config/fedora-users.xml.in All these files get converted to filenames without the ".in" suffix upon installation, and placeholder strings in these template files get replaced. The replacements are on the strings - @GSDL3SRCHOME@ - @FEDORA_HOME@ (set to gsdl3srchome/ext/fedora3) - @tomcatserver@ - @tomcatport@ - @tomcatshutdownport@ - @fedorapassw@ - @indexwritelocktimeout@ Except for the first, which needs to be set when running the Fedora extension's ant targets, all the remaining ones can be specified in Greenstone 3's toplevel build.properties. 2. The customisation files for Fedora GSearch 2.6 are located in the "adjust_war_files" subfolder of fedora3-gs3ext.tar.gz file. This custom folder contains the official unpacked fedoragsearch folder, but with the following customisation files: - adjust_war_files/fedoragsearch/WEB-INF/web.xml - adjust_war_files/fedoragsearch/FgsConfig/fgsconfig-basic.properties.in - adjust_war_files/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/foxmlToLucene.xslt - adjust_war_files/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/foxmlToLuceneGenerated.xslt - adjust_war_files/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/index.properties - adjust_war_files/fedoragsearch/client/runRESTClient.sh - adjust_war_files/fedoragsearch/client/runSOAPClient.sh - adjust_war_files/fedoragsearch/client/runSOAPClient.bat All these files will get copied into the same locations within /packages/tomcat/webapps/fedoragsearch. The file fgsconfig-basic.properties.in will get copied as fgsconfig-basic.properties but with the previously-listed placeholder strings replaced. 3. There's also the fedora.xml.in inside the "adjust_war_files" folder of the unpacked extension. This template file will be copied over as /packages/tomcat/conf/Catalina/localhost/fedora.xml during the extension installation process, also with placeholder strings replaced. DETAILS TO STEP 2 (Fedora GSearch customisation files): a. The changes to web.xml are that the authorisation filters are commented out: b. The 3 client scripts runRESTClient.bat, runRESTClient.sh and runSOAPClient.sh have been modified to be runnable from any directory, as they will be called by GS3's g2f perl scripts. - The bash files have been made executable on extraction of the GS3 Fedora extension and the following has been added to the top of the files: # Need to run this script from its own directory instead of whichever directory it may be called from thisdir="`dirname \"$0\"`" thisdir="`cd \"$thisdir\" && pwd`" cd "$thisdir" - Additions to the bat scripts are at the top and bottom: @echo off ::pushd "%CD%" set startdir=%CD% CD /D "%~dp0" ... :: popd cd "%startdir%" set startdir= c. The changes to fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene's foxmlToLucene.xslt and foxmlToLuceneGenerated.xslt are identical. The following xslt has been added: - Add the following namespaces to the namespace declarations at the top: xmlns:ex="http://www.greenstone.org/namespace/fake/ex" xmlns:dls="http://www.greenstone.org/namespace/fake/dls" - Add custom indexing for EX and DLS datastreams below the comment on datastreams: - Near the end of the XSLT files, just after index for the "foxml.all.text" field, allow just the full text of the documents to be indexed by adding an index for the ds.fulltext field:   Further datastreams and metadata sets can be indexed by adding similar namespace declarations and xml elements to these 2 XSLT files. If the changes are not meant to apply for all Greenstone users, individual Greenstone users can add fields for indexing by making such changes to just the file fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/foxmlToLuceneGenerated.xslt d. By default, Lucene's lock obtain timeout is set to 0 in its properties file, which is not acceptable for all but the smallest documents. The lock timeout can be customised in build.properties and by default is set to 10000 (or 1000), which can be adjusted before starting the Fedora extension installation process. The installation process will write out this property into GSearch's FgsConfig/FgsConfigIndexTemplate/Lucene/index.properties: fgsindex.defaultWriteLockTimeout = 1000 When fedoragsearch then gets deployed, this value will be propagated into: /packages/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/index.properties ******************************************************************************************************************** D. To manually set up Fedora with Fedora GSearch 2.5 from scratch ******************************************************************************************************************** When installing Fedora, you can choose to install it outside Greenstone, and also instruct it to use its own tomcat. However, in the following instructions, Fedora's installation location (FEDORA_HOME) is still taken to be /GS3/ext/fedora3, since it's assumed Fedora is be installed to use Greenstone's tomcat. The Fedora GSearch war file will need to be unpacked in the tomcat/webapps folder of whichever tomcat Fedora is using. In the following, this is still Greenstone's tomcat. In whichever location Fedora is installed, that's where Fedora will store its digital objects and their datastreams, and that's where Fedora GSearch will create its GSearch index. ******************************** INSTALLING FEDORA ******************************** 1. Set FEDORA_HOME=//ext/fedora3 in .profile (emacs ~/.profile) source ~/.profile 2. Run fedora installer: java -jar fcrepo-installer-3.6.1.jar Install by specifying to use the existing tomcat (existingTomcat), and provide the Greenstone3 tomcat stop and start ports (usually 8383 and 8305). Then set the existingTomcat to the Greenstone3 one: //packages/tomcat Turn on messaging, as this is necessary for Fedora GSearch. - Installation options: custom FEDORA_HOME pwd: pounamu host: def server context: fedora default false for user authentication for APIA-A false for SSL availability existingTomcat path to tomcat: /packages/tomcat tomcat listen port: 8383 tomcat shutdown port:8305 derby:included upstream HTTP authentication: (default) false FeSL AuthZ: (default) false XACML policy enforcement enabled: false Low Level Storage: (default) akubra-fs Resource Index: true Enable Messaging: true Messaging Provider URI: [default is vm:(broker:(tcp://localhost:61616))] Deploy local services and demos: (default) true - The Fedora installation options used are to be found (after installation) in fedora3/install/install.properties: #Install Options #Fri Sep 21 15:29:29 NZST 2012 ri.enabled=true messaging.enabled=false apia.auth.required=false database.jdbcDriverClass=org.apache.derby.jdbc.EmbeddedDriver upstream.auth.enabled=false ssl.available=false database.jdbcURL=jdbc\:derby\://ext/fedora3/derby/fedora3;create\=true database.password=fedoraAdmin database.username=fedoraAdmin fesl.authz.enabled=false tomcat.shutdown.port=8305 deploy.local.services=true xacml.enabled=false tomcat.http.port=8383 fedora.serverHost= database=included database.driver=included fedora.serverContext=fedora llstore.type=akubra-fs tomcat.home=//packages/tomcat fedora.home=//ext/fedora3 install.type=custom servlet.engine=existingTomcat fedora.admin.pass=pounamu - If trying to turn a custom installation into one for distribution, you would need to modify the following 4 files by inserting placeholder strings where applicable for the tomcatserver name, tomcat listen port and shutdown port, fedorapassword, GSLD3SRCHOME and FEDORA_HOME: fedora3/install/install.properties fedora3/server/config/spring/akubra-llstore.xml fedora3/server/config/fedora.fcfg fedora3/server/config/fedora-users.xml Then you would rename these files with the suffix .in 3. Just to confirm there are no differences between the server.xml fedora has generated for fedora, and the server.xml of GS3's tomcat, run a diff: diff -w //packages/tomcat/conf/server.xml fedora3/install/server.xml 4. Copy the fedora war files from fedora3/install into packages/tomcat/webapps fedora.war, fedora-demo.war, fop.war, imagemanip.war, saxon.war 5. Copy the jar files xalan.jar, serializer.jar (and xsltc.jar) from Greenstone3's web/WEB-INF/lib into Greenstone3's packages/tomcat/lib so that fedora has access to the xalan version of the TransformerFactoryImpl class. There's also a xalan.jar in //packages/tomcat/webapps/fop/WEB-INF/lib So this can be copied into //packages/tomcat/lib/. instead of the Greenstone version. 6. Create the file //packages/tomcat/conf/Catalina/localhost/fedora.xml containing: 7. Before running the Greenstone server, make a copy of the fedora3 folder. Because once you start up tomcat, and visit the fedora home page, it will create a lot of files customised to the location of the current installation. 8. Visit http://localhost:8383/greenstone3 and http://localhost:8383/fedora to confirm both work. ******************************** INSTALLING FEDORA GSEARCH ******************************** 1. Download Fedora GSearch 2.5 2. Stop tomcat. Copy the fedoragsearch.war files into //packages/tomcat/webapps If Fedora was not installed with messaging turned on, then turn it on in now //ext/fedora3/server/config/fedora.fcfg: Fedora's Java Messaging Service (JMS) Module ... 3. Update the following properties in //packages/tomcat/webapps/fedoragsearch/FgsConfig/fgsconfig-basic.properties gsearchBase=http://:8383 gsearchUser=fedoraAdmin gsearchPass= local.FEDORA_HOME=//ext/fedora3 finalConfigPath=//packages/tomcat/webapps/fedoragsearch/WEB-INF/classes fedoraBase=http://:8383 fedoraPass= 4. Add the edited foxmlToLucene.xslt & foxmlToLuceneGenerated.xslt files to //packages/tomcat/webapps/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene 5. Add the edited runRESTClient.sh (and runSOAPClient.sh, runRESTClient.bat) to //packages/tomcat/webapps/fedoragsearch/client And give them execute permissions, unless they already have it. 6. In a text editor, open up //packages/tomcat/webapps/fedoragsearch/WEB-INF/web.xml and comment out the authentication filters (one or two of these is probably all that is necessary): Maybe what I really need to comment out is: EnforceAuthnFilter org.fcrepo.server.security.servletfilters.FilterEnforceAuthn 7. Start tomcat to have it deploy fedoragsearch. This will unpack fedoragsearch. 8. With tomcat running, configure fedoragsearch, which will adjust various config files. You will need to go into the deployed fedoragsearch webapps folder's FgsConfig folder first to be able to run the configuration target: /packages/tomcat/webapps/fedoragsearch/FgsConfig>ant -f fgsconfig-basic.xml 9. With tomcat still running, run updateIndex once to create the empty index for the first time (by running the runRESTClient.sh script with the command: "host:port updateIndex createEmpty [indexName]") .//packages/tomcat/webapps/fedoragsearch/client/runRESTClient.sh localhost:8383 updateIndex createEmpty FgsIndex 10. Check fedoragsearch has been properly installed by visiting: http://localhost:8383/fedoragsearch/rest 10. If you wish to remove the various war files from GS3's tomcat webapps folder, you will need to stop tomcat first before deleting the war files, because doing so when tomcat is running will undeploy those webapps. ******************************** DEBUGGING ******************************** Logs to consult when debugging: 1. Check //packages/tomcat/logs/catalina.out 2. fedora3/server/logs/fedora.log for Fedora error logging. 3. To turn on FedoraGSearch's logging of debug statements (Huge files, can reach 500 Mb in a day if rebuilding the demo collection. So you may want to switch this off when not debugging by setting the debug level to INFO): - //packages/tomcat/webapps/fedoragsearch/WEB-INF/classes/log4j.xml is already set to output DEBUG statements and higher logging levels - So open fedora3/server/logs/fedoragsearch.daily.log to look at debug messages in case fedoragsearch doesn't work as expected. ********************************************************************************************************************