source: gs3-extensions/fedora/README@ 26380

Last change on this file since 26380 was 26380, checked in by ak19, 11 years ago

Adding README document for installing and setting up the Fedora extension (with Fedora GSearch) for Greenstone 3.

File size: 22.6 KB
Line 
1********************************************************************************************************************
2 CONTENTS
3********************************************************************************************************************
4
5A. To install your GS3 Fedora extension from fedora3-gs3ext.tar.gz
6
7B. How to manually build fedora collections
8
9C. The customisations that have been made for the extension
10
11D. How to manually set up Fedora 3.6.1 with Fedora GSearch 2.5 from scratch
12 Installing Fedora
13 Installing Fedora GSearch
14 Debugging
15
16
17Version:
18Greenstone 3.05
19Fedora 3.6.1
20Fedora GSearch 2.5
21
22Web services pages once installed:
23http://localhost:8383/fedora/services
24http://localhost:8383/fedoragsearch/services
25
26-- ak19, 2012
27
28
29
30********************************************************************************************************************
31 A. To install your Fedora extension for Greenstone 3 from the fedora3-gs3ext.tar.gz distribution file
32********************************************************************************************************************
33
34(Tested to work on Linux machines.)
35
360. You will need Greenstone 3.05 or a fresh checkout from SVN, already installed.
37
381. Download the Fedora Extension for Greenstone 3: http://trac.greenstone.org/browser/gs3-extensions/fedora/fedora3-gs3ext.tar.gz
39
402. Save into your Greenstone3/ext folder
41
423. Untar and decompress it:
43 $ tar -xvzf fedora3-gs3ext.tar.gz
44
454. This will create a fedora3 folder in your Greenstone3/ext folder.
46
475. Open Greenstone3/build.properties and find the section:
48
49 ##These need to be uncommented if using Fedora and Fedora GSearch with Greenstone's tomcat
50 fedora.home=#${basedir}/ext/fedora3
51 #fedora.maxpermsize=-XX:MaxPermSize=128m
52 #fedora.password=pounamu
53
54Uncomment all these properties and change the value for the fedora password if necessary:
55
56 ##These need to be uncommented if using Fedora and Fedora GSearch with Greenstone's tomcat
57 fedora.home=${basedir}/ext/fedora3
58 fedora.maxpermsize=-XX:MaxPermSize=128m
59 fedora.password=pounamu
60
61
626. Use a terminal to go into this folder:
63 $ cd Greenstone3/ext/fedora3
64
657. If you type `ant` in the x-term, it will print the Usage message which will display the list of commands available and the order in which to execute them for proper installation. It's a 5 step process (explained in detail further below):
66
67 $ ant -Dgsdl3.home=<type-full-path-to-your-gs3> start-fedora-install
68
69 $ ant start
70
71 $ ant -Dgsdl3src.home=<type-full-path-to-your-gs3> continue-fedora-install
72
73 $ ant stop
74
75 $ ant -Dgsdl3src.home=<type-full-path-to-your-gs3> finish-fedora-install
76
77
78That's it.
79Now you can start tomcat again and visit http://localhost:8383/fedora and http://localhost:8383/fedoragsearch/rest to check the pages display.
80
818. You can start creating Greenstone Fedora collections with FLI. Run FLI from the GS3 toplevel folder by typing:
82 $ ./gli/fli.sh
83
84
85
86DETAILS ON STEP 7:
87a. From your Greenstone3/ext/fedora3 folder, run:
88 $ ant -Dgsdl3.home=<type-full-path-to-your-gs3> start-fedora-install
89
90This step prepares the fedora and gsearch war files for deployment (by customising template files) and moves them into Greenstone 3's tomcat.
91
92b. Start tomcat by going into your GS3 toplevel folder and running:
93 $ ant start
94
95This step gets Greenstone's tomcat to deploy the fedora and fedoragsearch war files.
96You may want to check your Greenstone library is working by visiting http://localhost:8383/greenstone3 in your browser
97
98c. Go back into Greenstone3/ext/fedora3 and run:
99 $ ant -Dgsdl3src.home=<type-full-path-to-your-gs3> continue-fedora-install
100
101This target configures fedora-gsearch now that it has been deployed. It runs a fedora-gsearch ant build file to customise properties files that are internal to fedora-gsearch.
102
103d. Stop tomcat by going back into your GS3 toplevel folder and running:
104 $ ant stop
105
106You may want to check java instances running tomcat have indeed stopped:
107 $ ps aux | grep tomcat
108
109You will need to kill the process if any tomcat is still running at this stage (this can happen if you accidentally start tomcat several times in succession without stopping it).
110 $ kill -9 <process-id>
111
112e. Now that tomcat has stopped, go back into Greenstone3/ext/fedora3 and run:
113 ant -Dgsdl3src.home=<Full-Path-To-GS3> finish-fedora-install
114
115This will clean up the extension's war files in tomcat/packages/webapps now that they have already been deployed. (Do not perform this final step when tomcat is still running, else your fedora extension webapps will become undeployed.)
116
117
118
119When installation is completed, there will be a fedora3 folder in Greenstone3/ext, which you should not delete, since the Fedora Digital objects will be stored here, and the FedoraGSearch index will be created here.
120
121Further, the Fedora extension installation process would have made tomcat deploy several webapps in packages/tomcat/webapps. The resulting folders are:
122- fedora
123- fedora-demo
124- fedoragsearch
125- fop
126- imagemanip
127- saxon
128
129(Note that the installation process will have removed the Fedora extension war files it put into packages/tomcat/webapps during installation.)
130
131
132
133********************************************************************************************************************
134 B. Manual building of fedora collections
135********************************************************************************************************************
136
137You can use FLI to create, build and preview Fedora collections using Greenstone.
138
139The same can be done manually by calling the g2f perl scripts from the command line.
140
1411. Fedora needs to be running, so ensure the Greenstone tomcat is running:
142 ant start
143
1442. Set up the GS3 environment:
145 source gs3-setup.bash
146
1473. Create a new collection with mkcol.pl. Call it "fedora1" for example
148 mkcol.pl -collectdir /<GS3>/web/sites/localsite/collect fedora1
149
1504. Run import and build to ingest the new collection
151
152- First put the documents you want into the import directory of your new collection.
153
154- g2f-import.pl -hostname localhost -port 8383 -password pounamu -removeold -collectdir /<GS3>/web/sites/localsite/collect fedora1
155(password may be optional at this stage, by include it for convenience)
156
157- g2f-buildcol.pl -hostname localhost -port 8383 -password pounamu -removeold -collectdir /<GS3>/web/sites/localsite/collect fedora1
158
159
1605. If building the GS3 demo collection as a fedora collection:
161- use FLI to transfer dls to dc metadata upon Gathering the documents.
162- then before building, turn on the description_tags in the HTMLPlugin.
163
1646. If you ran FLI, exit it. It should stop the Greenstone server.
165
166With fedora installed, it's always good to check that the java process that launched tomcat has indeed stopped:
167 ps aux | grep "tomcat"
168
169(Since things will fail if multiple instances of this same tomcat are running, kill any java processes that are referring to tomcat.)
170
1717. Create an index folder in the new fedora collection folder. Create a buildConfig.xml file in it containing the following:
172
173<buildConfig>
174 <metadataList/>
175 <serviceRackList>
176 <serviceRack name="FedoraServiceProxy" />
177 </serviceRackList>
178</buildConfig>
179
180
1818. Restart the GS3 server.
182 ant start
183
1849. Visit the collection from the Greenstone collections page
185
186
187
188********************************************************************************************************************
189 C. The customisations that have been made for the extension
190********************************************************************************************************************
191
192This section is of use when Fedora or GSearch is updated or if you want to update the fedora3-gs3ext.tar.gz distribution file with further customisations.
193
194The first version of the Fedora extension for Greenstone uses Fedora 3.6.1 and the GSearch 2.5 (which goes with Fedora 3.6.1).
195
196Some template files were added to Fedora and Fedora Gsearch's distribution files in order to customise these for installing them within a Greenstone 3 installation. These files are committed to SVN (without directory structure) at http://trac.greenstone.org/browser/gs3-extensions/fedora/fedora-files and http://trac.greenstone.org/browser/gs3-extensions/fedoragsearch-files
197
198They can be edited there if and when necessary, but they would then need to be included in the extension's distribution file fedora3-gs3ext.tar.gz to update its own existing copies.
199
200That's because these template files are included in the Fedora Extension for GS3 (fedora3-gs3ext.tar.gz). The extension's build.xml modifies these template files when installing the Fedora Extension into Greenstone 3. The template files contain placeholder strings that get updated when ant targets are run over the build.xml file.
201
202
2031. The customisation files for Fedora 3.6.1 are:
204
205- ./install/install.properties.in
206- ./server/config/spring/akubra-llstore.xml.in
207- ./server/config/fedora.fcfg.in
208- ./server/config/fedora-users.xml.in
209
210All these files get converted to filenames without the ".in" suffix upon installation, and placeholder strings in these template files get replaced. The replacements are on the strings
211- @GSDL3SRCHOME@,
212- @FEDORA_HOME@ (set to gsdl3srchome/ext/fedora3),
213- @tomcatserver@,
214- @tomcatport@,
215- @tomcatshutdownport@,
216- @fedorapassw@.
217Except for the first, which needs to be set when running the Fedora extension's ant targets, all the remaining ones can be specified
218
219
2202. The customisation files for Fedora GSearch 2.6 are located in the "adjust_war_files" subfolder of fedora3-gs3ext.tar.gz file. This custom folder contains the official unpacked fedoragsearch folder, but with the following customisation files:
221
222- adjust_war_files/fedoragsearch/WEB-INF/web.xml
223- adjust_war_files/fedoragsearch/FgsConfig/fgsconfig-basic.properties.in
224- adjust_war_files/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/foxmlToLucene.xslt
225- adjust_war_files/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/foxmlToLuceneGenerated.xslt
226- adjust_war_files/fedoragsearch/client/runRESTClient.sh
227- adjust_war_files/fedoragsearch/client/runSOAPClient.sh
228- adjust_war_files/fedoragsearch/client/runSOAPClient.bat
229
230All these files will get copied into the same locations within <GS3>/packages/tomcat/webapps/fedoragsearch. The file fgsconfig-basic.properties.in will get copied as fgsconfig-basic.properties but with the previously-listed placeholder strings replaced.
231
232
2333. There's also the fedora.xml.in inside the "adjust_war_files" folder of the unpacked extension. This template file will be copied over as <GS3>/packages/tomcat/conf/Catalina/localhost/fedora.xml during the extension installation process, also with placeholder strings replaced.
234
235
236DETAILS TO STEP 2 (Fedora GSearch customisation files):
237
238a. The changes to web.xml are that the authorisation filters are commented out:
239
240<!-- <filter-mapping>
241 <filter-name>EnforceAuthnFilter</filter-name>
242 <servlet-name>AxisServlet</servlet-name>
243 </filter-mapping>
244 <filter-mapping>
245 <filter-name>EnforceAuthnFilter</filter-name>
246 <servlet-name>GenericSearchREST</servlet-name>
247 </filter-mapping>
248 <filter-mapping>
249 <filter-name>EnforceAuthnFilter</filter-name>
250 <url-pattern>/index.html</url-pattern>
251 </filter-mapping>
252 <filter-mapping>
253 <filter-name>EnforceAuthnFilter</filter-name>
254 <url-pattern>/rest</url-pattern>
255 </filter-mapping>
256-->
257
258
259b. The 3 client scripts runRESTClient.bat, runRESTClient.sh and runSOAPClient.sh have been modified to be runnable from any directory, as they will be called by GS3's g2f perl scripts.
260
261- The bash files have been made executable on extraction of the GS3 Fedora extension and the following has been added to the top of the files:
262
263 # Need to run this script from its own directory instead of whichever directory it may be called from
264 thisdir="`dirname \"$0\"`"
265 thisdir="`cd \"$thisdir\" && pwd`"
266 cd "$thisdir"
267
268
269- Additions to the bat scripts are at the top and bottom:
270
271 @echo off
272
273 ::pushd "%CD%"
274 set startdir=%CD%
275 CD /D "%~dp0"
276
277 ...
278
279 :: popd
280 cd "%startdir%"
281 set startdir=
282
283
284c. The changes to fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene's foxmlToLucene.xslt and foxmlToLuceneGenerated.xslt are identical. The following xslt has been added:
285
286- Add the following namespaces to the namespace declarations at the top:
287 xmlns:ex="http://www.greenstone.org/namespace/fake/ex"
288 xmlns:dls="http://www.greenstone.org/namespace/fake/dls"
289
290- Add custom indexing for EX and DLS datastreams below the comment on datastreams:
291
292 <!-- a datastream is fetched, if its mimetype
293 can be handled, the text becomes the value of the field.
294 This is the version using PDFBox,
295 below is the new version using Apache Tika. -->
296
297 <xsl:for-each select="foxml:datastream[starts-with(@ID,'EX')]/foxml:datastreamVersion[last()]/foxml:xmlContent/ex:ex/ex:metadata">
298 <IndexField index="TOKENIZED" store="YES" termVector="YES">
299 <xsl:attribute name="IFname">
300 <xsl:value-of select="concat('ex.', @name)"/>
301 </xsl:attribute>
302 <xsl:value-of select="text()"/>
303 </IndexField>
304 </xsl:for-each>
305
306 <xsl:for-each select="foxml:datastream[starts-with(@ID,'DLS')]/foxml:datastreamVersion[last()]/foxml:xmlContent/dls:dls/dls:metadata">
307 <IndexField index="TOKENIZED" store="YES" termVector="YES">
308 <xsl:attribute name="IFname">
309 <xsl:value-of select="concat('dls.', @name)"/>
310 </xsl:attribute>
311 <xsl:value-of select="text()"/>
312 </IndexField>
313 </xsl:for-each>
314
315- Near the end of the XSLT files, just after index for the "foxml.all.text" field, allow just the full text of the documents to be indexed by adding an index for the ds.fulltext field:
316
317 <IndexField IFname="ds.fulltext" index="TOKENIZED" store="YES" termVector="YES">
318 <xsl:for-each select="//foxml:datastream[@CONTROL_GROUP='M' or @CONTROL_GROUP='E' or @CONTROL_GROUP='R']">
319 <xsl:value-of select="exts:getDatastreamText($PID, $REPOSITORYNAME, @ID, $FEDORASOAP, $FEDORAUSER, $FEDORAPASS, $TRUSTSTOREPATH, $TRUSTSTOREPASS)"/>
320 <xsl:text> </xsl:text>
321 </xsl:for-each>
322 </IndexField>
323
324
325Further datastreams and metadata sets can be indexed by adding similar namespace declarations and xml elements to these 2 XSLT files.
326If the changes are not meant to apply for all Greenstone users, individual Greenstone users can add fields for indexing by making such changes to just the file fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene/foxmlToLuceneGenerated.xslt
327
328
329
330********************************************************************************************************************
331 D. To manually set up Fedora with Fedora GSearch 2.5 from scratch
332********************************************************************************************************************
333
334When installing Fedora, you can choose to install it outside Greenstone, and also instruct it to use its own tomcat. However, in the following, /GS3/ext/fedora3 is still assumed to be FEDORA_HOME and hence Fedora's installation location and it's assumed to be installed to use Greenstone's tomcat.
335
336The Fedora GSearch war file will need to be unpacked in the tomcat/webapps folder of whichever tomcat Fedora is using. In the following, this is still Greenstone's tomcat.
337
338Wherever Fedora is installed, that's where the digital objects and their datastreams will be stored, and that's where Fedora GSearch will create its GSearch index.
339
340
341********************************
342INSTALLING FEDORA
343********************************
3441. Set FEDORA_HOME=/<GS3>/ext/fedora3 in .profile (emacs ~/.profile)
345source ~/.profile
346
3472. Run fedora installer:
348 java -jar fcrepo-installer-3.6.1.jar
349
350Install by specifying to use the existing tomcat (existingTomcat), and provide the Greenstone3 tomcat stop and start ports (usually 8383 and 8305).
351Then set the existingTomcat to the Greenstone3 one: /<GS3>/packages/tomcat
352Turn on messaging, as this is necessary for Fedora GSearch.
353
354- Installation options:
355
356custom
357FEDORA_HOME
358pwd: pounamu
359host: <full-tomcat.server-host-name>
360def server context: fedora
361default false for user authentication for APIA-A
362false for SSL availability
363existingTomcat
364path to tomcat: </full-path-to-GS3>/packages/tomcat
365tomcat listen port: 8383
366tomcat shutdown port:8305
367derby:included
368upstream HTTP authentication: (default) false
369FeSL AuthZ: (default) false
370XACML policy enforcement enabled: false
371Low Level Storage: (default) akubra-fs
372Resource Index: true
373Enable Messaging: true
374Messaging Provider URI: [default is vm:(broker:(tcp://localhost:61616))]
375Deploy local services and demos: (default) true
376
377
378- The Fedora installation options used are to be found (after installation) in fedora3/install/install.properties:
379
380#Install Options
381#Fri Sep 21 15:29:29 NZST 2012
382ri.enabled=true
383messaging.enabled=false
384apia.auth.required=false
385database.jdbcDriverClass=org.apache.derby.jdbc.EmbeddedDriver
386upstream.auth.enabled=false
387ssl.available=false
388database.jdbcURL=jdbc\:derby\:/<GS3>/ext/fedora3/derby/fedora3;create\=true
389database.password=fedoraAdmin
390database.username=fedoraAdmin
391fesl.authz.enabled=false
392tomcat.shutdown.port=8305
393deploy.local.services=true
394xacml.enabled=false
395tomcat.http.port=8383
396fedora.serverHost=<full-tomcat.server-host-name>
397database=included
398database.driver=included
399fedora.serverContext=fedora
400llstore.type=akubra-fs
401tomcat.home=/<GS3>/packages/tomcat
402fedora.home=/<GS3>/ext/fedora3
403install.type=custom
404servlet.engine=existingTomcat
405fedora.admin.pass=pounamu
406
407
408- If trying to turn a custom installation into one for distribution, you would need to modify the following 4 files by inserting placeholder strings where applicable for the tomcatserver name, tomcat listen port and shutdown port, fedorapassword, GSLD3SRCHOME and FEDORA_HOME:
409
410fedora3/install/install.properties
411fedora3/server/config/spring/akubra-llstore.xml
412fedora3/server/config/fedora.fcfg
413fedora3/server/config/fedora-users.xml
414
415Then you would rename these files with the suffix .in
416
417
4183. Just to confirm there are no differences between the server.xml fedora has generated for fedora, and the server.xml of GS3's tomcat, run a diff:
419
420 diff -w /<GS3>/packages/tomcat/conf/server.xml </GS3/ext/>fedora3/install/server.xml
421
4224. Copy the fedora war files from fedora3/install into packages/tomcat/webapps
423fedora.war, fedora-demo.war, fop.war, imagemanip.war, saxon.war
424
4255. Copy the jar files xalan.jar, serializer.jar (and xsltc.jar) from Greenstone3's web/WEB-INF/lib into Greenstone3's packages/tomcat/lib so that fedora has access to the xalan version of the TransformerFactoryImpl class.
426
427There's also a xalan.jar in /<GS3>/packages/tomcat/webapps/fop/WEB-INF/lib
428So this can be copied into /<GS3>/packages/tomcat/lib/. instead of the Greenstone version.
429
4306. Create the file /<GS3>/packages/tomcat/conf/Catalina/localhost/fedora.xml
431containing:
432
433<?xml version="1.0" encoding="UTF-8"?>
434<Context>
435 <Parameter name="fedora.home" value="/<GS3>/ext/fedora3" />
436</Context>
437
438
4397. Before running the Greenstone server, make a copy of the fedora3 folder. Because once you start up tomcat, and visit the fedora home page, it will create a lot of files customised to the location of the current installation.
440
4418. Visit http://localhost:8383/greenstone3 and
442 http://localhost:8383/fedora
443
444to confirm both work.
445
446
447********************************
448INSTALLING FEDORA GSEARCH
449********************************
4501. Download Fedora GSearch 2.5
451
4522. Stop tomcat. Copy the fedoragsearch.war files into /<GS3>/packages/tomcat/webapps
453
454If Fedora was not installed with messaging turned on, then turn it on in now /<GS3>/ext/fedora3/server/config/fedora.fcfg:
455
456 <module role="org.fcrepo.server.messaging.Messaging" class="org.fcrepo.server.messaging.MessagingModule">
457 <comment>Fedora's Java Messaging Service (JMS) Module</comment>
458 <param name="enabled" value="true"/>
459 ...
460
4613. Update the following properties in /<GS3>/packages/tomcat/webapps/fedoragsearch/FgsConfig/fgsconfig-basic.properties
462
463gsearchBase=http://<tomcat.server>:8383
464gsearchUser=fedoraAdmin
465gsearchPass=<fedora.password>
466local.FEDORA_HOME=/<GS3>/ext/fedora3
467finalConfigPath=/<GS3>/packages/tomcat/webapps/fedoragsearch/WEB-INF/classes
468
469fedoraBase=http://<tomcat.server>:8383
470fedoraPass=<fedora.password>
471
4724. Add the edited foxmlToLucene.xslt & foxmlToLuceneGenerated.xslt files to /<GS3>/packages/tomcat/webapps/fedoragsearch/FgsConfig/FgsConfigIndexTemplate/Lucene
473
4745. Add the edited runRESTClient.sh (and runSOAPClient.sh, runRESTClient.bat) to /<GS3>/packages/tomcat/webapps/fedoragsearch/client
475And give them execute permissions, unless they already have it.
476
4776. In a text editor, open up /<GS3>/packages/tomcat/webapps/fedoragsearch/WEB-INF/web.xml
478and comment out the authentication filters (one or two of these is probably all that is necessary):
479
480<!--
481 <filter-mapping>
482 <filter-name>EnforceAuthnFilter</filter-name>
483 <servlet-name>AxisServlet</servlet-name>
484 </filter-mapping>
485 <filter-mapping>
486 <filter-name>EnforceAuthnFilter</filter-name>
487 <servlet-name>GenericSearchREST</servlet-name>
488 </filter-mapping>
489 <filter-mapping>
490 <filter-name>EnforceAuthnFilter</filter-name>
491 <url-pattern>/index.html</url-pattern>
492 </filter-mapping>
493 <filter-mapping>
494 <filter-name>EnforceAuthnFilter</filter-name>
495 <url-pattern>/rest</url-pattern>
496 </filter-mapping>
497-->
498
499Maybe what I really need to comment out is:
500 <filter>
501 <filter-name>EnforceAuthnFilter</filter-name>
502 <filter-class>org.fcrepo.server.security.servletfilters.FilterEnforceAuthn</filter-class>
503 </filter>
504
5057. Start tomcat to have it deploy fedoragsearch.
506This will unpack fedoragsearch.
507
5088. With tomcat running, configure fedoragsearch, which will adjust various config files. You will need to go into the deployed fedoragsearch webapps folder's FgsConfig folder first to be able to run the configuration target:
509
510<GS3>/packages/tomcat/webapps/fedoragsearch/FgsConfig>ant -f fgsconfig-basic.xml
511
5129. Run tomcat and check fedoragsearch has been properly installed by visiting:
513http://localhost:8383/fedoragsearch/rest
514
51510. If you wish to remove the various war files from GS3's tomcat webapps folder, you will need to stop tomcat first before deleting the war files, because doing so when tomcat is running will undeploy those webapps.
516
517
518********************************
519DEBUGGING
520********************************
521
522Logs to consult when debugging:
523
5241. Check /<GS3>/packages/tomcat/logs/catalina.out
525
5262. </GS3/ext/>fedora3/server/logs/fedora.log for Fedora error logging.
527
5283. To turn on FedoraGSearch's logging of debug statements (Huge files, can reach 500 Mb in a day. Do this only when debugging):
529- /<GS3>/packages/tomcat/webapps/fedoragsearch/WEB-INF/classes/log4j.xml is already set to output DEBUG statements and higher logging levels
530- So open </GS3/ext/>fedora3/server/logs/fedoragsearch.daily.log to look at debug messages in case fedoragsearch doesn't work as expected.
531
532
533********************************************************************************************************************
Note: See TracBrowser for help on using the repository browser.