source: gs2-extensions/apache-jena/trunk/src/README-FUSEKI3.txt@ 38208

Last change on this file since 38208 was 38208, checked in by davidb, 8 months ago

Extra details. Perhaps these should be in the main README.txt

File size: 7.4 KB
Line 
1
2!!!!!!!!!!!!!!
3! DEPRECATED !
4!!!!!!!!!!!!!!
5
6These notes were written for when Fuseki3 was run as its own
7standalone server on a separate port to the Greenstone3 installation.
8
9***********
10
11
12
13#----
14# Installation/Compiling
15#----
16
17Installation of this extension follows the usual pattern:
18
19 ./CASCADE-MAKE.sh
20
21Test out the triplestore server with:
22
23 source ./setup.bash
24 gs-triplestore-server3.DEPRECATED
25
26
27A successful run will produce output similar to the following:
28
29 [2023-01-13 03:48:52] Server INFO Running in read-only mode for /greenstone
30 [2023-01-13 03:48:52] Server INFO Apache Jena Fuseki 3.17.0
31 [2023-01-13 03:48:53] Config INFO FUSEKI_HOME=/mnt/disks/atea-scratch-encrypted/davidb/research/code-managed/intermuse/greenstone3-svn/gs2build/ext/apache-jena/packages/apache-jena-fuseki-3.17.0
32 [2023-01-13 03:48:53] Config INFO FUSEKI_BASE=/mnt/disks/atea-scratch-encrypted/davidb/research/code-managed/intermuse/greenstone3-svn/gs2build/ext/apache-jena/run
33 [2023-01-13 03:48:53] Config INFO Shiro file: file:///mnt/disks/atea-scratch-encrypted/davidb/research/code-managed/intermuse/greenstone3-svn/gs2build/ext/apache-jena/run/shiro.ini
34 [2023-01-13 03:48:53] Config INFO Template file: templates/config-tdb-dir
35 [2023-01-13 03:48:54] Server INFO Database: TDB1 dataset: location=etc/tdb-triple-store3
36 [2023-01-13 03:48:54] Server INFO Path = /greenstone
37 [2023-01-13 03:48:54] Server INFO System
38 [2023-01-13 03:48:54] Server INFO Memory: 4.0 GiB
39 [2023-01-13 03:48:54] Server INFO Java: 11.0.16.1
40 [2023-01-13 03:48:54] Server INFO OS: Linux 5.11.0-1029-gcp amd64
41 [2023-01-13 03:48:54] Server INFO PID: 1025186
42 [2023-01-13 03:48:54] Server INFO Started 2023/01/13 03:48:54 UTC on port 4040
43
44
45Currently there is no content in the triplestore. For that, you need to adjust and then
46build a collection.
47
48Note: if you want to stop the triplestore server, just press ^C,
49
50#----
51# Building collections with Linked Open Data
52#----
53
54Assuming you are running the gs-triplestore-server3 ...
55
56To include Linked Data triples of the documents metadata to a collection,
57add the following to its collectionConfig.xml file, and rebuild.
58
59
60 <search type="jenaTDB" orthogonal="true"/>
61
62#----
63# Confirming Linked Open Data has been ingested
64#----
65
66For interative experiments with the Triplestore, you can then visit
67the home page:
68
69 http://localhost:4040/
70
71and from there explore the interactive interface Fuseki provides to
72the Jena TDB store.
73
74For production use, it is common to be operating Greenstone3 through a
75Reverse Proxy web server, such as Apache2 Assuming you are running
76Greenstone3 through as:
77
78 http://mydomain.org/greenstone3/library
79
80Then add to the Apache2 configuration file:
81
82 ProxyPass /greenstone3-lod3/ http://localhost:4040/
83 ProxyPassReverse /greenstone3-lod3/ http://localhost:4040/
84
85You can now visit the interactive interface as:
86
87 http://mydomain.org/greenstone3-lod3/
88
89
90A useful page to visit to test out your triple store is:
91
92 https://mydomain.org/greenstone3-lod3/dataset.html
93
94Make sure for 'dataset' you have the drop-down menu on:
95
96 /greenstone
97
98In normal use of Greenstone3 with the apache-jena extension, this will
99be the only item in the drop-down menu
100
101By default it has the default SPARQL query loaded:
102
103 SELECT ?subject ?predicate ?object
104 WHERE {
105 ?subject ?predicate ?object
106 }
107 LIMIT 25
108
109
110Press the 'play' button to run the query.
111
112Then you will get output similar to the following:
113
114
115 subject predicate object
116 1
117 <http://127.0.0.1:4343/greenstone3/library/collection/programmes-and-performers/document/D0272>
118 <http://purl.org/dc/elements/1.1/Relation.isPartOf>
119 <http://127.0.0.1:4343/greenstone3/library/collection/programmes-and-performers>
120 2
121 <http://127.0.0.1:4343/greenstone3/library/collection/programmes-and-performers/document/D0272>
122 <http://greenstone.org/gsdlextracted#gsdlsourcefilename>
123 "import/HMS-Catalogue-SMALL.csv"
124
125
126
127#----
128# Adding in sparql.xsl to a collection
129#----
130
131If needed:
132
133 mkdir $GSDL3HOME/sites/<yoursite>/collect/<yourcollection>/transform
134 mkdir $GSDL3HOME/sites/<yoursite>/collect/<yourcollection>/transform/pages
135
136Then:
137 /bin/cp transform/pages/sparql.xsl $GSDL3HOME/sites/<yoursite>/collect/<yourcollection>/transform/pages/.
138
139If using a reverse-proxy web server:
140
141 emacs packages/apache-jena-fuseki-3.17.0/webapp/xml-to-html-links.xsl
142
143Then change:
144
145 <a href="?query={$query}&amp;output=xml&amp;stylesheet=%2Fxml-to-html-links.xsl">
146=>
147 <a href="?query={$query}&amp;output=xml&amp;stylesheet=/greenstone3-lod3/%2Fxml-to-html-links.xsl">
148
149
150Also make sure you have set the reverse-proxy web server settings needed for Greenstone3 in:
151
152
153 emacs $GSDL3SRCHOME/build.properties
154
155####!!!
156
157For /fuseki3 running within Greenstone3 tomcat server, do the above but:
158
159stylesheet=/fuseki3/%2F ...
160
161And after accessing the /fuseki3 URL to unbundle the WAR file:
162
163cp packages/apache-jena-fuseki-3.17.0/webapp/xml-to-html-links.xsl $GSDL3SRCHOME/packages/tomcat/webapps/fuseki3/.
164
165
166
167#----
168# SPARQL quieries
169#----
170
171To run a SPARQL query directly, you would do:
172
173 http://localhost:4040/greenstone/query?query=select+*+where+%7B+%3Fs+%3Fo+%3Fp+%7D+limit+100&output=text&stylesheet=
174
175For a proxied install you would (continuing the example) do:
176
177 http://mydomain.org/greenstone3-lod/greenstone/query?query=select+*+where+%7B+%3Fs+%3Fo+%3Fp+%7D+limit+100&output=text&stylesheet=
178
179Factoring in these details, it is possible to setup a transform/page/sparql.xsl page in Greenstone3 that is fully operational,
180even in the ReverseProxying situation.
181
182One key step in sparql.xsl is to updated the form input argument for 'stylesheet' to specify the proxied prefix URL:
183
184 stylesheet=/greenstone3-lod/xml-to-html-links.xsl
185 stylesheet=%2Fgreenstone3-lod%2Fxml-to-html-links.xsl
186
187
188When asking for the SPARQL XML result to be transformed to html-links, the hyperlinks in the page generated
189also need some tweaking. The Greenstone3 sparql.xsl page helps plumb the static files used back to the
190/greenstone3-lod/ url, but inside 'xml-to-html-links.xsl' this sets another 'stylesheet=' argument that
191needs to be updated to the ReverseProxy setup
192
193
194 <a href="?query={$query}&amp;output=xml&amp;stylesheet=%2Fxml-to-html-links.xsl">
195=>
196 <a href="?query={$query}&amp;output=xml&amp;stylesheet=%2Fgreenstone3-lod%2Fxml-to-html-links.xsl">
197
198You might need to reload the xml-to-html-links.xsl file in the browser if it has been cached before the change
199
200
201% fgrep -rl xml-to-html etc/
202 etc/pages/sparql.html
203 etc/pages/sparql.tpl
204 etc/pages/xml-to-html-links.xsl
205
206
207
208
209#----
210# Note about Jena code:
211#----
212
213 Jena is a pure Java code base, and the relevant compiled jar files
214 are bundled with this extension, so no actual compilation is needed.
215 The main task CASCADE-MAKE.sh does is to untar the 'binary' files,
216 and setup them up in appropriate top-level ' lib' folder (for jar
217 files) etc so Greenstone can find them
218
219 In case it is ever needed, the companion source code is in 'src'
220 directory (taken from the Apache Jena website, matched to the binary
221 version). For source code, there was only a zip version of the file
222 to download.
223
224 To run the command-line client programs, such as 's-put', this
225 requires Ruby to be installed. The source code for this is
226 provided in the 'packages' directory, and compiled up and installed
227 as part of running ./CASCADE-MAKE.sh
228
229
230
231
232
Note: See TracBrowser for help on using the repository browser.