source: gs2-extensions/gstika/trunk/java/GSTikaCLI.sh@ 34174

Last change on this file since 34174 was 34174, checked in by ak19, 4 years ago
  1. Created GSTikaCLI.java based off TikaCLI.java of the apache tika-app jar, but with modifications to convert a doc to html and extract images in one step and correct the image file names so they match what the html refers to and correct the html img src attributes to not have an embedded: prefix that breaks the image linking. 2. Attempting to add this as a proper GS2 extension. It worked in the gs2build/ext folder (though some paths have now changed to reflect the gs2-extensions folder structur for an ext). I've not tested whether this works properly as an ext. Hope it will all still work once I make a tarball and zip out of this.
  • Property svn:executable set to *
File size: 1.4 KB
Line 
1#!/bin/bash
2
3
4##################################################################################
5# BEWARE TO ECHO ALL MSGS IN THIS SCRIPT TO STDERR BY ADDING >&2 AT END OF ECHO #
6# ELSE IT WILL GO INTO THE OUTPUT DOC PRODUCED BY Tika!!! #
7##################################################################################
8
9# To run GSTikaClient.java against tika-app-*.jar:
10
11#GS3/gs2build/ext/tika>java -cp "`pwd`/tika-app-1.24.1.jar:build" org.greenstone.tika.GSTikaCLI --html-with-imgs /PATH/TO/inputfile.docx > outputfile.html
12
13if [ "x$GEXT_GSTIKA" = "x" ]; then
14 echo "@@@ Source the Greenstone environment setup script first" >&2
15 exit -1
16fi
17
18tika_app_jar=$GEXT_GSTIKA/lib/tika-app-1.24.1.jar
19builddir="$GEXT_GSTIKA/build"
20
21if [ ! -f "$builddir/org/greenstone/tika/GSTikaCLI.class" ]; then
22 echo "@@@@ First compile up org.greenstone.tika.GSTikaClient by running ./$GSDLHOME/ext/tika/makeGSTikaCLI.sh" >&2
23 exit -2
24fi
25
26
27# These echo messages *TO STDERR* are useful in GLI:
28# can see when Tika's STDERR messages per processed document start and end
29echo "--------------- TIKA RUN OUTPUT -------------------" >&2
30
31# Run it at last - output sent to std out, which is what GSTikaCLI does (so too TikaCLI)
32#java -cp "$GSDLHOME/ext/tika/tika-app-1.24.1.jar:build" org.greenstone.tika.GSTikaCLI $*
33java -cp "$tika_app_jar:$builddir" org.greenstone.tika.GSTikaCLI $*
34
35echo "--------------- END TIKA -------------------" >&2
Note: See TracBrowser for help on using the repository browser.