Changeset 32024 for gs3-extensions/gs-icecite/GS-Icecite-README
- Timestamp:
- 2017-10-05T20:02:22+13:00 (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
gs3-extensions/gs-icecite/GS-Icecite-README
r32023 r32024 1 IceCite for Greenstone was built on the research net linux machine. 1 IceCite for Greenstone was built 19 July 2017 on the research net linux machine. The version checked out from git and compiled successfully on 5 Oct 2017 produced strange sequences of alphanumeric interspersed with what could be the regular contents when run over the 24.pdf test file in step 4c. So we've since committed the version compiled on 19 July instead. 2 2 3 3 4 LICENSE INFO … … 116 117 117 118 The solution was to: 118 a. Obtain bouncycastle (encryption?) jar files from https://www.bouncycastle.org/latest_releases.html119 a. Create a new folder inside the "icecite" checked out folder called "gs-installed-jars". 119 120 120 Download both jar files listed under the "Provider" column for row "JDK 1.5 - JDK 1.8" (not sure that both are necessary) and put them in icecite/pdf-cli folder (for example) 121 b. Obtain bouncycastle (encryption?) jar files from https://www.bouncycastle.org/latest_releases.html 122 123 Download both jar files listed under the "Provider" column for row "JDK 1.5 - JDK 1.8" (not sure that both are necessary) and put them in icecite/gs-installed-jars folder 121 124 122 125 b. Then see https://stackoverflow.com/questions/15930782/call-java-jar-myfile-jar-with-additional-classpath-option … … 124 127 125 128 126 Therefore, to convert PDF docs to text now that we have the bouncycastle jar files, we now run icecite's PDF-CLI as follows:129 Therefore, to convert PDF docs to text now that we have the bouncycastle jar files, we now run icecite's PDF-CLI as in the following example: 127 130 128 ~/icecite/pdf-cli$ java -classpath '.:/home/greenstone/icecite/pdf-cli/*:target/pdf-cli-0.0.1-SNAPSHOT-jar-with-dependencies.jar' cli.PdfParserCommandLine --format txt --feature words ~/Desktop/24.pdf ~/Desktop/24converted.txt131 java -classpath '.:/home/greenstone/icecite/gs-installed-jars/*:/home/greenstone/icecite/pdf-cli/target/pdf-cli-0.0.1-SNAPSHOT-jar-with-dependencies.jar' cli.PdfParserCommandLine --format txt --feature words ~/Desktop/24.pdf ~/Desktop/24converted.txt 129 132 130 133 134 Since we provide the absolute path to the jar nested within pdf-cli, we no longer need to cd into pdf-cli first to run the jar executable. 131 135
Note:
See TracChangeset
for help on using the changeset viewer.