Ignore:
Timestamp:
2017-10-05T20:02:22+13:00 (7 years ago)
Author:
ak19
Message:

IceCite for Greenstone was built 19 July 2017 on the research net linux machine. The version checked out from git and compiled successfully on 5 Oct 2017 produced strange sequences of alphanumeric interspersed with what could be the regular contents when run over the 24.pdf test file in step 4c. So committing the version compiled on 19 July instead, as it works.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gs3-extensions/gs-icecite/GS-Icecite-README

    r32023 r32024  
    1 IceCite for Greenstone was built on the research net linux machine.
     1IceCite for Greenstone was built 19 July 2017 on the research net linux machine. The version checked out from git and compiled successfully on 5 Oct 2017 produced strange sequences of alphanumeric interspersed with what could be the regular contents when run over the 24.pdf test file in step 4c. So we've since committed the version compiled on 19 July instead.
     2
    23
    34LICENSE INFO
     
    116117
    117118The solution was to:
    118 a. Obtain bouncycastle (encryption?) jar files from https://www.bouncycastle.org/latest_releases.html
     119a. Create a new folder inside the "icecite" checked out folder called "gs-installed-jars".
    119120
    120 Download both jar files listed under the "Provider" column for row "JDK 1.5 - JDK 1.8" (not sure that both are necessary) and put them in icecite/pdf-cli folder (for example)
     121b. Obtain bouncycastle (encryption?) jar files from https://www.bouncycastle.org/latest_releases.html
     122
     123Download both jar files listed under the "Provider" column for row "JDK 1.5 - JDK 1.8" (not sure that both are necessary) and put them in icecite/gs-installed-jars folder
    121124
    122125b. Then see https://stackoverflow.com/questions/15930782/call-java-jar-myfile-jar-with-additional-classpath-option
     
    124127
    125128
    126 Therefore, to convert PDF docs to text now that we have the bouncycastle jar files, we now run icecite's PDF-CLI as follows:
     129Therefore, to convert PDF docs to text now that we have the bouncycastle jar files, we now run icecite's PDF-CLI as in the following example:
    127130
    128     ~/icecite/pdf-cli$ java -classpath '.:/home/greenstone/icecite/pdf-cli/*:target/pdf-cli-0.0.1-SNAPSHOT-jar-with-dependencies.jar' cli.PdfParserCommandLine --format txt --feature words ~/Desktop/24.pdf ~/Desktop/24converted.txt
     131    java -classpath '.:/home/greenstone/icecite/gs-installed-jars/*:/home/greenstone/icecite/pdf-cli/target/pdf-cli-0.0.1-SNAPSHOT-jar-with-dependencies.jar' cli.PdfParserCommandLine --format txt --feature words ~/Desktop/24.pdf ~/Desktop/24converted.txt
    129132
    130133
     134Since we provide the absolute path to the jar nested within pdf-cli, we no longer need to cd into pdf-cli first to run the jar executable.
    131135
Note: See TracChangeset for help on using the changeset viewer.