Ignore:
Timestamp:
2021-09-15T11:58:11+12:00 (3 years ago)
Author:
anupama
Message:

Committing Dr Bainbridge's improvements to the Tika-preconfigured UnknownConverterPlugin: 1. Introducing the OS-agnostic %%GSDLHOME variable into the model collConfig.xml file which the UnknownConverterPlugin.pm will replace with or %GSDLHOME% as needed. The perl file will now also handle GSDL3HOME and GSDL3SRCHOME similarly. 2. The tika-app-1.24.1.jar is now renamed to just tika-app.jar so that UnknownConverterPlugin's exec_cmd works on Windows too, where there is no file globbing or wildcard to expand tika-app*.jar as there was on Linux. The gs2build/ext/tika folder's README has been updated to mention the version number of the tika-app jar file we're using.

Location:
main/trunk/greenstone2/perllib
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/UnknownConverterPlugin.pm

    r32305 r35401  
    272272    }
    273273
     274    # Allow the user to use %%GSDL(3|3SRC)HOME and replace them here with the
     275    # OS-specific $GSDL(3|3SRC)HOME or %GSDL(3|3SRC)HOME%
     276    $cmd =~ s@%%GSDLHOME@\"$ENV{'GSDLHOME'}\"@g;
     277    $cmd =~ s@%%GSDL3HOME@\"$ENV{'GSDL3HOME'}\"@g;
     278    $cmd =~ s@%%GSDL3SRCHOME@\"$ENV{'GSDL3SRCHOME'}\"@g;
     279
    274280    # Some debugging
    275281    if ($self->{'verbosity'} > 2) {
  • main/trunk/greenstone2/perllib/strings.properties

    r35163 r35401  
    13451345UnknownConverterPlugin.desc:If you have a custom conversion tool installed that you're able to run from the command line to convert from an unsupported document format to text, HTML or a series of images in jpg, png or gif form, then provide that command to this Plugin. It will then run the command for you, capturing the output for indexing by Greenstone, making the documents (if converted to text or HTML) searchable. Set the -process_extension option to the suffix of files to be converted. Set the -convert_to option to the output format that the conversion command will generate, which will determine the output file's suffix. Set the -exec_cmd option to the command to be run.
    13461346
    1347 UnknownConverterPlugin.exec_cmd:Command line command string to execute that will do the conversion. Quoted elements need to have the quotes escaped with a backslash to preserve them. Use %%%%INPUT_FILE and %%%%OUTPUT as place holders in the command for input filename, and output filename, respectively. Greenstone will replace these with the correct values when calling the command. If -convert_to is a pagedimg type, Greenstone sets %%%%OUTPUT to be a directory to contain the expected files and will create an item file collating the parts of the document.
     1347UnknownConverterPlugin.exec_cmd:Command line command string to execute that will do the conversion. Quoted elements need to have the quotes escaped with a backslash to preserve them. Use %%%%INPUT_FILE and %%%%OUTPUT as place holders in the command for input filename, and output filename, respectively. (You can optionally use %%%%GSDLHOME, %%%%GSDL3HOME, %%%%GSDL3SRCHOME in place of the similarly named environment variables, to set the exec_cmd value to a command that will function across operating-systems.) Greenstone will replace all these placeholder variables with the correct values when calling the command. If -convert_to is a pagedimg type, Greenstone sets %%%%OUTPUT to be a directory to contain the expected files and will create an item file collating the parts of the document.
    13481348
    13491349UnknownConverterPlugin.output_file_or_dir_name: Full pathname of the output file or of the directory (of output files) that get generated by the conversion
Note: See TracChangeset for help on using the changeset viewer.