Changeset 31787 for main/trunk


Ignore:
Timestamp:
2017-07-11T18:28:41+12:00 (7 years ago)
Author:
ak19
Message:

Fixing up a couple of strings for the UnknownConverterPlugin. BUG: The new plugin has issues on Windows when using a Win poppler binary's pdftohtml. The conversion succeeds, but the files seem to remain in the tmp area when HTMLPlugin processes them. Not sure if this is to dow with Windows or the new exec_cmd being tested or the spaces in the filename of the imported file.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/strings.properties

    r31762 r31787  
    12711271TextPlugin.title_sub:Substitution expression to modify string stored as Title. Used by, for example, PostScriptPlugin to remove "Page 1" etc from text used as the title.
    12721272
    1273 UnknownConverterPlugin.desc:If you have a custom conversion tool installed that you're able to run from the command line to convert from an unsupported document format to text, HTML or a series of images in jpg, png or gif form, then provide that command to this Plugin. It will then run the command for you, capturing the output for indexing by Greenstone, making any documents that aren't converted to images searchable. Set the process_extension to the suffix of files to be converted. Set convert_to to be the output format that the conversion command will generate, which will determine the output file's suffix. Use %INPUT_FILE and %OUTPUT as place holders in the command, which Greenstone will replace. It will pass in the full path to each file that matches the process_extension suffix in turn as %INPUT_FILE. $OUTPUT will be replaced with a path in the temporary folder of the output file with suffix determined by the value of convert_to. If convert_to is a pagedimg type, Greenstone sets %OUTPUT to be a directory to contain the expected files and will create an item file collating the parts of the document.
    1274 
    1275 UnknownConverterPlugin.exec_cmd:Command line command string to execute that will do the conversion. Quoted elements need to have the quotes escaped with a backslash to preserve them.
     1273UnknownConverterPlugin.desc:If you have a custom conversion tool installed that you're able to run from the command line to convert from an unsupported document format to text, HTML or a series of images in jpg, png or gif form, then provide that command to this Plugin. It will then run the command for you, capturing the output for indexing by Greenstone, making any documents that aren't converted to images searchable. Set the process_extension to the suffix of files to be converted. Set convert_to to be the output format that the conversion command will generate, which will determine the output file's suffix. Use %INPUT_FILE and %OUTPUT as place holders in the command, which Greenstone will replace. It will pass in the full path to each file that matches the process_extension suffix in turn as %INPUT_FILE. %OUTPUT will be replaced with a path in the temporary folder of the output file with suffix determined by the value of convert_to. If convert_to is a pagedimg type, Greenstone sets %OUTPUT to be a directory to contain the expected files and will create an item file collating the parts of the document.
     1274
     1275UnknownConverterPlugin.exec_cmd:Command line command string to execute that will do the conversion. Quoted elements need to have the quotes escaped with a backslash to preserve them. Use %INPUT_FILE and %OUTPUT as place holders in the command.
    12761276
    12771277UnknownConverterPlugin.output_file_or_dir_name: Full pathname of the output file or of the directory (of output files) that get generated by the conversion
Note: See TracChangeset for help on using the changeset viewer.