Ignore:
Timestamp:
2018-06-11T17:54:08+12:00 (6 years ago)
Author:
ak19
Message:

Updates to the recent commit's modifications to do with pdfbox: new class has been renamed from GS_PDFToImagesAndText.java to org/greenstone/pdfbox/PDFBoxToImagesAndText.java and uses a GS package. This class file is no longer included in pdfbox-app.jar, but is just compiled against that. Added Apache v 2.0 licensing related files. PDFBoxConverter.pm now refers to the newly named Java class with the new org.greenstone.pdfbox package name. Updated the Readme to add instructions to do with compiling the new java file and its new folder/package structure, and information related to the Apache license. There's also the new java/build subfolder containing the precompiled class file (and Java pkg structure) for the new class. This new build folder with the new custom class, and the modified PDFBoxConverter.pm and the modified pdfbox-app.jar (without the custom class) are modifications to the pdfbox tarball/zip files too.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gs2-extensions/pdf-box/trunk/java/perllib/plugins/PDFBoxConverter.pm

    r32193 r32197  
    131131    $self->{'pdfbox_launch_cmd'} = $launch_cmd;
    132132    #$self->{'pdfbox_img_launch_cmd'} = "java -cp \"$pbajar\" org.apache.pdfbox.tools.PDFToImage"; # pdfbox 2.09 cmd for converting each PDF page to an image (gif, jpg, png)
    133     # Now: use this cmd to launch our new custom PDFBox class (GS_PDFToImagesAndText.java) to convert each PDF page into an image (gif, jpg, png)
     133    # Now: use this cmd to launch our new custom PDFBox class (PDFBoxToImagesAndText.java) to convert each PDF page into an image (gif, jpg, png)
    134134    # AND its extracted text. An item file is still generated, but this time referring to txtfiles too, not just the images. Result: searchable paged output.
    135     $self->{'pdfbox_img_launch_cmd'} = "java -cp \"$pbajar\" org.apache.pdfbox.tools.GS_PDFToImagesAndText";
     135    # Our new custom class PDFBoxToImagesAndText.java lives in the new build folder, so add that to the classpath for the launch cmd
     136    my $pdfbox_build = &FileUtils::filenameConcatenate($gextpb_home,"build");
     137    my $classpath = &util::pathname_cat($pbajar,$pdfbox_build);
     138    $self->{'pdfbox_img_launch_cmd'} = "java -cp \"$classpath\" org.greenstone.pdfbox.PDFBoxToImagesAndText";
    136139    }
    137140    else {       
Note: See TracChangeset for help on using the changeset viewer.