Ticket #649 (new defect)

Opened 9 years ago

Last modified 22 months ago

new version of pdftohtml

Reported by: kjdon Owned by: nobody
Priority: moderate Milestone: Collection building wishlist
Component: Collection Building Severity: major
Keywords: Cc:

Description

There is an experimental 0.40 version - try this, or 0.39 in greenstone and upgrade if better. Does it support pdf 1.6?

Change History

Changed 9 years ago by kjdon

  • milestone changed from 3.05 Release to 2.84 Release

Changed 9 years ago by kjdon

Or try PDFBox??

Changed 8 years ago by mdewsnip

Notes from Richard Managh at DL Consulting:

Currently Greenstone has pdf2html 0.34 which was released in 2002 apparently, and is based on xpdf version 2.02 and only supports PDF version 1.4 (apparently).

PDF version 1.4 came out in May 2001 PDF version 1.5 came out in July 2003 PDF version 1.6 came out in January 2005 PDF version 1.7 came out in November 2006.

o Apparently the latest (experimental) version of pdf2html 0.40 does NOT support PDF version 1.6, because it is based on xpdf 3.01, and according to xpdf's changelog, support for Acrobat 1.6 and 1.7 was added in xpdf 3.02.

So even if we use pdf2html 0.40 it will be 5 and a half years out of date.

The most heroic course of action would be to hack xpdf 3.02 into pdf2html 0.40 and test it on various PDFs, including 1.6 and 1.7 ones.

The most practical course of action might be to test 0.40 on a bunch of different PDFs and see how good it is i guess.

Or to use some other more up to date tool that can handle PDF 1.6 and 1.7.

Changed 8 years ago by kjdon

  • milestone changed from 2.84 Release to Collection building wishlist

We have added PDFBox into Greenstone. Maybe look at this at a later stage. But better to look at PDFBox and make it do HTML?

Changed 5 years ago by robertthomas

The most up to date listing of coupons for  fabfurnish. Our editors check coupon codes to ensure validity every day.

Note: See TracTickets for help on using tickets.