source: trunk/gsdl/packages/pdftohtml/readme-gs.txt@ 3147

Last change on this file since 3147 was 3147, checked in by jrm21, 22 years ago

Updated to mention new maintained version at sourceforge.

  • Property svn:keywords set to Author Date Id Revision
File size: 624 bytes
Line 
1This is pdftohtml, which was based at:
2 http://www.ra.informatik.uni-stuttgart.de/~gosho/pdftohtml/
3
4It has recently been picked up again, and is currently based at:
5 http://pdftohtml.sourceforge.net/
6
7The version is based on version 0.22, with some code included from
8version 0.31. It has been modified for Greenstone use, particularly
9the file xpdf/HtmlOutputDev.cc, in an attempt to get text and images
10in roughly the right place without using javascript or multiple pages.
11
12Known problems:
13 tables with text.
14 multi-column pages.
15 some image types don't get extracted.
16
17John McPherson.
1802 May 2001.
Note: See TracBrowser for help on using the repository browser.