source: trunk/gsdl/packages/pdftohtml/readme-gs.txt@ 2351

Last change on this file since 2351 was 2351, checked in by jrm21, 23 years ago

added some known problems to the txt

  • Property svn:keywords set to Author Date Id Revision
File size: 519 bytes
Line 
1This is pdftohtml, which is based at:
2 http://www.ra.informatik.uni-stuttgart.de/~gosho/pdftohtml/
3
4The version is based on version 0.22, with some code included from
5version 0.31. It has been modified for Greenstone use, particularly
6the file xpdf/HtmlOutputDev.cc, in an attempt to get text and images
7in roughly the right place without using javascript or multiple pages.
8
9Known problems:
10 tables with text.
11 multi-column pages.
12 some image types don't get extracted.
13
14John McPherson.
1502 May 2001.
Note: See TracBrowser for help on using the repository browser.