source: main/trunk/greenstone2/bin/linux/xpdf-tools/doc/pdfimages.1@ 32205

Last change on this file since 32205 was 32205, checked in by ak19, 6 years ago

First set of commits to do with implementing the new 'paged_html' output option of PDFPlugin that uses using xpdftools' new pdftohtml. So far tested only on Linux (64 bit), but things work there so I'm optimistically committing the changes since they work. 2. Committing the pre-built Linux binaries of XPDFtools for both 32 and 64 bit built by the XPDF group. 2. To use the correct bitness variant of xpdftools, setup.bash now exports the BITNESS env var, consulted by gsConvert.pl. 3. All the perl code changes to do with using xpdf tools' pdftohtml to generate paged_html and feed it in the desired form into GS(3): gsConvert.pl, PDFPlugin.pm and its parent ConvertBinaryPFile.pm have been modified to make it all work. xpdftools' pdftohtml generates a folder containing an html file and a screenshot for each page in a PDF (as well as an index.html linking to each page's html). However, we want a single html file that contains each individual 'page' html's content in a div, and need to do some further HTML style, attribute and structure modifications to massage the xpdftool output to what we want for GS. In order to parse and manipulate the HTML 'DOM' to do this, we're using the Mojo::DOM package that Dr Bainbridge found and which he's compiled up. Mojo::DOM is therefore also committed in this revision. Some further changes and some display fixes are required, but need to check with the others about that.

File size: 3.2 KB
Line 
1.\" Copyright 1998-2017 Glyph & Cog, LLC
2.TH pdfimages 1 "10 Aug 2017"
3.SH NAME
4pdfimages \- Portable Document Format (PDF) image extractor
5(version 4.00)
6.SH SYNOPSIS
7.B pdfimages
8[options]
9.I PDF-file image-root
10.SH DESCRIPTION
11.B Pdfimages
12saves images from a Portable Document Format (PDF) file as Portable
13Pixmap (PPM), Portable Graymap (PGM), Portable Bitmap (PBM), or JPEG
14files.
15.PP
16Pdfimages reads the PDF file, scans one or more pages,
17.IR PDF-file ,
18and writes one PPM, PGM, PBM, or JPEG file for each image,
19.IR image-root - nnnn . xxx ,
20where
21.I nnnn
22is the image number and
23.I xxx
24is the image type (.ppm, .pgm, .pbm, .jpg).
25.PP
26NB: pdfimages extracts the raw image data from the PDF file, without
27performing any additional transforms. Any rotation, clipping,
28color inversion, etc. done by the PDF content stream is ignored.
29.SH CONFIGURATION FILE
30Pdfimages reads a configuration file at startup. It first tries to
31find the user's private config file, ~/.xpdfrc. If that doesn't
32exist, it looks for a system-wide config file, typically
33/usr/local/etc/xpdfrc (but this location can be changed when pdfimages
34is built). See the
35.BR xpdfrc (5)
36man page for details.
37.SH OPTIONS
38Many of the following options can be set with configuration file
39commands. These are listed in square brackets with the description of
40the corresponding command line option.
41.TP
42.BI \-f " number"
43Specifies the first page to scan.
44.TP
45.BI \-l " number"
46Specifies the last page to scan.
47.TP
48.B \-j
49Normally, all images are written as PBM (for monochrome images), PGM
50(for grayscale images), or PPM (for color images) files. With this
51option, images in DCT format are saved as JPEG files. All non-DCT
52images are saved in PBM/PGM/PPM format as usual. (Inline images are
53always saved in PBM/PGM/PPM format.)
54.TP
55.B \-raw
56Write all images in PDF-native formats. Most of the formats are not
57standard image formats, so this option is primarily useful as input to
58a tool that generates PDF files. (Inline images are always saved in
59PBM/PGM/PPM format.)
60.TP
61.B \-list
62Write a one-line summary to stdout for each image. The summary
63provides the image file name, the page number, the image width and
64height, the horizontal and vertical resolution (DPI) as drawn, the
65color space type, and the number of bits per component (BPC).
66.TP
67.BI \-opw " password"
68Specify the owner password for the PDF file. Providing this will
69bypass all security restrictions.
70.TP
71.BI \-upw " password"
72Specify the user password for the PDF file.
73.TP
74.B \-q
75Don't print any messages or errors.
76.RB "[config file: " errQuiet ]
77.TP
78.B \-v
79Print copyright and version information.
80.TP
81.B \-h
82Print usage information.
83.RB ( \-help
84and
85.B \-\-help
86are equivalent.)
87.SH EXIT CODES
88The Xpdf tools use the following exit codes:
89.TP
900
91No error.
92.TP
931
94Error opening a PDF file.
95.TP
962
97Error opening an output file.
98.TP
993
100Error related to PDF permissions.
101.TP
10299
103Other error.
104.SH AUTHOR
105The pdfimages software and documentation are copyright 1998-2017 Glyph
106& Cog, LLC.
107.SH "SEE ALSO"
108.BR xpdf (1),
109.BR pdftops (1),
110.BR pdftotext (1),
111.BR pdftohtml (1),
112.BR pdfinfo (1),
113.BR pdffonts (1),
114.BR pdfdetach (1),
115.BR pdftoppm (1),
116.BR pdftopng (1),
117.BR xpdfrc (5)
118.br
119.B http://www.xpdfreader.com/
Note: See TracBrowser for help on using the repository browser.