source: trunk/gsdl/perllib/plugins/PDFPlug.pm@ 2785

Last change on this file since 2785 was 2785, checked in by sjboddie, 23 years ago

The build process now creates a summary of how many files were included,
which were rejected, etc. A link to a page containing this summary is
provided from the final page of the collector (once the collection is built
successfully) and from the default "about this collection" text for
collections built by the collector.

Also did a little bit of tidying in a couple of places

  • Property svn:keywords set to Author Date Id Revision
File size: 2.0 KB
Line 
1###########################################################################
2#
3# PDFPlug.pm -- reasonably with-it pdf plugin
4# A component of the Greenstone digital library software
5# from the New Zealand Digital Library Project at the
6# University of Waikato, New Zealand.
7#
8# Copyright (C) 1999-2001 New Zealand Digital Library Project
9#
10# This program is free software; you can redistribute it and/or modify
11# it under the terms of the GNU General Public License as published by
12# the Free Software Foundation; either version 2 of the License, or
13# (at your option) any later version.
14#
15# This program is distributed in the hope that it will be useful,
16# but WITHOUT ANY WARRANTY; without even the implied warranty of
17# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18# GNU General Public License for more details.
19#
20# You should have received a copy of the GNU General Public License
21# along with this program; if not, write to the Free Software
22# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
23#
24###########################################################################
25
26package PDFPlug;
27
28use ConvertToPlug;
29
30sub BEGIN {
31 @ISA = ('ConvertToPlug');
32}
33
34sub new {
35 my $class = shift (@_);
36
37 # following title_sub removes "Page 1" added by pdftohtml, and a leading
38 # "1", which is often the page number at the top of the page. Bad Luck
39 # if your document title actually starts with "1 " - is there a better way?
40
41 my $self = new ConvertToPlug ($class, @_, "-title_sub", '^(Page\s+\d+)?(\s*1\s+)?');
42
43 return bless $self, $class;
44}
45
46
47
48sub get_default_process_exp {
49 my $self = shift (@_);
50
51 return q^(?i)\.pdf$^;
52}
53
54# so we don't inherit HTMLPlug's block exp...
55sub get_default_block_exp {
56 return "";
57}
58
59
60# do plugin specific processing of doc_obj for HTML type
61sub process {
62 my $self = shift (@_);
63
64 my $outhandle = $self->{'outhandle'};
65 print $outhandle "PDFPlug: passing $_[3] on to $self->{'convert_to'}Plug\n"
66 if $self->{'verbosity'} > 1;
67
68 return ConvertToPlug::process_type($self,"pdf",@_);
69}
70
711;
Note: See TracBrowser for help on using the repository browser.