source: main/trunk/greenstone2/perllib/cpan/Mojo/Content/Single.pm@ 32205

Last change on this file since 32205 was 32205, checked in by ak19, 6 years ago

First set of commits to do with implementing the new 'paged_html' output option of PDFPlugin that uses using xpdftools' new pdftohtml. So far tested only on Linux (64 bit), but things work there so I'm optimistically committing the changes since they work. 2. Committing the pre-built Linux binaries of XPDFtools for both 32 and 64 bit built by the XPDF group. 2. To use the correct bitness variant of xpdftools, setup.bash now exports the BITNESS env var, consulted by gsConvert.pl. 3. All the perl code changes to do with using xpdf tools' pdftohtml to generate paged_html and feed it in the desired form into GS(3): gsConvert.pl, PDFPlugin.pm and its parent ConvertBinaryPFile.pm have been modified to make it all work. xpdftools' pdftohtml generates a folder containing an html file and a screenshot for each page in a PDF (as well as an index.html linking to each page's html). However, we want a single html file that contains each individual 'page' html's content in a div, and need to do some further HTML style, attribute and structure modifications to massage the xpdftool output to what we want for GS. In order to parse and manipulate the HTML 'DOM' to do this, we're using the Mojo::DOM package that Dr Bainbridge found and which he's compiled up. Mojo::DOM is therefore also committed in this revision. Some further changes and some display fixes are required, but need to check with the others about that.

File size: 4.0 KB
Line 
1package Mojo::Content::Single;
2use Mojo::Base 'Mojo::Content';
3
4use Mojo::Asset::Memory;
5use Mojo::Content::MultiPart;
6
7has asset => sub { Mojo::Asset::Memory->new(auto_upgrade => 1) };
8has auto_upgrade => 1;
9
10sub body_contains { shift->asset->contains(shift) >= 0 }
11
12sub body_size {
13 my $self = shift;
14 return ($self->headers->content_length || 0) if $self->is_dynamic;
15 return $self->{body_size} //= $self->asset->size;
16}
17
18sub clone {
19 my $self = shift;
20 return undef unless my $clone = $self->SUPER::clone();
21 return $clone->asset($self->asset);
22}
23
24sub get_body_chunk {
25 my ($self, $offset) = @_;
26 return $self->generate_body_chunk($offset) if $self->is_dynamic;
27 return $self->asset->get_chunk($offset);
28}
29
30sub new {
31 my $self = shift->SUPER::new(@_);
32 $self->{read}
33 = $self->on(read => sub { $_[0]->asset($_[0]->asset->add_chunk($_[1])) });
34 return $self;
35}
36
37sub parse {
38 my $self = shift;
39
40 # Parse headers
41 $self->_parse_until_body(@_);
42
43 # Parse body
44 return $self->SUPER::parse
45 unless $self->auto_upgrade && defined $self->boundary;
46
47 # Content needs to be upgraded to multipart
48 $self->unsubscribe(read => $self->{read});
49 my $multi = Mojo::Content::MultiPart->new(%$self);
50 $self->emit(upgrade => $multi);
51 return $multi->parse;
52}
53
541;
55
56=encoding utf8
57
58=head1 NAME
59
60Mojo::Content::Single - HTTP content
61
62=head1 SYNOPSIS
63
64 use Mojo::Content::Single;
65
66 my $single = Mojo::Content::Single->new;
67 $single->parse("Content-Length: 12\x0d\x0a\x0d\x0aHello World!");
68 say $single->headers->content_length;
69
70=head1 DESCRIPTION
71
72L<Mojo::Content::Single> is a container for HTTP content, based on
73L<RFC 7230|http://tools.ietf.org/html/rfc7230> and
74L<RFC 7231|http://tools.ietf.org/html/rfc7231>.
75
76=head1 EVENTS
77
78L<Mojo::Content::Single> inherits all events from L<Mojo::Content> and can emit
79the following new ones.
80
81=head2 upgrade
82
83 $single->on(upgrade => sub {
84 my ($single, $multi) = @_;
85 ...
86 });
87
88Emitted when content gets upgraded to a L<Mojo::Content::MultiPart> object.
89
90 $single->on(upgrade => sub {
91 my ($single, $multi) = @_;
92 return unless $multi->headers->content_type =~ /multipart\/([^;]+)/i;
93 say "Multipart: $1";
94 });
95
96=head1 ATTRIBUTES
97
98L<Mojo::Content::Single> inherits all attributes from L<Mojo::Content> and
99implements the following new ones.
100
101=head2 asset
102
103 my $asset = $single->asset;
104 $single = $single->asset(Mojo::Asset::Memory->new);
105
106The actual content, defaults to a L<Mojo::Asset::Memory> object with
107L<Mojo::Asset::Memory/"auto_upgrade"> enabled.
108
109=head2 auto_upgrade
110
111 my $bool = $single->auto_upgrade;
112 $single = $single->auto_upgrade($bool);
113
114Try to detect multipart content and automatically upgrade to a
115L<Mojo::Content::MultiPart> object, defaults to a true value.
116
117=head1 METHODS
118
119L<Mojo::Content::Single> inherits all methods from L<Mojo::Content> and
120implements the following new ones.
121
122=head2 body_contains
123
124 my $bool = $single->body_contains('1234567');
125
126Check if content contains a specific string.
127
128=head2 body_size
129
130 my $size = $single->body_size;
131
132Content size in bytes.
133
134=head2 clone
135
136 my $clone = $single->clone;
137
138Return a new L<Mojo::Content::Single> object cloned from this content if
139possible, otherwise return C<undef>.
140
141=head2 get_body_chunk
142
143 my $bytes = $single->get_body_chunk(0);
144
145Get a chunk of content starting from a specific position. Note that it might
146not be possible to get the same chunk twice if content was generated
147dynamically.
148
149=head2 new
150
151 my $single = Mojo::Content::Single->new;
152 my $single = Mojo::Content::Single->new(asset => Mojo::Asset::File->new);
153 my $single = Mojo::Content::Single->new({asset => Mojo::Asset::File->new});
154
155Construct a new L<Mojo::Content::Single> object and subscribe to L</"read">
156event with default content parser.
157
158=head2 parse
159
160 $single = $single->parse("Content-Length: 12\x0d\x0a\x0d\x0aHello World!");
161 my $multi
162 = $single->parse("Content-Type: multipart/form-data\x0d\x0a\x0d\x0a");
163
164Parse content chunk and upgrade to L<Mojo::Content::MultiPart> object if
165necessary.
166
167=head1 SEE ALSO
168
169L<Mojolicious>, L<Mojolicious::Guides>, L<https://mojolicious.org>.
170
171=cut
Note: See TracBrowser for help on using the repository browser.