1 | #############################################################################
|
---|
2 | # Pod/Parser.pm -- package which defines a base class for parsing POD docs.
|
---|
3 | #
|
---|
4 | # Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved.
|
---|
5 | # This file is part of "PodParser". PodParser is free software;
|
---|
6 | # you can redistribute it and/or modify it under the same terms
|
---|
7 | # as Perl itself.
|
---|
8 | #############################################################################
|
---|
9 |
|
---|
10 | package Pod::Parser;
|
---|
11 |
|
---|
12 | use vars qw($VERSION);
|
---|
13 | $VERSION = 1.32; ## Current version of this package
|
---|
14 | require 5.005; ## requires this Perl version or later
|
---|
15 |
|
---|
16 | #############################################################################
|
---|
17 |
|
---|
18 | =head1 NAME
|
---|
19 |
|
---|
20 | Pod::Parser - base class for creating POD filters and translators
|
---|
21 |
|
---|
22 | =head1 SYNOPSIS
|
---|
23 |
|
---|
24 | use Pod::Parser;
|
---|
25 |
|
---|
26 | package MyParser;
|
---|
27 | @ISA = qw(Pod::Parser);
|
---|
28 |
|
---|
29 | sub command {
|
---|
30 | my ($parser, $command, $paragraph, $line_num) = @_;
|
---|
31 | ## Interpret the command and its text; sample actions might be:
|
---|
32 | if ($command eq 'head1') { ... }
|
---|
33 | elsif ($command eq 'head2') { ... }
|
---|
34 | ## ... other commands and their actions
|
---|
35 | my $out_fh = $parser->output_handle();
|
---|
36 | my $expansion = $parser->interpolate($paragraph, $line_num);
|
---|
37 | print $out_fh $expansion;
|
---|
38 | }
|
---|
39 |
|
---|
40 | sub verbatim {
|
---|
41 | my ($parser, $paragraph, $line_num) = @_;
|
---|
42 | ## Format verbatim paragraph; sample actions might be:
|
---|
43 | my $out_fh = $parser->output_handle();
|
---|
44 | print $out_fh $paragraph;
|
---|
45 | }
|
---|
46 |
|
---|
47 | sub textblock {
|
---|
48 | my ($parser, $paragraph, $line_num) = @_;
|
---|
49 | ## Translate/Format this block of text; sample actions might be:
|
---|
50 | my $out_fh = $parser->output_handle();
|
---|
51 | my $expansion = $parser->interpolate($paragraph, $line_num);
|
---|
52 | print $out_fh $expansion;
|
---|
53 | }
|
---|
54 |
|
---|
55 | sub interior_sequence {
|
---|
56 | my ($parser, $seq_command, $seq_argument) = @_;
|
---|
57 | ## Expand an interior sequence; sample actions might be:
|
---|
58 | return "*$seq_argument*" if ($seq_command eq 'B');
|
---|
59 | return "`$seq_argument'" if ($seq_command eq 'C');
|
---|
60 | return "_${seq_argument}_'" if ($seq_command eq 'I');
|
---|
61 | ## ... other sequence commands and their resulting text
|
---|
62 | }
|
---|
63 |
|
---|
64 | package main;
|
---|
65 |
|
---|
66 | ## Create a parser object and have it parse file whose name was
|
---|
67 | ## given on the command-line (use STDIN if no files were given).
|
---|
68 | $parser = new MyParser();
|
---|
69 | $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0);
|
---|
70 | for (@ARGV) { $parser->parse_from_file($_); }
|
---|
71 |
|
---|
72 | =head1 REQUIRES
|
---|
73 |
|
---|
74 | perl5.005, Pod::InputObjects, Exporter, Symbol, Carp
|
---|
75 |
|
---|
76 | =head1 EXPORTS
|
---|
77 |
|
---|
78 | Nothing.
|
---|
79 |
|
---|
80 | =head1 DESCRIPTION
|
---|
81 |
|
---|
82 | B<Pod::Parser> is a base class for creating POD filters and translators.
|
---|
83 | It handles most of the effort involved with parsing the POD sections
|
---|
84 | from an input stream, leaving subclasses free to be concerned only with
|
---|
85 | performing the actual translation of text.
|
---|
86 |
|
---|
87 | B<Pod::Parser> parses PODs, and makes method calls to handle the various
|
---|
88 | components of the POD. Subclasses of B<Pod::Parser> override these methods
|
---|
89 | to translate the POD into whatever output format they desire.
|
---|
90 |
|
---|
91 | =head1 QUICK OVERVIEW
|
---|
92 |
|
---|
93 | To create a POD filter for translating POD documentation into some other
|
---|
94 | format, you create a subclass of B<Pod::Parser> which typically overrides
|
---|
95 | just the base class implementation for the following methods:
|
---|
96 |
|
---|
97 | =over 2
|
---|
98 |
|
---|
99 | =item *
|
---|
100 |
|
---|
101 | B<command()>
|
---|
102 |
|
---|
103 | =item *
|
---|
104 |
|
---|
105 | B<verbatim()>
|
---|
106 |
|
---|
107 | =item *
|
---|
108 |
|
---|
109 | B<textblock()>
|
---|
110 |
|
---|
111 | =item *
|
---|
112 |
|
---|
113 | B<interior_sequence()>
|
---|
114 |
|
---|
115 | =back
|
---|
116 |
|
---|
117 | You may also want to override the B<begin_input()> and B<end_input()>
|
---|
118 | methods for your subclass (to perform any needed per-file and/or
|
---|
119 | per-document initialization or cleanup).
|
---|
120 |
|
---|
121 | If you need to perform any preprocesssing of input before it is parsed
|
---|
122 | you may want to override one or more of B<preprocess_line()> and/or
|
---|
123 | B<preprocess_paragraph()>.
|
---|
124 |
|
---|
125 | Sometimes it may be necessary to make more than one pass over the input
|
---|
126 | files. If this is the case you have several options. You can make the
|
---|
127 | first pass using B<Pod::Parser> and override your methods to store the
|
---|
128 | intermediate results in memory somewhere for the B<end_pod()> method to
|
---|
129 | process. You could use B<Pod::Parser> for several passes with an
|
---|
130 | appropriate state variable to control the operation for each pass. If
|
---|
131 | your input source can't be reset to start at the beginning, you can
|
---|
132 | store it in some other structure as a string or an array and have that
|
---|
133 | structure implement a B<getline()> method (which is all that
|
---|
134 | B<parse_from_filehandle()> uses to read input).
|
---|
135 |
|
---|
136 | Feel free to add any member data fields you need to keep track of things
|
---|
137 | like current font, indentation, horizontal or vertical position, or
|
---|
138 | whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA">
|
---|
139 | to avoid name collisions.
|
---|
140 |
|
---|
141 | For the most part, the B<Pod::Parser> base class should be able to
|
---|
142 | do most of the input parsing for you and leave you free to worry about
|
---|
143 | how to intepret the commands and translate the result.
|
---|
144 |
|
---|
145 | Note that all we have described here in this quick overview is the
|
---|
146 | simplest most straightforward use of B<Pod::Parser> to do stream-based
|
---|
147 | parsing. It is also possible to use the B<Pod::Parser::parse_text> function
|
---|
148 | to do more sophisticated tree-based parsing. See L<"TREE-BASED PARSING">.
|
---|
149 |
|
---|
150 | =head1 PARSING OPTIONS
|
---|
151 |
|
---|
152 | A I<parse-option> is simply a named option of B<Pod::Parser> with a
|
---|
153 | value that corresponds to a certain specified behavior. These various
|
---|
154 | behaviors of B<Pod::Parser> may be enabled/disabled by setting
|
---|
155 | or unsetting one or more I<parse-options> using the B<parseopts()> method.
|
---|
156 | The set of currently accepted parse-options is as follows:
|
---|
157 |
|
---|
158 | =over 3
|
---|
159 |
|
---|
160 | =item B<-want_nonPODs> (default: unset)
|
---|
161 |
|
---|
162 | Normally (by default) B<Pod::Parser> will only provide access to
|
---|
163 | the POD sections of the input. Input paragraphs that are not part
|
---|
164 | of the POD-format documentation are not made available to the caller
|
---|
165 | (not even using B<preprocess_paragraph()>). Setting this option to a
|
---|
166 | non-empty, non-zero value will allow B<preprocess_paragraph()> to see
|
---|
167 | non-POD sections of the input as well as POD sections. The B<cutting()>
|
---|
168 | method can be used to determine if the corresponding paragraph is a POD
|
---|
169 | paragraph, or some other input paragraph.
|
---|
170 |
|
---|
171 | =item B<-process_cut_cmd> (default: unset)
|
---|
172 |
|
---|
173 | Normally (by default) B<Pod::Parser> handles the C<=cut> POD directive
|
---|
174 | by itself and does not pass it on to the caller for processing. Setting
|
---|
175 | this option to a non-empty, non-zero value will cause B<Pod::Parser> to
|
---|
176 | pass the C<=cut> directive to the caller just like any other POD command
|
---|
177 | (and hence it may be processed by the B<command()> method).
|
---|
178 |
|
---|
179 | B<Pod::Parser> will still interpret the C<=cut> directive to mean that
|
---|
180 | "cutting mode" has been (re)entered, but the caller will get a chance
|
---|
181 | to capture the actual C<=cut> paragraph itself for whatever purpose
|
---|
182 | it desires.
|
---|
183 |
|
---|
184 | =item B<-warnings> (default: unset)
|
---|
185 |
|
---|
186 | Normally (by default) B<Pod::Parser> recognizes a bare minimum of
|
---|
187 | pod syntax errors and warnings and issues diagnostic messages
|
---|
188 | for errors, but not for warnings. (Use B<Pod::Checker> to do more
|
---|
189 | thorough checking of POD syntax.) Setting this option to a non-empty,
|
---|
190 | non-zero value will cause B<Pod::Parser> to issue diagnostics for
|
---|
191 | the few warnings it recognizes as well as the errors.
|
---|
192 |
|
---|
193 | =back
|
---|
194 |
|
---|
195 | Please see L<"parseopts()"> for a complete description of the interface
|
---|
196 | for the setting and unsetting of parse-options.
|
---|
197 |
|
---|
198 | =cut
|
---|
199 |
|
---|
200 | #############################################################################
|
---|
201 |
|
---|
202 | use vars qw(@ISA);
|
---|
203 | use strict;
|
---|
204 | #use diagnostics;
|
---|
205 | use Pod::InputObjects;
|
---|
206 | use Carp;
|
---|
207 | use Exporter;
|
---|
208 | BEGIN {
|
---|
209 | if ($] < 5.6) {
|
---|
210 | require Symbol;
|
---|
211 | import Symbol;
|
---|
212 | }
|
---|
213 | }
|
---|
214 | @ISA = qw(Exporter);
|
---|
215 |
|
---|
216 | ## These "variables" are used as local "glob aliases" for performance
|
---|
217 | use vars qw(%myData %myOpts @input_stack);
|
---|
218 |
|
---|
219 | #############################################################################
|
---|
220 |
|
---|
221 | =head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES
|
---|
222 |
|
---|
223 | B<Pod::Parser> provides several methods which most subclasses will probably
|
---|
224 | want to override. These methods are as follows:
|
---|
225 |
|
---|
226 | =cut
|
---|
227 |
|
---|
228 | ##---------------------------------------------------------------------------
|
---|
229 |
|
---|
230 | =head1 B<command()>
|
---|
231 |
|
---|
232 | $parser->command($cmd,$text,$line_num,$pod_para);
|
---|
233 |
|
---|
234 | This method should be overridden by subclasses to take the appropriate
|
---|
235 | action when a POD command paragraph (denoted by a line beginning with
|
---|
236 | "=") is encountered. When such a POD directive is seen in the input,
|
---|
237 | this method is called and is passed:
|
---|
238 |
|
---|
239 | =over 3
|
---|
240 |
|
---|
241 | =item C<$cmd>
|
---|
242 |
|
---|
243 | the name of the command for this POD paragraph
|
---|
244 |
|
---|
245 | =item C<$text>
|
---|
246 |
|
---|
247 | the paragraph text for the given POD paragraph command.
|
---|
248 |
|
---|
249 | =item C<$line_num>
|
---|
250 |
|
---|
251 | the line-number of the beginning of the paragraph
|
---|
252 |
|
---|
253 | =item C<$pod_para>
|
---|
254 |
|
---|
255 | a reference to a C<Pod::Paragraph> object which contains further
|
---|
256 | information about the paragraph command (see L<Pod::InputObjects>
|
---|
257 | for details).
|
---|
258 |
|
---|
259 | =back
|
---|
260 |
|
---|
261 | B<Note> that this method I<is> called for C<=pod> paragraphs.
|
---|
262 |
|
---|
263 | The base class implementation of this method simply treats the raw POD
|
---|
264 | command as normal block of paragraph text (invoking the B<textblock()>
|
---|
265 | method with the command paragraph).
|
---|
266 |
|
---|
267 | =cut
|
---|
268 |
|
---|
269 | sub command {
|
---|
270 | my ($self, $cmd, $text, $line_num, $pod_para) = @_;
|
---|
271 | ## Just treat this like a textblock
|
---|
272 | $self->textblock($pod_para->raw_text(), $line_num, $pod_para);
|
---|
273 | }
|
---|
274 |
|
---|
275 | ##---------------------------------------------------------------------------
|
---|
276 |
|
---|
277 | =head1 B<verbatim()>
|
---|
278 |
|
---|
279 | $parser->verbatim($text,$line_num,$pod_para);
|
---|
280 |
|
---|
281 | This method may be overridden by subclasses to take the appropriate
|
---|
282 | action when a block of verbatim text is encountered. It is passed the
|
---|
283 | following parameters:
|
---|
284 |
|
---|
285 | =over 3
|
---|
286 |
|
---|
287 | =item C<$text>
|
---|
288 |
|
---|
289 | the block of text for the verbatim paragraph
|
---|
290 |
|
---|
291 | =item C<$line_num>
|
---|
292 |
|
---|
293 | the line-number of the beginning of the paragraph
|
---|
294 |
|
---|
295 | =item C<$pod_para>
|
---|
296 |
|
---|
297 | a reference to a C<Pod::Paragraph> object which contains further
|
---|
298 | information about the paragraph (see L<Pod::InputObjects>
|
---|
299 | for details).
|
---|
300 |
|
---|
301 | =back
|
---|
302 |
|
---|
303 | The base class implementation of this method simply prints the textblock
|
---|
304 | (unmodified) to the output filehandle.
|
---|
305 |
|
---|
306 | =cut
|
---|
307 |
|
---|
308 | sub verbatim {
|
---|
309 | my ($self, $text, $line_num, $pod_para) = @_;
|
---|
310 | my $out_fh = $self->{_OUTPUT};
|
---|
311 | print $out_fh $text;
|
---|
312 | }
|
---|
313 |
|
---|
314 | ##---------------------------------------------------------------------------
|
---|
315 |
|
---|
316 | =head1 B<textblock()>
|
---|
317 |
|
---|
318 | $parser->textblock($text,$line_num,$pod_para);
|
---|
319 |
|
---|
320 | This method may be overridden by subclasses to take the appropriate
|
---|
321 | action when a normal block of POD text is encountered (although the base
|
---|
322 | class method will usually do what you want). It is passed the following
|
---|
323 | parameters:
|
---|
324 |
|
---|
325 | =over 3
|
---|
326 |
|
---|
327 | =item C<$text>
|
---|
328 |
|
---|
329 | the block of text for the a POD paragraph
|
---|
330 |
|
---|
331 | =item C<$line_num>
|
---|
332 |
|
---|
333 | the line-number of the beginning of the paragraph
|
---|
334 |
|
---|
335 | =item C<$pod_para>
|
---|
336 |
|
---|
337 | a reference to a C<Pod::Paragraph> object which contains further
|
---|
338 | information about the paragraph (see L<Pod::InputObjects>
|
---|
339 | for details).
|
---|
340 |
|
---|
341 | =back
|
---|
342 |
|
---|
343 | In order to process interior sequences, subclasses implementations of
|
---|
344 | this method will probably want to invoke either B<interpolate()> or
|
---|
345 | B<parse_text()>, passing it the text block C<$text>, and the corresponding
|
---|
346 | line number in C<$line_num>, and then perform any desired processing upon
|
---|
347 | the returned result.
|
---|
348 |
|
---|
349 | The base class implementation of this method simply prints the text block
|
---|
350 | as it occurred in the input stream).
|
---|
351 |
|
---|
352 | =cut
|
---|
353 |
|
---|
354 | sub textblock {
|
---|
355 | my ($self, $text, $line_num, $pod_para) = @_;
|
---|
356 | my $out_fh = $self->{_OUTPUT};
|
---|
357 | print $out_fh $self->interpolate($text, $line_num);
|
---|
358 | }
|
---|
359 |
|
---|
360 | ##---------------------------------------------------------------------------
|
---|
361 |
|
---|
362 | =head1 B<interior_sequence()>
|
---|
363 |
|
---|
364 | $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq);
|
---|
365 |
|
---|
366 | This method should be overridden by subclasses to take the appropriate
|
---|
367 | action when an interior sequence is encountered. An interior sequence is
|
---|
368 | an embedded command within a block of text which appears as a command
|
---|
369 | name (usually a single uppercase character) followed immediately by a
|
---|
370 | string of text which is enclosed in angle brackets. This method is
|
---|
371 | passed the sequence command C<$seq_cmd> and the corresponding text
|
---|
372 | C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior
|
---|
373 | sequence that occurs in the string that it is passed. It should return
|
---|
374 | the desired text string to be used in place of the interior sequence.
|
---|
375 | The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence>
|
---|
376 | object which contains further information about the interior sequence.
|
---|
377 | Please see L<Pod::InputObjects> for details if you need to access this
|
---|
378 | additional information.
|
---|
379 |
|
---|
380 | Subclass implementations of this method may wish to invoke the
|
---|
381 | B<nested()> method of C<$pod_seq> to see if it is nested inside
|
---|
382 | some other interior-sequence (and if so, which kind).
|
---|
383 |
|
---|
384 | The base class implementation of the B<interior_sequence()> method
|
---|
385 | simply returns the raw text of the interior sequence (as it occurred
|
---|
386 | in the input) to the caller.
|
---|
387 |
|
---|
388 | =cut
|
---|
389 |
|
---|
390 | sub interior_sequence {
|
---|
391 | my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_;
|
---|
392 | ## Just return the raw text of the interior sequence
|
---|
393 | return $pod_seq->raw_text();
|
---|
394 | }
|
---|
395 |
|
---|
396 | #############################################################################
|
---|
397 |
|
---|
398 | =head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES
|
---|
399 |
|
---|
400 | B<Pod::Parser> provides several methods which subclasses may want to override
|
---|
401 | to perform any special pre/post-processing. These methods do I<not> have to
|
---|
402 | be overridden, but it may be useful for subclasses to take advantage of them.
|
---|
403 |
|
---|
404 | =cut
|
---|
405 |
|
---|
406 | ##---------------------------------------------------------------------------
|
---|
407 |
|
---|
408 | =head1 B<new()>
|
---|
409 |
|
---|
410 | my $parser = Pod::Parser->new();
|
---|
411 |
|
---|
412 | This is the constructor for B<Pod::Parser> and its subclasses. You
|
---|
413 | I<do not> need to override this method! It is capable of constructing
|
---|
414 | subclass objects as well as base class objects, provided you use
|
---|
415 | any of the following constructor invocation styles:
|
---|
416 |
|
---|
417 | my $parser1 = MyParser->new();
|
---|
418 | my $parser2 = new MyParser();
|
---|
419 | my $parser3 = $parser2->new();
|
---|
420 |
|
---|
421 | where C<MyParser> is some subclass of B<Pod::Parser>.
|
---|
422 |
|
---|
423 | Using the syntax C<MyParser::new()> to invoke the constructor is I<not>
|
---|
424 | recommended, but if you insist on being able to do this, then the
|
---|
425 | subclass I<will> need to override the B<new()> constructor method. If
|
---|
426 | you do override the constructor, you I<must> be sure to invoke the
|
---|
427 | B<initialize()> method of the newly blessed object.
|
---|
428 |
|
---|
429 | Using any of the above invocations, the first argument to the
|
---|
430 | constructor is always the corresponding package name (or object
|
---|
431 | reference). No other arguments are required, but if desired, an
|
---|
432 | associative array (or hash-table) my be passed to the B<new()>
|
---|
433 | constructor, as in:
|
---|
434 |
|
---|
435 | my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 );
|
---|
436 | my $parser2 = new MyParser( -myflag => 1 );
|
---|
437 |
|
---|
438 | All arguments passed to the B<new()> constructor will be treated as
|
---|
439 | key/value pairs in a hash-table. The newly constructed object will be
|
---|
440 | initialized by copying the contents of the given hash-table (which may
|
---|
441 | have been empty). The B<new()> constructor for this class and all of its
|
---|
442 | subclasses returns a blessed reference to the initialized object (hash-table).
|
---|
443 |
|
---|
444 | =cut
|
---|
445 |
|
---|
446 | sub new {
|
---|
447 | ## Determine if we were called via an object-ref or a classname
|
---|
448 | my $this = shift;
|
---|
449 | my $class = ref($this) || $this;
|
---|
450 | ## Any remaining arguments are treated as initial values for the
|
---|
451 | ## hash that is used to represent this object.
|
---|
452 | my %params = @_;
|
---|
453 | my $self = { %params };
|
---|
454 | ## Bless ourselves into the desired class and perform any initialization
|
---|
455 | bless $self, $class;
|
---|
456 | $self->initialize();
|
---|
457 | return $self;
|
---|
458 | }
|
---|
459 |
|
---|
460 | ##---------------------------------------------------------------------------
|
---|
461 |
|
---|
462 | =head1 B<initialize()>
|
---|
463 |
|
---|
464 | $parser->initialize();
|
---|
465 |
|
---|
466 | This method performs any necessary object initialization. It takes no
|
---|
467 | arguments (other than the object instance of course, which is typically
|
---|
468 | copied to a local variable named C<$self>). If subclasses override this
|
---|
469 | method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>.
|
---|
470 |
|
---|
471 | =cut
|
---|
472 |
|
---|
473 | sub initialize {
|
---|
474 | #my $self = shift;
|
---|
475 | #return;
|
---|
476 | }
|
---|
477 |
|
---|
478 | ##---------------------------------------------------------------------------
|
---|
479 |
|
---|
480 | =head1 B<begin_pod()>
|
---|
481 |
|
---|
482 | $parser->begin_pod();
|
---|
483 |
|
---|
484 | This method is invoked at the beginning of processing for each POD
|
---|
485 | document that is encountered in the input. Subclasses should override
|
---|
486 | this method to perform any per-document initialization.
|
---|
487 |
|
---|
488 | =cut
|
---|
489 |
|
---|
490 | sub begin_pod {
|
---|
491 | #my $self = shift;
|
---|
492 | #return;
|
---|
493 | }
|
---|
494 |
|
---|
495 | ##---------------------------------------------------------------------------
|
---|
496 |
|
---|
497 | =head1 B<begin_input()>
|
---|
498 |
|
---|
499 | $parser->begin_input();
|
---|
500 |
|
---|
501 | This method is invoked by B<parse_from_filehandle()> immediately I<before>
|
---|
502 | processing input from a filehandle. The base class implementation does
|
---|
503 | nothing, however, subclasses may override it to perform any per-file
|
---|
504 | initializations.
|
---|
505 |
|
---|
506 | Note that if multiple files are parsed for a single POD document
|
---|
507 | (perhaps the result of some future C<=include> directive) this method
|
---|
508 | is invoked for every file that is parsed. If you wish to perform certain
|
---|
509 | initializations once per document, then you should use B<begin_pod()>.
|
---|
510 |
|
---|
511 | =cut
|
---|
512 |
|
---|
513 | sub begin_input {
|
---|
514 | #my $self = shift;
|
---|
515 | #return;
|
---|
516 | }
|
---|
517 |
|
---|
518 | ##---------------------------------------------------------------------------
|
---|
519 |
|
---|
520 | =head1 B<end_input()>
|
---|
521 |
|
---|
522 | $parser->end_input();
|
---|
523 |
|
---|
524 | This method is invoked by B<parse_from_filehandle()> immediately I<after>
|
---|
525 | processing input from a filehandle. The base class implementation does
|
---|
526 | nothing, however, subclasses may override it to perform any per-file
|
---|
527 | cleanup actions.
|
---|
528 |
|
---|
529 | Please note that if multiple files are parsed for a single POD document
|
---|
530 | (perhaps the result of some kind of C<=include> directive) this method
|
---|
531 | is invoked for every file that is parsed. If you wish to perform certain
|
---|
532 | cleanup actions once per document, then you should use B<end_pod()>.
|
---|
533 |
|
---|
534 | =cut
|
---|
535 |
|
---|
536 | sub end_input {
|
---|
537 | #my $self = shift;
|
---|
538 | #return;
|
---|
539 | }
|
---|
540 |
|
---|
541 | ##---------------------------------------------------------------------------
|
---|
542 |
|
---|
543 | =head1 B<end_pod()>
|
---|
544 |
|
---|
545 | $parser->end_pod();
|
---|
546 |
|
---|
547 | This method is invoked at the end of processing for each POD document
|
---|
548 | that is encountered in the input. Subclasses should override this method
|
---|
549 | to perform any per-document finalization.
|
---|
550 |
|
---|
551 | =cut
|
---|
552 |
|
---|
553 | sub end_pod {
|
---|
554 | #my $self = shift;
|
---|
555 | #return;
|
---|
556 | }
|
---|
557 |
|
---|
558 | ##---------------------------------------------------------------------------
|
---|
559 |
|
---|
560 | =head1 B<preprocess_line()>
|
---|
561 |
|
---|
562 | $textline = $parser->preprocess_line($text, $line_num);
|
---|
563 |
|
---|
564 | This method should be overridden by subclasses that wish to perform
|
---|
565 | any kind of preprocessing for each I<line> of input (I<before> it has
|
---|
566 | been determined whether or not it is part of a POD paragraph). The
|
---|
567 | parameter C<$text> is the input line; and the parameter C<$line_num> is
|
---|
568 | the line number of the corresponding text line.
|
---|
569 |
|
---|
570 | The value returned should correspond to the new text to use in its
|
---|
571 | place. If the empty string or an undefined value is returned then no
|
---|
572 | further processing will be performed for this line.
|
---|
573 |
|
---|
574 | Please note that the B<preprocess_line()> method is invoked I<before>
|
---|
575 | the B<preprocess_paragraph()> method. After all (possibly preprocessed)
|
---|
576 | lines in a paragraph have been assembled together and it has been
|
---|
577 | determined that the paragraph is part of the POD documentation from one
|
---|
578 | of the selected sections, then B<preprocess_paragraph()> is invoked.
|
---|
579 |
|
---|
580 | The base class implementation of this method returns the given text.
|
---|
581 |
|
---|
582 | =cut
|
---|
583 |
|
---|
584 | sub preprocess_line {
|
---|
585 | my ($self, $text, $line_num) = @_;
|
---|
586 | return $text;
|
---|
587 | }
|
---|
588 |
|
---|
589 | ##---------------------------------------------------------------------------
|
---|
590 |
|
---|
591 | =head1 B<preprocess_paragraph()>
|
---|
592 |
|
---|
593 | $textblock = $parser->preprocess_paragraph($text, $line_num);
|
---|
594 |
|
---|
595 | This method should be overridden by subclasses that wish to perform any
|
---|
596 | kind of preprocessing for each block (paragraph) of POD documentation
|
---|
597 | that appears in the input stream. The parameter C<$text> is the POD
|
---|
598 | paragraph from the input file; and the parameter C<$line_num> is the
|
---|
599 | line number for the beginning of the corresponding paragraph.
|
---|
600 |
|
---|
601 | The value returned should correspond to the new text to use in its
|
---|
602 | place If the empty string is returned or an undefined value is
|
---|
603 | returned, then the given C<$text> is ignored (not processed).
|
---|
604 |
|
---|
605 | This method is invoked after gathering up all the lines in a paragraph
|
---|
606 | and after determining the cutting state of the paragraph,
|
---|
607 | but before trying to further parse or interpret them. After
|
---|
608 | B<preprocess_paragraph()> returns, the current cutting state (which
|
---|
609 | is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates
|
---|
610 | to true then input text (including the given C<$text>) is cut (not
|
---|
611 | processed) until the next POD directive is encountered.
|
---|
612 |
|
---|
613 | Please note that the B<preprocess_line()> method is invoked I<before>
|
---|
614 | the B<preprocess_paragraph()> method. After all (possibly preprocessed)
|
---|
615 | lines in a paragraph have been assembled together and either it has been
|
---|
616 | determined that the paragraph is part of the POD documentation from one
|
---|
617 | of the selected sections or the C<-want_nonPODs> option is true,
|
---|
618 | then B<preprocess_paragraph()> is invoked.
|
---|
619 |
|
---|
620 | The base class implementation of this method returns the given text.
|
---|
621 |
|
---|
622 | =cut
|
---|
623 |
|
---|
624 | sub preprocess_paragraph {
|
---|
625 | my ($self, $text, $line_num) = @_;
|
---|
626 | return $text;
|
---|
627 | }
|
---|
628 |
|
---|
629 | #############################################################################
|
---|
630 |
|
---|
631 | =head1 METHODS FOR PARSING AND PROCESSING
|
---|
632 |
|
---|
633 | B<Pod::Parser> provides several methods to process input text. These
|
---|
634 | methods typically won't need to be overridden (and in some cases they
|
---|
635 | can't be overridden), but subclasses may want to invoke them to exploit
|
---|
636 | their functionality.
|
---|
637 |
|
---|
638 | =cut
|
---|
639 |
|
---|
640 | ##---------------------------------------------------------------------------
|
---|
641 |
|
---|
642 | =head1 B<parse_text()>
|
---|
643 |
|
---|
644 | $ptree1 = $parser->parse_text($text, $line_num);
|
---|
645 | $ptree2 = $parser->parse_text({%opts}, $text, $line_num);
|
---|
646 | $ptree3 = $parser->parse_text(\%opts, $text, $line_num);
|
---|
647 |
|
---|
648 | This method is useful if you need to perform your own interpolation
|
---|
649 | of interior sequences and can't rely upon B<interpolate> to expand
|
---|
650 | them in simple bottom-up order.
|
---|
651 |
|
---|
652 | The parameter C<$text> is a string or block of text to be parsed
|
---|
653 | for interior sequences; and the parameter C<$line_num> is the
|
---|
654 | line number curresponding to the beginning of C<$text>.
|
---|
655 |
|
---|
656 | B<parse_text()> will parse the given text into a parse-tree of "nodes."
|
---|
657 | and interior-sequences. Each "node" in the parse tree is either a
|
---|
658 | text-string, or a B<Pod::InteriorSequence>. The result returned is a
|
---|
659 | parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects>
|
---|
660 | for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>.
|
---|
661 |
|
---|
662 | If desired, an optional hash-ref may be specified as the first argument
|
---|
663 | to customize certain aspects of the parse-tree that is created and
|
---|
664 | returned. The set of recognized option keywords are:
|
---|
665 |
|
---|
666 | =over 3
|
---|
667 |
|
---|
668 | =item B<-expand_seq> =E<gt> I<code-ref>|I<method-name>
|
---|
669 |
|
---|
670 | Normally, the parse-tree returned by B<parse_text()> will contain an
|
---|
671 | unexpanded C<Pod::InteriorSequence> object for each interior-sequence
|
---|
672 | encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand"
|
---|
673 | every interior-sequence it sees by invoking the referenced function
|
---|
674 | (or named method of the parser object) and using the return value as the
|
---|
675 | expanded result.
|
---|
676 |
|
---|
677 | If a subroutine reference was given, it is invoked as:
|
---|
678 |
|
---|
679 | &$code_ref( $parser, $sequence )
|
---|
680 |
|
---|
681 | and if a method-name was given, it is invoked as:
|
---|
682 |
|
---|
683 | $parser->method_name( $sequence )
|
---|
684 |
|
---|
685 | where C<$parser> is a reference to the parser object, and C<$sequence>
|
---|
686 | is a reference to the interior-sequence object.
|
---|
687 | [I<NOTE>: If the B<interior_sequence()> method is specified, then it is
|
---|
688 | invoked according to the interface specified in L<"interior_sequence()">].
|
---|
689 |
|
---|
690 | =item B<-expand_text> =E<gt> I<code-ref>|I<method-name>
|
---|
691 |
|
---|
692 | Normally, the parse-tree returned by B<parse_text()> will contain a
|
---|
693 | text-string for each contiguous sequence of characters outside of an
|
---|
694 | interior-sequence. Specifying B<-expand_text> tells B<parse_text()> to
|
---|
695 | "preprocess" every such text-string it sees by invoking the referenced
|
---|
696 | function (or named method of the parser object) and using the return value
|
---|
697 | as the preprocessed (or "expanded") result. [Note that if the result is
|
---|
698 | an interior-sequence, then it will I<not> be expanded as specified by the
|
---|
699 | B<-expand_seq> option; Any such recursive expansion needs to be handled by
|
---|
700 | the specified callback routine.]
|
---|
701 |
|
---|
702 | If a subroutine reference was given, it is invoked as:
|
---|
703 |
|
---|
704 | &$code_ref( $parser, $text, $ptree_node )
|
---|
705 |
|
---|
706 | and if a method-name was given, it is invoked as:
|
---|
707 |
|
---|
708 | $parser->method_name( $text, $ptree_node )
|
---|
709 |
|
---|
710 | where C<$parser> is a reference to the parser object, C<$text> is the
|
---|
711 | text-string encountered, and C<$ptree_node> is a reference to the current
|
---|
712 | node in the parse-tree (usually an interior-sequence object or else the
|
---|
713 | top-level node of the parse-tree).
|
---|
714 |
|
---|
715 | =item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name>
|
---|
716 |
|
---|
717 | Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an
|
---|
718 | argument to the referenced subroutine (or named method of the parser
|
---|
719 | object) and return the result instead of the parse-tree object.
|
---|
720 |
|
---|
721 | If a subroutine reference was given, it is invoked as:
|
---|
722 |
|
---|
723 | &$code_ref( $parser, $ptree )
|
---|
724 |
|
---|
725 | and if a method-name was given, it is invoked as:
|
---|
726 |
|
---|
727 | $parser->method_name( $ptree )
|
---|
728 |
|
---|
729 | where C<$parser> is a reference to the parser object, and C<$ptree>
|
---|
730 | is a reference to the parse-tree object.
|
---|
731 |
|
---|
732 | =back
|
---|
733 |
|
---|
734 | =cut
|
---|
735 |
|
---|
736 | sub parse_text {
|
---|
737 | my $self = shift;
|
---|
738 | local $_ = '';
|
---|
739 |
|
---|
740 | ## Get options and set any defaults
|
---|
741 | my %opts = (ref $_[0]) ? %{ shift() } : ();
|
---|
742 | my $expand_seq = $opts{'-expand_seq'} || undef;
|
---|
743 | my $expand_text = $opts{'-expand_text'} || undef;
|
---|
744 | my $expand_ptree = $opts{'-expand_ptree'} || undef;
|
---|
745 |
|
---|
746 | my $text = shift;
|
---|
747 | my $line = shift;
|
---|
748 | my $file = $self->input_file();
|
---|
749 | my $cmd = "";
|
---|
750 |
|
---|
751 | ## Convert method calls into closures, for our convenience
|
---|
752 | my $xseq_sub = $expand_seq;
|
---|
753 | my $xtext_sub = $expand_text;
|
---|
754 | my $xptree_sub = $expand_ptree;
|
---|
755 | if (defined $expand_seq and $expand_seq eq 'interior_sequence') {
|
---|
756 | ## If 'interior_sequence' is the method to use, we have to pass
|
---|
757 | ## more than just the sequence object, we also need to pass the
|
---|
758 | ## sequence name and text.
|
---|
759 | $xseq_sub = sub {
|
---|
760 | my ($self, $iseq) = @_;
|
---|
761 | my $args = join("", $iseq->parse_tree->children);
|
---|
762 | return $self->interior_sequence($iseq->name, $args, $iseq);
|
---|
763 | };
|
---|
764 | }
|
---|
765 | ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) };
|
---|
766 | ref $xtext_sub or $xtext_sub = sub { shift()->$expand_text(@_) };
|
---|
767 | ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) };
|
---|
768 |
|
---|
769 | ## Keep track of the "current" interior sequence, and maintain a stack
|
---|
770 | ## of "in progress" sequences.
|
---|
771 | ##
|
---|
772 | ## NOTE that we push our own "accumulator" at the very beginning of the
|
---|
773 | ## stack. It's really a parse-tree, not a sequence; but it implements
|
---|
774 | ## the methods we need so we can use it to gather-up all the sequences
|
---|
775 | ## and strings we parse. Thus, by the end of our parsing, it should be
|
---|
776 | ## the only thing left on our stack and all we have to do is return it!
|
---|
777 | ##
|
---|
778 | my $seq = Pod::ParseTree->new();
|
---|
779 | my @seq_stack = ($seq);
|
---|
780 | my ($ldelim, $rdelim) = ('', '');
|
---|
781 |
|
---|
782 | ## Iterate over all sequence starts text (NOTE: split with
|
---|
783 | ## capturing parens keeps the delimiters)
|
---|
784 | $_ = $text;
|
---|
785 | my @tokens = split /([A-Z]<(?:<+\s)?)/;
|
---|
786 | while ( @tokens ) {
|
---|
787 | $_ = shift @tokens;
|
---|
788 | ## Look for the beginning of a sequence
|
---|
789 | if ( /^([A-Z])(<(?:<+\s)?)$/ ) {
|
---|
790 | ## Push a new sequence onto the stack of those "in-progress"
|
---|
791 | my $ldelim_orig;
|
---|
792 | ($cmd, $ldelim_orig) = ($1, $2);
|
---|
793 | ($ldelim = $ldelim_orig) =~ s/\s+$//;
|
---|
794 | ($rdelim = $ldelim) =~ tr/</>/;
|
---|
795 | $seq = Pod::InteriorSequence->new(
|
---|
796 | -name => $cmd,
|
---|
797 | -ldelim => $ldelim_orig, -rdelim => $rdelim,
|
---|
798 | -file => $file, -line => $line
|
---|
799 | );
|
---|
800 | (@seq_stack > 1) and $seq->nested($seq_stack[-1]);
|
---|
801 | push @seq_stack, $seq;
|
---|
802 | }
|
---|
803 | ## Look for sequence ending
|
---|
804 | elsif ( @seq_stack > 1 ) {
|
---|
805 | ## Make sure we match the right kind of closing delimiter
|
---|
806 | my ($seq_end, $post_seq) = ("", "");
|
---|
807 | if ( ($ldelim eq '<' and /\A(.*?)(>)/s)
|
---|
808 | or /\A(.*?)(\s+$rdelim)/s )
|
---|
809 | {
|
---|
810 | ## Found end-of-sequence, capture the interior and the
|
---|
811 | ## closing the delimiter, and put the rest back on the
|
---|
812 | ## token-list
|
---|
813 | $post_seq = substr($_, length($1) + length($2));
|
---|
814 | ($_, $seq_end) = ($1, $2);
|
---|
815 | (length $post_seq) and unshift @tokens, $post_seq;
|
---|
816 | }
|
---|
817 | if (length) {
|
---|
818 | ## In the middle of a sequence, append this text to it, and
|
---|
819 | ## dont forget to "expand" it if that's what the caller wanted
|
---|
820 | $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_);
|
---|
821 | $_ .= $seq_end;
|
---|
822 | }
|
---|
823 | if (length $seq_end) {
|
---|
824 | ## End of current sequence, record terminating delimiter
|
---|
825 | $seq->rdelim($seq_end);
|
---|
826 | ## Pop it off the stack of "in progress" sequences
|
---|
827 | pop @seq_stack;
|
---|
828 | ## Append result to its parent in current parse tree
|
---|
829 | $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq)
|
---|
830 | : $seq);
|
---|
831 | ## Remember the current cmd-name and left-delimiter
|
---|
832 | if(@seq_stack > 1) {
|
---|
833 | $cmd = $seq_stack[-1]->name;
|
---|
834 | $ldelim = $seq_stack[-1]->ldelim;
|
---|
835 | $rdelim = $seq_stack[-1]->rdelim;
|
---|
836 | } else {
|
---|
837 | $cmd = $ldelim = $rdelim = '';
|
---|
838 | }
|
---|
839 | }
|
---|
840 | }
|
---|
841 | elsif (length) {
|
---|
842 | ## In the middle of a sequence, append this text to it, and
|
---|
843 | ## dont forget to "expand" it if that's what the caller wanted
|
---|
844 | $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_);
|
---|
845 | }
|
---|
846 | ## Keep track of line count
|
---|
847 | $line += tr/\n//;
|
---|
848 | ## Remember the "current" sequence
|
---|
849 | $seq = $seq_stack[-1];
|
---|
850 | }
|
---|
851 |
|
---|
852 | ## Handle unterminated sequences
|
---|
853 | my $errorsub = (@seq_stack > 1) ? $self->errorsub() : undef;
|
---|
854 | while (@seq_stack > 1) {
|
---|
855 | ($cmd, $file, $line) = ($seq->name, $seq->file_line);
|
---|
856 | $ldelim = $seq->ldelim;
|
---|
857 | ($rdelim = $ldelim) =~ tr/</>/;
|
---|
858 | $rdelim =~ s/^(\S+)(\s*)$/$2$1/;
|
---|
859 | pop @seq_stack;
|
---|
860 | my $errmsg = "*** ERROR: unterminated ${cmd}${ldelim}...${rdelim}".
|
---|
861 | " at line $line in file $file\n";
|
---|
862 | (ref $errorsub) and &{$errorsub}($errmsg)
|
---|
863 | or (defined $errorsub) and $self->$errorsub($errmsg)
|
---|
864 | or warn($errmsg);
|
---|
865 | $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq);
|
---|
866 | $seq = $seq_stack[-1];
|
---|
867 | }
|
---|
868 |
|
---|
869 | ## Return the resulting parse-tree
|
---|
870 | my $ptree = (pop @seq_stack)->parse_tree;
|
---|
871 | return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree;
|
---|
872 | }
|
---|
873 |
|
---|
874 | ##---------------------------------------------------------------------------
|
---|
875 |
|
---|
876 | =head1 B<interpolate()>
|
---|
877 |
|
---|
878 | $textblock = $parser->interpolate($text, $line_num);
|
---|
879 |
|
---|
880 | This method translates all text (including any embedded interior sequences)
|
---|
881 | in the given text string C<$text> and returns the interpolated result. The
|
---|
882 | parameter C<$line_num> is the line number corresponding to the beginning
|
---|
883 | of C<$text>.
|
---|
884 |
|
---|
885 | B<interpolate()> merely invokes a private method to recursively expand
|
---|
886 | nested interior sequences in bottom-up order (innermost sequences are
|
---|
887 | expanded first). If there is a need to expand nested sequences in
|
---|
888 | some alternate order, use B<parse_text> instead.
|
---|
889 |
|
---|
890 | =cut
|
---|
891 |
|
---|
892 | sub interpolate {
|
---|
893 | my($self, $text, $line_num) = @_;
|
---|
894 | my %parse_opts = ( -expand_seq => 'interior_sequence' );
|
---|
895 | my $ptree = $self->parse_text( \%parse_opts, $text, $line_num );
|
---|
896 | return join "", $ptree->children();
|
---|
897 | }
|
---|
898 |
|
---|
899 | ##---------------------------------------------------------------------------
|
---|
900 |
|
---|
901 | =begin __PRIVATE__
|
---|
902 |
|
---|
903 | =head1 B<parse_paragraph()>
|
---|
904 |
|
---|
905 | $parser->parse_paragraph($text, $line_num);
|
---|
906 |
|
---|
907 | This method takes the text of a POD paragraph to be processed, along
|
---|
908 | with its corresponding line number, and invokes the appropriate method
|
---|
909 | (one of B<command()>, B<verbatim()>, or B<textblock()>).
|
---|
910 |
|
---|
911 | For performance reasons, this method is invoked directly without any
|
---|
912 | dynamic lookup; Hence subclasses may I<not> override it!
|
---|
913 |
|
---|
914 | =end __PRIVATE__
|
---|
915 |
|
---|
916 | =cut
|
---|
917 |
|
---|
918 | sub parse_paragraph {
|
---|
919 | my ($self, $text, $line_num) = @_;
|
---|
920 | local *myData = $self; ## alias to avoid deref-ing overhead
|
---|
921 | local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options
|
---|
922 | local $_;
|
---|
923 |
|
---|
924 | ## See if we want to preprocess nonPOD paragraphs as well as POD ones.
|
---|
925 | my $wantNonPods = $myOpts{'-want_nonPODs'};
|
---|
926 |
|
---|
927 | ## Update cutting status
|
---|
928 | $myData{_CUTTING} = 0 if $text =~ /^={1,2}\S/;
|
---|
929 |
|
---|
930 | ## Perform any desired preprocessing if we wanted it this early
|
---|
931 | $wantNonPods and $text = $self->preprocess_paragraph($text, $line_num);
|
---|
932 |
|
---|
933 | ## Ignore up until next POD directive if we are cutting
|
---|
934 | return if $myData{_CUTTING};
|
---|
935 |
|
---|
936 | ## Now we know this is block of text in a POD section!
|
---|
937 |
|
---|
938 | ##-----------------------------------------------------------------
|
---|
939 | ## This is a hook (hack ;-) for Pod::Select to do its thing without
|
---|
940 | ## having to override methods, but also without Pod::Parser assuming
|
---|
941 | ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS
|
---|
942 | ## field exists then we assume there is an is_selected() method for
|
---|
943 | ## us to invoke (calling $self->can('is_selected') could verify this
|
---|
944 | ## but that is more overhead than I want to incur)
|
---|
945 | ##-----------------------------------------------------------------
|
---|
946 |
|
---|
947 | ## Ignore this block if it isnt in one of the selected sections
|
---|
948 | if (exists $myData{_SELECTED_SECTIONS}) {
|
---|
949 | $self->is_selected($text) or return ($myData{_CUTTING} = 1);
|
---|
950 | }
|
---|
951 |
|
---|
952 | ## If we havent already, perform any desired preprocessing and
|
---|
953 | ## then re-check the "cutting" state
|
---|
954 | unless ($wantNonPods) {
|
---|
955 | $text = $self->preprocess_paragraph($text, $line_num);
|
---|
956 | return 1 unless ((defined $text) and (length $text));
|
---|
957 | return 1 if ($myData{_CUTTING});
|
---|
958 | }
|
---|
959 |
|
---|
960 | ## Look for one of the three types of paragraphs
|
---|
961 | my ($pfx, $cmd, $arg, $sep) = ('', '', '', '');
|
---|
962 | my $pod_para = undef;
|
---|
963 | if ($text =~ /^(={1,2})(?=\S)/) {
|
---|
964 | ## Looks like a command paragraph. Capture the command prefix used
|
---|
965 | ## ("=" or "=="), as well as the command-name, its paragraph text,
|
---|
966 | ## and whatever sequence of characters was used to separate them
|
---|
967 | $pfx = $1;
|
---|
968 | $_ = substr($text, length $pfx);
|
---|
969 | ($cmd, $sep, $text) = split /(\s+)/, $_, 2;
|
---|
970 | ## If this is a "cut" directive then we dont need to do anything
|
---|
971 | ## except return to "cutting" mode.
|
---|
972 | if ($cmd eq 'cut') {
|
---|
973 | $myData{_CUTTING} = 1;
|
---|
974 | return unless $myOpts{'-process_cut_cmd'};
|
---|
975 | }
|
---|
976 | }
|
---|
977 | ## Save the attributes indicating how the command was specified.
|
---|
978 | $pod_para = new Pod::Paragraph(
|
---|
979 | -name => $cmd,
|
---|
980 | -text => $text,
|
---|
981 | -prefix => $pfx,
|
---|
982 | -separator => $sep,
|
---|
983 | -file => $myData{_INFILE},
|
---|
984 | -line => $line_num
|
---|
985 | );
|
---|
986 | # ## Invoke appropriate callbacks
|
---|
987 | # if (exists $myData{_CALLBACKS}) {
|
---|
988 | # ## Look through the callback list, invoke callbacks,
|
---|
989 | # ## then see if we need to do the default actions
|
---|
990 | # ## (invoke_callbacks will return true if we do).
|
---|
991 | # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para);
|
---|
992 | # }
|
---|
993 | if (length $cmd) {
|
---|
994 | ## A command paragraph
|
---|
995 | $self->command($cmd, $text, $line_num, $pod_para);
|
---|
996 | }
|
---|
997 | elsif ($text =~ /^\s+/) {
|
---|
998 | ## Indented text - must be a verbatim paragraph
|
---|
999 | $self->verbatim($text, $line_num, $pod_para);
|
---|
1000 | }
|
---|
1001 | else {
|
---|
1002 | ## Looks like an ordinary block of text
|
---|
1003 | $self->textblock($text, $line_num, $pod_para);
|
---|
1004 | }
|
---|
1005 | return 1;
|
---|
1006 | }
|
---|
1007 |
|
---|
1008 | ##---------------------------------------------------------------------------
|
---|
1009 |
|
---|
1010 | =head1 B<parse_from_filehandle()>
|
---|
1011 |
|
---|
1012 | $parser->parse_from_filehandle($in_fh,$out_fh);
|
---|
1013 |
|
---|
1014 | This method takes an input filehandle (which is assumed to already be
|
---|
1015 | opened for reading) and reads the entire input stream looking for blocks
|
---|
1016 | (paragraphs) of POD documentation to be processed. If no first argument
|
---|
1017 | is given the default input filehandle C<STDIN> is used.
|
---|
1018 |
|
---|
1019 | The C<$in_fh> parameter may be any object that provides a B<getline()>
|
---|
1020 | method to retrieve a single line of input text (hence, an appropriate
|
---|
1021 | wrapper object could be used to parse PODs from a single string or an
|
---|
1022 | array of strings).
|
---|
1023 |
|
---|
1024 | Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled
|
---|
1025 | into paragraphs or "blocks" (which are separated by lines containing
|
---|
1026 | nothing but whitespace). For each block of POD documentation
|
---|
1027 | encountered it will invoke a method to parse the given paragraph.
|
---|
1028 |
|
---|
1029 | If a second argument is given then it should correspond to a filehandle where
|
---|
1030 | output should be sent (otherwise the default output filehandle is
|
---|
1031 | C<STDOUT> if no output filehandle is currently in use).
|
---|
1032 |
|
---|
1033 | B<NOTE:> For performance reasons, this method caches the input stream at
|
---|
1034 | the top of the stack in a local variable. Any attempts by clients to
|
---|
1035 | change the stack contents during processing when in the midst executing
|
---|
1036 | of this method I<will not affect> the input stream used by the current
|
---|
1037 | invocation of this method.
|
---|
1038 |
|
---|
1039 | This method does I<not> usually need to be overridden by subclasses.
|
---|
1040 |
|
---|
1041 | =cut
|
---|
1042 |
|
---|
1043 | sub parse_from_filehandle {
|
---|
1044 | my $self = shift;
|
---|
1045 | my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : ();
|
---|
1046 | my ($in_fh, $out_fh) = @_;
|
---|
1047 | $in_fh = \*STDIN unless ($in_fh);
|
---|
1048 | local *myData = $self; ## alias to avoid deref-ing overhead
|
---|
1049 | local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options
|
---|
1050 | local $_;
|
---|
1051 |
|
---|
1052 | ## Put this stream at the top of the stack and do beginning-of-input
|
---|
1053 | ## processing. NOTE that $in_fh might be reset during this process.
|
---|
1054 | my $topstream = $self->_push_input_stream($in_fh, $out_fh);
|
---|
1055 | (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} );
|
---|
1056 |
|
---|
1057 | ## Initialize line/paragraph
|
---|
1058 | my ($textline, $paragraph) = ('', '');
|
---|
1059 | my ($nlines, $plines) = (0, 0);
|
---|
1060 |
|
---|
1061 | ## Use <$fh> instead of $fh->getline where possible (for speed)
|
---|
1062 | $_ = ref $in_fh;
|
---|
1063 | my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh);
|
---|
1064 |
|
---|
1065 | ## Read paragraphs line-by-line
|
---|
1066 | while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) {
|
---|
1067 | $textline = $self->preprocess_line($textline, ++$nlines);
|
---|
1068 | next unless ((defined $textline) && (length $textline));
|
---|
1069 |
|
---|
1070 | if ((! length $paragraph) && ($textline =~ /^==/)) {
|
---|
1071 | ## '==' denotes a one-line command paragraph
|
---|
1072 | $paragraph = $textline;
|
---|
1073 | $plines = 1;
|
---|
1074 | $textline = '';
|
---|
1075 | } else {
|
---|
1076 | ## Append this line to the current paragraph
|
---|
1077 | $paragraph .= $textline;
|
---|
1078 | ++$plines;
|
---|
1079 | }
|
---|
1080 |
|
---|
1081 | ## See if this line is blank and ends the current paragraph.
|
---|
1082 | ## If it isnt, then keep iterating until it is.
|
---|
1083 | next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/)
|
---|
1084 | && (length $paragraph));
|
---|
1085 |
|
---|
1086 | ## Issue a warning about any non-empty blank lines
|
---|
1087 | if (length($1) > 0 and $myOpts{'-warnings'} and ! $myData{_CUTTING}) {
|
---|
1088 | my $errorsub = $self->errorsub();
|
---|
1089 | my $file = $self->input_file();
|
---|
1090 | my $errmsg = "*** WARNING: line containing nothing but whitespace".
|
---|
1091 | " in paragraph at line $nlines in file $file\n";
|
---|
1092 | (ref $errorsub) and &{$errorsub}($errmsg)
|
---|
1093 | or (defined $errorsub) and $self->$errorsub($errmsg)
|
---|
1094 | or warn($errmsg);
|
---|
1095 | }
|
---|
1096 |
|
---|
1097 | ## Now process the paragraph
|
---|
1098 | parse_paragraph($self, $paragraph, ($nlines - $plines) + 1);
|
---|
1099 | $paragraph = '';
|
---|
1100 | $plines = 0;
|
---|
1101 | }
|
---|
1102 | ## Dont forget about the last paragraph in the file
|
---|
1103 | if (length $paragraph) {
|
---|
1104 | parse_paragraph($self, $paragraph, ($nlines - $plines) + 1)
|
---|
1105 | }
|
---|
1106 |
|
---|
1107 | ## Now pop the input stream off the top of the input stack.
|
---|
1108 | $self->_pop_input_stream();
|
---|
1109 | }
|
---|
1110 |
|
---|
1111 | ##---------------------------------------------------------------------------
|
---|
1112 |
|
---|
1113 | =head1 B<parse_from_file()>
|
---|
1114 |
|
---|
1115 | $parser->parse_from_file($filename,$outfile);
|
---|
1116 |
|
---|
1117 | This method takes a filename and does the following:
|
---|
1118 |
|
---|
1119 | =over 2
|
---|
1120 |
|
---|
1121 | =item *
|
---|
1122 |
|
---|
1123 | opens the input and output files for reading
|
---|
1124 | (creating the appropriate filehandles)
|
---|
1125 |
|
---|
1126 | =item *
|
---|
1127 |
|
---|
1128 | invokes the B<parse_from_filehandle()> method passing it the
|
---|
1129 | corresponding input and output filehandles.
|
---|
1130 |
|
---|
1131 | =item *
|
---|
1132 |
|
---|
1133 | closes the input and output files.
|
---|
1134 |
|
---|
1135 | =back
|
---|
1136 |
|
---|
1137 | If the special input filename "-" or "<&STDIN" is given then the STDIN
|
---|
1138 | filehandle is used for input (and no open or close is performed). If no
|
---|
1139 | input filename is specified then "-" is implied.
|
---|
1140 |
|
---|
1141 | If a second argument is given then it should be the name of the desired
|
---|
1142 | output file. If the special output filename "-" or ">&STDOUT" is given
|
---|
1143 | then the STDOUT filehandle is used for output (and no open or close is
|
---|
1144 | performed). If the special output filename ">&STDERR" is given then the
|
---|
1145 | STDERR filehandle is used for output (and no open or close is
|
---|
1146 | performed). If no output filehandle is currently in use and no output
|
---|
1147 | filename is specified, then "-" is implied.
|
---|
1148 | Alternatively, an L<IO::String> object is also accepted as an output
|
---|
1149 | file handle.
|
---|
1150 |
|
---|
1151 | This method does I<not> usually need to be overridden by subclasses.
|
---|
1152 |
|
---|
1153 | =cut
|
---|
1154 |
|
---|
1155 | sub parse_from_file {
|
---|
1156 | my $self = shift;
|
---|
1157 | my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : ();
|
---|
1158 | my ($infile, $outfile) = @_;
|
---|
1159 | my ($in_fh, $out_fh) = (gensym(), gensym()) if ($] < 5.006);
|
---|
1160 | my ($close_input, $close_output) = (0, 0);
|
---|
1161 | local *myData = $self;
|
---|
1162 | local *_;
|
---|
1163 |
|
---|
1164 | ## Is $infile a filename or a (possibly implied) filehandle
|
---|
1165 | if (defined $infile && ref $infile) {
|
---|
1166 | if (ref($infile) =~ /^(SCALAR|ARRAY|HASH|CODE|REF)$/) {
|
---|
1167 | croak "Input from $1 reference not supported!\n";
|
---|
1168 | }
|
---|
1169 | ## Must be a filehandle-ref (or else assume its a ref to an object
|
---|
1170 | ## that supports the common IO read operations).
|
---|
1171 | $myData{_INFILE} = ${$infile};
|
---|
1172 | $in_fh = $infile;
|
---|
1173 | }
|
---|
1174 | elsif (!defined($infile) || !length($infile) || ($infile eq '-')
|
---|
1175 | || ($infile =~ /^<&(?:STDIN|0)$/i))
|
---|
1176 | {
|
---|
1177 | ## Not a filename, just a string implying STDIN
|
---|
1178 | $infile ||= '-';
|
---|
1179 | $myData{_INFILE} = "<standard input>";
|
---|
1180 | $in_fh = \*STDIN;
|
---|
1181 | }
|
---|
1182 | else {
|
---|
1183 | ## We have a filename, open it for reading
|
---|
1184 | $myData{_INFILE} = $infile;
|
---|
1185 | open($in_fh, "< $infile") or
|
---|
1186 | croak "Can't open $infile for reading: $!\n";
|
---|
1187 | $close_input = 1;
|
---|
1188 | }
|
---|
1189 |
|
---|
1190 | ## NOTE: we need to be *very* careful when "defaulting" the output
|
---|
1191 | ## file. We only want to use a default if this is the beginning of
|
---|
1192 | ## the entire document (but *not* if this is an included file). We
|
---|
1193 | ## determine this by seeing if the input stream stack has been set-up
|
---|
1194 | ## already
|
---|
1195 |
|
---|
1196 | ## Is $outfile a filename, a (possibly implied) filehandle, maybe a ref?
|
---|
1197 | if (ref $outfile) {
|
---|
1198 | ## we need to check for ref() first, as other checks involve reading
|
---|
1199 | if (ref($outfile) =~ /^(ARRAY|HASH|CODE)$/) {
|
---|
1200 | croak "Output to $1 reference not supported!\n";
|
---|
1201 | }
|
---|
1202 | elsif (ref($outfile) eq 'SCALAR') {
|
---|
1203 | # # NOTE: IO::String isn't a part of the perl distribution,
|
---|
1204 | # # so probably we shouldn't support this case...
|
---|
1205 | # require IO::String;
|
---|
1206 | # $myData{_OUTFILE} = "$outfile";
|
---|
1207 | # $out_fh = IO::String->new($outfile);
|
---|
1208 | croak "Output to SCALAR reference not supported!\n";
|
---|
1209 | }
|
---|
1210 | else {
|
---|
1211 | ## Must be a filehandle-ref (or else assume its a ref to an
|
---|
1212 | ## object that supports the common IO write operations).
|
---|
1213 | $myData{_OUTFILE} = ${$outfile};
|
---|
1214 | $out_fh = $outfile;
|
---|
1215 | }
|
---|
1216 | }
|
---|
1217 | elsif (!defined($outfile) || !length($outfile) || ($outfile eq '-')
|
---|
1218 | || ($outfile =~ /^>&?(?:STDOUT|1)$/i))
|
---|
1219 | {
|
---|
1220 | if (defined $myData{_TOP_STREAM}) {
|
---|
1221 | $out_fh = $myData{_OUTPUT};
|
---|
1222 | }
|
---|
1223 | else {
|
---|
1224 | ## Not a filename, just a string implying STDOUT
|
---|
1225 | $outfile ||= '-';
|
---|
1226 | $myData{_OUTFILE} = "<standard output>";
|
---|
1227 | $out_fh = \*STDOUT;
|
---|
1228 | }
|
---|
1229 | }
|
---|
1230 | elsif ($outfile =~ /^>&(STDERR|2)$/i) {
|
---|
1231 | ## Not a filename, just a string implying STDERR
|
---|
1232 | $myData{_OUTFILE} = "<standard error>";
|
---|
1233 | $out_fh = \*STDERR;
|
---|
1234 | }
|
---|
1235 | else {
|
---|
1236 | ## We have a filename, open it for writing
|
---|
1237 | $myData{_OUTFILE} = $outfile;
|
---|
1238 | (-d $outfile) and croak "$outfile is a directory, not POD input!\n";
|
---|
1239 | open($out_fh, "> $outfile") or
|
---|
1240 | croak "Can't open $outfile for writing: $!\n";
|
---|
1241 | $close_output = 1;
|
---|
1242 | }
|
---|
1243 |
|
---|
1244 | ## Whew! That was a lot of work to set up reasonably/robust behavior
|
---|
1245 | ## in the case of a non-filename for reading and writing. Now we just
|
---|
1246 | ## have to parse the input and close the handles when we're finished.
|
---|
1247 | $self->parse_from_filehandle(\%opts, $in_fh, $out_fh);
|
---|
1248 |
|
---|
1249 | $close_input and
|
---|
1250 | close($in_fh) || croak "Can't close $infile after reading: $!\n";
|
---|
1251 | $close_output and
|
---|
1252 | close($out_fh) || croak "Can't close $outfile after writing: $!\n";
|
---|
1253 | }
|
---|
1254 |
|
---|
1255 | #############################################################################
|
---|
1256 |
|
---|
1257 | =head1 ACCESSOR METHODS
|
---|
1258 |
|
---|
1259 | Clients of B<Pod::Parser> should use the following methods to access
|
---|
1260 | instance data fields:
|
---|
1261 |
|
---|
1262 | =cut
|
---|
1263 |
|
---|
1264 | ##---------------------------------------------------------------------------
|
---|
1265 |
|
---|
1266 | =head1 B<errorsub()>
|
---|
1267 |
|
---|
1268 | $parser->errorsub("method_name");
|
---|
1269 | $parser->errorsub(\&warn_user);
|
---|
1270 | $parser->errorsub(sub { print STDERR, @_ });
|
---|
1271 |
|
---|
1272 | Specifies the method or subroutine to use when printing error messages
|
---|
1273 | about POD syntax. The supplied method/subroutine I<must> return TRUE upon
|
---|
1274 | successful printing of the message. If C<undef> is given, then the B<warn>
|
---|
1275 | builtin is used to issue error messages (this is the default behavior).
|
---|
1276 |
|
---|
1277 | my $errorsub = $parser->errorsub()
|
---|
1278 | my $errmsg = "This is an error message!\n"
|
---|
1279 | (ref $errorsub) and &{$errorsub}($errmsg)
|
---|
1280 | or (defined $errorsub) and $parser->$errorsub($errmsg)
|
---|
1281 | or warn($errmsg);
|
---|
1282 |
|
---|
1283 | Returns a method name, or else a reference to the user-supplied subroutine
|
---|
1284 | used to print error messages. Returns C<undef> if the B<warn> builtin
|
---|
1285 | is used to issue error messages (this is the default behavior).
|
---|
1286 |
|
---|
1287 | =cut
|
---|
1288 |
|
---|
1289 | sub errorsub {
|
---|
1290 | return (@_ > 1) ? ($_[0]->{_ERRORSUB} = $_[1]) : $_[0]->{_ERRORSUB};
|
---|
1291 | }
|
---|
1292 |
|
---|
1293 | ##---------------------------------------------------------------------------
|
---|
1294 |
|
---|
1295 | =head1 B<cutting()>
|
---|
1296 |
|
---|
1297 | $boolean = $parser->cutting();
|
---|
1298 |
|
---|
1299 | Returns the current C<cutting> state: a boolean-valued scalar which
|
---|
1300 | evaluates to true if text from the input file is currently being "cut"
|
---|
1301 | (meaning it is I<not> considered part of the POD document).
|
---|
1302 |
|
---|
1303 | $parser->cutting($boolean);
|
---|
1304 |
|
---|
1305 | Sets the current C<cutting> state to the given value and returns the
|
---|
1306 | result.
|
---|
1307 |
|
---|
1308 | =cut
|
---|
1309 |
|
---|
1310 | sub cutting {
|
---|
1311 | return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING};
|
---|
1312 | }
|
---|
1313 |
|
---|
1314 | ##---------------------------------------------------------------------------
|
---|
1315 |
|
---|
1316 | ##---------------------------------------------------------------------------
|
---|
1317 |
|
---|
1318 | =head1 B<parseopts()>
|
---|
1319 |
|
---|
1320 | When invoked with no additional arguments, B<parseopts> returns a hashtable
|
---|
1321 | of all the current parsing options.
|
---|
1322 |
|
---|
1323 | ## See if we are parsing non-POD sections as well as POD ones
|
---|
1324 | my %opts = $parser->parseopts();
|
---|
1325 | $opts{'-want_nonPODs}' and print "-want_nonPODs\n";
|
---|
1326 |
|
---|
1327 | When invoked using a single string, B<parseopts> treats the string as the
|
---|
1328 | name of a parse-option and returns its corresponding value if it exists
|
---|
1329 | (returns C<undef> if it doesn't).
|
---|
1330 |
|
---|
1331 | ## Did we ask to see '=cut' paragraphs?
|
---|
1332 | my $want_cut = $parser->parseopts('-process_cut_cmd');
|
---|
1333 | $want_cut and print "-process_cut_cmd\n";
|
---|
1334 |
|
---|
1335 | When invoked with multiple arguments, B<parseopts> treats them as
|
---|
1336 | key/value pairs and the specified parse-option names are set to the
|
---|
1337 | given values. Any unspecified parse-options are unaffected.
|
---|
1338 |
|
---|
1339 | ## Set them back to the default
|
---|
1340 | $parser->parseopts(-warnings => 0);
|
---|
1341 |
|
---|
1342 | When passed a single hash-ref, B<parseopts> uses that hash to completely
|
---|
1343 | reset the existing parse-options, all previous parse-option values
|
---|
1344 | are lost.
|
---|
1345 |
|
---|
1346 | ## Reset all options to default
|
---|
1347 | $parser->parseopts( { } );
|
---|
1348 |
|
---|
1349 | See L<"PARSING OPTIONS"> for more information on the name and meaning of each
|
---|
1350 | parse-option currently recognized.
|
---|
1351 |
|
---|
1352 | =cut
|
---|
1353 |
|
---|
1354 | sub parseopts {
|
---|
1355 | local *myData = shift;
|
---|
1356 | local *myOpts = ($myData{_PARSEOPTS} ||= {});
|
---|
1357 | return %myOpts if (@_ == 0);
|
---|
1358 | if (@_ == 1) {
|
---|
1359 | local $_ = shift;
|
---|
1360 | return ref($_) ? $myData{_PARSEOPTS} = $_ : $myOpts{$_};
|
---|
1361 | }
|
---|
1362 | my @newOpts = (%myOpts, @_);
|
---|
1363 | $myData{_PARSEOPTS} = { @newOpts };
|
---|
1364 | }
|
---|
1365 |
|
---|
1366 | ##---------------------------------------------------------------------------
|
---|
1367 |
|
---|
1368 | =head1 B<output_file()>
|
---|
1369 |
|
---|
1370 | $fname = $parser->output_file();
|
---|
1371 |
|
---|
1372 | Returns the name of the output file being written.
|
---|
1373 |
|
---|
1374 | =cut
|
---|
1375 |
|
---|
1376 | sub output_file {
|
---|
1377 | return $_[0]->{_OUTFILE};
|
---|
1378 | }
|
---|
1379 |
|
---|
1380 | ##---------------------------------------------------------------------------
|
---|
1381 |
|
---|
1382 | =head1 B<output_handle()>
|
---|
1383 |
|
---|
1384 | $fhandle = $parser->output_handle();
|
---|
1385 |
|
---|
1386 | Returns the output filehandle object.
|
---|
1387 |
|
---|
1388 | =cut
|
---|
1389 |
|
---|
1390 | sub output_handle {
|
---|
1391 | return $_[0]->{_OUTPUT};
|
---|
1392 | }
|
---|
1393 |
|
---|
1394 | ##---------------------------------------------------------------------------
|
---|
1395 |
|
---|
1396 | =head1 B<input_file()>
|
---|
1397 |
|
---|
1398 | $fname = $parser->input_file();
|
---|
1399 |
|
---|
1400 | Returns the name of the input file being read.
|
---|
1401 |
|
---|
1402 | =cut
|
---|
1403 |
|
---|
1404 | sub input_file {
|
---|
1405 | return $_[0]->{_INFILE};
|
---|
1406 | }
|
---|
1407 |
|
---|
1408 | ##---------------------------------------------------------------------------
|
---|
1409 |
|
---|
1410 | =head1 B<input_handle()>
|
---|
1411 |
|
---|
1412 | $fhandle = $parser->input_handle();
|
---|
1413 |
|
---|
1414 | Returns the current input filehandle object.
|
---|
1415 |
|
---|
1416 | =cut
|
---|
1417 |
|
---|
1418 | sub input_handle {
|
---|
1419 | return $_[0]->{_INPUT};
|
---|
1420 | }
|
---|
1421 |
|
---|
1422 | ##---------------------------------------------------------------------------
|
---|
1423 |
|
---|
1424 | =begin __PRIVATE__
|
---|
1425 |
|
---|
1426 | =head1 B<input_streams()>
|
---|
1427 |
|
---|
1428 | $listref = $parser->input_streams();
|
---|
1429 |
|
---|
1430 | Returns a reference to an array which corresponds to the stack of all
|
---|
1431 | the input streams that are currently in the middle of being parsed.
|
---|
1432 |
|
---|
1433 | While parsing an input stream, it is possible to invoke
|
---|
1434 | B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input
|
---|
1435 | stream and then return to parsing the previous input stream. Each input
|
---|
1436 | stream to be parsed is pushed onto the end of this input stack
|
---|
1437 | before any of its input is read. The input stream that is currently
|
---|
1438 | being parsed is always at the end (or top) of the input stack. When an
|
---|
1439 | input stream has been exhausted, it is popped off the end of the
|
---|
1440 | input stack.
|
---|
1441 |
|
---|
1442 | Each element on this input stack is a reference to C<Pod::InputSource>
|
---|
1443 | object. Please see L<Pod::InputObjects> for more details.
|
---|
1444 |
|
---|
1445 | This method might be invoked when printing diagnostic messages, for example,
|
---|
1446 | to obtain the name and line number of the all input files that are currently
|
---|
1447 | being processed.
|
---|
1448 |
|
---|
1449 | =end __PRIVATE__
|
---|
1450 |
|
---|
1451 | =cut
|
---|
1452 |
|
---|
1453 | sub input_streams {
|
---|
1454 | return $_[0]->{_INPUT_STREAMS};
|
---|
1455 | }
|
---|
1456 |
|
---|
1457 | ##---------------------------------------------------------------------------
|
---|
1458 |
|
---|
1459 | =begin __PRIVATE__
|
---|
1460 |
|
---|
1461 | =head1 B<top_stream()>
|
---|
1462 |
|
---|
1463 | $hashref = $parser->top_stream();
|
---|
1464 |
|
---|
1465 | Returns a reference to the hash-table that represents the element
|
---|
1466 | that is currently at the top (end) of the input stream stack
|
---|
1467 | (see L<"input_streams()">). The return value will be the C<undef>
|
---|
1468 | if the input stack is empty.
|
---|
1469 |
|
---|
1470 | This method might be used when printing diagnostic messages, for example,
|
---|
1471 | to obtain the name and line number of the current input file.
|
---|
1472 |
|
---|
1473 | =end __PRIVATE__
|
---|
1474 |
|
---|
1475 | =cut
|
---|
1476 |
|
---|
1477 | sub top_stream {
|
---|
1478 | return $_[0]->{_TOP_STREAM} || undef;
|
---|
1479 | }
|
---|
1480 |
|
---|
1481 | #############################################################################
|
---|
1482 |
|
---|
1483 | =head1 PRIVATE METHODS AND DATA
|
---|
1484 |
|
---|
1485 | B<Pod::Parser> makes use of several internal methods and data fields
|
---|
1486 | which clients should not need to see or use. For the sake of avoiding
|
---|
1487 | name collisions for client data and methods, these methods and fields
|
---|
1488 | are briefly discussed here. Determined hackers may obtain further
|
---|
1489 | information about them by reading the B<Pod::Parser> source code.
|
---|
1490 |
|
---|
1491 | Private data fields are stored in the hash-object whose reference is
|
---|
1492 | returned by the B<new()> constructor for this class. The names of all
|
---|
1493 | private methods and data-fields used by B<Pod::Parser> begin with a
|
---|
1494 | prefix of "_" and match the regular expression C</^_\w+$/>.
|
---|
1495 |
|
---|
1496 | =cut
|
---|
1497 |
|
---|
1498 | ##---------------------------------------------------------------------------
|
---|
1499 |
|
---|
1500 | =begin _PRIVATE_
|
---|
1501 |
|
---|
1502 | =head1 B<_push_input_stream()>
|
---|
1503 |
|
---|
1504 | $hashref = $parser->_push_input_stream($in_fh,$out_fh);
|
---|
1505 |
|
---|
1506 | This method will push the given input stream on the input stack and
|
---|
1507 | perform any necessary beginning-of-document or beginning-of-file
|
---|
1508 | processing. The argument C<$in_fh> is the input stream filehandle to
|
---|
1509 | push, and C<$out_fh> is the corresponding output filehandle to use (if
|
---|
1510 | it is not given or is undefined, then the current output stream is used,
|
---|
1511 | which defaults to standard output if it doesnt exist yet).
|
---|
1512 |
|
---|
1513 | The value returned will be reference to the hash-table that represents
|
---|
1514 | the new top of the input stream stack. I<Please Note> that it is
|
---|
1515 | possible for this method to use default values for the input and output
|
---|
1516 | file handles. If this happens, you will need to look at the C<INPUT>
|
---|
1517 | and C<OUTPUT> instance data members to determine their new values.
|
---|
1518 |
|
---|
1519 | =end _PRIVATE_
|
---|
1520 |
|
---|
1521 | =cut
|
---|
1522 |
|
---|
1523 | sub _push_input_stream {
|
---|
1524 | my ($self, $in_fh, $out_fh) = @_;
|
---|
1525 | local *myData = $self;
|
---|
1526 |
|
---|
1527 | ## Initialize stuff for the entire document if this is *not*
|
---|
1528 | ## an included file.
|
---|
1529 | ##
|
---|
1530 | ## NOTE: we need to be *very* careful when "defaulting" the output
|
---|
1531 | ## filehandle. We only want to use a default value if this is the
|
---|
1532 | ## beginning of the entire document (but *not* if this is an included
|
---|
1533 | ## file).
|
---|
1534 | unless (defined $myData{_TOP_STREAM}) {
|
---|
1535 | $out_fh = \*STDOUT unless (defined $out_fh);
|
---|
1536 | $myData{_CUTTING} = 1; ## current "cutting" state
|
---|
1537 | $myData{_INPUT_STREAMS} = []; ## stack of all input streams
|
---|
1538 | }
|
---|
1539 |
|
---|
1540 | ## Initialize input indicators
|
---|
1541 | $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE});
|
---|
1542 | $myData{_OUTPUT} = $out_fh if (defined $out_fh);
|
---|
1543 | $in_fh = \*STDIN unless (defined $in_fh);
|
---|
1544 | $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE});
|
---|
1545 | $myData{_INPUT} = $in_fh;
|
---|
1546 | my $input_top = $myData{_TOP_STREAM}
|
---|
1547 | = new Pod::InputSource(
|
---|
1548 | -name => $myData{_INFILE},
|
---|
1549 | -handle => $in_fh,
|
---|
1550 | -was_cutting => $myData{_CUTTING}
|
---|
1551 | );
|
---|
1552 | local *input_stack = $myData{_INPUT_STREAMS};
|
---|
1553 | push(@input_stack, $input_top);
|
---|
1554 |
|
---|
1555 | ## Perform beginning-of-document and/or beginning-of-input processing
|
---|
1556 | $self->begin_pod() if (@input_stack == 1);
|
---|
1557 | $self->begin_input();
|
---|
1558 |
|
---|
1559 | return $input_top;
|
---|
1560 | }
|
---|
1561 |
|
---|
1562 | ##---------------------------------------------------------------------------
|
---|
1563 |
|
---|
1564 | =begin _PRIVATE_
|
---|
1565 |
|
---|
1566 | =head1 B<_pop_input_stream()>
|
---|
1567 |
|
---|
1568 | $hashref = $parser->_pop_input_stream();
|
---|
1569 |
|
---|
1570 | This takes no arguments. It will perform any necessary end-of-file or
|
---|
1571 | end-of-document processing and then pop the current input stream from
|
---|
1572 | the top of the input stack.
|
---|
1573 |
|
---|
1574 | The value returned will be reference to the hash-table that represents
|
---|
1575 | the new top of the input stream stack.
|
---|
1576 |
|
---|
1577 | =end _PRIVATE_
|
---|
1578 |
|
---|
1579 | =cut
|
---|
1580 |
|
---|
1581 | sub _pop_input_stream {
|
---|
1582 | my ($self) = @_;
|
---|
1583 | local *myData = $self;
|
---|
1584 | local *input_stack = $myData{_INPUT_STREAMS};
|
---|
1585 |
|
---|
1586 | ## Perform end-of-input and/or end-of-document processing
|
---|
1587 | $self->end_input() if (@input_stack > 0);
|
---|
1588 | $self->end_pod() if (@input_stack == 1);
|
---|
1589 |
|
---|
1590 | ## Restore cutting state to whatever it was before we started
|
---|
1591 | ## parsing this file.
|
---|
1592 | my $old_top = pop(@input_stack);
|
---|
1593 | $myData{_CUTTING} = $old_top->was_cutting();
|
---|
1594 |
|
---|
1595 | ## Dont forget to reset the input indicators
|
---|
1596 | my $input_top = undef;
|
---|
1597 | if (@input_stack > 0) {
|
---|
1598 | $input_top = $myData{_TOP_STREAM} = $input_stack[-1];
|
---|
1599 | $myData{_INFILE} = $input_top->name();
|
---|
1600 | $myData{_INPUT} = $input_top->handle();
|
---|
1601 | } else {
|
---|
1602 | delete $myData{_TOP_STREAM};
|
---|
1603 | delete $myData{_INPUT_STREAMS};
|
---|
1604 | }
|
---|
1605 |
|
---|
1606 | return $input_top;
|
---|
1607 | }
|
---|
1608 |
|
---|
1609 | #############################################################################
|
---|
1610 |
|
---|
1611 | =head1 TREE-BASED PARSING
|
---|
1612 |
|
---|
1613 | If straightforward stream-based parsing wont meet your needs (as is
|
---|
1614 | likely the case for tasks such as translating PODs into structured
|
---|
1615 | markup languages like HTML and XML) then you may need to take the
|
---|
1616 | tree-based approach. Rather than doing everything in one pass and
|
---|
1617 | calling the B<interpolate()> method to expand sequences into text, it
|
---|
1618 | may be desirable to instead create a parse-tree using the B<parse_text()>
|
---|
1619 | method to return a tree-like structure which may contain an ordered
|
---|
1620 | list of children (each of which may be a text-string, or a similar
|
---|
1621 | tree-like structure).
|
---|
1622 |
|
---|
1623 | Pay special attention to L<"METHODS FOR PARSING AND PROCESSING"> and
|
---|
1624 | to the objects described in L<Pod::InputObjects>. The former describes
|
---|
1625 | the gory details and parameters for how to customize and extend the
|
---|
1626 | parsing behavior of B<Pod::Parser>. B<Pod::InputObjects> provides
|
---|
1627 | several objects that may all be used interchangeably as parse-trees. The
|
---|
1628 | most obvious one is the B<Pod::ParseTree> object. It defines the basic
|
---|
1629 | interface and functionality that all things trying to be a POD parse-tree
|
---|
1630 | should do. A B<Pod::ParseTree> is defined such that each "node" may be a
|
---|
1631 | text-string, or a reference to another parse-tree. Each B<Pod::Paragraph>
|
---|
1632 | object and each B<Pod::InteriorSequence> object also supports the basic
|
---|
1633 | parse-tree interface.
|
---|
1634 |
|
---|
1635 | The B<parse_text()> method takes a given paragraph of text, and
|
---|
1636 | returns a parse-tree that contains one or more children, each of which
|
---|
1637 | may be a text-string, or an InteriorSequence object. There are also
|
---|
1638 | callback-options that may be passed to B<parse_text()> to customize
|
---|
1639 | the way it expands or transforms interior-sequences, as well as the
|
---|
1640 | returned result. These callbacks can be used to create a parse-tree
|
---|
1641 | with custom-made objects (which may or may not support the parse-tree
|
---|
1642 | interface, depending on how you choose to do it).
|
---|
1643 |
|
---|
1644 | If you wish to turn an entire POD document into a parse-tree, that process
|
---|
1645 | is fairly straightforward. The B<parse_text()> method is the key to doing
|
---|
1646 | this successfully. Every paragraph-callback (i.e. the polymorphic methods
|
---|
1647 | for B<command()>, B<verbatim()>, and B<textblock()> paragraphs) takes
|
---|
1648 | a B<Pod::Paragraph> object as an argument. Each paragraph object has a
|
---|
1649 | B<parse_tree()> method that can be used to get or set a corresponding
|
---|
1650 | parse-tree. So for each of those paragraph-callback methods, simply call
|
---|
1651 | B<parse_text()> with the options you desire, and then use the returned
|
---|
1652 | parse-tree to assign to the given paragraph object.
|
---|
1653 |
|
---|
1654 | That gives you a parse-tree for each paragraph - so now all you need is
|
---|
1655 | an ordered list of paragraphs. You can maintain that yourself as a data
|
---|
1656 | element in the object/hash. The most straightforward way would be simply
|
---|
1657 | to use an array-ref, with the desired set of custom "options" for each
|
---|
1658 | invocation of B<parse_text>. Let's assume the desired option-set is
|
---|
1659 | given by the hash C<%options>. Then we might do something like the
|
---|
1660 | following:
|
---|
1661 |
|
---|
1662 | package MyPodParserTree;
|
---|
1663 |
|
---|
1664 | @ISA = qw( Pod::Parser );
|
---|
1665 |
|
---|
1666 | ...
|
---|
1667 |
|
---|
1668 | sub begin_pod {
|
---|
1669 | my $self = shift;
|
---|
1670 | $self->{'-paragraphs'} = []; ## initialize paragraph list
|
---|
1671 | }
|
---|
1672 |
|
---|
1673 | sub command {
|
---|
1674 | my ($parser, $command, $paragraph, $line_num, $pod_para) = @_;
|
---|
1675 | my $ptree = $parser->parse_text({%options}, $paragraph, ...);
|
---|
1676 | $pod_para->parse_tree( $ptree );
|
---|
1677 | push @{ $self->{'-paragraphs'} }, $pod_para;
|
---|
1678 | }
|
---|
1679 |
|
---|
1680 | sub verbatim {
|
---|
1681 | my ($parser, $paragraph, $line_num, $pod_para) = @_;
|
---|
1682 | push @{ $self->{'-paragraphs'} }, $pod_para;
|
---|
1683 | }
|
---|
1684 |
|
---|
1685 | sub textblock {
|
---|
1686 | my ($parser, $paragraph, $line_num, $pod_para) = @_;
|
---|
1687 | my $ptree = $parser->parse_text({%options}, $paragraph, ...);
|
---|
1688 | $pod_para->parse_tree( $ptree );
|
---|
1689 | push @{ $self->{'-paragraphs'} }, $pod_para;
|
---|
1690 | }
|
---|
1691 |
|
---|
1692 | ...
|
---|
1693 |
|
---|
1694 | package main;
|
---|
1695 | ...
|
---|
1696 | my $parser = new MyPodParserTree(...);
|
---|
1697 | $parser->parse_from_file(...);
|
---|
1698 | my $paragraphs_ref = $parser->{'-paragraphs'};
|
---|
1699 |
|
---|
1700 | Of course, in this module-author's humble opinion, I'd be more inclined to
|
---|
1701 | use the existing B<Pod::ParseTree> object than a simple array. That way
|
---|
1702 | everything in it, paragraphs and sequences, all respond to the same core
|
---|
1703 | interface for all parse-tree nodes. The result would look something like:
|
---|
1704 |
|
---|
1705 | package MyPodParserTree2;
|
---|
1706 |
|
---|
1707 | ...
|
---|
1708 |
|
---|
1709 | sub begin_pod {
|
---|
1710 | my $self = shift;
|
---|
1711 | $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree
|
---|
1712 | }
|
---|
1713 |
|
---|
1714 | sub parse_tree {
|
---|
1715 | ## convenience method to get/set the parse-tree for the entire POD
|
---|
1716 | (@_ > 1) and $_[0]->{'-ptree'} = $_[1];
|
---|
1717 | return $_[0]->{'-ptree'};
|
---|
1718 | }
|
---|
1719 |
|
---|
1720 | sub command {
|
---|
1721 | my ($parser, $command, $paragraph, $line_num, $pod_para) = @_;
|
---|
1722 | my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...);
|
---|
1723 | $pod_para->parse_tree( $ptree );
|
---|
1724 | $parser->parse_tree()->append( $pod_para );
|
---|
1725 | }
|
---|
1726 |
|
---|
1727 | sub verbatim {
|
---|
1728 | my ($parser, $paragraph, $line_num, $pod_para) = @_;
|
---|
1729 | $parser->parse_tree()->append( $pod_para );
|
---|
1730 | }
|
---|
1731 |
|
---|
1732 | sub textblock {
|
---|
1733 | my ($parser, $paragraph, $line_num, $pod_para) = @_;
|
---|
1734 | my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...);
|
---|
1735 | $pod_para->parse_tree( $ptree );
|
---|
1736 | $parser->parse_tree()->append( $pod_para );
|
---|
1737 | }
|
---|
1738 |
|
---|
1739 | ...
|
---|
1740 |
|
---|
1741 | package main;
|
---|
1742 | ...
|
---|
1743 | my $parser = new MyPodParserTree2(...);
|
---|
1744 | $parser->parse_from_file(...);
|
---|
1745 | my $ptree = $parser->parse_tree;
|
---|
1746 | ...
|
---|
1747 |
|
---|
1748 | Now you have the entire POD document as one great big parse-tree. You
|
---|
1749 | can even use the B<-expand_seq> option to B<parse_text> to insert
|
---|
1750 | whole different kinds of objects. Just don't expect B<Pod::Parser>
|
---|
1751 | to know what to do with them after that. That will need to be in your
|
---|
1752 | code. Or, alternatively, you can insert any object you like so long as
|
---|
1753 | it conforms to the B<Pod::ParseTree> interface.
|
---|
1754 |
|
---|
1755 | One could use this to create subclasses of B<Pod::Paragraphs> and
|
---|
1756 | B<Pod::InteriorSequences> for specific commands (or to create your own
|
---|
1757 | custom node-types in the parse-tree) and add some kind of B<emit()>
|
---|
1758 | method to each custom node/subclass object in the tree. Then all you'd
|
---|
1759 | need to do is recursively walk the tree in the desired order, processing
|
---|
1760 | the children (most likely from left to right) by formatting them if
|
---|
1761 | they are text-strings, or by calling their B<emit()> method if they
|
---|
1762 | are objects/references.
|
---|
1763 |
|
---|
1764 | =head1 SEE ALSO
|
---|
1765 |
|
---|
1766 | L<Pod::InputObjects>, L<Pod::Select>
|
---|
1767 |
|
---|
1768 | B<Pod::InputObjects> defines POD input objects corresponding to
|
---|
1769 | command paragraphs, parse-trees, and interior-sequences.
|
---|
1770 |
|
---|
1771 | B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability
|
---|
1772 | to selectively include and/or exclude sections of a POD document from being
|
---|
1773 | translated based upon the current heading, subheading, subsubheading, etc.
|
---|
1774 |
|
---|
1775 | =for __PRIVATE__
|
---|
1776 | B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users
|
---|
1777 | the ability the employ I<callback functions> instead of, or in addition
|
---|
1778 | to, overriding methods of the base class.
|
---|
1779 |
|
---|
1780 | =for __PRIVATE__
|
---|
1781 | B<Pod::Select> and B<Pod::Callbacks> do not override any
|
---|
1782 | methods nor do they define any new methods with the same name. Because
|
---|
1783 | of this, they may I<both> be used (in combination) as a base class of
|
---|
1784 | the same subclass in order to combine their functionality without
|
---|
1785 | causing any namespace clashes due to multiple inheritance.
|
---|
1786 |
|
---|
1787 | =head1 AUTHOR
|
---|
1788 |
|
---|
1789 | Please report bugs using L<http://rt.cpan.org>.
|
---|
1790 |
|
---|
1791 | Brad Appleton E<lt>[email protected]<gt>
|
---|
1792 |
|
---|
1793 | Based on code for B<Pod::Text> written by
|
---|
1794 | Tom Christiansen E<lt>[email protected]<gt>
|
---|
1795 |
|
---|
1796 | =cut
|
---|
1797 |
|
---|
1798 | 1;
|
---|
1799 | # vim: ts=4 sw=4 et
|
---|