Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Blame
Revision Log

perldebguts.pod@ 14489

Last change on this file since 14489 was 14489, checked in by oranfry, 17 years ago
upgrading to perl 5.8
File size: 31.2 KB

Line
1	=head1 NAME
2
3	perldebguts - Guts of Perl debugging
4
5	=head1 DESCRIPTION
6
7	This is not the perldebug(1) manpage, which tells you how to use
8	the debugger. This manpage describes low-level details concerning
9	the debugger's internals, which range from difficult to impossible
10	to understand for anyone who isn't incredibly intimate with Perl's guts.
11	Caveat lector.
12
13	=head1 Debugger Internals
14
15	Perl has special debugging hooks at compile-time and run-time used
16	to create debugging environments. These hooks are not to be confused
17	with the I<perl -Dxxx> command described in L<perlrun>, which is
18	usable only if a special Perl is built per the instructions in the
19	F<INSTALL> podpage in the Perl source tree.
20
21	For example, whenever you call Perl's built-in C<caller> function
22	from the package C<DB>, the arguments that the corresponding stack
23	frame was called with are copied to the C<@DB::args> array. These
24	mechanisms are enabled by calling Perl with the B<-d> switch.
25	Specifically, the following additional features are enabled
26	(cf. L<perlvar/$^P>):
27
28	=over 4
29
30	=item *
31
32	Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
33	'perl5db.pl'}> if not present) before the first line of your program.
34
35	=item *
36
37	Each array C<@{"_<$filename"}> holds the lines of $filename for a
38	file compiled by Perl. The same is also true for C<eval>ed strings
39	that contain subroutines, or which are currently being executed.
40	The $filename for C<eval>ed strings looks like C<(eval 34)>.
41	Code assertions in regexes look like C<(re_eval 19)>.
42
43	Values in this array are magical in numeric context: they compare
44	equal to zero only if the line is not breakable.
45
46	=item *
47
48	Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed
49	by line number. Individual entries (as opposed to the whole hash)
50	are settable. Perl only cares about Boolean true here, although
51	the values used by F<perl5db.pl> have the form
52	C<"$break_condition\0$action">.
53
54	The same holds for evaluated strings that contain subroutines, or
55	which are currently being executed. The $filename for C<eval>ed strings
56	looks like C<(eval 34)> or C<(re_eval 19)>.
57
58	=item *
59
60	Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
61	also the case for evaluated strings that contain subroutines, or
62	which are currently being executed. The $filename for C<eval>ed
63	strings looks like C<(eval 34)> or C<(re_eval 19)>.
64
65	=item *
66
67	After each C<require>d file is compiled, but before it is executed,
68	C<DB::postponed(*{"_<$filename"})> is called if the subroutine
69	C<DB::postponed> exists. Here, the $filename is the expanded name of
70	the C<require>d file, as found in the values of %INC.
71
72	=item *
73
74	After each subroutine C<subname> is compiled, the existence of
75	C<$DB::postponed{subname}> is checked. If this key exists,
76	C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
77	also exists.
78
79	=item *
80
81	A hash C<%DB::sub> is maintained, whose keys are subroutine names
82	and whose values have the form C<filename:startline-endline>.
83	C<filename> has the form C<(eval 34)> for subroutines defined inside
84	C<eval>s, or C<(re_eval 19)> for those within regex code assertions.
85
86	=item *
87
88	When the execution of your program reaches a point that can hold a
89	breakpoint, the C<DB::DB()> subroutine is called if any of the variables
90	C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables
91	are not C<local>izable. This feature is disabled when executing
92	inside C<DB::DB()>, including functions called from it
93	unless C<< $^D & (1<<30) >> is true.
94
95	=item *
96
97	When execution of the program reaches a subroutine call, a call to
98	C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
99	name of the called subroutine. (This doesn't happen if the subroutine
100	was compiled in the C<DB> package.)
101
102	=back
103
104	Note that if C<&DB::sub> needs external data for it to work, no
105	subroutine call is possible without it. As an example, the standard
106	debugger's C<&DB::sub> depends on the C<$DB::deep> variable
107	(it defines how many levels of recursion deep into the debugger you can go
108	before a mandatory break). If C<$DB::deep> is not defined, subroutine
109	calls are not possible, even though C<&DB::sub> exists.
110
111	=head2 Writing Your Own Debugger
112
113	=head3 Environment Variables
114
115	The C<PERL5DB> environment variable can be used to define a debugger.
116	For example, the minimal "working" debugger (it actually doesn't do anything)
117	consists of one line:
118
119	sub DB::DB {}
120
121	It can easily be defined like this:
122
123	$ PERL5DB="sub DB::DB {}" perl -d your-script
124
125	Another brief debugger, slightly more useful, can be created
126	with only the line:
127
128	sub DB::DB {print ++$i; scalar <STDIN>}
129
130	This debugger prints a number which increments for each statement
131	encountered and waits for you to hit a newline before continuing
132	to the next statement.
133
134	The following debugger is actually useful:
135
136	{
137	package DB;
138	sub DB {}
139	sub sub {print ++$i, " $sub\n"; &$sub}
140	}
141
142	It prints the sequence number of each subroutine call and the name of the
143	called subroutine. Note that C<&DB::sub> is being compiled into the
144	package C<DB> through the use of the C<package> directive.
145
146	When it starts, the debugger reads your rc file (F<./.perldb> or
147	F<~/.perldb> under Unix), which can set important options.
148	(A subroutine (C<&afterinit>) can be defined here as well; it is executed
149	after the debugger completes its own initialization.)
150
151	After the rc file is read, the debugger reads the PERLDB_OPTS
152	environment variable and uses it to set debugger options. The
153	contents of this variable are treated as if they were the argument
154	of an C<o ...> debugger command (q.v. in L<perldebug/Options>).
155
156	=head3 Debugger internal variables
157	In addition to the file and subroutine-related variables mentioned above,
158	the debugger also maintains various magical internal variables.
159
160	=over 4
161
162	=item *
163
164	C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
165	holds the lines of the currently-selected file (compiled by Perl), either
166	explicitly chosen with the debugger's C<f> command, or implicitly by flow
167	of execution.
168
169	Values in this array are magical in numeric context: they compare
170	equal to zero only if the line is not breakable.
171
172	=item *
173
174	C<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which
175	contains breakpoints and actions keyed by line number in
176	the currently-selected file, either explicitly chosen with the
177	debugger's C<f> command, or implicitly by flow of execution.
178
179	As previously noted, individual entries (as opposed to the whole hash)
180	are settable. Perl only cares about Boolean true here, although
181	the values used by F<perl5db.pl> have the form
182	C<"$break_condition\0$action">.
183
184	=back
185
186	=head3 Debugger customization functions
187
188	Some functions are provided to simplify customization.
189
190	=over 4
191
192	=item *
193
194	See L<perldebug/"Options"> for description of options parsed by
195	C<DB::parse_options(string)> parses debugger options; see
196	L<pperldebug/Options> for a description of options recognized.
197
198	=item *
199
200	C<DB::dump_trace(skip[,count])> skips the specified number of frames
201	and returns a list containing information about the calling frames (all
202	of them, if C<count> is missing). Each entry is reference to a hash
203	with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
204	name, or info about C<eval>), C<args> (C<undef> or a reference to
205	an array), C<file>, and C<line>.
206
207	=item *
208
209	C<DB::print_trace(FH, skip[, count[, short]])> prints
210	formatted info about caller frames. The last two functions may be
211	convenient as arguments to C<< < >>, C<< << >> commands.
212
213	=back
214
215	Note that any variables and functions that are not documented in
216	this manpages (or in L<perldebug>) are considered for internal
217	use only, and as such are subject to change without notice.
218
219	=head1 Frame Listing Output Examples
220
221	The C<frame> option can be used to control the output of frame
222	information. For example, contrast this expression trace:
223
224	$ perl -de 42
225	Stack dump during die enabled outside of evals.
226
227	Loading DB routines from perl5db.pl patch level 0.94
228	Emacs support available.
229
230	Enter h or `h h' for help.
231
232	main::(-e:1): 0
233	DB<1> sub foo { 14 }
234
235	DB<2> sub bar { 3 }
236
237	DB<3> t print foo() * bar()
238	main::((eval 172):3): print foo() + bar();
239	main::foo((eval 168):2):
240	main::bar((eval 170):2):
241	42
242
243	with this one, once the C<o>ption C<frame=2> has been set:
244
245	DB<4> o f=2
246	frame = '2'
247	DB<5> t print foo() * bar()
248	3: foo() * bar()
249	entering main::foo
250	2: sub foo { 14 };
251	exited main::foo
252	entering main::bar
253	2: sub bar { 3 };
254	exited main::bar
255	42
256
257	By way of demonstration, we present below a laborious listing
258	resulting from setting your C<PERLDB_OPTS> environment variable to
259	the value C<f=n N>, and running I<perl -d -V> from the command line.
260	Examples use various values of C<n> are shown to give you a feel
261	for the difference between settings. Long those it may be, this
262	is not a complete listing, but only excerpts.
263
264	=over 4
265
266	=item 1
267
268	entering main::BEGIN
269	entering Config::BEGIN
270	Package lib/Exporter.pm.
271	Package lib/Carp.pm.
272	Package lib/Config.pm.
273	entering Config::TIEHASH
274	entering Exporter::import
275	entering Exporter::export
276	entering Config::myconfig
277	entering Config::FETCH
278	entering Config::FETCH
279	entering Config::FETCH
280	entering Config::FETCH
281
282	=item 2
283
284	entering main::BEGIN
285	entering Config::BEGIN
286	Package lib/Exporter.pm.
287	Package lib/Carp.pm.
288	exited Config::BEGIN
289	Package lib/Config.pm.
290	entering Config::TIEHASH
291	exited Config::TIEHASH
292	entering Exporter::import
293	entering Exporter::export
294	exited Exporter::export
295	exited Exporter::import
296	exited main::BEGIN
297	entering Config::myconfig
298	entering Config::FETCH
299	exited Config::FETCH
300	entering Config::FETCH
301	exited Config::FETCH
302	entering Config::FETCH
303
304	=item 4
305
306	in $=main::BEGIN() from /dev/null:0
307	in $=Config::BEGIN() from lib/Config.pm:2
308	Package lib/Exporter.pm.
309	Package lib/Carp.pm.
310	Package lib/Config.pm.
311	in $=Config::TIEHASH('Config') from lib/Config.pm:644
312	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
313	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
314	in @=Config::myconfig() from /dev/null:0
315	in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
316	in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
317	in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
318	in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
319	in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
320	in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
321
322	=item 6
323
324	in $=main::BEGIN() from /dev/null:0
325	in $=Config::BEGIN() from lib/Config.pm:2
326	Package lib/Exporter.pm.
327	Package lib/Carp.pm.
328	out $=Config::BEGIN() from lib/Config.pm:0
329	Package lib/Config.pm.
330	in $=Config::TIEHASH('Config') from lib/Config.pm:644
331	out $=Config::TIEHASH('Config') from lib/Config.pm:644
332	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
333	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
334	out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
335	out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
336	out $=main::BEGIN() from /dev/null:0
337	in @=Config::myconfig() from /dev/null:0
338	in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
339	out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
340	in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
341	out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
342	in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
343	out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
344	in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
345
346	=item 14
347
348	in $=main::BEGIN() from /dev/null:0
349	in $=Config::BEGIN() from lib/Config.pm:2
350	Package lib/Exporter.pm.
351	Package lib/Carp.pm.
352	out $=Config::BEGIN() from lib/Config.pm:0
353	Package lib/Config.pm.
354	in $=Config::TIEHASH('Config') from lib/Config.pm:644
355	out $=Config::TIEHASH('Config') from lib/Config.pm:644
356	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
357	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
358	out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
359	out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
360	out $=main::BEGIN() from /dev/null:0
361	in @=Config::myconfig() from /dev/null:0
362	in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
363	out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
364	in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
365	out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
366
367	=item 30
368
369	in $=CODE(0x15eca4)() from /dev/null:0
370	in $=CODE(0x182528)() from lib/Config.pm:2
371	Package lib/Exporter.pm.
372	out $=CODE(0x182528)() from lib/Config.pm:0
373	scalar context return from CODE(0x182528): undef
374	Package lib/Config.pm.
375	in $=Config::TIEHASH('Config') from lib/Config.pm:628
376	out $=Config::TIEHASH('Config') from lib/Config.pm:628
377	scalar context return from Config::TIEHASH: empty hash
378	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
379	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
380	out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
381	scalar context return from Exporter::export: ''
382	out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
383	scalar context return from Exporter::import: ''
384
385	=back
386
387	In all cases shown above, the line indentation shows the call tree.
388	If bit 2 of C<frame> is set, a line is printed on exit from a
389	subroutine as well. If bit 4 is set, the arguments are printed
390	along with the caller info. If bit 8 is set, the arguments are
391	printed even if they are tied or references. If bit 16 is set, the
392	return value is printed, too.
393
394	When a package is compiled, a line like this
395
396	Package lib/Carp.pm.
397
398	is printed with proper indentation.
399
400	=head1 Debugging regular expressions
401
402	There are two ways to enable debugging output for regular expressions.
403
404	If your perl is compiled with C<-DDEBUGGING>, you may use the
405	B<-Dr> flag on the command line.
406
407	Otherwise, one can C<use re 'debug'>, which has effects at
408	compile time and run time. It is not lexically scoped.
409
410	=head2 Compile-time output
411
412	The debugging output at compile time looks like this:
413
414	Compiling REx `[bc]d(ef*g)+h[ij]k$'
415	size 45 Got 364 bytes for offset annotations.
416	first at 1
417	rarest char g at 0
418	rarest char d at 0
419	1: ANYOF[bc](12)
420	12: EXACT <d>(14)
421	14: CURLYX[0] {1,32767}(28)
422	16: OPEN1(18)
423	18: EXACT <e>(20)
424	20: STAR(23)
425	21: EXACT <f>(0)
426	23: EXACT <g>(25)
427	25: CLOSE1(27)
428	27: WHILEM[1/1](0)
429	28: NOTHING(29)
430	29: EXACT <h>(31)
431	31: ANYOF[ij](42)
432	42: EXACT <k>(44)
433	44: EOL(45)
434	45: END(0)
435	anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
436	stclass `ANYOF[bc]' minlen 7
437	Offsets: [45]
438	1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
439	0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
440	11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
441	0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
442	Omitting $` $& $' support.
443
444	The first line shows the pre-compiled form of the regex. The second
445	shows the size of the compiled form (in arbitrary units, usually
446	4-byte words) and the total number of bytes allocated for the
447	offset/length table, usually 4+C<size>*8. The next line shows the
448	label I<id> of the first node that does a match.
449
450	The
451
452	anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
453	stclass `ANYOF[bc]' minlen 7
454
455	line (split into two lines above) contains optimizer
456	information. In the example shown, the optimizer found that the match
457	should contain a substring C<de> at offset 1, plus substring C<gh>
458	at some offset between 3 and infinity. Moreover, when checking for
459	these substrings (to abandon impossible matches quickly), Perl will check
460	for the substring C<gh> before checking for the substring C<de>. The
461	optimizer may also use the knowledge that the match starts (at the
462	C<first> I<id>) with a character class, and no string
463	shorter than 7 characters can possibly match.
464
465	The fields of interest which may appear in this line are
466
467	=over 4
468
469	=item C<anchored> I<STRING> C<at> I<POS>
470
471	=item C<floating> I<STRING> C<at> I<POS1..POS2>
472
473	See above.
474
475	=item C<matching floating/anchored>
476
477	Which substring to check first.
478
479	=item C<minlen>
480
481	The minimal length of the match.
482
483	=item C<stclass> I<TYPE>
484
485	Type of first matching node.
486
487	=item C<noscan>
488
489	Don't scan for the found substrings.
490
491	=item C<isall>
492
493	Means that the optimizer information is all that the regular
494	expression contains, and thus one does not need to enter the regex engine at
495	all.
496
497	=item C<GPOS>
498
499	Set if the pattern contains C<\G>.
500
501	=item C<plus>
502
503	Set if the pattern starts with a repeated char (as in C<x+y>).
504
505	=item C<implicit>
506
507	Set if the pattern starts with C<.*>.
508
509	=item C<with eval>
510
511	Set if the pattern contain eval-groups, such as C<(?{ code })> and
512	C<(??{ code })>.
513
514	=item C<anchored(TYPE)>
515
516	If the pattern may match only at a handful of places, (with C<TYPE>
517	being C<BOL>, C<MBOL>, or C<GPOS>. See the table below.
518
519	=back
520
521	If a substring is known to match at end-of-line only, it may be
522	followed by C<$>, as in C<floating `k'$>.
523
524	The optimizer-specific information is used to avoid entering (a slow) regex
525	engine on strings that will not definitely match. If the C<isall> flag
526	is set, a call to the regex engine may be avoided even when the optimizer
527	found an appropriate place for the match.
528
529	Above the optimizer section is the list of I<nodes> of the compiled
530	form of the regex. Each line has format
531
532	C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
533
534	=head2 Types of nodes
535
536	Here are the possible types, with short descriptions:
537
538	# TYPE arg-description [num-args] [longjump-len] DESCRIPTION
539
540	# Exit points
541	END no End of program.
542	SUCCEED no Return from a subroutine, basically.
543
544	# Anchors:
545	BOL no Match "" at beginning of line.
546	MBOL no Same, assuming multiline.
547	SBOL no Same, assuming singleline.
548	EOS no Match "" at end of string.
549	EOL no Match "" at end of line.
550	MEOL no Same, assuming multiline.
551	SEOL no Same, assuming singleline.
552	BOUND no Match "" at any word boundary
553	BOUNDL no Match "" at any word boundary
554	NBOUND no Match "" at any word non-boundary
555	NBOUNDL no Match "" at any word non-boundary
556	GPOS no Matches where last m//g left off.
557
558	# [Special] alternatives
559	ANY no Match any one character (except newline).
560	SANY no Match any one character.
561	ANYOF sv Match character in (or not in) this class.
562	ALNUM no Match any alphanumeric character
563	ALNUML no Match any alphanumeric char in locale
564	NALNUM no Match any non-alphanumeric character
565	NALNUML no Match any non-alphanumeric char in locale
566	SPACE no Match any whitespace character
567	SPACEL no Match any whitespace char in locale
568	NSPACE no Match any non-whitespace character
569	NSPACEL no Match any non-whitespace char in locale
570	DIGIT no Match any numeric character
571	NDIGIT no Match any non-numeric character
572
573	# BRANCH The set of branches constituting a single choice are hooked
574	# together with their "next" pointers, since precedence prevents
575	# anything being concatenated to any individual branch. The
576	# "next" pointer of the last BRANCH in a choice points to the
577	# thing following the whole choice. This is also where the
578	# final "next" pointer of each individual branch points; each
579	# branch starts with the operand node of a BRANCH node.
580	#
581	BRANCH node Match this alternative, or the next...
582
583	# BACK Normal "next" pointers all implicitly point forward; BACK
584	# exists to make loop structures possible.
585	# not used
586	BACK no Match "", "next" ptr points backward.
587
588	# Literals
589	EXACT sv Match this string (preceded by length).
590	EXACTF sv Match this string, folded (prec. by length).
591	EXACTFL sv Match this string, folded in locale (w/len).
592
593	# Do nothing
594	NOTHING no Match empty string.
595	# A variant of above which delimits a group, thus stops optimizations
596	TAIL no Match empty string. Can jump here from outside.
597
598	# STAR,PLUS '?', and complex '*' and '+', are implemented as circular
599	# BRANCH structures using BACK. Simple cases (one character
600	# per match) are implemented with STAR and PLUS for speed
601	# and to minimize recursive plunges.
602	#
603	STAR node Match this (simple) thing 0 or more times.
604	PLUS node Match this (simple) thing 1 or more times.
605
606	CURLY sv 2 Match this simple thing {n,m} times.
607	CURLYN no 2 Match next-after-this simple thing
608	# {n,m} times, set parens.
609	CURLYM no 2 Match this medium-complex thing {n,m} times.
610	CURLYX sv 2 Match this complex thing {n,m} times.
611
612	# This terminator creates a loop structure for CURLYX
613	WHILEM no Do curly processing and see if rest matches.
614
615	# OPEN,CLOSE,GROUPP ...are numbered at compile time.
616	OPEN num 1 Mark this point in input as start of #n.
617	CLOSE num 1 Analogous to OPEN.
618
619	REF num 1 Match some already matched string
620	REFF num 1 Match already matched string, folded
621	REFFL num 1 Match already matched string, folded in loc.
622
623	# grouping assertions
624	IFMATCH off 1 2 Succeeds if the following matches.
625	UNLESSM off 1 2 Fails if the following matches.
626	SUSPEND off 1 1 "Independent" sub-regex.
627	IFTHEN off 1 1 Switch, should be preceded by switcher .
628	GROUPP num 1 Whether the group matched.
629
630	# Support for long regex
631	LONGJMP off 1 1 Jump far away.
632	BRANCHJ off 1 1 BRANCH with long offset.
633
634	# The heavy worker
635	EVAL evl 1 Execute some Perl code.
636
637	# Modifiers
638	MINMOD no Next operator is not greedy.
639	LOGICAL no Next opcode should set the flag only.
640
641	# This is not used yet
642	RENUM off 1 1 Group with independently numbered parens.
643
644	# This is not really a node, but an optimized away piece of a "long" node.
645	# To simplify debugging output, we mark it as if it were a node
646	OPTIMIZED off Placeholder for dump.
647
648	=for unprinted-credits
649	Next section M-J. Dominus ([email protected]) 20010421
650
651	Following the optimizer information is a dump of the offset/length
652	table, here split across several lines:
653
654	Offsets: [45]
655	1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
656	0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
657	11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
658	0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
659
660	The first line here indicates that the offset/length table contains 45
661	entries. Each entry is a pair of integers, denoted by C<offset[length]>.
662	Entries are numbered starting with 1, so entry #1 here is C<1[4]> and
663	entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:>
664	(the C<1: ANYOF[bc]>) begins at character position 1 in the
665	pre-compiled form of the regex, and has a length of 4 characters.
666	C<5[1]> in position 12
667	indicates that the node labeled C<12:>
668	(the C<< 12: EXACT <d> >>) begins at character position 5 in the
669	pre-compiled form of the regex, and has a length of 1 character.
670	C<12[1]> in position 14
671	indicates that the node labeled C<14:>
672	(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
673	pre-compiled form of the regex, and has a length of 1 character---that
674	is, it corresponds to the C<+> symbol in the precompiled regex.
675
676	C<0[0]> items indicate that there is no corresponding node.
677
678	=head2 Run-time output
679
680	First of all, when doing a match, one may get no run-time output even
681	if debugging is enabled. This means that the regex engine was never
682	entered and that all of the job was therefore done by the optimizer.
683
684	If the regex engine was entered, the output may look like this:
685
686	Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
687	Setting an EVAL scope, savestack=3
688	2 <ab> <cdefg__gh_> \| 1: ANYOF
689	3 <abc> <defg__gh_> \| 11: EXACT <d>
690	4 <abcd> <efg__gh_> \| 13: CURLYX {1,32767}
691	4 <abcd> <efg__gh_> \| 26: WHILEM
692	0 out of 1..32767 cc=effff31c
693	4 <abcd> <efg__gh_> \| 15: OPEN1
694	4 <abcd> <efg__gh_> \| 17: EXACT <e>
695	5 <abcde> <fg__gh_> \| 19: STAR
696	EXACT <f> can match 1 times out of 32767...
697	Setting an EVAL scope, savestack=3
698	6 <bcdef> <g__gh__> \| 22: EXACT <g>
699	7 <bcdefg> <__gh__> \| 24: CLOSE1
700	7 <bcdefg> <__gh__> \| 26: WHILEM
701	1 out of 1..32767 cc=effff31c
702	Setting an EVAL scope, savestack=12
703	7 <bcdefg> <__gh__> \| 15: OPEN1
704	7 <bcdefg> <__gh__> \| 17: EXACT <e>
705	restoring \1 to 4(4)..7
706	failed, try continuation...
707	7 <bcdefg> <__gh__> \| 27: NOTHING
708	7 <bcdefg> <__gh__> \| 28: EXACT <h>
709	failed...
710	failed...
711
712	The most significant information in the output is about the particular I<node>
713	of the compiled regex that is currently being tested against the target string.
714	The format of these lines is
715
716	C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> \|I<ID>: I<TYPE>
717
718	The I<TYPE> info is indented with respect to the backtracking level.
719	Other incidental information appears interspersed within.
720
721	=head1 Debugging Perl memory usage
722
723	Perl is a profligate wastrel when it comes to memory use. There
724	is a saying that to estimate memory usage of Perl, assume a reasonable
725	algorithm for memory allocation, multiply that estimate by 10, and
726	while you still may miss the mark, at least you won't be quite so
727	astonished. This is not absolutely true, but may provide a good
728	grasp of what happens.
729
730	Assume that an integer cannot take less than 20 bytes of memory, a
731	float cannot take less than 24 bytes, a string cannot take less
732	than 32 bytes (all these examples assume 32-bit architectures, the
733	result are quite a bit worse on 64-bit architectures). If a variable
734	is accessed in two of three different ways (which require an integer,
735	a float, or a string), the memory footprint may increase yet another
736	20 bytes. A sloppy malloc(3) implementation can inflate these
737	numbers dramatically.
738
739	On the opposite end of the scale, a declaration like
740
741	sub foo;
742
743	may take up to 500 bytes of memory, depending on which release of Perl
744	you're running.
745
746	Anecdotal estimates of source-to-compiled code bloat suggest an
747	eightfold increase. This means that the compiled form of reasonable
748	(normally commented, properly indented etc.) code will take
749	about eight times more space in memory than the code took
750	on disk.
751
752	The B<-DL> command-line switch is obsolete since circa Perl 5.6.0
753	(it was available only if Perl was built with C<-DDEBUGGING>).
754	The switch was used to track Perl's memory allocations and possible
755	memory leaks. These days the use of malloc debugging tools like
756	F<Purify> or F<valgrind> is suggested instead.
757
758	One way to find out how much memory is being used by Perl data
759	structures is to install the Devel::Size module from CPAN: it gives
760	you the minimum number of bytes required to store a particular data
761	structure. Please be mindful of the difference between the size()
762	and total_size().
763
764	If Perl has been compiled using Perl's malloc you can analyze Perl
765	memory usage by setting the $ENV{PERL_DEBUG_MSTATS}.
766
767	=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
768
769	If your perl is using Perl's malloc() and was compiled with the
770	necessary switches (this is the default), then it will print memory
771	usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS}
772	> 1 >>, and before termination of the program when C<<
773	$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to
774	the following example:
775
776	$ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
777	Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
778	14216 free: 130 117 28 7 9 0 2 2 1 0 0
779	437 61 36 0 5
780	60924 used: 125 137 161 55 7 8 6 16 2 0 1
781	74 109 304 84 20
782	Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
783	Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
784	30888 free: 245 78 85 13 6 2 1 3 2 0 1
785	315 162 39 42 11
786	175816 used: 265 176 1112 111 26 22 11 27 2 1 1
787	196 178 1066 798 39
788	Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
789
790	It is possible to ask for such a statistic at arbitrary points in
791	your execution using the mstat() function out of the standard
792	Devel::Peek module.
793
794	Here is some explanation of that format:
795
796	=over 4
797
798	=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
799
800	Perl's malloc() uses bucketed allocations. Every request is rounded
801	up to the closest bucket size available, and a bucket is taken from
802	the pool of buckets of that size.
803
804	The line above describes the limits of buckets currently in use.
805	Each bucket has two sizes: memory footprint and the maximal size
806	of user data that can fit into this bucket. Suppose in the above
807	example that the smallest bucket were size 4. The biggest bucket
808	would have usable size 8188, and the memory footprint would be 8192.
809
810	In a Perl built for debugging, some buckets may have negative usable
811	size. This means that these buckets cannot (and will not) be used.
812	For larger buckets, the memory footprint may be one page greater
813	than a power of 2. If so, case the corresponding power of two is
814	printed in the C<APPROX> field above.
815
816	=item Free/Used
817
818	The 1 or 2 rows of numbers following that correspond to the number
819	of buckets of each size between C<SMALLEST> and C<GREATEST>. In
820	the first row, the sizes (memory footprints) of buckets are powers
821	of two--or possibly one page greater. In the second row, if present,
822	the memory footprints of the buckets are between the memory footprints
823	of two buckets "above".
824
825	For example, suppose under the previous example, the memory footprints
826	were
827
828	free: 8 16 32 64 128 256 512 1024 2048 4096 8192
829	4 12 24 48 80
830
831	With non-C<DEBUGGING> perl, the buckets starting from C<128> have
832	a 4-byte overhead, and thus an 8192-long bucket may take up to
833	8188-byte allocations.
834
835	=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
836
837	The first two fields give the total amount of memory perl sbrk(2)ed
838	(ess-broken? :-) and number of sbrk(2)s used. The third number is
839	what perl thinks about continuity of returned chunks. So long as
840	this number is positive, malloc() will assume that it is probable
841	that sbrk(2) will provide continuous memory.
842
843	Memory allocated by external libraries is not counted.
844
845	=item C<pad: 0>
846
847	The amount of sbrk(2)ed memory needed to keep buckets aligned.
848
849	=item C<heads: 2192>
850
851	Although memory overhead of bigger buckets is kept inside the bucket, for
852	smaller buckets, it is kept in separate areas. This field gives the
853	total size of these areas.
854
855	=item C<chain: 0>
856
857	malloc() may want to subdivide a bigger bucket into smaller buckets.
858	If only a part of the deceased bucket is left unsubdivided, the rest
859	is kept as an element of a linked list. This field gives the total
860	size of these chunks.
861
862	=item C<tail: 6144>
863
864	To minimize the number of sbrk(2)s, malloc() asks for more memory. This
865	field gives the size of the yet unused part, which is sbrk(2)ed, but
866	never touched.
867
868	=back
869
870	=head1 SEE ALSO
871
872	L<perldebug>,
873	L<perlguts>,
874	L<perlrun>
875	L<re>,
876	and
877	L<Devel::DProf>.

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format