[14489] | 1 | =head1 NAME
|
---|
| 2 | X<reference> X<pointer> X<data structure> X<structure> X<struct>
|
---|
| 3 |
|
---|
| 4 | perlref - Perl references and nested data structures
|
---|
| 5 |
|
---|
| 6 | =head1 NOTE
|
---|
| 7 |
|
---|
| 8 | This is complete documentation about all aspects of references.
|
---|
| 9 | For a shorter, tutorial introduction to just the essential features,
|
---|
| 10 | see L<perlreftut>.
|
---|
| 11 |
|
---|
| 12 | =head1 DESCRIPTION
|
---|
| 13 |
|
---|
| 14 | Before release 5 of Perl it was difficult to represent complex data
|
---|
| 15 | structures, because all references had to be symbolic--and even then
|
---|
| 16 | it was difficult to refer to a variable instead of a symbol table entry.
|
---|
| 17 | Perl now not only makes it easier to use symbolic references to variables,
|
---|
| 18 | but also lets you have "hard" references to any piece of data or code.
|
---|
| 19 | Any scalar may hold a hard reference. Because arrays and hashes contain
|
---|
| 20 | scalars, you can now easily build arrays of arrays, arrays of hashes,
|
---|
| 21 | hashes of arrays, arrays of hashes of functions, and so on.
|
---|
| 22 |
|
---|
| 23 | Hard references are smart--they keep track of reference counts for you,
|
---|
| 24 | automatically freeing the thing referred to when its reference count goes
|
---|
| 25 | to zero. (Reference counts for values in self-referential or
|
---|
| 26 | cyclic data structures may not go to zero without a little help; see
|
---|
| 27 | L<perlobj/"Two-Phased Garbage Collection"> for a detailed explanation.)
|
---|
| 28 | If that thing happens to be an object, the object is destructed. See
|
---|
| 29 | L<perlobj> for more about objects. (In a sense, everything in Perl is an
|
---|
| 30 | object, but we usually reserve the word for references to objects that
|
---|
| 31 | have been officially "blessed" into a class package.)
|
---|
| 32 |
|
---|
| 33 | Symbolic references are names of variables or other objects, just as a
|
---|
| 34 | symbolic link in a Unix filesystem contains merely the name of a file.
|
---|
| 35 | The C<*glob> notation is something of a symbolic reference. (Symbolic
|
---|
| 36 | references are sometimes called "soft references", but please don't call
|
---|
| 37 | them that; references are confusing enough without useless synonyms.)
|
---|
| 38 | X<reference, symbolic> X<reference, soft>
|
---|
| 39 | X<symbolic reference> X<soft reference>
|
---|
| 40 |
|
---|
| 41 | In contrast, hard references are more like hard links in a Unix file
|
---|
| 42 | system: They are used to access an underlying object without concern for
|
---|
| 43 | what its (other) name is. When the word "reference" is used without an
|
---|
| 44 | adjective, as in the following paragraph, it is usually talking about a
|
---|
| 45 | hard reference.
|
---|
| 46 | X<reference, hard> X<hard reference>
|
---|
| 47 |
|
---|
| 48 | References are easy to use in Perl. There is just one overriding
|
---|
| 49 | principle: Perl does no implicit referencing or dereferencing. When a
|
---|
| 50 | scalar is holding a reference, it always behaves as a simple scalar. It
|
---|
| 51 | doesn't magically start being an array or hash or subroutine; you have to
|
---|
| 52 | tell it explicitly to do so, by dereferencing it.
|
---|
| 53 |
|
---|
| 54 | =head2 Making References
|
---|
| 55 | X<reference, creation> X<referencing>
|
---|
| 56 |
|
---|
| 57 | References can be created in several ways.
|
---|
| 58 |
|
---|
| 59 | =over 4
|
---|
| 60 |
|
---|
| 61 | =item 1.
|
---|
| 62 | X<\> X<backslash>
|
---|
| 63 |
|
---|
| 64 | By using the backslash operator on a variable, subroutine, or value.
|
---|
| 65 | (This works much like the & (address-of) operator in C.)
|
---|
| 66 | This typically creates I<another> reference to a variable, because
|
---|
| 67 | there's already a reference to the variable in the symbol table. But
|
---|
| 68 | the symbol table reference might go away, and you'll still have the
|
---|
| 69 | reference that the backslash returned. Here are some examples:
|
---|
| 70 |
|
---|
| 71 | $scalarref = \$foo;
|
---|
| 72 | $arrayref = \@ARGV;
|
---|
| 73 | $hashref = \%ENV;
|
---|
| 74 | $coderef = \&handler;
|
---|
| 75 | $globref = \*foo;
|
---|
| 76 |
|
---|
| 77 | It isn't possible to create a true reference to an IO handle (filehandle
|
---|
| 78 | or dirhandle) using the backslash operator. The most you can get is a
|
---|
| 79 | reference to a typeglob, which is actually a complete symbol table entry.
|
---|
| 80 | But see the explanation of the C<*foo{THING}> syntax below. However,
|
---|
| 81 | you can still use type globs and globrefs as though they were IO handles.
|
---|
| 82 |
|
---|
| 83 | =item 2.
|
---|
| 84 | X<array, anonymous> X<[> X<[]> X<square bracket>
|
---|
| 85 | X<bracket, square> X<arrayref> X<array reference> X<reference, array>
|
---|
| 86 |
|
---|
| 87 | A reference to an anonymous array can be created using square
|
---|
| 88 | brackets:
|
---|
| 89 |
|
---|
| 90 | $arrayref = [1, 2, ['a', 'b', 'c']];
|
---|
| 91 |
|
---|
| 92 | Here we've created a reference to an anonymous array of three elements
|
---|
| 93 | whose final element is itself a reference to another anonymous array of three
|
---|
| 94 | elements. (The multidimensional syntax described later can be used to
|
---|
| 95 | access this. For example, after the above, C<< $arrayref->[2][1] >> would have
|
---|
| 96 | the value "b".)
|
---|
| 97 |
|
---|
| 98 | Taking a reference to an enumerated list is not the same
|
---|
| 99 | as using square brackets--instead it's the same as creating
|
---|
| 100 | a list of references!
|
---|
| 101 |
|
---|
| 102 | @list = (\$a, \@b, \%c);
|
---|
| 103 | @list = \($a, @b, %c); # same thing!
|
---|
| 104 |
|
---|
| 105 | As a special case, C<\(@foo)> returns a list of references to the contents
|
---|
| 106 | of C<@foo>, not a reference to C<@foo> itself. Likewise for C<%foo>,
|
---|
| 107 | except that the key references are to copies (since the keys are just
|
---|
| 108 | strings rather than full-fledged scalars).
|
---|
| 109 |
|
---|
| 110 | =item 3.
|
---|
| 111 | X<hash, anonymous> X<{> X<{}> X<curly bracket>
|
---|
| 112 | X<bracket, curly> X<brace> X<hashref> X<hash reference> X<reference, hash>
|
---|
| 113 |
|
---|
| 114 | A reference to an anonymous hash can be created using curly
|
---|
| 115 | brackets:
|
---|
| 116 |
|
---|
| 117 | $hashref = {
|
---|
| 118 | 'Adam' => 'Eve',
|
---|
| 119 | 'Clyde' => 'Bonnie',
|
---|
| 120 | };
|
---|
| 121 |
|
---|
| 122 | Anonymous hash and array composers like these can be intermixed freely to
|
---|
| 123 | produce as complicated a structure as you want. The multidimensional
|
---|
| 124 | syntax described below works for these too. The values above are
|
---|
| 125 | literals, but variables and expressions would work just as well, because
|
---|
| 126 | assignment operators in Perl (even within local() or my()) are executable
|
---|
| 127 | statements, not compile-time declarations.
|
---|
| 128 |
|
---|
| 129 | Because curly brackets (braces) are used for several other things
|
---|
| 130 | including BLOCKs, you may occasionally have to disambiguate braces at the
|
---|
| 131 | beginning of a statement by putting a C<+> or a C<return> in front so
|
---|
| 132 | that Perl realizes the opening brace isn't starting a BLOCK. The economy and
|
---|
| 133 | mnemonic value of using curlies is deemed worth this occasional extra
|
---|
| 134 | hassle.
|
---|
| 135 |
|
---|
| 136 | For example, if you wanted a function to make a new hash and return a
|
---|
| 137 | reference to it, you have these options:
|
---|
| 138 |
|
---|
| 139 | sub hashem { { @_ } } # silently wrong
|
---|
| 140 | sub hashem { +{ @_ } } # ok
|
---|
| 141 | sub hashem { return { @_ } } # ok
|
---|
| 142 |
|
---|
| 143 | On the other hand, if you want the other meaning, you can do this:
|
---|
| 144 |
|
---|
| 145 | sub showem { { @_ } } # ambiguous (currently ok, but may change)
|
---|
| 146 | sub showem { {; @_ } } # ok
|
---|
| 147 | sub showem { { return @_ } } # ok
|
---|
| 148 |
|
---|
| 149 | The leading C<+{> and C<{;> always serve to disambiguate
|
---|
| 150 | the expression to mean either the HASH reference, or the BLOCK.
|
---|
| 151 |
|
---|
| 152 | =item 4.
|
---|
| 153 | X<subroutine, anonymous> X<subroutine, reference> X<reference, subroutine>
|
---|
| 154 | X<scope, lexical> X<closure> X<lexical> X<lexical scope>
|
---|
| 155 |
|
---|
| 156 | A reference to an anonymous subroutine can be created by using
|
---|
| 157 | C<sub> without a subname:
|
---|
| 158 |
|
---|
| 159 | $coderef = sub { print "Boink!\n" };
|
---|
| 160 |
|
---|
| 161 | Note the semicolon. Except for the code
|
---|
| 162 | inside not being immediately executed, a C<sub {}> is not so much a
|
---|
| 163 | declaration as it is an operator, like C<do{}> or C<eval{}>. (However, no
|
---|
| 164 | matter how many times you execute that particular line (unless you're in an
|
---|
| 165 | C<eval("...")>), $coderef will still have a reference to the I<same>
|
---|
| 166 | anonymous subroutine.)
|
---|
| 167 |
|
---|
| 168 | Anonymous subroutines act as closures with respect to my() variables,
|
---|
| 169 | that is, variables lexically visible within the current scope. Closure
|
---|
| 170 | is a notion out of the Lisp world that says if you define an anonymous
|
---|
| 171 | function in a particular lexical context, it pretends to run in that
|
---|
| 172 | context even when it's called outside the context.
|
---|
| 173 |
|
---|
| 174 | In human terms, it's a funny way of passing arguments to a subroutine when
|
---|
| 175 | you define it as well as when you call it. It's useful for setting up
|
---|
| 176 | little bits of code to run later, such as callbacks. You can even
|
---|
| 177 | do object-oriented stuff with it, though Perl already provides a different
|
---|
| 178 | mechanism to do that--see L<perlobj>.
|
---|
| 179 |
|
---|
| 180 | You might also think of closure as a way to write a subroutine
|
---|
| 181 | template without using eval(). Here's a small example of how
|
---|
| 182 | closures work:
|
---|
| 183 |
|
---|
| 184 | sub newprint {
|
---|
| 185 | my $x = shift;
|
---|
| 186 | return sub { my $y = shift; print "$x, $y!\n"; };
|
---|
| 187 | }
|
---|
| 188 | $h = newprint("Howdy");
|
---|
| 189 | $g = newprint("Greetings");
|
---|
| 190 |
|
---|
| 191 | # Time passes...
|
---|
| 192 |
|
---|
| 193 | &$h("world");
|
---|
| 194 | &$g("earthlings");
|
---|
| 195 |
|
---|
| 196 | This prints
|
---|
| 197 |
|
---|
| 198 | Howdy, world!
|
---|
| 199 | Greetings, earthlings!
|
---|
| 200 |
|
---|
| 201 | Note particularly that $x continues to refer to the value passed
|
---|
| 202 | into newprint() I<despite> "my $x" having gone out of scope by the
|
---|
| 203 | time the anonymous subroutine runs. That's what a closure is all
|
---|
| 204 | about.
|
---|
| 205 |
|
---|
| 206 | This applies only to lexical variables, by the way. Dynamic variables
|
---|
| 207 | continue to work as they have always worked. Closure is not something
|
---|
| 208 | that most Perl programmers need trouble themselves about to begin with.
|
---|
| 209 |
|
---|
| 210 | =item 5.
|
---|
| 211 | X<constructor> X<new>
|
---|
| 212 |
|
---|
| 213 | References are often returned by special subroutines called constructors.
|
---|
| 214 | Perl objects are just references to a special type of object that happens to know
|
---|
| 215 | which package it's associated with. Constructors are just special
|
---|
| 216 | subroutines that know how to create that association. They do so by
|
---|
| 217 | starting with an ordinary reference, and it remains an ordinary reference
|
---|
| 218 | even while it's also being an object. Constructors are often
|
---|
| 219 | named new() and called indirectly:
|
---|
| 220 |
|
---|
| 221 | $objref = new Doggie (Tail => 'short', Ears => 'long');
|
---|
| 222 |
|
---|
| 223 | But don't have to be:
|
---|
| 224 |
|
---|
| 225 | $objref = Doggie->new(Tail => 'short', Ears => 'long');
|
---|
| 226 |
|
---|
| 227 | use Term::Cap;
|
---|
| 228 | $terminal = Term::Cap->Tgetent( { OSPEED => 9600 });
|
---|
| 229 |
|
---|
| 230 | use Tk;
|
---|
| 231 | $main = MainWindow->new();
|
---|
| 232 | $menubar = $main->Frame(-relief => "raised",
|
---|
| 233 | -borderwidth => 2)
|
---|
| 234 |
|
---|
| 235 | =item 6.
|
---|
| 236 | X<autovivification>
|
---|
| 237 |
|
---|
| 238 | References of the appropriate type can spring into existence if you
|
---|
| 239 | dereference them in a context that assumes they exist. Because we haven't
|
---|
| 240 | talked about dereferencing yet, we can't show you any examples yet.
|
---|
| 241 |
|
---|
| 242 | =item 7.
|
---|
| 243 | X<*foo{THING}> X<*>
|
---|
| 244 |
|
---|
| 245 | A reference can be created by using a special syntax, lovingly known as
|
---|
| 246 | the *foo{THING} syntax. *foo{THING} returns a reference to the THING
|
---|
| 247 | slot in *foo (which is the symbol table entry which holds everything
|
---|
| 248 | known as foo).
|
---|
| 249 |
|
---|
| 250 | $scalarref = *foo{SCALAR};
|
---|
| 251 | $arrayref = *ARGV{ARRAY};
|
---|
| 252 | $hashref = *ENV{HASH};
|
---|
| 253 | $coderef = *handler{CODE};
|
---|
| 254 | $ioref = *STDIN{IO};
|
---|
| 255 | $globref = *foo{GLOB};
|
---|
| 256 | $formatref = *foo{FORMAT};
|
---|
| 257 |
|
---|
| 258 | All of these are self-explanatory except for C<*foo{IO}>. It returns
|
---|
| 259 | the IO handle, used for file handles (L<perlfunc/open>), sockets
|
---|
| 260 | (L<perlfunc/socket> and L<perlfunc/socketpair>), and directory
|
---|
| 261 | handles (L<perlfunc/opendir>). For compatibility with previous
|
---|
| 262 | versions of Perl, C<*foo{FILEHANDLE}> is a synonym for C<*foo{IO}>, though it
|
---|
| 263 | is deprecated as of 5.8.0. If deprecation warnings are in effect, it will warn
|
---|
| 264 | of its use.
|
---|
| 265 |
|
---|
| 266 | C<*foo{THING}> returns undef if that particular THING hasn't been used yet,
|
---|
| 267 | except in the case of scalars. C<*foo{SCALAR}> returns a reference to an
|
---|
| 268 | anonymous scalar if $foo hasn't been used yet. This might change in a
|
---|
| 269 | future release.
|
---|
| 270 |
|
---|
| 271 | C<*foo{IO}> is an alternative to the C<*HANDLE> mechanism given in
|
---|
| 272 | L<perldata/"Typeglobs and Filehandles"> for passing filehandles
|
---|
| 273 | into or out of subroutines, or storing into larger data structures.
|
---|
| 274 | Its disadvantage is that it won't create a new filehandle for you.
|
---|
| 275 | Its advantage is that you have less risk of clobbering more than
|
---|
| 276 | you want to with a typeglob assignment. (It still conflates file
|
---|
| 277 | and directory handles, though.) However, if you assign the incoming
|
---|
| 278 | value to a scalar instead of a typeglob as we do in the examples
|
---|
| 279 | below, there's no risk of that happening.
|
---|
| 280 |
|
---|
| 281 | splutter(*STDOUT); # pass the whole glob
|
---|
| 282 | splutter(*STDOUT{IO}); # pass both file and dir handles
|
---|
| 283 |
|
---|
| 284 | sub splutter {
|
---|
| 285 | my $fh = shift;
|
---|
| 286 | print $fh "her um well a hmmm\n";
|
---|
| 287 | }
|
---|
| 288 |
|
---|
| 289 | $rec = get_rec(*STDIN); # pass the whole glob
|
---|
| 290 | $rec = get_rec(*STDIN{IO}); # pass both file and dir handles
|
---|
| 291 |
|
---|
| 292 | sub get_rec {
|
---|
| 293 | my $fh = shift;
|
---|
| 294 | return scalar <$fh>;
|
---|
| 295 | }
|
---|
| 296 |
|
---|
| 297 | =back
|
---|
| 298 |
|
---|
| 299 | =head2 Using References
|
---|
| 300 | X<reference, use> X<dereferencing> X<dereference>
|
---|
| 301 |
|
---|
| 302 | That's it for creating references. By now you're probably dying to
|
---|
| 303 | know how to use references to get back to your long-lost data. There
|
---|
| 304 | are several basic methods.
|
---|
| 305 |
|
---|
| 306 | =over 4
|
---|
| 307 |
|
---|
| 308 | =item 1.
|
---|
| 309 |
|
---|
| 310 | Anywhere you'd put an identifier (or chain of identifiers) as part
|
---|
| 311 | of a variable or subroutine name, you can replace the identifier with
|
---|
| 312 | a simple scalar variable containing a reference of the correct type:
|
---|
| 313 |
|
---|
| 314 | $bar = $$scalarref;
|
---|
| 315 | push(@$arrayref, $filename);
|
---|
| 316 | $$arrayref[0] = "January";
|
---|
| 317 | $$hashref{"KEY"} = "VALUE";
|
---|
| 318 | &$coderef(1,2,3);
|
---|
| 319 | print $globref "output\n";
|
---|
| 320 |
|
---|
| 321 | It's important to understand that we are specifically I<not> dereferencing
|
---|
| 322 | C<$arrayref[0]> or C<$hashref{"KEY"}> there. The dereference of the
|
---|
| 323 | scalar variable happens I<before> it does any key lookups. Anything more
|
---|
| 324 | complicated than a simple scalar variable must use methods 2 or 3 below.
|
---|
| 325 | However, a "simple scalar" includes an identifier that itself uses method
|
---|
| 326 | 1 recursively. Therefore, the following prints "howdy".
|
---|
| 327 |
|
---|
| 328 | $refrefref = \\\"howdy";
|
---|
| 329 | print $$$$refrefref;
|
---|
| 330 |
|
---|
| 331 | =item 2.
|
---|
| 332 | X<${}> X<@{}> X<%{}>
|
---|
| 333 |
|
---|
| 334 | Anywhere you'd put an identifier (or chain of identifiers) as part of a
|
---|
| 335 | variable or subroutine name, you can replace the identifier with a
|
---|
| 336 | BLOCK returning a reference of the correct type. In other words, the
|
---|
| 337 | previous examples could be written like this:
|
---|
| 338 |
|
---|
| 339 | $bar = ${$scalarref};
|
---|
| 340 | push(@{$arrayref}, $filename);
|
---|
| 341 | ${$arrayref}[0] = "January";
|
---|
| 342 | ${$hashref}{"KEY"} = "VALUE";
|
---|
| 343 | &{$coderef}(1,2,3);
|
---|
| 344 | $globref->print("output\n"); # iff IO::Handle is loaded
|
---|
| 345 |
|
---|
| 346 | Admittedly, it's a little silly to use the curlies in this case, but
|
---|
| 347 | the BLOCK can contain any arbitrary expression, in particular,
|
---|
| 348 | subscripted expressions:
|
---|
| 349 |
|
---|
| 350 | &{ $dispatch{$index} }(1,2,3); # call correct routine
|
---|
| 351 |
|
---|
| 352 | Because of being able to omit the curlies for the simple case of C<$$x>,
|
---|
| 353 | people often make the mistake of viewing the dereferencing symbols as
|
---|
| 354 | proper operators, and wonder about their precedence. If they were,
|
---|
| 355 | though, you could use parentheses instead of braces. That's not the case.
|
---|
| 356 | Consider the difference below; case 0 is a short-hand version of case 1,
|
---|
| 357 | I<not> case 2:
|
---|
| 358 |
|
---|
| 359 | $$hashref{"KEY"} = "VALUE"; # CASE 0
|
---|
| 360 | ${$hashref}{"KEY"} = "VALUE"; # CASE 1
|
---|
| 361 | ${$hashref{"KEY"}} = "VALUE"; # CASE 2
|
---|
| 362 | ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
|
---|
| 363 |
|
---|
| 364 | Case 2 is also deceptive in that you're accessing a variable
|
---|
| 365 | called %hashref, not dereferencing through $hashref to the hash
|
---|
| 366 | it's presumably referencing. That would be case 3.
|
---|
| 367 |
|
---|
| 368 | =item 3.
|
---|
| 369 | X<autovivification> X<< -> >> X<arrow>
|
---|
| 370 |
|
---|
| 371 | Subroutine calls and lookups of individual array elements arise often
|
---|
| 372 | enough that it gets cumbersome to use method 2. As a form of
|
---|
| 373 | syntactic sugar, the examples for method 2 may be written:
|
---|
| 374 |
|
---|
| 375 | $arrayref->[0] = "January"; # Array element
|
---|
| 376 | $hashref->{"KEY"} = "VALUE"; # Hash element
|
---|
| 377 | $coderef->(1,2,3); # Subroutine call
|
---|
| 378 |
|
---|
| 379 | The left side of the arrow can be any expression returning a reference,
|
---|
| 380 | including a previous dereference. Note that C<$array[$x]> is I<not> the
|
---|
| 381 | same thing as C<< $array->[$x] >> here:
|
---|
| 382 |
|
---|
| 383 | $array[$x]->{"foo"}->[0] = "January";
|
---|
| 384 |
|
---|
| 385 | This is one of the cases we mentioned earlier in which references could
|
---|
| 386 | spring into existence when in an lvalue context. Before this
|
---|
| 387 | statement, C<$array[$x]> may have been undefined. If so, it's
|
---|
| 388 | automatically defined with a hash reference so that we can look up
|
---|
| 389 | C<{"foo"}> in it. Likewise C<< $array[$x]->{"foo"} >> will automatically get
|
---|
| 390 | defined with an array reference so that we can look up C<[0]> in it.
|
---|
| 391 | This process is called I<autovivification>.
|
---|
| 392 |
|
---|
| 393 | One more thing here. The arrow is optional I<between> brackets
|
---|
| 394 | subscripts, so you can shrink the above down to
|
---|
| 395 |
|
---|
| 396 | $array[$x]{"foo"}[0] = "January";
|
---|
| 397 |
|
---|
| 398 | Which, in the degenerate case of using only ordinary arrays, gives you
|
---|
| 399 | multidimensional arrays just like C's:
|
---|
| 400 |
|
---|
| 401 | $score[$x][$y][$z] += 42;
|
---|
| 402 |
|
---|
| 403 | Well, okay, not entirely like C's arrays, actually. C doesn't know how
|
---|
| 404 | to grow its arrays on demand. Perl does.
|
---|
| 405 |
|
---|
| 406 | =item 4.
|
---|
| 407 | X<encapsulation>
|
---|
| 408 |
|
---|
| 409 | If a reference happens to be a reference to an object, then there are
|
---|
| 410 | probably methods to access the things referred to, and you should probably
|
---|
| 411 | stick to those methods unless you're in the class package that defines the
|
---|
| 412 | object's methods. In other words, be nice, and don't violate the object's
|
---|
| 413 | encapsulation without a very good reason. Perl does not enforce
|
---|
| 414 | encapsulation. We are not totalitarians here. We do expect some basic
|
---|
| 415 | civility though.
|
---|
| 416 |
|
---|
| 417 | =back
|
---|
| 418 |
|
---|
| 419 | Using a string or number as a reference produces a symbolic reference,
|
---|
| 420 | as explained above. Using a reference as a number produces an
|
---|
| 421 | integer representing its storage location in memory. The only
|
---|
| 422 | useful thing to be done with this is to compare two references
|
---|
| 423 | numerically to see whether they refer to the same location.
|
---|
| 424 | X<reference, numeric context>
|
---|
| 425 |
|
---|
| 426 | if ($ref1 == $ref2) { # cheap numeric compare of references
|
---|
| 427 | print "refs 1 and 2 refer to the same thing\n";
|
---|
| 428 | }
|
---|
| 429 |
|
---|
| 430 | Using a reference as a string produces both its referent's type,
|
---|
| 431 | including any package blessing as described in L<perlobj>, as well
|
---|
| 432 | as the numeric address expressed in hex. The ref() operator returns
|
---|
| 433 | just the type of thing the reference is pointing to, without the
|
---|
| 434 | address. See L<perlfunc/ref> for details and examples of its use.
|
---|
| 435 | X<reference, string context>
|
---|
| 436 |
|
---|
| 437 | The bless() operator may be used to associate the object a reference
|
---|
| 438 | points to with a package functioning as an object class. See L<perlobj>.
|
---|
| 439 |
|
---|
| 440 | A typeglob may be dereferenced the same way a reference can, because
|
---|
| 441 | the dereference syntax always indicates the type of reference desired.
|
---|
| 442 | So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.
|
---|
| 443 |
|
---|
| 444 | Here's a trick for interpolating a subroutine call into a string:
|
---|
| 445 |
|
---|
| 446 | print "My sub returned @{[mysub(1,2,3)]} that time.\n";
|
---|
| 447 |
|
---|
| 448 | The way it works is that when the C<@{...}> is seen in the double-quoted
|
---|
| 449 | string, it's evaluated as a block. The block creates a reference to an
|
---|
| 450 | anonymous array containing the results of the call to C<mysub(1,2,3)>. So
|
---|
| 451 | the whole block returns a reference to an array, which is then
|
---|
| 452 | dereferenced by C<@{...}> and stuck into the double-quoted string. This
|
---|
| 453 | chicanery is also useful for arbitrary expressions:
|
---|
| 454 |
|
---|
| 455 | print "That yields @{[$n + 5]} widgets\n";
|
---|
| 456 |
|
---|
| 457 | =head2 Symbolic references
|
---|
| 458 | X<reference, symbolic> X<reference, soft>
|
---|
| 459 | X<symbolic reference> X<soft reference>
|
---|
| 460 |
|
---|
| 461 | We said that references spring into existence as necessary if they are
|
---|
| 462 | undefined, but we didn't say what happens if a value used as a
|
---|
| 463 | reference is already defined, but I<isn't> a hard reference. If you
|
---|
| 464 | use it as a reference, it'll be treated as a symbolic
|
---|
| 465 | reference. That is, the value of the scalar is taken to be the I<name>
|
---|
| 466 | of a variable, rather than a direct link to a (possibly) anonymous
|
---|
| 467 | value.
|
---|
| 468 |
|
---|
| 469 | People frequently expect it to work like this. So it does.
|
---|
| 470 |
|
---|
| 471 | $name = "foo";
|
---|
| 472 | $$name = 1; # Sets $foo
|
---|
| 473 | ${$name} = 2; # Sets $foo
|
---|
| 474 | ${$name x 2} = 3; # Sets $foofoo
|
---|
| 475 | $name->[0] = 4; # Sets $foo[0]
|
---|
| 476 | @$name = (); # Clears @foo
|
---|
| 477 | &$name(); # Calls &foo() (as in Perl 4)
|
---|
| 478 | $pack = "THAT";
|
---|
| 479 | ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
|
---|
| 480 |
|
---|
| 481 | This is powerful, and slightly dangerous, in that it's possible
|
---|
| 482 | to intend (with the utmost sincerity) to use a hard reference, and
|
---|
| 483 | accidentally use a symbolic reference instead. To protect against
|
---|
| 484 | that, you can say
|
---|
| 485 |
|
---|
| 486 | use strict 'refs';
|
---|
| 487 |
|
---|
| 488 | and then only hard references will be allowed for the rest of the enclosing
|
---|
| 489 | block. An inner block may countermand that with
|
---|
| 490 |
|
---|
| 491 | no strict 'refs';
|
---|
| 492 |
|
---|
| 493 | Only package variables (globals, even if localized) are visible to
|
---|
| 494 | symbolic references. Lexical variables (declared with my()) aren't in
|
---|
| 495 | a symbol table, and thus are invisible to this mechanism. For example:
|
---|
| 496 |
|
---|
| 497 | local $value = 10;
|
---|
| 498 | $ref = "value";
|
---|
| 499 | {
|
---|
| 500 | my $value = 20;
|
---|
| 501 | print $$ref;
|
---|
| 502 | }
|
---|
| 503 |
|
---|
| 504 | This will still print 10, not 20. Remember that local() affects package
|
---|
| 505 | variables, which are all "global" to the package.
|
---|
| 506 |
|
---|
| 507 | =head2 Not-so-symbolic references
|
---|
| 508 |
|
---|
| 509 | A new feature contributing to readability in perl version 5.001 is that the
|
---|
| 510 | brackets around a symbolic reference behave more like quotes, just as they
|
---|
| 511 | always have within a string. That is,
|
---|
| 512 |
|
---|
| 513 | $push = "pop on ";
|
---|
| 514 | print "${push}over";
|
---|
| 515 |
|
---|
| 516 | has always meant to print "pop on over", even though push is
|
---|
| 517 | a reserved word. This has been generalized to work the same outside
|
---|
| 518 | of quotes, so that
|
---|
| 519 |
|
---|
| 520 | print ${push} . "over";
|
---|
| 521 |
|
---|
| 522 | and even
|
---|
| 523 |
|
---|
| 524 | print ${ push } . "over";
|
---|
| 525 |
|
---|
| 526 | will have the same effect. (This would have been a syntax error in
|
---|
| 527 | Perl 5.000, though Perl 4 allowed it in the spaceless form.) This
|
---|
| 528 | construct is I<not> considered to be a symbolic reference when you're
|
---|
| 529 | using strict refs:
|
---|
| 530 |
|
---|
| 531 | use strict 'refs';
|
---|
| 532 | ${ bareword }; # Okay, means $bareword.
|
---|
| 533 | ${ "bareword" }; # Error, symbolic reference.
|
---|
| 534 |
|
---|
| 535 | Similarly, because of all the subscripting that is done using single
|
---|
| 536 | words, we've applied the same rule to any bareword that is used for
|
---|
| 537 | subscripting a hash. So now, instead of writing
|
---|
| 538 |
|
---|
| 539 | $array{ "aaa" }{ "bbb" }{ "ccc" }
|
---|
| 540 |
|
---|
| 541 | you can write just
|
---|
| 542 |
|
---|
| 543 | $array{ aaa }{ bbb }{ ccc }
|
---|
| 544 |
|
---|
| 545 | and not worry about whether the subscripts are reserved words. In the
|
---|
| 546 | rare event that you do wish to do something like
|
---|
| 547 |
|
---|
| 548 | $array{ shift }
|
---|
| 549 |
|
---|
| 550 | you can force interpretation as a reserved word by adding anything that
|
---|
| 551 | makes it more than a bareword:
|
---|
| 552 |
|
---|
| 553 | $array{ shift() }
|
---|
| 554 | $array{ +shift }
|
---|
| 555 | $array{ shift @_ }
|
---|
| 556 |
|
---|
| 557 | The C<use warnings> pragma or the B<-w> switch will warn you if it
|
---|
| 558 | interprets a reserved word as a string.
|
---|
| 559 | But it will no longer warn you about using lowercase words, because the
|
---|
| 560 | string is effectively quoted.
|
---|
| 561 |
|
---|
| 562 | =head2 Pseudo-hashes: Using an array as a hash
|
---|
| 563 | X<pseudo-hash> X<pseudo hash> X<pseudohash>
|
---|
| 564 |
|
---|
| 565 | B<WARNING>: This section describes an experimental feature. Details may
|
---|
| 566 | change without notice in future versions.
|
---|
| 567 |
|
---|
| 568 | B<NOTE>: The current user-visible implementation of pseudo-hashes
|
---|
| 569 | (the weird use of the first array element) is deprecated starting from
|
---|
| 570 | Perl 5.8.0 and will be removed in Perl 5.10.0, and the feature will be
|
---|
| 571 | implemented differently. Not only is the current interface rather ugly,
|
---|
| 572 | but the current implementation slows down normal array and hash use quite
|
---|
| 573 | noticeably. The 'fields' pragma interface will remain available.
|
---|
| 574 |
|
---|
| 575 | Beginning with release 5.005 of Perl, you may use an array reference
|
---|
| 576 | in some contexts that would normally require a hash reference. This
|
---|
| 577 | allows you to access array elements using symbolic names, as if they
|
---|
| 578 | were fields in a structure.
|
---|
| 579 |
|
---|
| 580 | For this to work, the array must contain extra information. The first
|
---|
| 581 | element of the array has to be a hash reference that maps field names
|
---|
| 582 | to array indices. Here is an example:
|
---|
| 583 |
|
---|
| 584 | $struct = [{foo => 1, bar => 2}, "FOO", "BAR"];
|
---|
| 585 |
|
---|
| 586 | $struct->{foo}; # same as $struct->[1], i.e. "FOO"
|
---|
| 587 | $struct->{bar}; # same as $struct->[2], i.e. "BAR"
|
---|
| 588 |
|
---|
| 589 | keys %$struct; # will return ("foo", "bar") in some order
|
---|
| 590 | values %$struct; # will return ("FOO", "BAR") in same some order
|
---|
| 591 |
|
---|
| 592 | while (my($k,$v) = each %$struct) {
|
---|
| 593 | print "$k => $v\n";
|
---|
| 594 | }
|
---|
| 595 |
|
---|
| 596 | Perl will raise an exception if you try to access nonexistent fields.
|
---|
| 597 | To avoid inconsistencies, always use the fields::phash() function
|
---|
| 598 | provided by the C<fields> pragma.
|
---|
| 599 |
|
---|
| 600 | use fields;
|
---|
| 601 | $pseudohash = fields::phash(foo => "FOO", bar => "BAR");
|
---|
| 602 |
|
---|
| 603 | For better performance, Perl can also do the translation from field
|
---|
| 604 | names to array indices at compile time for typed object references.
|
---|
| 605 | See L<fields>.
|
---|
| 606 |
|
---|
| 607 | There are two ways to check for the existence of a key in a
|
---|
| 608 | pseudo-hash. The first is to use exists(). This checks to see if the
|
---|
| 609 | given field has ever been set. It acts this way to match the behavior
|
---|
| 610 | of a regular hash. For instance:
|
---|
| 611 |
|
---|
| 612 | use fields;
|
---|
| 613 | $phash = fields::phash([qw(foo bar pants)], ['FOO']);
|
---|
| 614 | $phash->{pants} = undef;
|
---|
| 615 |
|
---|
| 616 | print exists $phash->{foo}; # true, 'foo' was set in the declaration
|
---|
| 617 | print exists $phash->{bar}; # false, 'bar' has not been used.
|
---|
| 618 | print exists $phash->{pants}; # true, your 'pants' have been touched
|
---|
| 619 |
|
---|
| 620 | The second is to use exists() on the hash reference sitting in the
|
---|
| 621 | first array element. This checks to see if the given key is a valid
|
---|
| 622 | field in the pseudo-hash.
|
---|
| 623 |
|
---|
| 624 | print exists $phash->[0]{bar}; # true, 'bar' is a valid field
|
---|
| 625 | print exists $phash->[0]{shoes};# false, 'shoes' can't be used
|
---|
| 626 |
|
---|
| 627 | delete() on a pseudo-hash element only deletes the value corresponding
|
---|
| 628 | to the key, not the key itself. To delete the key, you'll have to
|
---|
| 629 | explicitly delete it from the first hash element.
|
---|
| 630 |
|
---|
| 631 | print delete $phash->{foo}; # prints $phash->[1], "FOO"
|
---|
| 632 | print exists $phash->{foo}; # false
|
---|
| 633 | print exists $phash->[0]{foo}; # true, key still exists
|
---|
| 634 | print delete $phash->[0]{foo}; # now key is gone
|
---|
| 635 | print $phash->{foo}; # runtime exception
|
---|
| 636 |
|
---|
| 637 | =head2 Function Templates
|
---|
| 638 | X<scope, lexical> X<closure> X<lexical> X<lexical scope>
|
---|
| 639 | X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local>
|
---|
| 640 |
|
---|
| 641 | As explained above, an anonymous function with access to the lexical
|
---|
| 642 | variables visible when that function was compiled, creates a closure. It
|
---|
| 643 | retains access to those variables even though it doesn't get run until
|
---|
| 644 | later, such as in a signal handler or a Tk callback.
|
---|
| 645 |
|
---|
| 646 | Using a closure as a function template allows us to generate many functions
|
---|
| 647 | that act similarly. Suppose you wanted functions named after the colors
|
---|
| 648 | that generated HTML font changes for the various colors:
|
---|
| 649 |
|
---|
| 650 | print "Be ", red("careful"), "with that ", green("light");
|
---|
| 651 |
|
---|
| 652 | The red() and green() functions would be similar. To create these,
|
---|
| 653 | we'll assign a closure to a typeglob of the name of the function we're
|
---|
| 654 | trying to build.
|
---|
| 655 |
|
---|
| 656 | @colors = qw(red blue green yellow orange purple violet);
|
---|
| 657 | for my $name (@colors) {
|
---|
| 658 | no strict 'refs'; # allow symbol table manipulation
|
---|
| 659 | *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" };
|
---|
| 660 | }
|
---|
| 661 |
|
---|
| 662 | Now all those different functions appear to exist independently. You can
|
---|
| 663 | call red(), RED(), blue(), BLUE(), green(), etc. This technique saves on
|
---|
| 664 | both compile time and memory use, and is less error-prone as well, since
|
---|
| 665 | syntax checks happen at compile time. It's critical that any variables in
|
---|
| 666 | the anonymous subroutine be lexicals in order to create a proper closure.
|
---|
| 667 | That's the reasons for the C<my> on the loop iteration variable.
|
---|
| 668 |
|
---|
| 669 | This is one of the only places where giving a prototype to a closure makes
|
---|
| 670 | much sense. If you wanted to impose scalar context on the arguments of
|
---|
| 671 | these functions (probably not a wise idea for this particular example),
|
---|
| 672 | you could have written it this way instead:
|
---|
| 673 |
|
---|
| 674 | *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" };
|
---|
| 675 |
|
---|
| 676 | However, since prototype checking happens at compile time, the assignment
|
---|
| 677 | above happens too late to be of much use. You could address this by
|
---|
| 678 | putting the whole loop of assignments within a BEGIN block, forcing it
|
---|
| 679 | to occur during compilation.
|
---|
| 680 |
|
---|
| 681 | Access to lexicals that change over type--like those in the C<for> loop
|
---|
| 682 | above--only works with closures, not general subroutines. In the general
|
---|
| 683 | case, then, named subroutines do not nest properly, although anonymous
|
---|
| 684 | ones do. Thus is because named subroutines are created (and capture any
|
---|
| 685 | outer lexicals) only once at compile time, whereas anonymous subroutines
|
---|
| 686 | get to capture each time you execute the 'sub' operator. If you are
|
---|
| 687 | accustomed to using nested subroutines in other programming languages with
|
---|
| 688 | their own private variables, you'll have to work at it a bit in Perl. The
|
---|
| 689 | intuitive coding of this type of thing incurs mysterious warnings about
|
---|
| 690 | "will not stay shared". For example, this won't work:
|
---|
| 691 |
|
---|
| 692 | sub outer {
|
---|
| 693 | my $x = $_[0] + 35;
|
---|
| 694 | sub inner { return $x * 19 } # WRONG
|
---|
| 695 | return $x + inner();
|
---|
| 696 | }
|
---|
| 697 |
|
---|
| 698 | A work-around is the following:
|
---|
| 699 |
|
---|
| 700 | sub outer {
|
---|
| 701 | my $x = $_[0] + 35;
|
---|
| 702 | local *inner = sub { return $x * 19 };
|
---|
| 703 | return $x + inner();
|
---|
| 704 | }
|
---|
| 705 |
|
---|
| 706 | Now inner() can only be called from within outer(), because of the
|
---|
| 707 | temporary assignments of the closure (anonymous subroutine). But when
|
---|
| 708 | it does, it has normal access to the lexical variable $x from the scope
|
---|
| 709 | of outer().
|
---|
| 710 |
|
---|
| 711 | This has the interesting effect of creating a function local to another
|
---|
| 712 | function, something not normally supported in Perl.
|
---|
| 713 |
|
---|
| 714 | =head1 WARNING
|
---|
| 715 | X<reference, string context> X<reference, use as hash key>
|
---|
| 716 |
|
---|
| 717 | You may not (usefully) use a reference as the key to a hash. It will be
|
---|
| 718 | converted into a string:
|
---|
| 719 |
|
---|
| 720 | $x{ \$a } = $a;
|
---|
| 721 |
|
---|
| 722 | If you try to dereference the key, it won't do a hard dereference, and
|
---|
| 723 | you won't accomplish what you're attempting. You might want to do something
|
---|
| 724 | more like
|
---|
| 725 |
|
---|
| 726 | $r = \@a;
|
---|
| 727 | $x{ $r } = $r;
|
---|
| 728 |
|
---|
| 729 | And then at least you can use the values(), which will be
|
---|
| 730 | real refs, instead of the keys(), which won't.
|
---|
| 731 |
|
---|
| 732 | The standard Tie::RefHash module provides a convenient workaround to this.
|
---|
| 733 |
|
---|
| 734 | =head1 SEE ALSO
|
---|
| 735 |
|
---|
| 736 | Besides the obvious documents, source code can be instructive.
|
---|
| 737 | Some pathological examples of the use of references can be found
|
---|
| 738 | in the F<t/op/ref.t> regression test in the Perl source directory.
|
---|
| 739 |
|
---|
| 740 | See also L<perldsc> and L<perllol> for how to use references to create
|
---|
| 741 | complex data structures, and L<perltoot>, L<perlobj>, and L<perlbot>
|
---|
| 742 | for how to use them to create objects.
|
---|