1 | =head1 NAME
|
---|
2 |
|
---|
3 | perlthrtut - tutorial on threads in Perl
|
---|
4 |
|
---|
5 | =head1 DESCRIPTION
|
---|
6 |
|
---|
7 | B<NOTE>: this tutorial describes the new Perl threading flavour
|
---|
8 | introduced in Perl 5.6.0 called interpreter threads, or B<ithreads>
|
---|
9 | for short. In this model each thread runs in its own Perl interpreter,
|
---|
10 | and any data sharing between threads must be explicit.
|
---|
11 |
|
---|
12 | There is another older Perl threading flavour called the 5.005 model,
|
---|
13 | unsurprisingly for 5.005 versions of Perl. The old model is known to
|
---|
14 | have problems, deprecated, and will probably be removed around release
|
---|
15 | 5.10. You are strongly encouraged to migrate any existing 5.005
|
---|
16 | threads code to the new model as soon as possible.
|
---|
17 |
|
---|
18 | You can see which (or neither) threading flavour you have by
|
---|
19 | running C<perl -V> and looking at the C<Platform> section.
|
---|
20 | If you have C<useithreads=define> you have ithreads, if you
|
---|
21 | have C<use5005threads=define> you have 5.005 threads.
|
---|
22 | If you have neither, you don't have any thread support built in.
|
---|
23 | If you have both, you are in trouble.
|
---|
24 |
|
---|
25 | The user-level interface to the 5.005 threads was via the L<Threads>
|
---|
26 | class, while ithreads uses the L<threads> class. Note the change in case.
|
---|
27 |
|
---|
28 | =head1 Status
|
---|
29 |
|
---|
30 | The ithreads code has been available since Perl 5.6.0, and is considered
|
---|
31 | stable. The user-level interface to ithreads (the L<threads> classes)
|
---|
32 | appeared in the 5.8.0 release, and as of this time is considered stable
|
---|
33 | although it should be treated with caution as with all new features.
|
---|
34 |
|
---|
35 | =head1 What Is A Thread Anyway?
|
---|
36 |
|
---|
37 | A thread is a flow of control through a program with a single
|
---|
38 | execution point.
|
---|
39 |
|
---|
40 | Sounds an awful lot like a process, doesn't it? Well, it should.
|
---|
41 | Threads are one of the pieces of a process. Every process has at least
|
---|
42 | one thread and, up until now, every process running Perl had only one
|
---|
43 | thread. With 5.8, though, you can create extra threads. We're going
|
---|
44 | to show you how, when, and why.
|
---|
45 |
|
---|
46 | =head1 Threaded Program Models
|
---|
47 |
|
---|
48 | There are three basic ways that you can structure a threaded
|
---|
49 | program. Which model you choose depends on what you need your program
|
---|
50 | to do. For many non-trivial threaded programs you'll need to choose
|
---|
51 | different models for different pieces of your program.
|
---|
52 |
|
---|
53 | =head2 Boss/Worker
|
---|
54 |
|
---|
55 | The boss/worker model usually has one "boss" thread and one or more
|
---|
56 | "worker" threads. The boss thread gathers or generates tasks that need
|
---|
57 | to be done, then parcels those tasks out to the appropriate worker
|
---|
58 | thread.
|
---|
59 |
|
---|
60 | This model is common in GUI and server programs, where a main thread
|
---|
61 | waits for some event and then passes that event to the appropriate
|
---|
62 | worker threads for processing. Once the event has been passed on, the
|
---|
63 | boss thread goes back to waiting for another event.
|
---|
64 |
|
---|
65 | The boss thread does relatively little work. While tasks aren't
|
---|
66 | necessarily performed faster than with any other method, it tends to
|
---|
67 | have the best user-response times.
|
---|
68 |
|
---|
69 | =head2 Work Crew
|
---|
70 |
|
---|
71 | In the work crew model, several threads are created that do
|
---|
72 | essentially the same thing to different pieces of data. It closely
|
---|
73 | mirrors classical parallel processing and vector processors, where a
|
---|
74 | large array of processors do the exact same thing to many pieces of
|
---|
75 | data.
|
---|
76 |
|
---|
77 | This model is particularly useful if the system running the program
|
---|
78 | will distribute multiple threads across different processors. It can
|
---|
79 | also be useful in ray tracing or rendering engines, where the
|
---|
80 | individual threads can pass on interim results to give the user visual
|
---|
81 | feedback.
|
---|
82 |
|
---|
83 | =head2 Pipeline
|
---|
84 |
|
---|
85 | The pipeline model divides up a task into a series of steps, and
|
---|
86 | passes the results of one step on to the thread processing the
|
---|
87 | next. Each thread does one thing to each piece of data and passes the
|
---|
88 | results to the next thread in line.
|
---|
89 |
|
---|
90 | This model makes the most sense if you have multiple processors so two
|
---|
91 | or more threads will be executing in parallel, though it can often
|
---|
92 | make sense in other contexts as well. It tends to keep the individual
|
---|
93 | tasks small and simple, as well as allowing some parts of the pipeline
|
---|
94 | to block (on I/O or system calls, for example) while other parts keep
|
---|
95 | going. If you're running different parts of the pipeline on different
|
---|
96 | processors you may also take advantage of the caches on each
|
---|
97 | processor.
|
---|
98 |
|
---|
99 | This model is also handy for a form of recursive programming where,
|
---|
100 | rather than having a subroutine call itself, it instead creates
|
---|
101 | another thread. Prime and Fibonacci generators both map well to this
|
---|
102 | form of the pipeline model. (A version of a prime number generator is
|
---|
103 | presented later on.)
|
---|
104 |
|
---|
105 | =head1 What kind of threads are Perl threads?
|
---|
106 |
|
---|
107 | If you have experience with other thread implementations, you might
|
---|
108 | find that things aren't quite what you expect. It's very important to
|
---|
109 | remember when dealing with Perl threads that Perl Threads Are Not X
|
---|
110 | Threads, for all values of X. They aren't POSIX threads, or
|
---|
111 | DecThreads, or Java's Green threads, or Win32 threads. There are
|
---|
112 | similarities, and the broad concepts are the same, but if you start
|
---|
113 | looking for implementation details you're going to be either
|
---|
114 | disappointed or confused. Possibly both.
|
---|
115 |
|
---|
116 | This is not to say that Perl threads are completely different from
|
---|
117 | everything that's ever come before--they're not. Perl's threading
|
---|
118 | model owes a lot to other thread models, especially POSIX. Just as
|
---|
119 | Perl is not C, though, Perl threads are not POSIX threads. So if you
|
---|
120 | find yourself looking for mutexes, or thread priorities, it's time to
|
---|
121 | step back a bit and think about what you want to do and how Perl can
|
---|
122 | do it.
|
---|
123 |
|
---|
124 | However it is important to remember that Perl threads cannot magically
|
---|
125 | do things unless your operating systems threads allows it. So if your
|
---|
126 | system blocks the entire process on sleep(), Perl usually will as well.
|
---|
127 |
|
---|
128 | Perl Threads Are Different.
|
---|
129 |
|
---|
130 | =head1 Thread-Safe Modules
|
---|
131 |
|
---|
132 | The addition of threads has changed Perl's internals
|
---|
133 | substantially. There are implications for people who write
|
---|
134 | modules with XS code or external libraries. However, since perl data is
|
---|
135 | not shared among threads by default, Perl modules stand a high chance of
|
---|
136 | being thread-safe or can be made thread-safe easily. Modules that are not
|
---|
137 | tagged as thread-safe should be tested or code reviewed before being used
|
---|
138 | in production code.
|
---|
139 |
|
---|
140 | Not all modules that you might use are thread-safe, and you should
|
---|
141 | always assume a module is unsafe unless the documentation says
|
---|
142 | otherwise. This includes modules that are distributed as part of the
|
---|
143 | core. Threads are a new feature, and even some of the standard
|
---|
144 | modules aren't thread-safe.
|
---|
145 |
|
---|
146 | Even if a module is thread-safe, it doesn't mean that the module is optimized
|
---|
147 | to work well with threads. A module could possibly be rewritten to utilize
|
---|
148 | the new features in threaded Perl to increase performance in a threaded
|
---|
149 | environment.
|
---|
150 |
|
---|
151 | If you're using a module that's not thread-safe for some reason, you
|
---|
152 | can protect yourself by using it from one, and only one thread at all.
|
---|
153 | If you need multiple threads to access such a module, you can use semaphores and
|
---|
154 | lots of programming discipline to control access to it. Semaphores
|
---|
155 | are covered in L</"Basic semaphores">.
|
---|
156 |
|
---|
157 | See also L</"Thread-Safety of System Libraries">.
|
---|
158 |
|
---|
159 | =head1 Thread Basics
|
---|
160 |
|
---|
161 | The core L<threads> module provides the basic functions you need to write
|
---|
162 | threaded programs. In the following sections we'll cover the basics,
|
---|
163 | showing you what you need to do to create a threaded program. After
|
---|
164 | that, we'll go over some of the features of the L<threads> module that
|
---|
165 | make threaded programming easier.
|
---|
166 |
|
---|
167 | =head2 Basic Thread Support
|
---|
168 |
|
---|
169 | Thread support is a Perl compile-time option - it's something that's
|
---|
170 | turned on or off when Perl is built at your site, rather than when
|
---|
171 | your programs are compiled. If your Perl wasn't compiled with thread
|
---|
172 | support enabled, then any attempt to use threads will fail.
|
---|
173 |
|
---|
174 | Your programs can use the Config module to check whether threads are
|
---|
175 | enabled. If your program can't run without them, you can say something
|
---|
176 | like:
|
---|
177 |
|
---|
178 | $Config{useithreads} or die "Recompile Perl with threads to run this program.";
|
---|
179 |
|
---|
180 | A possibly-threaded program using a possibly-threaded module might
|
---|
181 | have code like this:
|
---|
182 |
|
---|
183 | use Config;
|
---|
184 | use MyMod;
|
---|
185 |
|
---|
186 | BEGIN {
|
---|
187 | if ($Config{useithreads}) {
|
---|
188 | # We have threads
|
---|
189 | require MyMod_threaded;
|
---|
190 | import MyMod_threaded;
|
---|
191 | } else {
|
---|
192 | require MyMod_unthreaded;
|
---|
193 | import MyMod_unthreaded;
|
---|
194 | }
|
---|
195 | }
|
---|
196 |
|
---|
197 | Since code that runs both with and without threads is usually pretty
|
---|
198 | messy, it's best to isolate the thread-specific code in its own
|
---|
199 | module. In our example above, that's what MyMod_threaded is, and it's
|
---|
200 | only imported if we're running on a threaded Perl.
|
---|
201 |
|
---|
202 | =head2 A Note about the Examples
|
---|
203 |
|
---|
204 | Although thread support is considered to be stable, there are still a number
|
---|
205 | of quirks that may startle you when you try out any of the examples below.
|
---|
206 | In a real situation, care should be taken that all threads are finished
|
---|
207 | executing before the program exits. That care has B<not> been taken in these
|
---|
208 | examples in the interest of simplicity. Running these examples "as is" will
|
---|
209 | produce error messages, usually caused by the fact that there are still
|
---|
210 | threads running when the program exits. You should not be alarmed by this.
|
---|
211 | Future versions of Perl may fix this problem.
|
---|
212 |
|
---|
213 | =head2 Creating Threads
|
---|
214 |
|
---|
215 | The L<threads> package provides the tools you need to create new
|
---|
216 | threads. Like any other module, you need to tell Perl that you want to use
|
---|
217 | it; C<use threads> imports all the pieces you need to create basic
|
---|
218 | threads.
|
---|
219 |
|
---|
220 | The simplest, most straightforward way to create a thread is with new():
|
---|
221 |
|
---|
222 | use threads;
|
---|
223 |
|
---|
224 | $thr = threads->new(\&sub1);
|
---|
225 |
|
---|
226 | sub sub1 {
|
---|
227 | print "In the thread\n";
|
---|
228 | }
|
---|
229 |
|
---|
230 | The new() method takes a reference to a subroutine and creates a new
|
---|
231 | thread, which starts executing in the referenced subroutine. Control
|
---|
232 | then passes both to the subroutine and the caller.
|
---|
233 |
|
---|
234 | If you need to, your program can pass parameters to the subroutine as
|
---|
235 | part of the thread startup. Just include the list of parameters as
|
---|
236 | part of the C<threads::new> call, like this:
|
---|
237 |
|
---|
238 | use threads;
|
---|
239 |
|
---|
240 | $Param3 = "foo";
|
---|
241 | $thr = threads->new(\&sub1, "Param 1", "Param 2", $Param3);
|
---|
242 | $thr = threads->new(\&sub1, @ParamList);
|
---|
243 | $thr = threads->new(\&sub1, qw(Param1 Param2 Param3));
|
---|
244 |
|
---|
245 | sub sub1 {
|
---|
246 | my @InboundParameters = @_;
|
---|
247 | print "In the thread\n";
|
---|
248 | print "got parameters >", join("<>", @InboundParameters), "<\n";
|
---|
249 | }
|
---|
250 |
|
---|
251 |
|
---|
252 | The last example illustrates another feature of threads. You can spawn
|
---|
253 | off several threads using the same subroutine. Each thread executes
|
---|
254 | the same subroutine, but in a separate thread with a separate
|
---|
255 | environment and potentially separate arguments.
|
---|
256 |
|
---|
257 | C<create()> is a synonym for C<new()>.
|
---|
258 |
|
---|
259 | =head2 Waiting For A Thread To Exit
|
---|
260 |
|
---|
261 | Since threads are also subroutines, they can return values. To wait
|
---|
262 | for a thread to exit and extract any values it might return, you can
|
---|
263 | use the join() method:
|
---|
264 |
|
---|
265 | use threads;
|
---|
266 |
|
---|
267 | $thr = threads->new(\&sub1);
|
---|
268 |
|
---|
269 | @ReturnData = $thr->join;
|
---|
270 | print "Thread returned @ReturnData";
|
---|
271 |
|
---|
272 | sub sub1 { return "Fifty-six", "foo", 2; }
|
---|
273 |
|
---|
274 | In the example above, the join() method returns as soon as the thread
|
---|
275 | ends. In addition to waiting for a thread to finish and gathering up
|
---|
276 | any values that the thread might have returned, join() also performs
|
---|
277 | any OS cleanup necessary for the thread. That cleanup might be
|
---|
278 | important, especially for long-running programs that spawn lots of
|
---|
279 | threads. If you don't want the return values and don't want to wait
|
---|
280 | for the thread to finish, you should call the detach() method
|
---|
281 | instead, as described next.
|
---|
282 |
|
---|
283 | =head2 Ignoring A Thread
|
---|
284 |
|
---|
285 | join() does three things: it waits for a thread to exit, cleans up
|
---|
286 | after it, and returns any data the thread may have produced. But what
|
---|
287 | if you're not interested in the thread's return values, and you don't
|
---|
288 | really care when the thread finishes? All you want is for the thread
|
---|
289 | to get cleaned up after when it's done.
|
---|
290 |
|
---|
291 | In this case, you use the detach() method. Once a thread is detached,
|
---|
292 | it'll run until it's finished, then Perl will clean up after it
|
---|
293 | automatically.
|
---|
294 |
|
---|
295 | use threads;
|
---|
296 |
|
---|
297 | $thr = threads->new(\&sub1); # Spawn the thread
|
---|
298 |
|
---|
299 | $thr->detach; # Now we officially don't care any more
|
---|
300 |
|
---|
301 | sub sub1 {
|
---|
302 | $a = 0;
|
---|
303 | while (1) {
|
---|
304 | $a++;
|
---|
305 | print "\$a is $a\n";
|
---|
306 | sleep 1;
|
---|
307 | }
|
---|
308 | }
|
---|
309 |
|
---|
310 | Once a thread is detached, it may not be joined, and any return data
|
---|
311 | that it might have produced (if it was done and waiting for a join) is
|
---|
312 | lost.
|
---|
313 |
|
---|
314 | =head1 Threads And Data
|
---|
315 |
|
---|
316 | Now that we've covered the basics of threads, it's time for our next
|
---|
317 | topic: data. Threading introduces a couple of complications to data
|
---|
318 | access that non-threaded programs never need to worry about.
|
---|
319 |
|
---|
320 | =head2 Shared And Unshared Data
|
---|
321 |
|
---|
322 | The biggest difference between Perl ithreads and the old 5.005 style
|
---|
323 | threading, or for that matter, to most other threading systems out there,
|
---|
324 | is that by default, no data is shared. When a new perl thread is created,
|
---|
325 | all the data associated with the current thread is copied to the new
|
---|
326 | thread, and is subsequently private to that new thread!
|
---|
327 | This is similar in feel to what happens when a UNIX process forks,
|
---|
328 | except that in this case, the data is just copied to a different part of
|
---|
329 | memory within the same process rather than a real fork taking place.
|
---|
330 |
|
---|
331 | To make use of threading however, one usually wants the threads to share
|
---|
332 | at least some data between themselves. This is done with the
|
---|
333 | L<threads::shared> module and the C< : shared> attribute:
|
---|
334 |
|
---|
335 | use threads;
|
---|
336 | use threads::shared;
|
---|
337 |
|
---|
338 | my $foo : shared = 1;
|
---|
339 | my $bar = 1;
|
---|
340 | threads->new(sub { $foo++; $bar++ })->join;
|
---|
341 |
|
---|
342 | print "$foo\n"; #prints 2 since $foo is shared
|
---|
343 | print "$bar\n"; #prints 1 since $bar is not shared
|
---|
344 |
|
---|
345 | In the case of a shared array, all the array's elements are shared, and for
|
---|
346 | a shared hash, all the keys and values are shared. This places
|
---|
347 | restrictions on what may be assigned to shared array and hash elements: only
|
---|
348 | simple values or references to shared variables are allowed - this is
|
---|
349 | so that a private variable can't accidentally become shared. A bad
|
---|
350 | assignment will cause the thread to die. For example:
|
---|
351 |
|
---|
352 | use threads;
|
---|
353 | use threads::shared;
|
---|
354 |
|
---|
355 | my $var = 1;
|
---|
356 | my $svar : shared = 2;
|
---|
357 | my %hash : shared;
|
---|
358 |
|
---|
359 | ... create some threads ...
|
---|
360 |
|
---|
361 | $hash{a} = 1; # all threads see exists($hash{a}) and $hash{a} == 1
|
---|
362 | $hash{a} = $var # okay - copy-by-value: same effect as previous
|
---|
363 | $hash{a} = $svar # okay - copy-by-value: same effect as previous
|
---|
364 | $hash{a} = \$svar # okay - a reference to a shared variable
|
---|
365 | $hash{a} = \$var # This will die
|
---|
366 | delete $hash{a} # okay - all threads will see !exists($hash{a})
|
---|
367 |
|
---|
368 | Note that a shared variable guarantees that if two or more threads try to
|
---|
369 | modify it at the same time, the internal state of the variable will not
|
---|
370 | become corrupted. However, there are no guarantees beyond this, as
|
---|
371 | explained in the next section.
|
---|
372 |
|
---|
373 | =head2 Thread Pitfalls: Races
|
---|
374 |
|
---|
375 | While threads bring a new set of useful tools, they also bring a
|
---|
376 | number of pitfalls. One pitfall is the race condition:
|
---|
377 |
|
---|
378 | use threads;
|
---|
379 | use threads::shared;
|
---|
380 |
|
---|
381 | my $a : shared = 1;
|
---|
382 | $thr1 = threads->new(\&sub1);
|
---|
383 | $thr2 = threads->new(\&sub2);
|
---|
384 |
|
---|
385 | $thr1->join;
|
---|
386 | $thr2->join;
|
---|
387 | print "$a\n";
|
---|
388 |
|
---|
389 | sub sub1 { my $foo = $a; $a = $foo + 1; }
|
---|
390 | sub sub2 { my $bar = $a; $a = $bar + 1; }
|
---|
391 |
|
---|
392 | What do you think $a will be? The answer, unfortunately, is "it
|
---|
393 | depends." Both sub1() and sub2() access the global variable $a, once
|
---|
394 | to read and once to write. Depending on factors ranging from your
|
---|
395 | thread implementation's scheduling algorithm to the phase of the moon,
|
---|
396 | $a can be 2 or 3.
|
---|
397 |
|
---|
398 | Race conditions are caused by unsynchronized access to shared
|
---|
399 | data. Without explicit synchronization, there's no way to be sure that
|
---|
400 | nothing has happened to the shared data between the time you access it
|
---|
401 | and the time you update it. Even this simple code fragment has the
|
---|
402 | possibility of error:
|
---|
403 |
|
---|
404 | use threads;
|
---|
405 | my $a : shared = 2;
|
---|
406 | my $b : shared;
|
---|
407 | my $c : shared;
|
---|
408 | my $thr1 = threads->create(sub { $b = $a; $a = $b + 1; });
|
---|
409 | my $thr2 = threads->create(sub { $c = $a; $a = $c + 1; });
|
---|
410 | $thr1->join;
|
---|
411 | $thr2->join;
|
---|
412 |
|
---|
413 | Two threads both access $a. Each thread can potentially be interrupted
|
---|
414 | at any point, or be executed in any order. At the end, $a could be 3
|
---|
415 | or 4, and both $b and $c could be 2 or 3.
|
---|
416 |
|
---|
417 | Even C<$a += 5> or C<$a++> are not guaranteed to be atomic.
|
---|
418 |
|
---|
419 | Whenever your program accesses data or resources that can be accessed
|
---|
420 | by other threads, you must take steps to coordinate access or risk
|
---|
421 | data inconsistency and race conditions. Note that Perl will protect its
|
---|
422 | internals from your race conditions, but it won't protect you from you.
|
---|
423 |
|
---|
424 | =head1 Synchronization and control
|
---|
425 |
|
---|
426 | Perl provides a number of mechanisms to coordinate the interactions
|
---|
427 | between themselves and their data, to avoid race conditions and the like.
|
---|
428 | Some of these are designed to resemble the common techniques used in thread
|
---|
429 | libraries such as C<pthreads>; others are Perl-specific. Often, the
|
---|
430 | standard techniques are clumsy and difficult to get right (such as
|
---|
431 | condition waits). Where possible, it is usually easier to use Perlish
|
---|
432 | techniques such as queues, which remove some of the hard work involved.
|
---|
433 |
|
---|
434 | =head2 Controlling access: lock()
|
---|
435 |
|
---|
436 | The lock() function takes a shared variable and puts a lock on it.
|
---|
437 | No other thread may lock the variable until the variable is unlocked
|
---|
438 | by the thread holding the lock. Unlocking happens automatically
|
---|
439 | when the locking thread exits the outermost block that contains
|
---|
440 | C<lock()> function. Using lock() is straightforward: this example has
|
---|
441 | several threads doing some calculations in parallel, and occasionally
|
---|
442 | updating a running total:
|
---|
443 |
|
---|
444 | use threads;
|
---|
445 | use threads::shared;
|
---|
446 |
|
---|
447 | my $total : shared = 0;
|
---|
448 |
|
---|
449 | sub calc {
|
---|
450 | for (;;) {
|
---|
451 | my $result;
|
---|
452 | # (... do some calculations and set $result ...)
|
---|
453 | {
|
---|
454 | lock($total); # block until we obtain the lock
|
---|
455 | $total += $result;
|
---|
456 | } # lock implicitly released at end of scope
|
---|
457 | last if $result == 0;
|
---|
458 | }
|
---|
459 | }
|
---|
460 |
|
---|
461 | my $thr1 = threads->new(\&calc);
|
---|
462 | my $thr2 = threads->new(\&calc);
|
---|
463 | my $thr3 = threads->new(\&calc);
|
---|
464 | $thr1->join;
|
---|
465 | $thr2->join;
|
---|
466 | $thr3->join;
|
---|
467 | print "total=$total\n";
|
---|
468 |
|
---|
469 |
|
---|
470 | lock() blocks the thread until the variable being locked is
|
---|
471 | available. When lock() returns, your thread can be sure that no other
|
---|
472 | thread can lock that variable until the outermost block containing the
|
---|
473 | lock exits.
|
---|
474 |
|
---|
475 | It's important to note that locks don't prevent access to the variable
|
---|
476 | in question, only lock attempts. This is in keeping with Perl's
|
---|
477 | longstanding tradition of courteous programming, and the advisory file
|
---|
478 | locking that flock() gives you.
|
---|
479 |
|
---|
480 | You may lock arrays and hashes as well as scalars. Locking an array,
|
---|
481 | though, will not block subsequent locks on array elements, just lock
|
---|
482 | attempts on the array itself.
|
---|
483 |
|
---|
484 | Locks are recursive, which means it's okay for a thread to
|
---|
485 | lock a variable more than once. The lock will last until the outermost
|
---|
486 | lock() on the variable goes out of scope. For example:
|
---|
487 |
|
---|
488 | my $x : shared;
|
---|
489 | doit();
|
---|
490 |
|
---|
491 | sub doit {
|
---|
492 | {
|
---|
493 | {
|
---|
494 | lock($x); # wait for lock
|
---|
495 | lock($x); # NOOP - we already have the lock
|
---|
496 | {
|
---|
497 | lock($x); # NOOP
|
---|
498 | {
|
---|
499 | lock($x); # NOOP
|
---|
500 | lockit_some_more();
|
---|
501 | }
|
---|
502 | }
|
---|
503 | } # *** implicit unlock here ***
|
---|
504 | }
|
---|
505 | }
|
---|
506 |
|
---|
507 | sub lockit_some_more {
|
---|
508 | lock($x); # NOOP
|
---|
509 | } # nothing happens here
|
---|
510 |
|
---|
511 | Note that there is no unlock() function - the only way to unlock a
|
---|
512 | variable is to allow it to go out of scope.
|
---|
513 |
|
---|
514 | A lock can either be used to guard the data contained within the variable
|
---|
515 | being locked, or it can be used to guard something else, like a section
|
---|
516 | of code. In this latter case, the variable in question does not hold any
|
---|
517 | useful data, and exists only for the purpose of being locked. In this
|
---|
518 | respect, the variable behaves like the mutexes and basic semaphores of
|
---|
519 | traditional thread libraries.
|
---|
520 |
|
---|
521 | =head2 A Thread Pitfall: Deadlocks
|
---|
522 |
|
---|
523 | Locks are a handy tool to synchronize access to data, and using them
|
---|
524 | properly is the key to safe shared data. Unfortunately, locks aren't
|
---|
525 | without their dangers, especially when multiple locks are involved.
|
---|
526 | Consider the following code:
|
---|
527 |
|
---|
528 | use threads;
|
---|
529 |
|
---|
530 | my $a : shared = 4;
|
---|
531 | my $b : shared = "foo";
|
---|
532 | my $thr1 = threads->new(sub {
|
---|
533 | lock($a);
|
---|
534 | sleep 20;
|
---|
535 | lock($b);
|
---|
536 | });
|
---|
537 | my $thr2 = threads->new(sub {
|
---|
538 | lock($b);
|
---|
539 | sleep 20;
|
---|
540 | lock($a);
|
---|
541 | });
|
---|
542 |
|
---|
543 | This program will probably hang until you kill it. The only way it
|
---|
544 | won't hang is if one of the two threads acquires both locks
|
---|
545 | first. A guaranteed-to-hang version is more complicated, but the
|
---|
546 | principle is the same.
|
---|
547 |
|
---|
548 | The first thread will grab a lock on $a, then, after a pause during which
|
---|
549 | the second thread has probably had time to do some work, try to grab a
|
---|
550 | lock on $b. Meanwhile, the second thread grabs a lock on $b, then later
|
---|
551 | tries to grab a lock on $a. The second lock attempt for both threads will
|
---|
552 | block, each waiting for the other to release its lock.
|
---|
553 |
|
---|
554 | This condition is called a deadlock, and it occurs whenever two or
|
---|
555 | more threads are trying to get locks on resources that the others
|
---|
556 | own. Each thread will block, waiting for the other to release a lock
|
---|
557 | on a resource. That never happens, though, since the thread with the
|
---|
558 | resource is itself waiting for a lock to be released.
|
---|
559 |
|
---|
560 | There are a number of ways to handle this sort of problem. The best
|
---|
561 | way is to always have all threads acquire locks in the exact same
|
---|
562 | order. If, for example, you lock variables $a, $b, and $c, always lock
|
---|
563 | $a before $b, and $b before $c. It's also best to hold on to locks for
|
---|
564 | as short a period of time to minimize the risks of deadlock.
|
---|
565 |
|
---|
566 | The other synchronization primitives described below can suffer from
|
---|
567 | similar problems.
|
---|
568 |
|
---|
569 | =head2 Queues: Passing Data Around
|
---|
570 |
|
---|
571 | A queue is a special thread-safe object that lets you put data in one
|
---|
572 | end and take it out the other without having to worry about
|
---|
573 | synchronization issues. They're pretty straightforward, and look like
|
---|
574 | this:
|
---|
575 |
|
---|
576 | use threads;
|
---|
577 | use Thread::Queue;
|
---|
578 |
|
---|
579 | my $DataQueue = Thread::Queue->new;
|
---|
580 | $thr = threads->new(sub {
|
---|
581 | while ($DataElement = $DataQueue->dequeue) {
|
---|
582 | print "Popped $DataElement off the queue\n";
|
---|
583 | }
|
---|
584 | });
|
---|
585 |
|
---|
586 | $DataQueue->enqueue(12);
|
---|
587 | $DataQueue->enqueue("A", "B", "C");
|
---|
588 | $DataQueue->enqueue(\$thr);
|
---|
589 | sleep 10;
|
---|
590 | $DataQueue->enqueue(undef);
|
---|
591 | $thr->join;
|
---|
592 |
|
---|
593 | You create the queue with C<new Thread::Queue>. Then you can
|
---|
594 | add lists of scalars onto the end with enqueue(), and pop scalars off
|
---|
595 | the front of it with dequeue(). A queue has no fixed size, and can grow
|
---|
596 | as needed to hold everything pushed on to it.
|
---|
597 |
|
---|
598 | If a queue is empty, dequeue() blocks until another thread enqueues
|
---|
599 | something. This makes queues ideal for event loops and other
|
---|
600 | communications between threads.
|
---|
601 |
|
---|
602 | =head2 Semaphores: Synchronizing Data Access
|
---|
603 |
|
---|
604 | Semaphores are a kind of generic locking mechanism. In their most basic
|
---|
605 | form, they behave very much like lockable scalars, except that they
|
---|
606 | can't hold data, and that they must be explicitly unlocked. In their
|
---|
607 | advanced form, they act like a kind of counter, and can allow multiple
|
---|
608 | threads to have the 'lock' at any one time.
|
---|
609 |
|
---|
610 | =head2 Basic semaphores
|
---|
611 |
|
---|
612 | Semaphores have two methods, down() and up(): down() decrements the resource
|
---|
613 | count, while up increments it. Calls to down() will block if the
|
---|
614 | semaphore's current count would decrement below zero. This program
|
---|
615 | gives a quick demonstration:
|
---|
616 |
|
---|
617 | use threads;
|
---|
618 | use Thread::Semaphore;
|
---|
619 |
|
---|
620 | my $semaphore = new Thread::Semaphore;
|
---|
621 | my $GlobalVariable : shared = 0;
|
---|
622 |
|
---|
623 | $thr1 = new threads \&sample_sub, 1;
|
---|
624 | $thr2 = new threads \&sample_sub, 2;
|
---|
625 | $thr3 = new threads \&sample_sub, 3;
|
---|
626 |
|
---|
627 | sub sample_sub {
|
---|
628 | my $SubNumber = shift @_;
|
---|
629 | my $TryCount = 10;
|
---|
630 | my $LocalCopy;
|
---|
631 | sleep 1;
|
---|
632 | while ($TryCount--) {
|
---|
633 | $semaphore->down;
|
---|
634 | $LocalCopy = $GlobalVariable;
|
---|
635 | print "$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n";
|
---|
636 | sleep 2;
|
---|
637 | $LocalCopy++;
|
---|
638 | $GlobalVariable = $LocalCopy;
|
---|
639 | $semaphore->up;
|
---|
640 | }
|
---|
641 | }
|
---|
642 |
|
---|
643 | $thr1->join;
|
---|
644 | $thr2->join;
|
---|
645 | $thr3->join;
|
---|
646 |
|
---|
647 | The three invocations of the subroutine all operate in sync. The
|
---|
648 | semaphore, though, makes sure that only one thread is accessing the
|
---|
649 | global variable at once.
|
---|
650 |
|
---|
651 | =head2 Advanced Semaphores
|
---|
652 |
|
---|
653 | By default, semaphores behave like locks, letting only one thread
|
---|
654 | down() them at a time. However, there are other uses for semaphores.
|
---|
655 |
|
---|
656 | Each semaphore has a counter attached to it. By default, semaphores are
|
---|
657 | created with the counter set to one, down() decrements the counter by
|
---|
658 | one, and up() increments by one. However, we can override any or all
|
---|
659 | of these defaults simply by passing in different values:
|
---|
660 |
|
---|
661 | use threads;
|
---|
662 | use Thread::Semaphore;
|
---|
663 | my $semaphore = Thread::Semaphore->new(5);
|
---|
664 | # Creates a semaphore with the counter set to five
|
---|
665 |
|
---|
666 | $thr1 = threads->new(\&sub1);
|
---|
667 | $thr2 = threads->new(\&sub1);
|
---|
668 |
|
---|
669 | sub sub1 {
|
---|
670 | $semaphore->down(5); # Decrements the counter by five
|
---|
671 | # Do stuff here
|
---|
672 | $semaphore->up(5); # Increment the counter by five
|
---|
673 | }
|
---|
674 |
|
---|
675 | $thr1->detach;
|
---|
676 | $thr2->detach;
|
---|
677 |
|
---|
678 | If down() attempts to decrement the counter below zero, it blocks until
|
---|
679 | the counter is large enough. Note that while a semaphore can be created
|
---|
680 | with a starting count of zero, any up() or down() always changes the
|
---|
681 | counter by at least one, and so $semaphore->down(0) is the same as
|
---|
682 | $semaphore->down(1).
|
---|
683 |
|
---|
684 | The question, of course, is why would you do something like this? Why
|
---|
685 | create a semaphore with a starting count that's not one, or why
|
---|
686 | decrement/increment it by more than one? The answer is resource
|
---|
687 | availability. Many resources that you want to manage access for can be
|
---|
688 | safely used by more than one thread at once.
|
---|
689 |
|
---|
690 | For example, let's take a GUI driven program. It has a semaphore that
|
---|
691 | it uses to synchronize access to the display, so only one thread is
|
---|
692 | ever drawing at once. Handy, but of course you don't want any thread
|
---|
693 | to start drawing until things are properly set up. In this case, you
|
---|
694 | can create a semaphore with a counter set to zero, and up it when
|
---|
695 | things are ready for drawing.
|
---|
696 |
|
---|
697 | Semaphores with counters greater than one are also useful for
|
---|
698 | establishing quotas. Say, for example, that you have a number of
|
---|
699 | threads that can do I/O at once. You don't want all the threads
|
---|
700 | reading or writing at once though, since that can potentially swamp
|
---|
701 | your I/O channels, or deplete your process' quota of filehandles. You
|
---|
702 | can use a semaphore initialized to the number of concurrent I/O
|
---|
703 | requests (or open files) that you want at any one time, and have your
|
---|
704 | threads quietly block and unblock themselves.
|
---|
705 |
|
---|
706 | Larger increments or decrements are handy in those cases where a
|
---|
707 | thread needs to check out or return a number of resources at once.
|
---|
708 |
|
---|
709 | =head2 cond_wait() and cond_signal()
|
---|
710 |
|
---|
711 | These two functions can be used in conjunction with locks to notify
|
---|
712 | co-operating threads that a resource has become available. They are
|
---|
713 | very similar in use to the functions found in C<pthreads>. However
|
---|
714 | for most purposes, queues are simpler to use and more intuitive. See
|
---|
715 | L<threads::shared> for more details.
|
---|
716 |
|
---|
717 | =head2 Giving up control
|
---|
718 |
|
---|
719 | There are times when you may find it useful to have a thread
|
---|
720 | explicitly give up the CPU to another thread. You may be doing something
|
---|
721 | processor-intensive and want to make sure that the user-interface thread
|
---|
722 | gets called frequently. Regardless, there are times that you might want
|
---|
723 | a thread to give up the processor.
|
---|
724 |
|
---|
725 | Perl's threading package provides the yield() function that does
|
---|
726 | this. yield() is pretty straightforward, and works like this:
|
---|
727 |
|
---|
728 | use threads;
|
---|
729 |
|
---|
730 | sub loop {
|
---|
731 | my $thread = shift;
|
---|
732 | my $foo = 50;
|
---|
733 | while($foo--) { print "in thread $thread\n" }
|
---|
734 | threads->yield;
|
---|
735 | $foo = 50;
|
---|
736 | while($foo--) { print "in thread $thread\n" }
|
---|
737 | }
|
---|
738 |
|
---|
739 | my $thread1 = threads->new(\&loop, 'first');
|
---|
740 | my $thread2 = threads->new(\&loop, 'second');
|
---|
741 | my $thread3 = threads->new(\&loop, 'third');
|
---|
742 |
|
---|
743 | It is important to remember that yield() is only a hint to give up the CPU,
|
---|
744 | it depends on your hardware, OS and threading libraries what actually happens.
|
---|
745 | B<On many operating systems, yield() is a no-op.> Therefore it is important
|
---|
746 | to note that one should not build the scheduling of the threads around
|
---|
747 | yield() calls. It might work on your platform but it won't work on another
|
---|
748 | platform.
|
---|
749 |
|
---|
750 | =head1 General Thread Utility Routines
|
---|
751 |
|
---|
752 | We've covered the workhorse parts of Perl's threading package, and
|
---|
753 | with these tools you should be well on your way to writing threaded
|
---|
754 | code and packages. There are a few useful little pieces that didn't
|
---|
755 | really fit in anyplace else.
|
---|
756 |
|
---|
757 | =head2 What Thread Am I In?
|
---|
758 |
|
---|
759 | The C<< threads->self >> class method provides your program with a way to
|
---|
760 | get an object representing the thread it's currently in. You can use this
|
---|
761 | object in the same way as the ones returned from thread creation.
|
---|
762 |
|
---|
763 | =head2 Thread IDs
|
---|
764 |
|
---|
765 | tid() is a thread object method that returns the thread ID of the
|
---|
766 | thread the object represents. Thread IDs are integers, with the main
|
---|
767 | thread in a program being 0. Currently Perl assigns a unique tid to
|
---|
768 | every thread ever created in your program, assigning the first thread
|
---|
769 | to be created a tid of 1, and increasing the tid by 1 for each new
|
---|
770 | thread that's created.
|
---|
771 |
|
---|
772 | =head2 Are These Threads The Same?
|
---|
773 |
|
---|
774 | The equal() method takes two thread objects and returns true
|
---|
775 | if the objects represent the same thread, and false if they don't.
|
---|
776 |
|
---|
777 | Thread objects also have an overloaded == comparison so that you can do
|
---|
778 | comparison on them as you would with normal objects.
|
---|
779 |
|
---|
780 | =head2 What Threads Are Running?
|
---|
781 |
|
---|
782 | C<< threads->list >> returns a list of thread objects, one for each thread
|
---|
783 | that's currently running and not detached. Handy for a number of things,
|
---|
784 | including cleaning up at the end of your program:
|
---|
785 |
|
---|
786 | # Loop through all the threads
|
---|
787 | foreach $thr (threads->list) {
|
---|
788 | # Don't join the main thread or ourselves
|
---|
789 | if ($thr->tid && !threads::equal($thr, threads->self)) {
|
---|
790 | $thr->join;
|
---|
791 | }
|
---|
792 | }
|
---|
793 |
|
---|
794 | If some threads have not finished running when the main Perl thread
|
---|
795 | ends, Perl will warn you about it and die, since it is impossible for Perl
|
---|
796 | to clean up itself while other threads are running
|
---|
797 |
|
---|
798 | =head1 A Complete Example
|
---|
799 |
|
---|
800 | Confused yet? It's time for an example program to show some of the
|
---|
801 | things we've covered. This program finds prime numbers using threads.
|
---|
802 |
|
---|
803 | 1 #!/usr/bin/perl -w
|
---|
804 | 2 # prime-pthread, courtesy of Tom Christiansen
|
---|
805 | 3
|
---|
806 | 4 use strict;
|
---|
807 | 5
|
---|
808 | 6 use threads;
|
---|
809 | 7 use Thread::Queue;
|
---|
810 | 8
|
---|
811 | 9 my $stream = new Thread::Queue;
|
---|
812 | 10 my $kid = new threads(\&check_num, $stream, 2);
|
---|
813 | 11
|
---|
814 | 12 for my $i ( 3 .. 1000 ) {
|
---|
815 | 13 $stream->enqueue($i);
|
---|
816 | 14 }
|
---|
817 | 15
|
---|
818 | 16 $stream->enqueue(undef);
|
---|
819 | 17 $kid->join;
|
---|
820 | 18
|
---|
821 | 19 sub check_num {
|
---|
822 | 20 my ($upstream, $cur_prime) = @_;
|
---|
823 | 21 my $kid;
|
---|
824 | 22 my $downstream = new Thread::Queue;
|
---|
825 | 23 while (my $num = $upstream->dequeue) {
|
---|
826 | 24 next unless $num % $cur_prime;
|
---|
827 | 25 if ($kid) {
|
---|
828 | 26 $downstream->enqueue($num);
|
---|
829 | 27 } else {
|
---|
830 | 28 print "Found prime $num\n";
|
---|
831 | 29 $kid = new threads(\&check_num, $downstream, $num);
|
---|
832 | 30 }
|
---|
833 | 31 }
|
---|
834 | 32 $downstream->enqueue(undef) if $kid;
|
---|
835 | 33 $kid->join if $kid;
|
---|
836 | 34 }
|
---|
837 |
|
---|
838 | This program uses the pipeline model to generate prime numbers. Each
|
---|
839 | thread in the pipeline has an input queue that feeds numbers to be
|
---|
840 | checked, a prime number that it's responsible for, and an output queue
|
---|
841 | into which it funnels numbers that have failed the check. If the thread
|
---|
842 | has a number that's failed its check and there's no child thread, then
|
---|
843 | the thread must have found a new prime number. In that case, a new
|
---|
844 | child thread is created for that prime and stuck on the end of the
|
---|
845 | pipeline.
|
---|
846 |
|
---|
847 | This probably sounds a bit more confusing than it really is, so let's
|
---|
848 | go through this program piece by piece and see what it does. (For
|
---|
849 | those of you who might be trying to remember exactly what a prime
|
---|
850 | number is, it's a number that's only evenly divisible by itself and 1)
|
---|
851 |
|
---|
852 | The bulk of the work is done by the check_num() subroutine, which
|
---|
853 | takes a reference to its input queue and a prime number that it's
|
---|
854 | responsible for. After pulling in the input queue and the prime that
|
---|
855 | the subroutine's checking (line 20), we create a new queue (line 22)
|
---|
856 | and reserve a scalar for the thread that we're likely to create later
|
---|
857 | (line 21).
|
---|
858 |
|
---|
859 | The while loop from lines 23 to line 31 grabs a scalar off the input
|
---|
860 | queue and checks against the prime this thread is responsible
|
---|
861 | for. Line 24 checks to see if there's a remainder when we modulo the
|
---|
862 | number to be checked against our prime. If there is one, the number
|
---|
863 | must not be evenly divisible by our prime, so we need to either pass
|
---|
864 | it on to the next thread if we've created one (line 26) or create a
|
---|
865 | new thread if we haven't.
|
---|
866 |
|
---|
867 | The new thread creation is line 29. We pass on to it a reference to
|
---|
868 | the queue we've created, and the prime number we've found.
|
---|
869 |
|
---|
870 | Finally, once the loop terminates (because we got a 0 or undef in the
|
---|
871 | queue, which serves as a note to die), we pass on the notice to our
|
---|
872 | child and wait for it to exit if we've created a child (lines 32 and
|
---|
873 | 37).
|
---|
874 |
|
---|
875 | Meanwhile, back in the main thread, we create a queue (line 9) and the
|
---|
876 | initial child thread (line 10), and pre-seed it with the first prime:
|
---|
877 | 2. Then we queue all the numbers from 3 to 1000 for checking (lines
|
---|
878 | 12-14), then queue a die notice (line 16) and wait for the first child
|
---|
879 | thread to terminate (line 17). Because a child won't die until its
|
---|
880 | child has died, we know that we're done once we return from the join.
|
---|
881 |
|
---|
882 | That's how it works. It's pretty simple; as with many Perl programs,
|
---|
883 | the explanation is much longer than the program.
|
---|
884 |
|
---|
885 | =head1 Different implementations of threads
|
---|
886 |
|
---|
887 | Some background on thread implementations from the operating system
|
---|
888 | viewpoint. There are three basic categories of threads: user-mode threads,
|
---|
889 | kernel threads, and multiprocessor kernel threads.
|
---|
890 |
|
---|
891 | User-mode threads are threads that live entirely within a program and
|
---|
892 | its libraries. In this model, the OS knows nothing about threads. As
|
---|
893 | far as it's concerned, your process is just a process.
|
---|
894 |
|
---|
895 | This is the easiest way to implement threads, and the way most OSes
|
---|
896 | start. The big disadvantage is that, since the OS knows nothing about
|
---|
897 | threads, if one thread blocks they all do. Typical blocking activities
|
---|
898 | include most system calls, most I/O, and things like sleep().
|
---|
899 |
|
---|
900 | Kernel threads are the next step in thread evolution. The OS knows
|
---|
901 | about kernel threads, and makes allowances for them. The main
|
---|
902 | difference between a kernel thread and a user-mode thread is
|
---|
903 | blocking. With kernel threads, things that block a single thread don't
|
---|
904 | block other threads. This is not the case with user-mode threads,
|
---|
905 | where the kernel blocks at the process level and not the thread level.
|
---|
906 |
|
---|
907 | This is a big step forward, and can give a threaded program quite a
|
---|
908 | performance boost over non-threaded programs. Threads that block
|
---|
909 | performing I/O, for example, won't block threads that are doing other
|
---|
910 | things. Each process still has only one thread running at once,
|
---|
911 | though, regardless of how many CPUs a system might have.
|
---|
912 |
|
---|
913 | Since kernel threading can interrupt a thread at any time, they will
|
---|
914 | uncover some of the implicit locking assumptions you may make in your
|
---|
915 | program. For example, something as simple as C<$a = $a + 2> can behave
|
---|
916 | unpredictably with kernel threads if $a is visible to other
|
---|
917 | threads, as another thread may have changed $a between the time it
|
---|
918 | was fetched on the right hand side and the time the new value is
|
---|
919 | stored.
|
---|
920 |
|
---|
921 | Multiprocessor kernel threads are the final step in thread
|
---|
922 | support. With multiprocessor kernel threads on a machine with multiple
|
---|
923 | CPUs, the OS may schedule two or more threads to run simultaneously on
|
---|
924 | different CPUs.
|
---|
925 |
|
---|
926 | This can give a serious performance boost to your threaded program,
|
---|
927 | since more than one thread will be executing at the same time. As a
|
---|
928 | tradeoff, though, any of those nagging synchronization issues that
|
---|
929 | might not have shown with basic kernel threads will appear with a
|
---|
930 | vengeance.
|
---|
931 |
|
---|
932 | In addition to the different levels of OS involvement in threads,
|
---|
933 | different OSes (and different thread implementations for a particular
|
---|
934 | OS) allocate CPU cycles to threads in different ways.
|
---|
935 |
|
---|
936 | Cooperative multitasking systems have running threads give up control
|
---|
937 | if one of two things happen. If a thread calls a yield function, it
|
---|
938 | gives up control. It also gives up control if the thread does
|
---|
939 | something that would cause it to block, such as perform I/O. In a
|
---|
940 | cooperative multitasking implementation, one thread can starve all the
|
---|
941 | others for CPU time if it so chooses.
|
---|
942 |
|
---|
943 | Preemptive multitasking systems interrupt threads at regular intervals
|
---|
944 | while the system decides which thread should run next. In a preemptive
|
---|
945 | multitasking system, one thread usually won't monopolize the CPU.
|
---|
946 |
|
---|
947 | On some systems, there can be cooperative and preemptive threads
|
---|
948 | running simultaneously. (Threads running with realtime priorities
|
---|
949 | often behave cooperatively, for example, while threads running at
|
---|
950 | normal priorities behave preemptively.)
|
---|
951 |
|
---|
952 | Most modern operating systems support preemptive multitasking nowadays.
|
---|
953 |
|
---|
954 | =head1 Performance considerations
|
---|
955 |
|
---|
956 | The main thing to bear in mind when comparing ithreads to other threading
|
---|
957 | models is the fact that for each new thread created, a complete copy of
|
---|
958 | all the variables and data of the parent thread has to be taken. Thus
|
---|
959 | thread creation can be quite expensive, both in terms of memory usage and
|
---|
960 | time spent in creation. The ideal way to reduce these costs is to have a
|
---|
961 | relatively short number of long-lived threads, all created fairly early
|
---|
962 | on - before the base thread has accumulated too much data. Of course, this
|
---|
963 | may not always be possible, so compromises have to be made. However, after
|
---|
964 | a thread has been created, its performance and extra memory usage should
|
---|
965 | be little different than ordinary code.
|
---|
966 |
|
---|
967 | Also note that under the current implementation, shared variables
|
---|
968 | use a little more memory and are a little slower than ordinary variables.
|
---|
969 |
|
---|
970 | =head1 Process-scope Changes
|
---|
971 |
|
---|
972 | Note that while threads themselves are separate execution threads and
|
---|
973 | Perl data is thread-private unless explicitly shared, the threads can
|
---|
974 | affect process-scope state, affecting all the threads.
|
---|
975 |
|
---|
976 | The most common example of this is changing the current working
|
---|
977 | directory using chdir(). One thread calls chdir(), and the working
|
---|
978 | directory of all the threads changes.
|
---|
979 |
|
---|
980 | Even more drastic example of a process-scope change is chroot():
|
---|
981 | the root directory of all the threads changes, and no thread can
|
---|
982 | undo it (as opposed to chdir()).
|
---|
983 |
|
---|
984 | Further examples of process-scope changes include umask() and
|
---|
985 | changing uids/gids.
|
---|
986 |
|
---|
987 | Thinking of mixing fork() and threads? Please lie down and wait
|
---|
988 | until the feeling passes. Be aware that the semantics of fork() vary
|
---|
989 | between platforms. For example, some UNIX systems copy all the current
|
---|
990 | threads into the child process, while others only copy the thread that
|
---|
991 | called fork(). You have been warned!
|
---|
992 |
|
---|
993 | Similarly, mixing signals and threads should not be attempted.
|
---|
994 | Implementations are platform-dependent, and even the POSIX
|
---|
995 | semantics may not be what you expect (and Perl doesn't even
|
---|
996 | give you the full POSIX API).
|
---|
997 |
|
---|
998 | =head1 Thread-Safety of System Libraries
|
---|
999 |
|
---|
1000 | Whether various library calls are thread-safe is outside the control
|
---|
1001 | of Perl. Calls often suffering from not being thread-safe include:
|
---|
1002 | localtime(), gmtime(), get{gr,host,net,proto,serv,pw}*(), readdir(),
|
---|
1003 | rand(), and srand() -- in general, calls that depend on some global
|
---|
1004 | external state.
|
---|
1005 |
|
---|
1006 | If the system Perl is compiled in has thread-safe variants of such
|
---|
1007 | calls, they will be used. Beyond that, Perl is at the mercy of
|
---|
1008 | the thread-safety or -unsafety of the calls. Please consult your
|
---|
1009 | C library call documentation.
|
---|
1010 |
|
---|
1011 | On some platforms the thread-safe library interfaces may fail if the
|
---|
1012 | result buffer is too small (for example the user group databases may
|
---|
1013 | be rather large, and the reentrant interfaces may have to carry around
|
---|
1014 | a full snapshot of those databases). Perl will start with a small
|
---|
1015 | buffer, but keep retrying and growing the result buffer
|
---|
1016 | until the result fits. If this limitless growing sounds bad for
|
---|
1017 | security or memory consumption reasons you can recompile Perl with
|
---|
1018 | PERL_REENTRANT_MAXSIZE defined to the maximum number of bytes you will
|
---|
1019 | allow.
|
---|
1020 |
|
---|
1021 | =head1 Conclusion
|
---|
1022 |
|
---|
1023 | A complete thread tutorial could fill a book (and has, many times),
|
---|
1024 | but with what we've covered in this introduction, you should be well
|
---|
1025 | on your way to becoming a threaded Perl expert.
|
---|
1026 |
|
---|
1027 | =head1 Bibliography
|
---|
1028 |
|
---|
1029 | Here's a short bibliography courtesy of Jürgen Christoffel:
|
---|
1030 |
|
---|
1031 | =head2 Introductory Texts
|
---|
1032 |
|
---|
1033 | Birrell, Andrew D. An Introduction to Programming with
|
---|
1034 | Threads. Digital Equipment Corporation, 1989, DEC-SRC Research Report
|
---|
1035 | #35 online as
|
---|
1036 | http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html
|
---|
1037 | (highly recommended)
|
---|
1038 |
|
---|
1039 | Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A
|
---|
1040 | Guide to Concurrency, Communication, and
|
---|
1041 | Multithreading. Prentice-Hall, 1996.
|
---|
1042 |
|
---|
1043 | Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with
|
---|
1044 | Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written
|
---|
1045 | introduction to threads).
|
---|
1046 |
|
---|
1047 | Nelson, Greg (editor). Systems Programming with Modula-3. Prentice
|
---|
1048 | Hall, 1991, ISBN 0-13-590464-1.
|
---|
1049 |
|
---|
1050 | Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell.
|
---|
1051 | Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1
|
---|
1052 | (covers POSIX threads).
|
---|
1053 |
|
---|
1054 | =head2 OS-Related References
|
---|
1055 |
|
---|
1056 | Boykin, Joseph, David Kirschen, Alan Langerman, and Susan
|
---|
1057 | LoVerso. Programming under Mach. Addison-Wesley, 1994, ISBN
|
---|
1058 | 0-201-52739-1.
|
---|
1059 |
|
---|
1060 | Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall,
|
---|
1061 | 1995, ISBN 0-13-219908-4 (great textbook).
|
---|
1062 |
|
---|
1063 | Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts,
|
---|
1064 | 4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4
|
---|
1065 |
|
---|
1066 | =head2 Other References
|
---|
1067 |
|
---|
1068 | Arnold, Ken and James Gosling. The Java Programming Language, 2nd
|
---|
1069 | ed. Addison-Wesley, 1998, ISBN 0-201-31006-6.
|
---|
1070 |
|
---|
1071 | comp.programming.threads FAQ,
|
---|
1072 | L<http://www.serpentine.com/~bos/threads-faq/>
|
---|
1073 |
|
---|
1074 | Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage
|
---|
1075 | Collection on Virtually Shared Memory Architectures" in Memory
|
---|
1076 | Management: Proc. of the International Workshop IWMM 92, St. Malo,
|
---|
1077 | France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer,
|
---|
1078 | 1992, ISBN 3540-55940-X (real-life thread applications).
|
---|
1079 |
|
---|
1080 | Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002,
|
---|
1081 | L<http://www.perl.com/pub/a/2002/06/11/threads.html>
|
---|
1082 |
|
---|
1083 | =head1 Acknowledgements
|
---|
1084 |
|
---|
1085 | Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
|
---|
1086 | Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua
|
---|
1087 | Pritikin, and Alan Burlison, for their help in reality-checking and
|
---|
1088 | polishing this article. Big thanks to Tom Christiansen for his rewrite
|
---|
1089 | of the prime number generator.
|
---|
1090 |
|
---|
1091 | =head1 AUTHOR
|
---|
1092 |
|
---|
1093 | Dan Sugalski E<lt>[email protected]<gt>
|
---|
1094 |
|
---|
1095 | Slightly modified by Arthur Bergman to fit the new thread model/module.
|
---|
1096 |
|
---|
1097 | Reworked slightly by Jörg Walter E<lt>[email protected]<gt> to be more concise
|
---|
1098 | about thread-safety of perl code.
|
---|
1099 |
|
---|
1100 | Rearranged slightly by Elizabeth Mattijsen E<lt>[email protected]<gt> to put
|
---|
1101 | less emphasis on yield().
|
---|
1102 |
|
---|
1103 | =head1 Copyrights
|
---|
1104 |
|
---|
1105 | The original version of this article originally appeared in The Perl
|
---|
1106 | Journal #10, and is copyright 1998 The Perl Journal. It appears courtesy
|
---|
1107 | of Jon Orwant and The Perl Journal. This document may be distributed
|
---|
1108 | under the same terms as Perl itself.
|
---|
1109 |
|
---|
1110 | For more information please see L<threads> and L<threads::shared>.
|
---|