1 | =head1 NAME
|
---|
2 |
|
---|
3 | perlstyle - Perl style guide
|
---|
4 |
|
---|
5 | =head1 DESCRIPTION
|
---|
6 |
|
---|
7 | Each programmer will, of course, have his or her own preferences in
|
---|
8 | regards to formatting, but there are some general guidelines that will
|
---|
9 | make your programs easier to read, understand, and maintain.
|
---|
10 |
|
---|
11 | The most important thing is to run your programs under the B<-w>
|
---|
12 | flag at all times. You may turn it off explicitly for particular
|
---|
13 | portions of code via the C<no warnings> pragma or the C<$^W> variable
|
---|
14 | if you must. You should also always run under C<use strict> or know the
|
---|
15 | reason why not. The C<use sigtrap> and even C<use diagnostics> pragmas
|
---|
16 | may also prove useful.
|
---|
17 |
|
---|
18 | Regarding aesthetics of code lay out, about the only thing Larry
|
---|
19 | cares strongly about is that the closing curly bracket of
|
---|
20 | a multi-line BLOCK should line up with the keyword that started the construct.
|
---|
21 | Beyond that, he has other preferences that aren't so strong:
|
---|
22 |
|
---|
23 | =over 4
|
---|
24 |
|
---|
25 | =item *
|
---|
26 |
|
---|
27 | 4-column indent.
|
---|
28 |
|
---|
29 | =item *
|
---|
30 |
|
---|
31 | Opening curly on same line as keyword, if possible, otherwise line up.
|
---|
32 |
|
---|
33 | =item *
|
---|
34 |
|
---|
35 | Space before the opening curly of a multi-line BLOCK.
|
---|
36 |
|
---|
37 | =item *
|
---|
38 |
|
---|
39 | One-line BLOCK may be put on one line, including curlies.
|
---|
40 |
|
---|
41 | =item *
|
---|
42 |
|
---|
43 | No space before the semicolon.
|
---|
44 |
|
---|
45 | =item *
|
---|
46 |
|
---|
47 | Semicolon omitted in "short" one-line BLOCK.
|
---|
48 |
|
---|
49 | =item *
|
---|
50 |
|
---|
51 | Space around most operators.
|
---|
52 |
|
---|
53 | =item *
|
---|
54 |
|
---|
55 | Space around a "complex" subscript (inside brackets).
|
---|
56 |
|
---|
57 | =item *
|
---|
58 |
|
---|
59 | Blank lines between chunks that do different things.
|
---|
60 |
|
---|
61 | =item *
|
---|
62 |
|
---|
63 | Uncuddled elses.
|
---|
64 |
|
---|
65 | =item *
|
---|
66 |
|
---|
67 | No space between function name and its opening parenthesis.
|
---|
68 |
|
---|
69 | =item *
|
---|
70 |
|
---|
71 | Space after each comma.
|
---|
72 |
|
---|
73 | =item *
|
---|
74 |
|
---|
75 | Long lines broken after an operator (except C<and> and C<or>).
|
---|
76 |
|
---|
77 | =item *
|
---|
78 |
|
---|
79 | Space after last parenthesis matching on current line.
|
---|
80 |
|
---|
81 | =item *
|
---|
82 |
|
---|
83 | Line up corresponding items vertically.
|
---|
84 |
|
---|
85 | =item *
|
---|
86 |
|
---|
87 | Omit redundant punctuation as long as clarity doesn't suffer.
|
---|
88 |
|
---|
89 | =back
|
---|
90 |
|
---|
91 | Larry has his reasons for each of these things, but he doesn't claim that
|
---|
92 | everyone else's mind works the same as his does.
|
---|
93 |
|
---|
94 | Here are some other more substantive style issues to think about:
|
---|
95 |
|
---|
96 | =over 4
|
---|
97 |
|
---|
98 | =item *
|
---|
99 |
|
---|
100 | Just because you I<CAN> do something a particular way doesn't mean that
|
---|
101 | you I<SHOULD> do it that way. Perl is designed to give you several
|
---|
102 | ways to do anything, so consider picking the most readable one. For
|
---|
103 | instance
|
---|
104 |
|
---|
105 | open(FOO,$foo) || die "Can't open $foo: $!";
|
---|
106 |
|
---|
107 | is better than
|
---|
108 |
|
---|
109 | die "Can't open $foo: $!" unless open(FOO,$foo);
|
---|
110 |
|
---|
111 | because the second way hides the main point of the statement in a
|
---|
112 | modifier. On the other hand
|
---|
113 |
|
---|
114 | print "Starting analysis\n" if $verbose;
|
---|
115 |
|
---|
116 | is better than
|
---|
117 |
|
---|
118 | $verbose && print "Starting analysis\n";
|
---|
119 |
|
---|
120 | because the main point isn't whether the user typed B<-v> or not.
|
---|
121 |
|
---|
122 | Similarly, just because an operator lets you assume default arguments
|
---|
123 | doesn't mean that you have to make use of the defaults. The defaults
|
---|
124 | are there for lazy systems programmers writing one-shot programs. If
|
---|
125 | you want your program to be readable, consider supplying the argument.
|
---|
126 |
|
---|
127 | Along the same lines, just because you I<CAN> omit parentheses in many
|
---|
128 | places doesn't mean that you ought to:
|
---|
129 |
|
---|
130 | return print reverse sort num values %array;
|
---|
131 | return print(reverse(sort num (values(%array))));
|
---|
132 |
|
---|
133 | When in doubt, parenthesize. At the very least it will let some poor
|
---|
134 | schmuck bounce on the % key in B<vi>.
|
---|
135 |
|
---|
136 | Even if you aren't in doubt, consider the mental welfare of the person
|
---|
137 | who has to maintain the code after you, and who will probably put
|
---|
138 | parentheses in the wrong place.
|
---|
139 |
|
---|
140 | =item *
|
---|
141 |
|
---|
142 | Don't go through silly contortions to exit a loop at the top or the
|
---|
143 | bottom, when Perl provides the C<last> operator so you can exit in
|
---|
144 | the middle. Just "outdent" it a little to make it more visible:
|
---|
145 |
|
---|
146 | LINE:
|
---|
147 | for (;;) {
|
---|
148 | statements;
|
---|
149 | last LINE if $foo;
|
---|
150 | next LINE if /^#/;
|
---|
151 | statements;
|
---|
152 | }
|
---|
153 |
|
---|
154 | =item *
|
---|
155 |
|
---|
156 | Don't be afraid to use loop labels--they're there to enhance
|
---|
157 | readability as well as to allow multilevel loop breaks. See the
|
---|
158 | previous example.
|
---|
159 |
|
---|
160 | =item *
|
---|
161 |
|
---|
162 | Avoid using C<grep()> (or C<map()>) or `backticks` in a void context, that is,
|
---|
163 | when you just throw away their return values. Those functions all
|
---|
164 | have return values, so use them. Otherwise use a C<foreach()> loop or
|
---|
165 | the C<system()> function instead.
|
---|
166 |
|
---|
167 | =item *
|
---|
168 |
|
---|
169 | For portability, when using features that may not be implemented on
|
---|
170 | every machine, test the construct in an eval to see if it fails. If
|
---|
171 | you know what version or patchlevel a particular feature was
|
---|
172 | implemented, you can test C<$]> (C<$PERL_VERSION> in C<English>) to see if it
|
---|
173 | will be there. The C<Config> module will also let you interrogate values
|
---|
174 | determined by the B<Configure> program when Perl was installed.
|
---|
175 |
|
---|
176 | =item *
|
---|
177 |
|
---|
178 | Choose mnemonic identifiers. If you can't remember what mnemonic means,
|
---|
179 | you've got a problem.
|
---|
180 |
|
---|
181 | =item *
|
---|
182 |
|
---|
183 | While short identifiers like C<$gotit> are probably ok, use underscores to
|
---|
184 | separate words in longer identifiers. It is generally easier to read
|
---|
185 | C<$var_names_like_this> than C<$VarNamesLikeThis>, especially for
|
---|
186 | non-native speakers of English. It's also a simple rule that works
|
---|
187 | consistently with C<VAR_NAMES_LIKE_THIS>.
|
---|
188 |
|
---|
189 | Package names are sometimes an exception to this rule. Perl informally
|
---|
190 | reserves lowercase module names for "pragma" modules like C<integer> and
|
---|
191 | C<strict>. Other modules should begin with a capital letter and use mixed
|
---|
192 | case, but probably without underscores due to limitations in primitive
|
---|
193 | file systems' representations of module names as files that must fit into a
|
---|
194 | few sparse bytes.
|
---|
195 |
|
---|
196 | =item *
|
---|
197 |
|
---|
198 | You may find it helpful to use letter case to indicate the scope
|
---|
199 | or nature of a variable. For example:
|
---|
200 |
|
---|
201 | $ALL_CAPS_HERE constants only (beware clashes with perl vars!)
|
---|
202 | $Some_Caps_Here package-wide global/static
|
---|
203 | $no_caps_here function scope my() or local() variables
|
---|
204 |
|
---|
205 | Function and method names seem to work best as all lowercase.
|
---|
206 | E.g., C<$obj-E<gt>as_string()>.
|
---|
207 |
|
---|
208 | You can use a leading underscore to indicate that a variable or
|
---|
209 | function should not be used outside the package that defined it.
|
---|
210 |
|
---|
211 | =item *
|
---|
212 |
|
---|
213 | If you have a really hairy regular expression, use the C</x> modifier and
|
---|
214 | put in some whitespace to make it look a little less like line noise.
|
---|
215 | Don't use slash as a delimiter when your regexp has slashes or backslashes.
|
---|
216 |
|
---|
217 | =item *
|
---|
218 |
|
---|
219 | Use the new C<and> and C<or> operators to avoid having to parenthesize
|
---|
220 | list operators so much, and to reduce the incidence of punctuation
|
---|
221 | operators like C<&&> and C<||>. Call your subroutines as if they were
|
---|
222 | functions or list operators to avoid excessive ampersands and parentheses.
|
---|
223 |
|
---|
224 | =item *
|
---|
225 |
|
---|
226 | Use here documents instead of repeated C<print()> statements.
|
---|
227 |
|
---|
228 | =item *
|
---|
229 |
|
---|
230 | Line up corresponding things vertically, especially if it'd be too long
|
---|
231 | to fit on one line anyway.
|
---|
232 |
|
---|
233 | $IDX = $ST_MTIME;
|
---|
234 | $IDX = $ST_ATIME if $opt_u;
|
---|
235 | $IDX = $ST_CTIME if $opt_c;
|
---|
236 | $IDX = $ST_SIZE if $opt_s;
|
---|
237 |
|
---|
238 | mkdir $tmpdir, 0700 or die "can't mkdir $tmpdir: $!";
|
---|
239 | chdir($tmpdir) or die "can't chdir $tmpdir: $!";
|
---|
240 | mkdir 'tmp', 0777 or die "can't mkdir $tmpdir/tmp: $!";
|
---|
241 |
|
---|
242 | =item *
|
---|
243 |
|
---|
244 | Always check the return codes of system calls. Good error messages should
|
---|
245 | go to C<STDERR>, include which program caused the problem, what the failed
|
---|
246 | system call and arguments were, and (VERY IMPORTANT) should contain the
|
---|
247 | standard system error message for what went wrong. Here's a simple but
|
---|
248 | sufficient example:
|
---|
249 |
|
---|
250 | opendir(D, $dir) or die "can't opendir $dir: $!";
|
---|
251 |
|
---|
252 | =item *
|
---|
253 |
|
---|
254 | Line up your transliterations when it makes sense:
|
---|
255 |
|
---|
256 | tr [abc]
|
---|
257 | [xyz];
|
---|
258 |
|
---|
259 | =item *
|
---|
260 |
|
---|
261 | Think about reusability. Why waste brainpower on a one-shot when you
|
---|
262 | might want to do something like it again? Consider generalizing your
|
---|
263 | code. Consider writing a module or object class. Consider making your
|
---|
264 | code run cleanly with C<use strict> and C<use warnings> (or B<-w>) in
|
---|
265 | effect. Consider giving away your code. Consider changing your whole
|
---|
266 | world view. Consider... oh, never mind.
|
---|
267 |
|
---|
268 | =item *
|
---|
269 |
|
---|
270 | Try to document your code and use Pod formatting in a consistent way. Here
|
---|
271 | are commonly expected conventions:
|
---|
272 |
|
---|
273 | =over 4
|
---|
274 |
|
---|
275 | =item *
|
---|
276 |
|
---|
277 | use C<CE<lt>E<gt>> for function, variable and module names (and more
|
---|
278 | generally anything that can be considered part of code, like filehandles
|
---|
279 | or specific values). Note that function names are considered more readable
|
---|
280 | with parentheses after their name, that is C<function()>.
|
---|
281 |
|
---|
282 | =item *
|
---|
283 |
|
---|
284 | use C<BE<lt>E<gt>> for commands names like B<cat> or B<grep>.
|
---|
285 |
|
---|
286 | =item *
|
---|
287 |
|
---|
288 | use C<FE<lt>E<gt>> or C<CE<lt>E<gt>> for file names. C<FE<lt>E<gt>> should
|
---|
289 | be the only Pod code for file names, but as most Pod formatters render it
|
---|
290 | as italic, Unix and Windows paths with their slashes and backslashes may
|
---|
291 | be less readable, and better rendered with C<CE<lt>E<gt>>.
|
---|
292 |
|
---|
293 | =back
|
---|
294 |
|
---|
295 | =item *
|
---|
296 |
|
---|
297 | Be consistent.
|
---|
298 |
|
---|
299 | =item *
|
---|
300 |
|
---|
301 | Be nice.
|
---|
302 |
|
---|
303 | =back
|
---|