1 | =head1 NAME
|
---|
2 |
|
---|
3 | JSON::XS - JSON serialising/deserialising, done correctly and fast
|
---|
4 |
|
---|
5 | =encoding utf-8
|
---|
6 |
|
---|
7 | JSON::XS - æ£ãããŠé«é㪠JSON ã·ãªã¢ã©ã€ã¶/ãã·ãªã¢ã©ã€ã¶
|
---|
8 | (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
|
---|
9 |
|
---|
10 | =head1 SYNOPSIS
|
---|
11 |
|
---|
12 | use JSON::XS;
|
---|
13 |
|
---|
14 | # exported functions, they croak on error
|
---|
15 | # and expect/generate UTF-8
|
---|
16 |
|
---|
17 | $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
|
---|
18 | $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
|
---|
19 |
|
---|
20 | # OO-interface
|
---|
21 |
|
---|
22 | $coder = JSON::XS->new->ascii->pretty->allow_nonref;
|
---|
23 | $pretty_printed_unencoded = $coder->encode ($perl_scalar);
|
---|
24 | $perl_scalar = $coder->decode ($unicode_json_text);
|
---|
25 |
|
---|
26 | # Note that JSON version 2.0 and above will automatically use JSON::XS
|
---|
27 | # if available, at virtually no speed overhead either, so you should
|
---|
28 | # be able to just:
|
---|
29 |
|
---|
30 | use JSON;
|
---|
31 |
|
---|
32 | # and do the same things, except that you have a pure-perl fallback now.
|
---|
33 |
|
---|
34 | =head1 DESCRIPTION
|
---|
35 |
|
---|
36 | This module converts Perl data structures to JSON and vice versa. Its
|
---|
37 | primary goal is to be I<correct> and its secondary goal is to be
|
---|
38 | I<fast>. To reach the latter goal it was written in C.
|
---|
39 |
|
---|
40 | Beginning with version 2.0 of the JSON module, when both JSON and
|
---|
41 | JSON::XS are installed, then JSON will fall back on JSON::XS (this can be
|
---|
42 | overridden) with no overhead due to emulation (by inheriting constructor
|
---|
43 | and methods). If JSON::XS is not available, it will fall back to the
|
---|
44 | compatible JSON::PP module as backend, so using JSON instead of JSON::XS
|
---|
45 | gives you a portable JSON API that can be fast when you need and doesn't
|
---|
46 | require a C compiler when that is a problem.
|
---|
47 |
|
---|
48 | As this is the n-th-something JSON module on CPAN, what was the reason
|
---|
49 | to write yet another JSON module? While it seems there are many JSON
|
---|
50 | modules, none of them correctly handle all corner cases, and in most cases
|
---|
51 | their maintainers are unresponsive, gone missing, or not listening to bug
|
---|
52 | reports for other reasons.
|
---|
53 |
|
---|
54 | See MAPPING, below, on how JSON::XS maps perl values to JSON values and
|
---|
55 | vice versa.
|
---|
56 |
|
---|
57 | =head2 FEATURES
|
---|
58 |
|
---|
59 | =over 4
|
---|
60 |
|
---|
61 | =item * correct Unicode handling
|
---|
62 |
|
---|
63 | This module knows how to handle Unicode, documents how and when it does
|
---|
64 | so, and even documents what "correct" means.
|
---|
65 |
|
---|
66 | =item * round-trip integrity
|
---|
67 |
|
---|
68 | When you serialise a perl data structure using only data types supported
|
---|
69 | by JSON, the deserialised data structure is identical on the Perl level.
|
---|
70 | (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
|
---|
71 | like a number). There minor I<are> exceptions to this, read the MAPPING
|
---|
72 | section below to learn about those.
|
---|
73 |
|
---|
74 | =item * strict checking of JSON correctness
|
---|
75 |
|
---|
76 | There is no guessing, no generating of illegal JSON texts by default,
|
---|
77 | and only JSON is accepted as input by default (the latter is a security
|
---|
78 | feature).
|
---|
79 |
|
---|
80 | =item * fast
|
---|
81 |
|
---|
82 | Compared to other JSON modules and other serialisers such as Storable,
|
---|
83 | this module usually compares favourably in terms of speed, too.
|
---|
84 |
|
---|
85 | =item * simple to use
|
---|
86 |
|
---|
87 | This module has both a simple functional interface as well as an object
|
---|
88 | oriented interface interface.
|
---|
89 |
|
---|
90 | =item * reasonably versatile output formats
|
---|
91 |
|
---|
92 | You can choose between the most compact guaranteed-single-line format
|
---|
93 | possible (nice for simple line-based protocols), a pure-ASCII format
|
---|
94 | (for when your transport is not 8-bit clean, still supports the whole
|
---|
95 | Unicode range), or a pretty-printed format (for when you want to read that
|
---|
96 | stuff). Or you can combine those features in whatever way you like.
|
---|
97 |
|
---|
98 | =back
|
---|
99 |
|
---|
100 | =cut
|
---|
101 |
|
---|
102 | package JSON::XS;
|
---|
103 |
|
---|
104 | use common::sense;
|
---|
105 |
|
---|
106 | our $VERSION = '2.25';
|
---|
107 | our @ISA = qw(Exporter);
|
---|
108 |
|
---|
109 | our @EXPORT = qw(encode_json decode_json to_json from_json);
|
---|
110 |
|
---|
111 | sub to_json($) {
|
---|
112 | require Carp;
|
---|
113 | Carp::croak ("JSON::XS::to_json has been renamed to encode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
|
---|
114 | }
|
---|
115 |
|
---|
116 | sub from_json($) {
|
---|
117 | require Carp;
|
---|
118 | Carp::croak ("JSON::XS::from_json has been renamed to decode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
|
---|
119 | }
|
---|
120 |
|
---|
121 | use Exporter;
|
---|
122 | use XSLoader;
|
---|
123 |
|
---|
124 | =head1 FUNCTIONAL INTERFACE
|
---|
125 |
|
---|
126 | The following convenience methods are provided by this module. They are
|
---|
127 | exported by default:
|
---|
128 |
|
---|
129 | =over 4
|
---|
130 |
|
---|
131 | =item $json_text = encode_json $perl_scalar
|
---|
132 |
|
---|
133 | Converts the given Perl data structure to a UTF-8 encoded, binary string
|
---|
134 | (that is, the string contains octets only). Croaks on error.
|
---|
135 |
|
---|
136 | This function call is functionally identical to:
|
---|
137 |
|
---|
138 | $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
|
---|
139 |
|
---|
140 | Except being faster.
|
---|
141 |
|
---|
142 | =item $perl_scalar = decode_json $json_text
|
---|
143 |
|
---|
144 | The opposite of C<encode_json>: expects an UTF-8 (binary) string and tries
|
---|
145 | to parse that as an UTF-8 encoded JSON text, returning the resulting
|
---|
146 | reference. Croaks on error.
|
---|
147 |
|
---|
148 | This function call is functionally identical to:
|
---|
149 |
|
---|
150 | $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
|
---|
151 |
|
---|
152 | Except being faster.
|
---|
153 |
|
---|
154 | =item $is_boolean = JSON::XS::is_bool $scalar
|
---|
155 |
|
---|
156 | Returns true if the passed scalar represents either JSON::XS::true or
|
---|
157 | JSON::XS::false, two constants that act like C<1> and C<0>, respectively
|
---|
158 | and are used to represent JSON C<true> and C<false> values in Perl.
|
---|
159 |
|
---|
160 | See MAPPING, below, for more information on how JSON values are mapped to
|
---|
161 | Perl.
|
---|
162 |
|
---|
163 | =back
|
---|
164 |
|
---|
165 |
|
---|
166 | =head1 A FEW NOTES ON UNICODE AND PERL
|
---|
167 |
|
---|
168 | Since this often leads to confusion, here are a few very clear words on
|
---|
169 | how Unicode works in Perl, modulo bugs.
|
---|
170 |
|
---|
171 | =over 4
|
---|
172 |
|
---|
173 | =item 1. Perl strings can store characters with ordinal values > 255.
|
---|
174 |
|
---|
175 | This enables you to store Unicode characters as single characters in a
|
---|
176 | Perl string - very natural.
|
---|
177 |
|
---|
178 | =item 2. Perl does I<not> associate an encoding with your strings.
|
---|
179 |
|
---|
180 | ... until you force it to, e.g. when matching it against a regex, or
|
---|
181 | printing the scalar to a file, in which case Perl either interprets your
|
---|
182 | string as locale-encoded text, octets/binary, or as Unicode, depending
|
---|
183 | on various settings. In no case is an encoding stored together with your
|
---|
184 | data, it is I<use> that decides encoding, not any magical meta data.
|
---|
185 |
|
---|
186 | =item 3. The internal utf-8 flag has no meaning with regards to the
|
---|
187 | encoding of your string.
|
---|
188 |
|
---|
189 | Just ignore that flag unless you debug a Perl bug, a module written in
|
---|
190 | XS or want to dive into the internals of perl. Otherwise it will only
|
---|
191 | confuse you, as, despite the name, it says nothing about how your string
|
---|
192 | is encoded. You can have Unicode strings with that flag set, with that
|
---|
193 | flag clear, and you can have binary data with that flag set and that flag
|
---|
194 | clear. Other possibilities exist, too.
|
---|
195 |
|
---|
196 | If you didn't know about that flag, just the better, pretend it doesn't
|
---|
197 | exist.
|
---|
198 |
|
---|
199 | =item 4. A "Unicode String" is simply a string where each character can be
|
---|
200 | validly interpreted as a Unicode code point.
|
---|
201 |
|
---|
202 | If you have UTF-8 encoded data, it is no longer a Unicode string, but a
|
---|
203 | Unicode string encoded in UTF-8, giving you a binary string.
|
---|
204 |
|
---|
205 | =item 5. A string containing "high" (> 255) character values is I<not> a UTF-8 string.
|
---|
206 |
|
---|
207 | It's a fact. Learn to live with it.
|
---|
208 |
|
---|
209 | =back
|
---|
210 |
|
---|
211 | I hope this helps :)
|
---|
212 |
|
---|
213 |
|
---|
214 | =head1 OBJECT-ORIENTED INTERFACE
|
---|
215 |
|
---|
216 | The object oriented interface lets you configure your own encoding or
|
---|
217 | decoding style, within the limits of supported formats.
|
---|
218 |
|
---|
219 | =over 4
|
---|
220 |
|
---|
221 | =item $json = new JSON::XS
|
---|
222 |
|
---|
223 | Creates a new JSON::XS object that can be used to de/encode JSON
|
---|
224 | strings. All boolean flags described below are by default I<disabled>.
|
---|
225 |
|
---|
226 | The mutators for flags all return the JSON object again and thus calls can
|
---|
227 | be chained:
|
---|
228 |
|
---|
229 | my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
|
---|
230 | => {"a": [1, 2]}
|
---|
231 |
|
---|
232 | =item $json = $json->ascii ([$enable])
|
---|
233 |
|
---|
234 | =item $enabled = $json->get_ascii
|
---|
235 |
|
---|
236 | If C<$enable> is true (or missing), then the C<encode> method will not
|
---|
237 | generate characters outside the code range C<0..127> (which is ASCII). Any
|
---|
238 | Unicode characters outside that range will be escaped using either a
|
---|
239 | single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
|
---|
240 | as per RFC4627. The resulting encoded JSON text can be treated as a native
|
---|
241 | Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
|
---|
242 | or any other superset of ASCII.
|
---|
243 |
|
---|
244 | If C<$enable> is false, then the C<encode> method will not escape Unicode
|
---|
245 | characters unless required by the JSON syntax or other flags. This results
|
---|
246 | in a faster and more compact format.
|
---|
247 |
|
---|
248 | See also the section I<ENCODING/CODESET FLAG NOTES> later in this
|
---|
249 | document.
|
---|
250 |
|
---|
251 | The main use for this flag is to produce JSON texts that can be
|
---|
252 | transmitted over a 7-bit channel, as the encoded JSON texts will not
|
---|
253 | contain any 8 bit characters.
|
---|
254 |
|
---|
255 | JSON::XS->new->ascii (1)->encode ([chr 0x10401])
|
---|
256 | => ["\ud801\udc01"]
|
---|
257 |
|
---|
258 | =item $json = $json->latin1 ([$enable])
|
---|
259 |
|
---|
260 | =item $enabled = $json->get_latin1
|
---|
261 |
|
---|
262 | If C<$enable> is true (or missing), then the C<encode> method will encode
|
---|
263 | the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
|
---|
264 | outside the code range C<0..255>. The resulting string can be treated as a
|
---|
265 | latin1-encoded JSON text or a native Unicode string. The C<decode> method
|
---|
266 | will not be affected in any way by this flag, as C<decode> by default
|
---|
267 | expects Unicode, which is a strict superset of latin1.
|
---|
268 |
|
---|
269 | If C<$enable> is false, then the C<encode> method will not escape Unicode
|
---|
270 | characters unless required by the JSON syntax or other flags.
|
---|
271 |
|
---|
272 | See also the section I<ENCODING/CODESET FLAG NOTES> later in this
|
---|
273 | document.
|
---|
274 |
|
---|
275 | The main use for this flag is efficiently encoding binary data as JSON
|
---|
276 | text, as most octets will not be escaped, resulting in a smaller encoded
|
---|
277 | size. The disadvantage is that the resulting JSON text is encoded
|
---|
278 | in latin1 (and must correctly be treated as such when storing and
|
---|
279 | transferring), a rare encoding for JSON. It is therefore most useful when
|
---|
280 | you want to store data structures known to contain binary data efficiently
|
---|
281 | in files or databases, not when talking to other JSON encoders/decoders.
|
---|
282 |
|
---|
283 | JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
|
---|
284 | => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
|
---|
285 |
|
---|
286 | =item $json = $json->utf8 ([$enable])
|
---|
287 |
|
---|
288 | =item $enabled = $json->get_utf8
|
---|
289 |
|
---|
290 | If C<$enable> is true (or missing), then the C<encode> method will encode
|
---|
291 | the JSON result into UTF-8, as required by many protocols, while the
|
---|
292 | C<decode> method expects to be handled an UTF-8-encoded string. Please
|
---|
293 | note that UTF-8-encoded strings do not contain any characters outside the
|
---|
294 | range C<0..255>, they are thus useful for bytewise/binary I/O. In future
|
---|
295 | versions, enabling this option might enable autodetection of the UTF-16
|
---|
296 | and UTF-32 encoding families, as described in RFC4627.
|
---|
297 |
|
---|
298 | If C<$enable> is false, then the C<encode> method will return the JSON
|
---|
299 | string as a (non-encoded) Unicode string, while C<decode> expects thus a
|
---|
300 | Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
|
---|
301 | to be done yourself, e.g. using the Encode module.
|
---|
302 |
|
---|
303 | See also the section I<ENCODING/CODESET FLAG NOTES> later in this
|
---|
304 | document.
|
---|
305 |
|
---|
306 | Example, output UTF-16BE-encoded JSON:
|
---|
307 |
|
---|
308 | use Encode;
|
---|
309 | $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
|
---|
310 |
|
---|
311 | Example, decode UTF-32LE-encoded JSON:
|
---|
312 |
|
---|
313 | use Encode;
|
---|
314 | $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
|
---|
315 |
|
---|
316 | =item $json = $json->pretty ([$enable])
|
---|
317 |
|
---|
318 | This enables (or disables) all of the C<indent>, C<space_before> and
|
---|
319 | C<space_after> (and in the future possibly more) flags in one call to
|
---|
320 | generate the most readable (or most compact) form possible.
|
---|
321 |
|
---|
322 | Example, pretty-print some simple structure:
|
---|
323 |
|
---|
324 | my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
|
---|
325 | =>
|
---|
326 | {
|
---|
327 | "a" : [
|
---|
328 | 1,
|
---|
329 | 2
|
---|
330 | ]
|
---|
331 | }
|
---|
332 |
|
---|
333 | =item $json = $json->indent ([$enable])
|
---|
334 |
|
---|
335 | =item $enabled = $json->get_indent
|
---|
336 |
|
---|
337 | If C<$enable> is true (or missing), then the C<encode> method will use a multiline
|
---|
338 | format as output, putting every array member or object/hash key-value pair
|
---|
339 | into its own line, indenting them properly.
|
---|
340 |
|
---|
341 | If C<$enable> is false, no newlines or indenting will be produced, and the
|
---|
342 | resulting JSON text is guaranteed not to contain any C<newlines>.
|
---|
343 |
|
---|
344 | This setting has no effect when decoding JSON texts.
|
---|
345 |
|
---|
346 | =item $json = $json->space_before ([$enable])
|
---|
347 |
|
---|
348 | =item $enabled = $json->get_space_before
|
---|
349 |
|
---|
350 | If C<$enable> is true (or missing), then the C<encode> method will add an extra
|
---|
351 | optional space before the C<:> separating keys from values in JSON objects.
|
---|
352 |
|
---|
353 | If C<$enable> is false, then the C<encode> method will not add any extra
|
---|
354 | space at those places.
|
---|
355 |
|
---|
356 | This setting has no effect when decoding JSON texts. You will also
|
---|
357 | most likely combine this setting with C<space_after>.
|
---|
358 |
|
---|
359 | Example, space_before enabled, space_after and indent disabled:
|
---|
360 |
|
---|
361 | {"key" :"value"}
|
---|
362 |
|
---|
363 | =item $json = $json->space_after ([$enable])
|
---|
364 |
|
---|
365 | =item $enabled = $json->get_space_after
|
---|
366 |
|
---|
367 | If C<$enable> is true (or missing), then the C<encode> method will add an extra
|
---|
368 | optional space after the C<:> separating keys from values in JSON objects
|
---|
369 | and extra whitespace after the C<,> separating key-value pairs and array
|
---|
370 | members.
|
---|
371 |
|
---|
372 | If C<$enable> is false, then the C<encode> method will not add any extra
|
---|
373 | space at those places.
|
---|
374 |
|
---|
375 | This setting has no effect when decoding JSON texts.
|
---|
376 |
|
---|
377 | Example, space_before and indent disabled, space_after enabled:
|
---|
378 |
|
---|
379 | {"key": "value"}
|
---|
380 |
|
---|
381 | =item $json = $json->relaxed ([$enable])
|
---|
382 |
|
---|
383 | =item $enabled = $json->get_relaxed
|
---|
384 |
|
---|
385 | If C<$enable> is true (or missing), then C<decode> will accept some
|
---|
386 | extensions to normal JSON syntax (see below). C<encode> will not be
|
---|
387 | affected in anyway. I<Be aware that this option makes you accept invalid
|
---|
388 | JSON texts as if they were valid!>. I suggest only to use this option to
|
---|
389 | parse application-specific files written by humans (configuration files,
|
---|
390 | resource files etc.)
|
---|
391 |
|
---|
392 | If C<$enable> is false (the default), then C<decode> will only accept
|
---|
393 | valid JSON texts.
|
---|
394 |
|
---|
395 | Currently accepted extensions are:
|
---|
396 |
|
---|
397 | =over 4
|
---|
398 |
|
---|
399 | =item * list items can have an end-comma
|
---|
400 |
|
---|
401 | JSON I<separates> array elements and key-value pairs with commas. This
|
---|
402 | can be annoying if you write JSON texts manually and want to be able to
|
---|
403 | quickly append elements, so this extension accepts comma at the end of
|
---|
404 | such items not just between them:
|
---|
405 |
|
---|
406 | [
|
---|
407 | 1,
|
---|
408 | 2, <- this comma not normally allowed
|
---|
409 | ]
|
---|
410 | {
|
---|
411 | "k1": "v1",
|
---|
412 | "k2": "v2", <- this comma not normally allowed
|
---|
413 | }
|
---|
414 |
|
---|
415 | =item * shell-style '#'-comments
|
---|
416 |
|
---|
417 | Whenever JSON allows whitespace, shell-style comments are additionally
|
---|
418 | allowed. They are terminated by the first carriage-return or line-feed
|
---|
419 | character, after which more white-space and comments are allowed.
|
---|
420 |
|
---|
421 | [
|
---|
422 | 1, # this comment not allowed in JSON
|
---|
423 | # neither this one...
|
---|
424 | ]
|
---|
425 |
|
---|
426 | =back
|
---|
427 |
|
---|
428 | =item $json = $json->canonical ([$enable])
|
---|
429 |
|
---|
430 | =item $enabled = $json->get_canonical
|
---|
431 |
|
---|
432 | If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
|
---|
433 | by sorting their keys. This is adding a comparatively high overhead.
|
---|
434 |
|
---|
435 | If C<$enable> is false, then the C<encode> method will output key-value
|
---|
436 | pairs in the order Perl stores them (which will likely change between runs
|
---|
437 | of the same script).
|
---|
438 |
|
---|
439 | This option is useful if you want the same data structure to be encoded as
|
---|
440 | the same JSON text (given the same overall settings). If it is disabled,
|
---|
441 | the same hash might be encoded differently even if contains the same data,
|
---|
442 | as key-value pairs have no inherent ordering in Perl.
|
---|
443 |
|
---|
444 | This setting has no effect when decoding JSON texts.
|
---|
445 |
|
---|
446 | This setting has currently no effect on tied hashes.
|
---|
447 |
|
---|
448 | =item $json = $json->allow_nonref ([$enable])
|
---|
449 |
|
---|
450 | =item $enabled = $json->get_allow_nonref
|
---|
451 |
|
---|
452 | If C<$enable> is true (or missing), then the C<encode> method can convert a
|
---|
453 | non-reference into its corresponding string, number or null JSON value,
|
---|
454 | which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
|
---|
455 | values instead of croaking.
|
---|
456 |
|
---|
457 | If C<$enable> is false, then the C<encode> method will croak if it isn't
|
---|
458 | passed an arrayref or hashref, as JSON texts must either be an object
|
---|
459 | or array. Likewise, C<decode> will croak if given something that is not a
|
---|
460 | JSON object or array.
|
---|
461 |
|
---|
462 | Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
|
---|
463 | resulting in an invalid JSON text:
|
---|
464 |
|
---|
465 | JSON::XS->new->allow_nonref->encode ("Hello, World!")
|
---|
466 | => "Hello, World!"
|
---|
467 |
|
---|
468 | =item $json = $json->allow_unknown ([$enable])
|
---|
469 |
|
---|
470 | =item $enabled = $json->get_allow_unknown
|
---|
471 |
|
---|
472 | If C<$enable> is true (or missing), then C<encode> will I<not> throw an
|
---|
473 | exception when it encounters values it cannot represent in JSON (for
|
---|
474 | example, filehandles) but instead will encode a JSON C<null> value. Note
|
---|
475 | that blessed objects are not included here and are handled separately by
|
---|
476 | c<allow_nonref>.
|
---|
477 |
|
---|
478 | If C<$enable> is false (the default), then C<encode> will throw an
|
---|
479 | exception when it encounters anything it cannot encode as JSON.
|
---|
480 |
|
---|
481 | This option does not affect C<decode> in any way, and it is recommended to
|
---|
482 | leave it off unless you know your communications partner.
|
---|
483 |
|
---|
484 | =item $json = $json->allow_blessed ([$enable])
|
---|
485 |
|
---|
486 | =item $enabled = $json->get_allow_blessed
|
---|
487 |
|
---|
488 | If C<$enable> is true (or missing), then the C<encode> method will not
|
---|
489 | barf when it encounters a blessed reference. Instead, the value of the
|
---|
490 | B<convert_blessed> option will decide whether C<null> (C<convert_blessed>
|
---|
491 | disabled or no C<TO_JSON> method found) or a representation of the
|
---|
492 | object (C<convert_blessed> enabled and C<TO_JSON> method found) is being
|
---|
493 | encoded. Has no effect on C<decode>.
|
---|
494 |
|
---|
495 | If C<$enable> is false (the default), then C<encode> will throw an
|
---|
496 | exception when it encounters a blessed object.
|
---|
497 |
|
---|
498 | =item $json = $json->convert_blessed ([$enable])
|
---|
499 |
|
---|
500 | =item $enabled = $json->get_convert_blessed
|
---|
501 |
|
---|
502 | If C<$enable> is true (or missing), then C<encode>, upon encountering a
|
---|
503 | blessed object, will check for the availability of the C<TO_JSON> method
|
---|
504 | on the object's class. If found, it will be called in scalar context
|
---|
505 | and the resulting scalar will be encoded instead of the object. If no
|
---|
506 | C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
|
---|
507 | to do.
|
---|
508 |
|
---|
509 | The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
|
---|
510 | returns other blessed objects, those will be handled in the same
|
---|
511 | way. C<TO_JSON> must take care of not causing an endless recursion cycle
|
---|
512 | (== crash) in this case. The name of C<TO_JSON> was chosen because other
|
---|
513 | methods called by the Perl core (== not by the user of the object) are
|
---|
514 | usually in upper case letters and to avoid collisions with any C<to_json>
|
---|
515 | function or method.
|
---|
516 |
|
---|
517 | This setting does not yet influence C<decode> in any way, but in the
|
---|
518 | future, global hooks might get installed that influence C<decode> and are
|
---|
519 | enabled by this setting.
|
---|
520 |
|
---|
521 | If C<$enable> is false, then the C<allow_blessed> setting will decide what
|
---|
522 | to do when a blessed object is found.
|
---|
523 |
|
---|
524 | =item $json = $json->filter_json_object ([$coderef->($hashref)])
|
---|
525 |
|
---|
526 | When C<$coderef> is specified, it will be called from C<decode> each
|
---|
527 | time it decodes a JSON object. The only argument is a reference to the
|
---|
528 | newly-created hash. If the code references returns a single scalar (which
|
---|
529 | need not be a reference), this value (i.e. a copy of that scalar to avoid
|
---|
530 | aliasing) is inserted into the deserialised data structure. If it returns
|
---|
531 | an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
|
---|
532 | original deserialised hash will be inserted. This setting can slow down
|
---|
533 | decoding considerably.
|
---|
534 |
|
---|
535 | When C<$coderef> is omitted or undefined, any existing callback will
|
---|
536 | be removed and C<decode> will not change the deserialised hash in any
|
---|
537 | way.
|
---|
538 |
|
---|
539 | Example, convert all JSON objects into the integer 5:
|
---|
540 |
|
---|
541 | my $js = JSON::XS->new->filter_json_object (sub { 5 });
|
---|
542 | # returns [5]
|
---|
543 | $js->decode ('[{}]')
|
---|
544 | # throw an exception because allow_nonref is not enabled
|
---|
545 | # so a lone 5 is not allowed.
|
---|
546 | $js->decode ('{"a":1, "b":2}');
|
---|
547 |
|
---|
548 | =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
|
---|
549 |
|
---|
550 | Works remotely similar to C<filter_json_object>, but is only called for
|
---|
551 | JSON objects having a single key named C<$key>.
|
---|
552 |
|
---|
553 | This C<$coderef> is called before the one specified via
|
---|
554 | C<filter_json_object>, if any. It gets passed the single value in the JSON
|
---|
555 | object. If it returns a single value, it will be inserted into the data
|
---|
556 | structure. If it returns nothing (not even C<undef> but the empty list),
|
---|
557 | the callback from C<filter_json_object> will be called next, as if no
|
---|
558 | single-key callback were specified.
|
---|
559 |
|
---|
560 | If C<$coderef> is omitted or undefined, the corresponding callback will be
|
---|
561 | disabled. There can only ever be one callback for a given key.
|
---|
562 |
|
---|
563 | As this callback gets called less often then the C<filter_json_object>
|
---|
564 | one, decoding speed will not usually suffer as much. Therefore, single-key
|
---|
565 | objects make excellent targets to serialise Perl objects into, especially
|
---|
566 | as single-key JSON objects are as close to the type-tagged value concept
|
---|
567 | as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not
|
---|
568 | support this in any way, so you need to make sure your data never looks
|
---|
569 | like a serialised Perl hash.
|
---|
570 |
|
---|
571 | Typical names for the single object key are C<__class_whatever__>, or
|
---|
572 | C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
|
---|
573 | things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
|
---|
574 | with real hashes.
|
---|
575 |
|
---|
576 | Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
|
---|
577 | into the corresponding C<< $WIDGET{<id>} >> object:
|
---|
578 |
|
---|
579 | # return whatever is in $WIDGET{5}:
|
---|
580 | JSON::XS
|
---|
581 | ->new
|
---|
582 | ->filter_json_single_key_object (__widget__ => sub {
|
---|
583 | $WIDGET{ $_[0] }
|
---|
584 | })
|
---|
585 | ->decode ('{"__widget__": 5')
|
---|
586 |
|
---|
587 | # this can be used with a TO_JSON method in some "widget" class
|
---|
588 | # for serialisation to json:
|
---|
589 | sub WidgetBase::TO_JSON {
|
---|
590 | my ($self) = @_;
|
---|
591 |
|
---|
592 | unless ($self->{id}) {
|
---|
593 | $self->{id} = ..get..some..id..;
|
---|
594 | $WIDGET{$self->{id}} = $self;
|
---|
595 | }
|
---|
596 |
|
---|
597 | { __widget__ => $self->{id} }
|
---|
598 | }
|
---|
599 |
|
---|
600 | =item $json = $json->shrink ([$enable])
|
---|
601 |
|
---|
602 | =item $enabled = $json->get_shrink
|
---|
603 |
|
---|
604 | Perl usually over-allocates memory a bit when allocating space for
|
---|
605 | strings. This flag optionally resizes strings generated by either
|
---|
606 | C<encode> or C<decode> to their minimum size possible. This can save
|
---|
607 | memory when your JSON texts are either very very long or you have many
|
---|
608 | short strings. It will also try to downgrade any strings to octet-form
|
---|
609 | if possible: perl stores strings internally either in an encoding called
|
---|
610 | UTF-X or in octet-form. The latter cannot store everything but uses less
|
---|
611 | space in general (and some buggy Perl or C code might even rely on that
|
---|
612 | internal representation being used).
|
---|
613 |
|
---|
614 | The actual definition of what shrink does might change in future versions,
|
---|
615 | but it will always try to save space at the expense of time.
|
---|
616 |
|
---|
617 | If C<$enable> is true (or missing), the string returned by C<encode> will
|
---|
618 | be shrunk-to-fit, while all strings generated by C<decode> will also be
|
---|
619 | shrunk-to-fit.
|
---|
620 |
|
---|
621 | If C<$enable> is false, then the normal perl allocation algorithms are used.
|
---|
622 | If you work with your data, then this is likely to be faster.
|
---|
623 |
|
---|
624 | In the future, this setting might control other things, such as converting
|
---|
625 | strings that look like integers or floats into integers or floats
|
---|
626 | internally (there is no difference on the Perl level), saving space.
|
---|
627 |
|
---|
628 | =item $json = $json->max_depth ([$maximum_nesting_depth])
|
---|
629 |
|
---|
630 | =item $max_depth = $json->get_max_depth
|
---|
631 |
|
---|
632 | Sets the maximum nesting level (default C<512>) accepted while encoding
|
---|
633 | or decoding. If a higher nesting level is detected in JSON text or a Perl
|
---|
634 | data structure, then the encoder and decoder will stop and croak at that
|
---|
635 | point.
|
---|
636 |
|
---|
637 | Nesting level is defined by number of hash- or arrayrefs that the encoder
|
---|
638 | needs to traverse to reach a given point or the number of C<{> or C<[>
|
---|
639 | characters without their matching closing parenthesis crossed to reach a
|
---|
640 | given character in a string.
|
---|
641 |
|
---|
642 | Setting the maximum depth to one disallows any nesting, so that ensures
|
---|
643 | that the object is only a single hash/object or array.
|
---|
644 |
|
---|
645 | If no argument is given, the highest possible setting will be used, which
|
---|
646 | is rarely useful.
|
---|
647 |
|
---|
648 | Note that nesting is implemented by recursion in C. The default value has
|
---|
649 | been chosen to be as large as typical operating systems allow without
|
---|
650 | crashing.
|
---|
651 |
|
---|
652 | See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
|
---|
653 |
|
---|
654 | =item $json = $json->max_size ([$maximum_string_size])
|
---|
655 |
|
---|
656 | =item $max_size = $json->get_max_size
|
---|
657 |
|
---|
658 | Set the maximum length a JSON text may have (in bytes) where decoding is
|
---|
659 | being attempted. The default is C<0>, meaning no limit. When C<decode>
|
---|
660 | is called on a string that is longer then this many bytes, it will not
|
---|
661 | attempt to decode the string but throw an exception. This setting has no
|
---|
662 | effect on C<encode> (yet).
|
---|
663 |
|
---|
664 | If no argument is given, the limit check will be deactivated (same as when
|
---|
665 | C<0> is specified).
|
---|
666 |
|
---|
667 | See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
|
---|
668 |
|
---|
669 | =item $json_text = $json->encode ($perl_scalar)
|
---|
670 |
|
---|
671 | Converts the given Perl data structure (a simple scalar or a reference
|
---|
672 | to a hash or array) to its JSON representation. Simple scalars will be
|
---|
673 | converted into JSON string or number sequences, while references to arrays
|
---|
674 | become JSON arrays and references to hashes become JSON objects. Undefined
|
---|
675 | Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
|
---|
676 | nor C<false> values will be generated.
|
---|
677 |
|
---|
678 | =item $perl_scalar = $json->decode ($json_text)
|
---|
679 |
|
---|
680 | The opposite of C<encode>: expects a JSON text and tries to parse it,
|
---|
681 | returning the resulting simple scalar or reference. Croaks on error.
|
---|
682 |
|
---|
683 | JSON numbers and strings become simple Perl scalars. JSON arrays become
|
---|
684 | Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
|
---|
685 | C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
|
---|
686 |
|
---|
687 | =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
|
---|
688 |
|
---|
689 | This works like the C<decode> method, but instead of raising an exception
|
---|
690 | when there is trailing garbage after the first JSON object, it will
|
---|
691 | silently stop parsing there and return the number of characters consumed
|
---|
692 | so far.
|
---|
693 |
|
---|
694 | This is useful if your JSON texts are not delimited by an outer protocol
|
---|
695 | (which is not the brightest thing to do in the first place) and you need
|
---|
696 | to know where the JSON text ends.
|
---|
697 |
|
---|
698 | JSON::XS->new->decode_prefix ("[1] the tail")
|
---|
699 | => ([], 3)
|
---|
700 |
|
---|
701 | =back
|
---|
702 |
|
---|
703 |
|
---|
704 | =head1 INCREMENTAL PARSING
|
---|
705 |
|
---|
706 | In some cases, there is the need for incremental parsing of JSON
|
---|
707 | texts. While this module always has to keep both JSON text and resulting
|
---|
708 | Perl data structure in memory at one time, it does allow you to parse a
|
---|
709 | JSON stream incrementally. It does so by accumulating text until it has
|
---|
710 | a full JSON object, which it then can decode. This process is similar to
|
---|
711 | using C<decode_prefix> to see if a full JSON object is available, but
|
---|
712 | is much more efficient (and can be implemented with a minimum of method
|
---|
713 | calls).
|
---|
714 |
|
---|
715 | JSON::XS will only attempt to parse the JSON text once it is sure it
|
---|
716 | has enough text to get a decisive result, using a very simple but
|
---|
717 | truly incremental parser. This means that it sometimes won't stop as
|
---|
718 | early as the full parser, for example, it doesn't detect parenthese
|
---|
719 | mismatches. The only thing it guarantees is that it starts decoding as
|
---|
720 | soon as a syntactically valid JSON text has been seen. This means you need
|
---|
721 | to set resource limits (e.g. C<max_size>) to ensure the parser will stop
|
---|
722 | parsing in the presence if syntax errors.
|
---|
723 |
|
---|
724 | The following methods implement this incremental parser.
|
---|
725 |
|
---|
726 | =over 4
|
---|
727 |
|
---|
728 | =item [void, scalar or list context] = $json->incr_parse ([$string])
|
---|
729 |
|
---|
730 | This is the central parsing function. It can both append new text and
|
---|
731 | extract objects from the stream accumulated so far (both of these
|
---|
732 | functions are optional).
|
---|
733 |
|
---|
734 | If C<$string> is given, then this string is appended to the already
|
---|
735 | existing JSON fragment stored in the C<$json> object.
|
---|
736 |
|
---|
737 | After that, if the function is called in void context, it will simply
|
---|
738 | return without doing anything further. This can be used to add more text
|
---|
739 | in as many chunks as you want.
|
---|
740 |
|
---|
741 | If the method is called in scalar context, then it will try to extract
|
---|
742 | exactly I<one> JSON object. If that is successful, it will return this
|
---|
743 | object, otherwise it will return C<undef>. If there is a parse error,
|
---|
744 | this method will croak just as C<decode> would do (one can then use
|
---|
745 | C<incr_skip> to skip the errornous part). This is the most common way of
|
---|
746 | using the method.
|
---|
747 |
|
---|
748 | And finally, in list context, it will try to extract as many objects
|
---|
749 | from the stream as it can find and return them, or the empty list
|
---|
750 | otherwise. For this to work, there must be no separators between the JSON
|
---|
751 | objects or arrays, instead they must be concatenated back-to-back. If
|
---|
752 | an error occurs, an exception will be raised as in the scalar context
|
---|
753 | case. Note that in this case, any previously-parsed JSON texts will be
|
---|
754 | lost.
|
---|
755 |
|
---|
756 | =item $lvalue_string = $json->incr_text
|
---|
757 |
|
---|
758 | This method returns the currently stored JSON fragment as an lvalue, that
|
---|
759 | is, you can manipulate it. This I<only> works when a preceding call to
|
---|
760 | C<incr_parse> in I<scalar context> successfully returned an object. Under
|
---|
761 | all other circumstances you must not call this function (I mean it.
|
---|
762 | although in simple tests it might actually work, it I<will> fail under
|
---|
763 | real world conditions). As a special exception, you can also call this
|
---|
764 | method before having parsed anything.
|
---|
765 |
|
---|
766 | This function is useful in two cases: a) finding the trailing text after a
|
---|
767 | JSON object or b) parsing multiple JSON objects separated by non-JSON text
|
---|
768 | (such as commas).
|
---|
769 |
|
---|
770 | =item $json->incr_skip
|
---|
771 |
|
---|
772 | This will reset the state of the incremental parser and will remove
|
---|
773 | the parsed text from the input buffer so far. This is useful after
|
---|
774 | C<incr_parse> died, in which case the input buffer and incremental parser
|
---|
775 | state is left unchanged, to skip the text parsed so far and to reset the
|
---|
776 | parse state.
|
---|
777 |
|
---|
778 | The difference to C<incr_reset> is that only text until the parse error
|
---|
779 | occured is removed.
|
---|
780 |
|
---|
781 | =item $json->incr_reset
|
---|
782 |
|
---|
783 | This completely resets the incremental parser, that is, after this call,
|
---|
784 | it will be as if the parser had never parsed anything.
|
---|
785 |
|
---|
786 | This is useful if you want to repeatedly parse JSON objects and want to
|
---|
787 | ignore any trailing data, which means you have to reset the parser after
|
---|
788 | each successful decode.
|
---|
789 |
|
---|
790 | =back
|
---|
791 |
|
---|
792 | =head2 LIMITATIONS
|
---|
793 |
|
---|
794 | All options that affect decoding are supported, except
|
---|
795 | C<allow_nonref>. The reason for this is that it cannot be made to
|
---|
796 | work sensibly: JSON objects and arrays are self-delimited, i.e. you can concatenate
|
---|
797 | them back to back and still decode them perfectly. This does not hold true
|
---|
798 | for JSON numbers, however.
|
---|
799 |
|
---|
800 | For example, is the string C<1> a single JSON number, or is it simply the
|
---|
801 | start of C<12>? Or is C<12> a single JSON number, or the concatenation
|
---|
802 | of C<1> and C<2>? In neither case you can tell, and this is why JSON::XS
|
---|
803 | takes the conservative route and disallows this case.
|
---|
804 |
|
---|
805 | =head2 EXAMPLES
|
---|
806 |
|
---|
807 | Some examples will make all this clearer. First, a simple example that
|
---|
808 | works similarly to C<decode_prefix>: We want to decode the JSON object at
|
---|
809 | the start of a string and identify the portion after the JSON object:
|
---|
810 |
|
---|
811 | my $text = "[1,2,3] hello";
|
---|
812 |
|
---|
813 | my $json = new JSON::XS;
|
---|
814 |
|
---|
815 | my $obj = $json->incr_parse ($text)
|
---|
816 | or die "expected JSON object or array at beginning of string";
|
---|
817 |
|
---|
818 | my $tail = $json->incr_text;
|
---|
819 | # $tail now contains " hello"
|
---|
820 |
|
---|
821 | Easy, isn't it?
|
---|
822 |
|
---|
823 | Now for a more complicated example: Imagine a hypothetical protocol where
|
---|
824 | you read some requests from a TCP stream, and each request is a JSON
|
---|
825 | array, without any separation between them (in fact, it is often useful to
|
---|
826 | use newlines as "separators", as these get interpreted as whitespace at
|
---|
827 | the start of the JSON text, which makes it possible to test said protocol
|
---|
828 | with C<telnet>...).
|
---|
829 |
|
---|
830 | Here is how you'd do it (it is trivial to write this in an event-based
|
---|
831 | manner):
|
---|
832 |
|
---|
833 | my $json = new JSON::XS;
|
---|
834 |
|
---|
835 | # read some data from the socket
|
---|
836 | while (sysread $socket, my $buf, 4096) {
|
---|
837 |
|
---|
838 | # split and decode as many requests as possible
|
---|
839 | for my $request ($json->incr_parse ($buf)) {
|
---|
840 | # act on the $request
|
---|
841 | }
|
---|
842 | }
|
---|
843 |
|
---|
844 | Another complicated example: Assume you have a string with JSON objects
|
---|
845 | or arrays, all separated by (optional) comma characters (e.g. C<[1],[2],
|
---|
846 | [3]>). To parse them, we have to skip the commas between the JSON texts,
|
---|
847 | and here is where the lvalue-ness of C<incr_text> comes in useful:
|
---|
848 |
|
---|
849 | my $text = "[1],[2], [3]";
|
---|
850 | my $json = new JSON::XS;
|
---|
851 |
|
---|
852 | # void context, so no parsing done
|
---|
853 | $json->incr_parse ($text);
|
---|
854 |
|
---|
855 | # now extract as many objects as possible. note the
|
---|
856 | # use of scalar context so incr_text can be called.
|
---|
857 | while (my $obj = $json->incr_parse) {
|
---|
858 | # do something with $obj
|
---|
859 |
|
---|
860 | # now skip the optional comma
|
---|
861 | $json->incr_text =~ s/^ \s* , //x;
|
---|
862 | }
|
---|
863 |
|
---|
864 | Now lets go for a very complex example: Assume that you have a gigantic
|
---|
865 | JSON array-of-objects, many gigabytes in size, and you want to parse it,
|
---|
866 | but you cannot load it into memory fully (this has actually happened in
|
---|
867 | the real world :).
|
---|
868 |
|
---|
869 | Well, you lost, you have to implement your own JSON parser. But JSON::XS
|
---|
870 | can still help you: You implement a (very simple) array parser and let
|
---|
871 | JSON decode the array elements, which are all full JSON objects on their
|
---|
872 | own (this wouldn't work if the array elements could be JSON numbers, for
|
---|
873 | example):
|
---|
874 |
|
---|
875 | my $json = new JSON::XS;
|
---|
876 |
|
---|
877 | # open the monster
|
---|
878 | open my $fh, "<bigfile.json"
|
---|
879 | or die "bigfile: $!";
|
---|
880 |
|
---|
881 | # first parse the initial "["
|
---|
882 | for (;;) {
|
---|
883 | sysread $fh, my $buf, 65536
|
---|
884 | or die "read error: $!";
|
---|
885 | $json->incr_parse ($buf); # void context, so no parsing
|
---|
886 |
|
---|
887 | # Exit the loop once we found and removed(!) the initial "[".
|
---|
888 | # In essence, we are (ab-)using the $json object as a simple scalar
|
---|
889 | # we append data to.
|
---|
890 | last if $json->incr_text =~ s/^ \s* \[ //x;
|
---|
891 | }
|
---|
892 |
|
---|
893 | # now we have the skipped the initial "[", so continue
|
---|
894 | # parsing all the elements.
|
---|
895 | for (;;) {
|
---|
896 | # in this loop we read data until we got a single JSON object
|
---|
897 | for (;;) {
|
---|
898 | if (my $obj = $json->incr_parse) {
|
---|
899 | # do something with $obj
|
---|
900 | last;
|
---|
901 | }
|
---|
902 |
|
---|
903 | # add more data
|
---|
904 | sysread $fh, my $buf, 65536
|
---|
905 | or die "read error: $!";
|
---|
906 | $json->incr_parse ($buf); # void context, so no parsing
|
---|
907 | }
|
---|
908 |
|
---|
909 | # in this loop we read data until we either found and parsed the
|
---|
910 | # separating "," between elements, or the final "]"
|
---|
911 | for (;;) {
|
---|
912 | # first skip whitespace
|
---|
913 | $json->incr_text =~ s/^\s*//;
|
---|
914 |
|
---|
915 | # if we find "]", we are done
|
---|
916 | if ($json->incr_text =~ s/^\]//) {
|
---|
917 | print "finished.\n";
|
---|
918 | exit;
|
---|
919 | }
|
---|
920 |
|
---|
921 | # if we find ",", we can continue with the next element
|
---|
922 | if ($json->incr_text =~ s/^,//) {
|
---|
923 | last;
|
---|
924 | }
|
---|
925 |
|
---|
926 | # if we find anything else, we have a parse error!
|
---|
927 | if (length $json->incr_text) {
|
---|
928 | die "parse error near ", $json->incr_text;
|
---|
929 | }
|
---|
930 |
|
---|
931 | # else add more data
|
---|
932 | sysread $fh, my $buf, 65536
|
---|
933 | or die "read error: $!";
|
---|
934 | $json->incr_parse ($buf); # void context, so no parsing
|
---|
935 | }
|
---|
936 |
|
---|
937 | This is a complex example, but most of the complexity comes from the fact
|
---|
938 | that we are trying to be correct (bear with me if I am wrong, I never ran
|
---|
939 | the above example :).
|
---|
940 |
|
---|
941 |
|
---|
942 |
|
---|
943 | =head1 MAPPING
|
---|
944 |
|
---|
945 | This section describes how JSON::XS maps Perl values to JSON values and
|
---|
946 | vice versa. These mappings are designed to "do the right thing" in most
|
---|
947 | circumstances automatically, preserving round-tripping characteristics
|
---|
948 | (what you put in comes out as something equivalent).
|
---|
949 |
|
---|
950 | For the more enlightened: note that in the following descriptions,
|
---|
951 | lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
|
---|
952 | refers to the abstract Perl language itself.
|
---|
953 |
|
---|
954 |
|
---|
955 | =head2 JSON -> PERL
|
---|
956 |
|
---|
957 | =over 4
|
---|
958 |
|
---|
959 | =item object
|
---|
960 |
|
---|
961 | A JSON object becomes a reference to a hash in Perl. No ordering of object
|
---|
962 | keys is preserved (JSON does not preserve object key ordering itself).
|
---|
963 |
|
---|
964 | =item array
|
---|
965 |
|
---|
966 | A JSON array becomes a reference to an array in Perl.
|
---|
967 |
|
---|
968 | =item string
|
---|
969 |
|
---|
970 | A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
|
---|
971 | are represented by the same codepoints in the Perl string, so no manual
|
---|
972 | decoding is necessary.
|
---|
973 |
|
---|
974 | =item number
|
---|
975 |
|
---|
976 | A JSON number becomes either an integer, numeric (floating point) or
|
---|
977 | string scalar in perl, depending on its range and any fractional parts. On
|
---|
978 | the Perl level, there is no difference between those as Perl handles all
|
---|
979 | the conversion details, but an integer may take slightly less memory and
|
---|
980 | might represent more values exactly than floating point numbers.
|
---|
981 |
|
---|
982 | If the number consists of digits only, JSON::XS will try to represent
|
---|
983 | it as an integer value. If that fails, it will try to represent it as
|
---|
984 | a numeric (floating point) value if that is possible without loss of
|
---|
985 | precision. Otherwise it will preserve the number as a string value (in
|
---|
986 | which case you lose roundtripping ability, as the JSON number will be
|
---|
987 | re-encoded toa JSON string).
|
---|
988 |
|
---|
989 | Numbers containing a fractional or exponential part will always be
|
---|
990 | represented as numeric (floating point) values, possibly at a loss of
|
---|
991 | precision (in which case you might lose perfect roundtripping ability, but
|
---|
992 | the JSON number will still be re-encoded as a JSON number).
|
---|
993 |
|
---|
994 | =item true, false
|
---|
995 |
|
---|
996 | These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
|
---|
997 | respectively. They are overloaded to act almost exactly like the numbers
|
---|
998 | C<1> and C<0>. You can check whether a scalar is a JSON boolean by using
|
---|
999 | the C<JSON::XS::is_bool> function.
|
---|
1000 |
|
---|
1001 | =item null
|
---|
1002 |
|
---|
1003 | A JSON null atom becomes C<undef> in Perl.
|
---|
1004 |
|
---|
1005 | =back
|
---|
1006 |
|
---|
1007 |
|
---|
1008 | =head2 PERL -> JSON
|
---|
1009 |
|
---|
1010 | The mapping from Perl to JSON is slightly more difficult, as Perl is a
|
---|
1011 | truly typeless language, so we can only guess which JSON type is meant by
|
---|
1012 | a Perl value.
|
---|
1013 |
|
---|
1014 | =over 4
|
---|
1015 |
|
---|
1016 | =item hash references
|
---|
1017 |
|
---|
1018 | Perl hash references become JSON objects. As there is no inherent ordering
|
---|
1019 | in hash keys (or JSON objects), they will usually be encoded in a
|
---|
1020 | pseudo-random order that can change between runs of the same program but
|
---|
1021 | stays generally the same within a single run of a program. JSON::XS can
|
---|
1022 | optionally sort the hash keys (determined by the I<canonical> flag), so
|
---|
1023 | the same datastructure will serialise to the same JSON text (given same
|
---|
1024 | settings and version of JSON::XS), but this incurs a runtime overhead
|
---|
1025 | and is only rarely useful, e.g. when you want to compare some JSON text
|
---|
1026 | against another for equality.
|
---|
1027 |
|
---|
1028 | =item array references
|
---|
1029 |
|
---|
1030 | Perl array references become JSON arrays.
|
---|
1031 |
|
---|
1032 | =item other references
|
---|
1033 |
|
---|
1034 | Other unblessed references are generally not allowed and will cause an
|
---|
1035 | exception to be thrown, except for references to the integers C<0> and
|
---|
1036 | C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
|
---|
1037 | also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
|
---|
1038 |
|
---|
1039 | encode_json [\0, JSON::XS::true] # yields [false,true]
|
---|
1040 |
|
---|
1041 | =item JSON::XS::true, JSON::XS::false
|
---|
1042 |
|
---|
1043 | These special values become JSON true and JSON false values,
|
---|
1044 | respectively. You can also use C<\1> and C<\0> directly if you want.
|
---|
1045 |
|
---|
1046 | =item blessed objects
|
---|
1047 |
|
---|
1048 | Blessed objects are not directly representable in JSON. See the
|
---|
1049 | C<allow_blessed> and C<convert_blessed> methods on various options on
|
---|
1050 | how to deal with this: basically, you can choose between throwing an
|
---|
1051 | exception, encoding the reference as if it weren't blessed, or provide
|
---|
1052 | your own serialiser method.
|
---|
1053 |
|
---|
1054 | =item simple scalars
|
---|
1055 |
|
---|
1056 | Simple Perl scalars (any scalar that is not a reference) are the most
|
---|
1057 | difficult objects to encode: JSON::XS will encode undefined scalars as
|
---|
1058 | JSON C<null> values, scalars that have last been used in a string context
|
---|
1059 | before encoding as JSON strings, and anything else as number value:
|
---|
1060 |
|
---|
1061 | # dump as number
|
---|
1062 | encode_json [2] # yields [2]
|
---|
1063 | encode_json [-3.0e17] # yields [-3e+17]
|
---|
1064 | my $value = 5; encode_json [$value] # yields [5]
|
---|
1065 |
|
---|
1066 | # used as string, so dump as string
|
---|
1067 | print $value;
|
---|
1068 | encode_json [$value] # yields ["5"]
|
---|
1069 |
|
---|
1070 | # undef becomes null
|
---|
1071 | encode_json [undef] # yields [null]
|
---|
1072 |
|
---|
1073 | You can force the type to be a JSON string by stringifying it:
|
---|
1074 |
|
---|
1075 | my $x = 3.1; # some variable containing a number
|
---|
1076 | "$x"; # stringified
|
---|
1077 | $x .= ""; # another, more awkward way to stringify
|
---|
1078 | print $x; # perl does it for you, too, quite often
|
---|
1079 |
|
---|
1080 | You can force the type to be a JSON number by numifying it:
|
---|
1081 |
|
---|
1082 | my $x = "3"; # some variable containing a string
|
---|
1083 | $x += 0; # numify it, ensuring it will be dumped as a number
|
---|
1084 | $x *= 1; # same thing, the choice is yours.
|
---|
1085 |
|
---|
1086 | You can not currently force the type in other, less obscure, ways. Tell me
|
---|
1087 | if you need this capability (but don't forget to explain why it's needed
|
---|
1088 | :).
|
---|
1089 |
|
---|
1090 | =back
|
---|
1091 |
|
---|
1092 |
|
---|
1093 | =head1 ENCODING/CODESET FLAG NOTES
|
---|
1094 |
|
---|
1095 | The interested reader might have seen a number of flags that signify
|
---|
1096 | encodings or codesets - C<utf8>, C<latin1> and C<ascii>. There seems to be
|
---|
1097 | some confusion on what these do, so here is a short comparison:
|
---|
1098 |
|
---|
1099 | C<utf8> controls whether the JSON text created by C<encode> (and expected
|
---|
1100 | by C<decode>) is UTF-8 encoded or not, while C<latin1> and C<ascii> only
|
---|
1101 | control whether C<encode> escapes character values outside their respective
|
---|
1102 | codeset range. Neither of these flags conflict with each other, although
|
---|
1103 | some combinations make less sense than others.
|
---|
1104 |
|
---|
1105 | Care has been taken to make all flags symmetrical with respect to
|
---|
1106 | C<encode> and C<decode>, that is, texts encoded with any combination of
|
---|
1107 | these flag values will be correctly decoded when the same flags are used
|
---|
1108 | - in general, if you use different flag settings while encoding vs. when
|
---|
1109 | decoding you likely have a bug somewhere.
|
---|
1110 |
|
---|
1111 | Below comes a verbose discussion of these flags. Note that a "codeset" is
|
---|
1112 | simply an abstract set of character-codepoint pairs, while an encoding
|
---|
1113 | takes those codepoint numbers and I<encodes> them, in our case into
|
---|
1114 | octets. Unicode is (among other things) a codeset, UTF-8 is an encoding,
|
---|
1115 | and ISO-8859-1 (= latin 1) and ASCII are both codesets I<and> encodings at
|
---|
1116 | the same time, which can be confusing.
|
---|
1117 |
|
---|
1118 | =over 4
|
---|
1119 |
|
---|
1120 | =item C<utf8> flag disabled
|
---|
1121 |
|
---|
1122 | When C<utf8> is disabled (the default), then C<encode>/C<decode> generate
|
---|
1123 | and expect Unicode strings, that is, characters with high ordinal Unicode
|
---|
1124 | values (> 255) will be encoded as such characters, and likewise such
|
---|
1125 | characters are decoded as-is, no canges to them will be done, except
|
---|
1126 | "(re-)interpreting" them as Unicode codepoints or Unicode characters,
|
---|
1127 | respectively (to Perl, these are the same thing in strings unless you do
|
---|
1128 | funny/weird/dumb stuff).
|
---|
1129 |
|
---|
1130 | This is useful when you want to do the encoding yourself (e.g. when you
|
---|
1131 | want to have UTF-16 encoded JSON texts) or when some other layer does
|
---|
1132 | the encoding for you (for example, when printing to a terminal using a
|
---|
1133 | filehandle that transparently encodes to UTF-8 you certainly do NOT want
|
---|
1134 | to UTF-8 encode your data first and have Perl encode it another time).
|
---|
1135 |
|
---|
1136 | =item C<utf8> flag enabled
|
---|
1137 |
|
---|
1138 | If the C<utf8>-flag is enabled, C<encode>/C<decode> will encode all
|
---|
1139 | characters using the corresponding UTF-8 multi-byte sequence, and will
|
---|
1140 | expect your input strings to be encoded as UTF-8, that is, no "character"
|
---|
1141 | of the input string must have any value > 255, as UTF-8 does not allow
|
---|
1142 | that.
|
---|
1143 |
|
---|
1144 | The C<utf8> flag therefore switches between two modes: disabled means you
|
---|
1145 | will get a Unicode string in Perl, enabled means you get an UTF-8 encoded
|
---|
1146 | octet/binary string in Perl.
|
---|
1147 |
|
---|
1148 | =item C<latin1> or C<ascii> flags enabled
|
---|
1149 |
|
---|
1150 | With C<latin1> (or C<ascii>) enabled, C<encode> will escape characters
|
---|
1151 | with ordinal values > 255 (> 127 with C<ascii>) and encode the remaining
|
---|
1152 | characters as specified by the C<utf8> flag.
|
---|
1153 |
|
---|
1154 | If C<utf8> is disabled, then the result is also correctly encoded in those
|
---|
1155 | character sets (as both are proper subsets of Unicode, meaning that a
|
---|
1156 | Unicode string with all character values < 256 is the same thing as a
|
---|
1157 | ISO-8859-1 string, and a Unicode string with all character values < 128 is
|
---|
1158 | the same thing as an ASCII string in Perl).
|
---|
1159 |
|
---|
1160 | If C<utf8> is enabled, you still get a correct UTF-8-encoded string,
|
---|
1161 | regardless of these flags, just some more characters will be escaped using
|
---|
1162 | C<\uXXXX> then before.
|
---|
1163 |
|
---|
1164 | Note that ISO-8859-1-I<encoded> strings are not compatible with UTF-8
|
---|
1165 | encoding, while ASCII-encoded strings are. That is because the ISO-8859-1
|
---|
1166 | encoding is NOT a subset of UTF-8 (despite the ISO-8859-1 I<codeset> being
|
---|
1167 | a subset of Unicode), while ASCII is.
|
---|
1168 |
|
---|
1169 | Surprisingly, C<decode> will ignore these flags and so treat all input
|
---|
1170 | values as governed by the C<utf8> flag. If it is disabled, this allows you
|
---|
1171 | to decode ISO-8859-1- and ASCII-encoded strings, as both strict subsets of
|
---|
1172 | Unicode. If it is enabled, you can correctly decode UTF-8 encoded strings.
|
---|
1173 |
|
---|
1174 | So neither C<latin1> nor C<ascii> are incompatible with the C<utf8> flag -
|
---|
1175 | they only govern when the JSON output engine escapes a character or not.
|
---|
1176 |
|
---|
1177 | The main use for C<latin1> is to relatively efficiently store binary data
|
---|
1178 | as JSON, at the expense of breaking compatibility with most JSON decoders.
|
---|
1179 |
|
---|
1180 | The main use for C<ascii> is to force the output to not contain characters
|
---|
1181 | with values > 127, which means you can interpret the resulting string
|
---|
1182 | as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about any character set and
|
---|
1183 | 8-bit-encoding, and still get the same data structure back. This is useful
|
---|
1184 | when your channel for JSON transfer is not 8-bit clean or the encoding
|
---|
1185 | might be mangled in between (e.g. in mail), and works because ASCII is a
|
---|
1186 | proper subset of most 8-bit and multibyte encodings in use in the world.
|
---|
1187 |
|
---|
1188 | =back
|
---|
1189 |
|
---|
1190 |
|
---|
1191 | =head2 JSON and ECMAscript
|
---|
1192 |
|
---|
1193 | JSON syntax is based on how literals are represented in javascript (the
|
---|
1194 | not-standardised predecessor of ECMAscript) which is presumably why it is
|
---|
1195 | called "JavaScript Object Notation".
|
---|
1196 |
|
---|
1197 | However, JSON is not a subset (and also not a superset of course) of
|
---|
1198 | ECMAscript (the standard) or javascript (whatever browsers actually
|
---|
1199 | implement).
|
---|
1200 |
|
---|
1201 | If you want to use javascript's C<eval> function to "parse" JSON, you
|
---|
1202 | might run into parse errors for valid JSON texts, or the resulting data
|
---|
1203 | structure might not be queryable:
|
---|
1204 |
|
---|
1205 | One of the problems is that U+2028 and U+2029 are valid characters inside
|
---|
1206 | JSON strings, but are not allowed in ECMAscript string literals, so the
|
---|
1207 | following Perl fragment will not output something that can be guaranteed
|
---|
1208 | to be parsable by javascript's C<eval>:
|
---|
1209 |
|
---|
1210 | use JSON::XS;
|
---|
1211 |
|
---|
1212 | print encode_json [chr 0x2028];
|
---|
1213 |
|
---|
1214 | The right fix for this is to use a proper JSON parser in your javascript
|
---|
1215 | programs, and not rely on C<eval> (see for example Douglas Crockford's
|
---|
1216 | F<json2.js> parser).
|
---|
1217 |
|
---|
1218 | If this is not an option, you can, as a stop-gap measure, simply encode to
|
---|
1219 | ASCII-only JSON:
|
---|
1220 |
|
---|
1221 | use JSON::XS;
|
---|
1222 |
|
---|
1223 | print JSON::XS->new->ascii->encode ([chr 0x2028]);
|
---|
1224 |
|
---|
1225 | Note that this will enlarge the resulting JSON text quite a bit if you
|
---|
1226 | have many non-ASCII characters. You might be tempted to run some regexes
|
---|
1227 | to only escape U+2028 and U+2029, e.g.:
|
---|
1228 |
|
---|
1229 | # DO NOT USE THIS!
|
---|
1230 | my $json = JSON::XS->new->utf8->encode ([chr 0x2028]);
|
---|
1231 | $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
|
---|
1232 | $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
|
---|
1233 | print $json;
|
---|
1234 |
|
---|
1235 | Note that I<this is a bad idea>: the above only works for U+2028 and
|
---|
1236 | U+2029 and thus only for fully ECMAscript-compliant parsers. Many existing
|
---|
1237 | javascript implementations, however, have issues with other characters as
|
---|
1238 | well - using C<eval> naively simply I<will> cause problems.
|
---|
1239 |
|
---|
1240 | Another problem is that some javascript implementations reserve
|
---|
1241 | some property names for their own purposes (which probably makes
|
---|
1242 | them non-ECMAscript-compliant). For example, Iceweasel reserves the
|
---|
1243 | C<__proto__> property name for it's own purposes.
|
---|
1244 |
|
---|
1245 | If that is a problem, you could parse try to filter the resulting JSON
|
---|
1246 | output for these property strings, e.g.:
|
---|
1247 |
|
---|
1248 | $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
|
---|
1249 |
|
---|
1250 | This works because C<__proto__> is not valid outside of strings, so every
|
---|
1251 | occurence of C<"__proto__"\s*:> must be a string used as property name.
|
---|
1252 |
|
---|
1253 | If you know of other incompatibilities, please let me know.
|
---|
1254 |
|
---|
1255 |
|
---|
1256 | =head2 JSON and YAML
|
---|
1257 |
|
---|
1258 | You often hear that JSON is a subset of YAML. This is, however, a mass
|
---|
1259 | hysteria(*) and very far from the truth (as of the time of this writing),
|
---|
1260 | so let me state it clearly: I<in general, there is no way to configure
|
---|
1261 | JSON::XS to output a data structure as valid YAML> that works in all
|
---|
1262 | cases.
|
---|
1263 |
|
---|
1264 | If you really must use JSON::XS to generate YAML, you should use this
|
---|
1265 | algorithm (subject to change in future versions):
|
---|
1266 |
|
---|
1267 | my $to_yaml = JSON::XS->new->utf8->space_after (1);
|
---|
1268 | my $yaml = $to_yaml->encode ($ref) . "\n";
|
---|
1269 |
|
---|
1270 | This will I<usually> generate JSON texts that also parse as valid
|
---|
1271 | YAML. Please note that YAML has hardcoded limits on (simple) object key
|
---|
1272 | lengths that JSON doesn't have and also has different and incompatible
|
---|
1273 | unicode handling, so you should make sure that your hash keys are
|
---|
1274 | noticeably shorter than the 1024 "stream characters" YAML allows and that
|
---|
1275 | you do not have characters with codepoint values outside the Unicode BMP
|
---|
1276 | (basic multilingual page). YAML also does not allow C<\/> sequences in
|
---|
1277 | strings (which JSON::XS does not I<currently> generate, but other JSON
|
---|
1278 | generators might).
|
---|
1279 |
|
---|
1280 | There might be other incompatibilities that I am not aware of (or the YAML
|
---|
1281 | specification has been changed yet again - it does so quite often). In
|
---|
1282 | general you should not try to generate YAML with a JSON generator or vice
|
---|
1283 | versa, or try to parse JSON with a YAML parser or vice versa: chances are
|
---|
1284 | high that you will run into severe interoperability problems when you
|
---|
1285 | least expect it.
|
---|
1286 |
|
---|
1287 | =over 4
|
---|
1288 |
|
---|
1289 | =item (*)
|
---|
1290 |
|
---|
1291 | I have been pressured multiple times by Brian Ingerson (one of the
|
---|
1292 | authors of the YAML specification) to remove this paragraph, despite him
|
---|
1293 | acknowledging that the actual incompatibilities exist. As I was personally
|
---|
1294 | bitten by this "JSON is YAML" lie, I refused and said I will continue to
|
---|
1295 | educate people about these issues, so others do not run into the same
|
---|
1296 | problem again and again. After this, Brian called me a (quote)I<complete
|
---|
1297 | and worthless idiot>(unquote).
|
---|
1298 |
|
---|
1299 | In my opinion, instead of pressuring and insulting people who actually
|
---|
1300 | clarify issues with YAML and the wrong statements of some of its
|
---|
1301 | proponents, I would kindly suggest reading the JSON spec (which is not
|
---|
1302 | that difficult or long) and finally make YAML compatible to it, and
|
---|
1303 | educating users about the changes, instead of spreading lies about the
|
---|
1304 | real compatibility for many I<years> and trying to silence people who
|
---|
1305 | point out that it isn't true.
|
---|
1306 |
|
---|
1307 | =back
|
---|
1308 |
|
---|
1309 |
|
---|
1310 | =head2 SPEED
|
---|
1311 |
|
---|
1312 | It seems that JSON::XS is surprisingly fast, as shown in the following
|
---|
1313 | tables. They have been generated with the help of the C<eg/bench> program
|
---|
1314 | in the JSON::XS distribution, to make it easy to compare on your own
|
---|
1315 | system.
|
---|
1316 |
|
---|
1317 | First comes a comparison between various modules using
|
---|
1318 | a very short single-line JSON string (also available at
|
---|
1319 | L<http://dist.schmorp.de/misc/json/short.json>).
|
---|
1320 |
|
---|
1321 | {"method": "handleMessage", "params": ["user1",
|
---|
1322 | "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
|
---|
1323 | true, false]}
|
---|
1324 |
|
---|
1325 | It shows the number of encodes/decodes per second (JSON::XS uses
|
---|
1326 | the functional interface, while JSON::XS/2 uses the OO interface
|
---|
1327 | with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
|
---|
1328 | shrink). Higher is better:
|
---|
1329 |
|
---|
1330 | module | encode | decode |
|
---|
1331 | -----------|------------|------------|
|
---|
1332 | JSON 1.x | 4990.842 | 4088.813 |
|
---|
1333 | JSON::DWIW | 51653.990 | 71575.154 |
|
---|
1334 | JSON::PC | 65948.176 | 74631.744 |
|
---|
1335 | JSON::PP | 8931.652 | 3817.168 |
|
---|
1336 | JSON::Syck | 24877.248 | 27776.848 |
|
---|
1337 | JSON::XS | 388361.481 | 227951.304 |
|
---|
1338 | JSON::XS/2 | 227951.304 | 218453.333 |
|
---|
1339 | JSON::XS/3 | 338250.323 | 218453.333 |
|
---|
1340 | Storable | 16500.016 | 135300.129 |
|
---|
1341 | -----------+------------+------------+
|
---|
1342 |
|
---|
1343 | That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
|
---|
1344 | about three times faster on decoding, and over forty times faster
|
---|
1345 | than JSON, even with pretty-printing and key sorting. It also compares
|
---|
1346 | favourably to Storable for small amounts of data.
|
---|
1347 |
|
---|
1348 | Using a longer test string (roughly 18KB, generated from Yahoo! Locals
|
---|
1349 | search API (L<http://dist.schmorp.de/misc/json/long.json>).
|
---|
1350 |
|
---|
1351 | module | encode | decode |
|
---|
1352 | -----------|------------|------------|
|
---|
1353 | JSON 1.x | 55.260 | 34.971 |
|
---|
1354 | JSON::DWIW | 825.228 | 1082.513 |
|
---|
1355 | JSON::PC | 3571.444 | 2394.829 |
|
---|
1356 | JSON::PP | 210.987 | 32.574 |
|
---|
1357 | JSON::Syck | 552.551 | 787.544 |
|
---|
1358 | JSON::XS | 5780.463 | 4854.519 |
|
---|
1359 | JSON::XS/2 | 3869.998 | 4798.975 |
|
---|
1360 | JSON::XS/3 | 5862.880 | 4798.975 |
|
---|
1361 | Storable | 4445.002 | 5235.027 |
|
---|
1362 | -----------+------------+------------+
|
---|
1363 |
|
---|
1364 | Again, JSON::XS leads by far (except for Storable which non-surprisingly
|
---|
1365 | decodes faster).
|
---|
1366 |
|
---|
1367 | On large strings containing lots of high Unicode characters, some modules
|
---|
1368 | (such as JSON::PC) seem to decode faster than JSON::XS, but the result
|
---|
1369 | will be broken due to missing (or wrong) Unicode handling. Others refuse
|
---|
1370 | to decode or encode properly, so it was impossible to prepare a fair
|
---|
1371 | comparison table for that case.
|
---|
1372 |
|
---|
1373 |
|
---|
1374 | =head1 SECURITY CONSIDERATIONS
|
---|
1375 |
|
---|
1376 | When you are using JSON in a protocol, talking to untrusted potentially
|
---|
1377 | hostile creatures requires relatively few measures.
|
---|
1378 |
|
---|
1379 | First of all, your JSON decoder should be secure, that is, should not have
|
---|
1380 | any buffer overflows. Obviously, this module should ensure that and I am
|
---|
1381 | trying hard on making that true, but you never know.
|
---|
1382 |
|
---|
1383 | Second, you need to avoid resource-starving attacks. That means you should
|
---|
1384 | limit the size of JSON texts you accept, or make sure then when your
|
---|
1385 | resources run out, that's just fine (e.g. by using a separate process that
|
---|
1386 | can crash safely). The size of a JSON text in octets or characters is
|
---|
1387 | usually a good indication of the size of the resources required to decode
|
---|
1388 | it into a Perl structure. While JSON::XS can check the size of the JSON
|
---|
1389 | text, it might be too late when you already have it in memory, so you
|
---|
1390 | might want to check the size before you accept the string.
|
---|
1391 |
|
---|
1392 | Third, JSON::XS recurses using the C stack when decoding objects and
|
---|
1393 | arrays. The C stack is a limited resource: for instance, on my amd64
|
---|
1394 | machine with 8MB of stack size I can decode around 180k nested arrays but
|
---|
1395 | only 14k nested JSON objects (due to perl itself recursing deeply on croak
|
---|
1396 | to free the temporary). If that is exceeded, the program crashes. To be
|
---|
1397 | conservative, the default nesting limit is set to 512. If your process
|
---|
1398 | has a smaller stack, you should adjust this setting accordingly with the
|
---|
1399 | C<max_depth> method.
|
---|
1400 |
|
---|
1401 | Something else could bomb you, too, that I forgot to think of. In that
|
---|
1402 | case, you get to keep the pieces. I am always open for hints, though...
|
---|
1403 |
|
---|
1404 | Also keep in mind that JSON::XS might leak contents of your Perl data
|
---|
1405 | structures in its error messages, so when you serialise sensitive
|
---|
1406 | information you might want to make sure that exceptions thrown by JSON::XS
|
---|
1407 | will not end up in front of untrusted eyes.
|
---|
1408 |
|
---|
1409 | If you are using JSON::XS to return packets to consumption
|
---|
1410 | by JavaScript scripts in a browser you should have a look at
|
---|
1411 | L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
|
---|
1412 | you are vulnerable to some common attack vectors (which really are browser
|
---|
1413 | design bugs, but it is still you who will have to deal with it, as major
|
---|
1414 | browser developers care only for features, not about getting security
|
---|
1415 | right).
|
---|
1416 |
|
---|
1417 |
|
---|
1418 | =head1 THREADS
|
---|
1419 |
|
---|
1420 | This module is I<not> guaranteed to be thread safe and there are no
|
---|
1421 | plans to change this until Perl gets thread support (as opposed to the
|
---|
1422 | horribly slow so-called "threads" which are simply slow and bloated
|
---|
1423 | process simulations - use fork, it's I<much> faster, cheaper, better).
|
---|
1424 |
|
---|
1425 | (It might actually work, but you have been warned).
|
---|
1426 |
|
---|
1427 |
|
---|
1428 | =head1 BUGS
|
---|
1429 |
|
---|
1430 | While the goal of this module is to be correct, that unfortunately does
|
---|
1431 | not mean it's bug-free, only that I think its design is bug-free. If you
|
---|
1432 | keep reporting bugs they will be fixed swiftly, though.
|
---|
1433 |
|
---|
1434 | Please refrain from using rt.cpan.org or any other bug reporting
|
---|
1435 | service. I put the contact address into my modules for a reason.
|
---|
1436 |
|
---|
1437 | =cut
|
---|
1438 |
|
---|
1439 | our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
|
---|
1440 | our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
|
---|
1441 |
|
---|
1442 | sub true() { $true }
|
---|
1443 | sub false() { $false }
|
---|
1444 |
|
---|
1445 | sub is_bool($) {
|
---|
1446 | UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
|
---|
1447 | # or UNIVERSAL::isa $_[0], "JSON::Literal"
|
---|
1448 | }
|
---|
1449 |
|
---|
1450 | XSLoader::load "JSON::XS", $VERSION;
|
---|
1451 |
|
---|
1452 | package JSON::XS::Boolean;
|
---|
1453 |
|
---|
1454 | use overload
|
---|
1455 | "0+" => sub { ${$_[0]} },
|
---|
1456 | "++" => sub { $_[0] = ${$_[0]} + 1 },
|
---|
1457 | "--" => sub { $_[0] = ${$_[0]} - 1 },
|
---|
1458 | fallback => 1;
|
---|
1459 |
|
---|
1460 | 1;
|
---|
1461 |
|
---|
1462 | =head1 SEE ALSO
|
---|
1463 |
|
---|
1464 | The F<json_xs> command line utility for quick experiments.
|
---|
1465 |
|
---|
1466 | =head1 AUTHOR
|
---|
1467 |
|
---|
1468 | Marc Lehmann <[email protected]>
|
---|
1469 | http://home.schmorp.de/
|
---|
1470 |
|
---|
1471 | =cut
|
---|
1472 |
|
---|