1 |
|
---|
2 | =head1 NAME
|
---|
3 |
|
---|
4 | Locale::Script - ISO codes for script identification (ISO 15924)
|
---|
5 |
|
---|
6 | =head1 SYNOPSIS
|
---|
7 |
|
---|
8 | use Locale::Script;
|
---|
9 | use Locale::Constants;
|
---|
10 |
|
---|
11 | $script = code2script('ph'); # 'Phoenician'
|
---|
12 | $code = script2code('Tibetan'); # 'bo'
|
---|
13 | $code3 = script2code('Tibetan',
|
---|
14 | LOCALE_CODE_ALPHA_3); # 'bod'
|
---|
15 | $codeN = script2code('Tibetan',
|
---|
16 | LOCALE_CODE_ALPHA_NUMERIC); # 330
|
---|
17 |
|
---|
18 | @codes = all_script_codes();
|
---|
19 | @scripts = all_script_names();
|
---|
20 |
|
---|
21 |
|
---|
22 | =head1 DESCRIPTION
|
---|
23 |
|
---|
24 | The C<Locale::Script> module provides access to the ISO
|
---|
25 | codes for identifying scripts, as defined in ISO 15924.
|
---|
26 | For example, Egyptian hieroglyphs are denoted by the two-letter
|
---|
27 | code 'eg', the three-letter code 'egy', and the numeric code 050.
|
---|
28 |
|
---|
29 | You can either access the codes via the conversion routines
|
---|
30 | (described below), or with the two functions which return lists
|
---|
31 | of all script codes or all script names.
|
---|
32 |
|
---|
33 | There are three different code sets you can use for identifying
|
---|
34 | scripts:
|
---|
35 |
|
---|
36 | =over 4
|
---|
37 |
|
---|
38 | =item B<alpha-2>
|
---|
39 |
|
---|
40 | Two letter codes, such as 'bo' for Tibetan.
|
---|
41 | This code set is identified with the symbol C<LOCALE_CODE_ALPHA_2>.
|
---|
42 |
|
---|
43 | =item B<alpha-3>
|
---|
44 |
|
---|
45 | Three letter codes, such as 'ell' for Greek.
|
---|
46 | This code set is identified with the symbol C<LOCALE_CODE_ALPHA_3>.
|
---|
47 |
|
---|
48 | =item B<numeric>
|
---|
49 |
|
---|
50 | Numeric codes, such as 410 for Hiragana.
|
---|
51 | This code set is identified with the symbol C<LOCALE_CODE_NUMERIC>.
|
---|
52 |
|
---|
53 | =back
|
---|
54 |
|
---|
55 | All of the routines take an optional additional argument
|
---|
56 | which specifies the code set to use.
|
---|
57 | If not specified, it defaults to the two-letter codes.
|
---|
58 | This is partly for backwards compatibility (previous versions
|
---|
59 | of Locale modules only supported the alpha-2 codes), and
|
---|
60 | partly because they are the most widely used codes.
|
---|
61 |
|
---|
62 | The alpha-2 and alpha-3 codes are not case-dependent,
|
---|
63 | so you can use 'BO', 'Bo', 'bO' or 'bo' for Tibetan.
|
---|
64 | When a code is returned by one of the functions in
|
---|
65 | this module, it will always be lower-case.
|
---|
66 |
|
---|
67 | =head2 SPECIAL CODES
|
---|
68 |
|
---|
69 | The standard defines various special codes.
|
---|
70 |
|
---|
71 | =over 4
|
---|
72 |
|
---|
73 | =item *
|
---|
74 |
|
---|
75 | The standard reserves codes in the ranges B<qa> - B<qt>,
|
---|
76 | B<qaa> - B<qat>, and B<900> - B<919>, for private use.
|
---|
77 |
|
---|
78 | =item *
|
---|
79 |
|
---|
80 | B<zx>, B<zxx>, and B<997>, are the codes for unwritten languages.
|
---|
81 |
|
---|
82 | =item *
|
---|
83 |
|
---|
84 | B<zy>, B<zyy>, and B<998>, are the codes for an undetermined script.
|
---|
85 |
|
---|
86 | =item *
|
---|
87 |
|
---|
88 | B<zz>, B<zzz>, and B<999>, are the codes for an uncoded script.
|
---|
89 |
|
---|
90 | =back
|
---|
91 |
|
---|
92 | The private codes are not recognised by Locale::Script,
|
---|
93 | but the others are.
|
---|
94 |
|
---|
95 |
|
---|
96 | =head1 CONVERSION ROUTINES
|
---|
97 |
|
---|
98 | There are three conversion routines: C<code2script()>, C<script2code()>,
|
---|
99 | and C<script_code2code()>.
|
---|
100 |
|
---|
101 | =over 4
|
---|
102 |
|
---|
103 | =item code2script( CODE, [ CODESET ] )
|
---|
104 |
|
---|
105 | This function takes a script code and returns a string
|
---|
106 | which contains the name of the script identified.
|
---|
107 | If the code is not a valid script code, as defined by ISO 15924,
|
---|
108 | then C<undef> will be returned:
|
---|
109 |
|
---|
110 | $script = code2script('cy'); # Cyrillic
|
---|
111 |
|
---|
112 | =item script2code( STRING, [ CODESET ] )
|
---|
113 |
|
---|
114 | This function takes a script name and returns the corresponding
|
---|
115 | script code, if such exists.
|
---|
116 | If the argument could not be identified as a script name,
|
---|
117 | then C<undef> will be returned:
|
---|
118 |
|
---|
119 | $code = script2code('Gothic', LOCALE_CODE_ALPHA_3);
|
---|
120 | # $code will now be 'gth'
|
---|
121 |
|
---|
122 | The case of the script name is not important.
|
---|
123 | See the section L<KNOWN BUGS AND LIMITATIONS> below.
|
---|
124 |
|
---|
125 | =item script_code2code( CODE, CODESET, CODESET )
|
---|
126 |
|
---|
127 | This function takes a script code from one code set,
|
---|
128 | and returns the corresponding code from another code set.
|
---|
129 |
|
---|
130 | $alpha2 = script_code2code('jwi',
|
---|
131 | LOCALE_CODE_ALPHA_3 => LOCALE_CODE_ALPHA_2);
|
---|
132 | # $alpha2 will now be 'jw' (Javanese)
|
---|
133 |
|
---|
134 | If the code passed is not a valid script code in
|
---|
135 | the first code set, or if there isn't a code for the
|
---|
136 | corresponding script in the second code set,
|
---|
137 | then C<undef> will be returned.
|
---|
138 |
|
---|
139 | =back
|
---|
140 |
|
---|
141 |
|
---|
142 | =head1 QUERY ROUTINES
|
---|
143 |
|
---|
144 | There are two function which can be used to obtain a list of all codes,
|
---|
145 | or all script names:
|
---|
146 |
|
---|
147 | =over 4
|
---|
148 |
|
---|
149 | =item C<all_script_codes ( [ CODESET ] )>
|
---|
150 |
|
---|
151 | Returns a list of all two-letter script codes.
|
---|
152 | The codes are guaranteed to be all lower-case,
|
---|
153 | and not in any particular order.
|
---|
154 |
|
---|
155 | =item C<all_script_names ( [ CODESET ] )>
|
---|
156 |
|
---|
157 | Returns a list of all script names for which there is a corresponding
|
---|
158 | script code in the specified code set.
|
---|
159 | The names are capitalised, and not returned in any particular order.
|
---|
160 |
|
---|
161 | =back
|
---|
162 |
|
---|
163 |
|
---|
164 | =head1 EXAMPLES
|
---|
165 |
|
---|
166 | The following example illustrates use of the C<code2script()> function.
|
---|
167 | The user is prompted for a script code, and then told the corresponding
|
---|
168 | script name:
|
---|
169 |
|
---|
170 | $| = 1; # turn off buffering
|
---|
171 |
|
---|
172 | print "Enter script code: ";
|
---|
173 | chop($code = <STDIN>);
|
---|
174 | $script = code2script($code, LOCALE_CODE_ALPHA_2);
|
---|
175 | if (defined $script)
|
---|
176 | {
|
---|
177 | print "$code = $script\n";
|
---|
178 | }
|
---|
179 | else
|
---|
180 | {
|
---|
181 | print "'$code' is not a valid script code!\n";
|
---|
182 | }
|
---|
183 |
|
---|
184 |
|
---|
185 | =head1 KNOWN BUGS AND LIMITATIONS
|
---|
186 |
|
---|
187 | =over 4
|
---|
188 |
|
---|
189 | =item *
|
---|
190 |
|
---|
191 | When using C<script2code()>, the script name must currently appear
|
---|
192 | exactly as it does in the source of the module. For example,
|
---|
193 |
|
---|
194 | script2code('Egyptian hieroglyphs')
|
---|
195 |
|
---|
196 | will return B<eg>, as expected. But the following will all return C<undef>:
|
---|
197 |
|
---|
198 | script2code('hieroglyphs')
|
---|
199 | script2code('Egyptian Hieroglypics')
|
---|
200 |
|
---|
201 | If there's need for it, a future version could have variants
|
---|
202 | for script names.
|
---|
203 |
|
---|
204 | =item *
|
---|
205 |
|
---|
206 | In the current implementation, all data is read in when the
|
---|
207 | module is loaded, and then held in memory.
|
---|
208 | A lazy implementation would be more memory friendly.
|
---|
209 |
|
---|
210 | =back
|
---|
211 |
|
---|
212 | =head1 SEE ALSO
|
---|
213 |
|
---|
214 | =over 4
|
---|
215 |
|
---|
216 | =item Locale::Language
|
---|
217 |
|
---|
218 | ISO two letter codes for identification of language (ISO 639).
|
---|
219 |
|
---|
220 | =item Locale::Currency
|
---|
221 |
|
---|
222 | ISO three letter codes for identification of currencies
|
---|
223 | and funds (ISO 4217).
|
---|
224 |
|
---|
225 | =item Locale::Country
|
---|
226 |
|
---|
227 | ISO three letter codes for identification of countries (ISO 3166)
|
---|
228 |
|
---|
229 | =item ISO 15924
|
---|
230 |
|
---|
231 | The ISO standard which defines these codes.
|
---|
232 |
|
---|
233 | =item http://www.evertype.com/standards/iso15924/
|
---|
234 |
|
---|
235 | Home page for ISO 15924.
|
---|
236 |
|
---|
237 |
|
---|
238 | =back
|
---|
239 |
|
---|
240 |
|
---|
241 | =head1 AUTHOR
|
---|
242 |
|
---|
243 | Neil Bowers E<lt>[email protected]<gt>
|
---|
244 |
|
---|
245 | =head1 COPYRIGHT
|
---|
246 |
|
---|
247 | Copyright (c) 2002-2004 Neil Bowers.
|
---|
248 |
|
---|
249 | This module is free software; you can redistribute it and/or
|
---|
250 | modify it under the same terms as Perl itself.
|
---|
251 |
|
---|
252 | =cut
|
---|
253 |
|
---|