1 | FILE(1) BSD General Commands Manual FILE(1)
|
---|
2 |
|
---|
3 | NAME
|
---|
4 | file -- determine file type
|
---|
5 |
|
---|
6 | SYNOPSIS
|
---|
7 | file [-bchikLnNprsvz] [--mime-type] [--mime-encoding]
|
---|
8 | [-f namefile] [-F separator] [-m magicfiles] file
|
---|
9 | file -C [-m magicfile]
|
---|
10 | file [--help]
|
---|
11 |
|
---|
12 | DESCRIPTION
|
---|
13 | This manual page documents version 5.03 of the file com-
|
---|
14 | mand.
|
---|
15 |
|
---|
16 | file tests each argument in an attempt to classify it.
|
---|
17 | There are three sets of tests, performed in this order:
|
---|
18 | filesystem tests, magic tests, and language tests. The
|
---|
19 | first test that succeeds causes the file type to be
|
---|
20 | printed.
|
---|
21 |
|
---|
22 | The type printed will usually contain one of the words
|
---|
23 | text (the file contains only printing characters and a few
|
---|
24 | common control characters and is probably safe to read on
|
---|
25 | an ASCII terminal), executable (the file contains the
|
---|
26 | result of compiling a program in a form understandable to
|
---|
27 | some UNIX kernel or another), or data meaning anything
|
---|
28 | else (data is usually `binary' or non-printable). Excep-
|
---|
29 | tions are well-known file formats (core files, tar ar-
|
---|
30 | chives) that are known to contain binary data. When modi-
|
---|
31 | fying magic files or the program itself, make sure to
|
---|
32 | preserve these keywords. Users depend on knowing that all
|
---|
33 | the readable files in a directory have the word `text'
|
---|
34 | printed. Don't do as Berkeley did and change `shell
|
---|
35 | commands text' to `shell script'.
|
---|
36 |
|
---|
37 | The filesystem tests are based on examining the return
|
---|
38 | from a stat(2) system call. The program checks to see if
|
---|
39 | the file is empty, or if it's some sort of special file.
|
---|
40 | Any known file types appropriate to the system you are
|
---|
41 | running on (sockets, symbolic links, or named pipes
|
---|
42 | (FIFOs) on those systems that implement them) are intuited
|
---|
43 | if they are defined in the system header file
|
---|
44 | <sys/stat.h>.
|
---|
45 |
|
---|
46 | The magic tests are used to check for files with data in
|
---|
47 | particular fixed formats. The canonical example of this
|
---|
48 | is a binary executable (compiled program) a.out file,
|
---|
49 | whose format is defined in <elf.h>, <a.out.h> and possibly
|
---|
50 | <exec.h> in the standard include directory. These files
|
---|
51 | have a `magic number' stored in a particular place near
|
---|
52 | the beginning of the file that tells the UNIX operating
|
---|
53 | system that the file is a binary executable, and which of
|
---|
54 | several types thereof. The concept of a `magic' has been
|
---|
55 | applied by extension to data files. Any file with some
|
---|
56 | invariant identifier at a small fixed offset into the file
|
---|
57 | can usually be described in this way. The information
|
---|
58 | identifying these files is read from the compiled magic
|
---|
59 | file c:/progra~1/file/share/misc/magic.mgc, or the files
|
---|
60 | in the directory c:/progra~1/file/share/misc/magic if the
|
---|
61 | compiled file does not exist. In addition, if
|
---|
62 | $HOME/.magic.mgc or $HOME/.magic exists, it will be used
|
---|
63 | in preference to the system magic files.
|
---|
64 |
|
---|
65 | If a file does not match any of the entries in the magic
|
---|
66 | file, it is examined to see if it seems to be a text file.
|
---|
67 | ASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character
|
---|
68 | sets (such as those used on Macintosh and IBM PC systems),
|
---|
69 | UTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC
|
---|
70 | character sets can be distinguished by the different
|
---|
71 | ranges and sequences of bytes that constitute printable
|
---|
72 | text in each set. If a file passes any of these tests,
|
---|
73 | its character set is reported. ASCII, ISO-8859-x, UTF-8,
|
---|
74 | and extended-ASCII files are identified as `text' because
|
---|
75 | they will be mostly readable on nearly any terminal;
|
---|
76 | UTF-16 and EBCDIC are only `character data' because, while
|
---|
77 | they contain text, it is text that will require transla-
|
---|
78 | tion before it can be read. In addition, file will
|
---|
79 | attempt to determine other characteristics of text-type
|
---|
80 | files. If the lines of a file are terminated by CR, CRLF,
|
---|
81 | or NEL, instead of the Unix-standard LF, this will be
|
---|
82 | reported. Files that contain embedded escape sequences or
|
---|
83 | overstriking will also be identified.
|
---|
84 |
|
---|
85 | Once file has determined the character set used in a text-
|
---|
86 | type file, it will attempt to determine in what language
|
---|
87 | the file is written. The language tests look for particu-
|
---|
88 | lar strings (cf. <names.h> ) that can appear anywhere in
|
---|
89 | the first few blocks of a file. For example, the keyword
|
---|
90 | .br indicates that the file is most likely a troff(1)
|
---|
91 | input file, just as the keyword struct indicates a C pro-
|
---|
92 | gram. These tests are less reliable than the previous two
|
---|
93 | groups, so they are performed last. The language test
|
---|
94 | routines also test for some miscellany (such as tar(1) ar-
|
---|
95 | chives).
|
---|
96 |
|
---|
97 | Any file that cannot be identified as having been written
|
---|
98 | in any of the character sets listed above is simply said
|
---|
99 | to be `data'.
|
---|
100 |
|
---|
101 | OPTIONS
|
---|
102 | -b, --brief
|
---|
103 | Do not prepend filenames to output lines (brief
|
---|
104 | mode).
|
---|
105 |
|
---|
106 | -c, --checking-printout
|
---|
107 | Cause a checking printout of the parsed form of
|
---|
108 | the magic file. This is usually used in conjunc-
|
---|
109 | tion with the -m flag to debug a new magic file
|
---|
110 | before installing it.
|
---|
111 |
|
---|
112 | -C, --compile
|
---|
113 | Write a magic.mgc output file that contains a pre-
|
---|
114 | parsed version of the magic file or directory.
|
---|
115 |
|
---|
116 | -e, --exclude testname
|
---|
117 | Exclude the test named in testname from the list
|
---|
118 | of tests made to determine the file type. Valid
|
---|
119 | test names are:
|
---|
120 |
|
---|
121 | apptype
|
---|
122 | EMX application type (only on EMX).
|
---|
123 |
|
---|
124 | text
|
---|
125 | Various types of text files (this test will try
|
---|
126 | to guess the text encoding, irrespective of the
|
---|
127 | setting of the `encoding' option).
|
---|
128 |
|
---|
129 | encoding
|
---|
130 | Different text encodings for soft magic tests.
|
---|
131 |
|
---|
132 | tokens
|
---|
133 | Looks for known tokens inside text files.
|
---|
134 |
|
---|
135 | cdf
|
---|
136 | Prints details of Compound Document Files.
|
---|
137 |
|
---|
138 | compress
|
---|
139 | Checks for, and looks inside, compressed files.
|
---|
140 |
|
---|
141 | elf
|
---|
142 | Prints ELF file details.
|
---|
143 |
|
---|
144 | soft
|
---|
145 | Consults magic files.
|
---|
146 |
|
---|
147 | tar
|
---|
148 | Examines tar files.
|
---|
149 |
|
---|
150 | -f, --files-from namefile
|
---|
151 | Read the names of the files to be examined from
|
---|
152 | namefile (one per line) before the argument list.
|
---|
153 | Either namefile or at least one filename argument
|
---|
154 | must be present; to test the standard input, use
|
---|
155 | `-' as a filename argument.
|
---|
156 |
|
---|
157 | -F, --separator separator
|
---|
158 | Use the specified string as the separator between
|
---|
159 | the filename and the file result returned.
|
---|
160 | Defaults to `:'.
|
---|
161 |
|
---|
162 | -h, --no-dereference
|
---|
163 | option causes symlinks not to be followed (on sys-
|
---|
164 | tems that support symbolic links). This is the
|
---|
165 | default if the environment variable
|
---|
166 | POSIXLY_CORRECT is not defined.
|
---|
167 |
|
---|
168 | -i, --mime
|
---|
169 | Causes the file command to output mime type
|
---|
170 | strings rather than the more traditional human
|
---|
171 | readable ones. Thus it may say `text/plain;
|
---|
172 | charset=us-ascii' rather than `ASCII text'. In
|
---|
173 | order for this option to work, file changes the
|
---|
174 | way it handles files recognized by the command
|
---|
175 | itself (such as many of the text file types,
|
---|
176 | directories etc), and makes use of an alternative
|
---|
177 | `magic' file. (See the FILES section, below).
|
---|
178 |
|
---|
179 | --mime-type, --mime-encoding
|
---|
180 | Like -i, but print only the specified element(s).
|
---|
181 |
|
---|
182 | -k, --keep-going
|
---|
183 | Don't stop at the first match, keep going. Subse-
|
---|
184 | quent matches will be have the string `\012- '
|
---|
185 | prepended. (If you want a newline, see the `-r'
|
---|
186 | option.)
|
---|
187 |
|
---|
188 | -L, --dereference
|
---|
189 | option causes symlinks to be followed, as the
|
---|
190 | like-named option in ls(1) (on systems that sup-
|
---|
191 | port symbolic links). This is the default if the
|
---|
192 | environment variable POSIXLY_CORRECT is defined.
|
---|
193 |
|
---|
194 | -m, --magic-file list
|
---|
195 | Specify an alternate list of files and directories
|
---|
196 | containing magic. This can be a single item, or a
|
---|
197 | colon-separated list. If a compiled magic file is
|
---|
198 | found alongside a file or directory, it will be
|
---|
199 | used instead.
|
---|
200 |
|
---|
201 | -n, --no-buffer
|
---|
202 | Force stdout to be flushed after checking each
|
---|
203 | file. This is only useful if checking a list of
|
---|
204 | files. It is intended to be used by programs that
|
---|
205 | want filetype output from a pipe.
|
---|
206 |
|
---|
207 | -N, --no-pad
|
---|
208 | Don't pad filenames so that they align in the out-
|
---|
209 | put.
|
---|
210 |
|
---|
211 | -p, --preserve-date
|
---|
212 | On systems that support utime(2) or utimes(2),
|
---|
213 | attempt to preserve the access time of files ana-
|
---|
214 | lyzed, to pretend that file never read them.
|
---|
215 |
|
---|
216 | -r, --raw
|
---|
217 | Don't translate unprintable characters to \ooo.
|
---|
218 | Normally file translates unprintable characters to
|
---|
219 | their octal representation.
|
---|
220 |
|
---|
221 | -s, --special-files
|
---|
222 | Normally, file only attempts to read and determine
|
---|
223 | the type of argument files which stat(2) reports
|
---|
224 | are ordinary files. This prevents problems,
|
---|
225 | because reading special files may have peculiar
|
---|
226 | consequences. Specifying the -s option causes
|
---|
227 | file to also read argument files which are block
|
---|
228 | or character special files. This is useful for
|
---|
229 | determining the filesystem types of the data in
|
---|
230 | raw disk partitions, which are block special
|
---|
231 | files. This option also causes file to disregard
|
---|
232 | the file size as reported by stat(2) since on some
|
---|
233 | systems it reports a zero size for raw disk parti-
|
---|
234 | tions.
|
---|
235 |
|
---|
236 | -v, --version
|
---|
237 | Print the version of the program and exit.
|
---|
238 |
|
---|
239 | -z, --uncompress
|
---|
240 | Try to look inside compressed files.
|
---|
241 |
|
---|
242 | -0, --print0
|
---|
243 | Output a null character `\0' after the end of the
|
---|
244 | filename. Nice to cut(1) the output. This does not
|
---|
245 | affect the separator which is still printed.
|
---|
246 |
|
---|
247 | --help Print a help message and exit.
|
---|
248 |
|
---|
249 | FILES
|
---|
250 | c:/progra~1/file/share/misc/magic.mgc Default compiled
|
---|
251 | list of magic.
|
---|
252 | c:/progra~1/file/share/misc/magic Directory contain-
|
---|
253 | ing default magic
|
---|
254 | files.
|
---|
255 |
|
---|
256 | ENVIRONMENT
|
---|
257 | The environment variable MAGIC can be used to set the
|
---|
258 | default magic file name. If that variable is set, then
|
---|
259 | file will not attempt to open $HOME/.magic. file adds
|
---|
260 | `.mgc' to the value of this variable as appropriate. The
|
---|
261 | environment variable POSIXLY_CORRECT controls (on systems
|
---|
262 | that support symbolic links), whether file will attempt to
|
---|
263 | follow symlinks or not. If set, then file follows symlink,
|
---|
264 | otherwise it does not. This is also controlled by the -L
|
---|
265 | and -h options.
|
---|
266 |
|
---|
267 | SEE ALSO
|
---|
268 | magic(5), strings(1), od(1), hexdump(1,) file(1posix)
|
---|
269 |
|
---|
270 | STANDARDS CONFORMANCE
|
---|
271 | This program is believed to exceed the System V Interface
|
---|
272 | Definition of FILE(CMD), as near as one can determine from
|
---|
273 | the vague language contained therein. Its behavior is
|
---|
274 | mostly compatible with the System V program of the same
|
---|
275 | name. This version knows more magic, however, so it will
|
---|
276 | produce different (albeit more accurate) output in many
|
---|
277 | cases.
|
---|
278 |
|
---|
279 | The one significant difference between this version and
|
---|
280 | System V is that this version treats any white space as a
|
---|
281 | delimiter, so that spaces in pattern strings must be
|
---|
282 | escaped. For example,
|
---|
283 |
|
---|
284 | >10 string language impress (imPRESS data)
|
---|
285 |
|
---|
286 | in an existing magic file would have to be changed to
|
---|
287 |
|
---|
288 | >10 string language\ impress (imPRESS data)
|
---|
289 |
|
---|
290 | In addition, in this version, if a pattern string contains
|
---|
291 | a backslash, it must be escaped. For example
|
---|
292 |
|
---|
293 | 0 string \begindata Andrew Toolkit document
|
---|
294 |
|
---|
295 | in an existing magic file would have to be changed to
|
---|
296 |
|
---|
297 | 0 string \\begindata Andrew Toolkit document
|
---|
298 |
|
---|
299 | SunOS releases 3.2 and later from Sun Microsystems include
|
---|
300 | a file command derived from the System V one, but with
|
---|
301 | some extensions. My version differs from Sun's only in
|
---|
302 | minor ways. It includes the extension of the `&' opera-
|
---|
303 | tor, used as, for example,
|
---|
304 |
|
---|
305 | >16 long&0x7fffffff >0 not stripped
|
---|
306 |
|
---|
307 | MAGIC DIRECTORY
|
---|
308 | The magic file entries have been collected from various
|
---|
309 | sources, mainly USENET, and contributed by various
|
---|
310 | authors. Christos Zoulas (address below) will collect
|
---|
311 | additional or corrected magic file entries. A consolida-
|
---|
312 | tion of magic file entries will be distributed periodi-
|
---|
313 | cally.
|
---|
314 |
|
---|
315 | The order of entries in the magic file is significant.
|
---|
316 | Depending on what system you are using, the order that
|
---|
317 | they are put together may be incorrect. If your old file
|
---|
318 | command uses a magic file, keep the old magic file around
|
---|
319 | for comparison purposes (rename it to
|
---|
320 | c:/progra~1/file/share/misc/magic.orig ).
|
---|
321 |
|
---|
322 | EXAMPLES
|
---|
323 | $ file file.c file /dev/{wd0a,hda}
|
---|
324 | file.c: C program text
|
---|
325 | file: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
|
---|
326 | dynamically linked (uses shared libs), stripped
|
---|
327 | /dev/wd0a: block special (0/0)
|
---|
328 | /dev/hda: block special (3/0)
|
---|
329 |
|
---|
330 | $ file -s /dev/wd0{b,d}
|
---|
331 | /dev/wd0b: data
|
---|
332 | /dev/wd0d: x86 boot sector
|
---|
333 |
|
---|
334 | $ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
|
---|
335 | /dev/hda: x86 boot sector
|
---|
336 | /dev/hda1: Linux/i386 ext2 filesystem
|
---|
337 | /dev/hda2: x86 boot sector
|
---|
338 | /dev/hda3: x86 boot sector, extended partition table
|
---|
339 | /dev/hda4: Linux/i386 ext2 filesystem
|
---|
340 | /dev/hda5: Linux/i386 swap file
|
---|
341 | /dev/hda6: Linux/i386 swap file
|
---|
342 | /dev/hda7: Linux/i386 swap file
|
---|
343 | /dev/hda8: Linux/i386 swap file
|
---|
344 | /dev/hda9: empty
|
---|
345 | /dev/hda10: empty
|
---|
346 |
|
---|
347 | $ file -i file.c file /dev/{wd0a,hda}
|
---|
348 | file.c: text/x-c
|
---|
349 | file: application/x-executable
|
---|
350 | /dev/hda: application/x-not-regular-file
|
---|
351 | /dev/wd0a: application/x-not-regular-file
|
---|
352 |
|
---|
353 |
|
---|
354 | HISTORY
|
---|
355 | There has been a file command in every UNIX since at least
|
---|
356 | Research Version 4 (man page dated November, 1973). The
|
---|
357 | System V version introduced one significant major change:
|
---|
358 | the external list of magic types. This slowed the program
|
---|
359 | down slightly but made it a lot more flexible.
|
---|
360 |
|
---|
361 | This program, based on the System V version, was written
|
---|
362 | by Ian Darwin <[email protected]> without looking at any-
|
---|
363 | body else's source code.
|
---|
364 |
|
---|
365 | John Gilmore revised the code extensively, making it bet-
|
---|
366 | ter than the first version. Geoff Collyer found several
|
---|
367 | inadequacies and provided some magic file entries. Con-
|
---|
368 | tributions by the `&' operator by Rob McMahon, cudcv@war-
|
---|
369 | wick.ac.uk, 1989.
|
---|
370 |
|
---|
371 | Guy Harris, [email protected], made many changes from 1993 to
|
---|
372 | the present.
|
---|
373 |
|
---|
374 | Primary development and maintenance from 1990 to the
|
---|
375 | present by Christos Zoulas ([email protected]).
|
---|
376 |
|
---|
377 | Altered by Chris Lowth, [email protected], 2000: Handle the
|
---|
378 | -i option to output mime type strings, using an alterna-
|
---|
379 | tive magic file and internal logic.
|
---|
380 |
|
---|
381 | Altered by Eric Fischer ([email protected]), July, 2000, to
|
---|
382 | identify character codes and attempt to identify the lan-
|
---|
383 | guages of non-ASCII files.
|
---|
384 |
|
---|
385 | Altered by Reuben Thomas ([email protected]), 2007 to 2008, to
|
---|
386 | improve MIME support and merge MIME and non-MIME magic,
|
---|
387 | support directories as well as files of magic, apply many
|
---|
388 | bug fixes and improve the build system.
|
---|
389 |
|
---|
390 | The list of contributors to the `magic' directory (magic
|
---|
391 | files) is too long to include here. You know who you are;
|
---|
392 | thank you. Many contributors are listed in the source
|
---|
393 | files.
|
---|
394 |
|
---|
395 | LEGAL NOTICE
|
---|
396 | Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
|
---|
397 | Covered by the standard Berkeley Software Distribution
|
---|
398 | copyright; see the file LEGAL.NOTICE in the source distri-
|
---|
399 | bution.
|
---|
400 |
|
---|
401 | The files tar.h and is_tar.c were written by John Gilmore
|
---|
402 | from his public-domain tar(1) program, and are not covered
|
---|
403 | by the above license.
|
---|
404 |
|
---|
405 | BUGS
|
---|
406 | There must be a better way to automate the construction of
|
---|
407 | the Magic file from all the glop in Magdir. What is it?
|
---|
408 |
|
---|
409 | file uses several algorithms that favor speed over accu-
|
---|
410 | racy, thus it can be misled about the contents of text
|
---|
411 | files.
|
---|
412 |
|
---|
413 | The support for text files (primarily for programming lan-
|
---|
414 | guages) is simplistic, inefficient and requires recompila-
|
---|
415 | tion to update.
|
---|
416 |
|
---|
417 | The list of keywords in ascmagic probably belongs in the
|
---|
418 | Magic file. This could be done by using some keyword like
|
---|
419 | `*' for the offset value.
|
---|
420 |
|
---|
421 | Complain about conflicts in the magic file entries. Make
|
---|
422 | a rule that the magic entries sort based on file offset
|
---|
423 | rather than position within the magic file?
|
---|
424 |
|
---|
425 | The program should provide a way to give an estimate of
|
---|
426 | `how good' a guess is. We end up removing guesses (e.g.
|
---|
427 | `Fromas first 5 chars of file) because' they are not as
|
---|
428 | good as other guesses (e.g. `Newsgroups:' versus
|
---|
429 | `Return-Path:' ). Still, if the others don't pan out, it
|
---|
430 | should be possible to use the first guess.
|
---|
431 |
|
---|
432 | This manual page, and particularly this section, is too
|
---|
433 | long.
|
---|
434 |
|
---|
435 | RETURN CODE
|
---|
436 | file returns 0 on success, and non-zero on error.
|
---|
437 |
|
---|
438 | AVAILABILITY
|
---|
439 | You can obtain the original author's latest version by
|
---|
440 | anonymous FTP on ftp.astron.com in the directory
|
---|
441 | /pub/file/file-X.YZ.tar.gz
|
---|
442 |
|
---|
443 | BSD October 9, 2008 BSD
|
---|