source: main/trunk/binaries/windows/bin/GNUfile/man/cat1/file.1.txt@ 31442

Last change on this file since 31442 was 31442, checked in by ak19, 7 years ago

Adding the GNUFile windows port of linux file utility to detect bitness of an executable. Link to license included and added a GS_README text file with some basic explanations.

File size: 17.9 KB
Line 
1FILE(1) BSD General Commands Manual FILE(1)
2
3NAME
4 file -- determine file type
5
6SYNOPSIS
7 file [-bchikLnNprsvz] [--mime-type] [--mime-encoding]
8 [-f namefile] [-F separator] [-m magicfiles] file
9 file -C [-m magicfile]
10 file [--help]
11
12DESCRIPTION
13 This manual page documents version 5.03 of the file com-
14 mand.
15
16 file tests each argument in an attempt to classify it.
17 There are three sets of tests, performed in this order:
18 filesystem tests, magic tests, and language tests. The
19 first test that succeeds causes the file type to be
20 printed.
21
22 The type printed will usually contain one of the words
23 text (the file contains only printing characters and a few
24 common control characters and is probably safe to read on
25 an ASCII terminal), executable (the file contains the
26 result of compiling a program in a form understandable to
27 some UNIX kernel or another), or data meaning anything
28 else (data is usually `binary' or non-printable). Excep-
29 tions are well-known file formats (core files, tar ar-
30 chives) that are known to contain binary data. When modi-
31 fying magic files or the program itself, make sure to
32 preserve these keywords. Users depend on knowing that all
33 the readable files in a directory have the word `text'
34 printed. Don't do as Berkeley did and change `shell
35 commands text' to `shell script'.
36
37 The filesystem tests are based on examining the return
38 from a stat(2) system call. The program checks to see if
39 the file is empty, or if it's some sort of special file.
40 Any known file types appropriate to the system you are
41 running on (sockets, symbolic links, or named pipes
42 (FIFOs) on those systems that implement them) are intuited
43 if they are defined in the system header file
44 <sys/stat.h>.
45
46 The magic tests are used to check for files with data in
47 particular fixed formats. The canonical example of this
48 is a binary executable (compiled program) a.out file,
49 whose format is defined in <elf.h>, <a.out.h> and possibly
50 <exec.h> in the standard include directory. These files
51 have a `magic number' stored in a particular place near
52 the beginning of the file that tells the UNIX operating
53 system that the file is a binary executable, and which of
54 several types thereof. The concept of a `magic' has been
55 applied by extension to data files. Any file with some
56 invariant identifier at a small fixed offset into the file
57 can usually be described in this way. The information
58 identifying these files is read from the compiled magic
59 file c:/progra~1/file/share/misc/magic.mgc, or the files
60 in the directory c:/progra~1/file/share/misc/magic if the
61 compiled file does not exist. In addition, if
62 $HOME/.magic.mgc or $HOME/.magic exists, it will be used
63 in preference to the system magic files.
64
65 If a file does not match any of the entries in the magic
66 file, it is examined to see if it seems to be a text file.
67 ASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character
68 sets (such as those used on Macintosh and IBM PC systems),
69 UTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC
70 character sets can be distinguished by the different
71 ranges and sequences of bytes that constitute printable
72 text in each set. If a file passes any of these tests,
73 its character set is reported. ASCII, ISO-8859-x, UTF-8,
74 and extended-ASCII files are identified as `text' because
75 they will be mostly readable on nearly any terminal;
76 UTF-16 and EBCDIC are only `character data' because, while
77 they contain text, it is text that will require transla-
78 tion before it can be read. In addition, file will
79 attempt to determine other characteristics of text-type
80 files. If the lines of a file are terminated by CR, CRLF,
81 or NEL, instead of the Unix-standard LF, this will be
82 reported. Files that contain embedded escape sequences or
83 overstriking will also be identified.
84
85 Once file has determined the character set used in a text-
86 type file, it will attempt to determine in what language
87 the file is written. The language tests look for particu-
88 lar strings (cf. <names.h> ) that can appear anywhere in
89 the first few blocks of a file. For example, the keyword
90 .br indicates that the file is most likely a troff(1)
91 input file, just as the keyword struct indicates a C pro-
92 gram. These tests are less reliable than the previous two
93 groups, so they are performed last. The language test
94 routines also test for some miscellany (such as tar(1) ar-
95 chives).
96
97 Any file that cannot be identified as having been written
98 in any of the character sets listed above is simply said
99 to be `data'.
100
101OPTIONS
102 -b, --brief
103 Do not prepend filenames to output lines (brief
104 mode).
105
106 -c, --checking-printout
107 Cause a checking printout of the parsed form of
108 the magic file. This is usually used in conjunc-
109 tion with the -m flag to debug a new magic file
110 before installing it.
111
112 -C, --compile
113 Write a magic.mgc output file that contains a pre-
114 parsed version of the magic file or directory.
115
116 -e, --exclude testname
117 Exclude the test named in testname from the list
118 of tests made to determine the file type. Valid
119 test names are:
120
121 apptype
122 EMX application type (only on EMX).
123
124 text
125 Various types of text files (this test will try
126 to guess the text encoding, irrespective of the
127 setting of the `encoding' option).
128
129 encoding
130 Different text encodings for soft magic tests.
131
132 tokens
133 Looks for known tokens inside text files.
134
135 cdf
136 Prints details of Compound Document Files.
137
138 compress
139 Checks for, and looks inside, compressed files.
140
141 elf
142 Prints ELF file details.
143
144 soft
145 Consults magic files.
146
147 tar
148 Examines tar files.
149
150 -f, --files-from namefile
151 Read the names of the files to be examined from
152 namefile (one per line) before the argument list.
153 Either namefile or at least one filename argument
154 must be present; to test the standard input, use
155 `-' as a filename argument.
156
157 -F, --separator separator
158 Use the specified string as the separator between
159 the filename and the file result returned.
160 Defaults to `:'.
161
162 -h, --no-dereference
163 option causes symlinks not to be followed (on sys-
164 tems that support symbolic links). This is the
165 default if the environment variable
166 POSIXLY_CORRECT is not defined.
167
168 -i, --mime
169 Causes the file command to output mime type
170 strings rather than the more traditional human
171 readable ones. Thus it may say `text/plain;
172 charset=us-ascii' rather than `ASCII text'. In
173 order for this option to work, file changes the
174 way it handles files recognized by the command
175 itself (such as many of the text file types,
176 directories etc), and makes use of an alternative
177 `magic' file. (See the FILES section, below).
178
179 --mime-type, --mime-encoding
180 Like -i, but print only the specified element(s).
181
182 -k, --keep-going
183 Don't stop at the first match, keep going. Subse-
184 quent matches will be have the string `\012- '
185 prepended. (If you want a newline, see the `-r'
186 option.)
187
188 -L, --dereference
189 option causes symlinks to be followed, as the
190 like-named option in ls(1) (on systems that sup-
191 port symbolic links). This is the default if the
192 environment variable POSIXLY_CORRECT is defined.
193
194 -m, --magic-file list
195 Specify an alternate list of files and directories
196 containing magic. This can be a single item, or a
197 colon-separated list. If a compiled magic file is
198 found alongside a file or directory, it will be
199 used instead.
200
201 -n, --no-buffer
202 Force stdout to be flushed after checking each
203 file. This is only useful if checking a list of
204 files. It is intended to be used by programs that
205 want filetype output from a pipe.
206
207 -N, --no-pad
208 Don't pad filenames so that they align in the out-
209 put.
210
211 -p, --preserve-date
212 On systems that support utime(2) or utimes(2),
213 attempt to preserve the access time of files ana-
214 lyzed, to pretend that file never read them.
215
216 -r, --raw
217 Don't translate unprintable characters to \ooo.
218 Normally file translates unprintable characters to
219 their octal representation.
220
221 -s, --special-files
222 Normally, file only attempts to read and determine
223 the type of argument files which stat(2) reports
224 are ordinary files. This prevents problems,
225 because reading special files may have peculiar
226 consequences. Specifying the -s option causes
227 file to also read argument files which are block
228 or character special files. This is useful for
229 determining the filesystem types of the data in
230 raw disk partitions, which are block special
231 files. This option also causes file to disregard
232 the file size as reported by stat(2) since on some
233 systems it reports a zero size for raw disk parti-
234 tions.
235
236 -v, --version
237 Print the version of the program and exit.
238
239 -z, --uncompress
240 Try to look inside compressed files.
241
242 -0, --print0
243 Output a null character `\0' after the end of the
244 filename. Nice to cut(1) the output. This does not
245 affect the separator which is still printed.
246
247 --help Print a help message and exit.
248
249FILES
250 c:/progra~1/file/share/misc/magic.mgc Default compiled
251 list of magic.
252 c:/progra~1/file/share/misc/magic Directory contain-
253 ing default magic
254 files.
255
256ENVIRONMENT
257 The environment variable MAGIC can be used to set the
258 default magic file name. If that variable is set, then
259 file will not attempt to open $HOME/.magic. file adds
260 `.mgc' to the value of this variable as appropriate. The
261 environment variable POSIXLY_CORRECT controls (on systems
262 that support symbolic links), whether file will attempt to
263 follow symlinks or not. If set, then file follows symlink,
264 otherwise it does not. This is also controlled by the -L
265 and -h options.
266
267SEE ALSO
268 magic(5), strings(1), od(1), hexdump(1,) file(1posix)
269
270STANDARDS CONFORMANCE
271 This program is believed to exceed the System V Interface
272 Definition of FILE(CMD), as near as one can determine from
273 the vague language contained therein. Its behavior is
274 mostly compatible with the System V program of the same
275 name. This version knows more magic, however, so it will
276 produce different (albeit more accurate) output in many
277 cases.
278
279 The one significant difference between this version and
280 System V is that this version treats any white space as a
281 delimiter, so that spaces in pattern strings must be
282 escaped. For example,
283
284 >10 string language impress (imPRESS data)
285
286 in an existing magic file would have to be changed to
287
288 >10 string language\ impress (imPRESS data)
289
290 In addition, in this version, if a pattern string contains
291 a backslash, it must be escaped. For example
292
293 0 string \begindata Andrew Toolkit document
294
295 in an existing magic file would have to be changed to
296
297 0 string \\begindata Andrew Toolkit document
298
299 SunOS releases 3.2 and later from Sun Microsystems include
300 a file command derived from the System V one, but with
301 some extensions. My version differs from Sun's only in
302 minor ways. It includes the extension of the `&' opera-
303 tor, used as, for example,
304
305 >16 long&0x7fffffff >0 not stripped
306
307MAGIC DIRECTORY
308 The magic file entries have been collected from various
309 sources, mainly USENET, and contributed by various
310 authors. Christos Zoulas (address below) will collect
311 additional or corrected magic file entries. A consolida-
312 tion of magic file entries will be distributed periodi-
313 cally.
314
315 The order of entries in the magic file is significant.
316 Depending on what system you are using, the order that
317 they are put together may be incorrect. If your old file
318 command uses a magic file, keep the old magic file around
319 for comparison purposes (rename it to
320 c:/progra~1/file/share/misc/magic.orig ).
321
322EXAMPLES
323 $ file file.c file /dev/{wd0a,hda}
324 file.c: C program text
325 file: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
326 dynamically linked (uses shared libs), stripped
327 /dev/wd0a: block special (0/0)
328 /dev/hda: block special (3/0)
329
330 $ file -s /dev/wd0{b,d}
331 /dev/wd0b: data
332 /dev/wd0d: x86 boot sector
333
334 $ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
335 /dev/hda: x86 boot sector
336 /dev/hda1: Linux/i386 ext2 filesystem
337 /dev/hda2: x86 boot sector
338 /dev/hda3: x86 boot sector, extended partition table
339 /dev/hda4: Linux/i386 ext2 filesystem
340 /dev/hda5: Linux/i386 swap file
341 /dev/hda6: Linux/i386 swap file
342 /dev/hda7: Linux/i386 swap file
343 /dev/hda8: Linux/i386 swap file
344 /dev/hda9: empty
345 /dev/hda10: empty
346
347 $ file -i file.c file /dev/{wd0a,hda}
348 file.c: text/x-c
349 file: application/x-executable
350 /dev/hda: application/x-not-regular-file
351 /dev/wd0a: application/x-not-regular-file
352
353
354HISTORY
355 There has been a file command in every UNIX since at least
356 Research Version 4 (man page dated November, 1973). The
357 System V version introduced one significant major change:
358 the external list of magic types. This slowed the program
359 down slightly but made it a lot more flexible.
360
361 This program, based on the System V version, was written
362 by Ian Darwin <[email protected]> without looking at any-
363 body else's source code.
364
365 John Gilmore revised the code extensively, making it bet-
366 ter than the first version. Geoff Collyer found several
367 inadequacies and provided some magic file entries. Con-
368 tributions by the `&' operator by Rob McMahon, cudcv@war-
369 wick.ac.uk, 1989.
370
371 Guy Harris, [email protected], made many changes from 1993 to
372 the present.
373
374 Primary development and maintenance from 1990 to the
375 present by Christos Zoulas ([email protected]).
376
377 Altered by Chris Lowth, [email protected], 2000: Handle the
378 -i option to output mime type strings, using an alterna-
379 tive magic file and internal logic.
380
381 Altered by Eric Fischer ([email protected]), July, 2000, to
382 identify character codes and attempt to identify the lan-
383 guages of non-ASCII files.
384
385 Altered by Reuben Thomas ([email protected]), 2007 to 2008, to
386 improve MIME support and merge MIME and non-MIME magic,
387 support directories as well as files of magic, apply many
388 bug fixes and improve the build system.
389
390 The list of contributors to the `magic' directory (magic
391 files) is too long to include here. You know who you are;
392 thank you. Many contributors are listed in the source
393 files.
394
395LEGAL NOTICE
396 Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
397 Covered by the standard Berkeley Software Distribution
398 copyright; see the file LEGAL.NOTICE in the source distri-
399 bution.
400
401 The files tar.h and is_tar.c were written by John Gilmore
402 from his public-domain tar(1) program, and are not covered
403 by the above license.
404
405BUGS
406 There must be a better way to automate the construction of
407 the Magic file from all the glop in Magdir. What is it?
408
409 file uses several algorithms that favor speed over accu-
410 racy, thus it can be misled about the contents of text
411 files.
412
413 The support for text files (primarily for programming lan-
414 guages) is simplistic, inefficient and requires recompila-
415 tion to update.
416
417 The list of keywords in ascmagic probably belongs in the
418 Magic file. This could be done by using some keyword like
419 `*' for the offset value.
420
421 Complain about conflicts in the magic file entries. Make
422 a rule that the magic entries sort based on file offset
423 rather than position within the magic file?
424
425 The program should provide a way to give an estimate of
426 `how good' a guess is. We end up removing guesses (e.g.
427 `Fromas first 5 chars of file) because' they are not as
428 good as other guesses (e.g. `Newsgroups:' versus
429 `Return-Path:' ). Still, if the others don't pan out, it
430 should be possible to use the first guess.
431
432 This manual page, and particularly this section, is too
433 long.
434
435RETURN CODE
436 file returns 0 on success, and non-zero on error.
437
438AVAILABILITY
439 You can obtain the original author's latest version by
440 anonymous FTP on ftp.astron.com in the directory
441 /pub/file/file-X.YZ.tar.gz
442
443BSD October 9, 2008 BSD
Note: See TracBrowser for help on using the repository browser.