source: gsdl/trunk/trunk/mg/src/text/mgstat.1@ 16583

Last change on this file since 16583 was 16583, checked in by davidb, 16 years ago

Undoing change commited in r16582

  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 2.8 KB
Line 
1.\"------------------------------------------------------------
2.\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
3.de Id
4.ds Rv \\$3
5.ds Dt \\$4
6..
7.Id $Id: mgstat.1 16583 2008-07-29 10:20:36Z davidb $
8.\"------------------------------------------------------------
9.TH mgstat 1 \*(Dt CITRI
10.SH NAME
11mgstat \- print out statistics about a document collection
12.SH SYNOPSIS
13.B mgstat
14[
15.B \-h
16]
17[
18.B \-E
19]
20[
21.BI \-d " directory"
22]
23.BI \-f " name"
24.SH DESCRIPTION
25.B mgstat
26prints out various statistics about an existing
27.BR mg (1)
28document collection. Depending on the size of the collection, sizes
29will be printed in either kilobytes or megabytes.
30.SH OPTIONS
31Options may appear in any order.
32.TP "\w'\fB\-d\fP \fIdirectory\fP'u+2n"
33.B \-h
34This displays a usage line on
35.IR stdout .
36.TP
37.B \-E
38This option forces sizes to be printed in bytes rather than kilobytes
39or megabytes.
40.TP
41.BI \-d " directory"
42This specifies the directory where the document collection can be found.
43.TP
44.BI \-f " name"
45This specifies the base name of the document collection.
46.SH ENVIRONMENT
47.TP "\w'\fBMGDATA\fP'u+2n"
48.SB MGDATA
49If this environment variable exists, then its value is used as the
50default directory where the
51.BR mg (1)
52collection files are. If this variable does not exist, then the
53directory \*(lq\fB.\fP\*(rq is used by default. The command line
54option
55.BI \-d " directory"
56overrides the directory in
57.BR MGDATA .
58.SH FILES
59.TP 20
60.B *.text
61Compressed documents.
62.TP
63.B *.invf
64Inverted file.
65.TP
66.B *.text.idx.wgt
67Interleaved index into the compressed documents and document weights.
68.TP
69.B *.weight.approx
70Approximate document weights.
71.TP
72.B *.invf.dict.blocked
73Compressed stemmed dictionary and index into the inverted file merged
74into an inverted file.
75.TP
76.B *.text.dict.fast
77Fast loading compression dictionary.
78.TP
79.B *.text.dict
80Compressed compression dictionary.
81.TP
82.B *.invf.dict
83Compressed stemmed dictionary.
84.TP
85.B *.invf.idx
86The index into the inverted file.
87.TP
88.B *.text.stats
89Statistics about the text.
90.TP
91.B *.text.dict.aux
92Auxiliary compression dictionary.
93.TP
94.B *.text.idx
95Index into the compressed documents.
96.TP
97.B *.weight
98The exact weights file.
99.TP
100.B *.invf.chunk
101Maps stemmed terms from occurrence order to lexical order.
102.TP
103.B *.invf.chunk.trans
104Describes where the source text is broken up into chunks for the
105inversion pass.
106.TP
107.B *.invf.dict.hash
108A perfect hash function for the terms in the stemmed dictionary.
109.SH "SEE ALSO"
110.na
111.BR mg (1),
112.BR mg_compression_dict (1),
113.BR mg_fast_comp_dict (1),
114.BR mg_get (1),
115.BR mg_invf_dict (1),
116.BR mg_invf_dump (1),
117.BR mg_invf_rebuild (1),
118.BR mg_passes (1),
119.BR mg_perf_hash_build (1),
120.BR mg_text_estimate (1),
121.BR mg_weights_build (1),
122.BR mgbilevel (1),
123.BR mgbuild (1),
124.BR mgdictlist (1),
125.BR mgfelics (1),
126.BR mgquery (1),
127.BR mgtic (1),
128.BR mgticbuild (1),
129.BR mgticdump (1),
130.BR mgticprune (1),
131.BR mgticstat (1).
Note: See TracBrowser for help on using the repository browser.