source: trunk/indexers/mg/src/text/mg_invf_dump.1@ 3745

Last change on this file since 3745 was 3745, checked in by mdewsnip, 21 years ago

Addition of MG package for search and retrieval

  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 2.9 KB
Line 
1.\"------------------------------------------------------------
2.\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
3.de Id
4.ds Rv \\$3
5.ds Dt \\$4
6..
7.Id $Id: mg_invf_dump.1 3745 2003-02-20 21:20:24Z mdewsnip $
8.\"------------------------------------------------------------
9.TH mg_invf_dump 1 \*(Dt CITRI
10.SH NAME
11mg_invf_dump \- Dump out an inverted file in ASCII
12.SH SYNOPSIS
13.B mg_invf_dump
14[
15.B \-h
16]
17[
18.B \-b
19]
20[
21.B \-w
22]
23[
24.B \-t
25]
26[
27.BI \-d " directory"
28]
29.BI \-f " name"
30.SH DESCRIPTION
31This program dumps out an inverted file produced by the
32.BR mg (1)
33system as ASCII numbers. This program could be used in conjunction
34with
35.BR mgdictlist (1)
36to write simple shell programs that work with the inverted files. The
37output from the program looks something like this:
38.LP
39.RS
40.ft 3
41.nf
426 337
431
44 1 1
451
46 5 1
471
48 2 1
491
50 1 1
514
52 1 3
53 2 7
54 3 2
55 5 2
561
57 1 1
58 . . .
59.fi
60.ft
61.RE
62The first number (6) is the number of documents in the collection. It
63is followed by the number of different terms in the collection. For
64each term, the number of different documents it occurs in follows. The
65document numbers that the term occurs in are listed next. If
66.B \-w
67is specified, then the number of times the term occurred in the
68document is printed alongside the document number.
69If
70.B \-t
71is specified, the term itself is also displayed in quotes.
72.SH OPTIONS
73Options may appear in any order.
74.TP "\w'\fB\-d\fP \fIdirectory\fP'u+2n"
75.B \-h
76This displays a usage line on
77.IR stderr .
78.TP
79.B \-b
80This option will cause the output from the program to be in fixed-size
81binary numbers, rather than in ASCII.
82.TP
83.B \-w
84If the inverted file is a level-2 or level-3 inverted file, this causes
85the word counts per document to be output.
86.TP
87.B \-t
88This option causes each term to be displayed along with the number of
89documents it appears in. The term is printed in quotes,
90as in "the".
91.TP
92.BI \-d " directory"
93This specifies the directory where the document collection can be found.
94.TP
95.BI \-f " name"
96This specifies the base name of the document collection.
97.SH ENVIRONMENT
98.TP "\w'\fBMGDATA\fP'u+2n"
99.SB MGDATA
100If this environment variable exists, then its value is used as the
101default directory where the
102.BR mg (1)
103collection files are. If this variable does not exist, then the
104directory \*(lq\fB.\fP\*(rq is used by default. The command line
105option
106.BI \-d " directory"
107overrides the directory in
108.BR MGDATA .
109.SH FILES
110.TP 20
111.B *.invf.dict
112The compressed stemmed dictionary.
113.TP
114.B *.invf.ORG
115The original inverted file saved by
116.BR mg_invf_rebuild .
117.TP
118.B *.invf
119The inverted file.
120.SH "SEE ALSO"
121.na
122.BR mg (1),
123.BR mg_compression_dict (1),
124.BR mg_fast_comp_dict (1),
125.BR mg_get (1),
126.BR mg_invf_dict (1),
127.BR mg_invf_rebuild (1),
128.BR mg_passes (1),
129.BR mg_perf_hash_build (1),
130.BR mg_text_estimate (1),
131.BR mg_weights_build (1),
132.BR mgbilevel (1),
133.BR mgbuild (1),
134.BR mgdictlist (1),
135.BR mgfelics (1),
136.BR mgquery (1),
137.BR mgstat (1),
138.BR mgtic (1),
139.BR mgticbuild (1),
140.BR mgticdump (1),
141.BR mgticprune (1),
142.BR mgticstat (1).
Note: See TracBrowser for help on using the repository browser.