source: gsdl/trunk/trunk/mg/src/text/mg_invf_rebuild.1@ 16583

Last change on this file since 16583 was 16583, checked in by davidb, 16 years ago

Undoing change commited in r16582

  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 3.2 KB
Line 
1.\"------------------------------------------------------------
2.\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
3.de Id
4.ds Rv \\$3
5.ds Dt \\$4
6..
7.Id $Id: mg_invf_rebuild.1 16583 2008-07-29 10:20:36Z davidb $
8.\"------------------------------------------------------------
9.TH mg_invf_rebuild 1 \*(Dt CITRI
10.SH NAME
11mg_invf_rebuild \- Rebuild an mg inverted file with or without skips.
12.SH SYNOPSIS
13.B mg_invf_rebuild
14.RB [ \-h ]
15.if n .ti +9n
16[
17.BR \-0 " |"
18.B \-1
19[
20.BI \-k " num"
21] |
22.B \-2
23[
24.BI \-s " num"
25]
26[
27.BI \-m " num"
28]
29]
30.if n .ti +9n
31[
32.BI \-d " directory"
33]
34.BI \-f " name"
35.SH DESCRIPTION
36.B mg_invf_rebuild
37builds a new inverted file, with or without skipping, from an old
38inverted file. This file uses
39.I *.invf.ORG
40and
41.I *.invf.idx.ORG
42as the
43source from which it builds
44.I *.invf
45and
46.IR *.invf.idx .
47If
48.I *.invf.ORG
49or
50.I *.invf.idx.ORG
51do not exist, the program renames
52.I *.invf
53or
54.I *.invf.idx
55to
56.I *.invf.ORG
57or
58.I *.invf.idx.ORG
59as appropriate. The old inverted file may contain skipping. This
60means that it is possible to delete the
61.I *.ORG
62files
63after the new inverted file is built.
64.SH OPTIONS
65Options may appear in any order.
66.TP "\w'\fB\-m\fP \fInum\fP'u+2n"
67.B \-h
68This displays a usage line on
69.IR stderr .
70.TP
71.B \-0
72This generates a non-skipped inverted file. This option is normally
73only needed if the
74.I *.ORG
75files have been deleted.
76.TP
77.B \-1
78This generates a skipped inverted file. The
79.BI \-k " num"
80argument specifies the number of pointers hopped over with each skip.
81.TP
82.B \-2
83This option generates a skipped inverted file. The skipped inverted
84file is built so that it is `optimal' for ranking using a specific
85number of accumulators. Each term in the inverted file has a
86different skip length. The arguments
87.BR \-s " and " \-m
88control the sizes of the skips.
89.TP
90.BI \-k " num"
91This specifies the number of pointers that should be hopped over with
92each skip. This option is only valid if
93.B \-1
94is specified.
95.TP
96.BI \-m " num"
97This specifies the intended number of accumulators that will be used
98when ranking queries are done on the collection.
99.TP
100.BI \-s " num"
101This specifies the minimum size for skips. If the calculation of the
102optimal skip size results in a number smaller than
103.IR num ,
104the skip size is set to
105.IR num .
106.SH ENVIRONMENT
107.TP "\w'\fBMGDATA\fP'u+2n"
108.SB MGDATA
109If this environment variable exists, then its value is used as the
110default directory where the
111.BR mg (1)
112collection files are. If this variable does not exist, then the
113directory \*(lq\fB.\fP\*(rq is used by default. The command line
114option
115.BI \-d " directory"
116overrides the directory in
117.BR MGDATA .
118.SH FILES
119.TP 20
120.B *.invf
121Inverted file.
122.TP
123.B *.invf.ORG
124Original inverted file.
125.TP
126.B *.invf.idx
127The index into the inverted file.
128.TP
129.B *.invf.idx.ORG
130The original index into the inverted file.
131.TP
132.B *.invf.dict.build
133Compressed stemmed dictionary.
134.SH "SEE ALSO"
135.na
136.BR mg (1),
137.BR mg_compression_dict (1),
138.BR mg_fast_comp_dict (1),
139.BR mg_get (1),
140.BR mg_invf_dict (1),
141.BR mg_invf_dump (1),
142.BR mg_passes (1),
143.BR mg_perf_hash_build (1),
144.BR mg_text_estimate (1),
145.BR mg_weights_build (1),
146.BR mgbilevel (1),
147.BR mgbuild (1),
148.BR mgdictlist (1),
149.BR mgfelics (1),
150.BR mgquery (1),
151.BR mgstat (1),
152.BR mgtic (1),
153.BR mgticbuild (1),
154.BR mgticdump (1),
155.BR mgticprune (1),
156.BR mgticstat (1).
Note: See TracBrowser for help on using the repository browser.