source: gsdl/trunk/trunk/mg/src/scripts/mgmerge.1@ 16583

Last change on this file since 16583 was 16583, checked in by davidb, 16 years ago

Undoing change commited in r16582

  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 4.2 KB
Line 
1`.\"------------------------------------------------------------
2.\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
3.de Id
4.ds Rv \\$3
5.ds Dt \\$4
6..
7.Id $Id: mgmerge.1 16583 2008-07-29 10:20:36Z davidb $
8.\"------------------------------------------------------------
9.TH mgmerge 1 \*(Dt CITRI
10.SH NAME
11mgmerge \- update an mg system database with new documents
12.SH SYNOPSIS
13.B mgmerge
14[
15.B \-c
16]
17[
18.BI \-g " get"
19]
20[
21.BI \-s " source"
22]
23[
24.B \-S
25]
26[
27.B \-w
28]
29.I collection-name
30.SH DESCRIPTION
31.B mgmerge
32is a
33.B csh
34script that executes all the appropriate programs in the correct order
35to completely merge a current
36.BR mg (1)
37system database ready with some new documents, saving the need for a complete
38database rebuild with
39.B mgbuild.
40It does this by building a second
41.BR mg (1)
42system database from the new documents and then merging this database
43with the old one.
44This program makes use of the
45.BR mg_get_merge (1)
46script to obtain the text of the collection.
47.B mgmerge
48cannot edit or delete documents already in a database; only new documents
49can be added.
50.SH OPTIONS
51Options can occur in any order, but the collection name must be last.
52.TP "\w'\fIcollection-name\fP'u+2n"
53.BI \-c
54This specifies whether the
55.I get
56program is \*(lqcomplex\*(rq. If a
57.I get
58program is \*(lqcomplex\*(rq, then it requires initialisation and
59cleanup with the
60.B \-i
61and
62.B \-c
63options.
64.TP
65.BI \-g " get"
66This specifies the program to use for getting the source text for the
67build. If no
68.B \-g
69option is given, the default program
70.BR mg_get_merge (1)
71is used.
72.TP
73.BI \-s " source"
74The
75.B mgmerge
76program consists of two parts. The first part initializes variables
77to default values. The second part uses these variables to control
78how the
79.BR mg (1)
80database is built. This option specifies a program to execute between
81the first and second parts. The details of what the variables are, and
82how they may be changed, are in comments in the
83.B mgmerge
84program.
85If this option is used, the program should ideally be the same one that
86was called by
87.BR mgbuild (1)
88to build the database since some parameters need to be consistent between
89.B mgbuild
90and
91.B mgmerge.
92.TP
93.B \-S
94This option will cause a slow merge to be performed on the inverted files,
95where each inverted file entry is decoded and recoded.
96The default is a fast merge. Accumulated fast merges slowly degrade
97compression performance on the resulting inverted file so
98a periodic slow merge is recommended.
99.TP
100.B \-w
101Adding new documents can have an effect on the weight of the previous ones.
102By default the weights for documents already in the collection are not
103recomputed since the change in their values is usually small.
104This option forces new weights to be recomputed.
105Periodic use of this option, as for the "-S" option, is recomended,
106otherwise query rankings may become inaccurate.
107.TP
108.I collection-name
109This is the collection name, as required by the
110.BR mg_get_merge (1)
111program. It serves both as a
112.I case
113statement selector, and as the name of a subdirectory that holds the
114indexing files.
115.SH ENVIRONMENT
116.TP "\w'\fBMGDATA\fP'u+2n"
117.SB MGDATA
118If this environment variable exists, then its value is used as the
119default directory where the
120.BR mg (1)
121collection files are. If this variable does not exist, then the
122directory \*(lq\fB.\fP\*(rq is used by default. The command line
123option
124.BI \-d " directory"
125overrides the directory in
126.BR MGDATA .
127Note that a temporary directory under the
128.B MGDATA
129directory is used to perform the merge.
130The default name for this directory is
131.BR MERGE .
132.SH FILES
133.TP 20
134.B *.invf
135Inverted file.
136.TP
137.B *.invf.dict
138Compressed stemmed dictionary.
139.TP
140.B *.invf.dict.blocked
141The `on-disk' stemmed dictionary.
142.TP
143.B *.invf.idx
144The index into the inverted file.
145.TP
146.B *.weight
147The exact weights file.
148.TP
149.B *.text
150Compressed documents.
151.TP
152.B *.text.stats
153Text statistics.
154.TP
155.B *.text.dict
156Compressed compression dictionary.
157.TP
158.B *.text.idx
159Index into the compressed documents.
160.TP
161.B *.text.idx.wgt
162Interleaved index into the compressed documents and document weights.
163.TP
164.B *.weight.approx
165Approximate document weights.
166.SH "SEE ALSO"
167.na
168.BR mg (1),
169.BR mgbuild (1),
170.BR mg_get_merge (1),
171.BR mg_get (1),
172.BR mg_invf_merge (1),
173.BR mg_text_merge (1),
174.BR mg_query (1),
175.BR mg_weights_build (1).
Note: See TracBrowser for help on using the repository browser.