`.\"------------------------------------------------------------ .\" Id - set Rv,revision, and Dt, Date using rcs-Id tag. .de Id .ds Rv \\$3 .ds Dt \\$4 .. .Id $Id: mgmerge.1 3745 2003-02-20 21:20:24Z mdewsnip $ .\"------------------------------------------------------------ .TH mgmerge 1 \*(Dt CITRI .SH NAME mgmerge \- update an mg system database with new documents .SH SYNOPSIS .B mgmerge [ .B \-c ] [ .BI \-g " get" ] [ .BI \-s " source" ] [ .B \-S ] [ .B \-w ] .I collection-name .SH DESCRIPTION .B mgmerge is a .B csh script that executes all the appropriate programs in the correct order to completely merge a current .BR mg (1) system database ready with some new documents, saving the need for a complete database rebuild with .B mgbuild. It does this by building a second .BR mg (1) system database from the new documents and then merging this database with the old one. This program makes use of the .BR mg_get_merge (1) script to obtain the text of the collection. .B mgmerge cannot edit or delete documents already in a database; only new documents can be added. .SH OPTIONS Options can occur in any order, but the collection name must be last. .TP "\w'\fIcollection-name\fP'u+2n" .BI \-c This specifies whether the .I get program is \*(lqcomplex\*(rq. If a .I get program is \*(lqcomplex\*(rq, then it requires initialisation and cleanup with the .B \-i and .B \-c options. .TP .BI \-g " get" This specifies the program to use for getting the source text for the build. If no .B \-g option is given, the default program .BR mg_get_merge (1) is used. .TP .BI \-s " source" The .B mgmerge program consists of two parts. The first part initializes variables to default values. The second part uses these variables to control how the .BR mg (1) database is built. This option specifies a program to execute between the first and second parts. The details of what the variables are, and how they may be changed, are in comments in the .B mgmerge program. If this option is used, the program should ideally be the same one that was called by .BR mgbuild (1) to build the database since some parameters need to be consistent between .B mgbuild and .B mgmerge. .TP .B \-S This option will cause a slow merge to be performed on the inverted files, where each inverted file entry is decoded and recoded. The default is a fast merge. Accumulated fast merges slowly degrade compression performance on the resulting inverted file so a periodic slow merge is recommended. .TP .B \-w Adding new documents can have an effect on the weight of the previous ones. By default the weights for documents already in the collection are not recomputed since the change in their values is usually small. This option forces new weights to be recomputed. Periodic use of this option, as for the "-S" option, is recomended, otherwise query rankings may become inaccurate. .TP .I collection-name This is the collection name, as required by the .BR mg_get_merge (1) program. It serves both as a .I case statement selector, and as the name of a subdirectory that holds the indexing files. .SH ENVIRONMENT .TP "\w'\fBMGDATA\fP'u+2n" .SB MGDATA If this environment variable exists, then its value is used as the default directory where the .BR mg (1) collection files are. If this variable does not exist, then the directory \*(lq\fB.\fP\*(rq is used by default. The command line option .BI \-d " directory" overrides the directory in .BR MGDATA . Note that a temporary directory under the .B MGDATA directory is used to perform the merge. The default name for this directory is .BR MERGE . .SH FILES .TP 20 .B *.invf Inverted file. .TP .B *.invf.dict Compressed stemmed dictionary. .TP .B *.invf.dict.blocked The `on-disk' stemmed dictionary. .TP .B *.invf.idx The index into the inverted file. .TP .B *.weight The exact weights file. .TP .B *.text Compressed documents. .TP .B *.text.stats Text statistics. .TP .B *.text.dict Compressed compression dictionary. .TP .B *.text.idx Index into the compressed documents. .TP .B *.text.idx.wgt Interleaved index into the compressed documents and document weights. .TP .B *.weight.approx Approximate document weights. .SH "SEE ALSO" .na .BR mg (1), .BR mgbuild (1), .BR mg_get_merge (1), .BR mg_get (1), .BR mg_invf_merge (1), .BR mg_text_merge (1), .BR mg_query (1), .BR mg_weights_build (1).