source: trunk/indexers/mg/src/scripts/mgbuild.1@ 3745

Last change on this file since 3745 was 3745, checked in by mdewsnip, 21 years ago

Addition of MG package for search and retrieval

  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 3.7 KB
Line 
1.\"------------------------------------------------------------
2.\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
3.de Id
4.ds Rv \\$3
5.ds Dt \\$4
6..
7.Id $Id: mgbuild.1 3745 2003-02-20 21:20:24Z mdewsnip $
8.\"------------------------------------------------------------
9.TH mgbuild 1 \*(Dt CITRI
10.SH NAME
11mgbuild \- build an mg system database
12.SH SYNOPSIS
13.B mgbuild
14[
15.B \-c
16]
17[
18.BI \-g " get"
19]
20[
21.BI \-s " source"
22]
23[
24.BI \-d " mgdata dir"
25]
26.I collection-name
27.SH DESCRIPTION
28.B mgbuild
29is a
30.B csh
31script that executes all the appropriate programs in the correct order
32to completely build an
33.BR mg (1)
34system database ready for queries to be made
35by
36.BR mgquery.
37This program makes use of the
38.BR mg_get (1)
39script to obtain the text of the collection.
40.SH OPTIONS
41Options can occur in any order, but the collection name must be last.
42.TP "\w'\fIcollection-name\fP'u+2n"
43.BI \-c
44This specifies whether the
45.I get
46program is \*(lqcomplex\*(rq. If a
47.I get
48program is \*(lqcomplex\*(rq, then it requires initialisation and
49cleanup with the
50.B \-i
51and
52.B \-c
53options.
54.TP
55.BI \-g " get"
56This specifies the program to use for getting the source text for the
57build. If no
58.B \-g
59option is given, the default program
60.BR mg_get (1)
61is used.
62.TP
63.BI \-s " source"
64The
65.B mgbuild
66program consists of two parts. The first part initializes variables
67to default values. The second part uses these variables to control
68how the
69.BR mg (1)
70database is built. This option specifies a program to execute between
71the first and second parts. The details of what the variables are, and
72how they may be changed, are in comments in the
73.B mgbuild
74program.
75.TP
76.BI \-d " mgdata dir"
77Setting this option allows one to override the MGDATA environment variable.
78.TP
79.I collection-name
80This is the collection name, as required by the
81.BR mg_get (1)
82program. It serves both as a
83.I case
84statement selector, and as the name of a subdirectory that holds the
85indexing files.
86.SH ENVIRONMENT
87.TP "\w'\fBMGDATA\fP'u+2n"
88.SB MGDATA
89If this environment variable exists, then its value is used as the
90default directory where the
91.BR mg (1)
92collection files are. If this variable does not exist, then the
93directory \*(lq\fB.\fP\*(rq is used by default. The command line
94option
95.BI \-d " directory"
96overrides the directory in
97.BR MGDATA .
98.SH FILES
99.TP 20
100.B *.invf
101Inverted file.
102.TP
103.B *.invf.chunk
104Inverted file chunk descriptor file. When the inverted file is
105created, it is written in chunks that use no more than a set amount of
106memory. This file describes those chunks.
107.TP
108.B *.invf.chunk.trans
109Word-occurrence-order to lexical-order translation file. The
110.B *.invf.chunk
111file is written in word-occurrence order but is required by
112.B \-N2
113to be in lexical order.
114.TP
115.B *.invf.dict
116Compressed stemmed dictionary.
117.TP
118.B *.invf.dict.blocked
119The `on-disk' stemmed dictionary.
120.TP
121.B *.invf.dict.hash
122Data for an order-preserving perfect hash function.
123.TP
124.B *.invf.idx
125The index into the inverted file.
126.TP
127.B *.weight
128The exact weights file.
129.TP
130.B *.text
131Compressed documents.
132.TP
133.B *.text.stats
134Text statistics.
135.TP
136.B *.text.dict
137Compressed compression dictionary.
138.TP
139.B *.text.idx
140Index into the compressed documents.
141.TP
142.B *.text.idx.wgt
143Interleaved index into the compressed documents and document weights.
144.TP
145.B *.weight.approx
146Approximate document weights.
147.SH "SEE ALSO"
148.na
149.BR mg (1),
150.BR mg_compression_dict (1),
151.BR mg_fast_comp_dict (1),
152.BR mg_get (1),
153.BR mg_invf_dict (1),
154.BR mg_invf_dump (1),
155.BR mg_invf_rebuild (1),
156.BR mg_passes (1),
157.BR mg_perf_hash_build (1),
158.BR mg_text_estimate (1),
159.BR mg_weights_build (1),
160.BR mgbilevel (1),
161.BR mgdictlist (1),
162.BR mgfelics (1),
163.BR mgquery (1),
164.BR mgstat (1),
165.BR mgtic (1),
166.BR mgticbuild (1),
167.BR mgticdump (1),
168.BR mgticprune (1),
169.BR mgticstat (1).
Note: See TracBrowser for help on using the repository browser.