1 | .\"------------------------------------------------------------
|
---|
2 | .\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
|
---|
3 | .de Id
|
---|
4 | .ds Rv \\$3
|
---|
5 | .ds Dt \\$4
|
---|
6 | ..
|
---|
7 | .Id $Id: mgintro++.1 3745 2003-02-20 21:20:24Z mdewsnip $
|
---|
8 | .\"------------------------------------------------------------
|
---|
9 | .ds r \&\s-1MG\s0
|
---|
10 | .if n .ds - \%--
|
---|
11 | .if t .ds - \(em
|
---|
12 | .\"------------------------------------------------------------
|
---|
13 | .am SS
|
---|
14 | .LP
|
---|
15 | ..
|
---|
16 | .\"------------------------------------------------------------
|
---|
17 | .TH MGINTRO++ 1 \*(Dt CITRI
|
---|
18 | .\"--------------------------------------------------------------
|
---|
19 | .SH NAME
|
---|
20 | mgintro++ \- extended introduction to the MG system
|
---|
21 | .\"-------------------------------------------------------------
|
---|
22 | .SH DESCRIPTION
|
---|
23 | This manual assumes the reader has already read
|
---|
24 | .BR mgintro (1).
|
---|
25 | .\"-------------------------------------------------------------
|
---|
26 | .SS Creating Different Databases
|
---|
27 | If a user wants to build databases other than for some
|
---|
28 | predefined ones, such as "alice", "davinci", "mailfiles", "allfiles",
|
---|
29 | then the user has a couple of choices.
|
---|
30 | Ultimately (s)he must produce a text file with control-Bs
|
---|
31 | terminating the documents.
|
---|
32 | To do this one can produce one or more such files, or write a
|
---|
33 | "get" command (typically in the form of a script or c program).
|
---|
34 | .\"-------------------------------------------------------------
|
---|
35 | .SS Using Input Files for mgbuild
|
---|
36 | If you don't want to write a "get" script and just want to use
|
---|
37 | one or more text files as input, then you must first generate
|
---|
38 | the file with control-Bs. For a simple example, you could take
|
---|
39 | any text file(s) such as "test1.txt" and "test2.txt", and use
|
---|
40 | .BR vi (1)
|
---|
41 | to insert control-Bs by typing "control-V b".
|
---|
42 | Next you should create a file with "set" statements
|
---|
43 | in the following form:
|
---|
44 | .PP
|
---|
45 | .IP
|
---|
46 | \fBset pipe = 0 # do not use pipe - use file instead
|
---|
47 | .br
|
---|
48 | \fBset input_files = 'test1.txt test2.txt'
|
---|
49 | .LP
|
---|
50 | Let's call this file, "build_options".
|
---|
51 | Now issue the command:
|
---|
52 | .IP
|
---|
53 | .B mgbuild -s build_options test
|
---|
54 | .LP
|
---|
55 | This should build a database called "test" in the $MGDATA directory,
|
---|
56 | based on the source data of "test1.txt" and "test2.txt".
|
---|
57 | The build_options file is simply sourced by
|
---|
58 | .BR mgbuild (1)
|
---|
59 | after it has set up its variables.
|
---|
60 | Therefore, any settings one makes in the
|
---|
61 | build_options file will override the standard settings.
|
---|
62 | See
|
---|
63 | .BR mgbuild (1)
|
---|
64 | for more information.
|
---|
65 | .\"-------------------------------------------------------------
|
---|
66 | .SS Writing A Get Program
|
---|
67 | Instead of using files as input, it is often more convenient to
|
---|
68 | write a "get" program. This program is called by
|
---|
69 | .BR mgbuild (1)
|
---|
70 | to get the text data with control-Bs as document terminators.
|
---|
71 | It should take three options:
|
---|
72 | .br
|
---|
73 | (i) -init; (ii) -text; (iii) -cleanup.
|
---|
74 | .br
|
---|
75 | Get will be called with "init" first and with "cleanup" at the end.
|
---|
76 | It will call get with "text" when it wants the text and it should
|
---|
77 | write the text to stdout.
|
---|
78 | .br
|
---|
79 | See
|
---|
80 | .BR mg_get (1)
|
---|
81 | for an example.
|
---|
82 | .\"-------------------------------------------------------------
|
---|
83 | .SS Regular Builds
|
---|
84 | The MG system provides a static database;
|
---|
85 | there are no update commands.
|
---|
86 | So if one wants to keep one's database reasonably up-to-date
|
---|
87 | then one can have this done automatically on a regular basis by
|
---|
88 | .BR cron (1).
|
---|
89 | A crontab file can be created using:
|
---|
90 | crontab -e
|
---|
91 | A crontab file contains lines of the form:
|
---|
92 | .nf
|
---|
93 | .IP
|
---|
94 | \fBminute hour day-of-month month day-of-week shell-command.
|
---|
95 | .LP
|
---|
96 | .fi
|
---|
97 | See
|
---|
98 | .BR crontab (1)
|
---|
99 | for more information.
|
---|
100 | .nf
|
---|
101 | An example crontab entry is:
|
---|
102 | .IP
|
---|
103 | \fB15 02 * * * mgbuild allfiles >$MGDATA/allfiles/allfiles.log 2>&1
|
---|
104 | .LP
|
---|
105 | .fi
|
---|
106 | This will build up the mg database for "allfiles", your mail in
|
---|
107 | the folders, every morning at 2:15am.
|
---|
108 | .\"
|
---|
109 | .\"-------------------------------------------------------------
|
---|
110 | .SS Command Structure
|
---|
111 | There are 22 commands that make up the mg system. However,
|
---|
112 | a user may only need to be aware of a few:
|
---|
113 | .BR mgbuild (1),
|
---|
114 | .BR mgquery (1),
|
---|
115 | and perhaps
|
---|
116 | .BR mg_get (1).
|
---|
117 | Many of the commands are called by
|
---|
118 | .BR mgbuild(1).
|
---|
119 | The commands can be broken up into a hierarchy.
|
---|
120 | .PP
|
---|
121 | --------------------------------------
|
---|
122 | .br
|
---|
123 | MG--+--image compression
|
---|
124 | | |
|
---|
125 | | +--mgbilevel
|
---|
126 | | |
|
---|
127 | | +--mgfelics
|
---|
128 | | |
|
---|
129 | | +--mgtic
|
---|
130 | | |
|
---|
131 | | +--mgticbuild
|
---|
132 | | |
|
---|
133 | | +--mgticdump
|
---|
134 | | |
|
---|
135 | | +--mgticprune
|
---|
136 | | |
|
---|
137 | | +--mgticstat
|
---|
138 | |
|
---|
139 | +--text
|
---|
140 | |
|
---|
141 | +--compression
|
---|
142 | | |
|
---|
143 | | +--mg_passes -T1
|
---|
144 | | |
|
---|
145 | | +--mg_passes -T2
|
---|
146 | | |
|
---|
147 | | +--mg_compression_dict
|
---|
148 | | |
|
---|
149 | | +--mg_fast_comp_dict
|
---|
150 | |
|
---|
151 | +--indexing
|
---|
152 | | |
|
---|
153 | | +--mg_passes -N1
|
---|
154 | | |
|
---|
155 | | +--mg_passes -N2
|
---|
156 | | |
|
---|
157 | | +--mg_perf_hash_build
|
---|
158 | | |
|
---|
159 | | +--mg_invf_dict
|
---|
160 | | |
|
---|
161 | | +--mg_invf_rebuild
|
---|
162 | |
|
---|
163 | +--weights
|
---|
164 | | |
|
---|
165 | | +--mg_weights_build
|
---|
166 | |
|
---|
167 | +--query
|
---|
168 | | |
|
---|
169 | | +--mgquery
|
---|
170 | |
|
---|
171 | +--tools
|
---|
172 | |
|
---|
173 | +--mg_invf_dump
|
---|
174 | |
|
---|
175 | +--mg_text_estimate
|
---|
176 | |
|
---|
177 | +--mgdictlist
|
---|
178 | |
|
---|
179 | +--mgstat
|
---|
180 | .br
|
---|
181 | --------------------------------------
|
---|
182 | .PP
|
---|
183 | .nf
|
---|
184 | .BR mgbuild (1)
|
---|
185 | calls the following commands:
|
---|
186 | .RS
|
---|
187 | .BR mg_passes (1), mg_compression_dict (1)
|
---|
188 | .BR mg_perf_hash_build (1), mg_invf_dict (1), mg_invf_rebuild (1)
|
---|
189 | .BR mg_weights_build (1)
|
---|
190 | .RE
|
---|
191 | .fi
|
---|
192 | .\"--------------------------------------------
|
---|
193 | .SH SEE ALSO
|
---|
194 | .BR mgintro (1),
|
---|
195 | .BR mgbuild (1),
|
---|
196 | .BR mg_get (1)
|
---|
197 | .br
|
---|
198 | "Guide To The \*r System", in Appendix A of the book:
|
---|
199 | .PP
|
---|
200 | .RS
|
---|
201 | .nf
|
---|
202 | Ian H. Witten, Alistair Moffat, and Timothy C. Bell
|
---|
203 | .I "Managing Gigabytes: Compressing and Indexing Documents and Images"
|
---|
204 | Van Nostrand Reinhold
|
---|
205 | 1994
|
---|
206 | xiv + 429 pages
|
---|
207 | US$54.95
|
---|
208 | ISBN 0-442-01863-0
|
---|
209 | Library of Congress catalog number TA1637 .W58 1994.
|
---|
210 | .fi
|
---|
211 | .RE
|
---|
212 |
|
---|