1 | .\"------------------------------------------------------------
|
---|
2 | .\" Id - set Rv,revision, and Dt, Date using rcs-Id tag.
|
---|
3 | .de Id
|
---|
4 | .ds Rv \\$3
|
---|
5 | .ds Dt \\$4
|
---|
6 | ..
|
---|
7 | .Id $Id: mgquery.1 3745 2003-02-20 21:20:24Z mdewsnip $
|
---|
8 | .\"------------------------------------------------------------
|
---|
9 | .TH mgquery 1 \*(Dt CITRI
|
---|
10 | .SH NAME
|
---|
11 | mgquery \- query program for the mg system
|
---|
12 | .SH SYNOPSIS
|
---|
13 | .B mgquery
|
---|
14 | [
|
---|
15 | .B \-h
|
---|
16 | ]
|
---|
17 | [
|
---|
18 | .B \-D
|
---|
19 | ]
|
---|
20 | [
|
---|
21 | .BI \-f " name"
|
---|
22 | ]
|
---|
23 | [
|
---|
24 | .BI \-d " directory"
|
---|
25 | ]
|
---|
26 | .if n .ti +9n
|
---|
27 | [
|
---|
28 | .I collection-name
|
---|
29 | ]
|
---|
30 | .SH DESCRIPTION
|
---|
31 | .B mgquery
|
---|
32 | enables users to make Boolean or ranked queries from a data base
|
---|
33 | generated by the
|
---|
34 | .BR mg (1)
|
---|
35 | system. It accepts queries from
|
---|
36 | .I stdin
|
---|
37 | and sends the retrieved documents to
|
---|
38 | .IR stdout .
|
---|
39 | Information on the resource usage of
|
---|
40 | .B mgquery
|
---|
41 | as it processes queries can be obtained interactively.
|
---|
42 | .SH OPTIONS
|
---|
43 | Options may appear in any order, but the
|
---|
44 | .IR collection-name ,
|
---|
45 | if specified, must be last.
|
---|
46 | .TP "\w'\fB\-d\fP \fIdirectory\fP'u+2n"
|
---|
47 | .B \-h
|
---|
48 | This displays a usage line on
|
---|
49 | .IR stderr .
|
---|
50 | .TP
|
---|
51 | .B \-D
|
---|
52 | This option causes the entire text to be decompressed and sent to
|
---|
53 | .IR stdout .
|
---|
54 | .TP
|
---|
55 | .BI \-f " name"
|
---|
56 | This specifies the base name of the document collection that will be
|
---|
57 | used. If a collection with the specified base
|
---|
58 | .I name
|
---|
59 | does not exist, an error message will be displayed and
|
---|
60 | .B mgquery
|
---|
61 | will exit.
|
---|
62 | .TP
|
---|
63 | .BI \-d " directory"
|
---|
64 | This specifies the directory where the document collection can be found.
|
---|
65 | .SH USAGE
|
---|
66 | Prior to processing the command line arguments, the
|
---|
67 | .B mgquery
|
---|
68 | program attempts to read in a startup script called
|
---|
69 | .IR ./.mgrc .
|
---|
70 | If that fails, it attempts to read in the file
|
---|
71 | .IR $HOME/.mgrc .
|
---|
72 | The startup file can only contain commands\(emno queries are
|
---|
73 | permitted in the
|
---|
74 | .I .mgrc
|
---|
75 | file. Lines starting with \*(lq\fB#\fP\*(rq in the file are comments.
|
---|
76 | The most common use for the
|
---|
77 | .I .mgrc
|
---|
78 | file is to personalise the initial values of the predefined parameters
|
---|
79 | with
|
---|
80 | .B .set
|
---|
81 | commands.
|
---|
82 | .LP
|
---|
83 | The input to
|
---|
84 | .B mgquery
|
---|
85 | consists of a series of input lines. The backslash
|
---|
86 | character
|
---|
87 | .RB (\*(lq \e \*(rq)
|
---|
88 | is used at the end of lines to indicate
|
---|
89 | that input continues on the next line.
|
---|
90 | .LP
|
---|
91 | Input lines on which the first character is a dot
|
---|
92 | .RB (\*(lq . \*(rq)
|
---|
93 | are commands to the
|
---|
94 | .B mgquery
|
---|
95 | program. Input lines that do not start with a dot are queries.
|
---|
96 | .LP
|
---|
97 | A query consists of two parts. One part is a Boolean or ranked query
|
---|
98 | that identifies documents. The second part is a post-processing
|
---|
99 | pattern matching operation. Any text between the first speech mark
|
---|
100 | (\*(lq) and the last speech mark (\*(rq) is considered to be a
|
---|
101 | post-processing pattern.
|
---|
102 | .SH COMMANDS
|
---|
103 | The
|
---|
104 | .B mgquery
|
---|
105 | program can accept the following commands.
|
---|
106 | .TP 17
|
---|
107 | .B .help
|
---|
108 | Display several pages of help text.
|
---|
109 | .TP
|
---|
110 | .B .quit
|
---|
111 | Quit the program.
|
---|
112 | .TP
|
---|
113 | .B .warranty
|
---|
114 | Display the
|
---|
115 | .BR mg (1)
|
---|
116 | warranty.
|
---|
117 | .TP
|
---|
118 | .B .conditions
|
---|
119 | Display the conditions of use and distribution of
|
---|
120 | .BR mg (1).
|
---|
121 | .TP
|
---|
122 | .BI ".set " "name value"
|
---|
123 | Set the parameter
|
---|
124 | .I name
|
---|
125 | to the specified
|
---|
126 | .IR value .
|
---|
127 | If the parameter is a Boolean
|
---|
128 | .I value
|
---|
129 | and the
|
---|
130 | .I value
|
---|
131 | is omitted, the parameter will be inverted (i.e., if it was
|
---|
132 | .IR true ,
|
---|
133 | then it will change to
|
---|
134 | .IR false ;
|
---|
135 | if it was
|
---|
136 | .IR false ,
|
---|
137 | then it will change to
|
---|
138 | .IR true ).
|
---|
139 | .TP
|
---|
140 | .BI ".unset " name
|
---|
141 | Delete the parameter
|
---|
142 | .I name
|
---|
143 | from the currently-defined parameters.
|
---|
144 | .TP
|
---|
145 | .B .reset
|
---|
146 | Reset the parameters to the state that they had after the processing
|
---|
147 | of the
|
---|
148 | .B mgquery
|
---|
149 | command line.
|
---|
150 | .TP
|
---|
151 | .B .display
|
---|
152 | Display the values of all the currently-defined parameters.
|
---|
153 | .TP
|
---|
154 | .B .push
|
---|
155 | Push the currently-defined parameters onto a stack.
|
---|
156 | .TP
|
---|
157 | .B .pop
|
---|
158 | Pops a set of parameters off the stack, replacing the currently-defined
|
---|
159 | ones.
|
---|
160 | .TP
|
---|
161 | .BI ".output " arg
|
---|
162 | This is used to specify where to send the text of the documents. Once
|
---|
163 | the
|
---|
164 | .B .output
|
---|
165 | command is specified, all subsequent output will be sent to the place
|
---|
166 | specified by
|
---|
167 | .IR arg .
|
---|
168 | If
|
---|
169 | .I arg
|
---|
170 | is not specified subsequent output will be directed to
|
---|
171 | .IR stdout .
|
---|
172 | .I Arg
|
---|
173 | may be any of the following.
|
---|
174 | .RS
|
---|
175 | .TP 13
|
---|
176 | .BI "> " filename
|
---|
177 | Send output to the specified file.
|
---|
178 | .TP
|
---|
179 | .BI ">> " filename
|
---|
180 | Append output to the specified file.
|
---|
181 | .TP
|
---|
182 | .BI "| " command
|
---|
183 | Pipe the output to
|
---|
184 | .IR command ,
|
---|
185 | which is executed by
|
---|
186 | .IR sh .
|
---|
187 | .RE
|
---|
188 | .TP
|
---|
189 | .BI ".input " arg
|
---|
190 | This is used to specify where input (queries and commands) comes
|
---|
191 | from. Once the
|
---|
192 | .B .input
|
---|
193 | command is specified all subsequent input will be come from the place
|
---|
194 | specified by
|
---|
195 | .IR arg .
|
---|
196 | If
|
---|
197 | .I arg
|
---|
198 | is not specified subsequent input will come from
|
---|
199 | .IR stdin .
|
---|
200 | .RS
|
---|
201 | .TP 13
|
---|
202 | .BI "< " filename
|
---|
203 | Get input from the specified file.
|
---|
204 | .TP
|
---|
205 | .BI "| " command
|
---|
206 | The input comes from the standard output of
|
---|
207 | .IR command ,
|
---|
208 | which is executed by
|
---|
209 | .IR sh .
|
---|
210 | .RE
|
---|
211 | .SH PARAMETERS
|
---|
212 | The following parameters are predefined and have special
|
---|
213 | significance. Each parameter will be followed by its default
|
---|
214 | value. Parameters are initialised before the
|
---|
215 | .I .mgrc
|
---|
216 | file is read or the command line arguments are processed.
|
---|
217 | .TP 17
|
---|
218 | .BI accumulator_method " `array'"
|
---|
219 | This parameter is used during ranking, and specifies how the
|
---|
220 | weight for each document should be accumulated. The following
|
---|
221 | methods are available:
|
---|
222 | .IR array ,
|
---|
223 | .IR splay_tree ,
|
---|
224 | .IR hash_table ,
|
---|
225 | and
|
---|
226 | .IR list .
|
---|
227 | .TP
|
---|
228 | .BI briefstats " `off'"
|
---|
229 | This is a Boolean parameter that determines whether the
|
---|
230 | totals for disk, memory and time usage statistics will be
|
---|
231 | displayed at the end of each query.
|
---|
232 | .IR Note :
|
---|
233 | this takes precedence over the parameters
|
---|
234 | .BR diskstats ,
|
---|
235 | .BR memstats " and " timestats .
|
---|
236 | This parameter may take the values
|
---|
237 | .IR yes ", " no ", "
|
---|
238 | .IR true ", " false ", "
|
---|
239 | .IR on " or " off .
|
---|
240 | .TP
|
---|
241 | .BI buffer " `1048576'"
|
---|
242 | When the documents are being read in, they are read into a
|
---|
243 | buffer of this size and then displayed from this buffer. If
|
---|
244 | the documents are larger than this buffer, the buffer is
|
---|
245 | expanded automatically. Having a large buffer gives a very
|
---|
246 | slight performance improvement, because it allows the order of
|
---|
247 | disk operations to be optimised. The buffer size is measured
|
---|
248 | in bytes.
|
---|
249 | .TP
|
---|
250 | .BI diskstats " `off'"
|
---|
251 | This is a Boolean parameter that determines whether the disk
|
---|
252 | usage statistics for the preceding query will be displayed
|
---|
253 | after each query. This parameter may take the values
|
---|
254 | .IR yes ", " no ", "
|
---|
255 | .IR true ", " false ", "
|
---|
256 | .IR on " or " off .
|
---|
257 | .TP
|
---|
258 | .BI doc_sepstr " `---------------------------------- %n\en\'"
|
---|
259 | This specifies the string that will be used to separate
|
---|
260 | documents when they are displayed for `Boolean' or `docnums'
|
---|
261 | queries. The standard C escape character sequences
|
---|
262 | may be used to place special characters in the
|
---|
263 | string. For example, a newline would be `\en'. To include a `%',
|
---|
264 | use the sequence `%%'. To include the
|
---|
265 | .BR mg (1)
|
---|
266 | document number, use the sequence `%n'. The following escape character
|
---|
267 | sequences are available
|
---|
268 | .nf
|
---|
269 | .ta 1.7iL
|
---|
270 | .B Sequence Meaning
|
---|
271 | `\e\e' backslash
|
---|
272 | `\eb' backspace
|
---|
273 | `\ef' formfeed
|
---|
274 | `\en' newline
|
---|
275 | `\er' carriage return
|
---|
276 | `\et' tab
|
---|
277 | `\e"' speech marks
|
---|
278 | `\e'' quote mark
|
---|
279 | `\ex\fIhh\fP' ASCII code in hexadecimal
|
---|
280 | `\ennn' ASCII code in octal
|
---|
281 | .fi
|
---|
282 | .TP
|
---|
283 | .BI expert " `false'"
|
---|
284 | If this is
|
---|
285 | .IR true ,
|
---|
286 | then much of the dialogue output is suppressed. This parameter may
|
---|
287 | take the values
|
---|
288 | .IR yes ", " no ", "
|
---|
289 | .IR true ", " false ", "
|
---|
290 | .IR on " or " off .
|
---|
291 | .TP
|
---|
292 | .BI hash_tbl_size " `1000'"
|
---|
293 | One of the options during ranking queries is to use a hash
|
---|
294 | table to accumulate the weights for each document. The hash
|
---|
295 | table is a simple chained type. This parameter specifies the
|
---|
296 | size of the hash table and may take any value between 8 and
|
---|
297 | 268435456 (2^28).
|
---|
298 | .TP
|
---|
299 | .BI heads_length " `50'"
|
---|
300 | When the mode is
|
---|
301 | .BR heads ,
|
---|
302 | this specifies the number of characters that will be output for each
|
---|
303 | document.
|
---|
304 | .TP
|
---|
305 | .BI maxdocs " `all'"
|
---|
306 | The maximum number of documents to display in response to a
|
---|
307 | query. This parameter may take on a numeric value between 1
|
---|
308 | and 429467295 (2^32 - 1) or the word
|
---|
309 | .IR all .
|
---|
310 | .TP
|
---|
311 | .BI maxparas " `1000'"
|
---|
312 | The maximum number of paragraphs to identify during a ranked
|
---|
313 | query with paragraph indexing. After the paragraphs have been
|
---|
314 | identified, the paragraphs are converted into documents, and
|
---|
315 | because some of the paragraphs may refer to the same documents
|
---|
316 | the final number of answers may be less than
|
---|
317 | .BR maxparas .
|
---|
318 | The
|
---|
319 | .B maxdocs
|
---|
320 | parameter will then be applied. This parameter may take on a numeric
|
---|
321 | value between 1 and 429467295 (2^32 - 1).
|
---|
322 | .TP
|
---|
323 | .BI max_accumulators " `50000'"
|
---|
324 | This parameter limits the number of different paragraph and
|
---|
325 | document numbers to be accumulated during ranked queries when
|
---|
326 | the parameter
|
---|
327 | .B accumulator_method
|
---|
328 | is set to
|
---|
329 | .IR splay_tree ,
|
---|
330 | .IR hash_table ,
|
---|
331 | or
|
---|
332 | .IR list .
|
---|
333 | This parameter may take any value between 8 and 268435456 (2^28).
|
---|
334 | .TP
|
---|
335 | .BI max_terms " `all'"
|
---|
336 | This parameter limits the number of terms that will actually
|
---|
337 | be used during a ranked query. If more terms than the number
|
---|
338 | specified by
|
---|
339 | .B max_terms
|
---|
340 | are entered, then the extra terms will be discarded. If
|
---|
341 | .B sorted_terms
|
---|
342 | is on, then the limiting will be done after the terms have been
|
---|
343 | sorted. This parameter may take any value between 1 and 429467295
|
---|
344 | (2^32 - 1), or the word
|
---|
345 | .IR all.
|
---|
346 | .TP
|
---|
347 | .BI memstats " `off'"
|
---|
348 | This is a Boolean parameter that determines whether the memory
|
---|
349 | usage statistics for the preceding query will be displayed
|
---|
350 | after each query. This parameter may take the values
|
---|
351 | .IR yes ", " no ", "
|
---|
352 | .IR true ", " false ", "
|
---|
353 | .IR on " or " off .
|
---|
354 | .TP
|
---|
355 | .BI mgdir " `.'"
|
---|
356 | This is set to the directory where the
|
---|
357 | .BR mg (1)
|
---|
358 | data files may be found. If
|
---|
359 | the environment variable
|
---|
360 | .B MGDATA
|
---|
361 | exists, then this is instead initialised to the value of
|
---|
362 | .BR MGDATA .
|
---|
363 | The value of this parameter may be changed, either in the
|
---|
364 | .I .mgrc
|
---|
365 | file with a
|
---|
366 | .BI ".set mgdir "directory
|
---|
367 | command, or from the command line using the
|
---|
368 | .BI \-d " directory"
|
---|
369 | option. Once the \*(lq\fB>\fP\*(rq prompt appears, changing this
|
---|
370 | parameter will have no effect.
|
---|
371 | .TP
|
---|
372 | .BI mgname " `bible'"
|
---|
373 | This is set to the name of the
|
---|
374 | .BR mg (1)
|
---|
375 | collection that is to be used for the session. The value of this
|
---|
376 | parameter may be changed, either in the
|
---|
377 | .I .mgrc
|
---|
378 | file with a
|
---|
379 | .BI ".set mgname "name
|
---|
380 | command, or from the command line using the
|
---|
381 | .BI \-f " name"
|
---|
382 | option. Once the \*(lq\fB>\fP\*(rq prompt appears, changing this
|
---|
383 | parameter will have no effect.
|
---|
384 | .TP
|
---|
385 | .BI mode " `text'"
|
---|
386 | This specifies how documents should be displayed when they
|
---|
387 | are retrieved. It may take six different values:
|
---|
388 | .IR text ,
|
---|
389 | .IR hilite ,
|
---|
390 | .IR docnums ,
|
---|
391 | .IR heads ,
|
---|
392 | .IR silent ,
|
---|
393 | or
|
---|
394 | .IR count .
|
---|
395 | .I text
|
---|
396 | displays the contents of the document.
|
---|
397 | .I hilite
|
---|
398 | displays the contents of the document and highlights any of the
|
---|
399 | stemmed query terms.
|
---|
400 | .I docnums
|
---|
401 | displays only the document numbers.
|
---|
402 | .I heads
|
---|
403 | is used to print out the head of each document.
|
---|
404 | .I silent
|
---|
405 | retrieves all the documents but displays nothing except how many
|
---|
406 | documents were retrieved. This mode is intended to be used in timing
|
---|
407 | experiments.
|
---|
408 | .I count
|
---|
409 | does the minimum
|
---|
410 | amount of work required to determine how many documents would
|
---|
411 | be retrieved, but does not retrieve them.
|
---|
412 | .TP
|
---|
413 | .BI optimise_type " `1'"
|
---|
414 | There are three types of boolean query optimisation (parse tree
|
---|
415 | rearrangement). Type 0 leaves parse tree unaltered. Type 1 optimises
|
---|
416 | for AND of terms and AND of OR of terms. Type 2 converts the tree
|
---|
417 | into DNF (an experiment :-).
|
---|
418 | .TP
|
---|
419 | .BI pager " `more'"
|
---|
420 | This is the name of the program that will be used to display
|
---|
421 | the help and the retrieved documents. If the environment
|
---|
422 | variable
|
---|
423 | .B PAGER
|
---|
424 | is defined, then
|
---|
425 | .B pager
|
---|
426 | takes on that value.
|
---|
427 | .TP
|
---|
428 | .BI hilite_style " `bold'"
|
---|
429 | This specifies the type of highlighting method.
|
---|
430 | It may take one of two different values:
|
---|
431 | .IR bold,
|
---|
432 | or
|
---|
433 | .IR underline.
|
---|
434 | .TP
|
---|
435 | .BI para_sepstr " `\en######## PARAGRAPH %n ########\en'"
|
---|
436 | This specifies the string that will be used to separate paragraphs.
|
---|
437 | The standard C escape character sequences may be used to place special
|
---|
438 | characters in the string. For example, a newline would be written
|
---|
439 | as `\en'. To include a `%', use the sequence `%%'. To include the
|
---|
440 | paragraph number within the document, use the sequence `%n'.
|
---|
441 | .TP
|
---|
442 | .BI para_start " `***** Weight = %w *****\en'"
|
---|
443 | This specifies the string that will be used at the head of paragraphs
|
---|
444 | for a paragraph-level index following a ranked query. The standard
|
---|
445 | C-language escape character sequences may be used to place special
|
---|
446 | characters in the string. For example, a newline would be written as
|
---|
447 | `\en'. To include a `%', use the sequence `%%'. To include the
|
---|
448 | paragraph weight, use the sequence `%w'.
|
---|
449 | .TP
|
---|
450 | .BI qfreq " `true'"
|
---|
451 | This determine whether the ranked queries will take into account the
|
---|
452 | number of times each query term is specified. When this is
|
---|
453 | .IR true ,
|
---|
454 | the number of times a term appears in the query is used in the
|
---|
455 | ranking. When this is
|
---|
456 | .IR false ,
|
---|
457 | all query terms are assumed to occur only once. This parameter may
|
---|
458 | take the values
|
---|
459 | .IR yes ", " no ", "
|
---|
460 | .IR true ", " false ", "
|
---|
461 | .IR on " or " off .
|
---|
462 | .TP
|
---|
463 | .BI query " `Boolean'"
|
---|
464 | This specifies the type of queries that are to be specified.
|
---|
465 | It can take four different values:
|
---|
466 | .IR Boolean ,
|
---|
467 | .IR ranked ,
|
---|
468 | .IR docnums " or " approx-ranked.
|
---|
469 | .I Boolean
|
---|
470 | is for Boolean queries.
|
---|
471 | The
|
---|
472 | .BR yacc (1)
|
---|
473 | grammar for Boolean queries is as follows.
|
---|
474 | .IP
|
---|
475 | .nf
|
---|
476 | query : or;
|
---|
477 | .IP
|
---|
478 | or : or '|' and
|
---|
479 | | and ;
|
---|
480 | .IP
|
---|
481 | and : and '&' not
|
---|
482 | | and not
|
---|
483 | | not ;
|
---|
484 | .IP
|
---|
485 | not : term
|
---|
486 | | '!' not ;
|
---|
487 | .IP
|
---|
488 | term : TERM
|
---|
489 | | '(' or ')' ;
|
---|
490 | .fi
|
---|
491 | .IP
|
---|
492 | .IR ranked " and " approx-ranked
|
---|
493 | are for queries ranked by the cosine measure.
|
---|
494 | .I approx-ranked
|
---|
495 | uses only the low-precision document lengths, and therefore only
|
---|
496 | produces an approximation to full cosine ranking.
|
---|
497 | .IP
|
---|
498 | .nf
|
---|
499 | query : TERM
|
---|
500 | | query TERM ;
|
---|
501 | .fi
|
---|
502 | .IP
|
---|
503 | .I docnums
|
---|
504 | allows the entry of document numbers. Multiple numbers separated by
|
---|
505 | spaces may be specified, or ranges separated by hyphens.
|
---|
506 | .IP
|
---|
507 | .nf
|
---|
508 | query : range
|
---|
509 | | query range ;
|
---|
510 | .IP
|
---|
511 | range : num
|
---|
512 | | num '-' num ;
|
---|
513 | .fi
|
---|
514 | .TP
|
---|
515 | .BI ranked_doc_sepstr " `-------------------------------- %n %w\en'"
|
---|
516 | This specifies the string that will be used to separate documents when
|
---|
517 | they are displayed for `ranked' or `approx-ranked' queries. The
|
---|
518 | standard C escape character sequences may be used to place special
|
---|
519 | characters in the string. For example, a newline would be written as
|
---|
520 | `\en'. To include a `%', use the sequence `%%'. To include the
|
---|
521 | .BR mg (1)
|
---|
522 | document number, use the sequence `%n'. To include the document
|
---|
523 | weight, use the sequence `%w'.
|
---|
524 | .TP
|
---|
525 | .BI sizestats " `false'"
|
---|
526 | If this is
|
---|
527 | .IR true ,
|
---|
528 | then various numbers are output at the end of each query indicating
|
---|
529 | what went on during the query. This parameter may take the values
|
---|
530 | .IR yes ", " no ", "
|
---|
531 | .IR true ", " false ", "
|
---|
532 | .IR on " or " off .
|
---|
533 | .TP
|
---|
534 | .BI skip_dump " `skips.%d'"
|
---|
535 | If this parameter is set, then a file will be produced in the current
|
---|
536 | directory during ranked queries on skipped inverted files when
|
---|
537 | .B accumulator_method
|
---|
538 | is set to
|
---|
539 | .IR splay_tree ,
|
---|
540 | .IR hash_table ,
|
---|
541 | or
|
---|
542 | .IR list .
|
---|
543 | The name of the file is the value of this parameter. A `%d' in the
|
---|
544 | file name will be replaced with the process id of
|
---|
545 | .BR mgquery .
|
---|
546 | This file will contain information about the usage of skips during the
|
---|
547 | query processing. This option is expensive; use
|
---|
548 | .B .unset skip_dump
|
---|
549 | to obtain optimal performance.
|
---|
550 | .TP
|
---|
551 | .BI sorted_terms " `on'"
|
---|
552 | This specifies whether or not the terms should be sorted into
|
---|
553 | decreasing occurrence in documents so that the least-often occurring
|
---|
554 | terms are processed first when ranked queries are being done. When
|
---|
555 | this is
|
---|
556 | .IR true ,
|
---|
557 | the terms are sorted. When this is
|
---|
558 | .IR false ,
|
---|
559 | the terms are not sorted, and are instead processed in order of
|
---|
560 | occurrence. This parameter may take the values
|
---|
561 | .IR yes ", " no ", "
|
---|
562 | .IR true ", " false ", "
|
---|
563 | .IR on " or " off .
|
---|
564 | .TP
|
---|
565 | .BI stop_at_max_accum " `on'"
|
---|
566 | This specifies what should happen when the maximum number of
|
---|
567 | accumulators set by
|
---|
568 | .B max_accumulators
|
---|
569 | is reached. When this is
|
---|
570 | .IR true ,
|
---|
571 | the processing of terms is stopped at the completion of the current
|
---|
572 | term. When this is
|
---|
573 | .IR false ,
|
---|
574 | processing continues but no new accumulators are created. This
|
---|
575 | parameter may take the values
|
---|
576 | .IR yes ", " no ", "
|
---|
577 | .IR true ", " false ", "
|
---|
578 | .IR on " or " off .
|
---|
579 | .TP
|
---|
580 | .BI terminator " `'"
|
---|
581 | This specifies the string that will be output after the last document
|
---|
582 | from the previous query has been output. The standard C escape
|
---|
583 | character sequences may be used to place special characters in the
|
---|
584 | string. For example, a newline would be written as `\en'. To include
|
---|
585 | a `%', use the sequence `%%'.
|
---|
586 | .TP
|
---|
587 | .BI timestats " `false'"
|
---|
588 | If this is
|
---|
589 | .IR true ,
|
---|
590 | then the time to process a query is displayed in both real time and
|
---|
591 | CPU time. This parameter may take the values
|
---|
592 | .IR yes ", " no ", "
|
---|
593 | .IR true ", " false ", "
|
---|
594 | .IR on " or " off .
|
---|
595 | .TP
|
---|
596 | .BI verbatim " `off'"
|
---|
597 | This is a Boolean parameter that determines whether the program
|
---|
598 | should attempt to do a regular-expression match on the retrieved
|
---|
599 | text. If verbatim is
|
---|
600 | .I on
|
---|
601 | and a post-processing string is specified with the query, then the
|
---|
602 | post-processing string will be searched for in the documents just
|
---|
603 | before they are displayed. If the string is found, the document will
|
---|
604 | be displayed; if not, the document will not be displayed. If verbatim
|
---|
605 | is
|
---|
606 | .IR off ,
|
---|
607 | the post-processing string will be considered a regular expression
|
---|
608 | as in
|
---|
609 | .BR egrep (1)
|
---|
610 | or
|
---|
611 | .BR vi (1).
|
---|
612 | E.g., if verbatim is
|
---|
613 | .I on,
|
---|
614 | \*(lq\fBand.*the\fP\*(rq will look for the 8-character sequence
|
---|
615 | \*(lq\fBand.*the\fP\*(rq. If verbatim is
|
---|
616 | .IR off ,
|
---|
617 | \*(lq\fBand.*the\fP\*(rq will look for the sequence
|
---|
618 | \*(lq\fBand\fP\*(rq followed somewhere later in the document by the
|
---|
619 | sequence \*(lq\fBthe\fP\*(rq. This parameter may take the values
|
---|
620 | .IR yes ", " no ", "
|
---|
621 | .IR true ", " false ", "
|
---|
622 | .IR on " or " off .
|
---|
623 | .SH ENVIRONMENT
|
---|
624 | .TP "\w'\fBMGDATA\fP'u+2n"
|
---|
625 | .SB MGDATA
|
---|
626 | If this environment variable exists, then its value is used as the
|
---|
627 | default directory where the
|
---|
628 | .BR mg (1)
|
---|
629 | collection files are. If this variable does not exist, then the
|
---|
630 | directory \*(lq\fB.\fP\*(rq is used by default. The command line
|
---|
631 | option
|
---|
632 | .BI \-d " directory"
|
---|
633 | overrides the directory in
|
---|
634 | .BR MGDATA .
|
---|
635 | .SH FILES
|
---|
636 | .TP 20
|
---|
637 | .I .mgrc
|
---|
638 | .B mgquery
|
---|
639 | startup file
|
---|
640 | .TP
|
---|
641 | .B help.mg
|
---|
642 | Help file for
|
---|
643 | .BR mgquery .
|
---|
644 | The contents of this file is displayed with the
|
---|
645 | .B .help
|
---|
646 | command.
|
---|
647 | .TP
|
---|
648 | .B *.invf
|
---|
649 | Inverted file.
|
---|
650 | .TP
|
---|
651 | .B *.invf.dict
|
---|
652 | The `on-disk' stemmed dictionary.
|
---|
653 | .TP
|
---|
654 | .B *.text
|
---|
655 | Compressed documents.
|
---|
656 | .TP
|
---|
657 | .B *.text.dict
|
---|
658 | Compression dictionary.
|
---|
659 | .TP
|
---|
660 | .B *.text.idx
|
---|
661 | Index into the compressed documents.
|
---|
662 | .TP
|
---|
663 | .B *.text.idx.wgt
|
---|
664 | Interleaved index into the compressed documents and document weights.
|
---|
665 | .TP
|
---|
666 | .B *.weight.approx
|
---|
667 | Approximate document weights.
|
---|
668 | .SH "SEE ALSO"
|
---|
669 | .na
|
---|
670 | .BR egrep (1),
|
---|
671 | .BR mg (1),
|
---|
672 | .BR mg_compression_dict (1),
|
---|
673 | .BR mg_fast_comp_dict (1),
|
---|
674 | .BR mg_get (1),
|
---|
675 | .BR mg_invf_dict (1),
|
---|
676 | .BR mg_invf_dump (1),
|
---|
677 | .BR mg_invf_rebuild (1),
|
---|
678 | .BR mg_passes (1),
|
---|
679 | .BR mg_perf_hash_build (1),
|
---|
680 | .BR mg_text_estimate (1),
|
---|
681 | .BR mg_weights_build (1),
|
---|
682 | .BR mgbilevel (1),
|
---|
683 | .BR mgbuild (1),
|
---|
684 | .BR mgdictlist (1),
|
---|
685 | .BR mgfelics (1),
|
---|
686 | .BR mgstat (1),
|
---|
687 | .BR mgtic (1),
|
---|
688 | .BR mgticbuild (1),
|
---|
689 | .BR mgticdump (1),
|
---|
690 | .BR mgticprune (1),
|
---|
691 | .BR mgticstat (1),
|
---|
692 | .BR vi (1),
|
---|
693 | .BR yacc (1).
|
---|