source: trunk/gsdl/packages/yaz/doc/yaz-7.html@ 1343

Last change on this file since 1343 was 1343, checked in by johnmcp, 24 years ago

Added the YAZ toolkit source to the packages directory (for z39.50 stuff)

  • Property svn:keywords set to Author Date Id Revision
File size: 22.8 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2<HTML>
3<HEAD>
4 <META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
5 <TITLE>YAZ User's Guide and Reference: Making an IR Interface for Your Database with YAZ</TITLE>
6 <LINK HREF="yaz-8.html" REL=next>
7 <LINK HREF="yaz-6.html" REL=previous>
8 <LINK HREF="yaz.html#toc7" REL=contents>
9</HEAD>
10<BODY>
11<A HREF="yaz-8.html">Next</A>
12<A HREF="yaz-6.html">Previous</A>
13<A HREF="yaz.html#toc7">Contents</A>
14<HR>
15<H2><A NAME="server"></A> <A NAME="s7">7. Making an IR Interface for Your Database with YAZ</A></H2>
16
17<H2><A NAME="ss7.1">7.1 Introduction</A>
18</H2>
19
20<P><I>NOTE: If you aren't into documentation, a good way to learn how the
21backend interface works is to look at the backend.h file. Then,
22look at the small dummy-server in server/ztest.c. Finally, you can
23have a look at the seshigh.c file, which is where most of the
24logic of the frontend server is located. The backend.h file also
25makes a good reference, once you've chewed your way through
26the prose of this file.</I>
27<P>If you have a database system that you would like to make available by
28means of Z39.50/SR, <B>YAZ</B> basically offers your two options. You
29can use the APIs provided by the <B>ASN</B>, <B>ODR</B>, and <B>COMSTACK</B>
30modules to
31create and decode PDUs, and exchange them with a client. Using this
32low-level interface gives you access to all fields and options of the
33protocol, and you can construct your server as close to your existing
34database as you like. It is also a fairly involved process, requiring
35you to set up an event-handling mechanism, protocol state machine,
36etc. To simplify server implementation, we have implemented a compact
37and simple, but reasonably full-functioned server-frontend that will
38handle most of the protocol mechanics, while leaving you to
39concentrate on your database interface.
40<P><I>NOTE: The backend interface was designed in anticipation of a specific
41integration task, while still attempting to achieve some degree of
42generality. We realise fully that there are points where the
43interface can be improved significantly. If you have specific
44functions or parameters that you think could be useful, send us a
45mail (or better, sign on to the mailing list referred to in the
46toplevel README file). We will try to fit good suggestions into future
47releases, to the extent that it can be done without requiring
48too many structural changes in existing applications.</I>
49<P>
50<H2><A NAME="ss7.2">7.2 The Database Frontend</A>
51</H2>
52
53<P>We refer to this software as a generic database frontend. Your
54database system is the <I>backend database</I>, and the interface between
55the two is called the <I>backend API</I>. The backend API consists of a
56small number of function prototypes and structure definitions. You are
57required to provide the <B>main()</B> routine for the server (which can be
58quite simple), as well as functions to match each of the prototypes.
59The interface functions that you write can use any mechanism you like
60to communicate with your database system: You might link the whole
61thing together with your database application and access it by
62function calls; you might use IPC to talk to a database server
63somewhere; or you might link with third-party software that handles
64the communication for you (like a commercial database client library).
65At any rate, the functions will perform the tasks of:
66<P>
67<UL>
68<LI>Initialization.</LI>
69<LI>Searching.</LI>
70<LI>Fetching records.</LI>
71<LI>Scanning the database index (if you wish to implement SCAN).</LI>
72</UL>
73<P>(more functions will be added in time to support as much of
74Z39.50-1995 as possible).
75<P>Because the model where pipes or sockets are used to access the backend
76database is a fairly common one, we have added a mechanism that allows this
77communication to take place asynchronously. In this mode, the frontend
78server doesn't have to block while the backend database is processing
79a request, but can wait for additional PDUs from the client.
80<P>
81<H2><A NAME="ss7.3">7.3 The Backend API</A>
82</H2>
83
84<P>The headers files that you need to use the interface are in the
85include/ directory. They are called <CODE>statserv.h</CODE> and <CODE>backend.h</CODE>. They
86will include other files from the <CODE>include</CODE> directory, so you'll
87probably want to use the -I option of your compiler to tell it where
88to find the files. When you run <CODE>make</CODE> in the toplevel <B>YAZ</B> directory,
89everything you need to create your server is put the lib/libyaz.a
90library. If you want OSI as well, you'll also need to link in the
91<CODE>libmosi.a</CODE> library from the xtimosi distribution (see the mosi.txt
92file), a well as the <CODE>lib/librfc.a</CODE> library (to provide OSI transport
93over RFC1006/TCP).
94<P>
95<H2><A NAME="ss7.4">7.4 Your main() Routine</A>
96</H2>
97
98<P>As mentioned, your <B>main()</B> routine can be quite brief. If you want to
99initialize global parameters, or read global configuration tables,
100this is the place to do it. At the end of the routine, you should call
101the function
102<P>
103<BLOCKQUOTE><CODE>
104<PRE>
105int statserv_main(int argc, char **argv);
106</PRE>
107</CODE></BLOCKQUOTE>
108<P><B>Statserv_main</B> will establish listening sockets according to the
109parameters given. When connection requests are received, the event
110handler will typically <B>fork()</B> to handle the new request. If you do use
111global variables, you should be aware, then, that these cannot be
112shared between associations, unless you explicitly disallow forking by
113command line parameters (we advise against this for any purposes
114except debugging, as a crash or hang in the server process will affect
115all users currently signed on to the server).
116<P>The server provides a mechanism for controlling some of its behavior
117without using command-line options. The function
118<P>
119<BLOCKQUOTE><CODE>
120<PRE>
121statserv_options_block *statserv_getcontrol(void);
122</PRE>
123</CODE></BLOCKQUOTE>
124<P>Will return a pointer to a <CODE>struct statserv_options_block</CODE> describing
125the current default settings of the server. The structure contains
126these elements:
127<P>
128<DL>
129<DT><B>int dynamic</B><DD><P>A boolean value, which determines whether the server
130will fork on each incoming request (TRUE), or not (FALSE). Default is
131TRUE.
132<DT><B>int loglevel</B><DD><P>Set this by ORing the constants defined in include/log.h.
133<DT><B>char logfile[ODR_MAXNAME+1]</B><DD><P>File for diagnostic output
134(&quot;&quot;: stderr).
135<DT><B>char apdufile[ODR_MAXNAME+1]</B><DD><P>Name of file for logging incoming and
136outgoing APDUs (&quot;&quot;: don't log APDUs, &quot;-&quot;: <CODE>stderr</CODE>).
137<DT><B>char default_listen[1024]</B><DD><P>Same form as the command-line
138specification of listener address. &quot;&quot;: no default listener address.
139Default is to listen at &quot;tcp:@:9999&quot;. You can only
140specify one default listener address in this fashion.
141<DT><B>enum oid_proto default_proto;</B><DD><P>Either <CODE>PROTO_SR</CODE> or <CODE>PROTO_Z3950</CODE>.
142Default is <CODE>PROTO_Z39_50</CODE>.
143<DT><B>int idle_timeout;</B><DD><P>Maximum session idletime, in minutes. Zero indicates
144no (infinite) timeout. Default is 120 minutes.
145<DT><B>int maxrecordsize;</B><DD><P>Maximum permissible record (message) size. Default
146is 1Mb. This amount of memory will only be allocated if a client requests a
147very large amount of records in one operation (or a big record). Set it
148to a lower number
149if you are worried about resource consumption on your host system.
150<DT><B>char configname[ODR_MAXNAME+1]</B><DD><P>Passed to the backend when a
151new connection is received.
152<DT><B>char setuid[ODR_MAXNAME+1]</B><DD><P>Set user id to the user specified,
153after binding the listener addresses.
154</DL>
155<P>The pointer returned by <CODE>statserv_getcontrol</CODE> points to a static area.
156You are allowed to change the contents of the structure, but the
157changes will not take effect before you call
158<P>
159<BLOCKQUOTE><CODE>
160<PRE>
161void statserv_setcontrol(statserv_options_block *block);
162</PRE>
163</CODE></BLOCKQUOTE>
164<P>Note that you should generally update this structure <I>before</I> calling
165<CODE>statserv_main()</CODE>.
166<P>
167<H2><A NAME="ss7.5">7.5 The Backend Functions</A>
168</H2>
169
170<P>For each service of the protocol, the backend interface declares one or
171two functions. You are required to provide implementations of the
172functions representing the services that you wish to implement.
173<P>
174<BLOCKQUOTE><CODE>
175<PRE>
176bend_initresult *bend_init(bend_initrequest *r);
177</PRE>
178</CODE></BLOCKQUOTE>
179<P>This function is called once for each new connection request, after
180a new process has been forked, and an initRequest has been received
181from the client. The parameter and result structures are defined as
182<P>
183<BLOCKQUOTE><CODE>
184<PRE>
185typedef struct bend_initrequest
186{
187 char *configname;
188} bend_initrequest;
189
190typedef struct bend_initresult
191{
192 int errcode; /* 0==OK */
193 char *errstring; /* system error string or NULL */
194 void *handle; /* private handle to the backend module */
195} bend_initresult;
196</PRE>
197</CODE></BLOCKQUOTE>
198<P>The <CODE>configname</CODE> of <CODE>bend_initrequest</CODE> is currently always set to
199&quot;default-config&quot;. We haven't had use for putting anything special in
200the initrequest yet, but something might go there if the need arises
201(account/password info would be obvious).
202<P>In general, the server frontend expects that the <CODE>bend_*result</CODE>
203pointer that you return is valid at least until the next call to a
204<CODE>bend_* function</CODE>. This applies to all of the functions described
205herein. The parameter structure passed to you in the call belongs to
206the server frontend, and you should not make assumptions about its
207contents after the current function call has completed. In other
208words, if you want to retain any of the contents of a request
209structure, you should copy them.
210<P>The <CODE>errcode</CODE> should be zero if the initialization of the backend went
211well. Any other value will be interpreted as an error. The
212<CODE>errstring</CODE> isn't used in the current version, but one option
213would be to stick it
214in the initResponse as a VisibleString. The <CODE>handle</CODE> is the most
215important parameter. It should be set to some value that uniquely
216identifies the current session to the backend implementation. It is
217used by the frontend server in any future calls to a backend function.
218The typical use is to set it to point to a dynamically allocated state
219structure that is private to your backend module.
220<P>
221<BLOCKQUOTE><CODE>
222<PRE>
223bend_searchresult *bend_search(void *handle, bend_searchrequest *r,
224 int *fd);
225bend_searchresult *bend_searchresponse(void *handle);
226
227typedef struct bend_searchrequest
228{
229 char *setname; /* name to give to this set */
230 int replace_set; /* replace set, if it already exists */
231 int num_bases; /* number of databases in list */
232 char **basenames; /* databases to search */
233 Z_Query *query; /* query structure */
234} bend_searchrequest;
235
236typedef struct bend_searchresult
237{
238 int hits; /* number of hits */
239 int errcode; /* 0==OK */
240 char *errstring; /* system error string or NULL */
241} bend_searchresult;
242</PRE>
243</CODE></BLOCKQUOTE>
244<P>The first thing to notice about the search request interface (as well
245as all of the following requests), is that it consists of two separate
246functions. The idea is to provide a simple facility for
247asynchronous communication with the backend server. When a
248searchrequest comes in, the server frontend will fill out the
249<CODE>bend_searchrequest</CODE> tructure, and call the <CODE>bend_search
250function</CODE>. The <CODE>fd</CODE>
251argument will point to an integer variable. If you are able to do
252asynchronous I/O with your database server, you should set *<CODE>fd</CODE> to the
253file descriptor you use for the communication, and return a null
254pointer. The server frontend will then <CODE>select()</CODE> on the *<CODE>fd</CODE>,
255and will call
256<CODE>bend_searchresult</CODE> when it sees that data is available. If you don't
257support asynchronous I/O, you should return a pointer to the
258<CODE>bend_searchresult</CODE> immediately, and leave *<CODE>fd</CODE> untouched. This
259construction is common to all of the <CODE>bend_</CODE> functions (except
260<CODE>bend_init</CODE>). Note that you can choose to support this facility in none,
261any, or all of the <CODE>bend_</CODE> functions, and you can respond
262differently on each request at run-time. The server frontend will
263adapt accordingly.
264<P>The <CODE>bend_searchrequest</CODE> is a fairly close approximation of a protocol
265searchRequest PDU. The <CODE>setname</CODE> is the resultSetName from the protocol. You
266are required to establish a mapping between the set name and whatever
267your backend database likes to use. Similarly, the <CODE>replace_set</CODE> is a
268boolean value corresponding to the resultSetIndicator field in the
269protocol. <CODE>Num_bases/basenames</CODE> is a length of/array of character
270pointers to the database names provided by the client. The <CODE>query</CODE> is the
271full query structure as defined in the protocol ASN.1 specification.
272It can be either of the possible query types, and it's up to you to
273determine if you can handle the provided query type. Rather than
274reproduce the C interface here, we'll refer you to the structure
275definitions in the file <CODE>include/proto.h</CODE>. If you want to look at the
276attributeSetId OID of the RPN query, you can either match it against
277your own internal tables, or you can use the <CODE>oid_getentbyoid</CODE> function
278provided by <B>YAZ</B>.
279<P>The result structure contains a number of hits, and an
280<CODE>errcode/errstring</CODE> pair. If an error occurs during the search, or if
281you're unhappy with the request, you should set the errcode to a value
282from the BIB-1 diagnostic set. The value will then be returned to the
283user in a nonsurrogate diagnostic record in the response. The
284<CODE>errstring</CODE>, if provided, will go in the addinfo field. Look at the
285protocol definition for the defined error codes, and the suggested
286uses of the addinfo field.
287<P>
288<BLOCKQUOTE><CODE>
289<PRE>
290bend_fetchresult *bend_fetch(void *handle, bend_fetchrequest *r,
291 int *fd);
292bend_fetchresult *bend_fetchresponse(void *handle);
293
294typedef struct bend_fetchrequest
295{
296 char *setname; /* set name */
297 int number; /* record number */
298 oid_value format;
299} bend_fetchrequest;
300
301typedef struct bend_fetchresult
302{
303 char *basename; /* name of database that provided record */
304 int len; /* length of record */
305 char *record; /* record */
306 int last_in_set; /* is it? */
307 oid_value format;
308 int errcode; /* 0==success */
309 char *errstring; /* system error string or NULL */
310} bend_fetchresult;
311</PRE>
312</CODE></BLOCKQUOTE>
313<P><I>NOTE: The <CODE>bend_fetchresponse()</CODE> function is not yet supported
314in this version of the software. Your implementation of <CODE>bend_fetch()</CODE>
315should always return a pointer to a <CODE>bend_fetchresult</CODE>.</I>
316<P>The frontend server calls <CODE>bend_fetch</CODE> when it needs database records to
317fulfill a searchRequest or a presentRequest. The <CODE>setname</CODE> is simply the
318name of the result set that holds the reference to the desired record.
319The <CODE>number</CODE> is the offset into the set (with 1 being the first record
320in the set). The <CODE>format</CODE> field is the record format requested by the
321client (See section
322<A HREF="yaz-3.html#oid">Object Identifiers</A>). The value
323<CODE>VAL_NONE</CODE> indicates that the client did not request a specific format.
324The <CODE>stream</CODE> argument is an <B>ODR</B> stream which should be used for
325allocating space for structured data records. The stream will be reset when
326all records have been assembled, and the response package has been transmitted.
327For unstructured data, the backend is responsible for maintaining a static
328or dynamic buffer for the record between calls.
329<P>In the result structure, the <CODE>basename</CODE> is the name of the
330database that holds the
331record. <CODE>Len</CODE> is the length of the record returned, in bytes, and
332<CODE>record</CODE>
333is a pointer to the record. <CODE>Last_in_set</CODE> should be nonzero only if the
334record returned is the last one in the given result set. <CODE>Errcode</CODE> and
335<CODE>errstring</CODE>, if given, will currently be interpreted as a global error
336pertaining to the set, and will be returned in a
337nonSurrogateDiagnostic.
338<P><I>NOTE: This is silly. Add a flag to say which is which.</I>
339<P>If the <CODE>len</CODE> field has the value -1, then <CODE>record</CODE> is assumed to point
340to a constructed data type. The <CODE>format</CODE> field will be used to determine
341which encoder should be used to serialize the data.
342<P><I>NOTE: If your backend generates structured records, it should use
343<CODE>odr_malloc()</CODE> on the provided stream for allocating data: This allows
344the frontend server to keep track of the record sizes.</I>
345<P>The <CODE>format</CODE> field is mapped to an object identifier in the direct
346reference of the resulting EXTERNAL representation of the record.
347<P><I>NOTE: The current version of <B>YAZ</B> only supports the direct reference
348mode.</I>
349<P>
350<BLOCKQUOTE><CODE>
351<PRE>
352bend_deleteresult *bend_delete(void *handle, bend_deleterequest *r,
353 int *fd);
354bend_deleteresult *bend_deleteresponse(void *handle);
355
356typedef struct bend_deleterequest
357{
358 char *setname;
359} bend_deleterequest;
360
361typedef struct bend_deleteresult
362{
363 int errcode; /* 0==success */
364 char *errstring; /* system error string or NULL */
365} bend_deleteresult;
366</PRE>
367</CODE></BLOCKQUOTE>
368<P><I>NOTE: The &quot;delete&quot; function is not yet supported
369in this version of the software.</I>
370<P><I>NOTE: The delete set function definition is rather primitive, mostly
371because we
372have had no practical need for it as of yet. If someone wants
373to provide a full delete service, we'd be happy to add the
374extra parameters that are required. Are there clients out there
375that will actually delete sets they no longer need?</I>
376<P>
377<BLOCKQUOTE><CODE>
378<PRE>
379bend_scanresult *bend_scan(void *handle, bend_scanrequest *r,
380 int *fd);
381bend_scanresult *bend_scanresponse(void *handle);
382
383typedef struct bend_scanrequest
384{
385 int num_bases; /* number of elements in databaselist */
386 char **basenames; /* databases to search */
387 Z_AttributesPlusTerm *term;
388 int term_position; /* desired index of term in result list */
389 int num_entries; /* number of entries requested */
390} bend_scanrequest;
391
392typedef struct bend_scanresult
393{
394 int num_entries;
395 struct scan_entry
396 {
397 char *term;
398 int occurrences;
399 } *entries;
400 int term_position;
401 enum
402 {
403 BEND_SCAN_SUCCESS,
404 BEND_SCAN_PARTIAL
405 } status;
406 int errcode;
407 char *errstring;
408} bend_scanresult;
409</PRE>
410</CODE></BLOCKQUOTE>
411<P><I>NOTE: The <CODE>bend_scanresponse()</CODE> function is not yet supported
412in this version of the software. Your implementation of <CODE>bend_scan()</CODE>
413should always return a pointer to a <CODE>bend_scanresult</CODE>.</I>
414<P>
415<H2><A NAME="ss7.6">7.6 Application Invocation</A>
416</H2>
417
418<P>The finished application has the following
419invocation syntax (by way of <CODE>statserv_main()</CODE>):
420<P>
421<BLOCKQUOTE><CODE>
422
423<PRE>
424appname [-szSu -a apdufile -l logfile -v loglevel]
425[listener ...]
426</PRE>
427
428</CODE></BLOCKQUOTE>
429<P>The options are
430<P>
431<DL>
432<DT><B>-a</B><DD><P>APDU file. Specify a file for dumping PDUs (for diagnostic purposes).
433The special name &quot;-&quot; sends output to <CODE>stderr</CODE>.
434<P>
435<DT><B>-S</B><DD><P>Don't fork on connection requests. This is good for debugging, but
436not recommended for real operation: Although the server is
437asynchronous and non-blocking, it can be nice to keep a software
438malfunction (okay then, a crash) from affecting all current users.
439<P>
440<DT><B>-s</B><DD><P>Use the SR protocol.
441<P>
442<DT><B>-z</B><DD><P>Use the Z39.50 protocol (default). These two options complement
443eachother. You can use both multiple times on the same command
444line, between listener-specifications (see below). This way, you
445can set up the server to listen for connections in both protocols
446concurrently, on different local ports.
447<P>
448<DT><B>-l</B><DD><P>The logfile.
449<P>
450<DT><B>-v</B><DD><P>The log level. Use a comma-separated list of members of the set
451{fatal,debug,warn,log,all,none}.
452<DT><B>-u</B><DD><P>Set user ID. Sets the real UID of the server process to that of the
453given user. It's useful if you aren't comfortable with having the
454server run as root, but you need to start it as such to bind a
455privileged port.
456<P>
457<DT><B>-w</B><DD><P>Working directory.
458<DT><B>-i</B><DD><P>Use this when running from the <CODE>inetd</CODE> server.
459<P>
460<DT><B>-t</B><DD><P>Idle session timeout, in minutes.
461<P>
462<DT><B>-k</B><DD><P>Maximum record size/message size, in kilobytes.
463<P>
464</DL>
465<P>A listener specification consists of a transport mode followed by a
466colon (:) followed by a listener address. The transport mode is
467either <CODE>osi</CODE> or <CODE>tcp</CODE>.
468<P>For TCP, an address has the form
469<P>
470<BLOCKQUOTE><CODE>
471<PRE>
472hostname | IP-number [: portnumber]
473</PRE>
474</CODE></BLOCKQUOTE>
475<P>The port number defaults to 210 (standard Z39.50 port).
476<P>For osi, the address form is
477<P>
478<BLOCKQUOTE><CODE>
479<PRE>
480[t-selector /] hostname | IP-number [: portnumber]
481</PRE>
482</CODE></BLOCKQUOTE>
483<P>The transport selector is given as a string of hex digits (with an even
484number of digits). The default port number is 102 (RFC1006 port).
485<P>Examples
486<P>
487<BLOCKQUOTE><CODE>
488<PRE>
489tcp:dranet.dra.com
490
491osi:0402/dbserver.osiworld.com:3000
492</PRE>
493</CODE></BLOCKQUOTE>
494<P>In both cases, the special hostname &quot;@&quot; is mapped to
495the address INADDR_ANY, which causes the server to listen on any local
496interface. To start the server listening on the registered ports for
497Z39.50 and SR over OSI/RFC1006, and to drop root privileges once the
498ports are bound, execute the server like this (from a root shell):
499<P>
500<BLOCKQUOTE><CODE>
501<PRE>
502my-server -u daemon tcp:@ -s osi:@
503</PRE>
504</CODE></BLOCKQUOTE>
505<P>You can replace <CODE>daemon</CODE> with another user, eg. your own account, or
506a dedicated IR server account. <CODE>my-server</CODE> should be the name of your
507server application. You can test the procedure with the <CODE>ztest</CODE>
508application.
509<P>
510<H2><A NAME="ss7.7">7.7 Summary and Synopsis</A>
511</H2>
512
513<P>
514<BLOCKQUOTE><CODE>
515<PRE>
516#include &lt;backend.h>
517
518bend_initresult *bend_init(bend_initrequest *r);
519
520bend_searchresult *bend_search(void *handle, bend_searchrequest *r,
521 int *fd);
522
523bend_searchresult *bend_searchresponse(void *handle);
524
525bend_fetchresult *bend_fetch(void *handle, bend_fetchrequest *r,
526 int *fd);
527
528bend_fetchresult *bend_fetchresponse(void *handle);
529
530bend_scanresult *bend_scan(void *handle, bend_scanrequest *r, int *fd);
531
532bend_scanresult *bend_scanresponse(void *handle);
533
534bend_deleteresult *bend_delete(void *handle, bend_deleterequest *r,
535 int *fd);
536
537bend_deleteresult *bend_deleteresponse(void *handle);
538
539void bend_close(void *handle);
540</PRE>
541</CODE></BLOCKQUOTE>
542<P>
543<HR>
544<A HREF="yaz-8.html">Next</A>
545<A HREF="yaz-6.html">Previous</A>
546<A HREF="yaz.html#toc7">Contents</A>
547</BODY>
548</HTML>
Note: See TracBrowser for help on using the repository browser.