[1343] | 1 | <!doctype linuxdoc system>
|
---|
| 2 | <article>
|
---|
| 3 | <title>Specifying and Using Application (Database) Profiles
|
---|
| 4 | <author>Index Data, <tt/[email protected]/
|
---|
| 5 | <date>$Revision: 1343 $
|
---|
| 6 | <abstract>
|
---|
| 7 | YAZ includes a subsystem to manage complex database records, driven
|
---|
| 8 | by a set of configuration tables that reflect a given profile.
|
---|
| 9 | Multiple database profiles can coexeist in the same server, or even
|
---|
| 10 | the same database. The record management system is responsible for
|
---|
| 11 | associating a given record with a specific profile, and processing it
|
---|
| 12 | accordingly. This document describes the various file formats for data
|
---|
| 13 | and configuration files which are used by the module.
|
---|
| 14 | </abstract>
|
---|
| 15 |
|
---|
| 16 | <toc>
|
---|
| 17 |
|
---|
| 18 | <sect>Warnings
|
---|
| 19 |
|
---|
| 20 | <p>
|
---|
| 21 | <itemize>
|
---|
| 22 | <item>The subsystem descibed herein is under development. Not
|
---|
| 23 | everything may work exactly as decribed, and details of the interface
|
---|
| 24 | may change as the module matures.
|
---|
| 25 |
|
---|
| 26 | <item>The exact workings of the subsystem may depend on the
|
---|
| 27 | application in which it is used. This document focuses on the use of
|
---|
| 28 | the module in the <bf/Zebra/ information server which is distributed by Index
|
---|
| 29 | Data as an independent package.
|
---|
| 30 | </itemize>
|
---|
| 31 |
|
---|
| 32 | <sect>Introduction
|
---|
| 33 |
|
---|
| 34 | <p>
|
---|
| 35 | The retrieval facilities of Z39.50 are extremely flexible and powerful.
|
---|
| 36 | They allow any level of structuring of database records. They allow
|
---|
| 37 | controlled re-use of attribute sets (for searching) and tag sets (for
|
---|
| 38 | retrieval) between application profiles; they allow precise selection
|
---|
| 39 | of the desired sub-elements of a database record; they allow different
|
---|
| 40 | variants of a given data element to be represented and selected in a
|
---|
| 41 | structured way; and finally they allow the exchange of any type and
|
---|
| 42 | amount of data to be represented in a single database record.
|
---|
| 43 |
|
---|
| 44 | These powerful retrieval facilities are a recent addition to the
|
---|
| 45 | protocol, and along with the flexible searching facilities, they make
|
---|
| 46 | the protocol an extremely capable tool for precise, structured
|
---|
| 47 | access to information systems. The retrieval facilities add new
|
---|
| 48 | levels of flexibility and control to the protocol, which add to its
|
---|
| 49 | value outside of its traditional domain of the library systems world.
|
---|
| 50 |
|
---|
| 51 | The new facilities, however, also add new complexity to the protocol,
|
---|
| 52 | which is already troubles by a too-steep learning curve. We have seen
|
---|
| 53 | many good projects severely hindered or even thwarted by the sheer
|
---|
| 54 | complexity of implementing the Z39.50 protocol.
|
---|
| 55 |
|
---|
| 56 | At the same time, we feel that the most complex and powerful
|
---|
| 57 | facilities of the protocol (Explain, structured retrieval, etc.), are
|
---|
| 58 | also what the protocol needs to become more widespread, and to fulfill
|
---|
| 59 | what we perceive to be its most noble potential: To provide
|
---|
| 60 | everybody with standardised, well-structured access to the
|
---|
| 61 | information resources of the world.
|
---|
| 62 |
|
---|
| 63 | The purpose of <bf/YAZ/, then, and of this module as well, is to
|
---|
| 64 | <it/simplify/ the use of the protocol for programmers and
|
---|
| 65 | administrators, by providing simple APIs and configuration systems to
|
---|
| 66 | access the functionality of the protocol. The <bf/Retrieval/ module
|
---|
| 67 | deals specifically with the advanced retrieval functions which were
|
---|
| 68 | added to the protocol with version 3, or Z39.50-1994.
|
---|
| 69 |
|
---|
| 70 | <sect>Overview
|
---|
| 71 |
|
---|
| 72 | <sect1>External Data (record) Representation
|
---|
| 73 |
|
---|
| 74 | <p>
|
---|
| 75 | The <bf/Retrieval/ module will eventually support a wide range of
|
---|
| 76 | input formats, ranging from MARC data to USENET news archives. This
|
---|
| 77 | section introduces what we think of as the <it/canonical/ format - the
|
---|
| 78 | one that gives the most general access to the various elements of the
|
---|
| 79 | retrieval functionality.
|
---|
| 80 |
|
---|
| 81 | The basic model presented by the Z39.50 retrieval system is that of a
|
---|
| 82 | recursively defined tree structure, containing a list of tagged elements,
|
---|
| 83 | which may in turn contain either data or more lists of tagged elements, and
|
---|
| 84 | so forth.
|
---|
| 85 |
|
---|
| 86 | We elect to represent this structuring externally by using an
|
---|
| 87 | &dquot;SGML-like&dquot; syntax. The <it/internal/ representation will
|
---|
| 88 | be discussed later.
|
---|
| 89 |
|
---|
| 90 | Consider a record describing an information resource (such a record is
|
---|
| 91 | sometimes known as a <it/locator record/). It might contain a field
|
---|
| 92 | describing the distributor of the information resource, which might in
|
---|
| 93 | turn be partitioned into various fields providing details about the
|
---|
| 94 | distributor, like this:
|
---|
| 95 |
|
---|
| 96 | <tscreen><verb>
|
---|
| 97 | <Distributor>
|
---|
| 98 | <Name> USGS/WRD &etago;Name>
|
---|
| 99 | <Organization> USGS/WRD &etago;Organization>
|
---|
| 100 | <Street-Address>
|
---|
| 101 | U.S. GEOLOGICAL SURVEY, 505 MARQUETTE, NW
|
---|
| 102 | &etago;Street-Address>
|
---|
| 103 | <City> ALBUQUERQUE &etago;City>
|
---|
| 104 | <State> NM &etago;State>
|
---|
| 105 | <Zip-Code> 87102 &etago;Zip-Code>
|
---|
| 106 | <Country> USA &etago;Country>
|
---|
| 107 | <Telephone> (505) 766-5560 &etago;Telephone>
|
---|
| 108 | &etago;Distributor>
|
---|
| 109 | </verb></tscreen>
|
---|
| 110 |
|
---|
| 111 | This is how data that the retrieval module reads from an input file
|
---|
| 112 | might look.
|
---|
| 113 |
|
---|
| 114 | Depending on the database profile that is being used, it is likely
|
---|
| 115 | that the data won't look like this when it's transmitted from the
|
---|
| 116 | server to the client, however. Typically, the client will prefer to
|
---|
| 117 | receive the data in a more rigid syntax, such as USMARC or GRS-1. To
|
---|
| 118 | save transmission time and avoid ambiguities of language, the
|
---|
| 119 | individual tags or field names, above, might be translated into
|
---|
| 120 | numbers which are known by both the client and the server (by
|
---|
| 121 | referring to a tag set).
|
---|
| 122 |
|
---|
| 123 | The retrieval module supports various types of conversions that might
|
---|
| 124 | be carried out by the server based on requests from the client. To do
|
---|
| 125 | this, it needs a set of configuration files to describe the
|
---|
| 126 | application profile that the given record adheres to.
|
---|
| 127 |
|
---|
| 128 | <it>
|
---|
| 129 | CAUTION: Because the tables described below serve the dual purpose of
|
---|
| 130 | representing an external application profile and an internal database
|
---|
| 131 | profile, the terminology and structuring used will sometimes be
|
---|
| 132 | somewhat different from the one suggested in the the Z39.50-1995.
|
---|
| 133 | </it>
|
---|
| 134 |
|
---|
| 135 | <sect1>The Abstract Syntax
|
---|
| 136 |
|
---|
| 137 | <p>
|
---|
| 138 | The abstract syntax definition (ARS) is the focal point of the
|
---|
| 139 | application profile description. For a given profile, it may state any
|
---|
| 140 | or all of the following:
|
---|
| 141 |
|
---|
| 142 | <itemize>
|
---|
| 143 | <item>The object identifier of the database schema associated with the
|
---|
| 144 | profile, so that it can be referred to by the client.
|
---|
| 145 |
|
---|
| 146 | <item>The attribute set (which can possibly be a compound of multiple
|
---|
| 147 | sets) which applies in the profile. This is used when indexing and
|
---|
| 148 | searching the records belonging to the given profile.
|
---|
| 149 |
|
---|
| 150 | <item>The Tag set (again, this can consist of several different sets).
|
---|
| 151 | This is used when reading the records from a file, to recognize the
|
---|
| 152 | different tags, and when transmitting the record to the client -
|
---|
| 153 | mapping the tags to their numerical representation, if they are
|
---|
| 154 | known.
|
---|
| 155 |
|
---|
| 156 | <item>The variant set which is used in the profile. This provides a
|
---|
| 157 | vocabulary for specifying the <it/forms/ of data that appear inside
|
---|
| 158 | the records.
|
---|
| 159 |
|
---|
| 160 | <item>Element set names, which are a shorthand way for the client to
|
---|
| 161 | ask for a subset of the data elements contained in a record. Element
|
---|
| 162 | set names, in the retrieval module, are mapped to <it/element
|
---|
| 163 | specifications/, which contain information equivalent to the
|
---|
| 164 | <it/Espec-1/ syntax of Z39.50.
|
---|
| 165 |
|
---|
| 166 | <item>Map tables, which may specify mappings to <it/other/ database
|
---|
| 167 | profiles, if desired.
|
---|
| 168 |
|
---|
| 169 | <item>Possibly, a set of rules describing the mapping of elements to a
|
---|
| 170 | MARC representation.
|
---|
| 171 |
|
---|
| 172 | <item>A list of element description (this is the actual ARS of the
|
---|
| 173 | profile), which lists the ways in which the various tags can be used
|
---|
| 174 | and organized hierarchically.
|
---|
| 175 | </itemize>
|
---|
| 176 |
|
---|
| 177 | Several of the entries above simply refer to other files, which describe the
|
---|
| 178 | given objects.
|
---|
| 179 |
|
---|
| 180 | <sect>The Configuration Files
|
---|
| 181 |
|
---|
| 182 | <p>
|
---|
| 183 | This section describes the syntax and use of the various tables which
|
---|
| 184 | are used by the retrieval module.
|
---|
| 185 |
|
---|
| 186 | The number of different file types may appear daunting at first, but
|
---|
| 187 | each type corresponds fairly clearly to a single aspect of the Z39.50
|
---|
| 188 | retrieval facilities. Further, the average database administrator
|
---|
| 189 | who's simply reusing an existing profile for which tables already
|
---|
| 190 | exist, shouldn't have to worry too much about these tables.
|
---|
| 191 |
|
---|
| 192 | <sect1>The Abstract Syntax (.abs) Files
|
---|
| 193 |
|
---|
| 194 | <p>
|
---|
| 195 | The name of this file type is slightly misleading, since, apart from
|
---|
| 196 | the actual abstract syntax of the profile, it also includes most of
|
---|
| 197 | the other definitions that go into a database profile.
|
---|
| 198 |
|
---|
| 199 | When a record in the canonical, SGML-like format is read from a file
|
---|
| 200 | or from the database, the first tag of the file should reference the
|
---|
| 201 | profile that governs the layout of the record. If the first tag of the
|
---|
| 202 | record is <tt><gils></tt>, the system will look for the profile
|
---|
| 203 | definition in the file <tt/gils.abs/. Profile definitions are cached,
|
---|
| 204 | so they only have to be read once during the lifespan of the current
|
---|
| 205 | process.
|
---|
| 206 |
|
---|
| 207 | The file may contain the following directives:
|
---|
| 208 |
|
---|
| 209 | <descrip>
|
---|
| 210 | <tag>name <it/symbolic-name/</tag> This provides a shorthand name or
|
---|
| 211 | description for the profile. Mostly useful for diagnostic purposes.
|
---|
| 212 |
|
---|
| 213 | <tag>reference <it/OID-name/</tag> The reference name of the OID for
|
---|
| 214 | the profile. The reference names can be found in the <bf/util/
|
---|
| 215 | module of <bf/YAZ/.
|
---|
| 216 |
|
---|
| 217 | <tag>attset <it/filename/</tag> The attribute set that is used for
|
---|
| 218 | indexing and searching records belonging to this profile.
|
---|
| 219 |
|
---|
| 220 | <tag>tagset <it/filename/</tag> The tag set (if any) that describe
|
---|
| 221 | that fields of the records.
|
---|
| 222 |
|
---|
| 223 | <tag>varset <it/filename/</tag> The variant set used in the profile.
|
---|
| 224 |
|
---|
| 225 | <tag>maptab <it/filename/</tag> (repeatable) This points to a
|
---|
| 226 | conversion table that might be used if the client asks for the record
|
---|
| 227 | in a different schema from the native one.
|
---|
| 228 |
|
---|
| 229 | <tag>marc <it/filename/</tag> Points to a file containing parameters
|
---|
| 230 | for representing the record contents in the ISO2709 syntax. Read the
|
---|
| 231 | description of the MARC representation facility below.
|
---|
| 232 |
|
---|
| 233 | <tag>esetname <it/name filename/</tag> (repeatable) Associates the
|
---|
| 234 | given element set name with an element selection file. If an (@) is
|
---|
| 235 | given in place of the filename, this corresponds to a null mapping for
|
---|
| 236 | the given element set name.
|
---|
| 237 |
|
---|
| 238 | <tag>elm <it/path name attribute/</tag> (repeatable) Adds an element
|
---|
| 239 | to the abstract record syntax of the schema. The <it/path/ follows the
|
---|
| 240 | syntax which is suggested by the Z39.50 document - that is, a sequence
|
---|
| 241 | of tags separated by slashes (/). Each tag is given as a
|
---|
| 242 | comma-separated pair of tag type and -value surrounded by parenthesis.
|
---|
| 243 | The <it/name/ is the name of the element, and the <it/attribute/
|
---|
| 244 | specifies what attribute to use when indexing the element. A ! in
|
---|
| 245 | place of the attribute name is equivalent to specifying an attribute
|
---|
| 246 | name identical to the element name. A - in place of the attribute name
|
---|
| 247 | specifies that no indexing is to take place for the given element.
|
---|
| 248 | </descrip>
|
---|
| 249 |
|
---|
| 250 | <it>
|
---|
| 251 | NOTE: The mechanism for controlling indexing is inadequate for
|
---|
| 252 | complex databases, and will probably be moved into a separate
|
---|
| 253 | configuration table eventually.
|
---|
| 254 | </it>
|
---|
| 255 |
|
---|
| 256 | The following is an excerpt from the abstract syntax file for the GILS
|
---|
| 257 | profile.
|
---|
| 258 |
|
---|
| 259 | <tscreen><verb>
|
---|
| 260 | name gils
|
---|
| 261 | reference GILS-schema
|
---|
| 262 | attset gils.att
|
---|
| 263 | tagset gils.tag
|
---|
| 264 | varset var1.var
|
---|
| 265 |
|
---|
| 266 | maptab gils-usmarc.map
|
---|
| 267 |
|
---|
| 268 | # Element set names
|
---|
| 269 |
|
---|
| 270 | esetname VARIANT gils-variant.est # for WAIS-compliance
|
---|
| 271 | esetname B gils-b.est
|
---|
| 272 | esetname G gils-g.est
|
---|
| 273 | esetname W gils-b.est
|
---|
| 274 | esetname F @
|
---|
| 275 |
|
---|
| 276 | elm (1,10) rank -
|
---|
| 277 | elm (1,12) url -
|
---|
| 278 | elm (1,14) localControlNumber Local-number
|
---|
| 279 | elm (1,16) dateOfLastModification Date/time-last-modified
|
---|
| 280 | elm (2,1) Title !
|
---|
| 281 | elm (4,1) controlIdentifier Identifier-standard
|
---|
| 282 | elm (2,6) abstract Abstract
|
---|
| 283 | elm (4,51) purpose !
|
---|
| 284 | elm (4,52) originator -
|
---|
| 285 | elm (4,53) accessConstraints !
|
---|
| 286 | elm (4,54) useConstraints !
|
---|
| 287 | elm (4,70) availability -
|
---|
| 288 | elm (4,70)/(4,90) distributor -
|
---|
| 289 | elm (4,70)/(4,90)/(2,7) distributorName !
|
---|
| 290 | elm (4,70)/(4,90)/(2,10 distributorOrganization !
|
---|
| 291 | elm (4,70)/(4,90)/(4,2) distributorStreetAddress !
|
---|
| 292 | elm (4,70)/(4,90)/(4,3) distributorCity !
|
---|
| 293 | </verb></tscreen>
|
---|
| 294 |
|
---|
| 295 | <sect1>The Attribute Set (.att) Files
|
---|
| 296 |
|
---|
| 297 | <p>
|
---|
| 298 | This file type describes the <bf/Use/ elements of an attribute set.
|
---|
| 299 | It contains the following directives.
|
---|
| 300 |
|
---|
| 301 | <descrip>
|
---|
| 302 |
|
---|
| 303 | <tag>name <it/symbolic-name/</tag> This provides a shorthand name or
|
---|
| 304 | description for the attribute set. Mostly useful for diagnostic purposes.
|
---|
| 305 |
|
---|
| 306 | <tag>reference <it/OID-name/</tag> The reference name of the OID for
|
---|
| 307 | the attribute set. The reference names can be found in the <bf/util/
|
---|
| 308 | module of <bf/YAZ/.
|
---|
| 309 |
|
---|
| 310 | <tag>ordinal <it/integer/</tag> This value will be used to represent the
|
---|
| 311 | attribute set in the index. Care should be taken that each attribute
|
---|
| 312 | set has a unique ordinal value.
|
---|
| 313 |
|
---|
| 314 | <tag>include <it/filename/</tag> This directive, which can be
|
---|
| 315 | repeated, is used to include another attribute set as a part of the
|
---|
| 316 | current one. This is used when a new attribute set is defined as an
|
---|
| 317 | extension to another set. For instance, many new attribute sets are
|
---|
| 318 | defined as extensions to the <bf/bib-1/ set. This is an important
|
---|
| 319 | feature of the retrieval system of Z39.50, as it ensures the highest
|
---|
| 320 | possible level of interoperability, as those access points of your
|
---|
| 321 | database which are derived from the external set (say, bib-1) can be used
|
---|
| 322 | even by clients who are unaware of the new set.
|
---|
| 323 |
|
---|
| 324 | <tag>att <it/att-value att-name [local-value]/</tag> This
|
---|
| 325 | repeatable directive
|
---|
| 326 | introduces a new attribute to the set. The attribute value is stored
|
---|
| 327 | in the index (unless a <it/local-value/ is given, in which case this
|
---|
| 328 | is stored). The name is used to refer to the attribute from the
|
---|
| 329 | <it/abstract syntax/.
|
---|
| 330 | </descrip>
|
---|
| 331 |
|
---|
| 332 | This is an excerpt from the GILS attribute set definition. Notice how
|
---|
| 333 | the file describing the <it/bib-1/ attribute set is referenced.
|
---|
| 334 |
|
---|
| 335 | <tscreen><verb>
|
---|
| 336 | name gils
|
---|
| 337 | reference GILS-attset
|
---|
| 338 | include bib1.att
|
---|
| 339 | ordinal 2
|
---|
| 340 |
|
---|
| 341 | att 2001 distributorName
|
---|
| 342 | att 2002 indexTermsControlled
|
---|
| 343 | att 2003 purpose
|
---|
| 344 | att 2004 accessConstraints
|
---|
| 345 | att 2005 useConstraints
|
---|
| 346 | </verb></tscreen>
|
---|
| 347 |
|
---|
| 348 | <sect1>The Tag Set (.tag) Files
|
---|
| 349 |
|
---|
| 350 | <p>
|
---|
| 351 | This file type defines the tagset of the profile, possibly by
|
---|
| 352 | referencing other tag sets (most tag sets, for instance, will include
|
---|
| 353 | tagsetG and tagsetM from the Z39.50 specification. The file may
|
---|
| 354 | contain the following directives.
|
---|
| 355 |
|
---|
| 356 | <descrip>
|
---|
| 357 | <tag>name <it/symbolic-name/</tag> This provides a shorthand name or
|
---|
| 358 | description for the tag set. Mostly useful for diagnostic purposes.
|
---|
| 359 |
|
---|
| 360 | <tag>reference <it/OID-name/</tag> The reference name of the OID for
|
---|
| 361 | the tag set. The reference names can be found in the <bf/util/
|
---|
| 362 | module of <bf/YAZ/.
|
---|
| 363 |
|
---|
| 364 | <tag>type <it/integer/</tag> The type number of the tag within the schema
|
---|
| 365 | profile.
|
---|
| 366 |
|
---|
| 367 | <tag>include <it/filename/</tag> (repeatable) This directive is used
|
---|
| 368 | to include the definitions of other tag sets into the current one.
|
---|
| 369 |
|
---|
| 370 | <tag>tag <it/number names type/</tag> (repeatable) Introduces a new
|
---|
| 371 | tag to the set. The <it/number/ is the tag number as used in the protocol
|
---|
| 372 | (there is currently no mechanism for specifying string tags at this
|
---|
| 373 | point, but this would be quick work to add). The <it/names/ parameter
|
---|
| 374 | is a list of names by which the tag should be recognized in the input
|
---|
| 375 | file format. The names should be separated by slashes (/). The
|
---|
| 376 | <it/type/ is th recommended datatype of the tag. It should be one of
|
---|
| 377 | the following:
|
---|
| 378 | <itemize>
|
---|
| 379 | <item>structured
|
---|
| 380 | <item>string
|
---|
| 381 | <item>numeric
|
---|
| 382 | <item>bool
|
---|
| 383 | <item>oid
|
---|
| 384 | <item>generalizedtime
|
---|
| 385 | <item>intunit
|
---|
| 386 | <item>int
|
---|
| 387 | <item>octetstring
|
---|
| 388 | <item>null
|
---|
| 389 | </itemize>
|
---|
| 390 | </descrip>
|
---|
| 391 |
|
---|
| 392 | The following is an excerpt from the TagsetG definition file.
|
---|
| 393 |
|
---|
| 394 | <tscreen><verb>
|
---|
| 395 | name tagsetg
|
---|
| 396 | reference TagsetG
|
---|
| 397 | type 2
|
---|
| 398 |
|
---|
| 399 | tag 1 title string
|
---|
| 400 | tag 2 author string
|
---|
| 401 | tag 3 publicationPlace string
|
---|
| 402 | tag 4 publicationDate string
|
---|
| 403 | tag 5 documentId string
|
---|
| 404 | tag 6 abstract string
|
---|
| 405 | tag 7 name string
|
---|
| 406 | tag 8 date generalizedtime
|
---|
| 407 | tag 9 bodyOfDisplay string
|
---|
| 408 | tag 10 organization string
|
---|
| 409 | </verb></tscreen>
|
---|
| 410 |
|
---|
| 411 | <sect1>The Variant Set (.var) Files
|
---|
| 412 |
|
---|
| 413 | <p>
|
---|
| 414 | The variant set file is a straightforward representation of the
|
---|
| 415 | variant set definitions associated with the protocol. At present, only
|
---|
| 416 | the <it/Variant-1/ set is known.
|
---|
| 417 |
|
---|
| 418 | These are the directives allowed in the file.
|
---|
| 419 |
|
---|
| 420 | <descrip>
|
---|
| 421 | <tag>name <it/symbolic-name/</tag> This provides a shorthand name or
|
---|
| 422 | description for the variant set. Mostly useful for diagnostic purposes.
|
---|
| 423 |
|
---|
| 424 | <tag>reference <it/OID-name/</tag> The reference name of the OID for
|
---|
| 425 | the variant set, if one is required. The reference names can be found
|
---|
| 426 | in the <bf/util/ module of <bf/YAZ/.
|
---|
| 427 |
|
---|
| 428 | <tag>class <it/integer class-name/</tag> (repeatable) Introduces a new
|
---|
| 429 | class to the variant set.
|
---|
| 430 |
|
---|
| 431 | <tag>type <it/integer type-name datatype/</tag> (repeatable) Addes a
|
---|
| 432 | new type to the current class (the one introduced by the most recent
|
---|
| 433 | <bf/class/ directive). The type names belong to the same name space as
|
---|
| 434 | the one used in the tag set definition file.
|
---|
| 435 | </descrip>
|
---|
| 436 |
|
---|
| 437 | The following is an excerpt from the file describing the variant set
|
---|
| 438 | <it/Variant-1/.
|
---|
| 439 |
|
---|
| 440 | <tscreen><verb>
|
---|
| 441 | name variant-1
|
---|
| 442 | reference Variant-1
|
---|
| 443 |
|
---|
| 444 | class 1 variantId
|
---|
| 445 |
|
---|
| 446 | type 1 variantId octetstring
|
---|
| 447 |
|
---|
| 448 | class 2 body
|
---|
| 449 |
|
---|
| 450 | type 1 iana string
|
---|
| 451 | type 2 z39.50 string
|
---|
| 452 | type 3 other string
|
---|
| 453 | </verb></tscreen>
|
---|
| 454 |
|
---|
| 455 | <sect1>The Element Set (.est) Files
|
---|
| 456 |
|
---|
| 457 | <p>
|
---|
| 458 | The element set specification files describe a selection of a subset
|
---|
| 459 | of the elements of a database record. The element selection mechanism
|
---|
| 460 | is equivalent to the one supplied by the <it/Espec-1/ syntax of the
|
---|
| 461 | Z39.50 specification. In fact, the internal representation of an
|
---|
| 462 | element set specification is identical to the <it/Espec-1/ structure,
|
---|
| 463 | and we'll refer you to the description of that structure for most of
|
---|
| 464 | the detailed semantics of the directives below.
|
---|
| 465 |
|
---|
| 466 | <it>
|
---|
| 467 | NOTE: Not all of the Espec-1 functionality has been implemented yet.
|
---|
| 468 | The fields that are mentioned below all work as expected, unless
|
---|
| 469 | otherwise is noted.
|
---|
| 470 | </it>
|
---|
| 471 |
|
---|
| 472 | The directives available in the element set file are as follows:
|
---|
| 473 |
|
---|
| 474 | <descrip>
|
---|
| 475 | <tag>defaultVariantSetId <it/OID-name/</tag> If variants are used in
|
---|
| 476 | the following, this should provide the name of the variantset used
|
---|
| 477 | (it's not currently possible to specify a different set in the
|
---|
| 478 | individual variant request). In almost all cases (certainly all
|
---|
| 479 | profiles known to us), the name <tt/Variant-1/ should be given here.
|
---|
| 480 |
|
---|
| 481 | <tag>defaultVariantRequest <it/variant-request/</tag> This directive
|
---|
| 482 | provides a default variant request for
|
---|
| 483 | use when the individual element requests (see below) do not contain a
|
---|
| 484 | variant request. Variant requests consist of a blank-separated list of
|
---|
| 485 | variant components. A variant compont is a comma-separated,
|
---|
| 486 | parenthesized triple of variant class, type, and value (the two former
|
---|
| 487 | values being represented as integers). The value can currently only be
|
---|
| 488 | entered as a string (this will change to depend on the definition of
|
---|
| 489 | the variant in question). The special value (@) is interpreted as a
|
---|
| 490 | null value, however.
|
---|
| 491 |
|
---|
| 492 | <tag>simpleElement <it/path ['variant' variant-request]/</tag>
|
---|
| 493 | This corresponds to a simple element request in <it/Espec-1/. The
|
---|
| 494 | path consists of a sequence of tag-selectors, where each of these can
|
---|
| 495 | consist of either:
|
---|
| 496 |
|
---|
| 497 | <itemize>
|
---|
| 498 | <item>A simple tag, consisting of a comma-separated type-value pair in
|
---|
| 499 | parenthesis, possibly followed by a colon (:) followed by an
|
---|
| 500 | occurrences-specification (see below). The tag-value can be a number
|
---|
| 501 | or a string. If the first character is an apostrophe ('), this forces
|
---|
| 502 | the value to be interpreted as a string, even if it appears to be numerical.
|
---|
| 503 |
|
---|
| 504 | <item>A WildThing, represented as a question mark (?), possibly
|
---|
| 505 | followed by a colon (:) followed by an occurrences specification (see
|
---|
| 506 | below).
|
---|
| 507 |
|
---|
| 508 | <item>A WildPath, represented as an asterisk (*). Note that the last
|
---|
| 509 | element of the path should not be a wildPath (wildpaths don't work in
|
---|
| 510 | this version).
|
---|
| 511 | </itemize>
|
---|
| 512 |
|
---|
| 513 | The occurrences-specification can be either the string <tt/all/, the
|
---|
| 514 | string <tt/last/, or an explicit value-range. The value-range is
|
---|
| 515 | represented as an integer (the starting point), possibly followed by a
|
---|
| 516 | plus (+) and a second integer (the number of elements, default being
|
---|
| 517 | one).
|
---|
| 518 |
|
---|
| 519 | The variant-request has the same syntax as the defaultVariantRequest
|
---|
| 520 | above. Note that it may sometimes be useful to give an empty variant
|
---|
| 521 | request, simlply to disable the default for a specific set of fields
|
---|
| 522 | (we aren't certain if this is proper <it/Espec-1/, but it works in
|
---|
| 523 | this implementation).
|
---|
| 524 | </descrip>
|
---|
| 525 |
|
---|
| 526 | The following is an example of an element specification belonging to
|
---|
| 527 | the GILS profile.
|
---|
| 528 |
|
---|
| 529 | <tscreen><verb>
|
---|
| 530 | simpleelement (1,10)
|
---|
| 531 | simpleelement (1,12)
|
---|
| 532 | simpleelement (2,1)
|
---|
| 533 | simpleelement (1,14)
|
---|
| 534 | simpleelement (4,1)
|
---|
| 535 | simpleelement (4,52)
|
---|
| 536 | </verb></tscreen>
|
---|
| 537 |
|
---|
| 538 | <sect1>The Schema Mapping (.map) Files
|
---|
| 539 |
|
---|
| 540 | <p>
|
---|
| 541 | Sometimes, the client might want to receive a database record in
|
---|
| 542 | a schema that differs from the native schema of the record. For
|
---|
| 543 | instance, a client might only know how to process WAIS records, while
|
---|
| 544 | the database record is represented in a more specific schema, such as
|
---|
| 545 | GILS. In this module, a mapping of data to one of the MARC formats is
|
---|
| 546 | also thought of as a schema mapping (mapping the elements of the
|
---|
| 547 | record into fields consistent with the given MARC specification, prior
|
---|
| 548 | to actually converting the data to the ISO2709). This use of the
|
---|
| 549 | object identifier for USMARC as a schema identifier represents an
|
---|
| 550 | overloading of the OID which might not be entirely proper. However,
|
---|
| 551 | it represents the dual role of schema and record syntax which
|
---|
| 552 | is assumed by the MARC family in Z39.50.
|
---|
| 553 |
|
---|
| 554 | <it>
|
---|
| 555 | NOTE: The schema-mapping functions are so far limited to a
|
---|
| 556 | straightforward mapping of elements. This should be extended with
|
---|
| 557 | mechanisms for conversions of the element contents, and conditional
|
---|
| 558 | mappings of elements based on the record contents.
|
---|
| 559 | </it>
|
---|
| 560 |
|
---|
| 561 | These are the directives of the schema mapping file format:
|
---|
| 562 |
|
---|
| 563 | <descrip>
|
---|
| 564 | <tag>targetName <it/name/</tag> A symbolic name for the target schema
|
---|
| 565 | of the table. Useful mostly for diagnostic purposes.
|
---|
| 566 |
|
---|
| 567 | <tag>targetRef <it/OID-name/</tag> An OID name for the target schema.
|
---|
| 568 | This is used, for instance, by a server receiving a request to present
|
---|
| 569 | a record in a different schema from the native one. The name, again,
|
---|
| 570 | is found in the <bf/oid/ module of <bf/YAZ/.
|
---|
| 571 |
|
---|
| 572 | <tag>map <it/element-name target-path/</tag> (repeatable) Adds
|
---|
| 573 | an element mapping rule to the table.
|
---|
| 574 | </descrip>
|
---|
| 575 |
|
---|
| 576 | <sect1>The MARC (ISO2709) Representation (.mar) Files
|
---|
| 577 |
|
---|
| 578 | <p>
|
---|
| 579 | This file provides rules for representing a record in the ISO2709
|
---|
| 580 | format. The rules pertain mostly to the values of the constant-length
|
---|
| 581 | header of the record.
|
---|
| 582 |
|
---|
| 583 | <it>NOTE: This will be described better.</it>
|
---|
| 584 |
|
---|
| 585 | <sect>The Input (Data) File Format
|
---|
| 586 |
|
---|
| 587 | <p>
|
---|
| 588 | The retrieval module is designed to manage data derived from a
|
---|
| 589 | variety of different input sources. When used on the client side, the
|
---|
| 590 | source format may be GRS-1 ISO2709. On the server side, the source may
|
---|
| 591 | be a structured ASCII file, augmented by a set of patterns that
|
---|
| 592 | describe the structure of the document.
|
---|
| 593 |
|
---|
| 594 | What we think of as the native source format - the one that is
|
---|
| 595 | guaranteed to provide complete access to the facilities of the module,
|
---|
| 596 | is an &dquot;SGML-like&dquot; syntax, based on an inferred DTD, which
|
---|
| 597 | is in turn based on the profile information from the various files
|
---|
| 598 | mentioned in this document.
|
---|
| 599 |
|
---|
| 600 | Like SGML, an input record consists of tags and data. The tags are
|
---|
| 601 | enclosed by brackets (<...>). As a general rule, each tag should
|
---|
| 602 | be matched by a corresponding close tag, identified by the same tag
|
---|
| 603 | name preceded by a slash (/).
|
---|
| 604 |
|
---|
| 605 | <sect>License
|
---|
| 606 |
|
---|
| 607 | <p>
|
---|
| 608 | Copyright © 1995-2000, Index Data.
|
---|
| 609 |
|
---|
| 610 | This is the Index Data &dquot;P&dquot; license - it applies exclusively to
|
---|
| 611 | the record management module of the YAZ system, and to this
|
---|
| 612 | document.
|
---|
| 613 |
|
---|
| 614 | Permission to use, copy, modify, distribute, and sell this software and
|
---|
| 615 | its documentation, in whole or in part, for any purpose, is hereby granted,
|
---|
| 616 | provided that:
|
---|
| 617 |
|
---|
| 618 | 1. This copyright and permission notice appear in all copies of the
|
---|
| 619 | software and its documentation. Notices of copyright or attribution
|
---|
| 620 | which appear at the beginning of any file must remain unchanged.
|
---|
| 621 |
|
---|
| 622 | 2. The names of Index Data or the individual authors may not be used to
|
---|
| 623 | endorse or promote products derived from this software without specific
|
---|
| 624 | prior written permission.
|
---|
| 625 |
|
---|
| 626 | THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT WARRANTY OF ANY KIND,
|
---|
| 627 | EXPRESS, IMPLIED, OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY
|
---|
| 628 | WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
---|
| 629 | IN NO EVENT SHALL INDEX DATA BE LIABLE FOR ANY SPECIAL, INCIDENTAL,
|
---|
| 630 | INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY DAMAGES
|
---|
| 631 | WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER OR
|
---|
| 632 | NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF
|
---|
| 633 | LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
|
---|
| 634 | OF THIS SOFTWARE.
|
---|
| 635 |
|
---|
| 636 | <sect>About Index Data
|
---|
| 637 |
|
---|
| 638 | <p>
|
---|
| 639 | Index Data is a consulting and software-development enterprise that
|
---|
| 640 | specialises in library and information management systems. Our
|
---|
| 641 | interests and expertise span a broad range of related fields, and one
|
---|
| 642 | of our primary, long-term objectives is the development of a powerful
|
---|
| 643 | information management
|
---|
| 644 | system with open network interfaces and hypermedia capabilities.
|
---|
| 645 |
|
---|
| 646 | We make this software available free of charge, on a fairly unrestrictive
|
---|
| 647 | license; as a service to the networking community, and to further the
|
---|
| 648 | development of quality software for open network communication.
|
---|
| 649 |
|
---|
| 650 | We'll be happy to answer questions about the software, and about ourselves
|
---|
| 651 | in general.
|
---|
| 652 |
|
---|
| 653 | <tscreen>
|
---|
| 654 | Index Data
|
---|
| 655 | Ryesgade 3
|
---|
| 656 | DK-2200 Copenhagen N
|
---|
| 657 | </tscreen>
|
---|
| 658 |
|
---|
| 659 | <p>
|
---|
| 660 | <tscreen><verb>
|
---|
| 661 | Phone: +45 3536 3672
|
---|
| 662 | Fax : +45 3536 0449
|
---|
| 663 | Email: [email protected]
|
---|
| 664 | </verb></tscreen>
|
---|
| 665 |
|
---|
| 666 | </article>
|
---|