source: trunk/gsdl-documentation/manuals/xml-source/en/User_en.xml@ 13781

Last change on this file since 13781 was 13781, checked in by lh92, 17 years ago

added SupplementaryText and 'id' 'lang' attribute to 'Manual' element, for back to manual index and back to top index links

  • Property svn:keywords set to Author Date Id Revision
File size: 171.7 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<Manual id="User" lang="en">
3<Heading>
4<Text id="1">GREENSTONE DIGITAL LIBRARY</Text>
5</Heading>
6<Title>
7<Text id="2">USER'S GUIDE</Text>
8</Title>
9<Author>
10<Text id="3">Ian H. Witten, Stefan Boddie and John Thompson</Text>
11</Author>
12<Affiliation>
13<Text id="4">Department of Computer Science <br/>University of Waikato, New Zealand</Text>
14</Affiliation>
15<SupplementaryText>
16<Text id="manual_index">Back to manual index</Text>
17<Text id="top_index">Back to top index</Text>
18</SupplementaryText>
19<Text id="5">Greenstone is a suite of software for building and distributing digital library collections. It provides a new way of organizing information and publishing it on the Internet or on CD-ROM. Greenstone is produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed in cooperation with UNESCO and the Human Info NGO. It is open-source software, available from <i>http://greenstone.org</i> under the terms of the Gnu General Public License.</Text>
20<Comment>
21<Text id="6">We want to ensure that this software works well for you. Please report any problems to <i>[email protected]</i></Text>
22</Comment>
23<Version>
24<Text id="7">Greenstone gsdl-2.70</Text>
25</Version>
26<Date>
27<Text id="8">March 2006</Text>
28</Date>
29<Section id="about_this_manual">
30<Title>
31<Text id="9">About this manual</Text>
32</Title>
33<Content>
34<Text id="10">This manual provides a comprehensive description of how to use the Greenstone software for accessing and building digital library collections.</Text>
35<Text id="11">Section <CrossRef target="Chapter" ref="overview_of_greenstone"/> gives an overview of the capabilities of the software. Section <CrossRef target="Chapter" ref="using_greenstone_collections"/> explains how to use Greenstone collections. The interface is self-explanatory—the best way to learn is by doingand this section comprises the on-line help information for a typical collection. Section <CrossRef target="Chapter" ref="making_greenstone_collections"/> explains how to build your own library collections using the Greenstone Librarian Interface. Section <CrossRef target="Chapter" ref="administration"/> introduces the administration facility that allows the system administrator to monitor what is going on and control who can build collections.</Text>
36<Text id="12">Appendices list the features of the Greenstone software, and give a glossary of terms used throughout the Greenstone documentation.</Text>
37</Content>
38</Section>
39<Section id="companion_documents">
40<Title>
41<Text id="13">Companion documents</Text>
42</Title>
43<Content>
44<Text id="14">The complete set of Greenstone documents includes four volumes:</Text>
45<BulletList>
46<Bullet>
47<Text id="15">Greenstone Digital Library Installer's Guide</Text>
48</Bullet>
49<Bullet>
50<Text id="16">Greenstone Digital Library User's Guide <i>(this document)</i></Text>
51</Bullet>
52<Bullet>
53<Text id="17">Greenstone Digital Library Developer's Guide</Text>
54</Bullet>
55<Bullet>
56<Text id="18">Greenstone Digital Library: From Paper to Collection</Text>
57</Bullet>
58</BulletList>
59</Content>
60</Section>
61<Section id="acknowledgements">
62<Title>
63<Text id="19">Acknowledgements</Text>
64</Title>
65<Content>
66<Text id="20">The Greenstone software is a collaborative effort between manypeople. Rodger McNab and Stefan Boddie are the principal architects andimplementors. Contributions have been made by David Bainbridge, GeorgeBuchanan, Hong Chen, Michael Dewsnip, Katherine Don, Elke Duncker, Carl Gutwin, Geoff Holmes, Dana McKay, JohnMcPherson, Craig Nevill-Manning, Dynal Patel, Gordon Paynter, Bernhard Pfahringer, ToddReed, Bill Rogers, John Thompson, and Stuart Yeates. Other members of the New ZealandDigital Library project provided advice and inspiration in the design ofthe system: Mark Apperley, Sally Jo Cunningham, Matt Jones, Steve Jones, Te TakaKeegan, Michel Loots, Malika Mahoui, Gary Marsden, Dave Nichols and Lloyd Smith. We would also like toacknowledge all those who have contributed to the GNU-licensed packagesincluded in this distribution: MG, GDBM, PDFTOHTML, PERL, WGET, WVWARE and XLHTML.</Text>
67</Content>
68</Section>
69<Chapter id="overview_of_greenstone">
70<Title>
71<Text id="21">Overview of Greenstone</Text>
72</Title>
73<Content>
74<Text id="22">Greenstone is a comprehensive system for constructing and presenting collections of thousands or millions of documents, including text, images, audio and video.</Text>
75<Section id="collections">
76<Title>
77<Text id="23">Collections</Text>
78</Title>
79<Content>
80<Text id="24">A typical digital library built with Greenstone will contain many collections, individually organized—though they bear a strong family resemblance. Easily maintained, collections can be augmented and rebuilt automatically.</Text>
81<Text id="25">There are several ways to find information in most Greenstone collections. For example, you can <i>search for particular words</i> that appear in the text, or within a section of a document. You can <i>browse documents by title</i>: just click on a book to read it. You can <i>browse documents by subject</i>. Subjects are represented by bookshelves: just click on a bookshelf to look at the books. Where appropriate, documents come complete with a table of contents: you can click on a chapter or subsection to open it, expand the full table of contents, or expand the full document into your browser window (useful for printing). The New Zealand Digital Library website (<i>nzdl.org</i>) provides numerous example collections.</Text>
82<Text id="26">On the front page of each collection is a statement of its purpose and coverage, and an explanation of how the collection is organized. Most collections can be accessed by both <i>searching</i> and <i>browsing</i>. When searching, the Greenstone software looks through the entire text of all documents in the collection (this is called “full-text search”). In most collections the user can choose between indexes built from different parts of the documents. Some collections have an index of full documents, an index of paragraphs, and an index of titles, each of which can be searched for particular words or phrases. Using these you can find all documents that contain a particular set of words (the words may be scattered far and wide throughout the document), or all paragraphs that contain the set of words (which must all appear in the same paragraph), or all documents whose titles contain the words (the words must all appear in the document's title). There might be other indexes, perhaps an index of sections, and an index of section headings. Browsing involves lists that the user can examine: lists of authors, lists of titles, lists of dates, hierarchical classification structures, and so on. Different collections offer different browsing facilities.</Text>
83</Content>
84</Section>
85<Section id="finding_information">
86<Title>
87<Text id="27">Finding information</Text>
88</Title>
89<Content>
90<Text id="28">Greenstone constructs full-text indexes from the document text—that is, indexes that enable searching on any words in the full text of the document. Indexes can be searched for particular words, combinations of words, or phrases, and results are ordered according to how relevant they are to the query.</Text>
91<Text id="29">In most collections, descriptive data such as author, title, date, keywords, and so on, is associated with each document. This information is called <i>metadata</i>. Many document collections also contain full-text indexes of certain kinds of metadata. For example, many collections have a searchable index of document titles.</Text>
92<Text id="30">Users can browse interactively around lists, and hierarchical structures, that are generated from the metadata that is associated with each document in the collection. Metadata forms the raw material for browsing. It must be provided explicitly or be derivable automatically from the documents themselves. Different collections offer different searching and browsing facilities. Indexes for both searching and browsing are constructed during a “building” process, according to information in a collection configuration file.</Text>
93<Text id="31">Greenstone creates all index structures automatically from the documents and suppporting files: nothing is done manually. If new documents in the same format become available, they can be merged into the collection automatically. Indeed, for many collections this is done by processes that awake regularly, scout for new material, and rebuild the indexes—all without manual intervention.</Text>
94</Content>
95</Section>
96<Section id="document_formats">
97<Title>
98<Text id="32">Document formats</Text>
99</Title>
100<Content>
101<Text id="33">Source documents come in a variety of formats, and are converted into a standard XML form for indexing by “plugins.” Plugins distributed with Greenstone process plain text, HTML, WORD and PDF documents, and Usenet and E-mail messages. New ones can be written for different document types (to do this you need to study the <i>Greenstone Digital Library Developer's Guide</i>). To build browsing structures from metadata, an analogous scheme of “classifiers” is used. These create browsing indexes of various kinds: scrollable lists, alphabetic selectors, dates, and arbitrary hierarchies. Again, Greenstone programmers can create new browsing structures.</Text>
102</Content>
103</Section>
104<Section id="multimedia_and_multilingual_documents">
105<Title>
106<Text id="34">Multimedia and multilingual documents</Text>
107</Title>
108<Content>
109<Text id="35">Collections can contain text, pictures, audio and video. Non-textual material is either linked into the textual documents or accompanied by textual descriptions (such as figure captions) to allow full-text searching and browsing.</Text>
110<Text id="36">Unicode, which is a standard scheme for representing the character sets used in the world's languages, is used throughout Greenstone. This allows any language to be processed and displayed in a consistent manner. Collections have been built containing Arabic, Chinese, English, French, M 0Å
111 1ori and Spanish. Multilingual collections embody automatic language recognition, and the interface is available in all the above languages (and more).</Text>
112</Content>
113</Section>
114<Section id="distributing_greenstone">
115<Title>
116<Text id="37">Distributing Greenstone</Text>
117</Title>
118<Content>
119<Text id="38">Collections are accessed over the Internet or published, in precisely the same form, on a self-installing Windows CD-ROM. Compression is used to compact the text and indexes. A Corba protocol supports distributed collections and graphical query interfaces.</Text>
120<Text id="39">The New Zealand Digital Library (<i>nzdl.org</i>) provides many example collections, including historical documents, humanitarian and development information, technical reports and bibliographies, literary works, and magazines.</Text>
121<Text id="40">Being open source, Greenstone is readily extensible, and benefits from the inclusion of Gnu-licensed modules for full-text retrieval, database management, and text extraction from proprietary document formats. Only through international cooperative efforts will digital library software become sufficiently comprehensive to meet the world's needs with the richness and flexibility that users deserve.</Text>
122</Content>
123</Section>
124</Content>
125</Chapter>
126<Chapter id="using_greenstone_collections">
127<Title>
128<Text id="41">Using Greenstone Collections</Text>
129</Title>
130<Content>
131<Text id="42">The Greenstone software is designed to be easy to use. Web-based and CD-ROM collections have interfaces that are identical. Installing the Greenstone software from CD-ROM on any Windows or Linux computer is very easy indeed; a standard installation setup program is used in conjunction with pre-compiled binaries. A collection can be used locally on the computer where it is installed; also, if this computer is connected to a network, the software automatically and transparently allows all other computers on the network to access the same collection.</Text>
132<Text id="43">The next section describes how to install a Greenstone CD-ROM. Then we look at the searching and browsing facilities offered by a typical Greenstone collection, the “Demo” collection that is supplied with the Greenstone software. Other collections offer similar facilities; if you can use one, you can use them all. The following section explains how to customize the interface for your own requirements using the Preferences page.</Text>
133<Section id="using_a_greenstone_cd-rom">
134<Title>
135<Text id="44">Using a Greenstone CD-ROM</Text>
136</Title>
137<Content>
138<Text id="45">The Greenstone digital library software itself comes on a CD-ROM, and you or your system manager have probably installed it on your system, following the instructions in the <i>Greenstone Digital Library Installer's Guide.</i> If so, Greenstone is already installed on your computer and you should skip the rest of this section.</Text>
139<Text id="46">Some Greenstone collections come on a self-contained Greenstone CD-ROM that includes enough of the software to run just that collection. To use it, simply put it into the CD-ROM drive on any Windows PC. Most likely (if “autorun” is enabled on your PC), a window will appear inviting you to install the Greenstone software. If not, find the CD-ROM disk drive (on current Windows systems you can get this by clicking on the <i>My Computer</i> icon on the desktop) and double-click it, then the <i>Setup.exe</i> file inside it. The Greenstone <i>Setup</i> program will be entered, which guides you through the setup procedure. Most people respond <i>yes</i> to all the questions.</Text>
140<Text id="47">When the installation procedure has finished, you'll find the library in the <i>Programs</i> submenu of the Windows <i>Start</i> menu, under the name of the collection (for example, “Development Library” or “United Nations University”).</Text>
141<Text id="48">Once the software has been installed, the library will be entered automatically every time you re-insert the CD-ROM if autorun is enabled.</Text>
142</Content>
143</Section>
144<Section id="finding_information_1">
145<Title>
146<Text id="49">Finding information</Text>
147</Title>
148<Content>
149<Text id="50">The easiest way to learn how to use a Greenstone collection is to try it out. Don't worry—you can't break anything. Click liberally: most images that appear on the screen are clickable. If you hold the mouse stationary over an image, most browsers will soon pop up a message that tells you what will happen if you click.</Text>
150<Text id="51">Experiment! Choose common words like “the” and “and” to search for—that should evoke some responses, and nothing will break.</Text>
151<Text id="52">Greenstone digital library systems usually comprise several separate collections—for example, computer science technical reports, literary works, internet FAQs, magazines. There will be a home page for the digital library system which allows you to access any publicly-accessible collection; in addition, each collection has its own “about” page that gives you information about how the collection is organized and the principles governing what is included in it. To get back to the “about” page at any time, just click on the “collection” icon that appears at the top left side of all searching and browsing pages.</Text>
152<Text id="53">Figure 1 shows a screenshot of the “Demo” collection supplied with the Greenstone software, which is a very small subset of the Development Library collection; we will use it as an example to describe the different ways of finding information. (If you can't find the Demo collection, use the Development Library instead; it looks just the same.) First, almost all icons are clickable. Several icons appear at the top of almost every page; Table 1 shows you what they mean.</Text>
153<Figure id="using_the_demo_collection">
154<Title>
155<Text id="54">Using the Demo collection</Text>
156</Title>
157<File width="397" height="286" url="images/User_Fig_1.png"/>
158</Figure>
159<Table id="what_the_icons_at_the_top_of_each_page_mean">
160<Title>
161<Text id="55">What the icons at the top of each page mean</Text>
162</Title>
163<TableContent>
164<tr>
165<th width="132">
166<File width="68" height="32" url="images/User_Icon_1.png"/>
167</th>
168<th width="397">
169<Text id="56">This takes you to the “about” page</Text>
170</th>
171</tr>
172<tr>
173<th width="132">
174<File width="36" height="16" url="images/User_Icon_2.png"/>
175</th>
176<th width="397">
177<Text id="57">This takes you to the Digital Library's home page, from which you can select another collection</Text>
178</th>
179</tr>
180<tr>
181<th width="132">
182<File width="29" height="16" url="images/User_Icon_3.png"/>
183</th>
184<th width="397">
185<Text id="58">This provides help text similar to what you are reading now</Text>
186</th>
187</tr>
188<tr>
189<th width="132">
190<File width="70" height="16" url="images/User_Icon_4.png"/>
191</th>
192<th width="397">
193<Text id="59">This allows you to set some user interface and searching options that will then be used henceforth</Text>
194</th>
195</tr>
196</TableContent>
197</Table>
198<Text id="60">The “<i>search 
 subjects 
 titles a-z 
 organization 
 how to</i>” bar underneath gives access to the searching and browsing facilities. The leftmost button is for searching, and the ones to the right of it—four, in this collection—evoke different browsing facilities. These last four may differ from one collection to another.</Text>
199<Subsection id="how_to_find_information">
200<Title>
201<Text id="61">How to find information</Text>
202</Title>
203<Content>
204<Text id="62">Table 2 shows the five ways to find information in the Demo collection.</Text>
205<Table id="table_icons_on_the_search_browse_bar">
206<Title>
207<Text id="63">What the icons on the search/browse bar mean</Text>
208</Title>
209<TableContent>
210<tr>
211<th width="123">
212<File width="61" height="11" url="images/User_Icon_5.png"/>
213</th>
214<th width="407">
215<Text id="64">Search for particular words</Text>
216</th>
217</tr>
218<tr>
219<th width="123">
220<File width="58" height="11" url="images/User_Icon_6.png"/>
221</th>
222<th width="407">
223<Text id="65">Access publications by subject</Text>
224</th>
225</tr>
226<tr>
227<th width="123">
228<File width="58" height="11" url="images/User_Icon_7.png"/>
229</th>
230<th width="407">
231<Text id="66">Access publications by title</Text>
232</th>
233</tr>
234<tr>
235<th width="123">
236<File width="71" height="11" url="images/User_Icon_8.png"/>
237</th>
238<th width="407">
239<Text id="67">Access publications by organization</Text>
240</th>
241</tr>
242<tr>
243<th width="123">
244<File width="58" height="11" url="images/User_Icon_9.png"/>
245</th>
246<th width="407">
247<Text id="68">Access publications by “how to” listing</Text>
248</th>
249</tr>
250</TableContent>
251</Table>
252<Text id="69">You can <i>search for particular words</i> that appear in the text from the “search” page. (This is just like the “about” page shown in Figure 1, except that it doesn't contain the <i>about this collection</i> text.) The search page can be reached from other pages by pressing the <i>search</i> button. You can <i>access publications by subject</i> by pressing the <i>subjects</i> button. This brings up a list of subjects, represented by bookshelves that can be further expanded by clicking on them. You can <i>access publications by title</i> by pressing the <i>titles a-z</i> button. This brings up a list of books in alphabetic order. You can <i>access publications by organization</i> by pressing the <i>organization</i> button. This brings up a list of organizations. You can <i>access publications by how to listing</i> by pressing the <i>how to</i> button. This brings up a list of “how to” hints. All these buttons are visible in Figure 1.</Text>
253</Content>
254</Subsection>
255<Subsection id="how_to_read_the_documents">
256<Title>
257<Text id="70">How to read the documents</Text>
258</Title>
259<Content>
260<Text id="71">In the Demo collection, you can tell when you have arrived at an individual book because there is a photograph of its front cover (Figure 2). Beside the photograph is a table of contents: the entry in bold face marks where you are, in this case <i>Introduction and Summary</i> —Section 1 of the chosen book. This table is expandable: click on the folders to open them or close them. Click on the open book at the top to close it.</Text>
261<Text id="72">Underneath is the text of the current section (“The international demand for tropical butterflies 
” in the example, beginning at the very bottom of the illustration). When you have read through it, there are arrows at the end to take you on to the next section or back to the previous one.</Text>
262<Text id="73">Below the photograph are four buttons. Click on <i>detach</i> to make a new browser window for this book. (This is useful if you want to compare books, or read two at once.) If you have reached this book through a search, the search terms will be highlighted: the <i>no highlighting</i> button turns this off. Click on <i>expand text</i> to expand out the whole text of the current section, or book. Click on <i>expand contents</i> to expand out the whole table of contents so that you can see the titles of all chapters and subsections.</Text>
263<Text id="74">In some collections, the documents do not have this kind of hierarchical structure. In this case, no table of contents is displayed when you get to an individual document—just the document text. In some cases, the document is split into pages, and you can read sequentially or jump about from one page to another.</Text>
264<Figure id="a_book_in_the_demo_collection">
265<Title>
266<Text id="75">A book in the Demo collection</Text>
267</Title>
268<File width="397" height="476" url="images/User_Fig_2.png"/>
269</Figure>
270</Content>
271</Subsection>
272<Subsection id="what_the_icons_mean">
273<Title>
274<Text id="76">What the icons mean</Text>
275</Title>
276<Content>
277<Text id="77">When you are browsing around the collection, you will encounter the items shown in Table 3.</Text>
278</Content>
279</Subsection>
280<Subsection id="how_to_search_for_particular_words">
281<Title>
282<Text id="78">How to search for particular words</Text>
283</Title>
284<Content>
285<Text id="79">From the search page, follow these simple steps to make a query:</Text>
286<BulletList>
287<Bullet>
288<Text id="80">Specify what units you want to search: in the Demo collection you can search section titles or the full text of the books.</Text>
289</Bullet>
290<Bullet>
291<Text id="81">Say whether you want to search for all or just some of the words</Text>
292</Bullet>
293<Bullet>
294<Text id="82">Type in the words you want to search for into the query box</Text>
295</Bullet>
296<Bullet>
297<Text id="83">Click the <i>Begin Search</i> button</Text>
298</Bullet>
299</BulletList>
300<Text id="84">When you make a query, the titles of up to twenty matching documents will be shown. There is a button at the end to take you on to the next twenty. From there you will find buttons to take you on to the third twenty or back to the first twenty, and so on. However, for efficiency reasons a maximum of 100 is imposed on the number of documents returned. You can change these numbers by clicking the <i>preferences</i> button at the top of the page.</Text>
301<Table id="table_icons_that_you_will_encounter_when_browsing">
302<Title>
303<Text id="85">Icons that you will encounter when browsing</Text>
304</Title>
305<TableContent>
306<tr>
307<th width="123">
308<File width="19" height="12" url="images/User_Icon_10.png"/>
309</th>
310<th width="407">
311<Text id="86">Click on a book icon to read the corresponding book</Text>
312</th>
313</tr>
314<tr>
315<th width="123">
316<File width="22" height="17" url="images/User_Icon_11.png"/>
317</th>
318<th width="407">
319<Text id="87">Click on a bookshelf icon to look at books on that subject</Text>
320</th>
321</tr>
322<tr>
323<th width="123">
324<File width="25" height="16" url="images/User_Icon_12.png"/>
325</th>
326<th width="407">
327<Text id="88">View this document</Text>
328</th>
329</tr>
330<tr>
331<th width="123">
332<File width="25" height="16" url="images/User_Icon_13.png"/>
333</th>
334<th width="407">
335<Text id="89">Open this folder and view contents</Text>
336</th>
337</tr>
338<tr>
339<th width="123">
340<File width="29" height="25" url="images/User_Icon_14.png"/>
341</th>
342<th width="407">
343<Text id="90">Click on this icon to close the book</Text>
344</th>
345</tr>
346<tr>
347<th width="123">
348<File width="25" height="17" url="images/User_Icon_15.png"/>
349</th>
350<th width="407">
351<Text id="91">Click on this icon to close the folder</Text>
352</th>
353</tr>
354<tr>
355<th width="123">
356<File width="32" height="17" url="images/User_Icon_16.png"/>
357</th>
358<th width="407">
359<Text id="92">Click on the arrow to go on to the next section ...</Text>
360</th>
361</tr>
362<tr>
363<th width="123">
364<File width="32" height="17" url="images/User_Icon_17.png"/>
365</th>
366<th width="407">
367<Text id="93">... or back to the previous section</Text>
368</th>
369</tr>
370<tr>
371<th width="123">
372<File width="49" height="31" url="images/User_Icon_18.png"/>
373</th>
374<th width="407">
375<Text id="94">Open this page in a new window</Text>
376</th>
377</tr>
378<tr>
379<th width="123">
380<File width="61" height="32" url="images/User_Icon_19.png"/>
381</th>
382<th width="407">
383<Text id="95">Expand table of contents</Text>
384</th>
385</tr>
386<tr>
387<th width="123">
388<File width="48" height="32" url="images/User_Icon_20.png"/>
389</th>
390<th width="407">
391<Text id="96">Display all text</Text>
392</th>
393</tr>
394<tr>
395<th width="123">
396<File width="55" height="31" url="images/User_Icon_21.png"/>
397</th>
398<th width="407">
399<Text id="97">Highlight search terms</Text>
400</th>
401</tr>
402</TableContent>
403</Table>
404<Text id="98">Click the title of any document, or the little icon beside it, to open it. The icon may show a book, or a folder, or a page: it will be a book icon if you are searching books; otherwise if you are searching sections it will be a folder or page icon depending on whether or not the section found has subsections.</Text>
405<Part id="search_terms">
406<Title>
407<Text id="99"><b><i>Search terms</i></b></Text>
408</Title>
409<Content>
410<Text id="100">Whatever you type into the query box is interpreted as a list of words called “search terms.” Each search term contains nothing but alphabetic characters and digits. Terms are separated by white space. If any other characters such as punctuation appear, they serve to separate terms just as though they were spaces. And then they are ignored. You can't search for words that include punctuation.</Text>
411<Text id="101">For example, the query</Text>
412<Text type="code" id="102">Agro-forestry in the Pacific Islands: Systems for Sustainability (1993)</Text>
413<Text id="103">will be treated the same as</Text>
414<Text type="code" id="104">Agro forestry in the Pacific Islands Systems for Sustainability 1993</Text>
415</Content>
416</Part>
417<Part id="query_type">
418<Title>
419<Text id="105"><b><i>Query type</i></b></Text>
420</Title>
421<Content>
422<Text id="106">There are two different kinds of query.</Text>
423<BulletList>
424<Bullet>
425<Text id="107">Queries for all the words. These look for documents (or chapters, or titles) that contain all the words you have specified. Documents that satisfy the query are displayed.</Text>
426</Bullet>
427<Bullet>
428<Text id="108">Queries for some of the words. Just list some terms that are likely to appear in the documents you are looking for. Documents are displayed in order of how closely they match the query. When determining the degree of match,</Text>
429<BulletList>
430<Bullet>
431<Text id="109">the more search terms a document contains, the closer it matches;</Text>
432</Bullet>
433<Bullet>
434<Text id="110">rare terms are more important than common ones;</Text>
435</Bullet>
436<Bullet>
437<Text id="111">short documents match better than long ones.</Text>
438</Bullet>
439</BulletList>
440</Bullet>
441</BulletList>
442</Content>
443</Part>
444<Text id="112">Use as many search terms as you like—a whole sentence, or even a whole paragraph. If you specify only one term, it doesn't much matter whether you use an <i>all</i> or a <i>some</i> query, except that in the second case the results will be sorted by the search term's frequency of occurrence.</Text>
445</Content>
446</Subsection>
447<Subsection id="scope_of_queries">
448<Title>
449<Text id="113">Scope of queries</Text>
450</Title>
451<Content>
452<Text id="114">In most collections you can choose different indexes to search. For example, there might be author or title indexes. Or there might be chapter or paragraph indexes. Generally, the full matching document is returned regardless of which index you search.</Text>
453<Text id="115">If documents are books, they will be opened at the appropriate place.</Text>
454</Content>
455</Subsection>
456<Subsection id="advanced_search_features">
457<Title>
458<Text id="116">Advanced search features</Text>
459</Title>
460<Content>
461<Text id="117">While the above is enough to meet most searching needs, some more advanced search features are provided. These are activated from the Preferences page, which is reached by clicking the <i>preferences</i> button at the top of the page—see Section <CrossRef target="Section" ref="changing_the_preferences"/> below. After changing your preferences, do not click your browser's <i>Back</i> button—that would undo the changes. Instead, click any of the buttons on the search/browse bar.</Text>
462<Part id="case_sensitivity_and_stemming">
463<Title>
464<Text id="118"><i><b>Case sensitivity and stemming</b></i></Text>
465</Title>
466<Content>
467<Text id="119">When you specify search terms, you can choose whether upper and lower case must match between the query and the document: this is called “case sensitivity.” You can also choose whether to ignore word endings or not: this is called “stemming.”</Text>
468<Text id="120">Under <i>Search options</i> on the Preferences page you will see a pair of buttons labeled <i>ignore case differences</i> and <i>upper/lower case must match</i>; these control the case sensitivity of your queries. Below is a pair of buttons labeled <i>ignore word endings</i> and <i>whole word must match</i>: these control stemming.</Text>
469<Text id="121">For example, if the buttons <i>ignore case differences</i> and <i>ignore word endings</i> are selected, the query</Text>
470<Text type="code" id="122">African building</Text>
471<Text id="123">will be treated the same as</Text>
472<Text type="code" id="124">africa builds</Text>
473<Text id="125">because the uppercase letter in “African” will be transformed to lowercase, and the suffixes “n” and “ing” will be removed from “African” and “building” respectively (also, “s” would be removed from “builds”).</Text>
474<Text id="126">Generally case differences and word endings should be ignored unless you are querying for particular names or acronyms.</Text>
475</Content>
476</Part>
477<Part id="phrase_searching">
478<Title>
479<Text id="127"><b><i>Phrase searching</i></b></Text>
480</Title>
481<Content>
482<Text id="128">If your query includes a phrase in quotation marks, only documents containing that phrase, exactly as typed, will be returned.</Text>
483<Text id="129">If you want to use phrase searching, you need to learn a little about how it works. Phrases are processed by a post-retrieval scan. First the query is issued in the normal way—all the words in the phrase are included as search terms—and then the documents returned are scanned to eliminate those in which that phrase does not appear.</Text>
484<Text id="130">During the post-retrieval scan, phrases are checked just as they are, including any punctuation. For example, the query</Text>
485<Text type="code" id="131">what's a “post-retrieval scan?”</Text>
486<Text id="132">will first retrieve all documents that match all of the words</Text>
487<CodeLine>what s a post retrieval scan</CodeLine>
488<Text id="133">and then the documents returned will be checked for the phrase</Text>
489<Text type="code" id="134">post-retrieval scan?</Text>
490<Text id="135">Phrase matches are case-insensitive if <i>ignore case differences</i> is set on the Preferences page.</Text>
491</Content>
492</Part>
493<Part id="advanced_query_mode">
494<Title>
495<Text id="136"><b><i>Advanced query mode</i></b></Text>
496</Title>
497<Content>
498<Text id="137">In <i>advanced query mode</i>, which can be selected on the Preferences page, the queries for <i>all</i> of the words, described above, are actually Boolean queries. They consist of a list of terms joined by logical operators &amp; (and), | (or), and ! (not). Absent operatorsbetween search terms are interpreted as &amp; (and): thus a query without any operators returns documents that match <i>all</i> the terms.</Text>
499<Text id="138">If the words AND, OR, and NOT appear in your query they are treated as ordinary search terms, not operators. For operators you must use &amp;, |, and !. In addition, parentheses can be used for grouping.</Text>
500</Content>
501</Part>
502<Part id="using_search_history">
503<Title>
504<Text id="139"><b><i>Using search history</i></b></Text>
505</Title>
506<Content>
507<Text id="140">When you switch on the “search history” feature on the Preferences page you will be shown your last few searches, along with a summary of how many results they generated. Click the button beside one of the previous searches to copy the text into the search box. This makes it easy to repeat slightly modified versions of previous queries.</Text>
508</Content>
509</Part>
510</Content>
511</Subsection>
512</Content>
513</Section>
514<Section id="changing_the_preferences">
515<Title>
516<Text id="141">Changing the preferences</Text>
517</Title>
518<Content>
519<Figure id="the_preferences_page">
520<Title>
521<Text id="142">The Preferences page</Text>
522</Title>
523<File width="394" height="439" url="images/User_Fig_3.png"/>
524</Figure>
525<Text id="143">When you click the <i>preferences</i> button at the top of the page you will be able to change some features of the interface to suit your own requirements. The preferences depend on the collection; an example is shown in Figure 3. When you adjust your search preferences, you should press the <i>set preferences</i> button shown in Figure 3. After setting preferences, do not use your browser's “back” button—that would unset them! Instead, click one of the buttons on the access bar near the top of the page.</Text>
526<Subsection id="collection_preferences">
527<Title>
528<Text id="144">Collection preferences</Text>
529</Title>
530<Content>
531<Text id="145">Some collections comprise several subcollections, which can be searched independently or together, as one unit. If so, you can select which subcollections to include in your searches on the Preferences page.</Text>
532</Content>
533</Subsection>
534<Subsection id="language_preferences">
535<Title>
536<Text id="146">Language preferences</Text>
537</Title>
538<Content>
539<Text id="147">Each collection has a default presentation language, but you can switch to a different language if you like. You can also alter the encoding scheme used by Greenstone for output to the browser—the software chooses sensible defaults, but with some browsers better visual results can be used by switching to a different encoding scheme. All collections allow you to switch from the standard graphical interface format to a textual one. This is particularly useful for visually impaired users who use large screen fonts or speech synthesizers for output.</Text>
540</Content>
541</Subsection>
542<Subsection id="presentation_preferences">
543<Title>
544<Text id="148">Presentation preferences</Text>
545</Title>
546<Content>
547<Text id="149">Depending on the collection, there may be other options you can set that control the presentation. Collections of web pages allow you to suppress the Greenstone navigation bar at the top of each document page, so that once you have done a search you land at the exact web page that matches without any Greenstone header. To do another search you will have to use your browser's “back” button. These collections also allow you to suppress Greenstone's warning message when you click a link that takes you out of the digital library collection and on to the web itself. And in some web collections you can control whether the links on the “Search Results” page take you straight to the actual URL in question, rather than to the digital library's copy of the page.</Text>
548</Content>
549</Subsection>
550<Subsection id="search_preferences">
551<Title>
552<Text id="150">Search preferences</Text>
553</Title>
554<Content>
555<Text id="151">Under <i>Search preferences</i> in Figure 3, the first pair of buttons allows you to get a large query box, so that you can easily do paragraph-sized searching. In Greenstone, it is surprisingly quick to search for large amounts of text. The next two pairs of buttons control the kind of text matching in the searches that you make. The first set (labeled “case differences”) controls whether upper and lower case must match. The second (“word endings”) controls whether to ignore word endings or not.</Text>
556<Text id="152">Using the next button pair you can switch to the “advanced” query mode described above, which allows you to specify more precise queries by combining terms using AND (&amp;), OR (|), and NOT (!). You can turn the search history feature, described above, on and off. Finally, you can control the number of hits returned, and the number presented on each screenful, through the last entry in Figure 3.</Text>
557</Content>
558</Subsection>
559</Content>
560</Section>
561</Content>
562</Chapter>
563<Chapter id="making_greenstone_collections">
564<Title>
565<Text id="153">Making Greenstone Collections</Text>
566</Title>
567<Content>
568<Text id="154">The simplest way to build new collections is to use Greenstone's “librarian” interface (GLI). This allows you to collect sets of documents, import or assign metadata, and build them into a Greenstone collection. It supports five basic activities, which can be interleaved but are nominally undertaken in this order:</Text>
569<NumberedList>
570<NumberedItem>
571<Text id="155">Copy documents from the computer's file space, including existing collections, into the new collection. Any existing metadata remains “attached” to these documents. Documents may also be gathered from the web through a built-in mirroring facility.</Text>
572</NumberedItem>
573<NumberedItem>
574<Text id="156">Enrich the documents by adding further metadata to individual documents or groups of documents.</Text>
575</NumberedItem>
576<NumberedItem>
577<Text id="157">Design the collection by determining its appearance and the access facilities that it will support.</Text>
578</NumberedItem>
579<NumberedItem>
580<Text id="158">Build the collection using Greenstone.</Text>
581</NumberedItem>
582<NumberedItem>
583<Text id="159">Preview the newly created collection, which will have been installed on your Greenstone home page as one of the regular collections.</Text>
584</NumberedItem>
585</NumberedList>
586<Text id="160">The librarian interface allows you to add what people call “external” metadata to documents, metadata that pertains to the document as a whole. But documents often need to be structured into sections and subsections, and “internal” metadata might be associated with each part. In Greenstone, source documents can be tagged with this information, and we explain this in Section <CrossRef target="Section" ref="tagging_document_files"/>.</Text>
587<Text id="161">Finally, an alternative way of building collections is provided by the Collector, which helps you create new collections, modify or add to existing ones, or delete collections. It predates the librarian interface, and for most practical purposes the librarian interface should be used instead of the Collector. It is described in Section <CrossRef target="Section" ref="the_collector"/>.</Text>
588<Text id="162">To harness the full power of Greenstone to build advanced collections, you will also need to read Chapter <CrossRef external="Develop" lang="en" target="Chapter" ref="getting_the_most_out_of_your_documents"/> of the <i>Developer's Guide</i>.</Text>
589<Section id="the_librarian_interface">
590<Title>
591<Text id="163">The librarian's interface</Text>
592</Title>
593<Content>
594<Text id="164">To convey the operation of Greenstone's librarian interface, we work through a simple example. Figures 4 to 15 are screen snapshots at various points during the interaction. This example uses documents in the Development Library Subset (DLS) collection, which is distributed with Greenstone. For expository purposes, the walkthrough takes the form of a single pass through the steps listed above. A more realistic pattern of use, however, is for users to switch back and forth through the various stages as the task proceeds.</Text>
595<Text id="165">The librarian interface can be run in one of four modes: Librarian Assistant, Librarian, Library Systems Specialist, and Expert. Modes control the level of detail within the interface, and can be changed through 'Preferences' in the 'File' menu. The walkthrough in this section assumes that the librarian interface is operating in the default mode, Librarian.</Text>
596<Subsection id="getting_started">
597<Title>
598<Text id="166">Getting started</Text>
599</Title>
600<Content>
601<Text id="167">Launch the librarian interface under Windows by selecting <i>Greenstone Digital Library</i> from the <i>Programs</i> section of the <i>Start</i> menu and choosing <i>Librarian Interface</i>. If you are using Unix, instead type</Text>
602<CodeLine>cd ~/gsdl</CodeLine>
603<CodeLine>cd gli</CodeLine>
604<CodeLine>./gli.sh</CodeLine>
605<Text id="168">where <i>~/gsdl</i> is the directory containing your Greenstone system. To begin, you must either open an existing collection or start a new one. Figure 4 shows the user in the process of starting a new collection. She has selected <i>New</i> from the file menu and begun to fill out general information about the collection—its title, the E-mail address of the person responsible for it, and a brief description of the content—in the popup window. The collection title is a short phrase used throughout the digital library to identify the collection's content: existing collections have names like <i>Food and Nutrition Library</i>, <i>World Environmental Library</i>, and so on. When you type the title, the system assigns a unique mnemonic identifier, the collection “name”, for internal use (you can change it if you like). The E-mail address specifies the first point of contact for any problems encountered with the collection.</Text>
606<Text id="169">The brief description is a statement describing the principles that govern what is included in the collection. It appears under the heading <i>About this collection</i> on the collection's initial page.</Text>
607<Figure id="starting_a_new_collection">
608<Title>
609<Text id="170">Starting a new collection</Text>
610</Title>
611<File width="407" height="317" url="images/User_Fig_4.png"/>
612</Figure>
613<Figure id="exploring_the_local_file_space">
614<Title>
615<Text id="171">Exploring the local file space</Text>
616</Title>
617<File width="407" height="318" url="images/User_Fig_5.png"/>
618</Figure>
619<Text id="172">At this point, the user decides whether to base the new collection on the same structure as an existing collection, or to build an entirely new kind of collection. In Figure 4 she has chosen to base it on the <i>Development Library Subset</i> collection. This implies that the “DLS” metadata set which is used in this collection will be used for the new collection. (In fact, this metadata set has been used to build several Greenstone collections that share a common structure and organization but with different content, including the <i>Development Library Subset</i> and <i>Demo</i> collections delivered as samples with Greenstone.)</Text>
620<Text id="173">The DLS metadata set contains these items:</Text>
621<BulletList>
622<Bullet>
623<Text id="174">Title</Text>
624</Bullet>
625<Bullet>
626<Text id="175">Subject</Text>
627</Bullet>
628<Bullet>
629<Text id="176">Language</Text>
630</Bullet>
631<Bullet>
632<Text id="177">Organization</Text>
633</Bullet>
634<Bullet>
635<Text id="178">Keyword (i.e.”Howto”).</Text>
636</Bullet>
637</BulletList>
638<Text id="179">(There is, in addition, a metadata item called <i>AZList</i> which is used to determine which bucket of the alphabetic list contains the document's title, with values like “A-B” or “C-D-E”. This is used to give precise control over thedivisions in the list. For most other collections it is absent, and Greenstone assigns the buckets itself.)</Text>
639<Text id="180">If, instead, the user had chosen “New Collection” at this point, she would have been asked to select what metadata sets should be used in the new collection. Three standard sets are pre-supplied: Dublin Core, the DLS metadata set mentioned above, and a set that comprises metadata elements extracted automatically by Greenstone from the documents in the collection. The user can also create new metadata sets using a popup panel activated through the “metadata” menu.</Text>
640<Text id="181">Several different metadata sets can be associated with the same collection; the system keeps them distinct (so that, for example, documents can have both a Dublin Core <i>Title</i> and a DLS <i>Title</i>). The different sets are clearly distinguished in the interface. Behind the scenes, metadata sets are represented in XML.</Text>
641</Content>
642</Subsection>
643<Subsection id="assembling_the_source_material">
644<Title>
645<Text id="182">Assembling the source material</Text>
646</Title>
647<Content>
648<Text id="183">After clicking the <i>OK</i> button on the “new collection” popup, the remaining parts of the interface, which were grayed out before, become active. The <i>Gather</i> panel, selected by the eponymous tab near the top of Figure 4, is displayed initially. This allows the user to explore the local file space and existing collections, gathering up selected documents for the new collection. The panel is divided into two sections, the left for browsing existing structures and the right for the documents in the collection.</Text>
649<Text id="184">Operations available at this stage include:</Text>
650<BulletList>
651<Bullet>
652<Text id="185">Navigating the existing file structure hierarchy, and the one being created, in the usual way.</Text>
653</Bullet>
654<Bullet>
655<Text id="186">Dragging and dropping files into the new collection.</Text>
656</Bullet>
657<Bullet>
658<Text id="187">Multiple selection of files.</Text>
659</Bullet>
660<Bullet>
661<Text id="188">Dragging and dropping entire sub-hierarchies.</Text>
662</Bullet>
663<Bullet>
664<Text id="189">Deleting documents from the nascent collection.</Text>
665</Bullet>
666<Bullet>
667<Text id="190">Creating new sub-hierarchies within the collection.</Text>
668</Bullet>
669<Bullet>
670<Text id="191">Filtering the files that are visible, in both the local file system and the collection, based on predetermined groups or on standard file matching terms.</Text>
671</Bullet>
672<Bullet>
673<Text id="192">Invoking the appropriate program to display the contents of a selected file, by double-clicking it.</Text>
674</Bullet>
675</BulletList>
676<Text id="193">Care is taken to deal appropriately with name clashes when files of the same name in different parts of the computer's directory structure are copied into the same folder of the collection.</Text>
677<Text id="194">In Figure 5 the user is using the interactive file tree display to explore the local file system. At this stage, the collection on the right is empty; the user populates it by dragging and dropping files of interest from the left to the right panel. Such files are “copied” rather than “moved”: so as not to disturb the original file system. The usual techniques for multiple selection, dragging and dropping, structuring the new collection by creating subdirectories (“folders”), and deleting files from it by moving them to a trashcan, are all available.</Text>
678<Text id="195">Existing collections are represented by a subdirectory on the left called “Greenstone Collections,” which can be opened and explored like any other directory. However, the documents therein differ from ordinary files because they already have metadata attached, and this is preserved when they are moved into the new collection. Conflicts may arise because their metadata may have been assigned using a different metadata set from the one in use for the new collection, and the user must resolve these. In Figure 6 the user has selected some documents from an existing collection and dragged them into the new one. The popup window explains that the metadata element <i>Organization</i> cannot be automatically imported, and asks the user to either select a metadata set and press <i>Add</i> to add the metadata element to that set,<FootnoteRef id="1"/>or choose a metadata set, then an element, and press <i>Merge</i> to effectively rename the old metadata element to the new one by merging the two. Metadata in subsequent documents from the same collection will automatically be handled in the same way.</Text>
679<Text id="196">When large file sets are selected, dragged, and dropped into the new collection, the copying operation may take some time—particularly if metadata conversion is involved. To indicate progress, the interface shows which file is being copied and what percentage of files has been processed.</Text>
680<Text id="197">Special facilities are provided for dealing with large file sets. For example, the user can choose to filter the file tree to show only certain files, using a dropdown menu of file types displayed underneath the trees. In Figure 7, only the HTM and HTML files are being shown (and only these files will be copied by drag and drop).</Text>
681</Content>
682</Subsection>
683<Subsection id="enriching_the_documents">
684<Title>
685<Text id="198">Enriching the documents</Text>
686</Title>
687<Content>
688<Text id="199">The next phase in collection building is to enrich the documents by adding metadata. The <i>Enrich</i> tab brings up a new panel of information (Figure 8), which shows the document tree representing the collection on the left and on the right allows metadata to be added to individual documents, or groups of documents.</Text>
689<Text id="200">Documents that are copied during the first step come with any applicable metadata attached. If a document is part of a Greenstone collection, previously defined metadata is carried over to the new collection. Of course, this new collection may have a different metadata set, or perhaps just a subset of the defined metadata, and only metadata that pertains to the new collection's set is carried over. Resolution of such conflicts may require user intervention via a supplementary dialog (Figure 6). Any choices made are remembered for subsequent file copies.</Text>
690<Text id="201">The <i>Enrich</i> panel allows metadata values to be assigned to documents in the collection. For example, new values can be added to the set of existing values for an element. If the element's values have a hierarchical structure, the hierarchy can be extended in the same way.</Text>
691<Figure id="importing_existing_metadata">
692<Title>
693<Text id="202">Importing existing metadata</Text>
694</Title>
695<File width="407" height="317" url="images/User_Fig_6.png"/>
696</Figure>
697<Figure id="filtering_the_file_trees">
698<Title>
699<Text id="203">Filtering the file trees</Text>
700</Title>
701<File width="407" height="318" url="images/User_Fig_7.png"/>
702</Figure>
703<Figure id="assigning_metadata_using_enrich_view">
704<Title>
705<Text id="204">Assigning metadata using <i>Enrich</i> view</Text>
706</Title>
707<File width="407" height="318" url="images/User_Fig_8.png"/>
708</Figure>
709<Figure id="viewing_all_metadata_for_selected_files">
710<Title>
711<Text id="205">Viewing all metadata for selected files</Text>
712</Title>
713<File width="407" height="317" url="images/User_Fig_9.png"/>
714</Figure>
715<Text id="206">Metadata values can also be assigned to folders, in just the same way. Documents in these folders for which this metadata is unspecified inherit the metadata values. However, they can subsequently be overridden by supplying different ones for the document itself.</Text>
716<Text id="207">Operations at this stage include:</Text>
717<BulletList>
718<Bullet>
719<Text id="208">Assigning new and existing metadata values to documents.</Text>
720</Bullet>
721<Bullet>
722<Text id="209">Assigning metadata to an individual document.</Text>
723</Bullet>
724<Bullet>
725<Text id="210">Assigning metadata to a folder (this is inherited by all documentsin the folder, including those in nested folders).</Text>
726</Bullet>
727<Bullet>
728<Text id="211">Assigning hierarchical metadata, whose structure can be dynamically updated if required.</Text>
729</Bullet>
730<Bullet>
731<Text id="212">Editing or updating assigned metadata.</Text>
732</Bullet>
733<Bullet>
734<Text id="213">Reviewing the metadata assigned to a selection of files and directories.</Text>
735</Bullet>
736</BulletList>
737<Text id="214">For our walkthrough example, in Figure 8 the user has selected the folder <i>ec121e</i> and assigned “EC Courier” as its <i>Organization</i> metadata. The buttons for updating and removing metadata become active depending on what selections have been made.</Text>
738<Text id="215">During the enrichment phase, or indeed at any other time, the user can choose to view all the metadata that has been assigned to documents in the collection. This is done by selecting a set of documents and choosing <i>Assigned Metadata</i> from the metadata sets menu, which brings up a popup window like that in Figure 9 that shows the metadata in spreadsheet form. For large collections it is useful to be able to view the metadata associated with certain document types only, and if the user has specified a file filter as mentioned above, only the selected documents are shown in the metadata display.</Text>
739<Text id="216">The panel in Figure 10 allows the user to edit metadata sets. Here, the user is looking at the <i>Subject</i> element of the DLS set. The values of this element form a hierarchy, and the user is examining, and perhaps changing, the list of values assigned to it. The same panel also allows you to change the “profile” for mapping elements of one metadata set to another. This profile is created when importing documents from collections that have pre-assigned metadata.</Text>
740<Figure id="editing_the_metadata_set">
741<Title>
742<Text id="217">Editing the metadata set</Text>
743</Title>
744<File width="407" height="317" url="images/User_Fig_10.png"/>
745</Figure>
746<Figure id="designing_the_collection">
747<Title>
748<Text id="218">Designing the collection</Text>
749</Title>
750<File width="407" height="318" url="images/User_Fig_11.png"/>
751</Figure>
752<Figure id="specifying_which_plug-ins_to_use">
753<Title>
754<Text id="219">Specifying which plug-ins to use</Text>
755</Title>
756<File width="407" height="318" url="images/User_Fig_12.png"/>
757</Figure>
758<Figure id="configuring_arguments_to_a_plug-in">
759<Title>
760<Text id="220">Configuring arguments to a plug-in</Text>
761</Title>
762<File width="407" height="317" url="images/User_Fig_13.png"/>
763</Figure>
764</Content>
765</Subsection>
766<Subsection id="designing_the_collection_1">
767<Title>
768<Text id="221">Designing the collection</Text>
769</Title>
770<Content>
771<Text id="222">The <i>Design</i> panel (Figures 11—13) allows one to specify the structure, organization, and presentation of the collection being created. As noted earlier, the result of this process is recorded in a “collection configuration file,” which is Greenstone's way of expressing the facilities that a collection requires. This step involves a series of separate interaction screens, each dealing with one aspect of the collection design. In effect, it serves as a graphical equivalent to the usual process of editing the configuration file manually.</Text>
772<Text id="223">Operations include:</Text>
773<BulletList>
774<Bullet>
775<Text id="224">Reviewing and editing collection-level metadata such as title, author and public availability of the collection.</Text>
776</Bullet>
777<Bullet>
778<Text id="225">Defining what full-text indexes are to be built.</Text>
779</Bullet>
780<Bullet>
781<Text id="226">Creating sub-collections and having indexes built for them.</Text>
782</Bullet>
783<Bullet>
784<Text id="227">Adding or removing support for predefined interface languages.</Text>
785</Bullet>
786<Bullet>
787<Text id="228">Constructing a list of plug-ins to be used, and their arguments.</Text>
788</Bullet>
789<Bullet>
790<Text id="229">Presenting the list to the user for review and modification.</Text>
791</Bullet>
792<Bullet>
793<Text id="230">Configuring individual plug-ins.</Text>
794</Bullet>
795<Bullet>
796<Text id="231">Constructing a list of “classifiers,” their arguments, assignment and configuration.</Text>
797</Bullet>
798<Bullet>
799<Text id="232">Assigning formatting strings to various controls within the collection, thus altering its appearance.</Text>
800</Bullet>
801<Bullet>
802<Text id="233">Reviewing the metadata sets, and their elements, used in the collection.</Text>
803</Bullet>
804</BulletList>
805<Text id="234">In Figure 11 the user has clicked the <i>Design</i> tab and is reviewing the general information about the collection, entered when the new collection was created. On the left are listed the various facets that the user can configure: General, Document Plug-ins, Search Types, Search Indexes, Partition Indexes, Cross-Collection Search, Browsing Classifiers, Format Features, Translate Text, Metadata Sets. Appearance and functionality varies between these. For example, clicking the <i>Plug-in</i> button brings up the screen shown in Figure 12, which allows you to add, remove or configure plug-ins, and change the order in which the plug-ins are applied to documents.</Text>
806<Text id="235">Plug-ins and classifiers have many different arguments or “options” that the user can supply. The dialog box in Figure 13 shows the user specifying arguments to some of the plug-ins. The grayed-out fields become active when the user adds the option by clicking the tick-box beside it. Because Greenstone is a continually growing open-source system, the number of options tends to increase as developers add new facilities. To help cope with this, Greenstone has a “plug-in information” utility program that lists the options available for each plug-in, and the librarian interface automatically invokes this to determine what options to show. This allows the interactive user interface to automatically keep pace with developments in the software.</Text>
807<Figure id="getting_ready_to_create_new_collection">
808<Title>
809<Text id="236">Getting ready to create new collection</Text>
810</Title>
811<File width="407" height="318" url="images/User_Fig_14.png"/>
812</Figure>
813<Figure id="previewing_the_newly_built_collection">
814<Title>
815<Text id="237">Previewing the newly built collection</Text>
816</Title>
817<File width="407" height="291" url="images/User_Fig_15.png"/>
818</Figure>
819</Content>
820</Subsection>
821<Subsection id="building_the_collection">
822<Title>
823<Text id="238">Building the collection</Text>
824</Title>
825<Content>
826<Text id="239">The <i>Create</i> panel (Figure 14) is used to construct a collection based on the documents and assigned metadata. The brunt of this work is borne by the Greenstone code itself. The user controls this external process through a series of separate interaction screens, each dealing with the arguments provided to a certain stage of the creation process.</Text>
827<Text id="240">The user observes the building process though a window that shows not only the text output generated by Greenstone's importing and index-building scripts, but also progress bars that indicate the overall degree of completion of each script.</Text>
828<Text id="241">Figure 14 shows the <i>Create</i> view. At the top are shown some options that can be applied during the creation process. The user selects appropriate values for the options. This figure illustrates a popup “tool tip” that is available throughout the interface to explain the function of each argument.</Text>
829<Text id="242">When satisfied with the arguments, the user clicks <i>Build Collection</i>. Greenstone continually prints text that indicates progress, and this is shown along with a more informative progress bar.</Text>
830</Content>
831</Subsection>
832<Subsection id="previewing">
833<Title>
834<Text id="243">Previewing</Text>
835</Title>
836<Content>
837<Text id="244">The <i>Preview Collection</i> button (Figure 14) is used to view the collection that has been built. Clicking this button launches a web browser showing the home page of the collection (Figure 15). In practice, previewing often shows up deficiencies in the collection design, or in the individual metadata values, and the user frequently returns to earlier stages to correct these. This button becomes active once the collection has been created. The newly created collection will also have been installed on your Greenstone home page as one of the regular collections.</Text>
838</Content>
839</Subsection>
840<Subsection id="help">
841<Title>
842<Text id="245">Help</Text>
843</Title>
844<Content>
845<Text id="246">On-line help is always available, and is invoked using the <i>Help</i> item at the right of the main menu bar at the top of each of the Figures. This opens up a hierarchically structured file of help text, and account is taken of the user's current context to highlight the section that is appropriate to the present stage of the interaction. Furthermore, as noted above, whenever the mouse is held still over any interactive object a small window pops up to give a textual “tool tip,” as illustrated near the bottom of Figure 14.</Text>
846</Content>
847</Subsection>
848</Content>
849</Section>
850<Section id="librarian_interface_user_guide">
851<Title>
852<Text id="247">Librarian Interface user guide</Text>
853</Title>
854<Content>
855<Subsection id="starting_off">
856<Title>
857<Text id="248">Starting Off</Text>
858</Title>
859<Content>
860<Text id="249">This section covers how to create, load, save and delete collections.</Text>
861<Part id="creating_a_new_collection">
862<Title>
863<Text id="250">Creating a New Collection</Text>
864</Title>
865<Content>
866<Text id="251">To create a new collection, open the "File" menu and choose "New". Several fields need to be filled out -- but you can change their values later if you need to, in the design view.</Text>
867<Text id="252">"Collection title" is the text displayed at the top of your collection's home page. It can be any length.</Text>
868<Text id="253">"Description of content" should describe, in as much detail as possible, what the collection is about. Use the [Enter] key to break it into paragraphs.</Text>
869<Text id="254">Finally you must specify whether the new collection will have the same appearance and metadata sets as an existing collection, or whether to start a default "New Collection".</Text>
870<Text id="255">Click "OK" to create the collection. If you chose "New Collection" you are prompted for the metadata sets to use in it. You can choose more than one, and you can add others later.</Text>
871<Text id="256">Clicking "Cancel" returns you to the main screen immediately.</Text>
872</Content>
873</Part>
874<Part id="saving_the_collection">
875<Title>
876<Text id="257">Saving the Collection</Text>
877</Title>
878<Content>
879<Text id="258">Save your work regularly by opening the "File" menu and choosing "Save". Saving a collection is not the same as making it ready for use in Greenstone (see Producing Your Collection).</Text>
880<Text id="259">The Librarian Interface protects your work by saving it whenever you exit the program or load another collection.</Text>
881<Text id="260">Saved collections are written to a file named for the collection and with file extension ".col", located in a folder of the same name within your Greenstone installation's "collect" folder.</Text>
882</Content>
883</Part>
884<Part id="opening_an_existing_collection">
885<Title>
886<Text id="261">Opening an Existing Collection</Text>
887</Title>
888<Content>
889<Text id="262">To open an existing collection, choose "Open" from the "File" menu to get the Open Collection prompt. A list of your Greenstone collections appears. Select one to see its description, and click "Open" to load it. If you seek a collection that resides outside Greenstone's "collect" folder, click "Browse" for a file system browsing dialog.</Text>
890<Text id="263">In case more than one Greenstone Librarian Interface program is running concurrently, the relevant directories are "locked" to prevent interference. On opening a collection, a small temporary lock file is created in its folder. Before opening a collection, the Librarian Interface checks to ensure that no lock file already exists. You can tell whether a collection is locked by the colour of its icon: green for a normal collection, red for a locked one. However, when the Librarian Interface is exited prematurely the lock file is sometimes left in place. When you open such a collection, the Librarian asks if you want to "steal" control of it. Never steal a collection that someone else is currently working on.</Text>
891<Text id="264">When you open a collection that the Greenstone Librarian Interface did not create, you will be asked to select a metadata set (or sets). If none are selected, any existing metadata will be ignored. Otherwise, metadata will be imported just as it is when you drag in files with existing metadata. The process is described in the Importing Previously Assigned Metadata section.</Text>
892</Content>
893</Part>
894<Part id="deleting_collections">
895<Title>
896<Text id="265">Deleting Collections</Text>
897</Title>
898<Content>
899<Text id="266">To permanently delete collections from your Greenstone installation, choose "Delete..." from the "File" menu. A list of your Greenstone collections appears. Select one to see its description, then tick the box at the bottom of the dialog and click "Delete" to delete the collection. This action is irreversible, so check carefully that you no longer need the collection before proceeding!</Text>
900</Content>
901</Part>
902</Content>
903</Subsection>
904<Subsection id="downloading_files_from_the_internet">
905<Title>
906<Text id="267">Downloading Files From the Internet</Text>
907</Title>
908<Content>
909<Text id="268">The "Download" view helps you download resources from the internet. This section explains the Librarian Interface's mirroring process.</Text>
910<Part id="the_download_view">
911<Title>
912<Text id="269">The Download view</Text>
913</Title>
914<Content>
915<Text id="270">This section describes how to configure a download task and control the downloading process. Access the "Download" view by clicking its tab. The top half of the screen shows the downloading controls. The bottom half is initially empty, but will show a list of pending and completed downloading jobs.</Text>
916<Text id="271">Files are downloaded into a folder in the workspace called "Downloaded Files" (only present when mirroring is enabled), and can be used in all collections built with the Librarian Interface. Files in this area are named by their full web URL. A new folder is created for each host, followed by others for each part of the path. This ensures that each file is distinct.</Text>
917<Text id="272">Use the first of the download configuration controls, "Source URL", to enter the URL of a target resource. Use the "Download Depth" control to limit how many hyperlinks to follow when downloading: Set this to 0 to download a single web page; set it to 1 to download a page and all the pages it points to. The depth limit is ignored when downloading media other than html pages. Next, there are several checkbox controls which can be set to turn on the specified feature for a specific download. Once the configuration is set up, click "Download" to start the new download job. There are two other button controls: "Preferences", which links to the connection section of the Preferences where proxy settings can be edited; and "Clear Cache", which deletes all previously downloaded files.</Text>
918<Text id="273">The download list has an entry for each web page download. Each entry has a text region that gives details of the task along with a progress bar showing current activity. Three buttons appear to the left of each entry. "Pause" is used for pausing a currently downloading task. "View Log" opens a window showing the download log file. "Close" terminates the download and removes the task from the list.</Text>
919<Text id="274">The Preferences section describes how to establish an Internet connection via a proxy. If authentication is needed, the proxy server prompts for identification and password. The Librarian Interface does not store passwords between sessions.</Text>
920</Content>
921</Part>
922</Content>
923</Subsection>
924<Subsection id="collecting_files_for_your_collection">
925<Title>
926<Text id="275">Collecting Files for Your Collection</Text>
927</Title>
928<Content>
929<Text id="276">Once you have a new collection you need to get some files into it. These may come from your ordinary file space, or from other Greenstone collections. Some may already have attached metadata. This section describes how to import files.</Text>
930<Part id="the_gather_view">
931<Title>
932<Text id="277">The Gather View</Text>
933</Title>
934<Content>
935<Text id="278">This section introduces the Gather area that you use to select what files to include in the collection you are building. The Librarian Interface starts with the Gather view. To return to this view later, click the "Gather" tab directly below the menu bar.</Text>
936<Text id="279">The two large areas titled "Workspace" and "Collection" are used to move files into your collection. They contain "file trees", graphical structures that represent files and folders.</Text>
937<Text id="280">Select an item in the tree by clicking it. (There are other ways; see below.) Double-click a folder, or single-click the switch symbol beside it, to expand (or collapse) its contents. Double-click a file to open it using its associated application program (see File Associations).</Text>
938<Text id="281">The Workspace file tree shows the sources of data available to the Librarian Interface -- the local file system (including disk and CD-ROM drives), the contents of existing Greenstone collections, and the cache of downloaded files. You can copy and view these files but you cannot move, delete, or edit them, with the exception of the downloaded files, which can be deleted. Navigate this space to find the files you want to include in the collection.</Text>
939<Text id="282">The Collection file tree represents the contents of the collection so far. Initially, it is empty.</Text>
940<Text id="283">You can resize the spaces by mousing over the grey bar that separates the trees (the shape of the pointer changes) and dragging.</Text>
941<Text id="284">At the bottom of the window is a status area that shows the progress of actions involving files (copying, moving and deleting). These can take some time to complete. The "Stop" button stops any action that is currently in progress.</Text>
942<Text id="285">Two large buttons occupy the lower right corner of the screen. "New Folder", with a picture of a folder, creates new folders (see Creating folders). "Delete", with a garbage can, removes files. Clicking the Delete button will remove any selected files from the Collection file tree. Alternatively, files can be deleted by dragging them onto the Delete button.</Text>
943<Text id="286">To select several sequential items, select the first and then hold down [Shift] and click on the last -- the selection will encompass all intervening items. Select non-sequential files by holding down [Ctrl] while clicking. Use these two methods together to select groups of non-adjacent items.</Text>
944<Text id="287">Certain folders -- such as the one containing your own web pages -- sometimes have special significance. The Librarian Interface can map such folders to the first level of the file tree. To do this, right-click the desired folder. Select "Create Shortcut", and enter a name for the folder. To remove an item, right-click the mapped folder and select "Remove Shortcut".</Text>
945</Content>
946</Part>
947<Part id="creating_folders">
948<Title>
949<Text id="288">Creating Folders</Text>
950</Title>
951<Content>
952<Text id="289">Use folders in the Collection file tree to group files together and make them easier to find. Folders can be placed inside folders. There is virtually no limit to how many folders you can have or how deeply they can be nested.</Text>
953<Text id="290">To create a new folder, optionally select an existing folder in the Collection Tree and click the New Folder button. The new folder appears within the selected one, or at the top level if none is selected. You are prompted for the folder's name (default "New Folder").</Text>
954<Text id="291">Folders can also be created by right-clicking over a folder, choosing "New Folder" and proceeding as above.</Text>
955</Content>
956</Part>
957<Part id="adding_files">
958<Title>
959<Text id="292">Adding Files</Text>
960</Title>
961<Content>
962<Text id="293">Files can be copied into the collection by dragging and dropping. The mouse pointer becomes a ghost of the selected item (or, if more than one is selected, the number of them). Drop the selection into the Collection Tree to copy the files there (if the source was the Workspace Tree) or move them around within the collection (if the source was the Collection Tree).</Text>
963<Text id="294">When copying multiple files, they are all placed in the target folder at the same level, irrespective of the folder structure they occupied originally. When you copy a second file with the same name into the same folder, you are asked whether to overwrite the first one. Respond "No" and the file will not be copied, but the others will be. To cancel all remaining copy actions, click the "stop" button.</Text>
964<Text id="295">Only the "highest" items in a selection are moved. A folder is higher than its children. You cannot select files within a folder and also the folder itself.</Text>
965<Text id="296">When you add a file, the Librarian Interface searches through the source folders for auxiliary files containing metadata previously assigned to the added file and, if it finds one, begins to import this metadata. As the operation proceeds, you may be prompted (perhaps several times) for extra information to match the imported metadata to the metadata sets in your collection. This process involves many different prompts, described in the Importing Previously Assigned Metadata section. For a more detailed explanation of associating metadata with files read Chapter 2 of the Greenstone Developer's Guide -- Getting the most out of your documents.</Text>
966</Content>
967</Part>
968<Part id="removing_files">
969<Title>
970<Text id="297">Removing Files</Text>
971</Title>
972<Content>
973<Text id="298">There are several methods for removing files and folders. You must first indicate what items to remove by selecting one or more files and folders as described in The Gather View.</Text>
974<Text id="299">Once files have been selected, click the "delete" button to remove them, or press the [Delete] key on your keyboard, or drag them from the collection to the delete button and drop them there.</Text>
975</Content>
976</Part>
977<Part id="filtering_the_tree">
978<Title>
979<Text id="300">Filtering the Tree</Text>
980</Title>
981<Content>
982<Text id="301">"Filtering" the collection tree allows you to narrow down the search for particular files.</Text>
983<Text id="302">The "Show Files" pull-down menu underneath each tree shows a list of predefined filters, such as "Images". Choosing this temporarily hides all other files in the tree. To restore the tree, change the filter back to "All Files". These operations do not alter the collection, nor do they affect the folders in the tree.</Text>
984<Text id="303">You can specify a custom filter by typing in a pattern to match files against (Librarian Systems Specialist and Expert modes only). Use standard file system abbreviations such as "*.*" or "*.doc" ("*" matches any characters).</Text>
985</Content>
986</Part>
987</Content>
988</Subsection>
989<Subsection id="enriching_the_collection_with_metadata">
990<Title>
991<Text id="304">Enriching the Collection with Metadata</Text>
992</Title>
993<Content>
994<Text id="305">Having gathered several files into the collection, now enrich them with additional information called "metadata". This section explains how metadata is created, edited, assigned and retrieved, and how to use external metadata sources (also see Chapter 2 of the Greenstone Developer's Guide -- Getting the most out of your documents).</Text>
995<Part id="the_enrich_view">
996<Title>
997<Text id="306">The Enrich View</Text>
998</Title>
999<Content>
1000<Text id="307">Use the Enrich view to assign metadata to the documents in the collection. Metadata is data about data -- typically title, author, creation date, and so on. Each metadata item has two parts: "element" tells what kind of item it is (such as author), and "value" gives the value of that metadata element (such as the author's name).</Text>
1001<Text id="308">On the left of the "Enrich" view is the Collection Tree. To the right is the Metadata Table, which shows metadata for any selected files or folders in the Collection Tree. Columns are named in grey at the top, and can be resized by dragging the separating line. If several files are selected, black text indicates that the value is common to all of the selected files, while grey text indicates that it is not. Black values may be updated or removed, while grey ones can be removed from those that have it, or appended to the others.</Text>
1002<Text id="309">A folder icon may appear beside some metadata entries. This indicates that the values are inherited from a parent (or ancestor) folder. Inherited metadata cannot be edited or removed, only appended to or overwritten. Click on the folder icon to go immediately to the folder where the metadata is assigned.</Text>
1003<Text id="310">Clicking on a metadata element in the table will display the existing values for that element in the "Existing values for..." area below the table. The Value Tree expands and collapses. Usually it is a list that shows all values entered previously for the selected element. Clicking an entry automatically places it into the value field. Conversely, typing in the text field selects the Value Tree entry that starts with the characters you have typed. Pressing [Tab] auto-completes the typing with the selected value.</Text>
1004<Text id="311">Metadata values can be organised into a hierarchy. This is shown in the Value Tree using folders for internal levels. Hierarchical values can be entered using the character "|" to separate the levels. For example, "Cards|Red|Diamonds|Seven" might be used in a hierarchy that represents a pack of playing cards. This enables values to be grouped together. Groups can also be assigned as metadata to files.</Text>
1005<Text id="312">Greenstone extracts metadata automatically from documents into a metadata set whose elements are prefixed by "ex.". This has no value tree and cannot be edited.</Text>
1006</Content>
1007</Part>
1008<Part id="selecting_metadata_sets">
1009<Title>
1010<Text id="313">Selecting Metadata Sets</Text>
1011</Title>
1012<Content>
1013<Text id="314">Sets of predefined metadata elements are known as "metadata sets". An example is the Dublin Core metadata set. When you add a metadata set to your collection, its elements become available for selection. You can have more than one set; to prevent name clashes a short identifier that identifies the metadata set is pre-pended to the element name. For instance the Dublin Core element Creator becomes "dc.Creator". Metadata sets are stored in the Librarian Interface's metadata folder and have the suffix ".mds".</Text>
1014<Text id="315">To control the metadata sets used in a collection, use the "Metadata Sets" entry on the Design view.</Text>
1015</Content>
1016</Part>
1017<Part id="appending_new_metadata">
1018<Title>
1019<Text id="316">Appending New Metadata</Text>
1020</Title>
1021<Content>
1022<Text id="317">We now add a metadata item -- both element and value -- to a file. First select the file from the Collection file tree on the left. The action causes any metadata previously assigned to this file to appear in the table at the right.</Text>
1023<Text id="318">Next select the metadata element you want to add by clicking its row in the table.</Text>
1024<Text id="319">Type the value into the value field. Use the "|" character to add structure, as described in The Enrich View. Pressing the [Up] or [Down] arrow keys will save the metadata value and move the selection appropriately. Pressing [Enter] will save the metadata value and create a new empty entry for the metadata element, allowing you to assign multiple values to a metadata element.</Text>
1025<Text id="320">You can also add metadata to a folder, or to several multiply selected files at once. It is added to all files within the folder or selection, and to child folders. Keep in mind that if you assign metadata to a folder, any new files in it automatically inherit the folder's values.</Text>
1026</Content>
1027</Part>
1028<Part id="adding_previously_defined_metadata">
1029<Title>
1030<Text id="321">Adding Previously Defined Metadata</Text>
1031</Title>
1032<Content>
1033<Text id="322">To add metadata that has an existing value, first select the file, then select the required value from the value tree, expanding hierarchy folders as necessary. The value of the selected entry automatically appears in the Value field (alternatively, use the value tree's auto-select and auto-complete features).</Text>
1034<Text id="323">The process of adding metadata with already-existing values to folders or multiple files is just the same.</Text>
1035</Content>
1036</Part>
1037<Part id="editing_or_removing_metadata">
1038<Title>
1039<Text id="324">Editing or Removing Metadata</Text>
1040</Title>
1041<Content>
1042<Text id="325">To edit or remove a piece of metadata, first select the appropriate file, and then the metadata value from the table. Edit the value field, deleting all text if you wish to remove the metadata.</Text>
1043<Text id="326">The process is the same when updating a folder with child folders or multiple files, but you can only update metadata that is common to all files/folders selected.</Text>
1044<Text id="327">The value tree shows all currently assigned values as well as previous values for the current session, so changed or deleted values will remain in the tree. Closing the collection and then re-opening it will remove the values which are no longer assigned.</Text>
1045</Content>
1046</Part>
1047<Part id="reviewing_assigned_metadata">
1048<Title>
1049<Text id="328">Reviewing Assigned Metadata</Text>
1050</Title>
1051<Content>
1052<Text id="329">Sometimes you need to see the metadata assigned to many or all files at once -- for instance, to determine how many files are left to work on, or to get some idea of the spread of dates.</Text>
1053<Text id="330">Select the files you wish to examine, then right-click and choose "Assigned Metadata...". A window called "All Metadata", dominated by a large table with many columns, appears. The first column shows file names; the rows show all metadata values assigned to those files.</Text>
1054<Text id="331">Drawing the table can take some time if many files are selected. You can continue to use the Librarian Interface while the "All Metadata" window is open.</Text>
1055<Text id="332">When it gets too large, you can filter the "All Metadata" table by applying filters to the columns. As new filters are added, only those rows that match them remain visible. To set, modify or clear a filter, click on the "funnel" icon at the top of a column. You are prompted for information about the filter. Once a filter is set, the column header changes colour.</Text>
1056<Text id="333">The prompt has a "Simple" and an "Advanced" tab. The Simple version filters columns so that they only show rows that contain a certain metadata value ("*" matches all values). You can select metadata values from the pull-down list. The Advanced version allows different matching operations: must start with, does not contain, alphabetically less than and is equal to. The value to be matched can be edited to be any string (including "*"), and you can choose whether the matching should be case insensitive. Finally, you can specify a second matching condition that you can use to specify a range of values (by selecting AND) or alternative values (by selecting OR). Below this area is a box that allows you to change the sort order (ascending or descending). Once you have finished, click "Set Filter" to apply the new filter to the column. Click "Clear Filter" to remove a current filter. Note that the filter details are retained even when the filter is cleared.</Text>
1057<Text id="334">For example, to sort the "All Metadata" table, choose a column, select the default filter setting (a Simple filter on "*"), and choose ascending or descending ordering.</Text>
1058</Content>
1059</Part>
1060<Part id="importing_previously_assigned_metadata">
1061<Title>
1062<Text id="335">Importing Previously Assigned Metadata</Text>
1063</Title>
1064<Content>
1065<Text id="336">This section describes how to import previously assigned metadata: metadata assigned to documents before they were added to the collection.</Text>
1066<Text id="337">If metadata in a form recognized by the Librarian Interface has been previously assigned to a file -- for example, when you choose documents from an existing Greenstone collection -- it is imported automatically when you add the file. To do this, the metadata must be mapped to the metadata sets available in the collection.</Text>
1067<Text id="338">The Librarian Interface prompts for the necessary information. The prompt gives brief instructions and then shows the name of the metadata element that is being imported, just as it appears in the source file. This field cannot be edited or changed. Next you choose what metadata set the new element should map to, and then the appropriate metadata element in that set. The system automatically selects the closest match, in terms of set and element, for the new metadata.</Text>
1068<Text id="339">Having checked the mapping, you can choose "Add" to add the new metadata element to the chosen metadata set. (This is only enabled if there is no element of the same name within the chosen set.) "Merge" maps the new element to the one chosen by the user. Finally, "Ignore" does not import any metadata with this element name. Once you have specified how to import a certain piece of metadata, the mapping information is retained for the collection's lifetime.</Text>
1069<Text id="340">For details on the metadata.xml files which Greenstone uses to store the metadata, see Chapter 2 of the Greenstone Developer's Guide -- Getting the most out of your documents.</Text>
1070</Content>
1071</Part>
1072</Content>
1073</Subsection>
1074<Subsection id="designing_your_collection_appearance">
1075<Title>
1076<Text id="341">Designing Your Collection's Appearance</Text>
1077</Title>
1078<Content>
1079<Text id="342">Once your files are marked up with metadata, you next decide how it should appear to users as a Greenstone collection. What kind of information is searchable? What ways are provided to browse through the documents? What languages are supported? Where do the buttons appear on the page? These things can be customized; this section describes how to do it.</Text>
1080<Part id="the_design_view">
1081<Title>
1082<Text id="343">The Design View</Text>
1083</Title>
1084<Content>
1085<Text id="344">This section introduces you to the design view and explains how to navigate between the various views within this pane.</Text>
1086<Text id="345">With the Librarian Interface, you can configure how the collection appears to the user. The configuration options are divided into different sections, each associated with a particular stage of navigating or presenting information.</Text>
1087<Text id="346">On the left is a list of different views, and on the right are the controls associated with the current one. To change to a different view, click its name in the list.</Text>
1088<Text id="347">To understand the stages and terms involved in designing a collection, first read Chapters 1 and 2 of the Greenstone Developer's Guide.</Text>
1089</Content>
1090</Part>
1091<Part id="general">
1092<Title>
1093<Text id="348">General</Text>
1094</Title>
1095<Content>
1096<Text id="349">This section explains how to review and alter the general settings associated with your collection. First, under "Design Sections", click "General".</Text>
1097<Text id="350">Here the values provided during collection creation can be modified.</Text>
1098<Text id="351">First are the contact emails of the collection's creator and maintainer. The following field allows you to change the collection title. The folder that the collection is stored in is shown next, but this cannot be edited. The next one specifies (in the form of a URL) the icon to show at the top left of the collection's "About" page, and the next is the icon used in the Greenstone library page to link to the collection. Then, a checkbox controls whether the collection should be publicly accessible. Finally comes the "Collection Description" text area as described in Creating A New Collection.</Text>
1099</Content>
1100</Part>
1101<Part id="document_plugins">
1102<Title>
1103<Text id="352">Document Plugins</Text>
1104</Title>
1105<Content>
1106<Text id="353">This section describes how to configure the document plugins the collection uses. It explains how you specify what plugins to use, what parameters to pass to them, and in what order they occur. Under "Design Sections", click "Document Plugins".</Text>
1107<Text id="354">To add a plugin, select it using the "Select plugin to add" pull-down list near the bottom and then click "Add Plugin". A window appears entitled "Configuring Arguments"; it is described later. Once you have configured the new plugin, it is added to the end of the "Currently Assigned Plugins" list. Note that, except for UnknownPlug, each plugin may only occur once in the list.</Text>
1108<Text id="355">To remove a plugin, select it in the list and click "Remove Plugin".</Text>
1109<Text id="356">Plugins are configured by providing arguments. To alter them, select the plugin from the list and click "Configure Plugin" (or double-click the plugin). A "Configuring Arguments" dialog appears with various controls for specifying arguments.</Text>
1110<Text id="357">There are different kinds of controls. Some are checkboxes, and clicking one adds the appropriate option to the plugin. Others are text strings, with a checkbox and a text field. Click the box to enable the argument, then type appropriate text (regular expression, file path etc) in the box. Others are pull-down menus from which you can select from a given set of values. To learn what an argument does, let the mouse hover over its name for a moment and a description will appear.</Text>
1111<Text id="358">When you have changed the configuration, click "OK" to commit the changes and close the dialog, or "Cancel" to close the dialog without changing any plugin arguments.</Text>
1112<Text id="359">The plugins in the list are executed in order, and the ordering is sometimes important. The order of the plugins can be changed in Library Systems Specialist and Expert modes only (see Preferences).</Text>
1113</Content>
1114</Part>
1115<Part id="search_types">
1116<Title>
1117<Text id="360">Search Types</Text>
1118</Title>
1119<Content>
1120<Text id="361">This section explains how to modify a new design feature in Greenstone, Search Types, which allow fielded searching. The search types specify what kind of search interface should be provided: form, for fielded searching, and/or plain for regular searching. Under "Design Sections", click "Search Types".</Text>
1121<Text id="362">When you enter the Search Types view, first check "Enable Advanced Searches", which activates the other controls. This changes the collection to use an indexing mechanism that allows fielded searching. Index specification is slightly different in this mode. (When switching between standard and advanced searching, the GLI does its best to convert the index specification, but may not get it completely right.)</Text>
1122<Text id="363">To add a search type, select it from the "Search Types" list and click "Add Search Type". Each type can only appear in the list once. The first search type will be the default, and will appear on the search page of the built collection. Any others will be selectable from the preferences page.</Text>
1123<Text id="364">To remove a search type, select it from the "Currently Assigned Search Types" list and click "Remove Search Type". The list must contain at least one search type.</Text>
1124</Content>
1125</Part>
1126<Part id="search_indexes">
1127<Title>
1128<Text id="365">Search Indexes</Text>
1129</Title>
1130<Content>
1131<Text id="366">Indexes specify what parts of the collection are searchable. This section explains how to add and remove indexes, and set a default index. Under "Design Sections", click "Search Indexes".</Text>
1132<Text id="367">To add an index, type a name for it into the "Index Name" field. Select which of the possible information sources to index by clicking the checkboxes beside them. The list shows all the assigned metadata elements, as well the full text. Having selected the data sources, choose the granularity of the index, using the "At the level" menu. Once these details are complete, "Add Index" becomes active (unless there is an existing index with the same settings). Click it to add the new index.</Text>
1133<Text id="368">To edit an index, select it and change the index details, then click "Replace Index".</Text>
1134<Text id="369">To remove an index, select it from the list of assigned indexes and click "Remove Index".</Text>
1135<Text id="370">To create an index covering text and all metadata, click "Add All".</Text>
1136<Text id="371">The default index, the one used on the collection's search page, is tagged with "[Default Index]" in the "Assigned Indexes" list. To set it, select an index from the list and click "Set Default".</Text>
1137<Text id="372">If advanced searching is enabled (via the Search Types view), the index controls are different. There is a new pseudo-data source "allfields" which provides searching across all specified indexes at once. Levels are not assigned to a specific index, but apply across all indexes: thus indexes and levels are added separately. "Add All" creates a separate index for each metadata field in this mode.</Text>
1138<Text id="373">The name of each index will default to the source name. To change the name, select an index, change its details, and click "Replace Index".</Text>
1139</Content>
1140</Part>
1141<Part id="partition_indexes">
1142<Title>
1143<Text id="374">Partition Indexes</Text>
1144</Title>
1145<Content>
1146<Text id="375">Indexes are built on particular text or metadata sources. The search space can be further controlled by partitioning the index, either by language or by a predetermined filter. This section describes how to do this. Under "Design Sections", click "Partition Indexes".</Text>
1147<Text id="376">The "Partition Indexes" view has three tabs; "Define Filters", "Assign Partitions" and "Assign Languages". To learn more about partitions read about subcollections and subindexes in Chapter 2 of the Greenstone Developer's Guide.</Text>
1148<Text id="377">The Partition Indexes screen is only enables in Library Systems Specialist and Expert modes (see Preferences). Note that the total number of partitions generated is a combination of all indexes, subcollection filters and languages chosen. Two indexes with two subcollection filters in two languages would yield eight index partitions.</Text>
1149</Content>
1150</Part>
1151<Part id="define_filters">
1152<Title>
1153<Text id="378">Define Filters</Text>
1154</Title>
1155<Content>
1156<Text id="379">Filters allow you to group together into a subcollection all documents in an index for which a metadata value matches a given pattern.</Text>
1157<Text id="380">To create a filter, click the "Define Filters" tab and enter a name for the new filter into the "Subcollection filter name:" field. Next choose a document attribute to match against, either a metadata element or the name of the file in question. Enter a regular expression to use during the matching. You can toggle between "Including" documents that match the filter, or "Excluding" them. Finally, you can specify any of the standard PERL regular expression flags to use when matching (e.g. "i" for case-insensitive matching). Finally, click "Add Filter" to add the filter to the "Defined Subcollection Filters" list.</Text>
1158<Text id="381">To remove a filter, select it from the list and click "Remove Filter".</Text>
1159<Text id="382">To alter a filter, select it from the list, change any of the values that appear in the editing controls and click "Replace Filter" to commit the changes.</Text>
1160</Content>
1161</Part>
1162<Part id="assign_partitions">
1163<Title>
1164<Text id="383">Assign Partitions</Text>
1165</Title>
1166<Content>
1167<Text id="384">Having defined a subcollection filter, use the "Assign Partitions" tab to build indexes for it (or for a group of filters). Select the desired filter (or filters) from the "Defined Subcollection Filters" list, enter a name for your partition in the "Partition Name" field, and click "Add Partition".</Text>
1168<Text id="385">To remove a partition, select it from the list and click "Remove Partition".</Text>
1169<Text id="386">To make a partition the default one, select it from the list and click "Set Default".</Text>
1170</Content>
1171</Part>
1172<Part id="assign_languages">
1173<Title>
1174<Text id="387">Assign Languages</Text>
1175</Title>
1176<Content>
1177<Text id="388">This section details how to restrict search indexes to particular languages. You do this by generating a partition using the "Assign Languages" tab of the "Partition Indexes" view.</Text>
1178<Text id="389">To add a new language to partition by, use the "Assign Languages" tab to build an index for it. Select the desired language from the "Language to add" pull-down list and click "Add Language".</Text>
1179<Text id="390">To remove a language, select it from the "Language Selection" list and click "Remove Language".</Text>
1180<Text id="391">To set the default language, select it from the list and click "Set Default".</Text>
1181</Content>
1182</Part>
1183<Part id="cross-collection_search">
1184<Title>
1185<Text id="392">Cross-Collection Search</Text>
1186</Title>
1187<Content>
1188<Text id="393">Greenstone can search across several different collections as though they were one. This is done by specifying a list of other collections to be searched along with the current one. Under "Design Sections", click "Cross-Collection Search".</Text>
1189<Text id="394">The Cross-Collection Search view shows a checklist of available collections. The current collection is ticked and cannot be deselected. To add another collection to be searched in parallel, click it in the list (click again to remove it). If only one collection is selected, there is no cross-collection searching.</Text>
1190<Text id="395">If the individual collections do not have the same indexes (including subcollection partitions and language partitions) as each other, cross-collection searching will not work properly. The user will only be able to search using indexes common to all collections.</Text>
1191<Text id="396">For further details, see Chapter 1 of the Greenstone Developer's Guide.</Text>
1192</Content>
1193</Part>
1194<Part id="browsing_classifiers">
1195<Title>
1196<Text id="397">Browsing Classifiers</Text>
1197</Title>
1198<Content>
1199<Text id="398">This section explains how to assign "classifiers", which are used for browsing, to the collection. Under "Design Sections", click "Browsing Classifiers".</Text>
1200<Text id="399">To add a classifier, select it using the "Select classifier to add" pull-down list near the bottom and then click "Add Classifier". A window appears entitled "Configuring Arguments"; instructions for this dialog are just the same as for plugins (see Document Plugins). Once you have configured the new classifier, it is added to the end of the "Currently Assigned Classifiers" list.</Text>
1201<Text id="400">To remove a classifier, select it from the list and click "Remove Classifier".</Text>
1202<Text id="401">To change the arguments a classifier, select it from the list and click "Configure Classifier" (or double-click on the classifier in the list).</Text>
1203<Text id="402">The ordering of classifiers in the collection's navigation bar is reflected in their order here. To change it, select the classifier you want to move and click "Move Up" or "Move Down".</Text>
1204<Text id="403">For further information on classifiers read Chapter 2, Greenstone Developer's Guide -- Getting the most out of your documents.</Text>
1205</Content>
1206</Part>
1207<Part id="format_features">
1208<Title>
1209<Text id="404">Format Features</Text>
1210</Title>
1211<Content>
1212<Text id="405">The web pages you see when using Greenstone are not pre-stored but are generated 'on the fly' as they are needed. Format commands are used to change the appearance of these generated pages. They affect such things as where buttons appear when a document is shown, and what links are displayed by the DateList classifier. Format commands are not easy to develop, and you should read Chapter 2 of the Greenstone Developer's Guide. This section discusses the format settings, and how the Librarian Interface gives access to them. Under "Design Sections", click "Format Features".</Text>
1213<Text id="406">You can apply a format command to anything in the "Choose Feature" pull-down list, which includes each classifier and a predefined list of features. When you select a feature, there are two types of control. Some features are simply enabled or disabled, and this is controlled by a checkbox. Others require a format string to be specified. For these there is a pull-down list ("Affected Component") for selecting which part of the feature the string applies to (if necessary), a text area ("HTML Format String") for entering the string, and a selection of predefined "Variables". To insert a variable into the current position in the format string, select it from the pull-down list and click "Insert".</Text>
1214<Text id="407">You can specify a default format for a particular component by selecting the blank feature. This format is then applied to all applicable features unless otherwise specified.</Text>
1215<Text id="408">To add a new format command, fill out the information as explained above and click "Add Format". The new format command appears in the list of "Currently Assigned Format Commands". Only one format command can be assigned to each feature/component combination.</Text>
1216<Text id="409">To remove a format command, select it from the list and click "Remove Format".</Text>
1217<Text id="410">To change a format command, select it from the list, modify the settings, and click "Replace Format".</Text>
1218<Text id="411">For more information about variables and the feature components, read Chapter 2 of the Greenstone Developer's Guide.</Text>
1219<Text id="412">If the "Allow Extended Options" checkbox is ticked, some advanced formatting options are enabled. The list of features that can be formatted is changed slightly, and more variables are available to be used in the format command, providing greater control over the page layout.</Text>
1220</Content>
1221</Part>
1222<Part id="translate_text">
1223<Title>
1224<Text id="413">Translate Text</Text>
1225</Title>
1226<Content>
1227<Text id="414">This section describes the translation view, where you can define language-specific text fragments for parts of the collection's interface. Under "Design Sections", click "Translate Text".</Text>
1228<Text id="415">First choose an entry from the "Features" list. The language-specific strings associated with this feature appear below. Use the "Language of translation" pull-down list to select the target language, and type the translated text into the text area, referring to the "Initial Text Fragment" if necessary. Click "Add Translation" when finished.</Text>
1229<Text id="416">To remove an existing translation, select it in the "Assigned Translations" table and click "Remove Translation".</Text>
1230<Text id="417">To edit a translation, select it, edit it in the "Translated Text" text area, and click "Replace Translation".</Text>
1231</Content>
1232</Part>
1233<Part id="metadata_sets">
1234<Title>
1235<Text id="418">Metadata Sets</Text>
1236</Title>
1237<Content>
1238<Text id="419">This section explains the metadata set review panel. Under "Design Sections", click "Metadata Sets".</Text>
1239<Text id="420">This view is used to review the metadata sets that the collection uses, and the elements that are available within each set. Choose from the list of "Available Metadata Sets" in order to see details of their elements.</Text>
1240<Text id="421">To use another metadata set with the loaded collection, click "Add Metadata Set" and select the metadata set file (.mds) for the new metadata set.</Text>
1241<Text id="422">Editing metadata sets is done with the Greenstone Editor for Metadata Sets (GEMS). Clicking the "Edit Metadata Set" button provides information on how to run the GEMS.</Text>
1242<Text id="423">If you no longer need a metadata set, select it and press "Remove Metadata Set" to remove it. If you have assigned any metadata to elements in the removed set you will be asked how to deal with this metadata when you next open the collection.</Text>
1243</Content>
1244</Part>
1245</Content>
1246</Subsection>
1247<Subsection id="producing_your_collection">
1248<Title>
1249<Text id="424">Producing Your Collection</Text>
1250</Title>
1251<Content>
1252<Text id="425">Having collected the documents for the collection, annotated them with metadata, and designed how the collection will appear, you can now produce the collection using Greenstone. This section explains how.</Text>
1253<Part id="the_create_view">
1254<Title>
1255<Text id="426">The Create View</Text>
1256</Title>
1257<Content>
1258<Text id="427">The Create view is used to create the collection by running Greenstone collection-building scripts on the information you have provided. Clicking "Build Collection" initiates the collection building process. The time this takes depends on the size of the collection and the number of indexes being created (for huge collections it can be hours). A progress bar indicates how much of the process has been completed. To cancel the process at any time, click "Cancel Build".</Text>
1259<Text id="428">Once the collection has successfully built, clicking "Preview Collection" will launch a web browser showing the home page of the collection.</Text>
1260<Text id="429">In Expert mode, you can use the "Message Log" entry at the left to review previous attempts to build the collection, whether successful or not. Select the log you want by clicking on the desired date in the "Log History" list.</Text>
1261</Content>
1262</Part>
1263<Part id="import_and_build_settings">
1264<Title>
1265<Text id="430">Import and Build Settings</Text>
1266</Title>
1267<Content>
1268<Text id="431">This section explains how to access the various import and build settings. For more information of importing and building read Chapter 1 of the Greenstone Developer's Guide -- Understanding the collection-building process.</Text>
1269<Text id="432">Controlling the various settings is done in a similar way to the "Configuring Arguments" window described in the Document Plugins section. Some fields require numeric arguments, and you can either type these in or use the up and down arrows to increase or decrease the current value (in some cases, the interface restricts the range you can enter). Others are enabled by clicking a checkbox (click again to disable).</Text>
1270</Content>
1271</Part>
1272</Content>
1273</Subsection>
1274<Subsection id="miscellaneous">
1275<Title>
1276<Text id="433">Miscellaneous</Text>
1277</Title>
1278<Content>
1279<Text id="434">This section describes features of the Librarian Interface that are not associated with any particular view.</Text>
1280<Part id="preferences">
1281<Title>
1282<Text id="435">Preferences</Text>
1283</Title>
1284<Content>
1285<Text id="436">This section explains the preferences dialog, accessed by opening "File" -&gt; "Preferences".</Text>
1286<Text id="437">The first "General" option is a text field for entering your e-mail address. This will be used for the "creator" and "maintainer" collection metadata items. The next option is a pull-down list of the languages in which the Librarian Interface can be presented. If you change the dictionary by choosing one from the list, you must restart the Librarian Interface in order to load the new language strings from the dictionary.</Text>
1287<Text id="438">If "View Extracted Metadata" is checked, the various controls dealing with metadata always show all metadata that has been extracted automatically from documents. Deselecting it hides this metadata (although it is still available during collection design, and within the final Greenstone collection). If "Show file sizes" is checked, the file size is shown next to each file in the Workspace and Collection file trees in the Gather and Enrich views.</Text>
1288<Text id="439">The "Mode" panel is used to control the level of detail within the interface. At its lowest setting, "Library Assistant", the design view is disabled, arguments requiring regular expressions are hidden and the collection building produces a minimal log of events. In contrast the highest setting, "Expert", provides access to all of the features of design, including plugin positioning and regular expression arguments, and also allows the full output from the collection building to be recorded in the logs. To change or review modes, click the radio button next to the mode you are interested in. You can quickly review what mode you are in by looking at the Librarian Interface's title bar.</Text>
1289<Text id="440">The Librarian Interface can support different workflows by determining which of the various view tabs are visible. Use the "Workflow" tab to customise what views are available by checking the boxes next to the views that you want to be available. Alternatively, use the pull-down list at the bottom to select predetermined configurations. Closing the preferences dialog establishes these workflow settings. These settings are stored with the collection, not in the Librarian Interface configuration file.</Text>
1290<Text id="441">The "Connection" tab lets you alter the path to the locally-running Greenstone library server, which is used when Previewing collections. It also lets you set proxy information for connecting to the Internet (e.g. when downloading files; see the Downloading Files From the Internet section for details). Check the box to enable proxy connection and supply details of the proxy host address and port number. The proxy connection is established when you close the Preferences dialog.</Text>
1291<Text id="442">During the course of a session the Librarian Interface may give warning messages which inform you of possibly unforeseen consequences of an action. You can disable the messages by checking the "Do not show this warning again" box. You can re-enable warning messages using the "Warnings" tab. Check the box next to warning messages you want to see again.</Text>
1292</Content>
1293</Part>
1294<Part id="file_associations">
1295<Title>
1296<Text id="443">File Associations</Text>
1297</Title>
1298<Content>
1299<Text id="444">The Librarian Interface uses particular application programs to open particular file types. To alter file associations open the "File" menu and click "File Associations...".</Text>
1300<Text id="445">To add an association, select the target file extension from the pull-down list, or type in a new extension (do not include the "."). Next either type command that launches the desired application in the appropriate field, or choose the application from the "Browse" dialog. "%1" can be used in the launch command to insert the name of the file being opened. Once these are filled out, "Add" is enabled and can be clicked to add the association.</Text>
1301<Text id="446">To edit an association, select an existing file extension. Any existing associated command is shown in the launch command field. Edit it, and then click "Replace".</Text>
1302<Text id="447">To remove an association, select an existing file extension and click "Remove". (The file extension remains in the "For Files Ending" pull-down list.)</Text>
1303<Text id="448">File associations are stored in the Librarian Interface's main folder, in a file called "associations.xml".</Text>
1304</Content>
1305</Part>
1306<Part id="exporting_collections_to_cddvd">
1307<Title>
1308<Text id="449">Exporting Collections to CD/DVD</Text>
1309</Title>
1310<Content>
1311<Text id="450">Greenstone can export one or more collections to a self-installing CD/DVD for Windows. To do so, Greenstone's "Export to CD-ROM" package must be installed. This is not included by default, so you may need to modify your installation to include it.</Text>
1312<Text id="451">To export a collection, open the "File" menu and choose "Write CD/DVD Image". A list of Greenstone collections appears; click on any one to see its description. Tick the check boxes of the collections to export. You can enter the CD/DVD's name in the box: this is what will appear in the Start menu when the CD/DVD has been installed. Then click "Export". The process involves copying many files and may take a few minutes.</Text>
1313<Text id="452">Upon completion, Greenstone will show the name of a folder containing the exported collections. Use a CD/DVD writer to copy its contents to a blank CD/DVD.</Text>
1314</Content>
1315</Part>
1316</Content>
1317</Subsection>
1318</Content>
1319</Section>
1320<Section id="tagging_document_files">
1321<Title>
1322<Text id="453">Tagging document files</Text>
1323</Title>
1324<Content>
1325<Text id="454">Source documents often need to be structured into sections and subsections, and this information needs to be communicated to Greenstone so that it can preserve the hierarchical structure. Also, metadata - typically the title - might be associated with each section and subsection.</Text>
1326<Text id="455">The source documents from an OCR process are typically a set of word processor files, including images. If these are represented as MicrosoftWord files, they can be input into Greenstone using the Word plugin. Alternatively, they can be converted to HTML and input using the HTML plugin.</Text>
1327<Text id="456">In either case, the hierarchical structure of a document may be indicated by inserting tags in the text as follows:</Text>
1328<CodeLine>&lt;!--</CodeLine>
1329<CodeLine>&lt;Section&gt;</CodeLine>
1330<CodeLine>&lt;Description&gt;</CodeLine>
1331<Text type="code" id="457">&lt;Metadata name="Title"&gt;Realizing human rights for poor people: Strategies for achieving the international development targets&lt;/Metadata&gt;</Text>
1332<CodeLine>&lt;/Description&gt;</CodeLine>
1333<CodeLine>--&gt;</CodeLine>
1334<Text id="458"><i>(text of section goes here)</i></Text>
1335<CodeLine>&lt;!--</CodeLine>
1336<CodeLine>&lt;/Section&gt;</CodeLine>
1337<CodeLine>--&gt;</CodeLine>
1338<Text id="459">The &lt;!-- ... --&gt; markers are used because they indicate comments in HTML; thus these section tags will not affect document formatting. You must include these markers around your section tags, even if the document you are working with is not HTML (e.g. if it's a Microsoft Word file).</Text>
1339<Text id="460">In the Description part (between the &lt;Description&gt; and &lt;/Description&gt; tags) other kinds of metadata can be specified, but this is not done for the style of collections we are describing here.</Text>
1340<Text id="461">It is important to remember that you are creating a hierarchical table of contents when you insert section tags into your document. This means that sections can be nested within other sections. In fact, all sections must be nested within a single enclosing section that encompasses the entire document.</Text>
1341<Text id="462">The following example demonstrates a document with two chapters, the second of which contains two subsections. For real examples of sourcedocuments tagged in this way, look at the source documents for the Demo or DLS collections.</Text>
1342<CodeLine>&lt;!--</CodeLine>
1343<CodeLine>&lt;Section&gt;</CodeLine>
1344<CodeLine>&lt;Description&gt;</CodeLine>
1345<CodeLine>&lt;Metadata name="Title"&gt;My Document&lt;/Metadata&gt;</CodeLine>
1346<CodeLine>&lt;/Description&gt;</CodeLine>
1347<CodeLine>&lt;Section&gt;</CodeLine>
1348<CodeLine>&lt;Description&gt;</CodeLine>
1349<CodeLine>&lt;Metadata name="Title"&gt;Chapter 1&lt;/Metadata&gt;</CodeLine>
1350<CodeLine>&lt;/Description&gt;</CodeLine>
1351<CodeLine>--&gt;</CodeLine>
1352<Text type="code" id="463">(text of chapter 1 goes here)</Text>
1353<CodeLine>&lt;!--</CodeLine>
1354<CodeLine>&lt;/Section&gt;</CodeLine>
1355<CodeLine>&lt;Section&gt;</CodeLine>
1356<CodeLine>&lt;Description&gt;</CodeLine>
1357<CodeLine>&lt;Metadata name="Title"&gt;Chapter 2&lt;/Metadata&gt;</CodeLine>
1358<CodeLine>&lt;/Description&gt;</CodeLine>
1359<CodeLine>&lt;Section&gt;</CodeLine>
1360<CodeLine>&lt;Description&gt;</CodeLine>
1361<CodeLine>&lt;Metadata name="Title"&gt;Subsection 1&lt;/Metadata&gt;</CodeLine>
1362<CodeLine>&lt;/Description&gt;</CodeLine>
1363<CodeLine>--&gt;</CodeLine>
1364<Text type="code" id="464">(text of sub-section 1 goes here)</Text>
1365<CodeLine>&lt;!--</CodeLine>
1366<CodeLine>&lt;/Section&gt;</CodeLine>
1367<CodeLine>&lt;Section&gt;</CodeLine>
1368<CodeLine>&lt;Description&gt;</CodeLine>
1369<CodeLine>&lt;Metadata name="Title"&gt;Subsection 2&lt;/Metadata&gt;</CodeLine>
1370<CodeLine>&lt;/Description&gt;</CodeLine>
1371<CodeLine>--&gt;</CodeLine>
1372<Text type="code" id="465">(text of sub-section 2 goes here)</Text>
1373<CodeLine>&lt;!--</CodeLine>
1374<CodeLine>&lt;/Section&gt;</CodeLine>
1375<CodeLine>&lt;/Section&gt;</CodeLine>
1376<CodeLine>&lt;/Section&gt;</CodeLine>
1377<CodeLine>--&gt;</CodeLine>
1378<Text id="466">Note that metadata assigned from within a section tag in a source document takes precedence over that assigned to the document as a whole. This means that you should not explicitly specify Title metadata for the top-level section within a source document unless you want it to override the title you gave it when specifying metadata. In the above example, unless you want to override the document's existing title you should omit the line that reads:</Text>
1379<CodeLine>&lt;Metadata name="Title"&gt;My Document&lt;/Metadata&gt;</CodeLine>
1380</Content>
1381</Section>
1382<Section id="the_collector">
1383<Title>
1384<Text id="467">The Collector</Text>
1385</Title>
1386<Content>
1387<Text id="468">The Collector is a facility that helps you create new collections, modify or add to existing ones, or delete collections. To do this you will be guided through a sequence of web pages which request the information that is needed. The sequence is self-explanatory: this section takes you through it. As an alternative to using the Collector, you can also build collections from the command line—the first few pages of the Developer's Guide give a detailed walk-through of how to do this. The Collector predates the librarian interface described in Section 3.1, and for most practical purposes the librarian interface should be used instead of the Collector.</Text>
1388<Text id="469">Building and distributing information collections carries responsibilities that you should reflect on before you begin. There are legal issues of copyright: being able to access documents doesn't mean you can necessarily give them to others. There are social issues: collections should respect the customs of the community out of which the documents arise. And there are ethical issues: some things simply should not be made available to others. The pen is mightier than the sword!—be sensitive to the power of information and use it wisely.</Text>
1389<Text id="470">To access the Collector, click the appropriate link on the digital library home page.</Text>
1390<Text id="471">In Greenstone, the structure of a particular collection is determined when the collection is set up. This includes such things as the format of the source documents, how they should be displayed on the screen, the source of metadata, what browsing facilities should be provided, what full-text search indexes should be provided, and how the search results should be displayed. Once the collection is in place, it is easy to add new documents to it—so long as they have the same format as the existing documents, and the same type of metadata is provided, in exactly the same way.</Text>
1391<Text id="472">The Collector has the following basic functions:</Text>
1392<NumberedList>
1393<NumberedItem>
1394<Text id="473">create a new collection with the same structure as an existing one;</Text>
1395</NumberedItem>
1396<NumberedItem>
1397<Text id="474">create a new collection with a different structure from existing ones;</Text>
1398</NumberedItem>
1399<NumberedItem>
1400<Text id="475">add new material to an existing collection;</Text>
1401</NumberedItem>
1402<NumberedItem>
1403<Text id="476">modify the structure of an existing collection;</Text>
1404</NumberedItem>
1405<NumberedItem>
1406<Text id="477">delete a collection; and</Text>
1407</NumberedItem>
1408<NumberedItem>
1409<Text id="478">write an existing collection to a self-contained, self-installing cd-rom.</Text>
1410</NumberedItem>
1411</NumberedList>
1412<Text id="479">Figure 16 shows the Collector being used to create a new collection, in this case from a set of html files stored locally. You must first decide whether to work with an existing collection or build a new one. The former case covers options 1 and 2 above; the latter covers options 3—6. In Figure 16a, the user opts to create a new collection.</Text>
1413<Figure id="using_the_collector_to_build_a_new_collection">
1414<Title>
1415<Text id="480">Using the Collector to build a new collection (continued on next pages)</Text>
1416<SubTitle>
1417<Text id="481">(a)</Text>
1418</SubTitle>
1419</Title>
1420<File width="369" height="440" url="images/User_Fig_16a.png"/>
1421</Figure>
1422<Subsection id="logging_in">
1423<Title>
1424<Text id="482">Logging in</Text>
1425</Title>
1426<Content>
1427<Text id="483">Either way it is necessary to log in before proceeding. Note that in general, people use their web browser to access the collection-building facility on a remote computer, and build the collection on that server. Of course, we cannot allow arbitrary people to build collections (for reasons of propriety if nothing else), so Greenstone contains a security system which forces people who want to build collections to log in first. This allows a central system to offer a service to those wishing to build information collections and use that server to make them available to others. Alternatively, if you are running Greenstone on your own computer you can build collections locally, but it is still necessary to log in because other people who use the Greenstone system on your computer should not be allowed to build collections without prior permission.</Text>
1428</Content>
1429</Subsection>
1430<Subsection id="dialog_structure">
1431<Title>
1432<Text id="484">Dialog structure</Text>
1433</Title>
1434<Content>
1435<Figure id="using_the_collector_to_build_a_new_collection_1">
1436<Title>
1437<Text id="485">Using the Collector to build a new collection (Continued)</Text>
1438<SubTitle>
1439<Text id="486">(b)</Text>
1440</SubTitle>
1441</Title>
1442<File width="369" height="435" url="images/User_Fig_16b.png"/>
1443</Figure>
1444<Text id="487">Upon completion of login, the page in Figure 16b appears. This shows the sequence of steps that are involved in collection building. They are:</Text>
1445<NumberedList>
1446<NumberedItem>
1447<Text id="488">Collection information</Text>
1448</NumberedItem>
1449<NumberedItem>
1450<Text id="489">Source data</Text>
1451</NumberedItem>
1452<NumberedItem>
1453<Text id="490">Configuring the collection</Text>
1454</NumberedItem>
1455<NumberedItem>
1456<Text id="491">Building the collection</Text>
1457</NumberedItem>
1458<NumberedItem>
1459<Text id="492">Viewing the collection.</Text>
1460</NumberedItem>
1461</NumberedList>
1462<Text id="493">The first step is to specify the collection's name and associated information. The second is to say where the source data is to come from. The third is to adjust the configuration options, a step that becomes more useful as you gain experience with Greenstone. The fourth step is where all the (computer's) work is done. During the “building” process the system makes all the indexes and gathers together any other information that is required to make the collection operate. The fifth step is to view the collection that has been created.</Text>
1463<Text id="494">These five steps are displayed as a linear sequence of gray buttons at the bottom of the screen in Figure 16b, and at the bottom of all other pages generated by the Collector. This display helps users keep track of where they are in the process. The button that should be clicked to continue the sequence is shown in green (<i>collection information</i> in Figure 16b). The gray buttons (all the others, in Figure 16b) are inactive. The buttons change to yellow as you proceed through the sequence, and the user can return to an earlier step by clicking the corresponding yellow button in the diagram. This display is modeled after the “wizards” that are widely used in commercial software to guide users through the steps involved in installing new software.</Text>
1464</Content>
1465</Subsection>
1466<Subsection id="collection_information">
1467<Title>
1468<Text id="495">Collection information</Text>
1469</Title>
1470<Content>
1471<Figure id="using_the_collector_to_build_a_new_collection_2">
1472<Title>
1473<Text id="496">Using the Collector to build a new collection (Continued)</Text>
1474<SubTitle>
1475<Text id="497">(c)</Text>
1476</SubTitle>
1477</Title>
1478<File width="369" height="504" url="images/User_Fig_16c.png"/>
1479</Figure>
1480<Text id="498">The next step in the sequence, collection information, is shown in Figure 16c. When creating a new collection, it is necessary to enter some information about it:</Text>
1481<BulletList>
1482<Bullet>
1483<Text id="499">title,</Text>
1484</Bullet>
1485<Bullet>
1486<Text id="500">contact E-mail address, and</Text>
1487</Bullet>
1488<Bullet>
1489<Text id="501">brief description.</Text>
1490</Bullet>
1491</BulletList>
1492<Text id="502">The collection title is a short phrase used through the digital library to identify the content of the collection. Example titles include <i>Food and Nutrition Library</i>, <i>World Environmental Library</i>, <i>Development Library</i>, and so on. The E-mail address specifies the first point of contact for any problems encountered with the collection. If the Greenstone software detects a problem, a diagnostic report may be sent to this address. Finally, the brief description is a statement describing the principles that govern what is included in the collection. It appears under the heading <i>About this collection</i> on the first page when the collection is presented.</Text>
1493<Text id="503">The user's current position in the collection-building sequence is indicated by an arrow that appears in the display at the bottom of each screen—in this case, as Figure 16c shows, the collection information stage. The user proceeds to Figure 16d by clicking the green source data button.</Text>
1494</Content>
1495</Subsection>
1496<Subsection id="source_data">
1497<Title>
1498<Text id="504">Source data</Text>
1499</Title>
1500<Content>
1501<Figure id="using_the_collector_to_build_a_new_collection_3">
1502<Title>
1503<Text id="505">Using the Collector to build a new collection (Continued)</Text>
1504<SubTitle>
1505<Text id="506">(d)</Text>
1506</SubTitle>
1507</Title>
1508<File width="368" height="532" url="images/User_Fig_16d.png"/>
1509</Figure>
1510<Text id="507">Figure 16d is the point where the user specifies the source text that comprises the collection. You may either base your collection on a default structure that is provided, or on the structure of an existing collection.</Text>
1511<Text id="508">If you opt for the default structure, the new collection may contain html documents (files ending in <i>.htm, .html</i>), or plain text documents (files ending in <i>.txt, .text</i>), Microsoft Word documents (files ending in <i>.doc</i>), PDF documents (files ending in <i>.pdf</i>) or E-mail documents (files ending in <i>.email</i>). More information about the different document formats that can be accommodated is given in the section on “Document formats” below.</Text>
1512<Text id="509">If you base your new collection on an existing one, the files in the new collection must be exactly the same type as those used to build the existing one. Note that some collections use non-standard input file formats, while others use metadata specified in auxiliary files. If your new input lacks this information, some browsing facilities may not work properly. For example, if you clone the Demo collection you may find that the <i>subjects</i>, <i>organization</i>, and <i>how to</i> buttons don't work.</Text>
1513<Text id="510">Boxes are provided to indicate where the source documents are located: up to three separate input sources can be specified in Figure 16d. If you need more, just click the button marked “more sources.”</Text>
1514<Text id="511">There are three kinds of specification:</Text>
1515<BulletList>
1516<Bullet>
1517<Text id="512">a directory name on the Greenstone server system (beginning with “file://”)</Text>
1518</Bullet>
1519<Bullet>
1520<Text id="513">an address beginning with “http://” for files to be downloaded from the web</Text>
1521</Bullet>
1522<Bullet>
1523<Text id="514">an address beginning with “ftp://” for files to be downloaded using anonymous FTP.</Text>
1524</Bullet>
1525</BulletList>
1526<Text id="515">If you use <i>file://</i> or <i>ftp://</i> to specify a file, that file will be downloaded.</Text>
1527<Text id="516">If you use <i>http://</i> it depends on whether the URL gives you a normal web page in your browser, or a list of files. If a page, that page will be downloaded—and so will all pages it links to, and all pages they link to, etc.—provided they reside on the same site, below the URL.</Text>
1528<Text id="517">If you use <i>file://</i> or <i>ftp://</i> to specify a folder or directory, or give a <i>http://</i> URL that leads to a list of files, everything in the folder and all its subfolders will be included in the collection.</Text>
1529<Text id="518">You can specify sources of more than one type.</Text>
1530<Text id="519">In this case (Figure 16d) the new collection will contain documents taken from a local file system as well as a remote web site, which will be mirrored during the building process.</Text>
1531<Text id="520">When you click the <i>configure collection</i> button to proceed to the next stage of building, the Collector checks that all the sources of input you specified can be reached. This might take a few seconds, or even a few minutes if you have specified several sources. If one or more of the input sources you specified is unavailable, you will be presented with a page like that in Figure 16e, where the unavailable sources are marked (both of them in this case).</Text>
1532<Figure id="using_the_collector_to_build_a_new_collection_4">
1533<Title>
1534<Text id="521">Using the Collector to build a new collection (Continued)</Text>
1535<SubTitle>
1536<Text id="522">(e)</Text>
1537</SubTitle>
1538</Title>
1539<File width="368" height="531" url="images/User_Fig_16e.png"/>
1540</Figure>
1541<Text id="523">Sources might be unavailable because</Text>
1542<BulletList>
1543<Bullet>
1544<Text id="524">the file, FTP site or URL does not exist;</Text>
1545</Bullet>
1546<Bullet>
1547<Text id="525">you need to dial up your ISP first;</Text>
1548</Bullet>
1549<Bullet>
1550<Text id="526">you are trying to access a URL from behind a firewall.</Text>
1551</Bullet>
1552</BulletList>
1553<Text id="527">The last case is potentially the most mysterious. It occurs if you normally have to present a username and password to access the Internet Sometimes it happens that you can see the page from your Web browser if you enter the URL, but the Collector claims that it is unavailable. The explanation is that the page in your browser may be coming from a locally cached copy. Unfortunately, locally cached copies are invisible to the Collector. In this case we recommend that you download the pages using your browser first.</Text>
1554</Content>
1555</Subsection>
1556<Subsection id="configuring_the_collection">
1557<Title>
1558<Text id="528">Configuring the collection</Text>
1559</Title>
1560<Content>
1561<Figure id="using_the_collector_to_build_a_new_collection_5">
1562<Title>
1563<Text id="529">Using the Collector to build a new collection (Continued)</Text>
1564<SubTitle>
1565<Text id="530">(f)</Text>
1566</SubTitle>
1567</Title>
1568<File width="369" height="467" url="images/User_Fig_16f.png"/>
1569</Figure>
1570<Text id="531">Figure 16f shows the next stage. The construction and presentation of all collections is controlled by specifications in a special collection configuration file (see below). Advanced users may use this page to alter the configuration settings. Most, however, will proceed directly to the final stage. Indeed, in Figure 16d both the <i>configure collection</i> and the <i>build collection</i> buttons are displayed in green, signifying that step 3 can be bypassed completely.</Text>
1571<Text id="532">In our example the user has made a small modification to the default configuration file by including the <i>file_is_url</i> flag with the html plugin. This flag causes URL metadata to be inserted in each document, based on the filename convention that is adopted by the mirroring package. This metadata is used in the collection to allow readers to refer to the original source material, rather than to a local copy.</Text>
1572</Content>
1573</Subsection>
1574<Subsection id="building_the_collection_1">
1575<Title>
1576<Text id="533">Building the collection</Text>
1577</Title>
1578<Content>
1579<Figure id="using_the_collector_to_build_a_new_collection_6">
1580<Title>
1581<Text id="534">Using the Collector to build a new collection (Continued)</Text>
1582<SubTitle>
1583<Text id="535">(g)</Text>
1584</SubTitle>
1585</Title>
1586<File width="369" height="304" url="images/User_Fig_16g.png"/>
1587</Figure>
1588<Text id="536">Figure 16g shows the “building” stage. Up until now, the responses to the dialog have merely been recorded in a temporary file. The building stage is where the action takes place.</Text>
1589<Text id="537">During building, indexes for both browsing and searching are constructed according to instructions in the collection configuration file. The building process takes some time: minutes to hours, depending on the size of the collection and the speed of your computer. Some very large collections take a day or more to build.</Text>
1590<Text id="538">When you reach this stage in the interaction, a status line at the bottom of the web page gives feedback on how the operation is progressing, updated every five seconds. The message visible in Figure 16f indicates that when the snapshot was taken, Title metadata was being extracted from an input file.</Text>
1591<Text id="539">Warnings are written if input files or URLs are requested that do not exist, or exist but there is no plugin that can process them, or the plugin cannot find an associated file, such as an image file embedded in a html document. The intention is that you will monitor progress by keeping this window open in your browser. If any errors cause the process to terminate, they are recorded in this status area.</Text>
1592<Text id="540">You can stop the building process at any time by clicking on the <i>stop building</i> button in Figure 16g. If you leave the web page (and have not cancelled the building process with the <i>stop building</i> button), the building operation will continue, and the new collection will be installed when the operation completes.</Text>
1593</Content>
1594</Subsection>
1595<Subsection id="viewing_the_collection">
1596<Title>
1597<Text id="541">Viewing the collection</Text>
1598</Title>
1599<Content>
1600<Text id="542">When the collection is built and installed, the sequence of buttons visible at the bottom of Figures 16b—f appears at the bottom of Figure 16g, with the View collection button active. This takes the user directly to the newly built collection.</Text>
1601<Text id="543">Finally, there is a facility for E-mail to be sent to the collection's contact E-mail address, and to the system's administrator, whenever a collection is created (or modified.) This allows those responsible to check when changes occur, and monitor what is happening on the system. The facility is disabled by default but can be enabled by editing the <i>main.cfg</i> configuration file (see the <i>Greenstone Digital Library Developer's Guide</i>, Section <CrossRef target="Chapter" external="Develop" lang="en" ref="configuring_your_greenstone_site"/>).</Text>
1602</Content>
1603</Subsection>
1604<Subsection id="working_with_existing_collections">
1605<Title>
1606<Text id="544">Working with existing collections</Text>
1607</Title>
1608<Content>
1609<Text id="545">When you enter the Collector you have to specify whether you want to create an entirely new collection or work with an existing one, adding data to it or deleting it. By creating all searching and browsing structures automatically from the documents themselves Greenstone makes it easy to add new information to existing collections. Because no links are inserted by hand, when new documents in the same format become available they can be merged into the collection automatically.</Text>
1610<Text id="546">To work with an existing collection, you first select the collection from a list that is provided. Some collections are “write protected” and cannot be altered: these ones don't appear in the selection list. With the collection, you can</Text>
1611<BulletList>
1612<Bullet>
1613<Text id="547">Add more data and rebuild the collection</Text>
1614</Bullet>
1615<Bullet>
1616<Text id="548">Edit the collection configuration file</Text>
1617</Bullet>
1618<Bullet>
1619<Text id="549">Delete the collection entirely</Text>
1620</Bullet>
1621<Bullet>
1622<Text id="550">Export the collection to CD-ROM.</Text>
1623</Bullet>
1624</BulletList>
1625<Part id="add_new_data">
1626<Title>
1627<Text id="551">Add new data</Text>
1628</Title>
1629<Content>
1630<Text id="552">The files that you specify will be added to the collection. Make sure that you do not re-specify files that are already in the collection—otherwise two copies will be included. Files are identified by their full pathname, web pages by their absolute web address. You specify directories and files just as you do when building a new collection.</Text>
1631<Text id="553">If you add data to a collection and for some reason the building process fails, the old version of the collection remains unchanged.</Text>
1632</Content>
1633</Part>
1634<Part id="edit_configuration_file">
1635<Title>
1636<Text id="554">Edit configuration file</Text>
1637</Title>
1638<Content>
1639<Text id="555">Advanced users can edit the collection configuration file, just as they can when a new collection is built.</Text>
1640</Content>
1641</Part>
1642<Part id="delete_the_collection">
1643<Title>
1644<Text id="556">Delete the collection</Text>
1645</Title>
1646<Content>
1647<Text id="557">You will be asked to confirm whether you really want to delete the collection. Once deleted, Greenstone can not bring the collection back!</Text>
1648</Content>
1649</Part>
1650<Part id="export_the_collection">
1651<Title>
1652<Text id="558">Export the collection</Text>
1653</Title>
1654<Content>
1655<Text id="559">You can export the collection in a form that allows it to be written to a self-contained, self-installing Greenstone CD-ROM for Windows. Because commercial software that creates self-installing CD-ROMs is expensive, this facility includes a homegrown installer module.</Text>
1656<Text id="560">When you export the collection, the dialogue informs you of the directory name in which the result has been placed. The entire contents of the directory should be written on to CD-ROM using a standard CD-writing utility.</Text>
1657<Text id="561">The immense variety of different possible Windows configurations has made it difficult for us to test and debug the Greenstone installer under all possible conditions. Although the installer produces CD-ROMs that operate on most Windows systems, it is still under development. If you experience problems and you possess a commercial installation package (e.g. InstallShield), you can use it to create CD-ROMs from the information that Greenstone provides. The above-mentioned export directory contains four files that relate to the installation process, and three subdirectories that contain the complete collection and software. Remove the four files and use InstallShield to make a CD-ROM image that installs these directories and creates a shortcut to the program <i>gsdl\server.exe</i>.</Text>
1658</Content>
1659</Part>
1660</Content>
1661</Subsection>
1662<Subsection id="document_formats_1">
1663<Title>
1664<Text id="562">Document formats</Text>
1665</Title>
1666<Content>
1667<Text id="563">When building collections, Greenstone processes each different format of source document by seeking a “plugin” that can deal with that particular format. Plugins are specified in the collection configuration file. Greenstone generally uses the filename to determine document formats—for example, <i>foo.txt</i> is processed as a text file, <i>foo.html</i> as html, and <i>foo.doc</i> as a Word file.</Text>
1668<Text id="564">Here is a summary of the plugins that are available for widely-used document formats. More detail about these plugins, and additional plugins for less commonly-used formats, can be found in the <i>Greenstone Digital Library Developer's Guide</i>.</Text>
1669<Part id="textplug">
1670<Title>
1671<Text id="565">TEXTPlug (*.txt, *.text)</Text>
1672</Title>
1673<Content>
1674<Text id="566">TEXTPlug interprets a plain text file as a simple document. It adds <i>title</i> metadata based on the first line of the file.</Text>
1675</Content>
1676</Part>
1677<Part id="htmlplug">
1678<Title>
1679<Text id="567">HTMLPlug (*.htm, *.html; also .shtml, .shm, .asp, .php, .cgi)</Text>
1680</Title>
1681<Content>
1682<Text id="568">HTMLPlug processes html files. It extracts <i>title</i> metadata based on the &lt;title&gt; tag; other metadata expressed using html's metatag syntax can be extracted too. There are many options available with this plugin, documented in the <i>Greenstone Digital Library Developer's Guide</i>.</Text>
1683</Content>
1684</Part>
1685<Part id="wordplug">
1686<Title>
1687<Text id="569">WORDPlug (*.doc)</Text>
1688</Title>
1689<Content>
1690<Text id="570">WORDPlug imports Microsoft Word documents. There are many different variants on the Word format—and even Microsoft programs frequently make conversion errors. Greenstone uses independent programs to convert Word files to html. For some older Word formats the system resorts to a simple extraction algorithm that finds all text strings in the input file.</Text>
1691</Content>
1692</Part>
1693<Part id="pdfplug">
1694<Title>
1695<Text id="571">PDFPlug (*.pdf)</Text>
1696</Title>
1697<Content>
1698<Text id="572">PDFPlug imports documents in PDF Adobe's Portable Document Format. Like WORDPlug, it uses an independent program, in this case <i>pdftohtml</i>, to convert PDF files to html.</Text>
1699<Text id="573">As with WORDPlug, by default collections will display the html equivalent of the file when the user clicks the <i>document</i> icon; however, the format strings in the collection configuration file can be adjusted to give the user access to the original PDF file instead, and we recommend that you do this. Again, just replace the <i>&lt;link&gt; 
 &lt;/link&gt;</i> tags by <i>&lt;srclink&gt; 
 &lt;/srclink&gt;</i> ones.</Text>
1700<Text id="574">The <i>pdftohtml</i> program fails on some PDF files. What happens is that the conversion process takes an exceptionally long time, and often an error message relating to the conversion process appears on the screen. If this occurs, the only solution that we can offer is to remove the offending document from the collection. Also, PDFPlug cannot handle encrypted PDF files.</Text>
1701</Content>
1702</Part>
1703<Part id="psplug">
1704<Title>
1705<Text id="575">PSPlug (*.ps)</Text>
1706</Title>
1707<Content>
1708<Text id="576">PSPlug imports documents in PostScript. It works best if a standard Linux program, called <i>ps2ascii</i>, is already installed on your computer. This is available on most Linux installations, but not on Windows. If this program is not available, PSPlug resorts to a simple text extraction algorithm.</Text>
1709</Content>
1710</Part>
1711<Part id="emailplug">
1712<Title>
1713<Text id="577">EMAILPlug (*.email)</Text>
1714</Title>
1715<Content>
1716<Text id="578">EMAILPlug imports files containing E-mail, and deals with common E-mail formats such as are used by the Netscape, Eudora, and Unix mail readers. Each source document is examined to see if it contains an E-mail, or several E-mails joined together in one file, and if so its contents are processed. The plugin extracts <i>Subject</i>, <i>To</i>, <i>From</i>, and <i>Date</i> metadata. However, this plugin does not yet handle MIME-encoded E-mails properly—although legible, they often look rather strange.</Text>
1717</Content>
1718</Part>
1719<Part id="zipplug">
1720<Title>
1721<Text id="579">ZIPPlug (.gz, .z, .tgz, .taz, .bz, .zip, .tar)</Text>
1722</Title>
1723<Content>
1724<Text id="580">ZIPPlug plugin handles the following compressed and/or archived input formats : gzip (.<i>gz</i>, .<i>z</i>, .<i>tgz</i>, .<i>taz</i>) , bzip (.<i>bz</i>) , zip (.<i>zip</i>, .<i>jar</i>) , and tar (.<i>tar</i>). It relies on the programs <i>gunzip</i>, <i>bunzip</i>, <i>unzip</i>, and <i>tar</i>, which are standard Linux utilities. ZIPPlug is disabled on Windows computers.</Text>
1725</Content>
1726</Part>
1727</Content>
1728</Subsection>
1729</Content>
1730</Section>
1731</Content>
1732</Chapter>
1733<Chapter id="administration">
1734<Title>
1735<Text id="581">Administration</Text>
1736</Title>
1737<Content>
1738<Text id="582">An “administrative” facility is included with every Greenstone installation.To access this facility, click the appropriate link on the front page.</Text>
1739<Text id="583">The entry page, shown in Figure 17, gives information about each of the collections offered by the system. Note that <i>all</i> collections are included—for there may be “private” ones that do not appear on the Greenstone home page. With each is given its short name, full name, whether it is publicly displayed, and whether or not it is running. Clicking a particular collection's abbreviation (the first column of links in Figure 17) brings up information about that collection, gathered from its collection configuration file and from other internal structures created for that collection. If the collection is both public and running, clicking the collection's full name (the second link) takes you to the collection itself.</Text>
1740<Text id="584">A collection named <i>wohiex</i>, for <i>Women's History Excerpt</i>, is visible near the bottom of Figure 17. Figure 18 shows the information that is displayed when this link is clicked. The first section gives some information from the configuration file, and the size of the collection (about 1000 documents, about a million words, over 6 Mb). The next sections contain internal information related to the communication protocol through which collections are accessed. For example, the filter options for “QueryFilter” show the options and possible values that can be used when querying the collection.</Text>
1741<Text id="585">The administrative facility also presents configuration information about the installation and allows it to be modified. It facilitates examination of the error logs that record internal errors, and the user logs that record usage. It enables a specified user (or users) to authorize others to build collections and add new material to existing ones. All these facilities are accessed interactively from the menu items at the left-hand side of Figure 17.</Text>
1742<Figure id="greenstone_administration_facility">
1743<Title>
1744<Text id="586">Greenstone Administration facility</Text>
1745</Title>
1746<File width="335" height="699" url="images/User_Fig_17.png"/>
1747</Figure>
1748<Figure id="information_about_the_womens_history_excerpt_collection">
1749<Title>
1750<Text id="587">Information about the <i>Women's History Excerpt</i> collection</Text>
1751</Title>
1752<File width="331" height="699" url="images/User_Fig_18.png"/>
1753</Figure>
1754<Section id="configuration_files">
1755<Title>
1756<Text id="588">Configuration files</Text>
1757</Title>
1758<Content>
1759<Text id="589">There are two configuration files that control Greenstone's operation, the site configuration file <i>gsdlsite.cfg</i> and the main configuration file <i>main.cfg</i>.</Text>
1760<Text id="590">The <i>gsdlsite.cfg</i> file is used to configure the Greenstone software for the site where it is installed. It is designed for keeping configuration options that are particular to a given site. Examples include the name of the directory where the Greenstone software is kept, the http address of the Greenstone system, and whether the <i>fastcgi</i> facility is being used.The entries in this file are described in the <i>Greenstone Digital Library Installation Guide</i>.</Text>
1761<Text id="591">The <i>main.cfg</i> file contains information that is common to the interface of all collections served by a Greenstone site. It includes the E-mail address of the system maintainer, whether the status and collector pages are enabled, whether logs of user activity are kept, and whether Internet “cookies” are used to identify users.</Text>
1762</Content>
1763</Section>
1764<Section id="logs">
1765<Title>
1766<Text id="592">Logs</Text>
1767</Title>
1768<Content>
1769<Text id="593">Three kinds of logs can be examined: usage logs, error logs and initialization logs. The last two are only really of interest to people maintaining the software.</Text>
1770<Text id="594">All user activity—every page that each user visits—can be recorded by the Greenstone software, though no personal names are included in the logs. Logging, disabled by default, is enabled by including the lines</Text>
1771<CodeLine>logcgiargs true</CodeLine>
1772<CodeLine>usecookies true</CodeLine>
1773<Text id="595">in the main system configuration file. Both options are false by default, so that no logging is done unless they are set. It is the <i>logcgiargs</i> line that actually turns logging on and off. By activating <i>usecookies</i> a unique identification code is assigned to each user, which enables individual user's interactions to be traced through the log file.</Text>
1774<Text id="596">Each line in the user log records a page visited—even the pages generated to inspect the log files! It contains (a) the IP address of the user's computer, (b) a timestamp in square brackets, (c) the CGI arguments in parentheses, and (d) the name of the user's browser (Netscape is called “Mozilla”). Here is a sample line, split and annotated for ease of reading:</Text>
1775<CodeLine>/fast-cgi-bin/niupepalibrary</CodeLine>
1776<CodeLine>(a) its-www1.massey.ac.nz</CodeLine>
1777<CodeLine>(b) [ Thu Dec07 23:47:00 NZDT2000]</CodeLine>
1778<CodeLine>(c) (a=p, b=0, bcp=, beu=, c=niupepa, cc=, ccp=0, ccs=0, cl=, cm=, cq2=, d=, e=, er=, f=0, fc=1, gc=0, gg=text, gt=0, h=, h2=, hl=1, hp=, il=l, j=, j2=, k=1, ky=, l=en, m=50, n=, n2=, o=20, p=home, pw=, q=, q2=, r=1, s=0, sp=frameset, t=1, ua=, uan=, ug=, uma=listusers, umc=, umnpw1=, umnpw2=, umpw=, umug=, umun=, umus=, un=, us=invalid, v=0, w=w, x=0, z=130.123.128.4-950647871)</CodeLine>
1779<CodeLine>(d) “Mozilla/4.08 [en] (Win95; I ;Nav)”</CodeLine>
1780<Text id="597">The last CGI argument, “z”, is an identification code or “cookie” generated by the user's browser: it comprises the user's IP number followed by the timestamp when they first accessed the digital library.</Text>
1781<Text id="598">The log file <i>usage.txt</i> is placed in the <i>etc</i> directoryin the Greenstone file structure (see the <i>Greenstone Digital Library Developer's Guide</i>).When logging is enabled, every action by every user is logged. However, only the last 100 entries in the log file are displayed by the <i>usage log</i> link in Figure 17.</Text>
1782</Content>
1783</Section>
1784<Section id="user_management">
1785<Title>
1786<Text id="599">User management</Text>
1787</Title>
1788<Content>
1789<Text id="600">Greenstone incorporates an authentication scheme which can be used to control access to certain facilities. At the moment this is only used to restrict the people who are allowed to enter the Collectorand certain administration functions. If, for a particular collection, it were necessary to authenticate users before returning information to them, this is possible too—for example, documents could be protected on an individual basis so that they can only be accessed by registered users on presentation of a password. However, no current collections use this facility). Authentication is done by requesting a user name and password, as illustrated in Figure 16a.</Text>
1790<Text id="601">From the administration page users can be listed, new ones added, and old ones deleted. The ability to do this is of course also protected: only users who have administrative privileges can add new users. It is also possible for each user to belong to different “groups”. At present, the only extant groups are “administrator” and “colbuilder”. Members of the first group can add and remove users, and change their groups. Members of the second can access the facilities described above to build new collections and alter (and delete) existing ones.</Text>
1791<Text id="602">When Greenstone is installed, there is one user called <i>admin</i> who belongs to both groups. The password for this user is set during the installation process. This user can create new names and passwords for users who belong just to the <i>colbuilder</i> group, which is the recommended way of giving other users the ability to build collections. User information is recorded in two databases that are placed in the Greenstone file structure (see the <i>Greenstone Digital Library Developer's Guide</i>).</Text>
1792</Content>
1793</Section>
1794<Section id="technical_information">
1795<Title>
1796<Text id="603">Technical information</Text>
1797</Title>
1798<Content>
1799<Text id="604">The links under the <i>Technical information</i> heading show further information on the installation.The <i>general</i> link gives access to technical information, including the directories where things are stored.The <i>protocols</i> menu item gives, for each possible protocol type, information about each of the collections supported by that protocol.</Text>
1800<Text id="605">Finally, user interface code (called the “receptionist”) uses <i>actions</i> to communicate the wishes of the user. These actions correspond to the CGI argument labeled <i>a</i>. For example, if <i>a=status</i> the receptionist invokes the <i>status</i> action (which displays the status page). A menu item gives access to lists of all actions supported by the system, and another leads to the arguments that these actions take.</Text>
1801</Content>
1802</Section>
1803</Content>
1804</Chapter>
1805<Chapter id="appendix_a_software_features">
1806<Title>
1807<Text id="606">Appendix A Software features</Text>
1808</Title>
1809<Content>
1810<Table class="hidden" id="table_appendixa">
1811<Title/>
1812<TableContent>
1813<tr>
1814<th width="132">
1815<Text id="607"><i>Accessible via web browser</i></Text>
1816</th>
1817<th width="397">
1818<Text id="608">Collections are accessed through a standard web browser (Netscape or Internet Explorer) and combine easy-to-use browsing with powerful search facilities.</Text>
1819</th>
1820</tr>
1821<tr>
1822<th width="132">
1823<Text id="609"><i>Full-text and fielded search</i></Text>
1824</th>
1825<th width="397">
1826<Text id="610">The user can search the full text of the documents, or choose between indexes built from different parts of the documents. For example, some collections have an index of full documents, an index of sections, an index of titles, and an index of authors, each of which can be searched for particular words or phrases. Results can be ranked by relevance or sorted by a metadata element.</Text>
1827</th>
1828</tr>
1829<tr>
1830<th width="132">
1831<Text id="611"><i>Flexible browsing facilities</i></Text>
1832</th>
1833<th width="397">
1834<Text id="612">The user can browse lists of authors, lists of titles, lists of dates, classification structures, and so on. Different collections may offer different browsing facilities and even within a collection, a broad variety of browsing interfaces are available. Browsing and searching interfaces are constructed during the building process, according to collection configuration information.</Text>
1835</th>
1836</tr>
1837<tr>
1838<th width="132">
1839<Text id="613"><i>Creates access structures automatically</i></Text>
1840</th>
1841<th width="397">
1842<Text id="614">The Greenstone software creates information collections that are very easy to maintain. All searching and browsing structures are built directly from the documents themselves. No links are inserted by hand, but existing links in originals are maintained. This means that if new documents in the same format become available, they can be merged into the collection automatically. Indeed, for some collections this is done by processes that wake up regularly, scout for new material, and rebuild the indexes—all without manual intervention.</Text>
1843</th>
1844</tr>
1845<tr>
1846<th width="132">
1847<Text id="615"><i>Makes use of available metadata</i></Text>
1848</th>
1849<th width="397">
1850<Text id="616">Metadata, which is descriptive information such as author, title, date, keywords, and so on, may be associated with each document, or with individual sections within documents. Metadata is used as the raw material for browsing indexes. It must be either provided explicitly or derivable automatically from the source documents. The Dublin Core metadata scheme is used for most electronic documents, however, provision is made for other schemes.</Text>
1851</th>
1852</tr>
1853<tr>
1854<th width="132">
1855<Text id="617"><i>Plugins extend the system's capabilities</i></Text>
1856</th>
1857<th width="397">
1858<Text id="618">In order to accommodate different kinds of source documents, the software is organized in such a way that “plugins” can be written for new document types. Plugins currently exist for plain text, html, Word, PDF, PostScript, E-mail, some proprietary formats, and for recursively traversing directory structures and compressed archives containing such documents. A collection may have source documents in different forms. In order to build browsing indexes from metadata, an analogous scheme of “classifiers” is used: classifiers create browsing indexes of various kinds based on metadata.</Text>
1859</th>
1860</tr>
1861<tr>
1862<th width="132">
1863<Text id="619"><i>Designed for multi-gigabyte collections</i></Text>
1864</th>
1865<th width="397">
1866<Text id="620">Collections can contain millions of documents, making the Greenstone system suitable for collections up to several gigabytes.</Text>
1867</th>
1868</tr>
1869<tr>
1870<th width="132">
1871<Text id="621"><i>Documents can be in any language</i></Text>
1872</th>
1873<th width="397">
1874<Text id="622">Unicode is used throughout the software, allowing any language to be processed in a consistent manner. To date, collections have been built containing French, Spanish, Maori, Chinese, Arabic and English. On-the-fly conversion is used to convert from Unicode to an alphabet supported by the user's web browser.</Text>
1875</th>
1876</tr>
1877<tr>
1878<th width="132">
1879<Text id="623"><i>User interface available in multiple languages</i></Text>
1880</th>
1881<th width="397">
1882<Text id="624">The interface can be presented in multiple languages. Currently, the interface is available in Arabic, Chinese, Dutch, English, French, German, Maori, Portuguese, and Spanish. New languages can be added easily.</Text>
1883</th>
1884</tr>
1885<tr>
1886<th width="132">
1887<Text id="625"><i>Collections can contain text, pictures, audio, and video</i></Text>
1888</th>
1889<th width="397">
1890<Text id="626">Greenstone collections can contain text, pictures, audio and video clips. Most non-textual material is either linked in to the textual documents or accompanied by textual descriptions (such as figure captions) to allow full-text searching and browsing. However, the architecture permits implementation of plugins and classifiers even for non-textual data.</Text>
1891</th>
1892</tr>
1893<tr>
1894<th width="132">
1895<Text id="627"><i>Uses advanced compression techniques</i></Text>
1896</th>
1897<th width="397">
1898<Text id="628">Compression techniques are used to reduce the size of the indexes and text. Reducing the size of the indexes via compression has the added advantage of increasing the speed of text retrieval.</Text>
1899</th>
1900</tr>
1901<tr>
1902<th width="132">
1903<Text id="629"><i>Administrative function provided</i></Text>
1904</th>
1905<th width="397">
1906<Text id="630">An “administrative” function enables specified users to authorize new users to build collections, protect documents so that they can only be accessed by registered users on presentation of a password, examine the composition of all collections, and so on. Logs of user activity can record all queries made to every Greenstone collection.</Text>
1907</th>
1908</tr>
1909<tr>
1910<th width="132">
1911<Text id="631"><i>New collections appear dynamically</i></Text>
1912</th>
1913<th width="397">
1914<Text id="632">Collections can be updated and new ones brought on-line at any time, without bringing the system down; the process responsible for the user interface will notice (through periodic polling) when new collections appear and add them to the list presented to the user.</Text>
1915</th>
1916</tr>
1917<tr>
1918<th width="132">
1919<Text id="633"><i>Collections can be published on the Internet or on CD-ROM</i></Text>
1920</th>
1921<th width="397">
1922<Text id="634">The software can be used to serve collections over the World-Wide Web. Greenstone collections can be made available, in precisely the same form, on CD-ROM. The user interface is through a standard web browser (Netscape is provided on each disk), and the interaction is identical to accessing the collection on the web—except that response times are more predictable. The CD-ROMs run under all versions of the Windows operating system.</Text>
1923</th>
1924</tr>
1925<tr>
1926<th width="132">
1927<Text id="635"><i>Collections can be distributed amongst different computers</i></Text>
1928</th>
1929<th width="397">
1930<Text id="636">A flexible process structure allows different collections to be served by different computers, yet be presented to the user in the same way, on the same web page, as part of the same digital library.</Text>
1931</th>
1932</tr>
1933<tr>
1934<th width="132">
1935<Text id="637"><i>Operates on both Windows and Unix</i></Text>
1936</th>
1937<th width="397">
1938<Text id="638">Greenstone runs under both Windows (3.1/3.11, 95/98/Me, NT/2000) and Unix (Linux and SunOS). Any of these systems can be used as a webserver. Collections cannot be built on low-end Windows systems (3.1/3.11), but pre-built collections can be transferred to them.</Text>
1939</th>
1940</tr>
1941<tr>
1942<th width="132">
1943<Text id="639"><i>What you get with Greenstone</i></Text>
1944</th>
1945<th width="397">
1946<Text id="640">The Greenstone Digital Library is open-source software, available from the New Zealand Digital Library (<i>nzdl.org</i>) under the terms of the Gnu General Public License. The software includes everything described above: web serving, CD-ROM creation, collection building, multi-lingual capability, plugins and classifiers for a variety of different source document types. It includes an autoinstall feature to allow easy installation on both Windows and Unix. In the spirit of open-source software, users are encouraged to contribute modifications and enhancements.</Text>
1947</th>
1948</tr>
1949</TableContent>
1950</Table>
1951</Content>
1952</Chapter>
1953<Chapter id="appendix_b_glossary_of_terms">
1954<Title>
1955<Text id="641">Appendix B Glossary of terms</Text>
1956</Title>
1957<Content>
1958<Table id="appendixb" class="hidden">
1959<Title/>
1960<TableContent>
1961<tr>
1962<th width="123">
1963<Text id="642"><b>Term</b></Text>
1964</th>
1965<th width="406">
1966<Text id="643"><b>Meaning</b></Text>
1967</th>
1968</tr>
1969<tr>
1970<th width="123">
1971<Text id="644"><i>autoconf</i></Text>
1972</th>
1973<th width="406">
1974<Text id="645">Unix program used to configure the Greenstone software installation package to suit your system</Text>
1975</th>
1976</tr>
1977<tr>
1978<th width="123">
1979<Text id="646"><i>Autorun</i></Text>
1980</th>
1981<th width="406">
1982<Text id="647">Windows feature that starts a program automatically whenever a CD-ROM is inserted</Text>
1983</th>
1984</tr>
1985<tr>
1986<th width="123">
1987<Text id="648">Boolean query</Text>
1988</th>
1989<th width="406">
1990<Text id="649">Query to an information retrieval system that may contain AND, OR, NOT</Text>
1991</th>
1992</tr>
1993<tr>
1994<th width="123">
1995<Text id="650">Browsing</Text>
1996</th>
1997<th width="406">
1998<Text id="651">Accessing a collection by scanning an organized list of metadata values associated with the documents (such as author, title, date, keywords)</Text>
1999</th>
2000</tr>
2001<tr>
2002<th width="123">
2003<Text id="652"><i>buildcol.pl</i></Text>
2004</th>
2005<th width="406">
2006<Text id="653">Greenstone program used to build collections</Text>
2007</th>
2008</tr>
2009<tr>
2010<th width="123">
2011<Text id="654">Building</Text>
2012</th>
2013<th width="406">
2014<Text id="655">Process of creating the indexing and browsing structures that are used to access a collection</Text>
2015</th>
2016</tr>
2017<tr>
2018<th width="123">
2019<Text id="656">C++</Text>
2020</th>
2021<th width="406">
2022<Text id="657">Programming language in which the majority of the Greenstone software is written</Text>
2023</th>
2024</tr>
2025<tr>
2026<th width="123">
2027<Text id="658">Casefolding</Text>
2028</th>
2029<th width="406">
2030<Text id="659">Making uppercase and lowercase words look the same, for searching purposes</Text>
2031</th>
2032</tr>
2033<tr>
2034<th width="123">
2035<Text id="660">CGI</Text>
2036</th>
2037<th width="406">
2038<Text id="661">Common Gateway Interface, a scheme that allows users to activate programs on the host computer by clicking on web pages</Text>
2039</th>
2040</tr>
2041<tr>
2042<th width="123">
2043<Text id="662">CGI script</Text>
2044</th>
2045<th width="406">
2046<Text id="663">Code associated with a button, menu, or link on a web page that specifies what the host computer is to do when it is clicked</Text>
2047</th>
2048</tr>
2049<tr>
2050<th width="123">
2051<Text id="664"><i>cgi-bin</i></Text>
2052</th>
2053<th width="406">
2054<Text id="665">Directory in which CGI scripts are stored</Text>
2055</th>
2056</tr>
2057<tr>
2058<th width="123">
2059<Text id="666">Classifier</Text>
2060</th>
2061<th width="406">
2062<Text id="667">Greenstone code module that examines document metadata to form an index for browsing</Text>
2063</th>
2064</tr>
2065<tr>
2066<th width="123">
2067<Text id="668">Collection</Text>
2068</th>
2069<th width="406">
2070<Text id="669">Set of documents that are brought together under a uniform searching and browsing interface</Text>
2071</th>
2072</tr>
2073<tr>
2074<th width="123">
2075<Text id="670">Collection configuration file</Text>
2076</th>
2077<th width="406">
2078<Text id="671">File that specifies how a collection is to be imported and built, what indexes and language interfaces are to be provided, etc.</Text>
2079</th>
2080</tr>
2081<tr>
2082<th width="123">
2083<Text id="672">Collection server</Text>
2084</th>
2085<th width="406">
2086<Text id="673">Program responsible for providing access to a collection when it is being used</Text>
2087</th>
2088</tr>
2089<tr>
2090<th width="123">
2091<Text id="674">Configuration file</Text>
2092</th>
2093<th width="406">
2094<Text id="675">See collection configuration file, main configuration file, site configuration file</Text>
2095</th>
2096</tr>
2097<tr>
2098<th width="123">
2099<Text id="676">CVS</Text>
2100</th>
2101<th width="406">
2102<Text id="677">Concurrent Versioning System, a scheme for maintaining source code used throughout Greenstone</Text>
2103</th>
2104</tr>
2105<tr>
2106<th width="123">
2107<Text id="678"><i>db2txt</i></Text>
2108</th>
2109<th width="406">
2110<Text id="679">Greenstone tool for viewing a GDBM database as text (see GDBM)</Text>
2111</th>
2112</tr>
2113<tr>
2114<th width="123">
2115<Text id="680">Demo collection</Text>
2116</th>
2117<th width="406">
2118<Text id="681">A subset of the Humanities Development Library, distributed with the Greenstone software and used for illustration in this tutorial</Text>
2119</th>
2120</tr>
2121<tr>
2122<th width="123">
2123<Text id="682">Digital library</Text>
2124</th>
2125<th width="406">
2126<Text id="683">Collection of digital objects (text, audio, video), along with methods for access and retrieval, and for selection, organization, and maintenance</Text>
2127</th>
2128</tr>
2129<tr>
2130<th width="123">
2131<Text id="684">DL</Text>
2132</th>
2133<th width="406">
2134<Text id="685">Development Library, A Greenstone collection of humanitarian information for developing countries</Text>
2135</th>
2136</tr>
2137<tr>
2138<th width="123">
2139<Text id="686">Document</Text>
2140</th>
2141<th width="406">
2142<Text id="687">Basic unit from which digital library collections are constructed; it may include text, graphics, sound, video, etc.</Text>
2143</th>
2144</tr>
2145<tr>
2146<th width="123">
2147<Text id="688">Dublin core</Text>
2148</th>
2149<th width="406">
2150<Text id="689">A standard way of describing metadata</Text>
2151</th>
2152</tr>
2153<tr>
2154<th width="123">
2155<Text id="690">Fast CGI</Text>
2156</th>
2157<th width="406">
2158<Text id="691">Facility that allows CGI scripts to remain continuously active so that they do not have to be restarted from scratch every time they are invoked</Text>
2159</th>
2160</tr>
2161<tr>
2162<th width="123">
2163<Text id="692">Filter program</Text>
2164</th>
2165<th width="406">
2166<Text id="693">That part of a Greenstone collection server that implements querying and browsing operations</Text>
2167</th>
2168</tr>
2169<tr>
2170<th width="123">
2171<Text id="694">Format string</Text>
2172</th>
2173<th width="406">
2174<Text id="695">A string that specifies how documents and other listings are to be displayed in Greenstone</Text>
2175</th>
2176</tr>
2177<tr>
2178<th width="123">
2179<Text id="696">GB-encoding</Text>
2180</th>
2181<th width="406">
2182<Text id="697">Standard way of encoding the Chinese language</Text>
2183</th>
2184</tr>
2185<tr>
2186<th width="123">
2187<Text id="698">GDBM</Text>
2188</th>
2189<th width="406">
2190<Text id="699">Gnu DataBase Manager, a program used within the Greenstone software to store metadata for each document</Text>
2191</th>
2192</tr>
2193<tr>
2194<th width="123">
2195<Text id="700">GIMP</Text>
2196</th>
2197<th width="406">
2198<Text id="701">Gnu Image-Manipulation Program used (on Unix) to create icons in Greenstone</Text>
2199</th>
2200</tr>
2201<tr>
2202<th width="123">
2203<Text id="702">GML</Text>
2204</th>
2205<th width="406">
2206<Text id="703">Greenstone Markup Language, an XML-compliant format used for storing documents internally</Text>
2207</th>
2208</tr>
2209<tr>
2210<th width="123">
2211<Text id="704">Gnu license</Text>
2212</th>
2213<th width="406">
2214<Text id="705">Software license that permits users to copy and distribute computer programs freely, and modify them—so long as all modifications are made publicly available</Text>
2215</th>
2216</tr>
2217<tr>
2218<th width="123">
2219<Text id="706">Greenstone</Text>
2220</th>
2221<th width="406">
2222<Text id="707">The name of this digital library software</Text>
2223</th>
2224</tr>
2225<tr>
2226<th width="123">
2227<Text id="708">GSDL</Text>
2228</th>
2229<th width="406">
2230<Text id="709">Abbreviation for Greenstone Digital Library</Text>
2231</th>
2232</tr>
2233<tr>
2234<th width="123">
2235<Text id="710"><i>%GSDLHOME%</i></Text>
2236</th>
2237<th width="406">
2238<Text id="711">Operating system variable that represents the top-level directory in which all Greenstone programs and collections are stored (<i>$GSDLHOME</i> on Unix systems)</Text>
2239</th>
2240</tr>
2241<tr>
2242<th width="123">
2243<Text id="712"><i>%GSDLOS%</i></Text>
2244</th>
2245<th width="406">
2246<Text id="713">Operating system variable that represents the operating system currently being used (<i>$GSDLOS</i> on Unix systems)</Text>
2247</th>
2248</tr>
2249<tr>
2250<th width="123">
2251<Text id="714"><i>hashfile</i></Text>
2252</th>
2253<th width="406">
2254<Text id="715">Greenstone program used at import or build time to generate the OID of each document</Text>
2255</th>
2256</tr>
2257<tr>
2258<th width="123">
2259<Text id="716">HTML</Text>
2260</th>
2261<th width="406">
2262<Text id="717">HyperText Markup Language, the language in which web documents are written</Text>
2263</th>
2264</tr>
2265<tr>
2266<th width="123">
2267<Text id="718"><i>import.pl</i></Text>
2268</th>
2269<th width="406">
2270<Text id="719">Greenstone program used to import documents</Text>
2271</th>
2272</tr>
2273<tr>
2274<th width="123">
2275<Text id="720">Importing</Text>
2276</th>
2277<th width="406">
2278<Text id="721">Process of bringing collections of documents into the Greenstone system</Text>
2279</th>
2280</tr>
2281<tr>
2282<th width="123">
2283<Text id="722">Index</Text>
2284</th>
2285<th width="406">
2286<Text id="723">Information structure that is used for searching or browsing a collection</Text>
2287</th>
2288</tr>
2289<tr>
2290<th width="123">
2291<Text id="724">InstallShield</Text>
2292</th>
2293<th width="406">
2294<Text id="725">Windows program, used by Greenstone CD-ROMs, that allows a system to be installed from a CD-ROM</Text>
2295</th>
2296</tr>
2297<tr>
2298<th width="123">
2299<Text id="726">Main configuration file</Text>
2300</th>
2301<th width="406">
2302<Text id="727">File that contains specifications common to all collections served by this site</Text>
2303</th>
2304</tr>
2305<tr>
2306<th width="123">
2307<Text id="728">Metadata</Text>
2308</th>
2309<th width="406">
2310<Text id="729">Descriptive data such as author, title, date, keywords, and so on, that is associated with a document (or document collection)</Text>
2311</th>
2312</tr>
2313<tr>
2314<th width="123">
2315<Text id="730">MG</Text>
2316</th>
2317<th width="406">
2318<Text id="731">Managing Gigabytes, a program used by the Greenstone system for full-text indexing, that incorporates compression techniques (see Witten, I.H., Moffat, A. and Bell, T. <i>Managing Gigabytes: compressing and indexing documents and images</i>, Morgan Kaufmann, second edition, 1999)</Text>
2319</th>
2320</tr>
2321<tr>
2322<th width="123">
2323<Text id="732"><i>mgbuild</i></Text>
2324</th>
2325<th width="406">
2326<Text id="733">MG program for building a compressed full-text index</Text>
2327</th>
2328</tr>
2329<tr>
2330<th width="123">
2331<Text id="734"><i>mgquery</i></Text>
2332</th>
2333<th width="406">
2334<Text id="735">MG program for querying a compressed full-text index</Text>
2335</th>
2336</tr>
2337<tr>
2338<th width="123">
2339<Text id="736"><i>mkcol.pl</i></Text>
2340</th>
2341<th width="406">
2342<Text id="737">Greenstone program that creates and initializes the directory structure for a new collection</Text>
2343</th>
2344</tr>
2345<tr>
2346<th width="123">
2347<Text id="738">New Zealand <br/>Digital Library Project</Text>
2348</th>
2349<th width="406">
2350<Text id="739">Research project in the Computer Science Department at the University of Waikato, New Zealand, that created the Greenstone software (<i>nzdl.org</i>)</Text>
2351</th>
2352</tr>
2353<tr>
2354<th width="123">
2355<Text id="740">OID</Text>
2356</th>
2357<th width="406">
2358<Text id="741">Object Identifier, a unique identification code associated with a document</Text>
2359</th>
2360</tr>
2361<tr>
2362<th width="123">
2363<Text id="742">Perl</Text>
2364</th>
2365<th width="406">
2366<Text id="743">Programming language used for many of the text-processing operations that occur during the building process</Text>
2367</th>
2368</tr>
2369<tr>
2370<th width="123">
2371<Text id="744">Ping</Text>
2372</th>
2373<th width="406">
2374<Text id="745">Message sent to a system to determine whether it is running or not</Text>
2375</th>
2376</tr>
2377<tr>
2378<th width="123">
2379<Text id="746">Plugin</Text>
2380</th>
2381<th width="406">
2382<Text id="747">Code module for handling documents of different formats, used during the importing and building processes</Text>
2383</th>
2384</tr>
2385<tr>
2386<th width="123">
2387<Text id="748">Protocol</Text>
2388</th>
2389<th width="406">
2390<Text id="749">Set of conventions by which a Greenstone receptionist communicates with a collection server</Text>
2391</th>
2392</tr>
2393<tr>
2394<th width="123">
2395<Text id="750">Ranked query</Text>
2396</th>
2397<th width="406">
2398<Text id="751">Natural-language query to an information retrieval system, for which the documents that match the query are sorted in order of relevance</Text>
2399</th>
2400</tr>
2401<tr>
2402<th width="123">
2403<Text id="752">Receptionist</Text>
2404</th>
2405<th width="406">
2406<Text id="753">Program that organizes the Greenstone user interface</Text>
2407</th>
2408</tr>
2409<tr>
2410<th width="123">
2411<Text id="754">RTF</Text>
2412</th>
2413<th width="406">
2414<Text id="755">Rich Text Format, a standard format for interchange of text documents</Text>
2415</th>
2416</tr>
2417<tr>
2418<th width="123">
2419<Text id="756">Searching</Text>
2420</th>
2421<th width="406">
2422<Text id="757">Accessing a collection through a full-text search of its contents (or parts of contents, such as section titles)</Text>
2423</th>
2424</tr>
2425<tr>
2426<th width="123">
2427<Text id="758">Server</Text>
2428</th>
2429<th width="406">
2430<Text id="759">See Collection server and Web server</Text>
2431</th>
2432</tr>
2433<tr>
2434<th width="123">
2435<Text id="760"><i>setup.bat, setup.sh, setup.csh</i></Text>
2436</th>
2437<th width="406">
2438<Text id="761">Script used to set up your environment to recognize the Greenstone software</Text>
2439</th>
2440</tr>
2441<tr>
2442<th width="123">
2443<Text id="762">Site configuration file</Text>
2444</th>
2445<th width="406">
2446<Text id="763">File that contains specifications used to configure the Greenstone software for the site on which it is installed</Text>
2447</th>
2448</tr>
2449<tr>
2450<th width="123">
2451<Text id="764">Stemming</Text>
2452</th>
2453<th width="406">
2454<Text id="765">Stripping endings off a query term to make it more general</Text>
2455</th>
2456</tr>
2457<tr>
2458<th width="123">
2459<Text id="766">STL</Text>
2460</th>
2461<th width="406">
2462<Text id="767">Standard template library, a widely-available library of C++ code developed by Silicon Graphics</Text>
2463</th>
2464</tr>
2465<tr>
2466<th width="123">
2467<Text id="768">txt2db</Text>
2468</th>
2469<th width="406">
2470<Text id="769">Greenstone program used at build time to create the GDBM database</Text>
2471</th>
2472</tr>
2473<tr>
2474<th width="123">
2475<Text id="770">Unicode</Text>
2476</th>
2477<th width="406">
2478<Text id="771">Standard scheme for representing the character sets used in the world's languages</Text>
2479</th>
2480</tr>
2481<tr>
2482<th width="123">
2483<Text id="772">UNU</Text>
2484</th>
2485<th width="406">
2486<Text id="773">The United Nations University; also used to refer to a Greenstone collection created for that organization</Text>
2487</th>
2488</tr>
2489<tr>
2490<th width="123">
2491<Text id="774">Web server</Text>
2492</th>
2493<th width="406">
2494<Text id="775">Standard program that computers use to make information accessible over the World Wide Web</Text>
2495</th>
2496</tr>
2497<tr>
2498<th width="123">
2499<Text id="776">XML</Text>
2500</th>
2501<th width="406">
2502<Text id="777">A standard format for structured documents and data on the web (the Greenstone Markup Language is an XML-compliant format)</Text>
2503</th>
2504</tr>
2505</TableContent>
2506</Table>
2507</Content>
2508</Chapter>
2509<FootnoteList>
2510<Footnote id="1">
2511<Text id="778">This option is disabled if an element of the same name already exists.</Text>
2512</Footnote>
2513</FootnoteList>
2514</Manual>
Note: See TracBrowser for help on using the repository browser.