1 | How to Make a Collection - A Quick Introduction
|
---|
2 |
|
---|
3 | Cristian Francu
|
---|
4 | [email protected]
|
---|
5 | Jan 12, 2000
|
---|
6 |
|
---|
7 | First, go to the directory where you installed GSDL. In order to make
|
---|
8 | sure that you can run certain perl scripts you should run either
|
---|
9 | setup.bash or setup.csh, depending on the shell you're using:
|
---|
10 |
|
---|
11 | source setup.bash or
|
---|
12 |
|
---|
13 | source setup.csh
|
---|
14 |
|
---|
15 | This scripts set variables GSDLHOME, GSDLOS and PATH. Of course you
|
---|
16 | can include them in .cshrc or .profile in order to have them set
|
---|
17 | automatically.
|
---|
18 |
|
---|
19 | Next, you should run mkcol.pl in order to create the collection. This
|
---|
20 | perl script creates the necessary environment for the collection, like
|
---|
21 | directories and the file collect.cfg. The script mkcol.pl is located
|
---|
22 | in the directory
|
---|
23 |
|
---|
24 | bin/script
|
---|
25 |
|
---|
26 | This directory contains all the scripts that you'll need, so it's a
|
---|
27 | good idea to peek at it.
|
---|
28 |
|
---|
29 | If you run mkcol.pl it will tell you how to use it:
|
---|
30 |
|
---|
31 | $ mkcol.pl
|
---|
32 |
|
---|
33 | usage: mkcol.pl [options] collection-name
|
---|
34 |
|
---|
35 | options:
|
---|
36 | -creator email Your email address
|
---|
37 | -maintainer email The current maintainer's email address
|
---|
38 | -public true|false If this collection has anonymous access
|
---|
39 | -beta true|false If this collection is still under development
|
---|
40 |
|
---|
41 | After running mkcol.pl the collection will reside in
|
---|
42 | collect/<collection-name>. The next thing you should do is edit the
|
---|
43 | file
|
---|
44 |
|
---|
45 | collect/<collection-name>/etc/collect.cfg
|
---|
46 |
|
---|
47 | You should do at least two things: one is to add a line like this:
|
---|
48 |
|
---|
49 | collectionmeta iconcollection "http://sequence.rutgers.edu/~gsdl/collect/cstr/images/cstr.jpg"
|
---|
50 |
|
---|
51 | This line will set the icon of the collection (the image that users
|
---|
52 | will click to access the collection once it's on-line). Make sure you
|
---|
53 | type a proper URL of the image between quotes. You should do this at
|
---|
54 | this moment, because if you want to change the icon you have to
|
---|
55 | rebuild the collection, which is a time consuming operation. Hey,
|
---|
56 | gurus, is there any simpler way to change the icon of the collection
|
---|
57 | once the collection is already built?
|
---|
58 |
|
---|
59 | Now, the second thing you should do in the collect.cfg file is add the
|
---|
60 | proper plugin on the lines:
|
---|
61 |
|
---|
62 | plugin GMLPlug
|
---|
63 | plugin TEXTPlug
|
---|
64 | plugin ArcPlug
|
---|
65 | plugin RecPlug
|
---|
66 |
|
---|
67 | The plugins you need depend on the format of your documents. If the
|
---|
68 | documents are plain text, or GSDL's own format named GML you don't
|
---|
69 | need to change anything. If your documents are in other formats you
|
---|
70 | should look for a proper plugin in the directory
|
---|
71 |
|
---|
72 | perllib/plugins
|
---|
73 |
|
---|
74 | A very useful plugin is HTMLPlug which can process files with .html and
|
---|
75 | .htm file extensions. You would normally replace the TEXTPlug plugin with
|
---|
76 | the one you want to use. Say your collection is in html format, than you
|
---|
77 | would change the plugin lines to:
|
---|
78 |
|
---|
79 | plugin GMLPlug
|
---|
80 | plugin HTMLPlug
|
---|
81 | plugin ArcPlug
|
---|
82 | plugin RecPlug
|
---|
83 |
|
---|
84 | You're finally done with collect.cfg. Suppose you are creating a
|
---|
85 | collection named "tutorial". The next thing you should do is go to the
|
---|
86 | directory collect/tutorial and create two directories, import and
|
---|
87 | archives:
|
---|
88 |
|
---|
89 | cd collect/tutorial
|
---|
90 | mkdir import
|
---|
91 | mkdir archives
|
---|
92 |
|
---|
93 | The material to be indexed should reside in 'import' directory. You
|
---|
94 | can either copy it there, or create links to its directory. The
|
---|
95 | material to be indexed can contain directories and subdirectories. The
|
---|
96 | building script will go recursively into them and search for files to
|
---|
97 | be indexed. This is what the plugin RecPlug does.
|
---|
98 |
|
---|
99 | So, the next thing to do is make sure you have the documents to be
|
---|
100 | indexed in the import directory. You are now ready to run the
|
---|
101 | processing scripts. The fastest way to build a collection is in two
|
---|
102 | steps:
|
---|
103 |
|
---|
104 | 1. process the documents in 'import' directory and generate their
|
---|
105 | equivalent in .gml format in 'archives' directory
|
---|
106 |
|
---|
107 | 2. process the documents in 'archives' directory (now in .gml format)
|
---|
108 | and create the necessary indexes in 'building' directory
|
---|
109 |
|
---|
110 | For the first step just run the script import.pl:
|
---|
111 |
|
---|
112 | import.pl tutorial
|
---|
113 |
|
---|
114 | Depending on the size of your documents this might take between
|
---|
115 | minutes and hours. You might also want to redirect stdout and stderr
|
---|
116 | to capture the possible errors to files. You can also change the
|
---|
117 | verbosity of the script, just run it without arguments and you'll get
|
---|
118 | a complete list of options.
|
---|
119 |
|
---|
120 | For the second step run the script buildcol.pl:
|
---|
121 |
|
---|
122 | buildcol.pl tutorial
|
---|
123 |
|
---|
124 | Again, depending on the size of your material to be processed this may
|
---|
125 | take minutes to hours. Keep in mind that you must have enough space on
|
---|
126 | your hard drive for both steps, as the .gml documents eat up about the
|
---|
127 | same amount as the original documents.
|
---|
128 |
|
---|
129 | If everything went fine, you should now have a directory named
|
---|
130 | 'building' under collect/tutorial. That directory contains the results
|
---|
131 | of the processing of your documents. In order to use it you have to
|
---|
132 | move the content of 'building' directory to a new directory named
|
---|
133 | 'index'. First create it:
|
---|
134 |
|
---|
135 | cd collect/tutorial
|
---|
136 | mkdir index
|
---|
137 |
|
---|
138 | Then move the content:
|
---|
139 |
|
---|
140 | mv building/* index
|
---|
141 |
|
---|
142 | As long as your collect.cfg file contains the line
|
---|
143 |
|
---|
144 | public true
|
---|
145 |
|
---|
146 | and the collection built successfully the gsdl software should
|
---|
147 | automatically notice your new collection. The collection should now appear
|
---|
148 | on the main page, which can be accessed at:
|
---|
149 |
|
---|
150 | http://hostname.domain.edu/cgi-bin/library?a=p&p=home
|
---|
151 |
|
---|
152 | (replace hostname.domain.edu with the name of your server.)
|
---|
153 |
|
---|
154 | Keep in mind these instructions are just a jump start
|
---|
155 | to get you quickly on the run. There are more options
|
---|
156 | you can use and you can explore more of GSDL by reading
|
---|
157 | the documentation carefully. You can also email the
|
---|
158 | creators for further details.
|
---|