1 | package projects
|
---|
2 |
|
---|
3 |
|
---|
4 | #######################################################################
|
---|
5 | # java images/scripts
|
---|
6 | #######################################################################
|
---|
7 |
|
---|
8 | # the _javalinks_ macros are the flashy image links at the top right of
|
---|
9 | # the page.
|
---|
10 |
|
---|
11 | _javalinks_ {_imagehome_}
|
---|
12 | _javalinks_ [v=1] {
|
---|
13 | _imagehome_<br>
|
---|
14 | }
|
---|
15 |
|
---|
16 |
|
---|
17 | #######################################################################
|
---|
18 | # icons
|
---|
19 | #######################################################################
|
---|
20 |
|
---|
21 | ## "projects and demonstrations" ## green_title ## demo ##
|
---|
22 | _httpicondemo_ {_httpimg_/demo.gif}
|
---|
23 | _widthdemo_ {450}
|
---|
24 | _heightdemo_ {57}
|
---|
25 |
|
---|
26 | _icondemo_ {<img src="_httpicondemo_" width=_widthdemo_ height=_heightdemo_}
|
---|
27 |
|
---|
28 | #######################################################################
|
---|
29 | # page content
|
---|
30 | #######################################################################
|
---|
31 |
|
---|
32 | _pagetitle_ {New Zealand Digital Library projects}
|
---|
33 |
|
---|
34 | _imagethispage_ {_icondemo_}
|
---|
35 |
|
---|
36 | _content_ {
|
---|
37 | _iconblankbar_
|
---|
38 | <p>
|
---|
39 | New Zealand Digital Library Project members have developed a range
|
---|
40 | of practical software packages in the course of their research.
|
---|
41 | Much of this software is available for
|
---|
42 | <a href="_httppagex_(download)">download</a>.
|
---|
43 |
|
---|
44 | <p><h4>Digital libraries and indexing</h4>
|
---|
45 |
|
---|
46 | <ul>
|
---|
47 | <li><a href="_httppagex_(gsdl)">Greenstone</a>
|
---|
48 | is the digital library system generates each and every page of
|
---|
49 | this website.
|
---|
50 | It is freely available under the GNU General public license,
|
---|
51 | and has been adopted by numerous other projects.
|
---|
52 | It is used to disseminate information by humanitarian
|
---|
53 | organisations including Global Help Projects and
|
---|
54 | United Nations organisations.
|
---|
55 | <ul>
|
---|
56 | <li> <a href="_gwcgi_">Our website</a> hosts exotic collections, humanitarian collections, and reference collections.
|
---|
57 | <li> <a href="_gwcgi_">Other websites</a> mirror these collections, and host many others.
|
---|
58 | <li> Greenstone is available for <a href="_httppagex_(download)">download</a>.
|
---|
59 | </ul>
|
---|
60 | <p>
|
---|
61 |
|
---|
62 | <li><a href="_httppagex_(mg)">MG</a>
|
---|
63 | is an enhancement of the
|
---|
64 | <a href="http://www.cs.mu.oz.au/mg">Managing Gigabytes</a>
|
---|
65 | full-text retrieval system that provides flexible stemming methods,
|
---|
66 | weighting terms, term frequencies, merged indexes,
|
---|
67 | machine independent indexes, and a port to MSDOS.
|
---|
68 | <ul>
|
---|
69 | <li> MG is available for <a href="_httppagex_(download)">download</a>.
|
---|
70 | </ul><p>
|
---|
71 |
|
---|
72 | <li><a href="_httppagex_(prescript)">PreScript</a>
|
---|
73 | converts PostScript to plain ASCII or HTML.
|
---|
74 | It detects paragraph boundaries, removes hyphenation,
|
---|
75 | and interprets many ligatures.
|
---|
76 | <ul>
|
---|
77 | <li> Prescript is available for <a href="_httppagex_(download)">download</a>.
|
---|
78 | </ul><p>
|
---|
79 | </ul>
|
---|
80 |
|
---|
81 | <p><h4>Extracting data and metadata</h4>
|
---|
82 | <ul>
|
---|
83 | <li>
|
---|
84 | <a href="http://www.cs.waikato.ac.nz/sequitur">Sequitur</a>
|
---|
85 | is a method for inferring compositional hierarchies from strings by detecting
|
---|
86 | repetition and factoring it out of the string by forming rules in a
|
---|
87 | grammar.
|
---|
88 | Sequitur is useful for recognizing lexical structure in strings,
|
---|
89 | and excels at very long sequences.
|
---|
90 | <ul>
|
---|
91 | <li>The <a href="http://www.cs.waikato.ac.nz/sequitur">Sequitur WWW interface</a> detects structure in text sequences.
|
---|
92 | <li> Sequitur is available for <a href="_httppagex_(download)">download</a>.
|
---|
93 | </ul><p>
|
---|
94 |
|
---|
95 | <li><a href="http://www.nzdl.org/Kea">Kea</a>
|
---|
96 | is a program for automatically extracting keywords and keyphrases
|
---|
97 | from the full text of documents.
|
---|
98 | Candidate keyphrases are identified using rudimentary lexical processing,
|
---|
99 | features are computed for each candidate, and machine learning is used to
|
---|
100 | determines which candidates should be assigned as keyphrases.
|
---|
101 | <ul>
|
---|
102 | <li>The <a href="http://nzdl2.cs.waikato.ac.nz/cgi-bin/WebKea">Kea WWW interface</a> will extract keyphrases from any web page you specify.
|
---|
103 | <li> Kea is available for <a href="_httppagex_(download)">download</a>.
|
---|
104 | </ul><p>
|
---|
105 | </ul>
|
---|
106 |
|
---|
107 | <p><h4>Text Mining</h4>
|
---|
108 | <ul>
|
---|
109 | See our <a href="http://www.cs.waikato.ac.nz/~nzdl/textmining/">Text Mining Webpage</a>.
|
---|
110 | </ul>
|
---|
111 |
|
---|
112 | <p><h4>Browsing interfaces</h4>
|
---|
113 | <ul>
|
---|
114 | <li>
|
---|
115 | <a href="http://www.nzdl.org/phind/">Phind</a>
|
---|
116 | is an interface for browsing the phrases that occur in a collection.
|
---|
117 | The phrases form an approximation of the topics covered.
|
---|
118 | They are extracted from the noun-phrases occuring in the text,
|
---|
119 | so nonsense phrases and phrases with very little information content
|
---|
120 | are excluded.
|
---|
121 | Each phrase is part of a hierarchy,
|
---|
122 | and the user can browse more specialised topics,
|
---|
123 | or retrieve documents that contain the phrase, at any point.
|
---|
124 | <ul>
|
---|
125 | <li> Phind has been applied to the web pages of the UN
|
---|
126 | <a href="http://www.nzdl.org/phind/fao.html">Food and Agriculture Organisation</a>.
|
---|
127 | </ul><p>
|
---|
128 |
|
---|
129 | <li><a href="http://www.cs.waikato.ac.nz/~stevej/Research/Phrasier/">Phrasier</a>
|
---|
130 | is a tool to support information seeking activities in a digital library.
|
---|
131 | Its novel design reflects the fact that reading, writing, browsing and
|
---|
132 | searching activities are rarely carried out independently of each other.
|
---|
133 | They overlap and interleave in ways which have not been effectively supported
|
---|
134 | by conventional information retrieval interfaces.
|
---|
135 | Consequenly Phrasier blurs the distinction between
|
---|
136 | writing a document and finding material related to it;
|
---|
137 | between reading a document and finding others on the same or similar topics;
|
---|
138 | between keyword searching and subject browsing.
|
---|
139 | <ul>
|
---|
140 | <li> A demonstration version of Phrasier is available for <a href="_httppagex_(download)">download</a>.
|
---|
141 | </ul><p>
|
---|
142 |
|
---|
143 |
|
---|
144 | <li><a href="http://nzdl2.cs.waikato.ac.nz/cgi-bin/Kniles">Kniles</a>
|
---|
145 | is a web-based system for inserting topic-based hypertext links
|
---|
146 | into existing, large-scale digital library collections.
|
---|
147 | The links are generated at runtime using keyphrases (provided by the author
|
---|
148 | or extracted by Kea), and let you browse collections of documents that
|
---|
149 | do not already have embedded hypertext links.
|
---|
150 | <ul>
|
---|
151 | <li> Kniles has been used insert links in the text of 45,000
|
---|
152 | <a href="http://nzdl2.cs.waikato.ac.nz/cgi-bin/Kniles?c=cstr">Computer Science Technical Reports</a>
|
---|
153 | that were originally in PostScript format.
|
---|
154 | </ul><p>
|
---|
155 |
|
---|
156 | </ul><p>
|
---|
157 |
|
---|
158 | <p><h4>Word segmentation</h4>
|
---|
159 | <ul>
|
---|
160 |
|
---|
161 | <li>
|
---|
162 | <a href="http://www.nzdl.org/cgi-bin/congb">Word segmentation</a>
|
---|
163 | is designed to find word boundaries in languages like Chinese and
|
---|
164 | Japanese, which are (unlike English) written without spaces
|
---|
165 | or other word delimiters (except for punctuation marks).
|
---|
166 | It plays a significant role in applications that use the word as the
|
---|
167 | basic unit due to the fact that machine-readable Chinese text
|
---|
168 | is invariably stored in unsegmented form.
|
---|
169 | <ul>
|
---|
170 | <li> We have implemented a
|
---|
171 | <a href="http://www.nzdl.org/cgi-bin/congb">WWW interface</a>
|
---|
172 | for segmanting Chinese text.
|
---|
173 | <li> If your web browsers does not support Chinese text,
|
---|
174 | illustrations of the transformation are available.
|
---|
175 | </ul><p>
|
---|
176 |
|
---|
177 | </ul><p>
|
---|
178 |
|
---|
179 | _nzdlpagefooter_
|
---|
180 | <br>April 2000
|
---|
181 | }
|
---|
182 |
|
---|