[1100] | 1 | package projects
|
---|
| 2 |
|
---|
| 3 |
|
---|
| 4 | #######################################################################
|
---|
| 5 | # java images/scripts
|
---|
| 6 | #######################################################################
|
---|
| 7 |
|
---|
| 8 | # the _javalinks_ macros are the flashy image links at the top right of
|
---|
| 9 | # the page.
|
---|
| 10 |
|
---|
| 11 | _javalinks_ {_imagehome_}
|
---|
| 12 | _javalinks_ [v=1] {
|
---|
| 13 | _imagehome_<br>
|
---|
| 14 | }
|
---|
| 15 |
|
---|
| 16 |
|
---|
| 17 | #######################################################################
|
---|
| 18 | # icons
|
---|
| 19 | #######################################################################
|
---|
| 20 |
|
---|
| 21 | ## "projects and demonstrations" ## green_title ## demo ##
|
---|
| 22 | _httpicondemo_ {_httpimg_/demo.gif}
|
---|
| 23 | _widthdemo_ {450}
|
---|
| 24 | _heightdemo_ {57}
|
---|
| 25 |
|
---|
| 26 | _icondemo_ {<img src="_httpicondemo_" width=_widthdemo_ height=_heightdemo_}
|
---|
| 27 |
|
---|
| 28 | #######################################################################
|
---|
| 29 | # page content
|
---|
| 30 | #######################################################################
|
---|
| 31 |
|
---|
| 32 | _pagetitle_ {New Zealand Digital Library projects}
|
---|
| 33 |
|
---|
| 34 | _imagethispage_ {_icondemo_}
|
---|
| 35 |
|
---|
| 36 | _content_ {
|
---|
| 37 | _iconblankbar_
|
---|
| 38 | <p>
|
---|
| 39 | New Zealand Digital Library Project members have developed a range
|
---|
| 40 | of practical software packages in the course of their research.
|
---|
| 41 | Much of this software is available for
|
---|
| 42 | <a href="_httppagex_(download)">download</a>.
|
---|
| 43 |
|
---|
| 44 | <p><h4>Digital libraries and indexing</h4>
|
---|
| 45 |
|
---|
| 46 | <ul>
|
---|
[1142] | 47 | <li><a href="_httppagex_(gsdl)">Greenstone</a>
|
---|
[1100] | 48 | is the digital library system generates each and every page of
|
---|
| 49 | this website.
|
---|
| 50 | It is freely available under the GNU General public license,
|
---|
| 51 | and has been adopted by numerous other projects.
|
---|
| 52 | It is used to disseminate information by humanitarian
|
---|
| 53 | organisations including Global Help Projects and
|
---|
| 54 | United Nations organisations.
|
---|
| 55 | <ul>
|
---|
| 56 | <li> <a href="_gwcgi_">Our website</a> hosts exotic collections, humanitarian collections, and reference collections.
|
---|
| 57 | <li> <a href="_gwcgi_">Other websites</a> mirror these collections, and host many others.
|
---|
| 58 | <li> Greenstone is available for <a href="_httppagex_(download)">download</a>.
|
---|
| 59 | </ul>
|
---|
| 60 | <p>
|
---|
| 61 |
|
---|
| 62 | <li><a href="_httppagex_(mg)">MG</a>
|
---|
| 63 | is an enhancement of the
|
---|
| 64 | <a href="http://www.cs.mu.oz.au/mg">Managing Gigabytes</a>
|
---|
| 65 | full-text retrieval system that provides flexible stemming methods,
|
---|
| 66 | weighting terms, term frequencies, merged indexes,
|
---|
| 67 | machine independent indexes, and a port to MSDOS.
|
---|
| 68 | <ul>
|
---|
| 69 | <li> MG is available for <a href="_httppagex_(download)">download</a>.
|
---|
| 70 | </ul><p>
|
---|
| 71 |
|
---|
| 72 | <li><a href="_httppagex_(prescript)">PreScript</a>
|
---|
| 73 | converts PostScript to plain ASCII or HTML.
|
---|
| 74 | It detects paragraph boundaries, removes hyphenation,
|
---|
| 75 | and interprets many ligatures.
|
---|
| 76 | <ul>
|
---|
| 77 | <li> Prescript is available for <a href="_httppagex_(download)">download</a>.
|
---|
| 78 | </ul><p>
|
---|
| 79 | </ul>
|
---|
| 80 |
|
---|
| 81 | <p><h4>Extracting data and metadata</h4>
|
---|
| 82 | <ul>
|
---|
| 83 | <li>
|
---|
| 84 | <a href="http://www.cs.waikato.ac.nz/sequitur">Sequitur</a>
|
---|
| 85 | is a method for inferring compositional hierarchies from strings by detecting
|
---|
| 86 | repetition and factoring it out of the string by forming rules in a
|
---|
| 87 | grammar.
|
---|
| 88 | Sequitur is useful for recognizing lexical structure in strings,
|
---|
| 89 | and excels at very long sequences.
|
---|
| 90 | <ul>
|
---|
| 91 | <li>The <a href="http://www.cs.waikato.ac.nz/sequitur">Sequitur WWW interface</a> detects structure in text sequences.
|
---|
| 92 | <li> Sequitur is available for <a href="_httppagex_(download)">download</a>.
|
---|
| 93 | </ul><p>
|
---|
| 94 |
|
---|
| 95 | <li><a href="http://www.nzdl.org/Kea">Kea</a>
|
---|
| 96 | is a program for automatically extracting keywords and keyphrases
|
---|
| 97 | from the full text of documents.
|
---|
| 98 | Candidate keyphrases are identified using rudimentary lexical processing,
|
---|
| 99 | features are computed for each candidate, and machine learning is used to
|
---|
| 100 | determines which candidates should be assigned as keyphrases.
|
---|
| 101 | <ul>
|
---|
| 102 | <li>The <a href="http://nzdl2.cs.waikato.ac.nz/cgi-bin/WebKea">Kea WWW interface</a> will extract keyphrases from any web page you specify.
|
---|
| 103 | <li> Kea is available for <a href="_httppagex_(download)">download</a>.
|
---|
| 104 | </ul><p>
|
---|
| 105 | </ul>
|
---|
| 106 |
|
---|
[1262] | 107 | <p><h4>Text Mining</h4>
|
---|
| 108 | <ul>
|
---|
| 109 | See our <a href="http://www.cs.waikato.ac.nz/~nzdl/textmining/">Text Mining Webpage</a>.
|
---|
| 110 | </ul>
|
---|
| 111 |
|
---|
[1100] | 112 | <p><h4>Browsing interfaces</h4>
|
---|
| 113 | <ul>
|
---|
| 114 | <li>
|
---|
| 115 | <a href="http://www.nzdl.org/phind/">Phind</a>
|
---|
| 116 | is an interface for browsing the phrases that occur in a collection.
|
---|
| 117 | The phrases form an approximation of the topics covered.
|
---|
| 118 | They are extracted from the noun-phrases occuring in the text,
|
---|
| 119 | so nonsense phrases and phrases with very little information content
|
---|
| 120 | are excluded.
|
---|
| 121 | Each phrase is part of a hierarchy,
|
---|
| 122 | and the user can browse more specialised topics,
|
---|
| 123 | or retrieve documents that contain the phrase, at any point.
|
---|
| 124 | <ul>
|
---|
| 125 | <li> Phind has been applied to the web pages of the UN
|
---|
| 126 | <a href="http://www.nzdl.org/phind/fao.html">Food and Agriculture Organisation</a>.
|
---|
| 127 | </ul><p>
|
---|
| 128 |
|
---|
| 129 | <li><a href="http://www.cs.waikato.ac.nz/~stevej/Research/Phrasier/">Phrasier</a>
|
---|
| 130 | is a tool to support information seeking activities in a digital library.
|
---|
| 131 | Its novel design reflects the fact that reading, writing, browsing and
|
---|
| 132 | searching activities are rarely carried out independently of each other.
|
---|
| 133 | They overlap and interleave in ways which have not been effectively supported
|
---|
| 134 | by conventional information retrieval interfaces.
|
---|
| 135 | Consequenly Phrasier blurs the distinction between
|
---|
| 136 | writing a document and finding material related to it;
|
---|
| 137 | between reading a document and finding others on the same or similar topics;
|
---|
| 138 | between keyword searching and subject browsing.
|
---|
| 139 | <ul>
|
---|
| 140 | <li> A demonstration version of Phrasier is available for <a href="_httppagex_(download)">download</a>.
|
---|
| 141 | </ul><p>
|
---|
| 142 |
|
---|
| 143 |
|
---|
| 144 | <li><a href="http://nzdl2.cs.waikato.ac.nz/cgi-bin/Kniles">Kniles</a>
|
---|
| 145 | is a web-based system for inserting topic-based hypertext links
|
---|
| 146 | into existing, large-scale digital library collections.
|
---|
| 147 | The links are generated at runtime using keyphrases (provided by the author
|
---|
| 148 | or extracted by Kea), and let you browse collections of documents that
|
---|
| 149 | do not already have embedded hypertext links.
|
---|
| 150 | <ul>
|
---|
| 151 | <li> Kniles has been used insert links in the text of 45,000
|
---|
| 152 | <a href="http://nzdl2.cs.waikato.ac.nz/cgi-bin/Kniles?c=cstr">Computer Science Technical Reports</a>
|
---|
| 153 | that were originally in PostScript format.
|
---|
| 154 | </ul><p>
|
---|
| 155 |
|
---|
| 156 | </ul><p>
|
---|
| 157 |
|
---|
| 158 | <p><h4>Word segmentation</h4>
|
---|
| 159 | <ul>
|
---|
| 160 |
|
---|
| 161 | <li>
|
---|
| 162 | <a href="http://www.nzdl.org/cgi-bin/congb">Word segmentation</a>
|
---|
| 163 | is designed to find word boundaries in languages like Chinese and
|
---|
| 164 | Japanese, which are (unlike English) written without spaces
|
---|
| 165 | or other word delimiters (except for punctuation marks).
|
---|
| 166 | It plays a significant role in applications that use the word as the
|
---|
| 167 | basic unit due to the fact that machine-readable Chinese text
|
---|
| 168 | is invariably stored in unsegmented form.
|
---|
| 169 | <ul>
|
---|
| 170 | <li> We have implemented a
|
---|
| 171 | <a href="http://www.nzdl.org/cgi-bin/congb">WWW interface</a>
|
---|
| 172 | for segmanting Chinese text.
|
---|
| 173 | <li> If your web browsers does not support Chinese text,
|
---|
| 174 | illustrations of the transformation are available.
|
---|
| 175 | </ul><p>
|
---|
| 176 |
|
---|
| 177 | </ul><p>
|
---|
| 178 |
|
---|
| 179 | _nzdlpagefooter_
|
---|
| 180 | <br>April 2000
|
---|
| 181 | }
|
---|
| 182 |
|
---|