|
|
@1442
|
24 years |
dmm9 |
date->Coverage
|
|
|
@1436
|
24 years |
davidb |
Due to rearrangement of ConvertTo hierarchy, this file is now redundant.
|
|
|
@1435
|
24 years |
davidb |
Rearrangement of ConvertTo inheritence so HTMLPlug and TextPlug do not …
|
|
|
@1431
|
24 years |
sjboddie |
Made a few minor adjustments to perl building code for use with …
|
|
|
@1424
|
24 years |
sjboddie |
Added a -out option to most of the perl building scripts to allow …
|
|
|
@1420
|
24 years |
davidb |
Moved read_file and read from ConvertToBasPlug to ConvertToPlug.
|
|
|
@1418
|
24 years |
davidb |
Small modification to improve handling of file names with spaces in.
|
|
|
@1417
|
24 years |
davidb |
Additions so ConvertPlug etc. can handle filenames with spaces in them.
|
|
|
@1415
|
24 years |
davidb |
Removed some diagnostic print statements.
|
|
|
@1412
|
24 years |
dmm9 |
adding the date extractor
|
|
|
@1411
|
24 years |
dmm9 |
added the options for the date extractor
|
|
|
@1410
|
24 years |
davidb |
Introduction of "ConvertTo" family of plugins. This establishes
a new …
|
|
|
@1405
|
24 years |
say1 |
fixed acronym bugs
|
|
|
@1404
|
24 years |
say1 |
fixed acronyms option file. trimmed text at start of bibliographies to …
|
|
|
@1403
|
24 years |
say1 |
taught HTMLPlug about shtml, asp, cgi, php and html query files …
|
|
|
@1401
|
24 years |
davidb |
Fixed small problem with associated files.
|
|
|
@1400
|
24 years |
davidb |
General tidying of code.
|
|
|
@1396
|
24 years |
say1 |
changed initialisation code for acronyms
|
|
|
@1393
|
24 years |
say1 |
acronym markup functionality
|
|
|
@1388
|
24 years |
sjboddie |
fixed a bit of a bug (more of a typo really) in the recent changes …
|
|
|
@1384
|
24 years |
paynter |
Changed language extraction to ignoer encoding information, so that …
|
|
|
@1382
|
24 years |
paynter |
Less common languages moved into a subdirectory of textcat so that the …
|
|
|
@1379
|
24 years |
paynter |
Fixed bug that gave gsdlsourcedocument metadata relative path instead …
|
|
|
@1377
|
24 years |
paynter |
Added "mirror interval N" command for use with update.pl
|
|
|
@1374
|
24 years |
sjboddie |
made set_OID use original document text instead of document object
|
|
|
@1362
|
24 years |
say1 |
removed use statement so other files could be compiled with use strict …
|
|
|
@1361
|
24 years |
say1 |
rewrote recursively to handle stop words and more cases
|
|
|
@1360
|
24 years |
say1 |
clarified status messages
|
|
|
@1358
|
24 years |
nzdl |
Fixed bug I recently introduced into HTMLPlug (<pre> tags were being …
|
|
|
@1341
|
24 years |
paynter |
Licensing information for TextCat language models.
|
|
|
@1336
|
24 years |
say1 |
fixed acronym extraction so it is now runs in time linear to the …
|
|
|
@1335
|
24 years |
say1 |
many acronym changes
|
|
|
@1317
|
24 years |
paynter |
Added -extract_language option, which uses the textcat language …
|
|
|
@1316
|
24 years |
paynter |
The textcat language identification package.
|
|
|
@1315
|
24 years |
paynter |
Language models for the textcat language identification package.
|
|
|
@1313
|
24 years |
sjboddie |
Added Davids version of AZCompactList which handles multiple value
metadata
|
|
|
@1312
|
24 years |
sjboddie |
fixed a bug in the HTML plugin that showed up under windows
|
|
|
@1304
|
24 years |
sjboddie |
fixed an intermittent bug (I hope) when building under windows
|
|
|
@1302
|
24 years |
kjm18 |
buildtype and indexfields added to configuration file entries. these …
|
|
|
@1301
|
24 years |
kjm18 |
building now writes 'buildtype mgpp' to build.cfg - indicates an mgpp …
|
|
|
@1287
|
24 years |
sjboddie |
Implemented a -sortmeta option for import.pl to sort archives.inf file …
|
|
|
@1269
|
24 years |
sjboddie |
Added ZIPPlug plugin for handling input documents that have been …
|
|
|
@1252
|
24 years |
sjboddie |
Building code now extracts a couple more statistics from mg and …
|
|
|
@1251
|
24 years |
sjboddie |
Added some stat reporting and a warning message to the build code.
Now …
|
|
|
@1250
|
24 years |
sjboddie |
Tidied up the classfiers slightly, made them a little more object …
|
|
|
@1246
|
24 years |
sjboddie |
Now prevent "notbuilt" field from going in the build.cfg file unless …
|
|
|
@1245
|
24 years |
sjboddie |
Fixed a bug that davidb found in a couple of regular expressions
|
|
|
@1244
|
24 years |
sjboddie |
Caught up most general plugins (that's the ones in …
|
|
|
@1243
|
24 years |
sjboddie |
Caught HTMLPlug up with BasPlug. A few minor changes to some …
|
|
|
@1242
|
24 years |
sjboddie |
Added Stuart Yeate's acronym extraction code and made it a standard …
|
|
|
@1241
|
24 years |
sjboddie |
merged ascii_doc.pm and doc.pm back together (removing basedoc.pm). To …
|
|
|
@1240
|
24 years |
gwp |
Resolved conflicts between previous two versions.
|
|
|
@1239
|
24 years |
gwp |
Replaced references to @_ in subroutine parse with a new variable …
|
|
|
@1235
|
24 years |
nzdl |
* empty log message *
|
|
|
@1231
|
24 years |
gwp |
Bug fix on the H1 metadata option: if the file has no <H1> tag, …
|
|
|
@1230
|
24 years |
gwp |
Added an additional H1 metadata field that extracts the text
between …
|
|
|
@1229
|
24 years |
sjboddie |
fixed bug in options
|
|
|
@1227
|
24 years |
sjboddie |
Modified the perl code for importing arabic encoded documents. Plugins …
|
|
|
@1225
|
24 years |
sjboddie |
minor change to parsargv.pm to allow for parsing of options within …
|
|
|
@1224
|
24 years |
sjboddie |
added handling of arabic encoding and ability to read in an entire …
|
|
|
@1223
|
24 years |
sjboddie |
added an arabic2unicode conversion function to unicode.pm
|
|
|
@1222
|
24 years |
sjboddie |
changed some ghtml.pm regular expressions to handle multiline strings
|
|
|
@1221
|
24 years |
sjboddie |
Added a new HBSPlug which is kind of a generalisation of HBPlug …
|
|
|
@1220
|
24 years |
sjboddie |
Caught HTMLPlug up with the changes I made to BasPlug. HTMLPlug now …
|
|
|
@1219
|
24 years |
sjboddie |
Made BasPlug take options (these options are available to all plugins …
|
|
|
@1218
|
24 years |
sjboddie |
fixed bug in gb.pm preventing gb encoding text from being translated …
|
|
|
@1206
|
24 years |
gwp |
A thorough rewrite; some of the metadata was flawed in such a way
that …
|
|
|
@1204
|
24 years |
gwp |
updated htmlsafe to substitue quotes with "
|
|
|
@1190
|
24 years |
gwp |
The first 200 chars of body text can now be extracted as metadata
by …
|
|
|
@1181
|
24 years |
sjboddie |
got end-user collection building to work (almost) on windows 95. …
|
|
|
@1178
|
24 years |
sjboddie |
modified perl dmsafe function to handle backslashes
|
|
|
@1086
|
24 years |
sjboddie |
Added AZCompactList.pm to distribution (and altered List.pm slightly …
|
|
|
@1072
|
24 years |
sjboddie |
Fixed bug - Control B's and C's were only being removed from body of …
|
|
|
@1046
|
24 years |
sjboddie |
added comment to make me feel better for having spent an hour testing …
|
|
|
@1044
|
24 years |
nzdl |
don't output doctype field to gdbm if document already has metadata …
|
|
|
@1020
|
24 years |
sjboddie |
changed paths to collection images (again!)
|
|
|
@1010
|
24 years |
sjboddie |
renamed old html module ghtml -- it clashed with builtin html module …
|
|
|
@1006
|
24 years |
sjboddie |
fixed but in previous changes
|
|
|
@983
|
24 years |
sjboddie |
link() function isn't supported on windows - use copy
|
|
|
@973
|
24 years |
sjboddie |
new path to images
|
|
|
@965
|
24 years |
sjboddie |
fixed bug - added assoc_files option
|
|
|
@932
|
24 years |
kjm18 |
new building programs for mgpp added
|
|
|
@918
|
24 years |
kjm18 |
fixed bug where it was creating two doc_obj per file instead of just one.
|
|
|
@900
|
24 years |
sjboddie |
tweaked the way associated files are handled at build time - some …
|
|
|
@899
|
24 years |
sjboddie |
small change to doc data structure to allow for some hacking
in WebPlug
|
|
|
@898
|
24 years |
sjboddie |
fixed small bug (groupsize had no default)
|
|
|
@897
|
24 years |
sjboddie |
lots of stuff
|
|
|
@863
|
24 years |
sjboddie |
fixed a couple of bugs that I introduced when including Davids stuff
|
|
|
@862
|
24 years |
sjboddie |
fixed a couple of bugs that were preventing muliple document gml files …
|
|
|
@850
|
24 years |
sjboddie |
added use strict - tidied a few things up etc.
|
|
|
@849
|
24 years |
sjboddie |
Fixed a bit of a bug
|
|
|
@847
|
25 years |
sjboddie |
fixed CVS burp
|
|
|
@846
|
25 years |
sjboddie |
don't use hashdoc for now
|
|
|
@842
|
25 years |
davidb |
base object for 'doc' objects (UTF8 or ASCII)
|
|
|
@840
|
25 years |
davidb |
Optimisations to make plugin go faster
|
|
|
@839
|
25 years |
davidb |
added extra_metadata function
|
|
|
@838
|
25 years |
davidb |
added options passed into 'new' subroutine
|
|
|
@837
|
25 years |
davidb |
added alpha_numeric search
|
|
|
@836
|
25 years |
davidb |
improvements to utils
|
|
|
@835
|
25 years |
davidb |
added 'begin' and 'end' function for plugins
|
|
|