|
|
@12970
|
18 years |
kjdon |
set_keepold and self->{'keepold'} have been changed to set_incremental …
|
|
|
@12630
|
18 years |
mdewsnip |
Plugins now output a <Processes> and <Blocks> information field, so …
|
|
|
@12624
|
18 years |
mdewsnip |
Added an extra argument to print_xml_usage for specifying whether to …
|
|
|
@12546
|
18 years |
kjdon |
changed parse2::parse so that it returns -1 on error, 0 on success, or …
|
|
|
@12270
|
18 years |
kjdon |
set_OIDtype now takes two arguments, the type and the metadata (used …
|
|
|
@11966
|
18 years |
mdewsnip |
(Profiling) Creating new textcat objects (one for each plugin) is …
|
|
|
@11880
|
18 years |
kjdon |
added a #" to line 1100 so that emacs colouring is not stuffed up
|
|
|
@11834
|
18 years |
mdewsnip |
Replaced all "_httpcollection_" in metadata (especially srclink) with …
|
|
|
@11681
|
18 years |
kjdon |
print_xml_usage and print_xml_header now take arguments
|
|
|
@11669
|
18 years |
kjdon |
need to pass a parameter to print_xml_header so it knows which DTD to …
|
|
|
@11389
|
18 years |
jrm21 |
try to get the encoding from a '<meta http-equiv' tag if HTML.
make …
|
|
|
@11368
|
18 years |
kjdon |
For some reason smart_block was hidden for gli, so made it visible
|
|
|
@11332
|
18 years |
mdewsnip |
Added a mechanism for plugins to do tidying up after exploding. …
|
|
|
@11122
|
18 years |
davidb |
Introduction of -associate_tail_re option to BasPlug. This is a …
|
|
|
@11089
|
18 years |
kjdon |
removed a couple of unnecessary bits of code like repeated arguments, …
|
|
|
@11069
|
18 years |
mdewsnip |
Added an option to use Kea 4.0 -- this isn't included with Greenstone, …
|
|
|
@11044
|
18 years |
mdewsnip |
The "-extract_keyphrase" and "-extract_keyphrase_options" arguments …
|
|
|
@10833
|
19 years |
jrm21 |
store the names of files we've already checked when looking for a …
|
|
|
@10620
|
19 years |
kjdon |
now prints out some gli tags when bad args are encountered for plugins …
|
|
|
@10579
|
19 years |
kjdon |
copied classify.pm and BasClas.pm, added -gsdlinfo flag - if this is …
|
|
|
@10478
|
19 years |
kjdon |
arcPlug now knows about keepold, and if its not set, it wont try to do …
|
|
|
@10446
|
19 years |
chi |
Modifications for converting windows-1252 to windows_1252.
|
|
|
@10442
|
19 years |
chi |
To retrieve encoding information for the HTML file generated from …
|
|
|
@10347
|
19 years |
kjdon |
removed the unneeded 'use parsargv'
|
|
|
@10329
|
19 years |
mdewsnip |
Changed the default_language string to be of type "string", since …
|
|
|
@10280
|
19 years |
chi |
Some major changes to allow secondary plugin setting.
|
|
|
@10254
|
19 years |
kjdon |
added 'use strict' to all plugins, and made modifications (mostly …
|
|
|
@10229
|
19 years |
kjdon |
fixed up some stuff for printing args (pluginfo.pl, classinfo.pl)
|
|
|
@10218
|
19 years |
kjdon |
Jeffrey's new parsing modifications, committed approx 6 July, 15.16
|
|
|
@10155
|
19 years |
davidb |
deinit subroutine added that balances out init routine. 'init' called …
|
|
|
@9961
|
19 years |
davidb |
Minor refinement made to print statements (warnings) generated by BasPlug.
|
|
|
@9853
|
19 years |
kjdon |
fixed up maxdocs - now pass an extra parameter to the read function
|
|
|
@9703
|
19 years |
mdewsnip |
Improvement to previous change so "file not processed" messages are …
|
|
|
@9586
|
19 years |
mdewsnip |
Added a ProcessingError message so the GLI knows when a file failed to …
|
|
|
@9584
|
19 years |
mdewsnip |
Plugins that return -1 from their read function now must output the …
|
|
|
@9413
|
19 years |
jrm21 |
if we are trying to automatically determine the encoding, look for a …
|
|
|
@9403
|
19 years |
jrm21 |
need to 'bless' an object before you can call functions in it
(for …
|
|
|
@9398
|
19 years |
davidb |
Introduction of GISBasPlug for Geographic Informatoin System support. …
|
|
|
@9351
|
19 years |
davidb |
Two changes:
1. Fusing files with the same root filename is meant …
|
|
|
@9067
|
19 years |
kjdon |
moved smart blocking stuff in htmlplug metadata_read into basplug …
|
|
|
@8915
|
19 years |
chi |
Add an option-smart_block_BN for BN Portugal Collection.
|
|
|
@8908
|
19 years |
davidb |
BasPlug now sets a piece of metadata [hascover] if document has a …
|
|
|
@8892
|
19 years |
davidb |
Addition of new minus option to BasPlug: -associate_ext.
This new …
|
|
|
@8818
|
19 years |
mdewsnip |
Title tags over multiple lines will now be removed correctly before …
|
|
|
@8814
|
19 years |
mdewsnip |
Updated files for Kea 3.0, thanks to Olena.
|
|
|
@8789
|
19 years |
mdewsnip |
Better documentation of the extract keyphrases (Kea) code, thanks to Olena.
|
|
|
@8761
|
19 years |
mdewsnip |
XML plugin descriptions now include an <Explodes> tag that records …
|
|
|
@8716
|
19 years |
kjdon |
added some changes made by Emanuel Dejanu (Simple Words)
|
|
|
@8678
|
20 years |
kjdon |
cover images are now turned on by default, and the option is changed …
|
|
|
@8510
|
20 years |
chi |
Add a new method metadat_read to deal with specific (or external) …
|
|
|
@8166
|
20 years |
mdewsnip |
Added FileSize metadata in most plugins.
|
|
|
@7818
|
20 years |
jrm21 |
improvements to the handling of textcat's guessed encoding
|
|
|
@7668
|
20 years |
jrm21 |
renamed "kea" to "Keyphrase" metadata, and add one for each extracted …
|
|
|
@7645
|
20 years |
jrm21 |
don't fail if we can't load the diagnostics package.
|
|
|
@7644
|
20 years |
jrm21 |
don't print "wrong encoding" message for text in english.
textcat …
|
|
|
@7508
|
20 years |
kjdon |
changed the plugin metadata - instead of having eg HTMLPlug metadata …
|
|
|
@7504
|
20 years |
davidb |
ImagePlug, MP3Plug, UnknownPlug modified to set Title metadata based …
|
|
|
@7362
|
20 years |
kjdon |
plugin read functions now return 'undef' - didn't recognise, '-1' - …
|
|
|
@7105
|
20 years |
kjdon |
changed the max century arg to a string instead of an int - need to be …
|
|
|
@7023
|
20 years |
kjdon |
fixed up the <tag> display for pluginfo and clasinfo. < and > should …
|
|
|
@6987
|
20 years |
mdewsnip |
Missed changing some print()s to gsprintf()s.
|
|
|
@6945
|
20 years |
mdewsnip |
Updated the resource bundle handling code some more. Strings are first …
|
|
|
@6932
|
20 years |
kjdon |
changed the output slightly, and now outputs the classifier/plugin …
|
|
|
@6925
|
20 years |
mdewsnip |
Changed the way display in different languages is done. Instead of …
|
|
|
@6918
|
20 years |
mdewsnip |
Removed some code I commented out.
|
|
|
@6584
|
20 years |
kjdon |
Fiddled around with segmenting for chinese text. Haven't changed how …
|
|
|
@6408
|
20 years |
jmt12 |
Added two new attributes for script arguments. HiddenGLI controls …
|
|
|
@6332
|
20 years |
jmt12 |
When -gli argument is provided to calling script these modules will …
|
|
|
@5924
|
21 years |
kjdon |
changed the new metadata to eg WordPlug instead of Word, cos a clash …
|
|
|
@5919
|
21 years |
kjdon |
each plugin now adds a metadata field to teh doc obj based on the …
|
|
|
@5681
|
21 years |
mdewsnip |
Rewritten option display code (used by all plugins) to use the new …
|
|
|
@4873
|
21 years |
mdewsnip |
Further work on standardising option descriptions. Specifically, in …
|
|
|
@4845
|
21 years |
jrm21 |
use add_metadata instead of add_utf8_metadata for Source and URL …
|
|
|
@4785
|
21 years |
mdewsnip |
Commented out print_usage functions - plugins should now call …
|
|
|
@4778
|
21 years |
mdewsnip |
Modified the code for generating the usage texts to use the methods in …
|
|
|
@4764
|
21 years |
mdewsnip |
Replaced call to removed function print_generic_usage() with a call to …
|
|
|
@4750
|
21 years |
mdewsnip |
Improved formatting of usage texts automatically generated from John's …
|
|
|
@4746
|
21 years |
mdewsnip |
Initial attempt at a generic print usage function which works with the …
|
|
|
@4744
|
21 years |
mdewsnip |
Tidied up and structures (representing the options of the plugin) in …
|
|
|
@3834
|
21 years |
sjboddie |
Prevent "use bytes" from causing errors for older perls
|
|
|
@3767
|
21 years |
sjboddie |
Scattered some "use bytes" pragmas around to try to prevent perl-5.8 …
|
|
|
@3731
|
21 years |
jrm21 |
If textcat returns too many possibilities, use the default language …
|
|
|
@3540
|
22 years |
kjdon |
added John T's changes into CVS - added info to enable retrieval of …
|
|
|
@3515
|
22 years |
jrm21 |
call a plugin's set_OID() method if one exists, otherwise use the …
|
|
|
@3427
|
22 years |
sjboddie |
The input encoding will now default to utf8 instead of iso-8859-1. …
|
|
|
@3086
|
22 years |
nzdl |
* empty log message *
|
|
|
@2835
|
23 years |
dmm9 |
Corrected pluginfo entry and renamed extract_date to …
|
|
|
@2816
|
23 years |
sjboddie |
Added cover_image option to BasPlug for associating a jpeg image as a …
|
|
|
@2811
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2796
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2795
|
23 years |
sjboddie |
Got ZIPPlug working under under windows
|
|
|
@2785
|
23 years |
sjboddie |
The build process now creates a summary of how many files were …
|
|
|
@2755
|
23 years |
jrm21 |
import.pl now takes an option for saving file conversion failures to a …
|
|
|
@2751
|
23 years |
sjboddie |
Had a go at enriching the default document structure.
Added …
|
|
|
@2734
|
23 years |
sjboddie |
Chinese text segmentation is now done whenever language="zh" instead …
|
|
|
@2604
|
23 years |
jrm21 |
when extracting email addresses, we now include people in the .net …
|
|
|
@2601
|
23 years |
jrm21 |
modified usage to not mention HTMLplug blocking rtf.
|
|
|
@2327
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2235
|
23 years |
sjboddie |
Hacked the textcat package about so that it only reads all the …
|
|
|
@2219
|
23 years |
sjboddie |
Had another go at suppressing the "subroutine redefined" warnings as …
|
|
|