source: trunk/gsdl/perllib/plugins

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @6769   20 years kjdon added two new options: noheaderpage to supress the empty first …
(edit) @6651   20 years kjdon fixed a bug I introduced last time
(edit) @6649   20 years kjdon changed the regex for getting info out of meta tags so it now works if …
(edit) @6584   20 years kjdon Fiddled around with segmenting for chinese text. Haven't changed how …
(edit) @6555   20 years kjdon a new plugin for processing sequences of page images
(edit) @6408   20 years jmt12 Added two new attributes for script arguments. HiddenGLI controls …
(edit) @6332   20 years jmt12 When -gli argument is provided to calling script these modules will …
(edit) @6214   20 years mdewsnip Added David's fixes for spaces in filenames and setting srclink metadata.
(edit) @6138   21 years mdewsnip Added plugin type metadata.
(edit) @6137   21 years kjdon added new metadata field - SourceSegment, set when the source doc has …
(edit) @6132   21 years kjdon importfrom now puts src docs into srcdocs instead of .orig
(edit) @6123   21 years mdewsnip A small fix to prevent ampersands being included in metadata field names.
(edit) @6107   21 years mdewsnip First stab at a plugin for reading CDS/ISIS databases. This plugin …
(edit) @6079   21 years jrm21 oops... became too liberal for attachment filenames... now fixed to …
(edit) @6062   21 years jrm21 "use strict" and picked up quite a few typos. escape _ into \_ …
(edit) @5924   21 years kjdon changed the new metadata to eg WordPlug instead of Word, cos a clash …
(edit) @5919   21 years kjdon each plugin now adds a metadata field to teh doc obj based on the …
(edit) @5878   21 years mdewsnip Added the perllib/cpan path back to INC, to prevent errors (under …
(edit) @5866   21 years davidb Flag for 'metadata_mapping' was mistkenly 'metdata_map'. …
(edit) @5845   21 years mdewsnip David's fixes for running ImagePlug under Windows.
(edit) @5765   21 years mdewsnip Commented out check for spaces in filenames - why is it there?? …
(edit) @5681   21 years mdewsnip Rewritten option display code (used by all plugins) to use the new …
(edit) @5680   21 years mdewsnip Moved plugin descriptions into the resource bundle …
(edit) @5616   21 years davidb Position of @args moved, so it records what the options passed to the …
(edit) @5454   21 years jrm21 seemed to need a \Q to protect a variable in LHS of an s/;
(edit) @5295   21 years mdewsnip Added one tiny little option to help the GLI out with monitoring the …
(edit) @5139   21 years mdewsnip Changed parsing the use_sections option to parse a flag.
(edit) @5103   21 years mdewsnip Newer versions of identify display more accurate file sizes (eg. …
(edit) @5096   21 years jmt12 Metadata fields actually has nothing to do with the metadata elements …
(edit) @5078   21 years mdewsnip Fixed bug where ScreenHeight and ScreenWidth would not be set if …
(edit) @5066   21 years kjdon changed HTMLPLug to extract multiple values for the same metadata name
(edit) @4908   21 years mdewsnip Another missing ']' causing problems.
(edit) @4894   21 years mdewsnip Added a missing ']' that was causing problems. Thanks to Ben Dwyer for …
(edit) @4873   21 years mdewsnip Further work on standardising option descriptions. Specifically, in …
(edit) @4845   21 years jrm21 use add_metadata instead of add_utf8_metadata for Source and URL …
(edit) @4844   21 years jrm21 database plugin doesn't take the "title_sub" option.
(edit) @4843   21 years mdewsnip Added check to ConvertToRogPlug creation so that 'pluginfo.pl …
(edit) @4842   21 years mdewsnip Added check when creating a ConvertToPlug object so that 'pluginfo.pl …
(edit) @4821   21 years jrm21 corrected extract_first_NNNN function so that it doesn't get confused …
(edit) @4792   21 years davidb Modified so BibTeX records with no key processed correctly.
(edit) @4791   21 years davidb Modified so -input_encoding flag used.
(edit) @4790   21 years davidb Addition of 'quotemeta' to protect directory separate under Windows …
(edit) @4785   21 years mdewsnip Commented out print_usage functions - plugins should now call …
(edit) @4778   21 years mdewsnip Modified the code for generating the usage texts to use the methods in …
(edit) @4764   21 years mdewsnip Replaced call to removed function print_generic_usage() with a call to …
(edit) @4750   21 years mdewsnip Improved formatting of usage texts automatically generated from John's …
(edit) @4748   21 years mdewsnip Changed "metadatum" type to "metadata".
(edit) @4747   21 years mdewsnip Added $options structure for storing plugin description.
(edit) @4746   21 years mdewsnip Initial attempt at a generic print usage function which works with the …
(edit) @4745   21 years mdewsnip Uncommented a line which shouldn't have been committed commented.
(edit) @4744   21 years mdewsnip Tidied up and structures (representing the options of the plugin) in …
(edit) @4726   21 years davidb Initial version of OAI plugin for parsing records downloaded from an …
(edit) @4724   21 years davidb ImagePlug now stores metadata for srcicon, thumbicon and screenicon to …
(edit) @4429   21 years jrm21 new plugin for importing data from perl's DBI database interface - eg …
(edit) @4224   21 years jrm21 fixed regexp for when we have a content type without a charset
(edit) @4103   21 years sjboddie Added a -nohidden PDFPlug option and made it pass the -hidden option …
(edit) @4089   21 years jrm21 added "\n" to headers as we weren't picking up messages that were only …
(edit) @3932   21 years jrm21 need to escape _ chars.
(edit) @3919   21 years jrm21 tidy and fix reg-exps when looking for #includes... it got stuck in a …
(edit) @3856   21 years davidb General improvement to the translator facility.
(edit) @3834   21 years sjboddie Prevent "use bytes" from causing errors for older perls
(edit) @3833   21 years jrm21 fixed up parsing the use_sections argument.
(edit) @3767   21 years sjboddie Scattered some "use bytes" pragmas around to try to prevent perl-5.8 …
(edit) @3737   21 years davidb Used to support music-centent based collections
(edit) @3732   21 years jrm21 need to escape ",", "<", and ">" in title metadata
(edit) @3731   21 years jrm21 If textcat returns too many possibilities, use the default language …
(edit) @3726   21 years jrm21 minor fix for "_" chars in urls... escape them after, not before. …
(edit) @3724   21 years kde2 Submission of Interface Translation Agency
(edit) @3721   21 years jrm21 bug where some text/plain messages weren't having < > & properly …
(edit) @3720   21 years sjboddie Added options to PDFPlug to take advantage of the improvements in …
(edit) @3708   21 years sjboddie Fixed a bug where HTMLPlug failed to associate files whose filenames …
(edit) @3630   21 years jrm21 1) Correct typo in print_usage(): process_exp -> split_exp 2) Fixed …
(edit) @3629   21 years jrm21 need to look for associated files in the assocfilepath, if this …
(edit) @3627   21 years jrm21 added less-obfuscated quote-printable parsing in qp_decode()
(edit) @3614   22 years jrm21 modified section-handling stuff to work with output from v.0.34 of …
(edit) @3590   22 years jrm21 modified the split regular expression so it works with newer versions …
(edit) @3587   22 years jrm21 removed comments about storing "BibTex" metadata as we don't do that …
(edit) @3542   22 years jrm21 ghtml returns utf8, not iso-8859-1, so any html entities were being …
(edit) @3540   22 years kjdon added John T's changes into CVS - added info to enable retrieval of …
(edit) @3539   22 years kjdon added jpe to the process and block expressions
(edit) @3537   22 years jrm21 if process() returns undef, then the plugin couldn't process that …
(edit) @3524   22 years kjdon added the help message for the previous change
(edit) @3523   22 years kjdon now EMAILplug accepts the split_exp option - a regular expression that …
(edit) @3517   22 years davidb ImagePlug modified so 'Source' metadata set to be consistent with …
(edit) @3515   22 years jrm21 call a plugin's set_OID() method if one exists, otherwise use the …
(edit) @3508   22 years jrm21 modified copyright statement
(edit) @3430   22 years jrm21 Added MARCPlug, mostly done by David Bainbridge. It needs a …
(edit) @3427   22 years sjboddie The input encoding will now default to utf8 instead of iso-8859-1. …
(edit) @3426   22 years jrm21 Don't add \n to the end of each metadata value.
(edit) @3414   22 years jrm21 Need to escape "_" characters so that greenstone doesn't interprete them…
(edit) @3411   22 years jrm21 Now takes a "-use_sections" option to make a section per page.
(edit) @3400   22 years sjboddie WordPlug now handles .dot files as well as .doc files.
(edit) @3398   22 years jrm21 Oops... the last change to the regex was too permissive... fixed up to …
(edit) @3397   22 years jrm21 minor change to the regex for marking up urls (to allow #anchor at the end)
(edit) @3369   22 years sjboddie HTMLPlug will no longer prevent metadata extraction when the …
(edit) @3352   22 years jrm21 We can now properly handle messages with a content type of …
(edit) @3351   22 years jrm21 If a message is in an unsupported encoding, we assume iso8859-1. …
(edit) @3350   22 years sjboddie Added -use_strings option to ConvertToPlug. The default behaviour for …
(edit) @3349   22 years sjboddie Bug fix.
(edit) @3329   22 years jrm21 Oops, removed debugging statement!
Note: See TracRevisionLog for help on using the revision log.