source: trunk/gsdl/perllib/plugins

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @9143   19 years davidb Added handling of <embed> tag in a similar fashion to <img> Also, …
(edit) @9125   19 years mdewsnip Added a substr function to unicode.pm that should work correctly on …
(edit) @9122   19 years mdewsnip Grrr... why doesn't anyone think about Windows when writing code?
(edit) @9120   19 years kjdon BibTex plug can do exploding - set 'explodes' to yes in xml description
(edit) @9118   19 years kjdon MARC plug can do exploding - set 'explodes' to yes in xml description
(edit) @9067   19 years kjdon moved smart blocking stuff in htmlplug metadata_read into basplug …
(edit) @9057   19 years kjdon tidied up previous commit
(edit) @9056   19 years kjdon added an option to not strip html tags from metadata in description …
(edit) @9053   19 years kjdon changed the description tags metadata handling again. now uses an …
(edit) @9046   19 years kjdon added an empty block exp so that we don't block images - any cases …
(edit) @9044   19 years kjdon changed some of the comments
(edit) @9043   19 years kjdon new plugin for processing the .nul files produced by exploding …
(edit) @8915   19 years chi Add an option-smart_block_BN for BN Portugal Collection.
(edit) @8914   19 years chi Add a smart_block option to deal with associated files of HTML document.
(edit) @8913   19 years chi program layout change
(edit) @8909   19 years davidb PageImgPlug updated so read function follows more consistently the …
(edit) @8908   19 years davidb BasPlug now sets a piece of metadata [hascover] if document has a …
(edit) @8904   19 years jrm21 need to do qp decoding before doing text_into_html so we don't keep …
(edit) @8902   19 years jrm21 slightly better way of recognising gb charset names (mapped to 'gb')
(edit) @8893   19 years davidb Additional check added to plugins read function to remain compatible …
(edit) @8892   19 years davidb Addition of new minus option to BasPlug: -associate_ext. This new …
(edit) @8891   19 years davidb Revision of argument types to a few plugin options to better reflect …
(edit) @8843   19 years jrm21 fix problem for -metadata_fields if tag1<Tag2> given for mapping to a …
(edit) @8818   19 years mdewsnip Title tags over multiple lines will now be removed correctly before …
(edit) @8814   19 years mdewsnip Updated files for Kea 3.0, thanks to Olena.
(edit) @8795   19 years kjdon if use_sections is on, now we are a bit more relaxed about what the …
(edit) @8794   19 years jrm21 remove trailing \n from meta tags (bug reported by Tim Finney, 13 Dec 2004)
(edit) @8789   19 years mdewsnip Better documentation of the extract keyphrases (Kea) code, thanks to Olena.
(edit) @8767   19 years jrm21 add 'use utf8' so hopefully substr() is smart enough to cut between …
(edit) @8764   19 years chi Modifications of the use of BasPlug
(edit) @8762   19 years mdewsnip The files this plugin processes can be exploded by the …
(edit) @8761   19 years mdewsnip XML plugin descriptions now include an <Explodes> tag that records …
(edit) @8749   19 years mdewsnip Now escapes '<' and '>' characters in metadata values correctly.
(edit) @8740   19 years chi Modifications for validated METS format.
(edit) @8739   19 years chi A new plugin - BNContentePlug to deal with Portugal BN collections.
(edit) @8737   19 years davidb Extension to RecPlug so metadata that goes with a file that is in a …
(edit) @8716   19 years kjdon added some changes made by Emanuel Dejanu (Simple Words)
(edit) @8684   20 years mdewsnip Ooops... the OAIPlug has never worked properly on Windows! Regular …
(edit) @8678   20 years kjdon cover images are now turned on by default, and the option is changed …
(edit) @8668   20 years kjdon when processing description tags, it used to use …
(edit) @8646   20 years mdewsnip Made ISISPlug.pm a bit more robust to crap files.
(edit) @8563   20 years mdewsnip Ripped all the obtaining referenced documents and exploding database …
(edit) @8519   20 years mdewsnip Fixed the extra Title metadata problem with David's help.
(edit) @8514   20 years chi Modify the namespace in METS file as "gsdl3"
(edit) @8513   20 years chi Add a method metadata_read in order to go straight to BasPlug and …
(edit) @8512   20 years chi Add a new metadata_read method in the first pass in order to identify …
(edit) @8511   20 years chi A new plugin to import the collections in DSpace format to GS2.
(edit) @8510   20 years chi Add a new method metadat_read to deal with specific (or external) …
(edit) @8509   20 years chi Add new methods (with a smart_block option) to store the blocked …
(edit) @8402   20 years kjdon fixed up the header page stuff with pagedimgplug - docs always have a …
(edit) @8366   20 years kjdon added script to the list of tags to process as relative links, and js …
(edit) @8365   20 years kjdon put doule quotes around values of <a href=xxx> and <img src=xxx>
(edit) @8350   20 years kjdon assign the fall back title after processing any other metadata, so …
(edit) @8315   20 years mdewsnip Was adding Source metadata twice.
(edit) @8278   20 years jrm21 sanity check for a valid date before trying to add it as metadata, …
(edit) @8246   20 years kjdon changed the default to have noheaderpage, so the option is now …
(edit) @8245   20 years kjdon a few fixes for problems found on Ians laptop
(edit) @8227   20 years jrm21 all perl things should "use strict;" to catch errors! $cursection was …
(edit) @8226   20 years jrm21 tell HTMLPlug to extract the author metadata, and rename it to Creator.
(edit) @8225   20 years jrm21 support tag<tagname> as described in the pluginfo for HTMLPlug. The …
(edit) @8218   20 years jrm21 use the unicode::ensure_utf8() function on the extracted text so we …
(edit) @8171   20 years mdewsnip FileFormat metadata for PostScript files should now be set correctly.
(edit) @8170   20 years mdewsnip Fixed some of the new FileFormat metadata so you only get one value …
(edit) @8166   20 years mdewsnip Added FileSize metadata in most plugins.
(edit) @8145   20 years mdewsnip Check for ImageMagick being installed and on the path, and bail early …
(edit) @8139   20 years mdewsnip Now adds NumPages metadata.
(edit) @8138   20 years mdewsnip Added FileFormat metadata.
(edit) @8121   20 years chi Add the "FileFormat" metadata to each of the Plugins.
(edit) @8119   20 years jrm21 allow multiple callbacks, one for each metadata field (using the …
(edit) @8117   20 years mdewsnip Fixed a bug where extra dots in filenames would cause the file …
(edit) @8098   20 years jrm21 * guess a title if no \title tag * \it tag * fractions in maths mode
(edit) @8097   20 years jrm21 added extra accent for \"i
(edit) @8090   20 years davidb Switching RecPlug over to using XMLParser wrapper rather than …
(edit) @8071   20 years davidb When title metadata is derived from first 100 chars of text, extra =~ …
(edit) @8069   20 years davidb Introduction of XMLParser.pm as a wrapper for XML::Parser (a standard …
(edit) @7966   20 years mdewsnip Updated my fix from yesterday, so the collections will work correctly …
(edit) @7949   20 years mdewsnip Added a bit of a hack for the wv 0.7.1 bug under Windows that causes …
(edit) @7932   20 years mdewsnip A first cut at modifying RecPlug to resolve Windows shortcuts …
(edit) @7911   20 years mdewsnip Finally got around to committing Christy Kuo's (cyk2) COMP517-03B …
(edit) @7901   20 years chi Greenstone2 now supports METS format as an archiving option. …
(edit) @7900   20 years chi Some minor changes in preparation for the introduction of METSPlug.
(edit) @7830   20 years jrm21 change a couple of error messages to using gsprintf translated strings …
(edit) @7818   20 years jrm21 improvements to the handling of textcat's guessed encoding
(edit) @7703   20 years jrm21 1) use the email's message ID instead of document hash for Identifier. …
(edit) @7701   20 years jrm21 allow utf-8 as an alias for utf8
(edit) @7693   20 years mdewsnip Improvements to the new code so that it works on Windows as well as Unix.
(edit) @7688   20 years mdewsnip Fixed a small bug in the code I just added.
(edit) @7686   20 years mdewsnip First cut at upgrading the CDS/ISIS plugin to obtain and index …
(edit) @7685   20 years jrm21 make sure warning/error messages go to $outhandle.
(edit) @7683   20 years jrm21 better code for reading in the config file, so not just unix-specific.
(edit) @7681   20 years jrm21 more portable way of reading config file
(edit) @7668   20 years jrm21 renamed "kea" to "Keyphrase" metadata, and add one for each extracted …
(edit) @7645   20 years jrm21 don't fail if we can't load the diagnostics package.
(edit) @7644   20 years jrm21 don't print "wrong encoding" message for text in english. textcat …
(edit) @7640   20 years mdewsnip Removed the reference to WebPlug, which no longer exists.
(edit) @7595   20 years mdewsnip Seem to have fixed the problem with anchors being added to images (for …
(edit) @7571   20 years kjdon origianl filename is now used for gsdlsourcefilename, and converted …
(edit) @7570   20 years kjdon now sets the converted filename to be used for hashing
(edit) @7559   20 years kjdon added use BasPLug
(edit) @7556   20 years kjdon set the original filename as gsdlsourcefilename so that GLI can assign …
Note: See TracRevisionLog for help on using the revision log.