source: trunk/gsdl/perllib/plugins

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @8139   20 years mdewsnip Now adds NumPages metadata.
(edit) @8138   20 years mdewsnip Added FileFormat metadata.
(edit) @8121   20 years chi Add the "FileFormat" metadata to each of the Plugins.
(edit) @8119   20 years jrm21 allow multiple callbacks, one for each metadata field (using the …
(edit) @8117   20 years mdewsnip Fixed a bug where extra dots in filenames would cause the file …
(edit) @8098   20 years jrm21 * guess a title if no \title tag * \it tag * fractions in maths mode
(edit) @8097   20 years jrm21 added extra accent for \"i
(edit) @8090   20 years davidb Switching RecPlug over to using XMLParser wrapper rather than …
(edit) @8071   20 years davidb When title metadata is derived from first 100 chars of text, extra =~ …
(edit) @8069   20 years davidb Introduction of XMLParser.pm as a wrapper for XML::Parser (a standard …
(edit) @7966   20 years mdewsnip Updated my fix from yesterday, so the collections will work correctly …
(edit) @7949   20 years mdewsnip Added a bit of a hack for the wv 0.7.1 bug under Windows that causes …
(edit) @7932   20 years mdewsnip A first cut at modifying RecPlug to resolve Windows shortcuts …
(edit) @7911   20 years mdewsnip Finally got around to committing Christy Kuo's (cyk2) COMP517-03B …
(edit) @7901   20 years chi Greenstone2 now supports METS format as an archiving option. …
(edit) @7900   20 years chi Some minor changes in preparation for the introduction of METSPlug.
(edit) @7830   20 years jrm21 change a couple of error messages to using gsprintf translated strings …
(edit) @7818   20 years jrm21 improvements to the handling of textcat's guessed encoding
(edit) @7703   20 years jrm21 1) use the email's message ID instead of document hash for Identifier. …
(edit) @7701   20 years jrm21 allow utf-8 as an alias for utf8
(edit) @7693   20 years mdewsnip Improvements to the new code so that it works on Windows as well as Unix.
(edit) @7688   20 years mdewsnip Fixed a small bug in the code I just added.
(edit) @7686   20 years mdewsnip First cut at upgrading the CDS/ISIS plugin to obtain and index …
(edit) @7685   20 years jrm21 make sure warning/error messages go to $outhandle.
(edit) @7683   20 years jrm21 better code for reading in the config file, so not just unix-specific.
(edit) @7681   20 years jrm21 more portable way of reading config file
(edit) @7668   20 years jrm21 renamed "kea" to "Keyphrase" metadata, and add one for each extracted …
(edit) @7645   20 years jrm21 don't fail if we can't load the diagnostics package.
(edit) @7644   20 years jrm21 don't print "wrong encoding" message for text in english. textcat …
(edit) @7640   20 years mdewsnip Removed the reference to WebPlug, which no longer exists.
(edit) @7595   20 years mdewsnip Seem to have fixed the problem with anchors being added to images (for …
(edit) @7571   20 years kjdon origianl filename is now used for gsdlsourcefilename, and converted …
(edit) @7570   20 years kjdon now sets the converted filename to be used for hashing
(edit) @7559   20 years kjdon added use BasPLug
(edit) @7556   20 years kjdon set the original filename as gsdlsourcefilename so that GLI can assign …
(edit) @7555   20 years kjdon removed ugly quotes from a printed message
(edit) @7553   20 years kjdon apparently 80 columns is too wide, use 64
(edit) @7547   20 years kjdon the text::wrap module is not a standard one apparently, so I have made …
(edit) @7541   20 years jrm21 new plugin. Eventually I'll get bibtex to inherit from latexplug …
(edit) @7533   20 years kjdon wrap the lines of the record at 80 columns
(edit) @7528   20 years davidb Plugin was setting title twice (derived from filename and extracted …
(edit) @7513   20 years davidb User specified fields need to be converted to all uppercase to be …
(edit) @7508   20 years kjdon changed the plugin metadata - instead of having eg HTMLPlug metadata …
(edit) @7506   20 years kjdon changed 'dummy text to sidestep display bug' to 'this document has no …
(edit) @7504   20 years davidb ImagePlug, MP3Plug, UnknownPlug modified to set Title metadata based …
(edit) @7498   20 years kjdon oops, accidently commited a testing return statement which meant that …
(edit) @7496   20 years davidb srcicon metadata added for mp3 icon
(edit) @7490   20 years davidb Language specific text labels for MP3Plugin added.
(edit) @7487   20 years davidb Plugin for processing MP3 files. Based on UnknownPlug, it …
(edit) @7458   20 years mdewsnip Added another case for when the dbt file doesn't exist.
(edit) @7362   20 years kjdon plugin read functions now return 'undef' - didn't recognise, '-1' - …
(edit) @7353   20 years kjdon removed all the old print_usage functions - they may be misleading as …
(edit) @7352   20 years kjdon added a bit more to the description
(edit) @7337   20 years kjdon fixed another error in the example format statement
(edit) @7336   20 years kjdon fixed a couple of errors in the example format statement
(edit) @7304   20 years kjdon now adds Source metadata
(edit) @7287   20 years mdewsnip Now extracts date metadata from the output of pdftohtml 0.36 and …
(edit) @7244   20 years kjdon made this abstract so it doesn't show up in the GLI plugin list
(edit) @7243   20 years kjdon David said these were abstract plugins so set abstract to yes - GLI …
(edit) @7235   20 years kjdon fixed a couple of bugs and added a bit of output to do with extracting …
(edit) @7202   20 years jrm21 rewrote the <meta> tag handling to be more robust and more efficient.
(edit) @7195   20 years mdewsnip First cut at a plugin for processing (exported) ProCite databases. It …
(edit) @7107   20 years kjdon added a range to the zoom arg
(edit) @7106   20 years kjdon modified the comments at the top, made the size args have ranges > 0
(edit) @7105   20 years kjdon changed the max century arg to a string instead of an int - need to be …
(edit) @7049   20 years mdewsnip Reduced the error messages just added because they would be confusing …
(edit) @7048   20 years mdewsnip Now checks for the .xrf file required by the IsisGdl program. Also, …
(edit) @7023   20 years kjdon fixed up the <tag> display for pluginfo and clasinfo. < and > should …
(edit) @7021   20 years mdewsnip The filename argument to IsisGdl is now quoted, for Windows systems …
(edit) @7019   20 years jrm21 fall back gracefully if -use_sections argument was given but no …
(edit) @6987   20 years mdewsnip Missed changing some print()s to gsprintf()s.
(edit) @6945   20 years mdewsnip Updated the resource bundle handling code some more. Strings are first …
(edit) @6943   20 years kjdon now it increments the num_processed docs counter
(edit) @6932   20 years kjdon changed the output slightly, and now outputs the classifier/plugin …
(edit) @6925   20 years mdewsnip Changed the way display in different languages is done. Instead of …
(edit) @6918   20 years mdewsnip Removed some code I commented out.
(edit) @6916   20 years jrm21 Don't store Headers metadata by default (it's quite wasteful of …
(edit) @6860   20 years kjdon moved teh type_list to before the arguments, so it can actually be used
(edit) @6812   20 years mdewsnip Additions for the GsdlCollageApplet: a classifier that displays a …
(edit) @6769   20 years kjdon added two new options: noheaderpage to supress the empty first …
(edit) @6651   20 years kjdon fixed a bug I introduced last time
(edit) @6649   20 years kjdon changed the regex for getting info out of meta tags so it now works if …
(edit) @6584   20 years kjdon Fiddled around with segmenting for chinese text. Haven't changed how …
(edit) @6555   20 years kjdon a new plugin for processing sequences of page images
(edit) @6408   20 years jmt12 Added two new attributes for script arguments. HiddenGLI controls …
(edit) @6332   20 years jmt12 When -gli argument is provided to calling script these modules will …
(edit) @6214   20 years mdewsnip Added David's fixes for spaces in filenames and setting srclink metadata.
(edit) @6138   20 years mdewsnip Added plugin type metadata.
(edit) @6137   20 years kjdon added new metadata field - SourceSegment, set when the source doc has …
(edit) @6132   21 years kjdon importfrom now puts src docs into srcdocs instead of .orig
(edit) @6123   21 years mdewsnip A small fix to prevent ampersands being included in metadata field names.
(edit) @6107   21 years mdewsnip First stab at a plugin for reading CDS/ISIS databases. This plugin …
(edit) @6079   21 years jrm21 oops... became too liberal for attachment filenames... now fixed to …
(edit) @6062   21 years jrm21 "use strict" and picked up quite a few typos. escape _ into \_ …
(edit) @5924   21 years kjdon changed the new metadata to eg WordPlug instead of Word, cos a clash …
(edit) @5919   21 years kjdon each plugin now adds a metadata field to teh doc obj based on the …
(edit) @5878   21 years mdewsnip Added the perllib/cpan path back to INC, to prevent errors (under …
(edit) @5866   21 years davidb Flag for 'metadata_mapping' was mistkenly 'metdata_map'. …
(edit) @5845   21 years mdewsnip David's fixes for running ImagePlug under Windows.
(edit) @5765   21 years mdewsnip Commented out check for spaces in filenames - why is it there?? …
Note: See TracRevisionLog for help on using the revision log.