source: trunk/gsdl/perllib

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @8090   20 years davidb Switching RecPlug over to using XMLParser wrapper rather than …
(edit) @8087   20 years mdewsnip On Windows we use the XML::Parser stuff in bin/windows/perl/lib rather …
(edit) @8079   20 years davidb docsave.pm had been saving both GA and METS format. if-statement …
(edit) @8072   20 years davidb Support for building collections with lucene.
(edit) @8071   20 years davidb When title metadata is derived from first 100 chars of text, extra =~ …
(edit) @8069   20 years davidb Introduction of XMLParser.pm as a wrapper for XML::Parser (a standard …
(edit) @7966   20 years mdewsnip Updated my fix from yesterday, so the collections will work correctly …
(edit) @7955   20 years davidb Introduction of some strings for various news options to command line …
(edit) @7954   20 years davidb Minor tweak to regular expression that modifies white space.
(edit) @7953   20 years davidb Minor modifications relating to storing buildtype information. …
(edit) @7949   20 years mdewsnip Added a bit of a hack for the wv 0.7.1 bug under Windows that causes …
(edit) @7933   20 years mdewsnip Win32::Shortcut 0.03 module (from CPAN) to support RecPlug resolving …
(edit) @7932   20 years mdewsnip A first cut at modifying RecPlug to resolve Windows shortcuts …
(edit) @7929   20 years davidb doc.pm modified so filename stored under gsdlsourcefilename is local …
(edit) @7911   20 years mdewsnip Finally got around to committing Christy Kuo's (cyk2) COMP517-03B …
(edit) @7909   20 years mdewsnip CPAN module for processing XPath expressions.
(edit) @7905   20 years chi New entries in dictionary for -saveas METS option.
(edit) @7904   20 years chi Minor changes to layout of code.
(edit) @7903   20 years chi Added unescaping routine for HTML entities < > &. Used in METSPlug.
(edit) @7902   20 years chi Saving of documents (in archive format) extended to generate METS …
(edit) @7901   20 years chi Greenstone2 now supports METS format as an archiving option. …
(edit) @7900   20 years chi Some minor changes in preparation for the introduction of METSPlug.
(edit) @7835   20 years jrm21 record mdoffset for each document when adding to a sub list. List.pm …
(edit) @7830   20 years jrm21 change a couple of error messages to using gsprintf translated strings …
(edit) @7829   20 years jrm21 use strict and declare all vars (think this fixes a "not a glob …
(edit) @7828   20 years jrm21 use strict (caught an error/typo). use perl's Exporter module, so we …
(edit) @7818   20 years jrm21 improvements to the handling of textcat's guessed encoding
(edit) @7817   20 years jrm21 new ensure_utf8() function was returning the wrong thing, so the utf8 …
(edit) @7815   20 years jrm21 added a comment... ascii2utf8 takes iso-8859-1, not just plain ascii.
(edit) @7798   20 years jrm21 added a function, unicode::ensure_utf8(), that will test that the …
(edit) @7704   20 years jrm21 oops... we were failing on documents that start with a 0 (zero)... if …
(edit) @7703   20 years jrm21 1) use the email's message ID instead of document hash for Identifier. …
(edit) @7702   20 years jrm21 handle metadata values that start with a "-", instead of screwing up …
(edit) @7701   20 years jrm21 allow utf-8 as an alias for utf8
(edit) @7693   20 years mdewsnip Improvements to the new code so that it works on Windows as well as Unix.
(edit) @7692   20 years mdewsnip Fixed up the parsing of quoted strings so strings like "Hello " (with …
(edit) @7688   20 years mdewsnip Fixed a small bug in the code I just added.
(edit) @7686   20 years mdewsnip First cut at upgrading the CDS/ISIS plugin to obtain and index …
(edit) @7685   20 years jrm21 make sure warning/error messages go to $outhandle.
(edit) @7683   20 years jrm21 better code for reading in the config file, so not just unix-specific.
(edit) @7682   20 years mdewsnip Added strings for the new RecPlug and ISISPlug options to support …
(edit) @7681   20 years jrm21 more portable way of reading config file
(edit) @7668   20 years jrm21 renamed "kea" to "Keyphrase" metadata, and add one for each extracted …
(edit) @7645   20 years jrm21 don't fail if we can't load the diagnostics package.
(edit) @7644   20 years jrm21 don't print "wrong encoding" message for text in english. textcat …
(edit) @7640   20 years mdewsnip Removed the reference to WebPlug, which no longer exists.
(edit) @7595   20 years mdewsnip Seem to have fixed the problem with anchors being added to images (for …
(edit) @7589   20 years kjdon some util stuff for the two unbuild scripts (though only v2 uses it at …
(edit) @7580   20 years kjdon added a new function: generate_title_from_metadata to BasClas - takes …
(edit) @7577   20 years kjdon added min range values to the numeric args
(edit) @7572   20 years mdewsnip Changed regular expression so lines entirely in UTF-8 are still read …
(edit) @7571   20 years kjdon origianl filename is now used for gsdlsourcefilename, and converted …
(edit) @7570   20 years kjdon now sets the converted filename to be used for hashing
(edit) @7569   20 years kjdon can now set gsdlconvertedfilename - gsdlsourcefilename is the original …
(edit) @7560   20 years kjdon added a (simple) description for LaTeXPlug
(edit) @7559   20 years kjdon added use BasPLug
(edit) @7557   20 years kjdon when adding metadata to teh list, use 'if @{}' instead of 'if defined …
(edit) @7556   20 years kjdon set the original filename as gsdlsourcefilename so that GLI can assign …
(edit) @7555   20 years kjdon removed ugly quotes from a printed message
(edit) @7553   20 years kjdon apparently 80 columns is too wide, use 64
(edit) @7549   20 years kjdon new/renamed/redescribed metadata options to AZCompactList
(edit) @7548   20 years kjdon renamed the onlyfirst and allmetadata options to be firstvalueonly and …
(edit) @7547   20 years kjdon the text::wrap module is not a standard one apparently, so I have made …
(edit) @7544   20 years kjdon comma separated list of metadata now uses all values of the first …
(edit) @7541   20 years jrm21 new plugin. Eventually I'll get bibtex to inherit from latexplug …
(edit) @7533   20 years kjdon wrap the lines of the record at 80 columns
(edit) @7528   20 years davidb Plugin was setting title twice (derived from filename and extracted …
(edit) @7518   20 years jrm21 fixed typo in entity name (been there for years... :p )
(edit) @7513   20 years davidb User specified fields need to be converted to all uppercase to be …
(edit) @7508   20 years kjdon changed the plugin metadata - instead of having eg HTMLPlug metadata …
(edit) @7507   20 years kjdon filename_cat now checks for an empty first path - so don't get an …
(edit) @7506   20 years kjdon changed 'dummy text to sidestep display bug' to 'this document has no …
(edit) @7504   20 years davidb ImagePlug, MP3Plug, UnknownPlug modified to set Title metadata based …
(edit) @7498   20 years kjdon oops, accidently commited a testing return statement which meant that …
(edit) @7497   20 years kjdon removed BasClas.metadata.deft from the buttonnmae default - now its …
(edit) @7496   20 years davidb srcicon metadata added for mp3 icon
(edit) @7492   20 years davidb giget.pm is a module for accessing Google Images. Used by MP3Plug to …
(edit) @7490   20 years davidb Language specific text labels for MP3Plugin added.
(edit) @7488   20 years davidb Perl module from CPAN for extracting ID3 tags from MP3 files.
(edit) @7487   20 years davidb Plugin for processing MP3 files. Based on UnknownPlug, it …
(edit) @7459   20 years mdewsnip Changed ReferPlug.desc to ReferPlug.longdesc (not used, but should …
(edit) @7458   20 years mdewsnip Added another case for when the dbt file doesn't exist.
(edit) @7409   20 years davidb Abstract field missing from Collage options (fix, and set to noi -- …
(edit) @7407   20 years davidb Additional strings added for Collage classifier.
(edit) @7406   20 years davidb Two changes made: the first to allow compact nodes to be formed …
(edit) @7405   20 years davidb Labels for strings.rb tidyied up.
(edit) @7363   20 years kjdon plugin read functions now return 'undef' - didn't recognise, '-1' - …
(edit) @7362   20 years kjdon plugin read functions now return 'undef' - didn't recognise, '-1' - …
(edit) @7359   20 years kjdon changed some strings in plugin to do with import messages
(edit) @7353   20 years kjdon removed all the old print_usage functions - they may be misleading as …
(edit) @7352   20 years kjdon added a bit more to the description
(edit) @7351   20 years kjdon tell the GLI if nothing processed the document
(edit) @7346   20 years davidb Collage specific code made more general.
(edit) @7345   20 years davidb Unused code removed.
(edit) @7337   20 years kjdon fixed another error in the example format statement
(edit) @7336   20 years kjdon fixed a couple of errors in the example format statement
(edit) @7331   20 years davidb Collage classifier extended to include "dynamic" parameters: that is …
(edit) @7304   20 years kjdon now adds Source metadata
(edit) @7287   20 years mdewsnip Now extracts date metadata from the output of pdftohtml 0.36 and …
(edit) @7244   20 years kjdon made this abstract so it doesn't show up in the GLI plugin list
Note: See TracRevisionLog for help on using the revision log.