|
|
@8647
|
19 years |
mdewsnip |
Added a "-newest_first" option to DateList for reverse chronological …
|
|
|
@8646
|
19 years |
mdewsnip |
Made ISISPlug.pm a bit more robust to crap files.
|
|
|
@8563
|
20 years |
mdewsnip |
Ripped all the obtaining referenced documents and exploding database …
|
|
|
@8519
|
20 years |
mdewsnip |
Fixed the extra Title metadata problem with David's help.
|
|
|
@8518
|
20 years |
chi |
A new program to deal with export.pl function.
|
|
|
@8517
|
20 years |
chi |
Add and modify methods to deal with exporting GS collections to "METS" …
|
|
|
@8516
|
20 years |
chi |
Add new messages for export.pl function.
|
|
|
@8515
|
20 years |
chi |
Add a new method metadata_read to identify any specific or extra …
|
|
|
@8514
|
20 years |
chi |
Modify the namespace in METS file as "gsdl3"
|
|
|
@8513
|
20 years |
chi |
Add a method metadata_read in order to go straight to BasPlug and …
|
|
|
@8512
|
20 years |
chi |
Add a new metadata_read method in the first pass in order to identify …
|
|
|
@8511
|
20 years |
chi |
A new plugin to import the collections in DSpace format to GS2.
|
|
|
@8510
|
20 years |
chi |
Add a new method metadat_read to deal with specific (or external) …
|
|
|
@8509
|
20 years |
chi |
Add new methods (with a smart_block option) to store the blocked …
|
|
|
@8504
|
20 years |
chi |
Modification of METS format in order to be compatible with GS3. Also, …
|
|
|
@8502
|
20 years |
kjdon |
changed mets:FLocate to mets:FLocat
|
|
|
@8479
|
20 years |
kjdon |
fixed a typo in the arg list which meant it didn't work with -xml
|
|
|
@8446
|
20 years |
kjdon |
new classifier:AutoHierarchy. Does the same thing as Hierarchy …
|
|
|
@8445
|
20 years |
kjdon |
fixed a bug I introduced with the remove_empty_classifications thing - …
|
|
|
@8402
|
20 years |
kjdon |
fixed up the header page stuff with pagedimgplug - docs always have a …
|
|
|
@8366
|
20 years |
kjdon |
added script to the list of tags to process as relative links, and js …
|
|
|
@8365
|
20 years |
kjdon |
put doule quotes around values of <a href=xxx> and <img src=xxx>
|
|
|
@8363
|
20 years |
kjdon |
renamed build option 'allclassifications' to …
|
|
|
@8362
|
20 years |
kjdon |
added a new option to the phind classifier: min_occurs. this is the …
|
|
|
@8361
|
20 years |
kjdon |
renamed build option 'allclassifications' to …
|
|
|
@8350
|
20 years |
kjdon |
assign the fall back title after processing any other metadata, so …
|
|
|
@8315
|
20 years |
mdewsnip |
Was adding Source metadata twice.
|
|
|
@8278
|
20 years |
jrm21 |
sanity check for a valid date before trying to add it as metadata, …
|
|
|
@8275
|
20 years |
cs025 |
Avoids problems with 'oai' being visible better than the previous version.
|
|
|
@8252
|
20 years |
kjdon |
changed the pagedimgplug -noheaderpage to -headerpage
|
|
|
@8246
|
20 years |
kjdon |
changed the default to have noheaderpage, so the option is now …
|
|
|
@8245
|
20 years |
kjdon |
a few fixes for problems found on Ians laptop
|
|
|
@8227
|
20 years |
jrm21 |
all perl things should "use strict;" to catch errors!
$cursection was …
|
|
|
@8226
|
20 years |
jrm21 |
tell HTMLPlug to extract the author metadata, and rename it to Creator.
|
|
|
@8225
|
20 years |
jrm21 |
support tag<tagname> as described in the pluginfo for HTMLPlug. The …
|
|
|
@8221
|
20 years |
cs025 |
Added AllList to provide a universal list of all documents, which …
|
|
|
@8220
|
20 years |
cs025 |
Extensions to underpin OAI - e.g. creation of the OAI classifier, …
|
|
|
@8218
|
20 years |
jrm21 |
use the unicode::ensure_utf8() function on the extracted text so we …
|
|
|
@8217
|
20 years |
jrm21 |
added a safety check to ensure_utf8()
|
|
|
@8171
|
20 years |
mdewsnip |
FileFormat metadata for PostScript files should now be set correctly.
|
|
|
@8170
|
20 years |
mdewsnip |
Fixed some of the new FileFormat metadata so you only get one value …
|
|
|
@8166
|
20 years |
mdewsnip |
Added FileSize metadata in most plugins.
|
|
|
@8154
|
20 years |
kjdon |
added a bit more to teh sortmeta description
|
|
|
@8145
|
20 years |
mdewsnip |
Check for ImageMagick being installed and on the path, and bail early …
|
|
|
@8139
|
20 years |
mdewsnip |
Now adds NumPages metadata.
|
|
|
@8138
|
20 years |
mdewsnip |
Added FileFormat metadata.
|
|
|
@8121
|
20 years |
chi |
Add the "FileFormat" metadata to each of the Plugins.
|
|
|
@8119
|
20 years |
jrm21 |
allow multiple callbacks, one for each metadata field (using the …
|
|
|
@8117
|
20 years |
mdewsnip |
Fixed a bug where extra dots in filenames would cause the file …
|
|
|
@8102
|
20 years |
mdewsnip |
Unfinished, but I'm committing it now so I don't lose it.
|
|
|
@8098
|
20 years |
jrm21 |
* guess a title if no \title tag
* \it tag
* fractions in maths mode
|
|
|
@8097
|
20 years |
jrm21 |
added extra accent for \"i
|
|
|
@8094
|
20 years |
jrm21 |
fix errors with uninitialised variables if 'saveas' not specified.
…
|
|
|
@8090
|
20 years |
davidb |
Switching RecPlug over to using XMLParser wrapper rather than …
|
|
|
@8087
|
20 years |
mdewsnip |
On Windows we use the XML::Parser stuff in bin/windows/perl/lib rather …
|
|
|
@8079
|
20 years |
davidb |
docsave.pm had been saving both GA and METS format. if-statement …
|
|
|
@8072
|
20 years |
davidb |
Support for building collections with lucene.
|
|
|
@8071
|
20 years |
davidb |
When title metadata is derived from first 100 chars of text,
extra =~ …
|
|
|
@8069
|
20 years |
davidb |
Introduction of XMLParser.pm as a wrapper for XML::Parser (a standard …
|
|
|
@7966
|
20 years |
mdewsnip |
Updated my fix from yesterday, so the collections will work correctly …
|
|
|
@7955
|
20 years |
davidb |
Introduction of some strings for various news options to command line …
|
|
|
@7954
|
20 years |
davidb |
Minor tweak to regular expression that modifies white space.
|
|
|
@7953
|
20 years |
davidb |
Minor modifications relating to storing buildtype information. …
|
|
|
@7949
|
20 years |
mdewsnip |
Added a bit of a hack for the wv 0.7.1 bug under Windows that causes …
|
|
|
@7933
|
20 years |
mdewsnip |
Win32::Shortcut 0.03 module (from CPAN) to support RecPlug resolving …
|
|
|
@7932
|
20 years |
mdewsnip |
A first cut at modifying RecPlug to resolve Windows shortcuts …
|
|
|
@7929
|
20 years |
davidb |
doc.pm modified so filename stored under gsdlsourcefilename is local …
|
|
|
@7911
|
20 years |
mdewsnip |
Finally got around to committing Christy Kuo's (cyk2) COMP517-03B …
|
|
|
@7909
|
20 years |
mdewsnip |
CPAN module for processing XPath expressions.
|
|
|
@7905
|
20 years |
chi |
New entries in dictionary for -saveas METS option.
|
|
|
@7904
|
20 years |
chi |
Minor changes to layout of code.
|
|
|
@7903
|
20 years |
chi |
Added unescaping routine for HTML entities < > &. Used in METSPlug.
|
|
|
@7902
|
20 years |
chi |
Saving of documents (in archive format) extended to generate METS …
|
|
|
@7901
|
20 years |
chi |
Greenstone2 now supports METS format as an archiving option. …
|
|
|
@7900
|
20 years |
chi |
Some minor changes in preparation for the introduction of METSPlug.
|
|
|
@7835
|
20 years |
jrm21 |
record mdoffset for each document when adding to a sub list. List.pm …
|
|
|
@7830
|
20 years |
jrm21 |
change a couple of error messages to using gsprintf translated strings …
|
|
|
@7829
|
20 years |
jrm21 |
use strict and declare all vars (think this fixes a "not a glob …
|
|
|
@7828
|
20 years |
jrm21 |
use strict (caught an error/typo).
use perl's Exporter module, so we …
|
|
|
@7818
|
20 years |
jrm21 |
improvements to the handling of textcat's guessed encoding
|
|
|
@7817
|
20 years |
jrm21 |
new ensure_utf8() function was returning the wrong thing, so the utf8 …
|
|
|
@7815
|
20 years |
jrm21 |
added a comment... ascii2utf8 takes iso-8859-1, not just plain ascii.
|
|
|
@7798
|
20 years |
jrm21 |
added a function, unicode::ensure_utf8(), that will test that the …
|
|
|
@7704
|
20 years |
jrm21 |
oops... we were failing on documents that start with a 0 (zero)...
if …
|
|
|
@7703
|
20 years |
jrm21 |
1) use the email's message ID instead of document hash for Identifier. …
|
|
|
@7702
|
20 years |
jrm21 |
handle metadata values that start with a "-", instead of screwing up …
|
|
|
@7701
|
20 years |
jrm21 |
allow utf-8 as an alias for utf8
|
|
|
@7693
|
20 years |
mdewsnip |
Improvements to the new code so that it works on Windows as well as Unix.
|
|
|
@7692
|
20 years |
mdewsnip |
Fixed up the parsing of quoted strings so strings like "Hello " (with …
|
|
|
@7688
|
20 years |
mdewsnip |
Fixed a small bug in the code I just added.
|
|
|
@7686
|
20 years |
mdewsnip |
First cut at upgrading the CDS/ISIS plugin to obtain and index …
|
|
|
@7685
|
20 years |
jrm21 |
make sure warning/error messages go to $outhandle.
|
|
|
@7683
|
20 years |
jrm21 |
better code for reading in the config file, so not just unix-specific.
|
|
|
@7682
|
20 years |
mdewsnip |
Added strings for the new RecPlug and ISISPlug options to support …
|
|
|
@7681
|
20 years |
jrm21 |
more portable way of reading config file
|
|
|
@7668
|
20 years |
jrm21 |
renamed "kea" to "Keyphrase" metadata, and add one for each extracted …
|
|
|
@7645
|
20 years |
jrm21 |
don't fail if we can't load the diagnostics package.
|
|
|
@7644
|
20 years |
jrm21 |
don't print "wrong encoding" message for text in english.
textcat …
|
|
|
@7640
|
20 years |
mdewsnip |
Removed the reference to WebPlug, which no longer exists.
|
|
|
@7595
|
20 years |
mdewsnip |
Seem to have fixed the problem with anchors being added to images (for …
|
|
|