|
|
@1954
|
23 years |
jmt14 |
* empty log message *
|
|
|
@1929
|
23 years |
dg5 |
Modified: ConvertToPlug and HTMLPlug to handle files in binary mode to …
|
|
|
@1903
|
23 years |
sjboddie |
We now use textcats best guess if it returns 3 or less possibilities …
|
|
|
@1895
|
23 years |
jrm21 |
Email plug now uses SplitPlug for mbox mail files. Hopefully this …
|
|
|
@1894
|
23 years |
jrm21 |
updated by copying BasPlug's new language/encoding stuff over for the …
|
|
|
@1891
|
23 years |
paynter |
Named characters like é and ì are translated
to UTF8 …
|
|
|
@1874
|
23 years |
sjboddie |
* empty log message *
|
|
|
@1870
|
23 years |
sjboddie |
Tidied up language support stuff.
|
|
|
@1869
|
23 years |
paynter |
Regular expression fix.
|
|
|
@1868
|
23 years |
sjboddie |
Made a bunch of changes to the building code to support lots of new …
|
|
|
@1857
|
23 years |
dmm9 |
date extraction options documented
|
|
|
@1855
|
23 years |
paynter |
Trivial change to warning message.
|
|
|
@1846
|
23 years |
sjboddie |
Removed a call to a function that I removed in my previous changes - oops
|
|
|
@1845
|
23 years |
paynter |
Changed a "!=" to a "ne".
|
|
|
@1844
|
23 years |
sjboddie |
Added an 'auto' argument to BasPlug's '-input_encoding' option ('auto' …
|
|
|
@1838
|
23 years |
sjboddie |
Added support for Cyrillic languages (windows codepage 1251) - yet to …
|
|
|
@1812
|
24 years |
sjboddie |
ZIPPlug is now disabled under windows
|
|
|
@1810
|
24 years |
sjboddie |
Fixed a bug that showed up when using Perl 5.6 on windows
|
|
|
@1787
|
24 years |
jrm21 |
"allow_extra_options" missing, to get inherited options
|
|
|
@1758
|
24 years |
say1 |
added minimum image size and a few bug fixes
|
|
|
@1757
|
24 years |
say1 |
tightened the criteria for email files to avoid matching all dynamic …
|
|
|
@1756
|
24 years |
say1 |
added detection and handling of unreadable files
|
|
|
@1755
|
24 years |
say1 |
added better cycle detection (but still not perfect)
|
|
|
@1754
|
24 years |
say1 |
added support for jar files (which are actually just fancy zip files)
|
|
|
@1744
|
24 years |
say1 |
about a billion changes to ImagePlug
|
|
|
@1742
|
24 years |
jrm21 |
Added a comment to the usage stuff about PRESCRIPT.
|
|
|
@1741
|
24 years |
sjboddie |
Fixed a little bug that was causing pluginfo.pl to print some dodgy …
|
|
|
@1740
|
24 years |
jrm21 |
We now escape underscores so that any macros in source code (wrt to …
|
|
|
@1735
|
24 years |
say1 |
fixed about a billion little Image things.
|
|
|
@1733
|
24 years |
say1 |
new plugin for images
|
|
|
@1731
|
24 years |
jrm21 |
New and improved! Now gets #include information from std C files as …
|
|
|
@1730
|
24 years |
jrm21 |
removed a debugging statement left in accidentally…
|
|
|
@1729
|
24 years |
jrm21 |
title regexp should have started "\s*", not "\s+" - it's optional …
|
|
|
@1728
|
24 years |
jrm21 |
Minor change so that leading whitespace is skipped when grabbing the …
|
|
|
@1720
|
24 years |
dmm9 |
Added information to the usage text about date extraction option
|
|
|
@1719
|
24 years |
dmm9 |
Added information to the usage text about date extraction option
|
|
|
@1718
|
24 years |
dmm9 |
Added information to the usage text about date extraction option
|
|
|
@1712
|
24 years |
say1 |
cleaned up metadata extraction.
|
|
|
@1711
|
24 years |
say1 |
fixed minor spelling mistake
|
|
|
@1710
|
24 years |
say1 |
RecPlug now skips CVS directories.
|
|
|
@1707
|
24 years |
jrm21 |
Plugin for source code (primarily for putting Greenstone src into a …
|
|
|
@1706
|
24 years |
say1 |
cleaned up the Title code to strip away standard prefixes inserted by …
|
|
|
@1705
|
24 years |
say1 |
fixed to handle filenames with multiple dots.
|
|
|
@1700
|
24 years |
say1 |
changed PSPlug to extract CreationDate, Title and Pages info.
|
|
|
@1699
|
24 years |
say1 |
fixed the bug in HTML plug which broke images for Dave
|
|
|
@1691
|
24 years |
jrm21 |
return "" instead of exit 1 on error. This means that if 1 file …
|
|
|
@1686
|
24 years |
jrm21 |
HTMLPlug no longer blocks .pdf files. (also updated reference to this …
|
|
|
@1685
|
24 years |
jrm21 |
PSPlug based heavily on PDFPlug…
|
|
|
@1677
|
24 years |
paynter |
Added teh BibTex entry type as metadata.
|
|
|
@1676
|
24 years |
paynter |
Plugins for processing files of bibliography records in BibTex and …
|
|
|
@1658
|
24 years |
paynter |
Fixed a bug reading the headers that confused "To" with "In-Reply-To".
|
|
|
@1653
|
24 years |
paynter |
Fixed a few bugs where incorrect variable names were used.
|
|
|
@1609
|
24 years |
say1 |
fixed print_uage
|
|
|
@1605
|
24 years |
say1 |
fixed some of my earlier mistakes. sorry Stefan
|
|
|
@1602
|
24 years |
say1 |
metadata extraction work. (email addresses, generalised HTML tags, …
|
|
|
@1503
|
24 years |
davidb |
A bit of extra error checking.
|
|
|
@1482
|
24 years |
davidb |
Small modification so Index files can be in subdirectories of an …
|
|
|
@1448
|
24 years |
paynter |
Changed regular expressions for extracting metadata from META tags …
|
|
|
@1446
|
24 years |
paynter |
Major overhauls; works with the new gsConvert.pl instead of …
|
|
|
@1436
|
24 years |
davidb |
Due to rearrangement of ConvertTo hierarchy, this file is now redundant.
|
|
|
@1435
|
24 years |
davidb |
Rearrangement of ConvertTo inheritence so HTMLPlug and TextPlug do not …
|
|
|
@1431
|
24 years |
sjboddie |
Made a few minor adjustments to perl building code for use with …
|
|
|
@1424
|
24 years |
sjboddie |
Added a -out option to most of the perl building scripts to allow …
|
|
|
@1420
|
24 years |
davidb |
Moved read_file and read from ConvertToBasPlug to ConvertToPlug.
|
|
|
@1418
|
24 years |
davidb |
Small modification to improve handling of file names with spaces in.
|
|
|
@1417
|
24 years |
davidb |
Additions so ConvertPlug etc. can handle filenames with spaces in them.
|
|
|
@1415
|
24 years |
davidb |
Removed some diagnostic print statements.
|
|
|
@1411
|
24 years |
dmm9 |
added the options for the date extractor
|
|
|
@1410
|
24 years |
davidb |
Introduction of "ConvertTo" family of plugins. This establishes
a new …
|
|
|
@1403
|
24 years |
say1 |
taught HTMLPlug about shtml, asp, cgi, php and html query files …
|
|
|
@1401
|
24 years |
davidb |
Fixed small problem with associated files.
|
|
|
@1400
|
24 years |
davidb |
General tidying of code.
|
|
|
@1396
|
24 years |
say1 |
changed initialisation code for acronyms
|
|
|
@1393
|
24 years |
say1 |
acronym markup functionality
|
|
|
@1384
|
24 years |
paynter |
Changed language extraction to ignoer encoding information, so that …
|
|
|
@1379
|
24 years |
paynter |
Fixed bug that gave gsdlsourcedocument metadata relative path instead …
|
|
|
@1360
|
24 years |
say1 |
clarified status messages
|
|
|
@1358
|
24 years |
nzdl |
Fixed bug I recently introduced into HTMLPlug (<pre> tags were being …
|
|
|
@1335
|
24 years |
say1 |
many acronym changes
|
|
|
@1317
|
24 years |
paynter |
Added -extract_language option, which uses the textcat language …
|
|
|
@1312
|
24 years |
sjboddie |
fixed a bug in the HTML plugin that showed up under windows
|
|
|
@1269
|
24 years |
sjboddie |
Added ZIPPlug plugin for handling input documents that have been …
|
|
|
@1245
|
24 years |
sjboddie |
Fixed a bug that davidb found in a couple of regular expressions
|
|
|
@1244
|
24 years |
sjboddie |
Caught up most general plugins (that's the ones in …
|
|
|
@1243
|
24 years |
sjboddie |
Caught HTMLPlug up with BasPlug. A few minor changes to some …
|
|
|
@1242
|
24 years |
sjboddie |
Added Stuart Yeate's acronym extraction code and made it a standard …
|
|
|
@1235
|
24 years |
nzdl |
* empty log message *
|
|
|
@1231
|
24 years |
gwp |
Bug fix on the H1 metadata option: if the file has no <H1> tag, …
|
|
|
@1230
|
24 years |
gwp |
Added an additional H1 metadata field that extracts the text
between …
|
|
|
@1229
|
24 years |
sjboddie |
fixed bug in options
|
|
|
@1227
|
24 years |
sjboddie |
Modified the perl code for importing arabic encoded documents. Plugins …
|
|
|
@1221
|
24 years |
sjboddie |
Added a new HBSPlug which is kind of a generalisation of HBPlug …
|
|
|
@1220
|
24 years |
sjboddie |
Caught HTMLPlug up with the changes I made to BasPlug. HTMLPlug now …
|
|
|
@1219
|
24 years |
sjboddie |
Made BasPlug take options (these options are available to all plugins …
|
|
|
@1206
|
24 years |
gwp |
A thorough rewrite; some of the metadata was flawed in such a way
that …
|
|
|
@1190
|
24 years |
gwp |
The first 200 chars of body text can now be extracted as metadata
by …
|
|
|
@1020
|
24 years |
sjboddie |
changed paths to collection images (again!)
|
|
|
@1010
|
24 years |
sjboddie |
renamed old html module ghtml -- it clashed with builtin html module …
|
|
|
@1006
|
24 years |
sjboddie |
fixed but in previous changes
|
|
|
@973
|
24 years |
sjboddie |
new path to images
|
|
|