|
|
@2858
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2847
|
23 years |
sjboddie |
Altered EMAILPlug a little so it now treats all text that it used to …
|
|
|
@2846
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2845
|
23 years |
sjboddie |
Caught SplitPlug up with recent changes
|
|
|
@2837
|
23 years |
sjboddie |
added hlist_at_top option to Hierarchy classifier
|
|
|
@2835
|
23 years |
dmm9 |
Corrected pluginfo entry and renamed extract_date to …
|
|
|
@2819
|
23 years |
sjboddie |
Altered HTMLPlug's description_tags option a bit so it should now also …
|
|
|
@2818
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2817
|
23 years |
sjboddie |
Implemented a description_tags option to HTMLPlug for splitting an …
|
|
|
@2816
|
23 years |
sjboddie |
Added cover_image option to BasPlug for associating a jpeg image as a …
|
|
|
@2813
|
23 years |
sjboddie |
Altered RecPlug's -use_metadata_files option to use better XML files …
|
|
|
@2812
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2811
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2810
|
23 years |
sjboddie |
Created GAPlug (and XMLPlug base class) to replace the old GMLPlug. …
|
|
|
@2808
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2804
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2803
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2799
|
23 years |
sjboddie |
Fixed a bug where Word documents containing non-ascii characters …
|
|
|
@2797
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2796
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2795
|
23 years |
sjboddie |
Got ZIPPlug working under under windows
|
|
|
@2793
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2785
|
23 years |
sjboddie |
The build process now creates a summary of how many files were …
|
|
|
@2781
|
23 years |
jrm21 |
oops - left off a '$' at end of a pattern match.
|
|
|
@2779
|
23 years |
jrm21 |
Be a little more flexible when looking for boundary field in a …
|
|
|
@2772
|
23 years |
kjm18 |
changes to enable language specific collectionmeta in collect.cfg …
|
|
|
@2771
|
23 years |
kjm18 |
updated this to include the browselist/doclist stuff thats now in …
|
|
|
@2761
|
23 years |
sjboddie |
added HTMLPlug2 temporarily while testing a new extract_subsections option
|
|
|
@2755
|
23 years |
jrm21 |
import.pl now takes an option for saving file conversion failures to a …
|
|
|
@2754
|
23 years |
jrm21 |
oops - left a debugging statement in there.
|
|
|
@2751
|
23 years |
sjboddie |
Had a go at enriching the default document structure.
Added …
|
|
|
@2735
|
23 years |
sjboddie |
Fixed up bugs I introduced with recent change to BasPlug
|
|
|
@2734
|
23 years |
sjboddie |
Chinese text segmentation is now done whenever language="zh" instead …
|
|
|
@2733
|
23 years |
jrm21 |
minor regex fixes/improvements.
|
|
|
@2732
|
23 years |
jrm21 |
needed <pre> tags when using the text/plain part of a multipart message.
|
|
|
@2730
|
23 years |
jrm21 |
1) Non-ascii characters should now work for any encoding handled by …
|
|
|
@2717
|
23 years |
jrm21 |
Do some email munging - @ symbols become @. Both netscape and IE …
|
|
|
@2713
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2711
|
23 years |
sjboddie |
Removed the "beta" collect.cfg option to avoid awkward questions from …
|
|
|
@2700
|
23 years |
cs025 |
fixed this up for building under windows
|
|
|
@2695
|
23 years |
jrm21 |
Allow spaces in img src=... tags if surrounded with dbl quotes.
|
|
|
@2685
|
23 years |
jrm21 |
Improved regex for when the last category is too small, and we need to …
|
|
|
@2681
|
23 years |
jrm21 |
fixed a few more minor MIME header parsing cases.
|
|
|
@2680
|
23 years |
jrm21 |
1. we escape 'and' chars in headers so greenstone doesn't try to …
|
|
|
@2667
|
23 years |
jrm21 |
protect against < and > chars, as <pre> tags don't preserve them.
|
|
|
@2666
|
23 years |
jrm21 |
Modified phind classifier so that special delimiters are always …
|
|
|
@2662
|
23 years |
jrm21 |
oops, that's a bit stupid (of me) - changed:
if …
|
|
|
@2661
|
23 years |
jrm21 |
added a default block exp of "" so it doesn't inherit HTMLPlugs…
|
|
|
@2658
|
23 years |
jrm21 |
fixed a typo
|
|
|
@2657
|
23 years |
jrm21 |
fixed a bug when #including a macro (ie no "... or <... on the line)
|
|
|
@2652
|
23 years |
jrm21 |
Needed to replace \s with s. Also checked for multipart/related.
|
|
|
@2638
|
23 years |
jrm21 |
typo in regexp broke import... encoding type should have had [\s], …
|
|
|
@2632
|
23 years |
jrm21 |
added an option "-bymonth=1", to group by (eg) 2000-January, …
|
|
|
@2631
|
23 years |
jrm21 |
Don't assume funny dates are 20th C - eg 101 -> 19101 - add to 1900 …
|
|
|
@2630
|
23 years |
jrm21 |
Mime support for multipart messages. Doesn't extract attachments …
|
|
|
@2604
|
23 years |
jrm21 |
when extracting email addresses, we now include people in the .net …
|
|
|
@2601
|
23 years |
jrm21 |
modified usage to not mention HTMLplug blocking rtf.
|
|
|
@2576
|
23 years |
sjboddie |
Moved phind's stopword directory from etc to etc/packages/phind
|
|
|
@2564
|
23 years |
jrm21 |
Added RTFPlug. (It's the smallest one so far - 1511 bytes - yay!)
…
|
|
|
@2539
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2529
|
23 years |
sjboddie |
added quoting to system calls in phind classifier - needed when …
|
|
|
@2525
|
23 years |
kjm18 |
removed unneeded output
|
|
|
@2516
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2515
|
23 years |
sjboddie |
Fixed a couple of bugs/inconsistencies in word and pdf plugins that …
|
|
|
@2510
|
23 years |
sjboddie |
renamed phind's stopwords directory and contents to use Win3.1 …
|
|
|
@2509
|
23 years |
sjboddie |
Fixed (bypassed really) a problem with the phind classifier on windows …
|
|
|
@2507
|
23 years |
sjboddie |
Tidied up the phind client a little more. It now belongs to the …
|
|
|
@2506
|
23 years |
dmm9 |
added writing of collection document list to db (OID browselist)
|
|
|
@2505
|
23 years |
dmm9 |
added collection of collection document list
|
|
|
@2503
|
23 years |
sjboddie |
fixed a small bug in the datelist classifier that caused year ranges …
|
|
|
@2500
|
23 years |
sjboddie |
Removed test for phindcgi from phind classifier as it is no longer used
|
|
|
@2493
|
23 years |
paynter |
Changed at the request of Marcio - see mailing list.
|
|
|
@2492
|
23 years |
paynter |
Fixed trivial bug in the new set_OID function.
|
|
|
@2489
|
23 years |
dmm9 |
adding the browse interface as a classifier option
|
|
|
@2487
|
23 years |
sjboddie |
Changes to get phind working under windows
|
|
|
@2484
|
23 years |
say1 |
Changed SplitPlug to allow control over the OID. Changed BibTexPlug to …
|
|
|
@2483
|
23 years |
say1 |
added a "if" to catch the case where someone tries to convert an …
|
|
|
@2481
|
23 years |
kjm18 |
changed mgpp system calls to use the new executable names
|
|
|
@2480
|
23 years |
kjm18 |
added the store_text option as done in mgbuildproc.pm
|
|
|
@2479
|
23 years |
kjm18 |
added indexmap and indexfieldmap to build.cfg fields
|
|
|
@2478
|
23 years |
kjm18 |
brought it in line with changes to buildcol.pl, mgbuilder.pm
now uses …
|
|
|
@2453
|
23 years |
jrm21 |
Slightly smarter title extraction from body's text.
|
|
|
@2452
|
23 years |
jrm21 |
-title_sub works now -- previously had a leading "--" argument, which …
|
|
|
@2451
|
23 years |
jrm21 |
PSPlug now uses the -title_sub option to TEXTPlug, to remove any …
|
|
|
@2450
|
23 years |
jrm21 |
now accepts the "-title_sub" option, a regexp to remove when …
|
|
|
@2432
|
23 years |
say1 |
switched the order of removing the symbolic link and checking for …
|
|
|
@2412
|
23 years |
sjboddie |
Added a tar archive of all the perl modules required to make ping.pl work
|
|
|
@2364
|
23 years |
jrm21 |
turn "\" into " " so that we don't lose backslashes along the way…
|
|
|
@2363
|
23 years |
jrm21 |
fixed nasty bug where </srclink></a><srclink> was being matched …
|
|
|
@2359
|
23 years |
sjboddie |
Altered the help text a little for mkcol.pl, import.pl, buildcol.pl, …
|
|
|
@2356
|
23 years |
sjboddie |
Renamed HBSPlug BookPlug in the hope that it's a little less crytic
|
|
|
@2355
|
23 years |
sjboddie |
All options to import.pl and buildcol.pl may now be specified from …
|
|
|
@2342
|
23 years |
sjboddie |
renamed HTMLPlug's w3mir option to file_is_url
|
|
|
@2336
|
23 years |
sjboddie |
added a -no_text option to buildcol.pl to allow collections to be …
|
|
|
@2333
|
23 years |
kjm18 |
closed all filehandles that had remained open, to fix the bug that was …
|
|
|
@2327
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2326
|
23 years |
sjboddie |
fixed a small bug in the new XML gml code that caused metadata tags …
|
|
|
@2267
|
23 years |
davidb |
GML file syntax altered to be XML compliant. This basically meant …
|
|
|
@2241
|
23 years |
sjboddie |
Tidied up the ConvertToPlug stuff to get it working on Windows 95/98
|
|
|
@2237
|
23 years |
sjboddie |
Added a unicode2koi8r function to unicode.pm (because I needed one). …
|
|
|