|
|
@3038
|
22 years |
jrm21 |
Put \" \" around href for srclink, in case the collection name has …
|
|
|
@3037
|
22 years |
jrm21 |
title_sub seems to always get defined by parsargv, so we test that it …
|
|
|
@3019
|
22 years |
jrm21 |
Fixes for when on windows - it was having a lot of trouble sorting out …
|
|
|
@2996
|
22 years |
sjboddie |
* empty log message *
|
|
|
@2995
|
22 years |
sjboddie |
Fixed a bug preventing HTML headers from being removed correctly when …
|
|
|
@2994
|
22 years |
jrm21 |
Added some mime types, and gave a url for "the list" of types at iana.org
|
|
|
@2990
|
22 years |
jrm21 |
Do MS Excel using ConvertToPlug, which currently uses the xlhtml package.
|
|
|
@2981
|
22 years |
jrm21 |
Added a minimal powerpoint plugin that causes an external converter to …
|
|
|
@2980
|
22 years |
jrm21 |
Added converted_to, which tells us what format the last input file we …
|
|
|
@2979
|
22 years |
jrm21 |
Use self->converted_to instead of convert_to, in case the file could …
|
|
|
@2975
|
22 years |
jrm21 |
Tidied up usage info to fit in 80 columns. Fixed title_sub stuff, so …
|
|
|
@2974
|
22 years |
jrm21 |
added a newline to soft link error message
|
|
|
@2973
|
22 years |
sjboddie |
Fixed a bug in the Hierarchy classifier
|
|
|
@2956
|
22 years |
jrm21 |
Added Don Gourley's changes for getting Sections to work properly.
|
|
|
@2955
|
22 years |
jrm21 |
Added removeprefix option. Added better usage information of the options.
|
|
|
@2954
|
22 years |
jrm21 |
added a remove_prefix option to strip from metadata before sorting for …
|
|
|
@2925
|
22 years |
sjboddie |
Altered the format of the GreenstoneArchive and …
|
|
|
@2918
|
22 years |
jrm21 |
Add [Title] metadata so that the default format strings will show …
|
|
|
@2916
|
22 years |
jrm21 |
Tidied up the usage output.
|
|
|
@2901
|
22 years |
jrm21 |
We now interprete some latex commands in the input, mostly to do with …
|
|
|
@2899
|
23 years |
sjboddie |
Added Alan Christensen's W3ImagePlug
|
|
|
@2897
|
23 years |
sjboddie |
Added AZCompactSectionList which was contributed by Don Gourley …
|
|
|
@2896
|
23 years |
sjboddie |
Fixed a small bug in the way XMLPlug was implemented - previously it …
|
|
|
@2891
|
23 years |
jrm21 |
Don't print out segment number if verbosity is set to zero.
|
|
|
@2890
|
23 years |
sjboddie |
Added xml_entity function to XMLPlug
|
|
|
@2889
|
23 years |
jrm21 |
Need to define $outhandle before using it in reclassify.
|
|
|
@2888
|
23 years |
sjboddie |
Removed extra white space that was being added inside all <Content> …
|
|
|
@2886
|
23 years |
jrm21 |
Fixed some encoding issues - need to convert to utf-8 after …
|
|
|
@2883
|
23 years |
paynter |
This Plugin can be used to import any file to Greenstone, regardless …
|
|
|
@2882
|
23 years |
paynter |
Compensate for change to "convert" output (size data goes to STDERR …
|
|
|
@2858
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2847
|
23 years |
sjboddie |
Altered EMAILPlug a little so it now treats all text that it used to …
|
|
|
@2846
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2845
|
23 years |
sjboddie |
Caught SplitPlug up with recent changes
|
|
|
@2837
|
23 years |
sjboddie |
added hlist_at_top option to Hierarchy classifier
|
|
|
@2835
|
23 years |
dmm9 |
Corrected pluginfo entry and renamed extract_date to …
|
|
|
@2819
|
23 years |
sjboddie |
Altered HTMLPlug's description_tags option a bit so it should now also …
|
|
|
@2818
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2817
|
23 years |
sjboddie |
Implemented a description_tags option to HTMLPlug for splitting an …
|
|
|
@2816
|
23 years |
sjboddie |
Added cover_image option to BasPlug for associating a jpeg image as a …
|
|
|
@2813
|
23 years |
sjboddie |
Altered RecPlug's -use_metadata_files option to use better XML files …
|
|
|
@2812
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2811
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2810
|
23 years |
sjboddie |
Created GAPlug (and XMLPlug base class) to replace the old GMLPlug. …
|
|
|
@2808
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2804
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2803
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2799
|
23 years |
sjboddie |
Fixed a bug where Word documents containing non-ascii characters …
|
|
|
@2797
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2796
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2795
|
23 years |
sjboddie |
Got ZIPPlug working under under windows
|
|
|
@2793
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2785
|
23 years |
sjboddie |
The build process now creates a summary of how many files were …
|
|
|
@2781
|
23 years |
jrm21 |
oops - left off a '$' at end of a pattern match.
|
|
|
@2779
|
23 years |
jrm21 |
Be a little more flexible when looking for boundary field in a …
|
|
|
@2772
|
23 years |
kjm18 |
changes to enable language specific collectionmeta in collect.cfg …
|
|
|
@2771
|
23 years |
kjm18 |
updated this to include the browselist/doclist stuff thats now in …
|
|
|
@2761
|
23 years |
sjboddie |
added HTMLPlug2 temporarily while testing a new extract_subsections option
|
|
|
@2755
|
23 years |
jrm21 |
import.pl now takes an option for saving file conversion failures to a …
|
|
|
@2754
|
23 years |
jrm21 |
oops - left a debugging statement in there.
|
|
|
@2751
|
23 years |
sjboddie |
Had a go at enriching the default document structure.
Added …
|
|
|
@2735
|
23 years |
sjboddie |
Fixed up bugs I introduced with recent change to BasPlug
|
|
|
@2734
|
23 years |
sjboddie |
Chinese text segmentation is now done whenever language="zh" instead …
|
|
|
@2733
|
23 years |
jrm21 |
minor regex fixes/improvements.
|
|
|
@2732
|
23 years |
jrm21 |
needed <pre> tags when using the text/plain part of a multipart message.
|
|
|
@2730
|
23 years |
jrm21 |
1) Non-ascii characters should now work for any encoding handled by …
|
|
|
@2717
|
23 years |
jrm21 |
Do some email munging - @ symbols become @. Both netscape and IE …
|
|
|
@2713
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2711
|
23 years |
sjboddie |
Removed the "beta" collect.cfg option to avoid awkward questions from …
|
|
|
@2700
|
23 years |
cs025 |
fixed this up for building under windows
|
|
|
@2695
|
23 years |
jrm21 |
Allow spaces in img src=... tags if surrounded with dbl quotes.
|
|
|
@2685
|
23 years |
jrm21 |
Improved regex for when the last category is too small, and we need to …
|
|
|
@2681
|
23 years |
jrm21 |
fixed a few more minor MIME header parsing cases.
|
|
|
@2680
|
23 years |
jrm21 |
1. we escape 'and' chars in headers so greenstone doesn't try to …
|
|
|
@2667
|
23 years |
jrm21 |
protect against < and > chars, as <pre> tags don't preserve them.
|
|
|
@2666
|
23 years |
jrm21 |
Modified phind classifier so that special delimiters are always …
|
|
|
@2662
|
23 years |
jrm21 |
oops, that's a bit stupid (of me) - changed:
if …
|
|
|
@2661
|
23 years |
jrm21 |
added a default block exp of "" so it doesn't inherit HTMLPlugs…
|
|
|
@2658
|
23 years |
jrm21 |
fixed a typo
|
|
|
@2657
|
23 years |
jrm21 |
fixed a bug when #including a macro (ie no "... or <... on the line)
|
|
|
@2652
|
23 years |
jrm21 |
Needed to replace \s with s. Also checked for multipart/related.
|
|
|
@2638
|
23 years |
jrm21 |
typo in regexp broke import... encoding type should have had [\s], …
|
|
|
@2632
|
23 years |
jrm21 |
added an option "-bymonth=1", to group by (eg) 2000-January, …
|
|
|
@2631
|
23 years |
jrm21 |
Don't assume funny dates are 20th C - eg 101 -> 19101 - add to 1900 …
|
|
|
@2630
|
23 years |
jrm21 |
Mime support for multipart messages. Doesn't extract attachments …
|
|
|
@2604
|
23 years |
jrm21 |
when extracting email addresses, we now include people in the .net …
|
|
|
@2601
|
23 years |
jrm21 |
modified usage to not mention HTMLplug blocking rtf.
|
|
|
@2576
|
23 years |
sjboddie |
Moved phind's stopword directory from etc to etc/packages/phind
|
|
|
@2564
|
23 years |
jrm21 |
Added RTFPlug. (It's the smallest one so far - 1511 bytes - yay!)
…
|
|
|
@2539
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2529
|
23 years |
sjboddie |
added quoting to system calls in phind classifier - needed when …
|
|
|
@2525
|
23 years |
kjm18 |
removed unneeded output
|
|
|
@2516
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2515
|
23 years |
sjboddie |
Fixed a couple of bugs/inconsistencies in word and pdf plugins that …
|
|
|
@2510
|
23 years |
sjboddie |
renamed phind's stopwords directory and contents to use Win3.1 …
|
|
|
@2509
|
23 years |
sjboddie |
Fixed (bypassed really) a problem with the phind classifier on windows …
|
|
|
@2507
|
23 years |
sjboddie |
Tidied up the phind client a little more. It now belongs to the …
|
|
|
@2506
|
23 years |
dmm9 |
added writing of collection document list to db (OID browselist)
|
|
|
@2505
|
23 years |
dmm9 |
added collection of collection document list
|
|
|
@2503
|
23 years |
sjboddie |
fixed a small bug in the datelist classifier that caused year ranges …
|
|
|