|
|
@3307
|
22 years |
davidb |
Some minor modifications to Image Plugin: filenames can now
include …
|
|
|
@3306
|
22 years |
davidb |
Removed some debugging print statements
|
|
|
@3303
|
22 years |
davidb |
Classifier extented to support frequency sort option through -freqsort …
|
|
|
@3302
|
22 years |
davidb |
Classifier modified so it does not include A-Z letters at top of
page …
|
|
|
@3249
|
22 years |
jrm21 |
1) add a space when joining consecutive lines, just in case.
2) Don't …
|
|
|
@3248
|
22 years |
jrm21 |
If we convert to HTML, we post-process to change named entities (eg …
|
|
|
@3247
|
22 years |
jrm21 |
Modified automatic title extraction to also recognise utf-8 nbsp as …
|
|
|
@3244
|
22 years |
jrm21 |
we no longer exit with an error if the suffix program failed to create …
|
|
|
@3226
|
22 years |
jrm21 |
Don't allow fields Encoding or Language for search - these are internal?!?
|
|
|
@3215
|
22 years |
jrm21 |
Fixed up some regexs for mime header encodings - eg people with …
|
|
|
@3206
|
22 years |
jrm21 |
Oops! Bad things were happening when the headers said utf-8 encoding, …
|
|
|
@3196
|
22 years |
sjboddie |
Added to the list of entities that HTMLPlug doesn't convert to utf-8
|
|
|
@3195
|
22 years |
kjdon |
create_shortname (turns a long metadata name into 2 char name) changed …
|
|
|
@3181
|
22 years |
sjboddie |
Altered the getcharequiv() function so it now converts entities to raw …
|
|
|
@3158
|
22 years |
kjdon |
the indexfieldmap list is now in sorted order with TextOnly at the …
|
|
|
@3156
|
22 years |
jrm21 |
Added a few extra accented characters, and recognise some …
|
|
|
@3148
|
22 years |
jrm21 |
If a document has associated files that are also given a subdirectory, …
|
|
|
@3146
|
22 years |
sjboddie |
textcat now returns "id" for Indonesian instead of "in"
|
|
|
@3144
|
22 years |
kjdon |
added mgpp's metadata field map to the gdbm file
For metadata, it uses …
|
|
|
@3143
|
22 years |
jrm21 |
Minor tweak for badly formatted dates. We now use a window, so …
|
|
|
@3142
|
22 years |
jrm21 |
1) We can't use "Date" for the year metadata, as greenstone assumes …
|
|
|
@3137
|
22 years |
paynter |
Changed the way Width, Height, Size and Type metadata is calculated. …
|
|
|
@3136
|
22 years |
paynter |
Reconciled John's version of my changes to EMAILPlug with my version …
|
|
|
@3135
|
22 years |
jrm21 |
modified process_exp to process php3 -named files too.
|
|
|
@3134
|
22 years |
jrm21 |
1) Convert headers to detected charset if possible.
2) Convert header …
|
|
|
@3132
|
22 years |
jrm21 |
Try to determine the encoding used in the headers in case it is not …
|
|
|
@3130
|
22 years |
jrm21 |
Added map files for iso-8859-15 encoding, which is basically Latin1 …
|
|
|
@3116
|
22 years |
sjboddie |
RecPlug will now die with an error if it finds a metadata.xml file …
|
|
|
@3115
|
22 years |
jrm21 |
Redirect mg(pp)_passes stderr to /dev/null if the "-out xxx" option is …
|
|
|
@3112
|
22 years |
jrm21 |
minor changes to formatted values (eg if enclosed in { and } ) and …
|
|
|
@3111
|
22 years |
jrm21 |
Allow .eml extension (IE and mozilla default to this for individual …
|
|
|
@3109
|
22 years |
jrm21 |
When getting first char for classification, s/(.).*$/$1/g isn't good …
|
|
|
@3108
|
22 years |
jrm21 |
Don't recursive into directories if they are symbolic links and point …
|
|
|
@3107
|
22 years |
jrm21 |
fixed problem where documents after a "bad" document would not be
read …
|
|
|
@3095
|
22 years |
jrm21 |
Added check for reading an empty file (ie read_line() returns undef).
|
|
|
@3094
|
22 years |
jrm21 |
Needed to add failhandle to the init() function, to pass to BasPlug.
|
|
|
@3086
|
22 years |
nzdl |
* empty log message *
|
|
|
@3073
|
22 years |
jrm21 |
1) Default Title now correctly escapes [ and ] chars.
2) …
|
|
|
@3038
|
22 years |
jrm21 |
Put \" \" around href for srclink, in case the collection name has …
|
|
|
@3037
|
22 years |
jrm21 |
title_sub seems to always get defined by parsargv, so we test that it …
|
|
|
@3019
|
22 years |
jrm21 |
Fixes for when on windows - it was having a lot of trouble sorting out …
|
|
|
@2996
|
22 years |
sjboddie |
* empty log message *
|
|
|
@2995
|
22 years |
sjboddie |
Fixed a bug preventing HTML headers from being removed correctly when …
|
|
|
@2994
|
22 years |
jrm21 |
Added some mime types, and gave a url for "the list" of types at iana.org
|
|
|
@2990
|
22 years |
jrm21 |
Do MS Excel using ConvertToPlug, which currently uses the xlhtml package.
|
|
|
@2981
|
22 years |
jrm21 |
Added a minimal powerpoint plugin that causes an external converter to …
|
|
|
@2980
|
22 years |
jrm21 |
Added converted_to, which tells us what format the last input file we …
|
|
|
@2979
|
22 years |
jrm21 |
Use self->converted_to instead of convert_to, in case the file could …
|
|
|
@2975
|
22 years |
jrm21 |
Tidied up usage info to fit in 80 columns. Fixed title_sub stuff, so …
|
|
|
@2974
|
22 years |
jrm21 |
added a newline to soft link error message
|
|
|
@2973
|
22 years |
sjboddie |
Fixed a bug in the Hierarchy classifier
|
|
|
@2956
|
22 years |
jrm21 |
Added Don Gourley's changes for getting Sections to work properly.
|
|
|
@2955
|
22 years |
jrm21 |
Added removeprefix option. Added better usage information of the options.
|
|
|
@2954
|
22 years |
jrm21 |
added a remove_prefix option to strip from metadata before sorting for …
|
|
|
@2925
|
22 years |
sjboddie |
Altered the format of the GreenstoneArchive and …
|
|
|
@2918
|
22 years |
jrm21 |
Add [Title] metadata so that the default format strings will show …
|
|
|
@2916
|
22 years |
jrm21 |
Tidied up the usage output.
|
|
|
@2901
|
22 years |
jrm21 |
We now interprete some latex commands in the input, mostly to do with …
|
|
|
@2899
|
22 years |
sjboddie |
Added Alan Christensen's W3ImagePlug
|
|
|
@2897
|
22 years |
sjboddie |
Added AZCompactSectionList which was contributed by Don Gourley …
|
|
|
@2896
|
22 years |
sjboddie |
Fixed a small bug in the way XMLPlug was implemented - previously it …
|
|
|
@2891
|
22 years |
jrm21 |
Don't print out segment number if verbosity is set to zero.
|
|
|
@2890
|
22 years |
sjboddie |
Added xml_entity function to XMLPlug
|
|
|
@2889
|
22 years |
jrm21 |
Need to define $outhandle before using it in reclassify.
|
|
|
@2888
|
22 years |
sjboddie |
Removed extra white space that was being added inside all <Content> …
|
|
|
@2886
|
22 years |
jrm21 |
Fixed some encoding issues - need to convert to utf-8 after …
|
|
|
@2883
|
22 years |
paynter |
This Plugin can be used to import any file to Greenstone, regardless …
|
|
|
@2882
|
22 years |
paynter |
Compensate for change to "convert" output (size data goes to STDERR …
|
|
|
@2858
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2847
|
23 years |
sjboddie |
Altered EMAILPlug a little so it now treats all text that it used to …
|
|
|
@2846
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2845
|
23 years |
sjboddie |
Caught SplitPlug up with recent changes
|
|
|
@2837
|
23 years |
sjboddie |
added hlist_at_top option to Hierarchy classifier
|
|
|
@2835
|
23 years |
dmm9 |
Corrected pluginfo entry and renamed extract_date to …
|
|
|
@2819
|
23 years |
sjboddie |
Altered HTMLPlug's description_tags option a bit so it should now also …
|
|
|
@2818
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2817
|
23 years |
sjboddie |
Implemented a description_tags option to HTMLPlug for splitting an …
|
|
|
@2816
|
23 years |
sjboddie |
Added cover_image option to BasPlug for associating a jpeg image as a …
|
|
|
@2813
|
23 years |
sjboddie |
Altered RecPlug's -use_metadata_files option to use better XML files …
|
|
|
@2812
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2811
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2810
|
23 years |
sjboddie |
Created GAPlug (and XMLPlug base class) to replace the old GMLPlug. …
|
|
|
@2808
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2804
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2803
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2799
|
23 years |
sjboddie |
Fixed a bug where Word documents containing non-ascii characters …
|
|
|
@2797
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2796
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2795
|
23 years |
sjboddie |
Got ZIPPlug working under under windows
|
|
|
@2793
|
23 years |
sjboddie |
* empty log message *
|
|
|
@2785
|
23 years |
sjboddie |
The build process now creates a summary of how many files were …
|
|
|
@2781
|
23 years |
jrm21 |
oops - left off a '$' at end of a pattern match.
|
|
|
@2779
|
23 years |
jrm21 |
Be a little more flexible when looking for boundary field in a …
|
|
|
@2772
|
23 years |
kjm18 |
changes to enable language specific collectionmeta in collect.cfg …
|
|
|
@2771
|
23 years |
kjm18 |
updated this to include the browselist/doclist stuff thats now in …
|
|
|
@2761
|
23 years |
sjboddie |
added HTMLPlug2 temporarily while testing a new extract_subsections option
|
|
|
@2755
|
23 years |
jrm21 |
import.pl now takes an option for saving file conversion failures to a …
|
|
|
@2754
|
23 years |
jrm21 |
oops - left a debugging statement in there.
|
|
|
@2751
|
23 years |
sjboddie |
Had a go at enriching the default document structure.
Added …
|
|
|
@2735
|
23 years |
sjboddie |
Fixed up bugs I introduced with recent change to BasPlug
|
|
|