# # ChangeLog for trunk/gsdl/perllib # # Generated by Trac 1.4.2 # 2024-06-21T05:49:00+12:00 Tue, 29 Oct 2002 04:23:42 GMT jrm21 [3506] * trunk/gsdl/perllib/cfgread.pm (modified) need to allow escaped \" inside a multiline "...". Eg ... Thu, 24 Oct 2002 03:37:12 GMT kjdon [3472] * trunk/gsdl/perllib/classify/Phind.pm (added) * trunk/gsdl/perllib/classify/phind.pm (deleted) renamed phind.pm to Phind.pm in keeping with the names of the other ... Tue, 01 Oct 2002 03:05:47 GMT jrm21 [3433] * trunk/gsdl/perllib/classify/AZList.pm (modified) If a metadata value becomes empty (because of the removeprefix ... Tue, 24 Sep 2002 05:17:39 GMT jrm21 [3430] * trunk/gsdl/etc/marctodc.txt (added) * trunk/gsdl/perllib/cpan/MARC (added) * trunk/gsdl/perllib/cpan/MARC/Batch.pm (added) * trunk/gsdl/perllib/cpan/MARC/Field.pm (added) * trunk/gsdl/perllib/cpan/MARC/File (added) * trunk/gsdl/perllib/cpan/MARC/File.pm (added) * trunk/gsdl/perllib/cpan/MARC/File/MicroLIF.pm (added) * trunk/gsdl/perllib/cpan/MARC/File/USMARC.pm (added) * trunk/gsdl/perllib/cpan/MARC/Lint.pm (added) * trunk/gsdl/perllib/cpan/MARC/Record.pm (added) * trunk/gsdl/perllib/plugins/MARCPlug.pm (added) Added MARCPlug, mostly done by David Bainbridge. It needs a ... Tue, 17 Sep 2002 08:51:08 GMT sjboddie [3427] * trunk/gsdl/perllib/plugins/BasPlug.pm (modified) The input encoding will now default to utf8 instead of iso-8859-1. ... Fri, 13 Sep 2002 03:20:47 GMT jrm21 [3426] * trunk/gsdl/perllib/plugins/BibTexPlug.pm (modified) Don't add \n to the end of each metadata value. Tue, 10 Sep 2002 07:15:46 GMT jrm21 [3418] * trunk/gsdl/perllib/cfgread.pm (modified) Allow fields to stretch over multiple lines if enclosed in double ... Tue, 10 Sep 2002 01:43:58 GMT jrm21 [3416] * trunk/gsdl/perllib/arcinfo.pm (modified) Fix up problem if no documents were processed and accepted. Tue, 10 Sep 2002 01:40:40 GMT jrm21 [3415] * trunk/gsdl/perllib/docsave.pm (modified) don't try to write to and close an archive file if one wasn't opened ... Tue, 03 Sep 2002 06:21:24 GMT jrm21 [3414] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Need to escape "_" characters so that greenstone doesn't interprete ... Tue, 03 Sep 2002 06:19:33 GMT jrm21 [3413] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Added "\" to the characters we need to escape for classifying. Thu, 29 Aug 2002 04:37:54 GMT jrm21 [3411] * trunk/gsdl/perllib/plugins/PDFPlug.pm (modified) Now takes a "-use_sections" option to make a section per page. Mon, 26 Aug 2002 04:26:56 GMT sjboddie [3402] * trunk/gsdl/bin/script/import.pl (modified) * trunk/gsdl/perllib/plugin.pm (modified) import.pl now tells user where the fail.log lives Sun, 25 Aug 2002 23:43:17 GMT sjboddie [3400] * trunk/gsdl/bin/script/gsConvert.pl (modified) * trunk/gsdl/perllib/plugins/WordPlug.pm (modified) WordPlug now handles .dot files as well as .doc files. Fri, 23 Aug 2002 03:31:53 GMT jrm21 [3398] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Oops... the last change to the regex was too permissive... fixed up ... Fri, 23 Aug 2002 03:22:34 GMT jrm21 [3397] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) minor change to the regex for marking up urls (to allow #anchor at ... Tue, 20 Aug 2002 05:09:03 GMT sjboddie [3369] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) HTMLPlug will no longer prevent metadata extraction when the ... Wed, 14 Aug 2002 04:46:11 GMT jrm21 [3352] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) We can now properly handle messages with a content type of ... Tue, 13 Aug 2002 05:25:35 GMT jrm21 [3351] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) If a message is in an unsupported encoding, we assume iso8859-1. ... Tue, 13 Aug 2002 00:28:15 GMT sjboddie [3350] * trunk/gsdl/bin/script/gsConvert.pl (modified) * trunk/gsdl/perllib/plugins/ConvertToPlug.pm (modified) Added -use_strings option to ConvertToPlug. The default behaviour for ... Mon, 12 Aug 2002 23:29:48 GMT sjboddie [3349] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Bug fix. Fri, 09 Aug 2002 02:05:00 GMT jrm21 [3329] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Oops, removed debugging statement! Fri, 09 Aug 2002 02:04:07 GMT jrm21 [3328] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Make sure that sender's name is more than 0 chars long, otherwise use ... Wed, 31 Jul 2002 15:05:38 GMT davidb [3307] * trunk/gsdl/perllib/plugins/ImagePlug.pm (modified) Some minor modifications to Image Plugin: filenames can now include ... Wed, 31 Jul 2002 15:03:31 GMT davidb [3306] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Removed some debugging print statements Tue, 30 Jul 2002 16:10:41 GMT davidb [3303] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Classifier extented to support frequency sort option through ... Tue, 30 Jul 2002 15:41:10 GMT davidb [3302] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Classifier modified so it does not include A-Z letters at top of ... Fri, 12 Jul 2002 03:19:17 GMT jrm21 [3249] * trunk/gsdl/perllib/plugins/BibTexPlug.pm (modified) 1) add a space when joining consecutive lines, just in case. 2) ... Thu, 11 Jul 2002 06:11:01 GMT jrm21 [3248] * trunk/gsdl/perllib/plugins/ConvertToPlug.pm (modified) If we convert to HTML, we post-process to change named entities (eg ... Thu, 11 Jul 2002 05:59:16 GMT jrm21 [3247] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Modified automatic title extraction to also recognise utf-8 nbsp as ... Mon, 08 Jul 2002 05:59:24 GMT jrm21 [3244] * trunk/gsdl/perllib/classify/phind.pm (modified) we no longer exit with an error if the suffix program failed to ... Fri, 05 Jul 2002 04:55:56 GMT jrm21 [3226] * trunk/gsdl/perllib/mgppbuildproc.pm (modified) Don't allow fields Encoding or Language for search - these are ... Thu, 04 Jul 2002 23:24:42 GMT jrm21 [3215] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Fixed up some regexs for mime header encodings - eg people with ... Tue, 02 Jul 2002 05:15:34 GMT jrm21 [3206] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Oops! Bad things were happening when the headers said utf-8 encoding, ... Tue, 25 Jun 2002 22:08:57 GMT sjboddie [3196] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Added   to the list of entities that HTMLPlug doesn't convert to ... Tue, 25 Jun 2002 21:55:21 GMT kjdon [3195] * trunk/gsdl/perllib/mgppbuildproc.pm (modified) create_shortname (turns a long metadata name into 2 char name) ... Tue, 25 Jun 2002 08:15:26 GMT sjboddie [3181] * trunk/gsdl/perllib/classify/phind.pm (modified) * trunk/gsdl/perllib/ghtml.pm (modified) * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Altered the getcharequiv() function so it now converts entities to ... Fri, 21 Jun 2002 02:39:08 GMT kjdon [3158] * trunk/gsdl/perllib/mgppbuilder.pm (modified) the indexfieldmap list is now in sorted order with TextOnly at the ... Tue, 18 Jun 2002 02:01:26 GMT jrm21 [3156] * trunk/gsdl/perllib/plugins/BibTexPlug.pm (modified) Added a few extra accented characters, and recognise some bibtex- ... Sun, 16 Jun 2002 22:28:16 GMT jrm21 [3148] * trunk/gsdl/perllib/mgbuildproc.pm (modified) * trunk/gsdl/perllib/mgppbuildproc.pm (modified) * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) If a document has associated files that are also given a ... Tue, 11 Jun 2002 09:16:30 GMT sjboddie [3146] * trunk/gsdl/perllib/textcat/id-iso_8859_1.lm (added) * trunk/gsdl/perllib/textcat/in-iso_8859_1.lm (deleted) textcat now returns "id" for Indonesian instead of "in" Tue, 11 Jun 2002 04:26:04 GMT kjdon [3144] * trunk/gsdl/perllib/mgppbuilder.pm (modified) added mgpp's metadata field map to the gdbm file For metadata, it ... Tue, 11 Jun 2002 02:21:46 GMT jrm21 [3143] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Minor tweak for badly formatted dates. We now use a window, so ... Mon, 10 Jun 2002 06:14:45 GMT jrm21 [3142] * trunk/gsdl/perllib/plugins/BibTexPlug.pm (modified) 1) We can't use "Date" for the year metadata, as greenstone assumes ... Fri, 07 Jun 2002 00:32:11 GMT paynter [3137] * trunk/gsdl/perllib/plugins/ImagePlug.pm (modified) Changed the way Width, Height, Size and Type metadata is calculated. ... Fri, 07 Jun 2002 00:25:02 GMT paynter [3136] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Reconciled John's version of my changes to EMAILPlug with my version ... Thu, 06 Jun 2002 03:34:54 GMT jrm21 [3135] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) modified process_exp to process php3 -named files too. Tue, 28 May 2002 03:32:33 GMT jrm21 [3134] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) 1) Convert headers to detected charset if possible. 2) Convert ... Wed, 22 May 2002 05:27:41 GMT jrm21 [3132] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Try to determine the encoding used in the headers in case it is not ... Wed, 22 May 2002 02:53:12 GMT jrm21 [3130] * trunk/gsdl/mappings/from_uc/8859_15.ump (added) * trunk/gsdl/mappings/to_uc/8859_15.ump (added) * trunk/gsdl/perllib/encodings.pm (modified) Added map files for iso-8859-15 encoding, which is basically Latin1 ... Thu, 16 May 2002 02:44:45 GMT sjboddie [3116] * trunk/gsdl/perllib/plugins/RecPlug.pm (modified) RecPlug will now die with an error if it finds a metadata.xml file ... Tue, 14 May 2002 05:42:21 GMT jrm21 [3115] * trunk/gsdl/perllib/mgbuilder.pm (modified) * trunk/gsdl/perllib/mgppbuilder.pm (modified) Redirect mg(pp)_passes stderr to /dev/null if the "-out xxx" option ... Mon, 13 May 2002 05:06:05 GMT jrm21 [3112] * trunk/gsdl/perllib/plugins/BibTexPlug.pm (modified) minor changes to formatted values (eg if enclosed in { and } ) and ... Tue, 07 May 2002 03:33:35 GMT jrm21 [3111] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Allow .eml extension (IE and mozilla default to this for individual ... Mon, 29 Apr 2002 03:38:12 GMT jrm21 [3109] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) * trunk/gsdl/perllib/classify/AZList.pm (modified) When getting first char for classification, s/^(.).*$/$1/g isn't good ... Mon, 29 Apr 2002 01:11:51 GMT jrm21 [3108] * trunk/gsdl/perllib/plugins/RecPlug.pm (modified) Don't recursive into directories if they are symbolic links and point ... Mon, 29 Apr 2002 00:42:27 GMT jrm21 [3107] * trunk/gsdl/perllib/plugins/XMLPlug.pm (modified) fixed problem where documents after a "bad" document would not be ... Wed, 24 Apr 2002 01:45:59 GMT jrm21 [3095] * trunk/gsdl/perllib/multiread.pm (modified) Added check for reading an empty file (ie read_line() returns undef). Wed, 24 Apr 2002 01:40:41 GMT jrm21 [3094] * trunk/gsdl/perllib/plugins/SplitPlug.pm (modified) Needed to add failhandle to the init() function, to pass to BasPlug. Mon, 22 Apr 2002 01:00:48 GMT nzdl [3086] * trunk/gsdl/macros/english.dm (modified) * trunk/gsdl/perllib/plugins/BasPlug.pm (modified) *** empty log message *** Wed, 03 Apr 2002 03:44:42 GMT jrm21 [3073] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) 1) Default Title now correctly escapes [ and ] chars. 2) ... Sun, 03 Mar 2002 23:23:40 GMT jrm21 [3038] * trunk/gsdl/perllib/plugins/ConvertToPlug.pm (modified) Put \" \" around href for srclink, in case the collection name has ... Sun, 03 Mar 2002 22:47:12 GMT jrm21 [3037] * trunk/gsdl/perllib/plugins/TEXTPlug.pm (modified) title_sub seems to always get defined by parsargv, so we test that it ... Wed, 27 Feb 2002 05:57:30 GMT jrm21 [3019] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Fixes for when on windows - it was having a lot of trouble sorting ... Mon, 25 Feb 2002 20:50:41 GMT sjboddie [2996] * trunk/gsdl/perllib/plugins/W3ImgPlug.pm (modified) *** empty log message *** Sat, 23 Feb 2002 21:09:50 GMT sjboddie [2995] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Fixed a bug preventing HTML headers from being removed correctly when ... Thu, 21 Feb 2002 05:13:16 GMT jrm21 [2994] * trunk/gsdl/perllib/ghtml.pm (modified) Added some mime types, and gave a url for "the list" of types at iana.org Thu, 21 Feb 2002 04:15:31 GMT jrm21 [2990] * trunk/gsdl/perllib/plugins/ExcelPlug.pm (added) Do MS Excel using ConvertToPlug, which currently uses the xlhtml package. Wed, 20 Feb 2002 03:38:48 GMT jrm21 [2981] * trunk/gsdl/perllib/plugins/PPTPlug.pm (added) Added a minimal powerpoint plugin that causes an external converter ... Wed, 20 Feb 2002 03:38:08 GMT jrm21 [2980] * trunk/gsdl/perllib/plugins/ConvertToPlug.pm (modified) Added converted_to, which tells us what format the last input file we ... Wed, 20 Feb 2002 03:36:58 GMT jrm21 [2979] * trunk/gsdl/perllib/plugins/PDFPlug.pm (modified) * trunk/gsdl/perllib/plugins/PSPlug.pm (modified) * trunk/gsdl/perllib/plugins/RTFPlug.pm (modified) * trunk/gsdl/perllib/plugins/WordPlug.pm (modified) Use self->converted_to instead of convert_to, in case the file could ... Wed, 20 Feb 2002 03:23:35 GMT jrm21 [2975] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Tidied up usage info to fit in 80 columns. Fixed title_sub stuff, so ... Wed, 20 Feb 2002 01:03:23 GMT jrm21 [2974] * trunk/gsdl/perllib/util.pm (modified) added a newline to soft link error message Tue, 19 Feb 2002 09:15:04 GMT sjboddie [2973] * trunk/gsdl/perllib/classify/Hierarchy.pm (modified) Fixed a bug in the Hierarchy classifier Thu, 07 Feb 2002 03:44:47 GMT jrm21 [2956] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Added Don Gourley's changes for getting Sections to work properly. Thu, 07 Feb 2002 03:28:48 GMT jrm21 [2955] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Added removeprefix option. Added better usage information of the options. Thu, 07 Feb 2002 03:18:36 GMT jrm21 [2954] * trunk/gsdl/perllib/classify/AZList.pm (modified) * trunk/gsdl/perllib/classify/AZSectionList.pm (modified) added a remove_prefix option to strip from metadata before sorting ... Wed, 30 Jan 2002 01:23:31 GMT sjboddie [2925] * trunk/gsdl/collect/demo/import/metadata.xml (modified) * trunk/gsdl/perllib/docsave.pm (modified) * trunk/gsdl/perllib/plugins/GAPlug.pm (modified) * trunk/gsdl/perllib/plugins/RecPlug.pm (modified) Altered the format of the GreenstoneArchive and ... Thu, 24 Jan 2002 02:28:47 GMT jrm21 [2918] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Add [Title] metadata so that the default format strings will show ... Tue, 22 Jan 2002 22:28:05 GMT jrm21 [2916] * trunk/gsdl/perllib/classify/DateList.pm (modified) Tidied up the usage output. Mon, 14 Jan 2002 04:38:47 GMT jrm21 [2901] * trunk/gsdl/perllib/plugins/BibTexPlug.pm (modified) We now interprete some latex commands in the input, mostly to do with ... Fri, 14 Dec 2001 00:18:02 GMT sjboddie [2899] * trunk/gsdl/perllib/plugins/W3ImgPlug.pm (added) * trunk/gsdl/src/recpt/vlistbrowserclass.cpp (modified) Added Alan Christensen's W3ImagePlug Thu, 13 Dec 2001 02:52:28 GMT sjboddie [2897] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) * trunk/gsdl/perllib/classify/AZCompactSectionList.pm (added) Added AZCompactSectionList which was contributed by Don Gourley ... Thu, 13 Dec 2001 02:25:43 GMT sjboddie [2896] * trunk/gsdl/perllib/plugins/XMLPlug.pm (modified) Fixed a small bug in the way XMLPlug was implemented - previously it ... Mon, 10 Dec 2001 02:38:43 GMT jrm21 [2891] * trunk/gsdl/perllib/plugins/SplitPlug.pm (modified) Don't print out segment number if verbosity is set to zero. Mon, 10 Dec 2001 01:28:31 GMT sjboddie [2890] * trunk/gsdl/perllib/plugins/XMLPlug.pm (modified) Added xml_entity function to XMLPlug Mon, 10 Dec 2001 00:49:53 GMT jrm21 [2889] * trunk/gsdl/perllib/classify/AZCompactList.pm (modified) Need to define $outhandle before using it in reclassify. Thu, 06 Dec 2001 00:45:32 GMT sjboddie [2888] * trunk/gsdl/perllib/doc.pm (modified) Removed extra white space that was being added inside all ... Tue, 04 Dec 2001 03:02:17 GMT jrm21 [2886] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Fixed some encoding issues - need to convert to utf-8 after un- ... Mon, 03 Dec 2001 17:30:01 GMT paynter [2883] * trunk/gsdl/perllib/plugins/UnknownPlug.pm (added) This Plugin can be used to import any file to Greenstone, regardless ... Mon, 03 Dec 2001 17:29:00 GMT paynter [2882] * trunk/gsdl/perllib/plugins/ImagePlug.pm (modified) Compensate for change to "convert" output (size data goes to STDERR ... Mon, 26 Nov 2001 00:48:52 GMT sjboddie [2858] * trunk/gsdl/perllib/doc.pm (modified) *** empty log message *** Fri, 23 Nov 2001 03:14:39 GMT sjboddie [2847] * trunk/gsdl/perllib/plugins/EMAILPlug.pm (modified) Altered EMAILPlug a little so it now treats all text that it used to ... Fri, 23 Nov 2001 03:10:45 GMT sjboddie [2846] * trunk/gsdl/perllib/doc.pm (modified) *** empty log message *** Thu, 22 Nov 2001 23:38:00 GMT sjboddie [2845] * trunk/gsdl/perllib/plugins/SplitPlug.pm (modified) Caught SplitPlug up with recent changes Wed, 21 Nov 2001 22:38:42 GMT sjboddie [2837] * trunk/gsdl/perllib/classify/Hierarchy.pm (modified) added hlist_at_top option to Hierarchy classifier Wed, 21 Nov 2001 00:11:40 GMT dmm9 [2835] * trunk/gsdl/perllib/plugins/BasPlug.pm (modified) Corrected pluginfo entry and renamed extract_date to ... Mon, 05 Nov 2001 09:49:46 GMT sjboddie [2819] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Altered HTMLPlug's description_tags option a bit so it should now ... Mon, 05 Nov 2001 03:31:17 GMT sjboddie [2818] * trunk/gsdl/collect/demo/etc/collect.cfg (modified) * trunk/gsdl/collect/demo/etc/org.txt (modified) * trunk/gsdl/collect/demo/etc/sub.txt (modified) * trunk/gsdl/collect/demo/import/bostid (deleted) * trunk/gsdl/collect/demo/import/ecourier (deleted) * trunk/gsdl/collect/demo/import/faobetf (deleted) * trunk/gsdl/collect/demo/import/wb (deleted) * trunk/gsdl/perllib/plugins/HTMLPlug2.pm (deleted) * trunk/gsdl/perllib/plugins/RecPlug.pm (modified) *** empty log message *** Mon, 05 Nov 2001 03:30:27 GMT sjboddie [2817] * trunk/gsdl/perllib/plugins/HTMLPlug.pm (modified) Implemented a description_tags option to HTMLPlug for splitting an ...