source: trunk/gsdl/perllib

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @2096   23 years jrm21 Minor changes to regexs, so that header fields have to be at start of …
(edit) @2086   23 years jrm21 We create a copy of any args to new() because parsargs might modify …
(edit) @2085   23 years jrm21 When importing, we need to escape any escape codes otherwise mg(?) …
(edit) @2084   23 years jrm21 usage message is now formatted to fit within 80 columns.
(edit) @2083   23 years paynter Fixed a stupid mistake that I know I've fixed before.
(edit) @2082   23 years jrm21 added bzip2 support (untested).
(edit) @2080   23 years jrm21 When creating nodes, now need to pass -buttonname instead of -title.
(edit) @2079   23 years paynter Added a new binary field to the savephrases output that indicates …
(edit) @2064   23 years paynter Sort thesaurus phrases by frequency then type.
(edit) @2048   23 years sjboddie * empty log message *
(edit) @2041   23 years jrm21 don't strip all whitespace from tmp filename, only from base name. …
(edit) @2040   23 years sjboddie * empty log message *
(edit) @2039   23 years jrm21 do eval{symlink()} because platforms that don't support symlink …
(edit) @2036   23 years jrm21 don't use strict; anymore, as we want to be able to write error msgs …
(edit) @2029   23 years jrm21 Return 0 instead of "" on error in read() so that RecPlug can continue.
(edit) @2027   23 years jrm21 read() is now completely independent of BasPlug::read(), as the latter …
(edit) @2025   23 years paynter You can now have several phind classifiers on one collection. This …
(edit) @2024   23 years paynter Store classifier-specific parameters in gdbm file if required. …
(edit) @2022   23 years sjboddie Caught some of the classifiers up with the documentation (finally). …
(edit) @2018   23 years jrm21 removed "use BasPlug" lines from metadata extractors, as they …
(edit) @2008   23 years paynter Marginally better support for non-English documents.
(edit) @2007   23 years sjboddie * empty log message *
(edit) @2001   23 years sjboddie Added a hack that mysteriously converts iso639 language codes …
(edit) @2000   23 years sjboddie Re-added iso639.pm
(edit) @1999   23 years sjboddie Fixed a small problem with language detection code.
(edit) @1995   23 years jmt14 * empty log message *
(edit) @1989   23 years jmt14 * empty log message *
(edit) @1974   23 years cs025 Fixed omission of encoding from parameters in read_file
(edit) @1973   23 years kjm18 fixed up language stuff
(edit) @1972   23 years jmt14 * empty log message *
(edit) @1954   23 years jmt14 * empty log message *
(edit) @1949   23 years paynter Fixed bug that prevented tokeniser from distinguishing between languages.
(edit) @1948   23 years jrm21 Updated to now pass arguments using the new parsargv list format, …
(edit) @1947   23 years dmm9 updated documentation
(edit) @1929   23 years dg5 Modified: ConvertToPlug and HTMLPlug to handle files in binary mode to …
(edit) @1920   23 years sjboddie * empty log message *
(edit) @1919   23 years sjboddie * empty log message *
(edit) @1917   23 years kjm18 minor changes
(edit) @1905   23 years sjboddie * empty log message *
(edit) @1904   23 years sjboddie Added support for a couple more encodings that I'm told are in common …
(edit) @1903   23 years sjboddie We now use textcats best guess if it returns 3 or less possibilities …
(edit) @1901   23 years sjboddie * empty log message *
(edit) @1897   23 years paynter Convert_gml_into_tokens function a little more language tolerant, and …
(edit) @1895   23 years jrm21 Email plug now uses SplitPlug for mbox mail files. Hopefully this …
(edit) @1894   23 years jrm21 updated by copying BasPlug's new language/encoding stuff over for the …
(edit) @1891   23 years paynter Named characters like é and ì are translated to UTF8 …
(edit) @1890   23 years paynter When multiple metadata fields have multiple values, get them all. …
(edit) @1885   23 years paynter Added a classinfo.pl script, analogous to pluginfo.pl, that provides …
(edit) @1884   23 years paynter Added some documentation.
(edit) @1883   23 years paynter Supports new parameters of suffix program and new stopword file …
(edit) @1874   23 years sjboddie * empty log message *
(edit) @1871   23 years paynter Use two-letter codes for language names, updated docs.
(edit) @1870   23 years sjboddie Tidied up language support stuff.
(edit) @1869   23 years paynter Regular expression fix.
(edit) @1868   23 years sjboddie Made a bunch of changes to the building code to support lots of new …
(edit) @1857   23 years dmm9 date extraction options documented
(edit) @1855   23 years paynter Trivial change to warning message.
(edit) @1852   23 years kjm18 heaps of changes
(edit) @1851   23 years kjm18 added levels and buildtype for mgpp collections
(edit) @1846   23 years sjboddie Removed a call to a function that I removed in my previous changes - oops
(edit) @1845   23 years paynter Changed a "!=" to a "ne".
(edit) @1844   23 years sjboddie Added an 'auto' argument to BasPlug's '-input_encoding' option ('auto' …
(edit) @1843   23 years sjboddie Re-included some languages for which we had removed support
(edit) @1840   23 years paynter Changed default suffix size, clean up phrases.3 file
(edit) @1839   23 years paynter Updated classifiers to use the parsearg library instead of ad-hoc …
(edit) @1838   23 years sjboddie Added support for Cyrillic languages (windows codepage 1251) - yet to …
(edit) @1829   23 years paynter Accept a "thesaurus=name" option that identifies a thesaurus in a …
(edit) @1812   23 years sjboddie ZIPPlug is now disabled under windows
(edit) @1810   23 years sjboddie Fixed a bug that showed up when using Perl 5.6 on windows
(edit) @1808   23 years paynter Option to save the phind phrases to a text file.
(edit) @1803   23 years paynter Moved the phind classifier's data directory into the index directory. …
(edit) @1799   23 years sjboddie fixed a little bug in the building code that caused an endless loop if …
(edit) @1787   23 years jrm21 "allow_extra_options" missing, to get inherited options
(edit) @1778   23 years sjboddie Implemented the new MailServer, LogEvents, EmailEvents and …
(edit) @1772   23 years kjm18 removed Paragraph stuff - now only has Document and Section; added …
(edit) @1762   23 years sjboddie Added support for the new LogEvents, EmailEvents, EmailUserEvents and …
(edit) @1758   23 years say1 added minimum image size and a few bug fixes
(edit) @1757   23 years say1 tightened the criteria for email files to avoid matching all dynamic …
(edit) @1756   23 years say1 added detection and handling of unreadable files
(edit) @1755   23 years say1 added better cycle detection (but still not perfect)
(edit) @1754   23 years say1 added support for jar files (which are actually just fancy zip files)
(edit) @1744   23 years say1 about a billion changes to ImagePlug
(edit) @1742   23 years jrm21 Added a comment to the usage stuff about PRESCRIPT.
(edit) @1741   23 years sjboddie Fixed a little bug that was causing pluginfo.pl to print some dodgy …
(edit) @1740   23 years jrm21 We now escape underscores so that any macros in source code (wrt to …
(edit) @1735   23 years say1 fixed about a billion little Image things.
(edit) @1733   23 years say1 new plugin for images
(edit) @1732   23 years say1 check metadata before adding
(edit) @1731   23 years jrm21 New and improved! Now gets #include information from std C files as …
(edit) @1730   23 years jrm21 removed a debugging statement left in accidentally…
(edit) @1729   23 years jrm21 title regexp should have started "\s*", not "\s+" - it's optional …
(edit) @1728   23 years jrm21 Minor change so that leading whitespace is skipped when grabbing the …
(edit) @1720   23 years dmm9 Added information to the usage text about date extraction option
(edit) @1719   23 years dmm9 Added information to the usage text about date extraction option
(edit) @1718   23 years dmm9 Added information to the usage text about date extraction option
(edit) @1716   23 years jrm21 minor change to allow the -title option to display correctly on HTML page.
(edit) @1712   23 years say1 cleaned up metadata extraction.
(edit) @1711   23 years say1 fixed minor spelling mistake
(edit) @1710   23 years say1 RecPlug now skips CVS directories.
(edit) @1707   23 years jrm21 Plugin for source code (primarily for putting Greenstone src into a …
Note: See TracRevisionLog for help on using the revision log.