Changeset 15869

05.06.2008 09:21:50 (12 years ago)

plugin overhaul: BasPlug? has been split into several base plugins: PrintInfo? just does the printing for, and does the argument parsing in the constructor. All plugins and supporting extractors etc inherit directly or indirectly from this. AbstractPlugin? adds a few methods to this, is used by Directory and ArchivesInf? plugins. These are not really plugins so can we remove them? anyway, not sure if AbstractPlugin? will live for very long. BasePlugin? is a proper base plugin, has read and read_into_doc_obj methods. It does nothing with reading in the file or textcat stuff. Makes a basic doc obj and adds some metadata. It also handles all the blocking stuff, associate ext stuff etc. Binary plugins can implement the process method to do file specific stuff. AutoExtractMetadata? inherits BasePlugin? and adds automatic metadata extraction using hte new Extractor plugins. ReadTextFile? is the equivalent in functionality to the old BasPlug? - does lang and encoding extraction, and reading in the file. It inherits from AutoExtractMetadata?. If your file type is binary and will have no text, then inherit from BasePlugin?. If its binary but ends up with text (eg using convert_to) then inherit from AutoExtractMetadata?. If your file is a text type file, then inherit from ReadTextFile?.

1 added