Ignore:
Timestamp:
2006-09-29T15:38:44+12:00 (18 years ago)
Author:
kjdon
Message:

added new -extract_style option to HTMLPlug. looks for style, script and link tags in the html head tag, and saves them as ex.DocumentHeader metadata. -metadata_fields can now be used with -description_tags - why shouldn't we have metadata in the header as well as in the description tags?? can always turn head metadata off using -no_metadata. -hunt_creator_metadata no longer needs -metadata_fields option to be set.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl/perllib/strings.properties

    r12817 r12947  
    778778HTMLPlug.desc:This plugin processes HTML files
    779779
    780 HTMLPlug.description_tags:Split document into sub-sections where <Section> tags occur. Note that by setting this option you implicitly set -no_metadata, as all metadata should be included within the <Section> tags. Also, '-keep_head' will have no effect when this option is set.
     780HTMLPlug.description_tags:Split document into sub-sections where <Section> tags occur. '-keep_head' will have no effect when this option is set.
     781
     782HTMLPlug.extract_style:Extract style and script information from the HTML <head> tag and save as DocumentHeader metadata. This will be set in the document page as the _document:documentheader_ macro.
    781783
    782784HTMLPlug.file_is_url:Set if input filenames make up url of original source documents e.g. if a web mirroring tool was used to create the import directory structure.
    783785
    784 HTMLPlug.hunt_creator_metadata:Find as much metadata as possible on authorship and place it in the 'Creator' field. Requires the -metadata_fields flag.
     786HTMLPlug.hunt_creator_metadata:Find as much metadata as possible on authorship and place it in the 'Creator' field.
    785787
    786788HTMLPlug.keep_head:Don't remove headers from html files.
Note: See TracChangeset for help on using the changeset viewer.