Changeset 24431

Show
Ignore:
Timestamp:
19.08.2011 17:04:41 (8 years ago)
Author:
ak19
Message:

Dr Bainbridge fixed HTMLPlugin so that empty keywords and subject values as generated when converting PDFs to HTML don't get entered into ex.Meta anymore as escaped quoted Keywords and Subject. Added in John Rose's request that PDFPlugin automatically extract Title, Author, Subject and Keywords metadata fields and store them as ex.Metadata.

Location:
main/trunk/greenstone2/perllib/plugins
Files:
2 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/HTMLPlugin.pm

    r23835 r24431  
    12561256    $value=$2; 
    12571257 
    1258     if (! $value) { 
    1259         $metatag =~ m/(?:name|http-equiv)\s*=\s*([^\s\>]+)/is; 
    1260         $value=$1; 
    1261     } 
    1262     if (!defined $value) { 
    1263         print $outhandle "HTMLPlugin: can't find VALUE in \"$metatag\"\n"; 
     1258    # The following code assigns the metaname to value if value is 
     1259    # empty. Why would we do this? 
     1260    #if (! $value) { 
     1261    #    $metatag =~ m/(?:name|http-equiv)\s*=\s*([^\s\>]+)/is; 
     1262    #    $value=$1; 
     1263    #} 
     1264    if (!defined $value || $value eq "") { 
     1265        print $outhandle "HTMLPlugin: can't find VALUE in <meta $metatag >\n" if ($self->{'verbosity'} > 2); 
    12641266        next; 
    12651267    } 
  • main/trunk/greenstone2/perllib/plugins/PDFPlugin.pm

    r24419 r24431  
    7373       'desc' => "{HTMLPlugin.metadata_fields}", 
    7474       'type' => "string", 
    75        'deft' => "" }, 
     75       'deft' => "Title,Author,Subject,Keywords" }, 
    7676      { 'name' => "metadata_field_separator", 
    7777    'desc' => "{HTMLPlugin.metadata_field_separator}",