Changeset 32215

Show
Ignore:
Timestamp:
26.06.2018 13:55:28 (3 months ago)
Author:
ak19
Message:

Before reorganising our PDFPlugin in whatever way we ultimately decide, committing a version where, on paged_html output mode, the pages produced by Xpdf's pdftohtml are sectionalised by default if total num pages is more than 10. Also changing inserted HTML heading tags to get the page title to still appear correctly.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/PDFPlugin.pm

    r32210 r32215  
    397397    $start_text .= "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\n"; 
    398398    $start_text .= "</head>\n<body>\n\n"; 
     399    $start_text .= "<h1>$output_tailname</h1>\n\n"; 
    399400 
    400401    #handle content encodings the same way that default_convert_post_process does 
     
    542543    my $start_range = $page_num - ($page_num % 10) + 1; 
    543544    my $end_range = $page_num + 10 - ($page_num % 10); 
    544     $page_div .= "<h1 style=\"font-size:1em;font-weight:normal;\">Pages ".$start_range . "-" . $end_range."</h1>\n"; 
    545     } 
    546  
    547     # Whether we're starting a new bucket or not, add a simpler heading: just the pagenumber, "Page #" 
    548     $page_div .= "<h2 style=\"font-size:1em;font-weight:normal;\">Page ".$page_num."</h2>\n"; 
    549     $new_dom->at('div')->append_content($new_dom->new_tag('h2', "Page ".$page_num))->root; 
     545    $page_div .= "<h2 style=\"font-size:1em;font-weight:normal;\">Pages ".$start_range . "-" . $end_range."</h2>\n"; 
     546    } 
     547 
     548    # No sectionalising for 10 pages or under. Otherwise, every page is a section too, not just buckets 
     549    if($num_html_pages > 10) { 
     550        # Whether we're starting a new bucket or not, add a simpler heading: just the pagenumber, "Page #"   
     551        $page_div .= "<h3 style=\"font-size:1em;font-weight:normal;\">Page ".$page_num."</h3>\n";        
     552    } 
    550553 
    551554    $page_div .= $inner_div_str;