Changeset 24475

Show
Ignore:
Timestamp:
25.08.2011 19:03:02 (8 years ago)
Author:
ak19
Message:

John Thompson's fix for efficient file-reading is useful here too, as he suggested. Tested replace_src_doc_with_html with ASCII and non-ASCII content in input (txt) file.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/bin/script/replace_srcdoc_with_html.pl

    r18590 r24475  
    192192    open(FIN,"<$output_filename") or die "replace_srcdoc_with_html.pl: Unable to open $output_filename to ensure utf8...ERROR: $!\n"; 
    193193    my $html_contents; 
    194     { 
    195     local $/ = undef;        # Read entire file at once 
    196     $html_contents = <FIN>;  # Now file is read in as one single 'line' 
    197     &unicode::ensure_utf8(\$html_contents); # turn any high bytes that aren't valid utf-8 into utf-8. 
    198     } 
     194    # Read in the entire contents of the file in one hit 
     195    sysread(FIN, $html_contents, -s FIN); 
     196    &unicode::ensure_utf8(\$html_contents); # turn any high bytes that aren't valid utf-8 into utf-8. 
    199197    close(FIN);  
    200198