Changeset 3248


Ignore:
Timestamp:
2002-07-11T18:11:01+12:00 (22 years ago)
Author:
jrm21
Message:

If we convert to HTML, we post-process to change named entities (eg é)
to utf-8.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl/perllib/plugins/ConvertToPlug.pm

    r3038 r3248  
    272272    }
    273273
     274    # if we converted to HTML, convert é and etc to utf-8.
     275    # this should really happen before language_extraction, but that means
     276    # modifying a file on disk...
     277    $text =~ s/&([^;]+);/&ghtml::getcharequiv($1,0)/ge;
     278
    274279    # create a new document
    275280    my $doc_obj = new doc ($conv_filename, "indexed_doc");
Note: See TracChangeset for help on using the changeset viewer.