Changeset 3247


Ignore:
Timestamp:
2002-07-11T17:59:16+12:00 (22 years ago)
Author:
jrm21
Message:

Modified automatic title extraction to also recognise utf-8 nbsp as well as
just plain HTML  

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl/perllib/plugins/HTMLPlug.pm

    r3196 r3247  
    579579        $tmptext =~ s/<\/([^>]+)><\1>//g; # (eg) </b><b> - no space
    580580        $tmptext =~ s/<[^>]*>/ /g;
    581         $tmptext =~ s/&nbsp;/ /g;
     581        $tmptext =~ s/(?:&nbsp;|\xc2\xa0)/ /g; # utf-8 for nbsp...
    582582        $tmptext =~ s/^\s+//s;
    583583        $tmptext =~ s/\s+$//;
    584584        $tmptext =~ s/\s+/ /gs;
    585         $tmptext =~ s/$self->{'title_sub'}// if ($self->{'title_sub'});
     585        $tmptext =~ s/^$self->{'title_sub'}// if ($self->{'title_sub'});
     586        $tmptext =~ s/^\s+//s; # in case title_sub introduced any...
    586587        $tmptext = substr ($tmptext, 0, 100);
    587588        $tmptext =~ s/\s\S*$/.../;
Note: See TracChangeset for help on using the changeset viewer.