Changeset 16641

Show
Ignore:
Timestamp:
04.08.2008 12:45:59 (11 years ago)
Author:
kjdon
Message:

upgraded this (using unicode 4.0) to include more Chinese characters and Japanese and Korean characters

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • gsdl/trunk/perllib/cnseg.pm

    r15894 r16641  
    5555    my $space = 1; # start doesn't need a space 
    5656    foreach $c (@$uniin) { 
    57     if (($c >= 0x4e00 && $c <= 0x9fa5) || 
    58         ($c >= 0xf900 && $c <= 0xfa2d)) { 
    59         # Chinese character 
     57    if (($c >= 0x2e80 && $c <= 0xfa6a) || # main east asian codes 
     58        ($c >= 0x20000 && $c <= 0x2a6d6) || # cjk unified ideographs ext B 
     59        ($c >= 0x2f800 && $c <= 0x2fa1d)) { #cjk compatibility ideographs supplement 
     60        # CJK character 
    6061        push (@$out, 0x200b) unless $space; 
    6162        push (@$out, $c);