Ignore:
Timestamp:
2019-07-23T17:29:18+12:00 (5 years ago)
Author:
ak19
Message:

Better comments. Tested macronised vs unmacronised Māori language test string and both are detected as mri, but the unmacronised is detected with lower confidence. Added a note on that in the README.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gs3-extensions/maori-lang-detection/README.txt

    r33339 r33350  
    3838
    3939
    40 
    41 
    42 For reading materials, see the OLD README section below.
     40For links to background reading materials, see the OLD README section further below.
     41
     42
     43NOTE: The OpenNLP Language Detection Model can detect non-macronised Māori text too,
     44but as anticipated, the same text produces a lower confidence level for the language prediction. Compare:
     45
     46$maori-lang-detection/src>java -cp ".:$OPENNLP_HOME/lib/opennlp-tools-1.9.1.jar" MaoriTextDetector -
     47   Waiting to read text from STDIN... (press Ctrl-D when done entering text)>
     48   Ko tenei te Whare Wananga o Waikato e whakatau nei i nga iwi o te ao, ki roto i te riu o te awa e rere nei, ki runga i te whenua e hora nei, ki raro i te taumaru o nga maunga whakaruru e tau awhi nei.
     49   Best language: mri
     50   Best language confidence: 0.5959533972070814
     51   Exitting program with returnVal 0...
     52
     53$maori-lang-detection/src>java -cp ".:$OPENNLP_HOME/lib/opennlp-tools-1.9.1.jar" MaoriTextDetector -
     54   Waiting to read text from STDIN... (press Ctrl-D when done entering text)>
     55   Ko tēnei te Whare Wānanga o Waikato e whakatau nei i ngā iwi o te ao, ki roto i te riu o te awa e rere nei, ki runga i te whenua e hora nei, ki raro i te taumaru o ngā maunga whakaruru e tau awhi nei.
     56   Best language: mri
     57   Best language confidence: 0.6825737450092515
     58   Exitting program with returnVal 0...
     59
    4360
    4461-------------------------
Note: See TracChangeset for help on using the changeset viewer.