1 | #
|
---|
2 | # This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
|
---|
3 | #
|
---|
4 | # Add entries to this file in order to override the statistical model in terms
|
---|
5 | # of segmentation, readings and part-of-speech tags. Notice that entries do
|
---|
6 | # not have weights since they are always used when found. This is by-design
|
---|
7 | # in order to maximize ease-of-use.
|
---|
8 | #
|
---|
9 | # Entries are defined using the following CSV format:
|
---|
10 | # <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
|
---|
11 | #
|
---|
12 | # Notice that a single half-width space separates tokens and readings, and
|
---|
13 | # that the number tokens and readings must match exactly.
|
---|
14 | #
|
---|
15 | # Also notice that multiple entries with the same <text> is undefined.
|
---|
16 | #
|
---|
17 | # Whitespace only lines are ignored. Comments are not allowed on entry lines.
|
---|
18 | #
|
---|
19 |
|
---|
20 | # Custom segmentation for kanji compounds
|
---|
21 | æ¥æ¬çµæžæ°è,æ¥æ¬ çµæž æ°è,ããã³ ã±ã€ã¶ã€ ã·ã³ãã³,ã«ã¹ã¿ã åè©
|
---|
22 | é¢è¥¿åœé空枯,é¢è¥¿ åœé 空枯,ã«ã³ãµã€ ã³ã¯ãµã€ ã¯ãŠã³ãŠ,ã«ã¹ã¿ã åè©
|
---|
23 |
|
---|
24 | # Custom segmentation for compound katakana
|
---|
25 | ããŒãããã°,ããŒã ããã°,ããŒã ããã°,ããã«ãåè©
|
---|
26 | ã·ã§ã«ããŒããã°,ã·ã§ã«ã㌠ããã°,ã·ã§ã«ã㌠ããã°,ããã«ãåè©
|
---|
27 |
|
---|
28 | # Custom reading for former sumo wrestler
|
---|
29 | æééŸ,æééŸ,ã¢ãµã·ã§ãŠãªã¥ãŠ,ã«ã¹ã¿ã 人å
|
---|