source: gs3-extensions/solr/trunk/src/conf/lang/userdict_ja.txt@ 29135

Last change on this file since 29135 was 29135, checked in by ak19, 10 years ago

Part of port from lucene3.3.0 to lucene4.7.2. Solr related. conf and lib folders for solr4.7.2.

File size: 1.3 KB
Line 
1#
2# This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
3#
4# Add entries to this file in order to override the statistical model in terms
5# of segmentation, readings and part-of-speech tags. Notice that entries do
6# not have weights since they are always used when found. This is by-design
7# in order to maximize ease-of-use.
8#
9# Entries are defined using the following CSV format:
10# <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
11#
12# Notice that a single half-width space separates tokens and readings, and
13# that the number tokens and readings must match exactly.
14#
15# Also notice that multiple entries with the same <text> is undefined.
16#
17# Whitespace only lines are ignored. Comments are not allowed on entry lines.
18#
19
20# Custom segmentation for kanji compounds
21日本経枈新聞,日本 経枈 新聞,ニホン ケむザむ シンブン,カスタム名詞
22関西囜際空枯,関西 囜際 空枯,カンサむ コクサむ クりコり,カスタム名詞
23
24# Custom segmentation for compound katakana
25トヌトバッグ,トヌト バッグ,トヌト バッグ,かずカナ名詞
26ショルダヌバッグ,ショルダヌ バッグ,ショルダヌ バッグ,かずカナ名詞
27
28# Custom reading for former sumo wrestler
29朝青韍,朝青韍,アサショりリュり,カスタム人名
Note: See TracBrowser for help on using the repository browser.