root/gs3-extensions/solr/trunk/src/collect/solr-jdbm-demo/etc/conf/lang/stoptags_ja.txt @ 30001

Revision 30001, 16.3 KB (checked in by ak19, 4 years ago)

Final commit (I think) to get update to solr getTerms() to work on gs3 checkout. The solr-jdbm-demo collection needed to be rebuilt with the changes to the index. This time added in other .xml files from the lucene/solr upgrade to the colleciton, and updated schema.xml and solrconfig.xml. This last is especially necessary as it uses the new Greenstone custom SearchHandler? to get getTerms() to work.

Line 
1#
2# This file defines a Japanese stoptag set for JapanesePartOfSpeechStopFilter.
3#
4# Any token with a part-of-speech tag that exactly matches those defined in this
5# file are removed from the token stream.
6#
7# Set your own stoptags by uncommenting the lines below.  Note that comments are
8# not allowed on the same line as a stoptag.  See LUCENE-3745 for frequency lists,
9# etc. that can be useful for building you own stoptag set.
10#
11# The entire possible tagset is provided below for convenience.
12#
13#####
14#  noun: unclassified nouns
15#名詞
16#
17#  noun-common: Common nouns or nouns where the sub-classification is undefined
18#名詞-䞀般
19#
20#  noun-proper: Proper nouns where the sub-classification is undefined
21#名詞-固有名詞
22#
23#  noun-proper-misc: miscellaneous proper nouns
24#名詞-固有名詞-䞀般
25#
26#  noun-proper-person: Personal names where the sub-classification is undefined
27#名詞-固有名詞-人名
28#
29#  noun-proper-person-misc: names that cannot be divided into surname and
30#  given name; foreign names; names where the surname or given name is unknown.
31#  e.g. お垂の方
32#名詞-固有名詞-人名-䞀般
33#
34#  noun-proper-person-surname: Mainly Japanese surnames.
35#  e.g. 山田
36#名詞-固有名詞-人名-姓
37#
38#  noun-proper-person-given_name: Mainly Japanese given names.
39#  e.g. 倪郎
40#名詞-固有名詞-人名-名
41#
42#  noun-proper-organization: Names representing organizations.
43#  e.g. 通産省, NHK
44#名詞-固有名詞-組織
45#
46#  noun-proper-place: Place names where the sub-classification is undefined
47#名詞-固有名詞-地域
48#
49#  noun-proper-place-misc: Place names excluding countries.
50#  e.g. アゞア, バルセロナ, 京郜
51#名詞-固有名詞-地域-䞀般
52#
53#  noun-proper-place-country: Country names.
54#  e.g. 日本, オヌストラリア
55#名詞-固有名詞-地域-囜
56#
57#  noun-pronoun: Pronouns where the sub-classification is undefined
58#名詞-代名詞
59#
60#  noun-pronoun-misc: miscellaneous pronouns:
61#  e.g. それ, ここ, あい぀, あなた, あちこち, いく぀, どこか, なに, みなさん, みんな, わたくし, われわれ
62#名詞-代名詞-䞀般
63#
64#  noun-pronoun-contraction: Spoken language contraction made by combining a
65#  pronoun and the particle 'wa'.
66#  e.g. ありゃ, こりゃ, こりゃあ, そりゃ, そりゃあ
67#名詞-代名詞-瞮玄
68#
69#  noun-adverbial: Temporal nouns such as names of days or months that behave
70#  like adverbs. Nouns that represent amount or ratios and can be used adverbially,
71#  e.g. 金曜, 䞀月, 午埌, 少量
72#名詞-副詞可胜
73#
74#  noun-verbal: Nouns that take arguments with case and can appear followed by
75#  'suru' and related verbs (する, できる, なさる, くださる)
76#  e.g. むンプット, 愛着, 悪化, 悪戊苊闘, 䞀安心, 䞋取り
77#名詞-サ倉接続
78#
79#  noun-adjective-base: The base form of adjectives, words that appear before な ("na")
80#  e.g. 健康, 安易, 駄目, だめ
81#名詞-圢容動詞語幹
82#
83#  noun-numeric: Arabic numbers, Chinese numerals, and counters like 䜕 (回), 数.
84#  e.g. 0, 1, 2, 䜕, 数, 幟
85#名詞-数
86#
87#  noun-affix: noun affixes where the sub-classification is undefined
88#名詞-非自立
89#
90#  noun-affix-misc: Of adnominalizers, the case-marker の ("no"), and words that
91#  attach to the base form of inflectional words, words that cannot be classified
92#  into any of the other categories below. This category includes indefinite nouns.
93#  e.g. あか぀き, 暁, かい, 甲斐, 気, きらい, 嫌い, くせ, 癖, こず, 事, ごず, 毎, しだい, 次第,
94#       é †, せい, 所為, ぀いで, 序で, ぀もり, 積もり, 点, どころ, の, はず, 筈, はずみ, 匟み,
95#       æ‹å­, ふう, ふり, 振り, ほう, 方, æ—š, もの, 物, 者, ゆえ, 故, ゆえん, 所以, わけ, èš³,
96#       ã‚ã‚Š, 割り, 割, ん-口語/, もん-口語/
97#名詞-非自立-䞀般
98#
99#  noun-affix-adverbial: noun affixes that that can behave as adverbs.
100#  e.g. あいだ, 間, あげく, 挙げ句, あず, 埌, 䜙り, 以倖, 以降, 以埌, 以䞊, 以前, 䞀方, うえ,
101#       äžŠ, うち, 内, おり, 折り, かぎり, 限り, きり, っきり, 結果, ころ, 頃, さい, 際, 最䞭, さなか,
102#       æœ€äž­, じたい, 自䜓, たび, 床, ため, 為, ぀ど, 郜床, ずおり, 通り, ずき, 時, ずころ, 所,
103#       ãšãŸã‚“, 途端, なか, äž­, のち, 埌, ばあい, 堎合, 日, ぶん, 分, ほか, 他, たえ, 前, たた,
104#       å„˜, 䟭, みぎり, 矢先
105#名詞-非自立-副詞可胜
106#
107#  noun-affix-aux: noun affixes treated as 助動詞 ("auxiliary verb") in school grammars
108#  with the stem よう(だ) ("you(da)").
109#  e.g.  よう, やう, 様 (よう)
110#名詞-非自立-助動詞語幹
111
112#  noun-affix-adjective-base: noun affixes that can connect to the indeclinable
113#  connection form な (aux "da").
114#  e.g. みたい, ふう
115#名詞-非自立-圢容動詞語幹
116#
117#  noun-special: special nouns where the sub-classification is undefined.
118#名詞-特殊
119#
120#  noun-special-aux: The そうだ ("souda") stem form that is used for reporting news, is
121#  treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the base
122#  form of inflectional words.
123#  e.g. そう
124#名詞-特殊-助動詞語幹
125#
126#  noun-suffix: noun suffixes where the sub-classification is undefined.
127#名詞-接尟
128#
129#  noun-suffix-misc: Of the nouns or stem forms of other parts of speech that connect
130#  to ガル or ã‚¿ã‚€ and can combine into compound nouns, words that cannot be classified into
131#  any of the other categories below. In general, this category is more inclusive than
132#  接尟語 ("suffix") and is usually the last element in a compound noun.
133#  e.g. おき, かた, 方, 甲斐 (がい), がかり, ぎみ, 気味, ぐるみ, (した) さ, 次第, 枈 (ず) み,
134#       ã‚ˆã†, (でき)っこ, 感, 芳, 性, å­Š, 類, 面, 甹
135#名詞-接尟-䞀般
136#
137#  noun-suffix-person: Suffixes that form nouns and attach to person names more often
138#  than other nouns.
139#  e.g. 君, 様, 著
140#名詞-接尟-人名
141#
142#  noun-suffix-place: Suffixes that form nouns and attach to place names more often
143#  than other nouns.
144#  e.g. 町, åž‚, 県
145#名詞-接尟-地域
146#
147#  noun-suffix-verbal: Of the suffixes that attach to nouns and form nouns, those that
148#  can appear before スル ("suru").
149#  e.g. 化, 芖, 分け, 入り, 萜ち, 買い
150#名詞-接尟-サ倉接続
151#
152#  noun-suffix-aux: The stem form of そうだ (様態) that is used to indicate conditions,
153#  is treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the
154#  conjunctive form of inflectional words.
155#  e.g. そう
156#名詞-接尟-助動詞語幹
157#
158#  noun-suffix-adjective-base: Suffixes that attach to other nouns or the conjunctive
159#  form of inflectional words and appear before the copula だ ("da").
160#  e.g. 的, げ, がち
161#名詞-接尟-圢容動詞語幹
162#
163#  noun-suffix-adverbial: Suffixes that attach to other nouns and can behave as adverbs.
164#  e.g. 埌 (ご), 以埌, 以降, 以前, 前埌, äž­, 末, 侊, 時 (じ)
165#名詞-接尟-副詞可胜
166#
167#  noun-suffix-classifier: Suffixes that attach to numbers and form nouns. This category
168#  is more inclusive than 助数詞 ("classifier") and includes common nouns that attach
169#  to numbers.
170#  e.g. 個, ぀, 本, 冊, パヌセント, cm, kg, カ月, か囜, 区画, 時間, 時半
171#名詞-接尟-助数詞
172#
173#  noun-suffix-special: Special suffixes that mainly attach to inflecting words.
174#  e.g. (楜し) さ, (考え) 方
175#名詞-接尟-特殊
176#
177#  noun-suffix-conjunctive: Nouns that behave like conjunctions and join two words
178#  together.
179#  e.g. (日本) 察 (アメリカ), 察 (アメリカ), (3) 察 (5), (女優) å…Œ (䞻婊)
180#名詞-接続詞的
181#
182#  noun-verbal_aux: Nouns that attach to the conjunctive particle お ("te") and are
183#  semantically verb-like.
184#  e.g. ごらん, ご芧, 埡芧, 頂戎
185#名詞-動詞非自立的
186#
187#  noun-quotation: text that cannot be segmented into words, proverbs, Chinese poetry,
188#  dialects, English, etc. Currently, the only entry for 名詞 匕甚文字列 ("noun quotation")
189#  is いわく ("iwaku").
190#名詞-匕甚文字列
191#
192#  noun-nai_adjective: Words that appear before the auxiliary verb ない ("nai") and
193#  behave like an adjective.
194#  e.g. 申し蚳, 仕方, ずんでも, 違い
195#名詞-ナむ圢容詞語幹
196#
197#####
198#  prefix: unclassified prefixes
199#接頭詞
200#
201#  prefix-nominal: Prefixes that attach to nouns (including adjective stem forms)
202#  excluding numerical expressions.
203#  e.g. お (æ°Ž), 某 (氏), 同 (瀟), 故 (氏), 高 (品質), お (芋事), ご (立掟)
204#接頭詞-名詞接続
205#
206#  prefix-verbal: Prefixes that attach to the imperative form of a verb or a verb
207#  in conjunctive form followed by なる/なさる/くださる.
208#  e.g. お (読みなさい), お (座り)
209#接頭詞-動詞接続
210#
211#  prefix-adjectival: Prefixes that attach to adjectives.
212#  e.g. お (寒いですねえ), バカ (でかい)
213#接頭詞-圢容詞接続
214#
215#  prefix-numerical: Prefixes that attach to numerical expressions.
216#  e.g. 箄, およそ, 毎時
217#接頭詞-数接続
218#
219#####
220#  verb: unclassified verbs
221#動詞
222#
223#  verb-main:
224#動詞-自立
225#
226#  verb-auxiliary:
227#動詞-非自立
228#
229#  verb-suffix:
230#動詞-接尟
231#
232#####
233#  adjective: unclassified adjectives
234#圢容詞
235#
236#  adjective-main:
237#圢容詞-自立
238#
239#  adjective-auxiliary:
240#圢容詞-非自立
241#
242#  adjective-suffix:
243#圢容詞-接尟
244#
245#####
246#  adverb: unclassified adverbs
247#副詞
248#
249#  adverb-misc: Words that can be segmented into one unit and where adnominal
250#  modification is not possible.
251#  e.g. あいかわらず, 倚分
252#副詞-䞀般
253#
254#  adverb-particle_conjunction: Adverbs that can be followed by の, は, に,
255#  な, する, だ, etc.
256#  e.g. こんなに, そんなに, あんなに, なにか, なんでも
257#副詞-助詞類接続
258#
259#####
260#  adnominal: Words that only have noun-modifying forms.
261#  e.g. この, その, あの, どの, いわゆる, なんらかの, 䜕らかの, いろんな, こういう, そういう, ああいう,
262#       ã©ã†ã„う, こんな, そんな, あんな, どんな, 倧きな, 小さな, おかしな, ほんの, たいした,
263#       ã€Œ(, も) さる (こずながら)」, 埮々たる, 堂々たる, 単なる, いかなる, 我が」「同じ, 亡き
264#連䜓詞
265#
266#####
267#  conjunction: Conjunctions that can occur independently.
268#  e.g. が, けれども, そしお, じゃあ, それどころか
269接続詞
270#
271#####
272#  particle: unclassified particles.
273助詞
274#
275#  particle-case: case particles where the subclassification is undefined.
276助詞-栌助詞
277#
278#  particle-case-misc: Case particles.
279#  e.g. から, が, で, ず, に, ぞ, より, を, の, にお
280助詞-栌助詞-䞀般
281#
282#  particle-case-quote: the "to" that appears after nouns, a person’s speech,
283#  quotation marks, expressions of decisions from a meeting, reasons, judgements,
284#  conjectures, etc.
285#  e.g. ( だ) ず (述べた.), ( である) ず (しお執行猶予...)
286助詞-栌助詞-匕甚
287#
288#  particle-case-compound: Compounds of particles and verbs that mainly behave
289#  like case particles.
290#  e.g. ずいう, ずいった, ずかいう, ずしお, ずずもに, ず共に, でもっお, にあたっお, に圓たっお, に圓っお,
291#       ã«ã‚たり, に圓たり, に圓り, に圓たる, にあたる, においお, に斌いお,に斌お, における, に斌ける,
292#       ã«ã‹ã‘, にかけお, にかんし, に関し, にかんしお, に関しお, にかんする, に関する, に際し,
293#       ã«éš›ã—お, にしたがい, に埓い, に埓う, にしたがっお, に埓っお, にたいし, に察し, にたいしお,
294#       ã«å¯Ÿã—お, にたいする, に察する, に぀いお, に぀き, に぀け, に぀けお, に぀れ, に぀れお, にずっお,
295#       ã«ãšã‚Š, にた぀わる, によっお, に䟝っお, に因っお, により, に䟝り, に因り, による, に䟝る, に因る,
296#       ã«ã‚ãŸã£ãŠ, にわたる, をもっお, を以っお, を通じ, を通じお, を通しお, をめぐっお, をめぐり, をめぐる,
297#       ã£ãŠ-口語/, ちゅう-関西匁「ずいう」/, (䜕) おいう (人)-口語/, っおいう-口語/, ずいふ, ずかいふ
298助詞-栌助詞-連語
299#
300#  particle-conjunctive:
301#  e.g. から, からには, が, けれど, けれども, けど, し, ぀぀, お, で, ず, ずころが, どころか, ずも, ども,
302#       ãªãŒã‚‰, なり, ので, のに, ば, ものの, や ( した), やいなや, (ころん) じゃ(いけない)-口語/,
303#       (行っ) ちゃ(いけない)-口語/, (蚀っ) たっお (しかたがない)-口語/, (それがなく)ったっお (平気)-口語/
304助詞-接続助詞
305#
306#  particle-dependency:
307#  e.g. こそ, さえ, しか, すら, は, も, ぞ
308助詞-係助詞
309#
310#  particle-adverbial:
311#  e.g. がおら, かも, くらい, 䜍, ぐらい, しも, (å­Šæ ¡) じゃ(これが流行っおいる)-口語/,
312#       (それ)じゃあ (よくない)-口語/, ず぀, (私) なぞ, など, (私) なり (に), (先生) なんか (倧嫌い)-口語/,
313#       (私) なんぞ, (先生) なんお (倧嫌い)-口語/, のみ, だけ, (私) だっお-口語/, だに,
314#       (圌)ったら-口語/, (お茶) でも (いかが), 等 (ずう), (今埌) ずも, ばかり, ばっか-口語/, ばっかり-口語/,
315#       ã»ã©, 繋, たで, 迄, (誰) も (が)([助詞-栌助詞] および [助詞-係助詞] の前に䜍眮する「も」)
316助詞-副助詞
317#
318#  particle-interjective: particles with interjective grammatical roles.
319#  e.g. (束島) や
320助詞-間投助詞
321#
322#  particle-coordinate:
323#  e.g. ず, たり, だの, だり, ずか, なり, や, やら
324助詞-䞊立助詞
325#
326#  particle-final:
327#  e.g. かい, かしら, さ, ぜ, (だ)っけ-口語/, (ずたっおる) で-方蚀/, な, ナ, なあ-口語/, ぞ, ね, ネ,
328#       ã­ã‡-口語/, ねえ-口語/, ねん-方蚀/, の, のう-口語/, や, よ, ペ, よぉ-口語/, わ, わい-口語/
329助詞-終助詞
330#
331#  particle-adverbial/conjunctive/final: The particle "ka" when unknown whether it is
332#  adverbial, conjunctive, or sentence final. For example:
333#       (a) 「A か B か」. Ex:「(囜内で運甚する) か,(海倖で運甚する) か (.)」
334#       (b) Inside an adverb phrase. Ex:「(幞いずいう) か (, 死者はいなかった.)」
335#           ã€Œ(祈りが届いたせい) か (, 詊隓に合栌した.)」
336#       (c) 「かのように」. Ex:「(䜕もなかった) か (のように振る舞った.)」
337#  e.g. か
338助詞-副助詞䞊立助詞終助詞
339#
340#  particle-adnominalizer: The "no" that attaches to nouns and modifies
341#  non-inflectional words.
342助詞-連䜓化
343#
344#  particle-adnominalizer: The "ni" and "to" that appear following nouns and adverbs
345#  that are giongo, giseigo, or gitaigo.
346#  e.g. に, ず
347助詞-副詞化
348#
349#  particle-special: A particle that does not fit into one of the above classifications.
350#  This includes particles that are used in Tanka, Haiku, and other poetry.
351#  e.g. かな, けむ, ( しただろう) に, (あんた) にゃ(わからん), (俺) ん (家)
352助詞-特殊
353#
354#####
355#  auxiliary-verb:
356助動詞
357#
358#####
359#  interjection: Greetings and other exclamations.
360#  e.g. おはよう, おはようございたす, こんにちは, こんばんは, ありがずう, どうもありがずう, ありがずうございたす,
361#       ã„ただきたす, ごちそうさた, さよなら, さようなら, はい, いいえ, ごめん, ごめんなさい
362#感動詞
363#
364#####
365#  symbol: unclassified Symbols.
366蚘号
367#
368#  symbol-misc: A general symbol not in one of the categories below.
369#  e.g. [○◎@$〒→+]
370蚘号-䞀般
371#
372#  symbol-comma: Commas
373#  e.g. [,、]
374蚘号-読点
375#
376#  symbol-period: Periods and full stops.
377#  e.g. [.。]
378蚘号-句点
379#
380#  symbol-space: Full-width whitespace.
381蚘号-空癜
382#
383#  symbol-open_bracket:
384#  e.g. [({‘“『【]
385蚘号-括匧開
386#
387#  symbol-close_bracket:
388#  e.g. [)}’”』」】]
389蚘号-括匧閉
390#
391#  symbol-alphabetic:
392#蚘号-アルファベット
393#
394#####
395#  other: unclassified other
396#その他
397#
398#  other-interjection: Words that are hard to classify as noun-suffixes or
399#  sentence-final particles.
400#  e.g. (だ)ァ
401その他-間投
402#
403#####
404#  filler: Aizuchi that occurs during a conversation or sounds inserted as filler.
405#  e.g. あの, うんず, えず
406フィラヌ
407#
408#####
409#  non-verbal: non-verbal sound.
410非蚀語音
411#
412#####
413#  fragment:
414#語断片
415#
416#####
417#  unknown: unknown part of speech.
418#未知語
419#
420##### End of file
Note: See TracBrowser for help on using the browser.