Lucene的StopFilter中使用的非索引字的默认列表是什么?
2022-09-01 20:29:19
Lucene有一个默认的停止过滤器(http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/core/StopFilter.html),有谁知道列表中的单词是什么?
Lucene有一个默认的停止过滤器(http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/core/StopFilter.html),有谁知道列表中的单词是什么?
在 中设置的默认停用词来自 ,如源文件中找到的那样:StandardAnalyzer
EnglishAnalyzer
StopAnalyzer.ENGLISH_STOP_WORDS_SET
"a", "an", "and", "are", "as", "at", "be", "but", "by",
"for", "if", "in", "into", "is", "it",
"no", "not", "of", "on", "or", "such",
"that", "the", "their", "then", "there", "these",
"they", "this", "to", "was", "will", "with"
StopFilter
本身没有定义默认的非索引字集。