public class CmsGallerySearchAnalyzer
extends org.apache.lucene.analysis.util.StopwordAnalyzerBase
The gallery search is done in one single index that may contain multiple languages.
According to the Lucene JavaDocs (3.0 version), the Lucene StandardAnalyzer
is already using
"a good tokenizer for most European-language documents". The only caveat is that a
list of English only stop words is used.
This extended analyzer used a compound list of stop words compiled from the following languages:
限定符和类型 | 字段和说明 |
---|---|
static int |
DEFAULT_MAX_TOKEN_LENGTH
Default maximum allowed token length.
|
构造器和说明 |
---|
CmsGallerySearchAnalyzer(org.apache.lucene.util.Version version)
Constructor with version parameter.
|
限定符和类型 | 方法和说明 |
---|---|
protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents |
createComponents(java.lang.String fieldName,
java.io.Reader reader) |
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet
public static final int DEFAULT_MAX_TOKEN_LENGTH