lucene2.4.1下完全匹配结果的疑问 - lucene - lucene爱好者

[lucene] lucene2.4.1下完全匹配结果的疑问

aaati 2009-03-23

网上关于2.4.1版本的文档实在是少，本人想实现完全匹配的查询方式。通过查询发现TermQuery 是可以满足要求的，但在查询时，非中文输入可实现完全匹配，但搜索中文还是无法做到完全匹配，。请大伙帮忙看看谢谢
例句如下

public static void search(Directory dir, Analyzer analyzer) {
try {
Searcher searcher = new IndexSearcher(dir);

TermQuery termQuery = new TermQuery(new Term("name","555"));
ScoreDoc[] docs = searcher.search(termQuery, searcher.maxDoc()).scoreDocs;
System.out.println(docs.length);

Document doc;
for (int i = 0; i < docs.length; i++) {
doc = searcher.doc(docs[i].doc);
System.out.println(doc.get("name"));
}
} catch (Exception e) {
e.printStackTrace();
}
}

chester60 2009-03-23

“555”完全匹配，是因为“555”是一个term，而中文不完全匹配，是因为中文经过了分词，除非你查询的关键字正好是一个完整的term，否则无法满足你的要求。

例：“中国电信”可能被分为“中国/电信”，也可能被分为“中国电信”。如果是后一种情况，才可能完全匹配。

aaati 2009-03-24

那可否控制中文不进行分词的操作呢？

amigobot 2009-03-29

用WhiteSpaceAnalyzer, 当然你必须同时在index和search都用这个。

amigobot 2009-03-29

2.4.1除了解决几个bug, 没有其他改动，肯定不会造成你说的问题。

sunjie 2009-06-30

amigobot 写道

用WhiteSpaceAnalyzer, 当然你必须同时在index和search都用这个。

这个能实现中文完全匹配？

luckaway 2009-07-03

/**
   * Create a field by specifying its name, value and how it will
   * be saved in the index. Term vectors will not be stored in the index.
   *
   * @param name The name of the field
   * @param value The string to process
   * @param store Whether <code>value</code> should be stored in the index
   * @param index Whether the field should be indexed, and if so, if it should
   * be tokenized before indexing
   * @throws NullPointerException if name or value is <code>null</code>
   * @throws IllegalArgumentException if the field is neither stored nor indexed
   */
public Field(String name, String value, Store store, Index index) {
    this(name, value, store, index, TermVector.NO);
}

可以控制某个字段的存储和索引方式

索引方式：Index.NO（不索引）
         Index.ANALYZED（切分索引）
         Index.ANALYZED_NO_NORMS（不太清楚）
         Index.NOT_ANALYZED（整体索引，你现在的需求）
         NOT_ANALYZED_NO_NORMS（不清楚）

储存方式：不存储、存储、压缩储存

opengloves 2009-08-06

也是刚接触lucene，并且遇到同样的问题，查询中文无法完全匹配
主要是为了查询完全匹配的词的个数。
感觉lucene是分词的，完全匹配可能不行吧？
比如：一人一个
不同的Analyzer可能分词不一样，之后用Term的话得到的分词结果不同
所以搜索出来的结果会有不同
用StandardAnalyzer，ChineseAnalyzer只能搜索出一个字的，比如“一”或“个”
用WhitespaceAnalyzer只能TermQuery出空格为间隔的
其他的Analyzer查2、3个字的，但是如果字多了，应该不能完全匹配了……
没办法象查数据库like '%keywork%' 这种效果？
哎，关键是数据较多，而且要加亮和分页，列表显示
否则就自己String.split(regex)了.

哪位用lucene做个完全匹配？
请帮个忙回答下。谢谢了。

trh3037 2012-08-06

我现在也是卡在这了，oracle的全文检索可以解决，但是又有其他限制，纠结中

发表回复

>>返回群组首页

[lucene] lucene2.4.1下完全匹配结果的疑问

相关讨论

相关资源推荐