IndexSearcher可以声明为成员变量吗
tan8888
2009-11-14
我看到很多人的代码把IndexSearcher声明为局部变量,每一次搜索都重新new一下,我想问一下,为什么不把IndexSearcher 声明为成员变量,这样不用每次都要打开硬盘,这样不是快很多吗?我曾经做过一个搜索例子,声明为成员变量时比声明为局部变量快三倍,结果也似乎没有什么不妥,请问各位,这样做合理吗?还有IndexReader是不是也可以这样认为呢
|
|
luckaway
2009-11-14
IndexReader要保存起来,初始化IndexReader很耗资源的!
IndexSearcher每次都new一个无碍! |
|
tan8888
2009-11-14
你的意思是,IndexSearcher 和IndexReader都可以声明为成员变量是吗?没有什么不妥是吗?因为我要做一个搜索,是要求速度的,要是每次都要new 一下很费时间,如果声明为成员变量就快很多。
|
|
luckaway
2009-11-15
import java.util.List; import org.apache.lucene.index.IndexReader; /** *存放和维护IndexReader的pool,线程安全 。一个应用只需要一个该实例 **/ public interface IndexReaderPool { public void init(); public IndexReader getIndexReader(String indexDirName); public void destory(); public List<IndexReader> getIndexReaderList(); public int size(); } import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.concurrent.ConcurrentHashMap; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexReader; public class DefaultIndexReaderPool implements IndexReaderPool { private String rootDir = "/home/admin/keeper/index"; private List<String> indexDirNameList = new ArrayList<String>(); public Map<String, IndexReader> indexReaderMap = new ConcurrentHashMap<String, IndexReader>(); @Override public void init() { File rootIndexDir = new File(rootDir); String[] indexDirNames = rootIndexDir.list(); for (String indexDirName : indexDirNames) { if (!isEmptyIndexDir(indexDirName)) { indexDirNameList.add(indexDirName); IndexReader indexReader = createIndexReader(indexDirName); if (indexReader != null) indexReaderMap.put(indexDirName, indexReader); } } } private boolean isEmptyIndexDir(String indexDirName) { // TODO Auto-generated method stub return false; } @Override public void destory() { Iterator<Entry<String, IndexReader>> iterator = indexReaderMap.entrySet().iterator(); while (iterator.hasNext()) { Entry<String, IndexReader> entry = iterator.next(); IndexReader indexReader = entry.getValue(); String indexDirName = entry.getKey(); try { indexReader.close(); } catch (IOException e) { throw new PoolException("IOException while closing indexReader whose name is " + indexDirName + ",the root cause is " + e.getMessage()); } } indexReaderMap = null; } @Override public IndexReader getIndexReader(String indexDirName) { if (!indexReaderMap.containsKey(indexDirName)) { synchronized (indexReaderMap) { if (!indexReaderMap.containsKey(indexDirName)) { IndexReader indexReader = createIndexReader(indexDirName); if (indexReader != null) indexReaderMap.put(indexDirName, indexReader); } } } IndexReader indexReader = indexReaderMap.get(indexDirName); if (indexReader == null) return null; try { IndexReader newIndexReader = indexReader.reopen(); // for load new Index if (newIndexReader != indexReader) { indexReader.close(); /** * must be closed old * indexReader,"too many open files exception" in case */ indexReader = newIndexReader; indexReaderMap.put(indexDirName, newIndexReader); // may be it is unnecessary } } catch (CorruptIndexException e) { // log error info } catch (IOException e) { // log error info } return indexReader; } /** * 创建指定索引目录的IndexReader * * @param indexDirName * 索引目录名 * @return {@link IndexReader} */ private IndexReader createIndexReader(String indexDirName) { try { return IndexReader.open(getPath(indexDirName)); } catch (CorruptIndexException e) { // print error log } catch (IOException e) { // print error log } return null; } /** *获取索引目录的全路径 * * @param indexDirName * 索引目录名 *@return 索引目录全路径 **/ private String getPath(String indexDirName) { return new StringBuilder(rootDir).append(File.separator).append(indexDirName).toString(); } @Override public List<IndexReader> getIndexReaderList() { List<IndexReader> indexReaderList = new ArrayList<IndexReader>(); for (String indexDirName : indexDirNameList) { IndexReader indexReader = getIndexReader(indexDirName); if (indexReader != null) indexReaderList.add(indexReader); } return indexReaderList; } @Override public int size() { return indexReaderMap.size(); } } public class PoolException extends RuntimeException { private static final long serialVersionUID = 389302495480038940L; public PoolException(Exception e) { super(e); } public PoolException(String errMsg) { super(errMsg); } } import java.io.IOException; import java.util.ArrayList; import java.util.List; import org.apache.lucene.index.IndexReader; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MultiSearcher; import org.apache.lucene.search.Searchable; import org.apache.lucene.search.Searcher; public class SearcherFactory { private IndexReaderPool indexReaderPool = null; public Searcher createSearcher() throws IOException { List<IndexReader> indexReaderList = indexReaderPool.getIndexReaderList(); List<Searchable> searchableList = new ArrayList<Searchable>(); for (IndexReader indexReader : indexReaderList) { searchableList.add(new IndexSearcher(indexReader)); } return new MultiSearcher(searchableList.toArray(new Searchable[searchableList.size()])); } } 小改下,都能满足需求了! |
|
luckaway
2009-11-15
代码未经过测试,自己检查下!
|
|
tan8888
2009-11-15
我简单测试了一下,速度还可以啊!谢谢你了,刚接触lucene,有很多问题请多多指教。
|
|
tan8888
2009-11-15
谢谢你的代码,以后多多指教!
|
|
balixiao
2009-11-22
如果你的索引文件始终没有变化的话是可以的,但是如果在初始化了IndexSearcher之后又添加了索引的话是搜索不到的。而且我觉得让一个和IO有关的对象始终处于打开状态并不太好。
|
|
tan8888
2009-12-22
我也知道,但是每次new IndexSearcher速度会慢很多,而且有一些文件就索引一次,以后很少会再次索引。还有, “IndexReader要保存起来,初始化IndexReader很耗资源的!
IndexSearcher每次都new一个无碍! ”,我记得好像IndexSearcher每次也调用IndexReader吧,是我理解错了吗 |
|
luckaway
2009-12-22
IndexSearcher是每次调用IndexReader的!
你把IndexReader池化后,IndexSearcher每次都是new的 但IndexReader都是同一个对象! |