IndexSearcher可以声明为成员变量吗

tan8888 2009-11-14
我看到很多人的代码把IndexSearcher声明为局部变量,每一次搜索都重新new一下,我想问一下,为什么不把IndexSearcher 声明为成员变量,这样不用每次都要打开硬盘,这样不是快很多吗?我曾经做过一个搜索例子,声明为成员变量时比声明为局部变量快三倍,结果也似乎没有什么不妥,请问各位,这样做合理吗?还有IndexReader是不是也可以这样认为呢
luckaway 2009-11-14
IndexReader要保存起来,初始化IndexReader很耗资源的!

IndexSearcher每次都new一个无碍!
tan8888 2009-11-14
你的意思是,IndexSearcher 和IndexReader都可以声明为成员变量是吗?没有什么不妥是吗?因为我要做一个搜索,是要求速度的,要是每次都要new 一下很费时间,如果声明为成员变量就快很多。
luckaway 2009-11-15
import java.util.List;

import org.apache.lucene.index.IndexReader;

/**
 *存放和维护IndexReader的pool,线程安全 。一个应用只需要一个该实例
 **/

public interface IndexReaderPool {

	public void init();

	public IndexReader getIndexReader(String indexDirName);

	public void destory();

	public List<IndexReader> getIndexReaderList();

	public int size();
}



import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.concurrent.ConcurrentHashMap;

import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexReader;

public class DefaultIndexReaderPool implements IndexReaderPool {

	private String rootDir = "/home/admin/keeper/index";

	private List<String> indexDirNameList = new ArrayList<String>();

	public Map<String, IndexReader> indexReaderMap = new ConcurrentHashMap<String, IndexReader>();

	@Override
	public void init() {
		File rootIndexDir = new File(rootDir);
		String[] indexDirNames = rootIndexDir.list();
		for (String indexDirName : indexDirNames) {
			if (!isEmptyIndexDir(indexDirName)) {
				indexDirNameList.add(indexDirName);
				IndexReader indexReader = createIndexReader(indexDirName);
				if (indexReader != null)
					indexReaderMap.put(indexDirName, indexReader);
			}
		}
	}

	private boolean isEmptyIndexDir(String indexDirName) {
		// TODO Auto-generated method stub
		return false;
	}

	@Override
	public void destory() {
		Iterator<Entry<String, IndexReader>> iterator = indexReaderMap.entrySet().iterator();
		while (iterator.hasNext()) {
			Entry<String, IndexReader> entry = iterator.next();
			IndexReader indexReader = entry.getValue();
			String indexDirName = entry.getKey();
			try {
				indexReader.close();
			} catch (IOException e) {
				throw new PoolException("IOException while closing indexReader whose name is " + indexDirName
						+ ",the root cause is " + e.getMessage());
			}
		}
		indexReaderMap = null;
	}

	@Override
	public IndexReader getIndexReader(String indexDirName) {
		if (!indexReaderMap.containsKey(indexDirName)) {
			synchronized (indexReaderMap) {
				if (!indexReaderMap.containsKey(indexDirName)) {
					IndexReader indexReader = createIndexReader(indexDirName);
					if (indexReader != null)
						indexReaderMap.put(indexDirName, indexReader);
				}
			}
		}
		IndexReader indexReader = indexReaderMap.get(indexDirName);
		if (indexReader == null)
			return null;
		try {
			IndexReader newIndexReader = indexReader.reopen();
			// for load new Index
			if (newIndexReader != indexReader) {
				indexReader.close();
				/**
				 * must be closed old
				 * indexReader,"too many open files exception" in case
				 */
				indexReader = newIndexReader;
				indexReaderMap.put(indexDirName, newIndexReader);
				// may be it is unnecessary
			}
		} catch (CorruptIndexException e) {
			// log error info
		} catch (IOException e) {
			// log error info
		}
		return indexReader;
	}

	/**
	 * 创建指定索引目录的IndexReader
	 * 
	 * @param indexDirName
	 *            索引目录名
	 * @return {@link IndexReader}
	 */
	private IndexReader createIndexReader(String indexDirName) {
		try {
			return IndexReader.open(getPath(indexDirName));
		} catch (CorruptIndexException e) {
			// print error log
		} catch (IOException e) {
			// print error log
		}
		return null;
	}

	/**
	 *获取索引目录的全路径
	 * 
	 * @param indexDirName
	 *            索引目录名
	 *@return 索引目录全路径
	 **/
	private String getPath(String indexDirName) {
		return new StringBuilder(rootDir).append(File.separator).append(indexDirName).toString();
	}

	@Override
	public List<IndexReader> getIndexReaderList() {
		List<IndexReader> indexReaderList = new ArrayList<IndexReader>();
		for (String indexDirName : indexDirNameList) {
			IndexReader indexReader = getIndexReader(indexDirName);
			if (indexReader != null)
				indexReaderList.add(indexReader);
		}
		return indexReaderList;
	}

	@Override
	public int size() {
		return indexReaderMap.size();
	}
}



public class PoolException extends RuntimeException {

	private static final long serialVersionUID = 389302495480038940L;

	public PoolException(Exception e) {
		super(e);
	}

	public PoolException(String errMsg) {
		super(errMsg);
	}
}




import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.MultiSearcher;
import org.apache.lucene.search.Searchable;
import org.apache.lucene.search.Searcher;

public class SearcherFactory {
	private IndexReaderPool indexReaderPool = null;

	public Searcher createSearcher() throws IOException {
		List<IndexReader> indexReaderList = indexReaderPool.getIndexReaderList();
		List<Searchable> searchableList = new ArrayList<Searchable>();
		for (IndexReader indexReader : indexReaderList) {
			searchableList.add(new IndexSearcher(indexReader));
		}
		return new MultiSearcher(searchableList.toArray(new Searchable[searchableList.size()]));
	}
}



小改下,都能满足需求了!
luckaway 2009-11-15
代码未经过测试,自己检查下!
tan8888 2009-11-15
我简单测试了一下,速度还可以啊!谢谢你了,刚接触lucene,有很多问题请多多指教。
tan8888 2009-11-15
谢谢你的代码,以后多多指教!
balixiao 2009-11-22
如果你的索引文件始终没有变化的话是可以的,但是如果在初始化了IndexSearcher之后又添加了索引的话是搜索不到的。而且我觉得让一个和IO有关的对象始终处于打开状态并不太好。
tan8888 2009-12-22
我也知道,但是每次new IndexSearcher速度会慢很多,而且有一些文件就索引一次,以后很少会再次索引。还有, “IndexReader要保存起来,初始化IndexReader很耗资源的!

IndexSearcher每次都new一个无碍! ”,我记得好像IndexSearcher每次也调用IndexReader吧,是我理解错了吗
luckaway 2009-12-22
IndexSearcher是每次调用IndexReader的!
你把IndexReader池化后,IndexSearcher每次都是new的
但IndexReader都是同一个对象!
Global site tag (gtag.js) - Google Analytics