Hibernate Search freshly baked features

June 6, 2007 1 minute read

I had to release Hibernate Search Beta3 early after we discovered a fairly severe bug in Beta2. But I had time to inject some new features. After those introduced in Beta2, that a fairly good week :)

batch size limit on object indexing
If you don't pay attention when initially indexing (or reindexing) your data, you may face out of memory exceptions. The old solution was to execute indexing in several smaller transactions, but the code ended up being fairly complex. Here is the new solution:

hibernate.search.worker.batch_size=5000

int batchSize=5000;
//scroll will load objects as needed
ScrollableResults results = fullTextSession.createCriteria( Email.class )
    .scroll( ScrollMode.FORWARD_ONLY );
int index = 0;
while( results.next() ) {
    index++;
    fullTextSession.index( results.get(0) ); //index each element
    if (index % batchSize == 0) s.clear(); //clear every batchSize
}

wrap that into one transaction and you are good to go.

Native Lucene
The APIs were never officially published (until beta3), but Hibernate Search lets you fall back to native Lucene when needed. All the needed APIs are held by SearchFactory.

DirectoryProvider provider = searchFactory.getDirectoryProvider(Order.class);
org.apache.lucene.store.Directory directory = provider.getDirectory();

This one is the brute force and gives you access to the Lucene Directory containing Orders. A smarter way, if you intend to execute a search query, is to use the ReaderProvider

DirectoryProvider clientProvider = searchFactory.getDirectoryProvider(Client.class);
IndexReader reader = searchFactory.getReaderProvider().openReader(clientProvider);

try {
   //do read-only operations on the reader
}
finally {
   readerProvider.closeReader(reader);
}

Smarter because you share the same IndexReaders as Hibernate Search, hence avoid the unnecessary IndexReader opening and warm up.

Finally you can optimize a Lucene Index (roughly a defragmentation)

SearchFactory searchFactory = fullTextSession.getSearchFactory();
searchFactory.optimize(Order.class);
//or searchFactory.optimize();

Share on

X Facebook LinkedIn Bluesky

Hibernate Search freshly baked features

Share on

Comments

You May Also Enjoy

Sharing mindmaps with Markdown

Sharing your ~/.claude config with Docker Sandbox

Anthropic has improved function calling and that’s good

Software Craftsmanship and AI