Discover CommitWithin in Solr

You may have been using Apache Solr for some time, and you all know that you have to do a <commit/> in order for the <add>ed content to become indexed. But what commit strategy should you choose? Many rely on the explicit commit from the client, or perhaps AutoCommit in solrconfig.xml. Explicit commits leaves all the responsibility to the client and you soon end up with too frequent/unnecessary commits (causing resource waste) or too few commits.

Sure, we have AutoCommit, where clients don’t need to think about committing, but then it gets less flexible; What if you sometimes want to index in larger batches, while other times you need low latency?

Discover CommitWithin! CommitWithin is a commit strategy introduced in Solr 1.4, which lets the client ask Solr to make sure this <add> request gets committed within a certain time. This leaves the control of when to do the commit to Solr itself, optimizing number of commits to a minimum while still fulfilling the update latency requirements. If I send an <add commitWithin=10000> (in an XML update), that tells Solr to make sure the document gets committed within 10000ms, i.e. 10s. You can then continue to add other documents, and Solr will automatically do a <commit> when the oldest <add> is due.

Becoming a committer

The Apache way of developing open source software relies on an active community of users, contributors and developers. All of us can contribute in some way or another. Being a committer means that you participate actively in the software development work and have write access to the source code repository. Each project is lead by a the PMC (Project Management Committee) which consists of some of the committers taking an extra responsibility of staking out the future of the project.

New Solr MeetUp group in Oslo

The number of Solr users in the Oslo area is growing, and many have wished for a better community for open source search in the area. Therefore Cominvent together with FindWise have founded a MeetUp group to gather the Oslo Solr Community. We’ll hold the first gathering at The Scotsman right after work time wednesday

Apache Solr 3.1 released

It’s been a long wait, and now it’s here – the release of Solr version 3.1. The 1.4.1 release was in June 2010, and for various reasons there was never a 1.4.2 nor a 1.5 release. Part of the reason is the merge of Lucene and Solr codebase which is also why the version number is 3.1 instead of 1.5.

So what’s new? For me, the single most important features are the Extended Dismax parser (SOLR-1553) and Geospatial search. The full list of improvements is found in CHANGES.TXT, but here are my favorites: