Solr 3.5 released

Today a new version of Apache Solr was released, version 3.5.0. Here’s the release statement from the Lucene PMC:

The Lucene PMC is pleased to announce the release of Apache Solr 3.5.0!

See the CHANGES.txt file included with the release for a full list of details.

Solr 3.5.0 Release Highlights:

  • Bug fixes and improvements from Apache Lucene 3.5.0, including a very substantial (3-5X) RAM reduction required to hold the terms index on opening an IndexReader. (LUCENE-2205)
  • Added support for distributed result grouping. (SOLR-2066SOLR-2776)
  • Added support for Hunspell stemmer TokenFilter supporting stemming for 99 languages. (SOLR-2769)
  • A new contrib module “langid” adds language identification capabilities as an Update Processor, using Tika’s LanguageIdentifier or Cybozu language-detection library (SOLR-1979)
  • Numeric types including Trie and date types now support sortMissingFirst/Last. (SOLR-2881)
  • Added hl.q parameter. It is optional and if it is specified, it overrides q parameter in Highlighter. (SOLR-1926)
  • Several minor bugfixes like date parsing for years from 0001-1000, ignored configurations when using QueryAnalyzer with SpellCheckComponent and many more. See CHANGES.txt entries for full details.

Contributions from Cominvent include LanguageIdentifier, Plugging in Hunspell stemmer in Solr and SOLR-2742 which makes commitWithin more accessible through the SolrJ APIs. Also, Apache Tika is upgraded to version 0.10, fixing several bugs in parsing PDFs and Office documents.

Posted in Search technology, Technology | Leave a comment

Discover CommitWithin in Solr

You may have been using Apache Solr for some time, and you all know that you have to do a <commit/> in order for the <add>ed content to become indexed. But what commit strategy should you choose? Many rely on the explicit commit from the client, or perhaps AutoCommit in solrconfig.xml. Explicit commits leaves all the responsibility to the client and you soon end up with too frequent/unnecessary commits (causing resource waste) or too few commits.

Sure, we have AutoCommit, where clients don’t need to think about committing, but then it gets less flexible; What if you sometimes want to index in larger batches, while other times you need low latency?

Discover CommitWithin! CommitWithin is a commit strategy introduced in Solr 1.4, which lets the client ask Solr to make sure this <add> request gets committed within a certain time. This leaves the control of when to do the commit to Solr itself, optimizing number of commits to a minimum while still fulfilling the update latency requirements. If I send an <add commitWithin=10000> (in an XML update), that tells Solr to make sure the document gets committed within 10000ms, i.e. 10s. You can then continue to add other documents, and Solr will automatically do a <commit> when the oldest <add> is due.

Continue reading

Posted in Technology | Leave a comment

MacOS X Lion still got some itches

After upgrading to Lion this week I got several issues, even if I’m using 10.7.1. I thought I’d share them – and their solutions with you.

Spinning beachball at login screen

I bought a new SSD disk and performed a clean install, just to start from scratch. But even before restoring any of my old settings, I got an issue with spinning beach-ball on the login screen before I could log in. Sometimes it also went straight to “bluescreen” telling me to restart.

The solution was found here, in short  you need to login quickly before the lockup, then open Energy Saving preferences and disable automatic graphics switching. It solved the issue for me.. Continue reading

Posted in Technology | Leave a comment

Becoming a committer

The Apache way of developing open source software relies on an active community of users, contributors and developers. All of us can contribute in some way or another. Being a committer means that you participate actively in the software development work and have write access to the source code repository. Each project is lead by a the PMC (Project Management Committee) which consists of some of the committers taking an extra responsibility of staking out the future of the project. Continue reading

Posted in Search technology | Leave a comment

New Solr MeetUp group in Oslo

The number of Solr users in the Oslo area is growing, and many have wished for a better community for open source search in the area. Therefore Cominvent together with FindWise have founded a MeetUp group to gather the Oslo Solr Community. We’ll hold the first gathering at The Scotsman right after work time wednesday June 8th. See details at meetup.com

Leave a comment