Archive for the ‘Solr’ Category

Super flexible AutoComplete with Solr

Wednesday, January 25th, 2012

AutoComplete or AutoSuggest has in recent years become a “must-have” search feature. Solr can do AutoComplete in a number of ways (such as SuggesterTermsComponent and Faceting using facet.prefix), but in this post we’ll consider a more advanced and flexible option, namely querying a dedicated Solr Core search index for the suggestions. You may think that this sounds heavy weight, but we’re talking small data here so it is really efficient and snappy!
Even if it’s some work setting up, the benefits to this approach are really compelling: (more…)

Solr 3.5 released

Sunday, November 27th, 2011

Today a new version of Apache Solr was released, version 3.5.0. Here’s the release statement from the Lucene PMC:

The Lucene PMC is pleased to announce the release of Apache Solr 3.5.0!

See the CHANGES.txt file included with the release for a full list of details.

Solr 3.5.0 Release Highlights:

  • Bug fixes and improvements from Apache Lucene 3.5.0, including a very substantial (3-5X) RAM reduction required to hold the terms index on opening an IndexReader. (LUCENE-2205)
  • Added support for distributed result grouping. (SOLR-2066SOLR-2776)
  • Added support for Hunspell stemmer TokenFilter supporting stemming for 99 languages. (SOLR-2769)
  • A new contrib module “langid” adds language identification capabilities as an Update Processor, using Tika’s LanguageIdentifier or Cybozu language-detection library (SOLR-1979)
  • Numeric types including Trie and date types now support sortMissingFirst/Last. (SOLR-2881)
  • Added hl.q parameter. It is optional and if it is specified, it overrides q parameter in Highlighter. (SOLR-1926)
  • Several minor bugfixes like date parsing for years from 0001-1000, ignored configurations when using QueryAnalyzer with SpellCheckComponent and many more. See CHANGES.txt entries for full details.

Contributions from Cominvent include LanguageIdentifier, Plugging in Hunspell stemmer in Solr and SOLR-2742 which makes commitWithin more accessible through the SolrJ APIs. Also, Apache Tika is upgraded to version 0.10, fixing several bugs in parsing PDFs and Office documents.

Discover CommitWithin in Solr

Friday, September 9th, 2011

You may have been using Apache Solr for some time, and you all know that you have to do a <commit/> in order for the <add>ed content to become indexed. But what commit strategy should you choose? Many rely on the explicit commit from the client, or perhaps AutoCommit in solrconfig.xml. Explicit commits leaves all the responsibility to the client and you soon end up with too frequent/unnecessary commits (causing resource waste) or too few commits.

Sure, we have AutoCommit, where clients don’t need to think about committing, but then it gets less flexible; What if you sometimes want to index in larger batches, while other times you need low latency?

Discover CommitWithin! CommitWithin is a commit strategy introduced in Solr 1.4, which lets the client ask Solr to make sure this <add> request gets committed within a certain time. This leaves the control of when to do the commit to Solr itself, optimizing number of commits to a minimum while still fulfilling the update latency requirements. If I send an <add commitWithin=10000> (in an XML update), that tells Solr to make sure the document gets committed within 10000ms, i.e. 10s. You can then continue to add other documents, and Solr will automatically do a <commit> when the oldest <add> is due.

(more…)

Becoming a committer

Thursday, June 16th, 2011

The Apache way of developing open source software relies on an active community of users, contributors and developers. All of us can contribute in some way or another. Being a committer means that you participate actively in the software development work and have write access to the source code repository. Each project is lead by a the PMC (Project Management Committee) which consists of some of the committers taking an extra responsibility of staking out the future of the project. (more…)

New Solr MeetUp group in Oslo

Thursday, May 19th, 2011

The number of Solr users in the Oslo area is growing, and many have wished for a better community for open source search in the area. Therefore Cominvent together with FindWise have founded a MeetUp group to gather the Oslo Solr Community. We’ll hold the first gathering at The Scotsman right after work time wednesday June 8th. See details at meetup.com