The Solr distros are coming

Open Source Search is gaining more and more traction. First you had Lucene (2001), giving great search for programmers. Then we got Solr (2006) making search accessible for non programmers, but a certain level of expertise is still needed. And then came Constellio, an open source (GPL) enterprise search distribution (distro) built on Solr, adding a slick GUI, connector and crawling support and more.

Say again. A Solr distro?

I call it “distro” because I like to compare the evolution to what we have seen in GNU/Linux. First there was the Linux core. Then there was the GNU tools that made Linux so much more usable but still only for engineers comfortable with the command line. And last, companies like RedHat and Suse built complete distros including modern GUI, ready-to use tools such as OpenOffice, Thunderbird and more. Without these distros, Linux would just have been a “core” leaving to the user to add the extra sugar.

I dare state that the same is about to happen with Open Source Search. There are many companies out there already with their own proprietary Apache Solr/Lucene based “distro”, but Constellio is the first open-source one I have seen so far.

As a Solr user, you’ll feel at home within ./constellio/tomcat/webapps/constellio/WEB-INF/solrcores/<your_core> where you’ll find the regular schema, solrconfig etc. But I do suspect that any manual edits here will be overwritten by the GUI…

Tapping into Google Search Appliance

The creators of Constellio have done a pretty good job in this first 1.0 release. Easy installation, nice administration GUI, easy to get started crawling, etc. And they have been bold enough to tap into Google’s open-sourced GSA connectors available at Google Code as opposed to using ManifoldCF from Apache or another connector framework. They also hook in to Google OneBox APIs, thus enabling users to plug in to all the smart search “widgets” that can for instance intercept the query, and if it detects a stock ticker, deliver a stock price graph on top of search results. Nifty! I bet Google didn’t anticipate their connector framework being used outside of the GSA…

So what’s the catch?

Well, for one, it is GPL (v3), meaning that it excludes some potential users right away (unless they are able to dual license?). You have to register on the site in order to download, meaning you’ll probably be contacted at some point in time by sales – no big deal. It is open source and the source code is available, but it is not developed by a community in an open way. You can download the source as a zip, but if you change it, who’s gonna maintain your changes? Probably yourself…

Luckily there is no limits on number of documents you can index or the QPS rate. Thus it is a true free (as in free beer) solution, which cannot be said about the weak MS search server Express or the old and maxdoc-limited Omnifind Yahoo! edition. Being free™ may be enough reason to give value to many users who would otherwise have to pay consultants to bring up a solution from scratch based on the individual components.

Constellio’s business model is to live from support and consulting fees, and that may very well work. But I cannot see how they will be able to create a true open community around their product, and for that reason I believe it will be a distro without very large adoption.


It is obviously an early version 1.0. If it was an ASF project it would probably have version number 0.x. A few quirks: The logo upload did not work. It identified my Norwegian web pages as Danish, and it crashed on me (see screen shots). But good luck to the creators with making this into a mature Solr distro.

Screen shots

Constellio search page

Constellio collection admin

Constellio - edit collection

Constellio - server management tab

Constellio - connectors management

Constellio - edit field types in GUI

Constellio - configuring a field type with analysis

Constellio - hey, it's only version 1.0 🙂

Comments (2)

  1. Rida Benjelloun

    Constellio 1.1 is under LGPL licence.

Leave a comment

Your email address will not be published. Required fields are marked *