Monday, August 23, 2010

A First Look at Apache Lucene Java Version 3

Apache Lucene is "open-source search software" and its "flagship sub-project," Lucene Java, is a "high-performance, full-featured text search engine library written entirely in Java" (web site, 23 August 2010).  Lucene Java 3 has been available since late November 2009 (with 3.0.1 and 3.0.2 releases since then).  Because Lucene Java Version 3 is the first "Lucene release with Java 5 as a minimum requirement," its API is now able to take advantage of features such as enums, genericsvarargs, and autoboxing.  Lucene Java 3 offers more type safety in the API and removes the need for those dreaded explicit casts.  It is convenient that the Second Edition of Lucene in Action covers Lucense 3.

Apache Lucene is open source, is sponsored by the Apache Software Foundation (ASF), and is available via the Apache License, Version 2.  Apache Lucene can be downloaded from the Apache Download Mirrors. There is also third-party commercial support available through organizations such as Lucid Imagination.

Besides the implication of maturity conveyed by its Version 3 status, there is additional evidence of Lucene's maturity.  For example, Java.net featured the article Lucene Intro in July 2003 and the 2004 JavaOne conference (six years ago!) featured a Lucene presentation by Erik Hatcher called Lucene in Action.  According to that presentation, the first open source version of Lucene (0.1) was released clear back in 2000, and Release 2.0.0 came out in mid-2006.

Lucene is a widely-used product.  According to its PoweredBy Wiki page, Lucene Java is used by many applications and web applications that are familiar to most of us.  The list includes AOL.com, Comcast, Disney, Eclipse, IBM, jGuruJIRA, LinkedIn, Project Roller, TheServerSide, SourceForge.net, and Wikipedia.

There are several other useful resources for starting with Lucene Java.  The Lucene Java Wiki has several interesting pages including LuceneFAQ and HowTo.   The developerWorks article Using Apache Lucene to Search Text covers Apache Lucene 2.4.1.

This post has been a brief introduction to the Apache Lucene with particular focus on Lucene Java.  One of the nice things about Lucene is that the non-Java implementations of it can use the same Lucene-built indexes as built by Lucene Java.  I hope to write a few more blog posts in the near future talking about various aspects of using Lucene Java 3.

No comments: