Monday, December 31, 2012

Significant Software Development Developments of 2012

I have written before (2007, 2008, 2009, 2010, 2011) on my biased perspective of the most significant developments in software development for that year. This post is the 2012 version with all my biases and skewed perspectives freely admitted.

10. Groovy 2.0

Groovy 2.0 have been an important version for Groovy. Groovy 2's arguably most notable new features are its static type checking and static compilation capabilities.

9. Perl Turns 25

Perl celebrated its 25th anniversary in 2012. Love it or loathe it, Perl has definitely become the predominant scripting language, especially in the non-Windows environments. Perl is not my favorite (I'd rather use Groovy, Python, or Ruby), but I find myself needing to use it at times, usually because I'm modifying or using an existing script or set of scripts already written in Perl. Happy Birthday, Perl!

8. Git and GitHub

Git is the trendy choice now in version control and GitHub is equally trendy for hosting code. The post Why Would Anyone use Git in their Enterprise? states, "Git has a cult-like following in the development community." The book Pro Git (2009) is freely available for reading online and can be downloaded as a PDF, mobi, or ePub electronic book.

7. NoSQL and Polyglot Persistence

The NoSQL concept seems to be maturing and moving from unabated hype and hyperbole to understanding when it works well and when it doesn't. In 7 hard truths about the NoSQL revolution, Peter Wayner writes: NoSQL systems are far from a perfect fit and often rub the wrong way. The smartest developers knew this from the beginning. ... the smart NoSQL developers simply noted that NoSQL stood for "Not Only SQL." If the masses misinterpreted the acronym, that was their problem.

Martin Fowler's nosql page states: "The rise of NoSQL databases marks the end of the era of relational database dominance. But NoSQL databases will not become the new dominators. Relational will still be popular, and used in the majority of situations. They, however, will no longer be the automatic choice." With this, Fowler introduced the concept of polyglot persistence (which he mentions was previously coined by Scott Leberknight in 2008) and explicitly compared this to the concept of polyglot programming. If we as a software development community believe that the advantages of using multiple languages in the same application (polyglot programming) are worth the costs, then it follows that we might also determine that the advantages of using multiple data stores within the same application (polyglot persistence) are also worth the costs of doing so.

6. Mobile Development

Mobile development continues to rapidly rise in 2012. The December 2012 write-up on the Tiobe Index states that Objective-C is likely to be the language of the year again in 2012 due to its rapid rise (third in December 2012 behind only C and Java and ahead of C++ and C#). The writers of that summary conclude about language ratings on this index, "In fact it seems that if you are not in the mobile phone market you are losing ground."

Suzanne Kattau's post Mobile development in the year 2012 succinctly summarizes the changes in popular mobile device platforms and operating systems in 2012.

5. Scala (and Typesafe Stack 2.0 with Play and Akka)

I have highlighted Scala multiple times in these year-end review posts, but this is my highest rating of Scala because Scala has seen a tremendous year in 2012. On 23 August 2012, Cameron McKenzie asked, "Is Scala the new Spring framework?" An answer to that question might be implied by the 1 October 2012 announcement that Spring founder Rod Johnson had joined Typesafe's Board of Directors (Johnson left SpringSource in July). Scala provides the intersection of again-trendy functional programming with widely popular and proven object-oriented programming and is part of the increasingly popular movement to languages other than Java on the JVM. It's not difficult to see why it had a big year in 2012.

The Typesafe Blog features a post called Why Learn Scala in 2013? that begins with the statement, "2012 was a big year for the Scala programming language - with monumental releases, adoption by major enterprises and social sites alike and a Scala Days conference that knocked the socks off years past." The post then goes on to list reasons to learn Scala in 2013 with liberal references to other recent positive posts regarding Scala.

Ted Neward has predicted that in 2013 "Typesafe (and their Scala/Akka/Play! stack) will begin to make some serious inroads into the enterprise space, and start to give Groovy/Grails a serious run for their money." I am not calling Play and Akka out in this post as separate significant developments in 2012, but instead lump them together with Scala as part of justifying Scala taking the #5 spot for 2012. There is no question, however, that 2012 was a big year for Akka and Play. The year 2012 saw the release of Typesafe Stack 2.0 along with Play 2.0 and Akka 2.0.

4. Big Data

Big Data was big in 2012. AOL Government has named Big Data its Best of 2012 for the technology category. Geoff Nunberg argues that "'Big Data' Should Be The Word Of The Year." Interest in the statistical computing language R has (not surprisingly) risen along with the surging interest in Big Data.

3. HTML5

2012 was another big year for HTML5. Although HTML5 continued to be evangelized as a standards-friendly favorite of developers, some hard truths (such as performance versus native code) about the current state of HTML5 also became more readily obvious.

That being stated, I think HTML5 still has a very positive future ahead of it. Although it has certainly been over-hyped with emphasis on what it might one day become rather than what it is today, it would also be foolhardy to ignore it or underestimate its usefulness. Two articles that remind us of this are FT exec: HTML5 is not dead and HTML5 myth busting. The article 'HTML5 is ready' say creators of mobile HTML5 Facebook clone talks about attempts to prove HTML5 is ready today from a performance standpoint.

2. Security

Awareness of security holes, risks, and vulnerabilities has been increasing for the past several years largely due to high-profile incidents of lost sensitive data and new legal requirements. However, 2012 seemed to be a bigger year than most in terms of increasing awareness of security requirements and expectations in software architecture and design.

Java seemed to be particularly hard hit by bad security news in 2012. Articles and posts that provide examples of this include 1 Billion computers at risk from Java exploit, Oracle's Java Security Woes Mount As Researchers Spot A Bug In Its Critical Bug Fix, Java Vulnerability Affects 1 Billion Plug-ins, Another Week, Another Java Security Issue Found, Oracle and Apple Struggle to Deal with Java Security Issues, and Java still has a crucial role to play—despite security risks.

The article Oracle to stop patching Java 6 in February 2013 suggests that users of Java should upgrade to Java 7 before February 2013 when Oracle will supply the last publicly available security patch to Java SE 6 outside of an Oracle support plan. Another article is called Oracle's Java security update lacking, experts say.

1. Cloud Computing

It seemed like everybody wanted a cloud in 2012 even if they didn't really need one. Archana Venkatraman put it this way, "2012 was the year cloud computing hit the mainstream." Steve Cerocke stated, "2012 will go down as the year of cloud computing." Other articles and posts on the biggest cloud stories of 2012 include The 10 Biggest Cloud Stories Of 2012 and Top five cloud computing news stories in 2012.

Cloud Computing is in the sweet spot many trendy technologies and approaches go through when enthusiasm is much greater than negativism. Charles Babcock's Cloud Computing: Best And Worst News Of 2012 is more interesting to me than many cloud-focused publications because it highlights the good and the bad of cloud computing in 2012.

Honorable Mention

I couldn't fit everything that interested me about software development in 2012 into the Top Ten. Here are some others that barely missed my cut.

C

As mentioned earlier, the C programming language appears headed for #1 on the Tiobe Index for 2012. One of programming's most senior languages is also one of its most popular. When one considers that numerous other languages are themselves built on C and when one considers that many languages strive for C-like syntax, the power and influence of C is better appreciated. C's popularity has remained strong for years and 2012 was another very strong year for C.

Another piece of evidence arguing C's case is the late 2012 O'Reilly publication of Ben Klemens's book 21st Century C: C Tips from the New School. The author discusses this book and C today in the O'Reilly post New school C.

Although I have not written C code for several years now, I've always had a fondness for the language. It was the language I used predominately in college (with Pascal and C++ being the other languages I used to a less degree) and I wrote the code for my electrical engineering capstone project in C. I remember (now fondly) spending almost an entire Saturday on one of my first C assignments fighting a bug to only realize that it was not working because I was using the assignment operator (=) rather than the equality operator (==). This lesson served me well as I learned other languages in terms of both what it important to differentiate and in terms of how to better debug programs even when a debugger is not readily available. I think my C experience grounded me well for later professional development with C++ and Java.

Gradle 1.x

Using an expressive programming language rather than XML or archaic make syntax to build software seems like an obviously beneficial thing to do. However, make, Ant, and Maven have continued to dominate in this area, but Groovy-based Gradle shows early signs of providing the alternative we've all been looking for. Gradle still has a long way to go in terms of acceptance and popularity and there are many other build systems with some of Gradle's ideals that have failed, but Gradle did seem to capture significant attention in 2012 and can hopefully build upon that in future years. Gradle 1.0 was formally released in June 2012 and Gradle 1.3 was released in November 2012.

DevOps

Among others, Scott Ambler predicted that "DevOps will become the new IT buzzword" in 2012. If it is not "the" buzzword of 2012, it is not for a lack of trying on the DevOps evangelists' part. The DevOps movement continued to gain momentum in 2012. The DZone DevOps Zone sees one or more posts on the subject added each day. The only reason this did not make it into my Top Ten is that I still don't see "Typical Everyday Coder" talking about it when I am away from the blogosphere talking to in-the-trenches developers.

Amber's concluding paragraph begins with this prediction, "Granted, there’s likely going to be a lot of talk and little action within most organizations due to the actual scope of DevOps, but a few years from now, we’ll look back on 2012 as the year when DevOps really started to take off." Only time will tell. There continue to be posts trying to explain what exactly DevOps is.

Departures of Noteworthy Development Personnel

There were some separations of key developers from their long-time organizations in 2012. As mentioned previously, Spring Framework founder Rod Johnson left VMWare/SpringSource (and ultimately ended up on the Board of Directors for Scala company Typesafe). Josh Bloch, perhaps still best known for his work at Sun on the JDK and for writing Effective Java, left Google in 2012 after working there for eight years.

Resurgence of Widely Popular but Aged Java Frameworks

Two very popular long-in-the-tooth Java-based frameworks saw a resurgence in 2012. Tomek Kaczanowski recently posted JUnit Strikes Back, in which he cites several pieces of evidence indicating a resurgence in JUnit, arguably the most commonly used testing framework in Java (and, in many ways, the inspiration for numerous other nUnit-based unit testing frameworks). Christian Grobmeier's recent post The new log4j 2.0 talks about many benefits of Log4j2 and how it can be used with more recently popular logging frameworks such as SLF4J and even Apache Commons Logging.

Electronic Books (E-books)

Electronic books (ebooks) are becoming widely popular in general, but also specifically within software development books. This is not surprising because e-books provide many general benefits, but also have benefits particular to software development. In particular, it is nice to be able to electronically search for terms (overcoming the poor index problem common to many printed programming books). Other advantages include the ability to have the book on laptops, mobile devices, e-readers, and other portable devices. This not only makes a particular book readily available, but makes it easy to carry many books on many different technical subjects with one on travel. It is also less likely for the electronic book to be "borrowed" unknowingly by others or turn up missing.

Perhaps the biggest advantage of electronic books is cost. It is fairly well known that technical books are generally not big profit makers for publishers. However, with printing and distribution costs being a significant portion of traditional publication costs, e-books make it easier to publishers to price these books at a lower cost than the printed equivalent.

The reduced cost to the publisher for an electronic book can be passed onto the consumer. I recently took advantage of an offer from Packt Publishing to purchase a total of eight of their books as electronic books for a total price of $40. Given that a single printed programming book can cost $40 or more, this was a bargain. I have also blogged on other good deals on e-books provided by other technical publishers such as O'Reilly and Manning.

I have especially appreciated the Manning Early Access Program (MEAP). This program is only viable thanks to electronic books and allows readers to read a book as it is developed. Because technologies change so quickly, it is often helpful to get access to even portions of these books as quickly as possible.

Finally, another advantage of e-books is their ultimate disposal. In reality, they take up such a small portion of even an old-fashioned CD or DVD, that I can usually dig up a copy if I want to. However, I can remove them from my electronic devices when I no longer need them and need the space. There are no environmental or logistic concerns about their disposal. This is important because these books do tend to get outdated quickly and sometimes an outdated programming book is worse than having no book at all because it can be very misleading.

PhoneGap / Cordova

Given the popularity of mobile development and HTML5 in 2012, it's not surprising that PhoneGap and Cordova had big years in 2011/2012. In the web versus native debate, one of the advantages of web is the portability of web apps across multiple devices. The PhoneGap/Cordova approach brings some of this benefit for writing code but maintains some of the performance advantages of running native applications.

Objective-C

Objective-C looks to win the Tiobe Index language of the year again in 2012. This is yet another indicator of mobile development prevalence as Objective-C's popularity is almost exclusively tied to iPhone/iPad development, though Objective-C's history is closely coupled with the NeXT workstations and even has been called an inspiration for Java (advertised as quoted by Patrick Naughton) instead of C++.

Kotlin

For several years now, it has been trendy for the "cool kids" to post feedback messages on articles or blogs about Java features proclaiming that Groovy or Scala does anything covered in that blog post or article better than Java does it. Many of the "cool kids" (or maybe different "cool kids" with the same modus operandi) now seem to be doing the same on Scala blog posts and articles, advocating the advantages of Kotlin over Scala.

As Scala and Groovy still lag Java in terms of overall popularity and usage, Kotlin lags both Groovy and Scala in terms of adoption at this point. However, there are definitely some characteristics of Kotlin in its favor. First, the Kotlin web page describes the language as, "a statically typed programming language that compiles to JVM byte codes and JavaScript." I could definitely see how "statically typed" and "compiles ... to JavaScript" would be endearing to developers who must write JavaScript but prefer static compilation. Andrey Breslav, Kotlin Project Lead, believes that static languages compiling to "typed JavaScript" will be a major development of 2013 and he cites Dart and TypeScript as other examples of this. Being able to run on the JVM can also be an advantage, though this is no different than Groovy or Scala.

One major positive for Kotlin is its sponsor: JetBrains. It is likely that their IDE, IntelliJ IDEA, will provide elegant support for the Kotlin language. This is also a sponsor/owner with the resources (monetary and people) to improve chances for success. Because JetBrains is "planning to roll out a beta of Kotlin and start using it extensively in production," they are more likely to continue investing resources in this new language.

There was no way I could justify to myself putting Kotlin in my top ten for 2012, but once it is released for production use, it is possible that Kotlin may make another year's Top Ten list if it is widely adopted.

Ceylon

Kotlin isn't the only up-and-coming static language for the JVM; Ceylon is also relatively young in this space. I wrote about the JavaOne 2012 presentation Introduction to Ceylon and that post provides brief overview and description information.

The first milestone of Ceylon IDE (Eclipse plug-in) was released in early 2012 and was followed in March with the release of Ceylon M2 ("Minitel"). This was followed by the Ceylon M3 ("V2000") release in June and Ceylon M4 ("Analytical Engine") in October.

The newer JVM-friendly languages with the seeming best chances of long-term viability are those with strong sponsors: Groovy has SpringSource, Scala has TypeSafe, Kotlin has JetBrains, and Ceylon has RedHat.

End of Oracle/Google Android Lawsuit

The lawsuit between Oracle and Google over Android seems to have, for the most part, concluded in 2012. There still seems to be bad blood between the two companies, but the settlement of this probably allows for continued success of the Android platform and potentially for collaboration at some future point between the two companies on Java. It will be interesting to see if Google allows its employees to submit abstracts to JavaOne 2013.

Everyone a Programmer

When HTML first started to expand across the universities and colleges in the early-to-mid 1990s, it seemed that everyone I knew was learning HTML. Most of us were "learning" it by copying other peoples' HTML source and editing it for our own use. Of course, everything then was static and fairly easy to pick up. It probably also skewed my perspective that I was majoring in electrical engineering with an emphasis in computer science and so was around people who had a tendency to adopt and use new technology. Perhaps for the first time since then, I have felt that there is an ever-growing interest in pushing everyone to learn how to program at a certain level. I don't need to provide any supporting points for this because I can instead reference Joab Jackson's 2012: The year that coding became de rigueur. Not only does this post enumerate several examples of the debate about whether everyone should learn programming, but it also makes cool use of "de rigueur" in its title.

Java

I did not include Java itself in my Top Ten in 2012. Perhaps this is an indication that I too felt that 2012 was a slow year for Java (and agree that this is not necessarily a bad thing). That being stated, Martijn Verburg has listed some "personal highlights of the year" in the world of Java in 2012 in What will 2013 bring? Developers place their bets. These include the JVM's entry into the cloud, Java/JVM community growth, OpenJDK, Java EE 7, and Mechanical Sympathy.

It's a small thing in many ways, but I think James Gosling returning to JavaOne and throwing out t-shirts was symbolic of a strengthening resurgence among an already strong Java development community.

Jelastic

Java on the cloud in general had a big year in 2012. Jelastic had a particularly big year. The screen snapshot below is from an e-mail sent out by Jelastic COO Dmitry Sotnikov.

Jelastic was prominently mentioned at JavaOne 2012 multiple times. Some of these mentions were due to winning the Duke's Choice Award for 2012. Other mentions resulted from James Gosling's positive review of his use of Jelastic. As I blogged on earlier, Gosling described himself as "a real Jelastic fan" at the JavaOne 2012 Community Keynote.

Linux-based ARM Devices

Oracle announced release of a Linux ARM JDK in 2012. The ability to run even JavaFX on Linux ARM devices provides evidence of Oracle's interest in supporting Linux ARM devices. Given that Oracle is well-known for investing in areas where returns are often great, it follows that perhaps Oracle sees great potential for the Linux ARM devices in the future. An interesting article that looks into this is Java 8 on ARM: Oracle's new shot against Android?

One couldn't go to a keynote presentation at JavaOne 2012 without hearing about one very trendy Linux ARM Device, the Rasperry Pi. Similarly, the BeagleBoard and PandaBoard have also become very popular.

Improving Job Market for Software Developers

2012 seemed to be a good year for those with careers in software development and this seems likely to continue. CNN Money ranked software developer as #9 in its Best Jobs in America story (software architect was #3). For example, Lauren Hepler has written that Software developers top 2013 job projection and cites a Forbes slides-based presentation.

Perhaps more importantly than these stories are my anecdotal observations of a surging market for software developers. I have seen an uptake in number of unsolicited employment queries from potential employers and clients. I am also seeing an increase in billboard advertising for developers, especially in areas of the United States with high concentrations of software development. This improving job market might be one of many reasons for increasing interest in programming in general.

Other Resources

There are other posts of potential interest. Katherine Slattery's Takeaways from the Top Development News Stories of 2012 talks about the Node.js "hype cycle," open source hardware, native apps versus HTML5 apps, and the "'learn to code' craze."

Ted Neward's annual predictions (for 2013 in this case) and review of his prior year (2012 in this) predictions is an interesting read.

Conclusion

2012 was another big year in software development across many different areas and types of development. Many of the themes discussed in this post overlap and are somehow associated with mobile development, cloud computing, and greater availability of data.

Friday, December 21, 2012

$5 E-books at Packt

Packt Publishing is offering $5 (USD) e-books when two or more are purchased through 3 January 2013. Their Stock Your Reader for Christmas page contains the details and I have included a snapshot of that here.

I've had my eye on a few of Packt's books, but wasn't sure how much time I'd have to read them. However, at $5 a piece, it was an easy decision to purchase them, even if I'm only able to browse them and read the most interesting portions. To take advantage of the deal, two of more e-books must be purchased. Fortunately, all of the Packt Publishing books appear to be part of this deal rather than a limited selection. As the next two screen snapshots indicate, I was able to purchase six e-books of interest to me for $30 (USD).

I don't know how good each of these books is, but at $5 per book, it wasn't much of a gamble. As the screen snapshots above indicate, I ended up purchasing Play Framework Cookbook (August 2011), HTML5 Mobile Development Cookbook (February 2012), PhoneGap Beginner's Guide (September 2011), EJB 3.1 Cookbook (June 2011), Apache Maven 3 Cookbook (August 2011), and Java EE 6 with GlassFish 3 Application Server (July 2010). I am tempted to go back and purchase the PhoneGap Mobile Application Development Cookbook (October 2012) e-book and, because you need at least two for the $5 each deal, I may be "forced" to purchase HTML5 Canvas Cookbook (November 2011) as well. For the typical price of a single book, I am able to get 6 to 8 books in a variety of subjects with this offer.

Saturday, December 15, 2012

Groovy: Multiple Values for a Single Command-line Option

One of the many features that makes Groovy an attractive scripting language is its built-in command-line argument support via CliBuilder. I have written about CliBuilder before in the posts Customizing Groovy's CliBuilder Usage Statements and Explicitly Specifying 'args' Property with Groovy CliBuilder. In this post, I look at Groovy's CliBuilder's support for multiple arguments passed via a single command-line flag.

The Groovy API Documentation includes this sentence about CliBuilder:

Note the use of some special notation. By adding 's' onto an option that may appear multiple times and has an argument or as in this case uses a valueSeparator to separate multiple argument values causes the list of associated argument values to be returned.

As this documentation states, Groovy's built-in CliBuilder support allows a parsed command line flag to be treated as having multiple values and the convention for referencing this argument is to add an "s" after the "short" name of the command-line option. Doing so makes the multiple values associated with a single flag available as a collection of Strings that can be easily iterated to access the multiple values.

In the post Customizing Groovy's CliBuilder Usage Statements, I briefly looked at the feature supporting multiple values passed to the script via a single command line argument. I described the feature in that post as follows:

The use of multiple values for a single argument can also be highly useful. The direct use of Apache Commons CLI's Option class (and specifically its UNLIMITED_VALUES constant field) allows the developer to communicate to CliBuilder that there is a variable number of values that need to be parsed for this option. The character that separates these multiple values (a common in this example) must also be specified by specifying the character via "valueSeparator."

The usefulness of this Apache CLI-powered Groovy feature can be demonstrated by adapting a script for finding class files contained in JAR files that I talked about in the post Searching JAR Files with Groovy. The script in that post searched one directory recursively for a single specified String contained as an entry in the searched JARs. A few minor tweaks to this script changes it so that it can support multiple specified directories to recursively search for multiple expressions.

The revised script is shown next.

#!/usr/bin/env groovy

/**
 * findClassesInJars.groovy
 *
 * findClassesInJars.groovy -d <<root_directories>> -s <<strings_to_search_for>>
 *
 * Script that looks for provided String in JAR files (assumed to have .jar
 * extensions) in the provided directory and all of its subdirectories.
 */

def cli = new CliBuilder(
   usage: 'findClassesInJars.groovy -d <root_directories> -s <strings_to_search_for>',
   header: '\nAvailable options (use -h for help):\n',
   footer: '\nInformation provided via above options is used to generate printed string.\n')
import org.apache.commons.cli.Option
cli.with
{
   h(longOpt: 'help', 'Help', args: 0, required: false)
   d(longOpt: 'directories', 'Two arguments, separated by a comma', args: Option.UNLIMITED_VALUES, valueSeparator: ',', required: true)
   s(longOpt: 'strings', 'Strings (class names) to search for in JARs', args: Option.UNLIMITED_VALUES, valueSeparator: ',', required: true)
}
def opt = cli.parse(args)
if (!opt) return
if (opt.h) cli.usage()

def directories = opt.ds
def stringsToSearchFor = opt.ss

import java.util.zip.ZipFile
import java.util.zip.ZipException

def matches = new TreeMap<String, Set<String>>()
directories.each
{ directory ->
   def dir = new File(directory)
   stringsToSearchFor.each
   { stringToFind ->
      dir.eachFileRecurse
      { file->
         if (file.isFile() && file.name.endsWith("jar"))
         {
            try
            {
               zip = new ZipFile(file)
               entries = zip.entries()
               entries.each
               { entry->
                  if (entry.name.contains(stringToFind))
                  {
                     def pathPlusMatch = "${file.canonicalPath} [${entry.name}]"
                     if (matches.get(stringToFind))
                     {
                        matches.get(stringToFind).add(pathPlusMatch)
                     }
                     else
                     {
                        def containingJars = new TreeSet<String>()
                        containingJars.add(pathPlusMatch)
                        matches.put(stringToFind, containingJars)
                     }
                  }
               }
            }
            catch (ZipException zipEx)
            {
               println "Unable to open file ${file.name}"
            }
         }
      }
   }
}

matches.each
{ searchString, containingJarNames ->
   println "String '${searchString}' Found:"
   containingJarNames.each
   { containingJarName ->
      println "\t${containingJarName}"
   }
}

Lines 11 through 28 are where Groovy's internal CliBuilder is applied. The "directories" (short name of 'd') and "strings" (short name of 's') command-line flags are set up in lines 20 and 21. Those lines use the Option.UNLIMITED_VALUES to specify multiple values applicable for each argument and they also use valueSeparator to specify the token separating the multiple values for each flag (comma in these cases).

Lines 27-28 obtain the multiple values for each argument. Although the options had short names of 'd' and 's', appending 's' to each of them (now 'ds' and 'ss') allows their multiple values to be accessed. The rest of the script takes advantage of these and iterates over the multiple strings associated with each flag.

The next screen snapshot demonstrates the above script being executed.

The above screen snapshot demonstrates the utility of being able to provide multiple values for a single command-line flag. Groovy's built-in support for Apache CLI makes it easy to employ customizable command-line parsing.

Wednesday, December 12, 2012

Groovy JDK (GDK): Date and Calendar

I have looked at some highly useful methods available in Groovy GDK's extensions to the Java JDK in blog posts such as Groovy JDK (GDK): File.deleteDir(), Groovy JDK (GDK): Text File to String, Groovy JDK (GDK): More File Fun, Groovy JDK (GDK): String Support, and Groovy JDK (GDK): Number Support. In this post, I look at some of the endearing features of Groovy's GDK extensions to the Java JDK java.util.Date and java.util.Calendar classes.

Java's current standard support for dates and times is generally disliked in the Java development community. Many of us look forward to JSR-310 and/or already use Joda Time to get around the shortcomings of Java's treatment of dates and times. Groovy makes working with dates and times a little easier when third-party frameworks are not available or cannot be used.

The Groovy GDK extension of Date provides several new and highly useful methods as shown in the screen snapshot of its documentation.

Some of these useful mehtods that I will highlight in this post are clearTime(), format(String), getDateString(), getTimeString(), parse(String, String), parseToStringDate(String), toCalendar(), toTimestamp(), and updated(Map). Many of the other methods listed in the API support Groovy operator overloading and are not highlighted in this post.

Date.clearTime() and Calendar.clearTime()

There are times when one wishes to represent a date only and the time portion of a Date or Calendar is not important (which is exactly why JSR 310 is bringing date-only constructs such as LocalDate to JDK 8). In such cases, Groovy's extension to Date and Calendar make it easy to "clear" the time component. The next code listing demonstrates use of Date.clearTime() followed by a screen snapshot showing that code executed. Note that the clearTime() method mutates the object it acts upon.

/**
 * Demonstrates Groovy's GDK Date.clearTime() method. Note that the clearTime()
 * method acts upon the Date object upon which it is called, mutating its value
 * in addition to returning a reference to that changed object.
 */
def demoClearTime()
{
   printTitle("Groovy GDK Date.clearTime()")
   def now = new Date()
   println "Now: ${now}"
   def timelessNow = now.clearTime()
   println "Now sans Time: ${timelessNow}"
   println "Mutated Time:  ${now}"
}

Calendar's clearTime() works similarly as shown in the next code snippet and its accompanying screen snapshot of its execution.

/**
 * Demonstrates Groovy's GDK Calendar.clearTime() method. Note that the
 * clearTime() method acts upon the Calendar object upon which it is called,
 * mutating its value in addition to returning a reference to that changed object.
 */
def demoCalendarClearTime()
{
   printTitle("Groovy GDK Calendar.clearTime()")
   def now = Calendar.getInstance()
   println "Now: ${now}"
   now.clearTime()
   println "Now is Timeless: ${now}"
}
Date.format and Calendar.format

It is common in Java development to need to display a Date or Calendar in a specific user-friendly format and this is typically accomplished using instances of SimpleDateFormat. Groovy simplifies this process of applying a format to a Date or String with the respective methods Date.format(String) and Calendar.format(String). Code listings demonstrating each are shown next with each code listing followed by a screen snapshot displaying the executed code.

/**
 * Demonstrate how much more easily a formatted String representation of a Date
 * can be acquired in Groovy using GDK Date.format(String). No need for an
 * explicit instance of SimpleDateFormat or any other DateFormat implementation
 * here!
 */
def demoFormat()
{
   printTitle("Groovy GDK Date.format(String)")
   def now = new Date()
   println "Now: ${now}"
   def dateString = now.format("yyyy-MMM-dd HH:mm:ss a")
   println "Formatted Now: ${dateString}"
}
/**
 * Demonstrate how much more easily a formatted String representation of a
 * Calendar can be acquired in Groovy using GDK Calendar.format(String). No need
 * for an explicit instance of SimpleDateFormat or any other DateFormat
 * implementation here!
 */
def demoCalendarFormat()
{
   printTitle("Groovy GDK Calendar.format(String)")
   def now = Calendar.getInstance()
   println "Now: ${now}"
   def calendarString = now.format("yyyy-MMM-dd HH:mm:ss a")
   println "Formatted Now: ${calendarString}"
}
Date.getDateString(), Date.getTimeString(), and Date.getDateTimeString()

The format methods shown previously allow customized representation of a Date or Calendar and the clearTime methods shown previously allow the time element to be removed from an instance of a Date or Calendar. Groovy provides some convenience methods on Date for displaying a user-friendly date only, time only, or date and time without specifying a format or clearing the time component. These methods print dates and times in the predefined format specified by DateFormat.SHORT (for date portions) and DateFormat.MEDIUM (for time portions). Code listings of each of these methods are shown next and are each followed by screen snapshots of that code being executed.

/**
 * Demonstrates Groovy's GDK Date.getDateString() method. Note that this
 * method doesn't change the underlying date, but simply presents only the date
 * portion (no time portion is presented) using the JDK's DateFormat.SHORT
 * constant (which defines the locale-specific "short style pattern" for
 * formatting a Date).
 */
def demoGetDateString()
{
   printTitle("Groovy GDK Date.getDateString()")
   def now = new Date()
   println "Now: ${now}"
   println "Date Only: ${now.getDateString()}"
   println "Now Unchanged: ${now}"
}
/**
 * Demonstrates Groovy's GDK Date.getTimeString() method. Note that this
 * method doesn't change the underlying date, but simply presents only the time
 * portion (no date portion is presented) using the JDK's DateFormat.MEDIUM
 * constant (which defines the locale-specific "medium style pattern" for
 * formatting a Date).
 */
def demoGetTimeString()
{
   printTitle("Groovy GDK Date.getTimeString()")
   def now = new Date()
   println "Now: ${now}"
   println "Time Only: ${now.getTimeString()}"
   println "Now Unchanged: ${now}"
}
/**
 * Demonstrates Groovy's GDK Date.getDateTimeString() method. Note that this
 * method doesn't change the underlying date, but simply presents the date and
 * time portions as a String. The date is presented with locale-specific format
 * as defined by DateFormat.SHORT and the time is presented with locale-specific
 * format as defined by DateFormat.MEDIUM.
 */
def demoGetDateTimeString()
{
   printTitle("Groovy GDK Date.getDateTimeString()")
   def now = new Date()
   println "Now: ${now}"
   println "Date/Time String: ${now.getDateTimeString()}"
   println "Now Unchanged: ${now}"
}
Date.parse(String, String)

The GDK Date class provides a method Date.parse(String, String) that is a "convenience method" that "acts as a wrapper for SimpleDateFormat." A code snippet and corresponding screen snapshot of the code's output follow and demonstrate this method's usefulness.

/**
 * Demonstrate Groovy GDK's Date.parse(String, String) method which parses a
 * String (second parameter) based on its provided format (first parameter).
 */
def demoParse()
{
   printTitle("Groovy GDK Date.parse(String, String)")
   def nowString = "2012-Nov-26 11:45:23 PM"
   println "Now String: ${nowString}"
   def now = Date.parse("yyyy-MMM-dd hh:mm:ss a", nowString)
   println "Now from String: ${now}"
}
Date.parseToStringDate(String)

The GDK Date.parseToStringDate(String) method can be used to obtain an instance of Date from a String matching the exact format put out by the Date.toString() method. In other words, this method can be useful for converting back to a Date from a String that was generated from a Date's toString() method.

Use of this method is demonstrated with the following code snippet and screen snapshot of the corresponding output.

/**
 * Demonstrate Groovy GDK's Date.parseToStringDate(String) method which parses
 * a String generated by a Date.toString() call, but assuming U.S. locale to
 * do this.
 */
def demoParseToStringDate()
{
   printTitle("Groovy GDK Date.parseToStringDate(String)")
   def now = new Date()
   println "Now: ${now}"
   def nowString = now.toString()
   def nowAgain = Date.parseToStringDate(nowString)
   println "From toString: ${nowAgain}"
}

There is one potentially significant downside to the GDK Date.parseToStringDate(String) method. As its documentation states, it relies on "US-locale-constants only."

Date.toCalendar() and Date.toTimestamp()

It is often useful to convert a java.util.Date to a java.util.Calendar or java.sql.Timestamp. Groovy makes these common conversions particularly easy with the GDK Date-provided methods Date.toCalendar and Date.toTimestamp(). These are demonstrated in the following code snippets with their output displayed in corresponding screen snapshots.

/**
 * Demonstrates how easy it is to get a Calendar instance from a Date instance
 * using Groovy's GDK Date.toCalendar() method.
 */
def demoToCalendar()
{
   printTitle("Groovy GDK Date.toCalendar()")
   def now = new Date()
   println "Now: ${now}"
   def calendarNow = now.toCalendar()
   println "Now: ${calendarNow} [${calendarNow.class}]"
}
/**
 * Demonstrates how easy it is to get a Timestamp instance from a Date instance
 * using Groovy's GDK Date.toTimestamp() method.
 */
def demoToTimestamp()
{
   printTitle("Groovy GDK Date.toTimestamp()")
   def now = new Date()
   println "Now: ${now}"
   def timestampNow = now.toTimestamp()
   println "Now: ${timestampNow} [${timestampNow.class}]"
}

Date.updated(Map) [and Calendar.updated(Map)]

The final convenience method provided by the GDK Date that I'm going to discuss in this post is Date.updated(Map), which its documentation describes as "Support creating a new Date having similar properties to an existing Date (which remains unaltered) but with some fields updated according to a Map of changes." In other words, this method allows one to start with a certain Date instance and acquire another Date instance with the same properties other than changes specified in the provided Map.

The next code listing acquires a new Date instance from an existing Date instance with a few fields updated using the Date.updated(Map) method. The code listing is followed by a screen snapshot of its execution.

/**
 * Demonstrate Groovy GDK's Date.updated(Map) with adaptation of the example
 * provided for that method in that method's Javadoc-based GDK documentation.
 * Note that the original Date upon which updated is invoked is NOT mutated and
 * the updates are on the returned instance of Date only.
 */
def demoUpdated()
{
   printTitle("Groovy GDK Date.updated(Map)")
   def now = new Date()
   def nextYear = now[YEAR] + 1
   def nextDate = now[DATE] + 1
   def prevMonth = now[MONTH] - 1
   def oneYearFromNow = now.updated(year: nextYear, date: nextDate, month: prevMonth)
   println "Now: ${now}"
   println "1 Year from Now: ${oneYearFromNow}"
}

The demonstration shows that the original Date instance does remain unaltered and that a copy with specified fields changed is provided. There is also an equivalent for the GDK Calendar called Calendar.updated(Map).

Conclusion

One of the things I like about Groovy is the GDK extensions to SDK classes. In this post, I looked at how the GDK Date extension of the JDK's Date provides many useful convenience methods that lead to more concise and more readable code.

Wednesday, November 28, 2012

When Premature Optimization Isn't

Earlier this month, I decided I wanted to write a post on not all optimization being premature optimization after hearing more than one developer use this mantra as an excuse for not making a better decision in the same week. Bozhidar Bozhanov beat me to it with his post Not All Optimization Is Premature, which makes some excellent but different points than I had planned to make in postulating that there is nothing wrong in early optimizing "if neither readability nor maintainability are damaged and the time taken is negligible."

Bozhanov's post reminded me that I wanted to write this post. I use this post to provide additional examples and support to backup my claims that too many in our community have allowed "avoid premature optimization" to become a "bumper sticker practice." In my opinion, some developers have taken the appropriate advice to "avoid premature optimization" out of context or do not want to spend the time and effort to really think about the reasons behind this statement. It may seem easier to blindly apply it to all situations, but that is just as dangerous as prematurely optimizing.

Good Design is Not Premature Optimization

I like Rod Johnson's differentiation between "design" and "code optimization" in his book Expert One-on-One J2EE Design and Development (2002). Perhaps the most common situations in which I have seen developers make bad decisions under the pretense of "avoiding premature optimization" is making bad architecture or design choices. The incident earlier this month that prompted me to want to write this post was a developer's assertion that we should not design our services to be coarser grained than he wanted because that was premature optimization and his idea of making numerous successive calls on the same service was "easiest" to implement. In my mind, this developer was mistaking the valid warning about not prematurely optimizing code as an excuse to not consider an appropriate design that might require a barely noticeable amount of extra effort.

In his differentiation between design and code optimization, Johnson highlighted, "Minimize the need for optimization by heading off problems with good design." In that same section he warns against "code optimization" for four main reasons: "optimization is hard" ("few things in programming are harder than optimizing existing code"), "most optimization is pointless," "optimization causes many bugs," and "inappropriate optimization may reduce maintainability forever." Johnson states (and I agree), "There is no conflict between designing for performance and maintainability, but optimization may be more problematic."

Applying Appropriate Language Practices is Not Premature Optimization

Most programming languages I'm familiar with often offer multiple ways to accomplish the same thing. In many cases, one of the alternatives has well-known advantages. In such cases, is it really premature optimization to use the better performing alternative? For example, if I'm writing Java code to append characters onto a String within a loop, why would I ever do this with a Java String instead of StringBuilder? Use of StringBuilder is not much different in terms of maintainability or readability for even a relatively new Java developer and there is a known performance benefit that requires no profiling to recognize. It seems silly to write it with String in the name of "avoiding premature optimization" and only change it to StringBuilder when the profiler shows it's a performance issue. That being stated, it would be just as silly to use a StringBuilder for simple concatenations outside of a loop "just in case" that code was ever placed within a loop.

Similarly, it's not "premature optimization" to write a conditional such that the most likely condition is encountered first as long as doing so does not make the code confusing. Another example is the common use of short circuit evaluation in conditionals that can be effective without being premature optimization. Finally, there are cases where certain data structures or collections are more likely to be appropriate than others for a given operation or set of expected operations.

Writing Cleaner Code is Not Premature Optimization

Some developers might confuse more "efficient" (cleaner) source code with premature optimization. Optimizing source code for readability and maintainability (such as in refactoring or carefully crafting original code) has nothing to do with Knuth's original quote. Writing cleaner code often leads to better performance, but this does not mean writing cleaner code is a form of premature optimization.

Others' Thoughts on Misunderstanding of Premature Optimization

Besides the Not All Optimization Is Premature post, other posts on the misapplication of the "avoid premature optimization" mantra include Joe Duffy's The 'premature optimization is evil' myth and 'Avoid Premature Optimization' Does Not Mean 'Write Dumb Code'.

Duffy puts it so well that I'll quote him directly here:

I have heard the "premature optimization is the root of all evil" statement used by programmers of varying experience at every stage of the software lifecycle, to defend all sorts of choices, ranging from poor architectures, to gratuitous memory allocations, to inappropriate choices of data structures and algorithms, to complete disregard for variable latency in latency-sensitive situations, among others. Mostly this quip is used defend sloppy decision-making, or to justify the indefinite deferral of decision-making. In other words, laziness.

The James Hague post points out one of the signs of potentially having crossed into premature optimization: "The warning sign is when you start sacrificing clarity and reliability while chasing some vague notion of performance." Hague also writes, "What's often missed in these discussions is that the advice to 'avoid premature optimization' is not the same thing as 'write dumb code.'" I like this last sentence because I believe that just as some developers have adulterated the agile concept to mean (to them) "no documentation," some developers have adulterated the sound advice to "avoid premature optimization" to mean (to them) "blindly write code."

Premature Optimization is a Real Problem

Premature optimization is a problem we developers must guard against. As Johnson states in the previously cited book, "Few things in programming are harder than optimizing existing code. Unfortunately, this is why optimization is uniquely satisfying to any programmer's ego. The problem is that the resources devoted to such optimization may well be wasted." There makings examples of where premature optimization wastes significant resources and in some cases even makes things perform worse. There is indeed a reason that the well-regarded Knuth wrote that "premature optimization is the root of all evil." I'm not saying that premature optimization doesn't exist or that it's not harmful. I'm just saying that avoiding this admitted dysfunctional behavior is often used an an excuse to avoid thinking or to avoid implementing sound decisions.

Conclusion

Like pithy bumper stickers on cars that naively boil down complex issues to a few clever and catchy words, use of "avoid premature optimization" is often used much more broadly than it was intended. Even the best of recommended practices can cause more harm than benefit when applied improperly and the misuse of "avoid premature optimization" is one of the best examples of this. I have seen the high cost paid in maintainability, readability, and even in performance when supposed "optimization" was implemented too early and at the expense of readability and maintainability for no measurable benefit. However, just as high of a cost can be incurred by blindly using "avoid premature optimization" as an excuse to avoid designing and writing better performing software. "Avoiding premature optimization" is not an excuse to stop thinking.

Tuesday, November 27, 2012

Type-safe Empty Collections in Java

I have blogged before on the utility of the Java Collections class and have specifically blogged on Using Collections Methods emptyList(), emptyMap(), and emptySet(). In this post, I look at the sometimes subtle but significant differences between using the relevant fields of the Collections class for accessing an empty collection versus using the relevant methods of the Collections class for accessing an empty collection.

The following code demonstrates accessing Collections's fields directly to specify empty collections.

Using Collections's Fields for Empty Collections
   /**
    * Instantiate my collections with empty versions using Collections fields.
    * This will result in javac compiler warnings stating "warning: [unchecked]
    * unchecked conversion".
    */
   public void instantiateWithEmptyCollectionsFieldsAssigment()
   {
      this.stringsList = Collections.EMPTY_LIST;
      this.stringsSet = Collections.EMPTY_SET;
      this.stringsMap = Collections.EMPTY_MAP;      
   }

The code above compiles with javac, but leads to the warning message (generated by NetBeans and Ant in this case):

-do-compile:
    [javac] Compiling 1 source file to C:\java\examples\typesafeEmptyCollections\build\classes
    [javac] Note: C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java uses unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.

Specifying -Xlint:unchecked as an argument to javac (in this case via the javac.compilerargs=-Xlint:unchecked in the NetBeans project.properties file) helps get more specific warning messages for the earlier listed code:

    [javac] Compiling 1 source file to C:\java\examples\typesafeEmptyCollections\build\classes
    [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:27: warning: [unchecked] unchecked conversion
    [javac]       this.stringsList = Collections.EMPTY_LIST;
    [javac]                                     ^
    [javac]   required: List<String>
    [javac]   found:    List
    [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:28: warning: [unchecked] unchecked conversion
    [javac]       this.stringsSet = Collections.EMPTY_SET;
    [javac]                                    ^
    [javac]   required: Set<String>
    [javac]   found:    Set
    [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:29: warning: [unchecked] unchecked conversion
    [javac]       this.stringsMap = Collections.EMPTY_MAP;      
    [javac]                                    ^
    [javac]   required: Map<String,String>
    [javac]   found:    Map

NetBeans will also show these warnings if the appropriate hint box is checked in its options. The next three images demonstrate ensuring that the appropriate hint is set to see these warnings in NetBeans and provides an example of how NetBeans presents the code shown above with warnings.

Fortunately, it is easy to take advantage of the utility of the Collections class and access empty collections in a typesafe manner that won't lead to these javac warnings and corresponding NetBeans hints. That approach is to use Collections's methods rather than its fields. This is demonstrated in the next simple code listing.

Using Collections's Methods for Empty Collections
   /**
    * Instantiate my collections with empty versions using Collections methods.
    * This will avoid the javac compiler warnings alluding to "unchecked conversion".
    */
   public void instantiateWithEmptyCollectionsMethodsTypeInferred()
   {
      this.stringsList = Collections.emptyList();
      this.stringsSet = Collections.emptySet();
      this.stringsMap = Collections.emptyMap();
   }

The above code will compile without warning and no NetBeans hints will be shown either. The Javadoc documentation for each field of the Collections class does not address why these warnings occur for the fields, but the documentation for each of the like-named methods does discuss this. Specifically, the documentation for Collections.emptyList(), Collections.emptySet(), and Collections.emptyMap() each state, "(Unlike this method, the field does not provide type safety.)"

Use of the Collections methods for empty collections shown in the last code listing provided type safety without the need to explicitly specify the types stored within that collection because type was inferred by use of the Collections methods in assignments to known and already declared instance attributes with explicitly specified element types. When type cannot be inferred, compiler errors will result when using the Collections methods without an explicitly specified type. This is shown in the next screen snapshot of attempting to do this in NetBeans.

The specific compiler error message is:

    [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:62: error: method populateList in class Main cannot be applied to given types;
    [javac]       populateList(Collections.emptyList());
    [javac]       ^
    [javac]   required: List<String>
    [javac]   found: List<Object>
    [javac]   reason: actual argument List<Object> cannot be converted to List<String> by method invocation conversion
    [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:63: error: method populateSet in class Main cannot be applied to given types;
    [javac]       populateSet(Collections.emptySet());
    [javac]       ^
    [javac]   required: Set<String>
    [javac]   found: Set<Object>
    [javac]   reason: actual argument Set<Object> cannot be converted to Set<String> by method invocation conversion
    [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:64: error: method populateMap in class Main cannot be applied to given types;
    [javac]       populateMap(Collections.emptyMap());
    [javac]       ^
    [javac]   required: Map<String,String>
    [javac]   found: Map<Object,Object>
    [javac]   reason: actual argument Map<Object,Object> cannot be converted to Map<String,String> by method invocation conversion
    [javac] 3 errors

These compiler errors are avoided and type safety is achieved by explicitly specifying the types of the collections' elements in the code. This is shown in the next code listing.

Explicitly Specifying Element Types with Collections's Empty Methods
   /**
    * Pass empty collections to another method for processing and specify those
    * empty methods using Collections methods. This will result in javac compiler
    * ERRORS unless the type is explicitly specified.
    */
   public void instantiateWithEmptyCollectionsMethodsTypeSpecified()
   {
      populateList(Collections.<String>emptyList());
      populateSet(Collections.<String>emptySet());
      populateMap(Collections.<String, String>emptyMap());
   }

The Collections class's methods for obtaining empty collections are preferable to use of Collections's similarly named fields for that same purpose because of the type safety the methods provide. This allows greater leveraging of Java's static type system, a key theme of books such as Effective Java. A nice side effect is the removal of cluttering warnings and marked NetBeans hints, but the more important result is better, safer code.

Saturday, November 24, 2012

Scripted Reports with Groovy

Groovy has become my favorite scripting language and in this blog I look at some of Groovy's features that make it particularly attractive for presenting text-based reports. The post will show how custom text-based reports of data stored in the database can be easily presented with Groovy. I will highlight several attractive features of Groovy along the way.

I use the Oracle Database 11g Express Edition (XE) for the data source in my example in this post, but any data source could be used. This example does make use of Groovy's excellent SQL/JDBC support and uses the Oracle sample schema (HR). A visual depiction of that sample schema is available in the sample schema documentation.

My example of using Groovy to write a reporting script involves retrieving data from the Oracle HR sample schema and presenting that data via a text-based report. One portion of the script needs to acquire this data from the database and Groovy adds only minimal ceremony to the SQL statement needed to do this. The following code snippet from the script shows use of Groovy's multi-line GString to specify the SQL query string in a user-friendly format and to process the results of that query.

def employeeQueryStr =
"""SELECT e.employee_id, e.first_name, e.last_name,
          e.email, e.phone_number,
          e.hire_date, e.job_id, j.job_title,
          e.salary, e.commission_pct, e.manager_id,
          e.department_id, d.department_name,
          m.first_name AS mgr_first_name, m.last_name AS mgr_last_name
     FROM employees e, departments d, jobs j, employees m
    WHERE e.department_id = d.department_id
      AND e.job_id = j.job_id
      AND e.manager_id = m.employee_id(+)"""

def employees = new TreeMap<Long, Employee>()
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:xe", "hr", "hr",
                          "oracle.jdbc.pool.OracleDataSource")
sql.eachRow(employeeQueryStr)
{
   def employeeId = it.employee_id as Long
   def employee = new Employee(employeeId, it.first_name, it.last_name,
                               it.email, it.phone_number,
                               it.hire_date, it.job_id, it.job_title,
                               it.salary, it.commission_pct, it.manager_id as Long,
                               it.department_id as Long, it.department_name,
                               it.mgr_first_name, it.mgr_last_name)
   employees.put(employeeId, employee)
}

The Groovy code above only adds a small amount of code on top of the Oracle SQL statement. The specified SELECT statement joins multiple tables and includes an outer join as well (outer join needed to include the President in the query results despite that position not having a manager). The vast majority of the first part of the code is the SQL statement that could be run as-is in SQL*Plus or SQL Developer. No need for verbose exception catching and result set handling with Groovy's SQL support!

There are more Groovy-specific advantages to point out in the code snippet above. Note that the import statement to import groovy.sql.Sql was allowed when needed and did not need to be at the top of the script file. The example also used Sql.newInstance and Sql.eachRow(GString,Closure). The latter method allows for easy application of a closure to the results of the query. The it special word is the default name for items being processed in the closure. In this case,it can be thought of a a row in the result set. Values in each row are accessed by the underlying database columns' names (or aliases in the case of mgr_first_name and mgr_last_name).

One of the advantages of Groovy is its seamless integration with Java. The above code snippet also demonstrated this via Groovy's use of TreeMap, which is advantageous because it means that the new Employee instances placed in the map based on data retrieved from the database will always be available in order of employee ID.

In the code above, the information retrieved from the database and processed via the closure is stored for each row in a newly instantiated Employee object. This Employee object provides another place to show off Groovy's brevity and is shown next.

Employee.groovy
@groovy.transform.Canonical
class Employee
{
   Long employeeId
   String firstName
   String lastName
   String emailAddress
   String phone_number
   Date hireDate
   String jobId
   String jobTitle
   BigDecimal salary
   BigDecimal commissionPercentage
   Long managerId
   Long departmentId
   String departmentName
   String managerFirstName
   String managerLastName
}

The code listing just shown is the entire class! Groovy's property supports makes getter/setter methods automatically available for all the defined class attributes. As I discussed in a previous blog post, the @Canonical annotation is a Groovy AST (transformation) that automatically creates several useful common methods for this class [equals(Object), hashCode(), and toString()]. There is no explicit constructor because @Canonical also handles this, providing a constructor that accepts that class's arguments in the order they are specified in their declarations. It is difficult to image a scenario in which it would be easier to easily and quickly create an object to store retrieved data values in a script.

A JDBC driver is needed for this script to retrieve this data from the Oracle Database XE and the JAR for that driver could be specified on the classpath when running the Groovy script. However, I like my scripts to be as self-contained as possible and this makes Groovy's classpath root loading mechanism attractive. This can be used within this script (rather than specifying it externally when invoking the script) as shown next:

this.class.classLoader.rootLoader.addURL(
   new URL("file:///C:/oraclexe/app/oracle/product/11.2.0/server/jdbc/lib/ojdbc6.jar"))

Side Note: Another nifty approach for accessing the appropriate dependent JAR or library is use of Groovy's Grape-provided @Grab annotation. I didn't use that here because Oracle's JDBC JAR is not available in any legitimate Maven central repositories that I am aware of. An example of using this approach when a dependency is available in the Maven public repository is shown in my blog post Easy Groovy Logger Injection and Log Guarding.

With the data retrieved from the database and placed in a collection of simple Groovy objects built for holding this data and providing easy access to it, it is almost time to start presenting this data in a text report. Some constants defined in the script are shown in the next excerpt from the script code.

int TOTAL_WIDTH = 120
String HEADER_ROW_SEPARATOR = "=".multiply(TOTAL_WIDTH)
String ROW_SEPARATOR = "-".multiply(TOTAL_WIDTH)
String COLUMN_SEPARATOR = "|"
int COLUMN_SEPARATOR_SIZE = COLUMN_SEPARATOR.size()
int COLUMN_WIDTH = 22
int TOTAL_NUM_COLUMNS = 5
int BALANCE_COLUMN_WIDTH = TOTAL_WIDTH-(TOTAL_NUM_COLUMNS-1)*COLUMN_WIDTH-COLUMN_SEPARATOR_SIZE*(TOTAL_NUM_COLUMNS-1)-2

The declaration of constants just shown exemplify more advantages of Groovy. For one, the constants are statically typed, demonstrating Groovy's flexibility to specifying types statically as well as dynamically. Another feature of Groovy worth special note in the last code snippet is the use of the String.multiply(Number) method on the literal Strings. Everything, even Strings and numerics, are objects in Groovy. The multiply method makes it easy to create a String of that number of the same repeating character.

The first part of the text output is the header. The following lines of the Groovy script write this header information to standard output.

println "\n\n${HEADER_ROW_SEPARATOR}"
println "${COLUMN_SEPARATOR}${'HR SCHEMA EMPLOYEES'.center(TOTAL_WIDTH-2*COLUMN_SEPARATOR_SIZE)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR
print "${COLUMN_SEPARATOR}${'EMPLOYEE ID/HIRE DATE'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'EMPLOYEE NAME'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'TITLE/DEPARTMENT'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'SALARY INFO'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println "${'CONTACT INFO'.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR

The code above shows some more addictive features of Groovy. One of my favorite aspects of Groovy's GString support is the ability to use Ant-like ${} expressions to provide executable code inline with the String. The code above also shows off Groovy's GDK String's support for the center(Number) method that automatically centers the given String withing the specified number of characters. This is a powerful feature for easily writing attractive text output.

With the data retrieved and available in our data structure and with the constants defined, the output portion can begin. The next code snippet shows use of Groovy's standard collections each method to allow iteration over the previously populated TreeMap with a closure applied to each iteration.

employees.each
{ id, employee ->
   // first line in each output row
   def idStr = id as String
   print "${COLUMN_SEPARATOR}${idStr.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def employeeName = employee.firstName + " " + employee.lastName
   print "${employeeName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def jobTitle = employee.jobTitle.replace("Vice President", "VP").replace("Assistant", "Asst").replace("Representative", "Rep")
   print "${jobTitle.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def salary = '$' + (employee.salary as String)
   print "${salary.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.phone_number.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"

   // second line in each output row
   print "${COLUMN_SEPARATOR}${employee.hireDate.getDateString().center(COLUMN_WIDTH)}"
   def managerName = employee.managerFirstName ? "Mgr: ${employee.managerFirstName[0]}. ${employee.managerLastName}" : "Answers to No One"
   print "${COLUMN_SEPARATOR}${managerName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   print "${employee.departmentName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   String commissionPercentage = employee.commissionPercentage ?: "No Commission"
   print "${commissionPercentage.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.emailAddress.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println ROW_SEPARATOR
}

The last code snippet is where the data retrieved from the database is output in a relatively attractive text format. The example shows how handles in a closure can be named to be more meaningful. In this case, they are named id and employee and represent the key (Long) and value (Employee) of each entry in the TreeMap.

There are other Groovy features in the last code snippet worth special mention. The presentation of commission uses Groovy's Elvis operator (?:), which makes even Java's conditional ternary look verbose. In this example, if the employee's commission percentage meets Groovy truth standards, that percentage is used; otherwise, "No Commission" is printed.

The handling of the hire date provides another opportunity to tout Groovy's GDK benefits. In this case, Groovy GDK Date.getDateString() is used to easily access the date-only portion of the Date class (time not desired for hire date) without explicit use of a String formatter. Nice!

The last code example also demonstrates use of the as keyword to coerce (cast) variables in a more readable way and also demonstrates more leverage of Java features, in this case taking advantage of Java String's replace(CharSequence, CharSequence) method. Groovy adds some more goodness to String again in this example, however. The example demonstrates Groovy's supporting extracting the first letter only of the manager's first name using subscript (array) notation ([0]) to get only the first character out of the string.

So far in this post, I've shown snippets of the overall script as I explained the various features of Groovy that are demonstrated in each snippet. The entire script is shown next and that code listing is followed by a screen snapshot of how the output appears when the script is executed. The complete code for the Groovy Employee class was shown previously.

generateReport.groovy: The Complete Script
#!/usr/bin/env groovy

// Add JDBC driver to classpath as part of this script's bootstrapping.
// See http://marxsoftware.blogspot.com/2011/02/groovy-scripts-master-their-own.html.
// WARNING: This location needs to be adjusted for specific user environment.
this.class.classLoader.rootLoader.addURL(
   new URL("file:///C:/oraclexe/app/oracle/product/11.2.0/server/jdbc/lib/ojdbc6.jar"))


int TOTAL_WIDTH = 120
String HEADER_ROW_SEPARATOR = "=".multiply(TOTAL_WIDTH)
String ROW_SEPARATOR = "-".multiply(TOTAL_WIDTH)
String COLUMN_SEPARATOR = "|"
int COLUMN_SEPARATOR_SIZE = COLUMN_SEPARATOR.size()
int COLUMN_WIDTH = 22
int TOTAL_NUM_COLUMNS = 5
int BALANCE_COLUMN_WIDTH = TOTAL_WIDTH-(TOTAL_NUM_COLUMNS-1)*COLUMN_WIDTH-COLUMN_SEPARATOR_SIZE*(TOTAL_NUM_COLUMNS-1)-2



// Get instance of Groovy's Sql class
// See http://marxsoftware.blogspot.com/2009/05/groovysql-groovy-jdbc.html
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:xe", "hr", "hr",
                          "oracle.jdbc.pool.OracleDataSource")

def employeeQueryStr =
"""SELECT e.employee_id, e.first_name, e.last_name,
          e.email, e.phone_number,
          e.hire_date, e.job_id, j.job_title,
          e.salary, e.commission_pct, e.manager_id,
          e.department_id, d.department_name,
          m.first_name AS mgr_first_name, m.last_name AS mgr_last_name
     FROM employees e, departments d, jobs j, employees m
    WHERE e.department_id = d.department_id
      AND e.job_id = j.job_id
      AND e.manager_id = m.employee_id(+)"""

def employees = new TreeMap<Long, Employee>()
sql.eachRow(employeeQueryStr)
{
   def employeeId = it.employee_id as Long
   def employee = new Employee(employeeId, it.first_name, it.last_name,
                               it.email, it.phone_number,
                               it.hire_date, it.job_id, it.job_title,
                               it.salary, it.commission_pct, it.manager_id as Long,
                               it.department_id as Long, it.department_name,
                               it.mgr_first_name, it.mgr_last_name)
   employees.put(employeeId, employee)
}

println "\n\n${HEADER_ROW_SEPARATOR}"
println "${COLUMN_SEPARATOR}${'HR SCHEMA EMPLOYEES'.center(TOTAL_WIDTH-2*COLUMN_SEPARATOR_SIZE)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR
print "${COLUMN_SEPARATOR}${'EMPLOYEE ID/HIRE DATE'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'EMPLOYEE NAME'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'TITLE/DEPARTMENT'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'SALARY INFO'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println "${'CONTACT INFO'.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR

employees.each
{ id, employee ->
   // first line in each row
   def idStr = id as String
   print "${COLUMN_SEPARATOR}${idStr.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def employeeName = employee.firstName + " " + employee.lastName
   print "${employeeName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def jobTitle = employee.jobTitle.replace("Vice President", "VP").replace("Assistant", "Asst").replace("Representative", "Rep")
   print "${jobTitle.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def salary = '$' + (employee.salary as String)
   print "${salary.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.phone_number.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"

   // second line in each row
   print "${COLUMN_SEPARATOR}${employee.hireDate.getDateString().center(COLUMN_WIDTH)}"
   def managerName = employee.managerFirstName ? "Mgr: ${employee.managerFirstName[0]}. ${employee.managerLastName}" : "Answers to No One"
   print "${COLUMN_SEPARATOR}${managerName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   print "${employee.departmentName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   String commissionPercentage = employee.commissionPercentage ?: "No Commission"
   print "${commissionPercentage.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.emailAddress.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println ROW_SEPARATOR
}

In this blog post, I've attempted to show how Groovy provides numerous features and other syntax support that make it easier to write scripts for generating readable and relatively attractive output. For more general Groovy scripts that provide text output support, see Formatting simple tabular text data. Although these are nice general solutions, an objective of my post has been to show that it is easy and does not take much time to write customized scripts for generating custom text output with Groovy. Small Groovy-isms such as easily centering a String, easily converting a Date to a String, extracting any desired character from a string based on array position notation, and easily accessing database data make Groovy a powerful tool in generating text-based reports.