Build Automation Software is still just Software

Whenever you move out of your cozy Java (-Language)/Eclipse/Maven comfort zone you recognize there is a whole plentitude of build systems coming with different environments. While you would probably associate the classic Java language with ant or Maven (or even 5000+ lines of BASH scripts written a hundred years ago), the polyglot programmer must be aware of:

  • Scala and SBT
  • Groovy and Gradle (becoming more and more popular in Java)
  • Clojure and Leiningen
  • (J)Ruby and Rake

Leaving the JVM community, there’s still C/C++ with Make (with the beloved  automake, autoconf & Co.), NANT for .NET and certainly a lot more.

How languages rub off on their build tools

There seems to be a strong need to leverage the feel and the qualities of the production language to the build system. The bad properties get inherited, too:

  • Ant and Maven rely heavily on XML (remember EJB 2.x?) and I remember that internally, domain values  are mostly Strings.
  • Non-trivial Gradle scripts lead to these “Oh, there is a coniditionally dynamically added method that makes just the release artifacts for the customers fail.”
  • SBT forces you to leave blank lines – wonder when tabbed punchcards come back.
  • Leiningen invents yet another syntax for GAV coordinates.

Business risks of 3rd party tools

Especially one-man shows and small companies should think about the impact of having yet another third-party tool that needs updates twice a year:

  • Maven Plugins have defects fixed in features that you don’t actually use and defects introduced in features your business relies on.
  • The Maven Release Plugin is still not the right way to do continuous delivery.
  • Testing of Maven plugins was cumbersome and still is no fun.
  • Using String ${properties} and not objects that provide the proper level of abstraction remains the source of many hacks in build scripts. javaCompilerConfiguration.invokeJavaC(buildChain, context) is quite different from setting untyped properties for a Maven plugin, right?

Why you might want to roll your own Build Automation System

Many people still think that programming in a general purpose language and programming by configuring a model are basically two different things. They are not. Anneke Kleppe introduced the term Mogram, which is a portmanteau of Model and Program. Think of JSON, which is data as a program. You are not configuring Maven or Gradle, you are programming them.

The problem with, say, a Maven pom.xml is, that it is hard to test its abstract behavior. Does it generate all the artifacts with the content we expect? While the plugin might be correct, you might have missed a configuration parameter or set up a wrong (or no) base path. I am always afraid when someone changes the build system – did they do more than a manual smoke test after the last-minute-change?.

Test-First development of your Build Automation System

Build Automation is extremely relevant for the delivery of the final product, still it is often treated as a kind of glorified AUTOEXEC.BAT: The build system people tinker around with plug-in configurations in the pom.xml like in the old MS-DOS days. But it’s real software!

So, why not treat it as a first class citizen? Having an easily maintainable, customized process written in your production general purpose language prevents 1500+ line build files (your coding stanadard says ~200 lines per unit max, right?), hard coded values and doomsday events on important releases.

A few Ideas

  • Use Test-First Development, TDD, all of it. I want my build system to be at least as good as the final product – when it has a defect or is not exensible I cannot deliver my products with the desired quality or in time.
    • The machines that build a consumer product often outlive the product they build for economical reasons.
  • Apply the same (or even more strict) design and coding rules.
  • Look out for self-similar structures. Even a Build Tool needs a Build Tool – but usually just not that complicated.
  • Do not use Language A to build Language B (except when A is a subset of B).  You will have more tools that may behave wrong after an innocent update.
  • Leverage compiler type checks (an ArtifactId is pretty well defined…) and IDE auto-completion.
  • Favor small build tasks over the Maven do-it-all-and-in-one-file approach (try configuring Maven to distribute an artifact to 5 different repositories, depending on different conditions.).

And please: Don’t invent yet another embedded DSL: A ten line build script is something you only find in textbooks (just like SELECT * FROM ORDERS …).

Advertisements

Project templates: Declarative Maven Archetypes vs. Rolling your own Generator (with NIO2)

Abstract

A Maven Archetype is a template that enables developers to create several Maven projects with common features, such as Maven settings, dependencies and default resources. Creating of archetypes can be cumbersome if done seriously and may be difficult to test. A lot of public archetypes seem to be outdated and not very flexible. This article describes experiences with the artifact mechanism and reminds of the fact that some simple Java objects can do the same trick using basic Java APIs and the StringTemplate v4 template library, while being easily testable and require no knowledge of yet-another-maven-plug-in.

My basic-java-archetype

Years ago I started putting every idea, learning test and snippet into a single project in my workspace (an “experimental”, “playground” or “prototype” project). This worked well until I really wanted to use the outcome ;-). If an idea evolves it is sensible to create a seperate project to have cohesive parts testable and in one place, with minimal dependencies. Additionally, I leave ideas unfinished, even uncompiling or with failing unit-tests which prevents using the Infinitest plug-in that supports continuous unit-testing.

One of my most valuable assets is a simple Maven archetype for Java projects that I created in 2011. Creating a Maven archetype was the obvious solution to the problem. A simple template that sets up common dependencies and plug-ins. I chose a minimal variant, only including a reference to a parent POM for setting up basic plug-ins (compiler source/target 1.6 and so on), and two POM projects including common run-time and test dependencies (this helps to keep the parent POM unpolluted).

Maven Archetypes are heavy-weight

What turned out tricky was that unless you decide working with snapshots you’ll need a stable, released parent POM, released dependency POMs and an actual working template project to begin with, as it makes no sense to create a non-working template. These need to go into version control (and additonally the POMs are required in the artifact repository – Nexus, Artifactory etc.).

Secondly you need an archetype project, which also is put into version control because it needs to get released and installed in your central Maven repository. From my experience it is best to keep templates free of any example code (such as HelloWorldService.java or MyFirstUnitTest.java). As I create a lot of small projects these get deleted right away and only bother me. You could add an “example-archetype” to aid new developers if you feel your organization requires it but keep a “real”, minimal template for the experienced staff.

This turned out to be non-trivial, because there’s a lot that can go wrong. Items need to be Maven-released in the right order, bugs require re-releases and you might also discover Maven (Plug-in) bugs along the way. “Make simple things simple and hard things possible” is definitely not the Maven way. Except for some default resources required by the Unitils test library I wanted to keep the directories clean, that is empty. The resources plug-in omits empty directories by default (there is a configuration setting) and I think some other plug-in wasn’t happy too. It was definitively hard work to create a working archetype with everything released and working flawlessly, altough it included only three files and a few directories. But it was still worth the effort, I use it to this very day.

The archetype mechanism is basically not very flexible. I observed that some organizations tend to use obsolete or inefficient artifacts because it is regarded a hassle to improve and refactor them. Parent POMs are several versions ahead of what is declared in the artifact, so there are several versions in use depending on what the developers thought was the current parent POM version. With me it’s the same. There are some settings in the parent POM and in the template I’d like to change. But as I get along with it it’s low priority.

Java generation scripts

Learning from Ruby and Python I’ve found a way that suits my requirements much better. Currently I am working out several ideas regarding continuous delivery and automatic deployment processes. which involves a lot of file handling tasks. Java NIO2 provides near-native possibilities for file-system handling, including POSIX-attributes. I wrote a lot of small learning tests using UnitilsIO to get familiar with it. Writing small, script-like components feels a lot like Ruby and Python, without sacrificing compile-time type checking.

What it great about NIO2 is that it lends itself very well to scripting. I seems heavily inspired by commons-io, making it possible to create files with one line:

Files.write(aPath, myRenderedContent.getBytes(StandardCharSets.UTF_8), StandardOpenOption.CREATE);

Using commons-cli it is easy to create a clear structured command-line parser (that is even unit-testable) if you need one. Using StringTemplate it is straight forward to create a custom Maven pom.xml. Directories are simply created using

Files.createDirectories(Paths.get("src","main","java");

From my experience is sufficient to have one archetype project per team or per department that includes several project generation classes and use some common building blocks to ease directory handling. I prevent creation of projects in existing directories (say /basedir/myproject/ already exists). Failure to create any file or directory results in termination of the generator – no sophisticated error handling is required because generating projects should be a simple process.

You can use package- or class names to encode different versions of the same generator. Your archetype project may look like:

MyDepartmentsArchetypes:

src/main/java/
              com/myorg/mydep/archetypes
                                        /SimpleMavenJARGeneratorv1.java
                                        /SimpleMavenJARGeneratorv2.java
              com/myorg/mydep/archetypes/simplemaven/v1/
                                        /SimpleMaven.java
src/test/java/
              ... // don't forget to test archetypes                  // it may pay off to use integration-tests that
                  // fire up a Maven instance with seperate
                  // configuration and canned local repo!

Note that custom Java archetype generation scripts are easy to test, to document and to debug because you’re using existing infrastructure. And O-Os “tell, don’t ask” principle may come in handy when solving really complex projects and deployment situations.  If Java is not your primary language of course all of this can be applied too.

Please comment if you have some hints or practices on creating Maven archetypes and if you find the plain Java approach useful and applicable.