Sunday, March 1, 2009

Build tools

I referred to the Maven change at my work the other day, but I didn't take the time to describe some of the basic principals that I was using.  These principals generate questions rather than exclude or include things.  As I mentioned in this post, any build tool can be configured to support your environment.  Using that logic, why would anyone move beyond shell scripts and batch files?  That's the first item in the philosophy:

1.  Can this project use the build tool in the way it was designed?
For make, this means try to use only the implicit rules and organize your pieces so that all the interdependent files are within the same directory.  For maven, this means organizing your project in a modular fashion where each of the configuration files describe only the dependencies between the modules.  When you move out of the comfort zone for the tool, you increase the likelihood of mistakes, or design cul-de-sacs.  Every design has certain features that shape it, and those features exclude other items or make them difficult to achieve.  Using Maven to build a project developed entirely in C raises a flag.

2. Can the build tool handle the corner case you know about now?
Pretty simple thing to remember - if the tool can't handle all your current requirements, how will you deal with the unexpected?

3. Can the tool be setup to run correctly from a sandbox?
By sandbox I mean can the tool be useful without referring to non-local items.  I understand that one of the features of Maven is that it can keep various libraries up-to-date by retrieving the latest version.  By "retrieve" it means connect via HTTP, I believe.  I've been informed that this behaviour can be disabled, which would be a plus in my view.  When building a particular project, it is preferable to be able to identify, before the build, the files that will be included.  That doesn't really provide any more or less protection from error, but it makes the process more repeatable and leads to the next point.

4. Can the tool be setup to run correctly from a source control?
Imagine you have a project in a source control system and you connect with a new machine.  Can  you do a single check, run the build tool and generate the proper output?  This is something that I value highly.  It speaks to repeatability and how fast you can get new project members up to speed.  It also means that as long as you have project source, you can build it.  Someone I knew at university was doing his PhD, but he said that he could no longer print or display his Master's thesis.  He wrote it in PostScript or LaTeX dialect and could no longer find the pieces the render the source as it appeared in the hard copy.  Many companies and individuals rely on open source tools.  If they do not keep the tools along with the project, they risk not being able to build the project from source in the future.  

5. Can developers operate effectively without modifying the configuration files?
One of the problems I've seen in various workplaces is build configuration differences.  If the build configuration files have to be checked out of source control and modified to build correctly, that is a large potential problem.  Ideally, build configurations would not need to be altered unless files are added or removed - even some tolerance of those changes would be good.  If these files have to be checked out, different groups produce slightly different builds, which leads to errors arising from these small changes.  These errors are very hard to track down and can be avoided with proper tool selection.

To summarize - the idea is that doing the right thing is easy and the wrong thing (unexpected things) is hard.  It should be easy to identify where everything came from and how something was built.  These things may take a large amount of configuration - a ramp up time learning how to use a tool properly.  As for Maven, so far I think it's right for what we are doing at work - but there are caveats.  The main one being the potential for cross-module changes, of which I know of one current example at work.  Items that touch many modules can be refactored by adding more indirection (or abstraction), but this has it's own cost.  Oh well - such complexity is what makes things difficult and what keeps me employed.  I just want to make sure that I don't forget the potential problem areas in the future.

No comments: