Summary
Code generators can automatically implement certain types of functionality, saving time and eliminating the possibility of certain classes of bugs.
Although they have much to recommend them, code generators also have a cost that should be considered carefully before incorporating them into your project.
Favor generators that allow a clear separation between generated and non-generated functionality, but make sure you understand the trade-offs you are making before including any generator into your project.
Details
Code generators can be grouped into three general types:
- Boilerplate generators
- Compile time annotation processors
- Runtime generators/frameworks
Boilerplate Generators
Boilerplate generators are the simplest form of code generation. They can be further split into :
- Generators that insert code into existing classes (e.g. methods auto generated by an IDE).
- Generators that produce scaffolding that is checked into version control and modified.
- Generators that produce new classes from a model. The generated code is not normally checked into version control.
We recommend that the first type are using sparingly, if at all. This is discussed further in "Know How to Implement Hashcode and Equals".
Generating code from a model (such as a schema or grammar) can be a useful approach as long as the generated code is not modified and is packaged separately. If generated and non-generated code are packaged within the same module then this can start to cause friction (see below).
Compile Time Annotation Processors
JSR 269 introduced a standard framework for processing annotations at build time. Several tools exist that use JSR 269 to generate code.
Most use the annotated classes purely as input, from which new classes are generated. Often, the new classes extend or implement the annotated class or interface but remain separate. These are really just a subset of model based boilerplate generators where the model input model is annotated Java classes.
Some (such as project Lombok) update the annotated classes themselves, adding additional behavior. This is likely to increase both surprise and friction which are discussed below.
Downsides
There are clearly a lot of upsides to code generators, so why wouldn't they always make sense?
The main issues they cause are surprise and friction.
Surprise
If you generate code at compile or runtime the you are no longer programming in Java.
You are programming in an augmented Java that does things that developers maintaining the code may not be aware of.
It may do things that they do not expect.
It may break fundamental assumptions that programmers have about what can or cannot happen within their code.
Runtime generators will usually generate more surprise than compile time systems - they add an element of magic that breaks the usual Java rules. Runtime generators also often weaken type safety, moving classes of problem a developer would normally expect to occur at compile time to runtime.
The first time a developer encounters a code generator in a project, everything it does will be surprising.
After a period of learning, most of the surprise should go away but each developer will need to go through this learning period. The learning involved can be significant - gaining a complete understanding of framework such as Spring is, for example, a significant effort.
The most worrying problem is when there is still some surprise left after the initial learning period.
If you find yourself asking the question "could this be because of the code generator?" when something unexpected happens with your system, and having to eliminate that possibility each time, then you have introduced a very real cost into your project.
Friction
Code using compile-time generators will not import cleanly into IDEs unless the IDE understands how to run the generator. Even when the system is supported by an IDE it may require plugins to be installed, configuration options to be set etc.
The amount of friction, and how often it is encountered, will depend on the IDE and the quality of the support. There may be little friction and it may only be encountered when a new developer joins a project. Or it may be considerable and triggered each time code is cleaned.
The most effective way to reduce the friction is to package the generated code separately from the code that depends upon it. The generated code then becomes a normal binary dependency and the fact that it is automated becomes an internal implementation detail.
While this works well, it may also have a downside. It may create artificial modules. If the code was not auto-generated, would it have made sense to package it as a separate module?
Runtime generators do not usually introduce much friction, although sometimes issues might be experienced if javaagents are not present when running tests from the IDE.
The Trade-off
So those are the issues.
Surprise and friction sound like minor concerns compared to the promise of functionality for free, but their impact can be significant.
Whether or not it makes sense to introduce a code generator often depends on how much it will be used. If there is a large amount of functionality that can be auto-generated then it probably makes sense, if the amount is relatively small it may be best to stick with vanilla Java.