The Source for Java Technology Collaboration
User: Password:



Ken Arnold

Ken Arnold's Blog

Generics Considered Harmful

Posted by arnold on June 27, 2005 at 10:53 AM | Comments (10)

I don't know how to ease into this gently. So I'll just spit it out.

Generics are a mistake.

This is not a problem based on technical disagreements. It's a fundamental language design problem.

Any feature added to any system has to pass a basic test: If it adds complexity, is the benefit worth the cost? The more obscure or minor the benefit, the less complexity its worth. Sometimes this is referred to with the name “complexity budget”. A design should have a complexity budget to keep its overall complexity under control.

Generics are way out of whack. I have just finished (well, nearly) my work in the fourth edition of The Java Programming Language. I am glad to say that David Holmes, not I, was the one who covered generics. Just reviewing it and reading the specifications was enough to put my brain through a cuisinart stuck on “pulse”.

We went through that chapter multiple times, consulting with several people who wrote the specs and are otherwise experts. We were only able to cover the highest level in the book, and it's still pretty hard to understand (although David exceeded himself in making it as comprehensible as possible).

Learning to use generified types can get very complicated. It's hard to understand why you cannot do some things without casts, for example. But writing generified classes is rocket science. Here's one that showed up at the very last minute: It's a bad idea to declare a type that returns an array of a type parameter. That is, you shouldn't do this:

    interface Holder<T> {
        T[] toArray();
    }
Why, you ask? Well, the problem is that T might itself be a generic type. That is, someone might declare a Holder<Set<String>>. And, ... uh, hold on, I'm trying to remember the issue here...

Actually, I'm only mildly embarrassed to say that I've forgotten. But I remember that it took a few back-and-forths between David and the advising expert so that David — remember, David is the guy who has been writing a chapter on generics after several months of experimentation and research and over a year of thinking about how to approach it — could understand the problem. So our book recommends against it because it isn't good.

Although there is an exception. It's okay to do this if the method takes as a parameter an array of T or a Class<T> object:

        T[] toArray(Class);
That's OK to do.

Which brings up the problem that I always cite for C++: I call it the “Nth order exception to the exception rule.” It sounds like this: “You can do x, except in case y, unless y does z, in which case you can if ...”

Humans can't track this stuff. They are always loosing which exception to what other exception applies (or doesn't) in any given case.

Or, to show the same point in brief, consider this: Enum is actually a generic class defined as Enum<T extends Enum<T>>. You figure it out. We gave up trying to explain it. Our actual footnote on the subject says:

Enum is actually a generic class defined as Enum<T extends Enum<T>>. This circular definition is probably the most confounding generic type definition you are likely to encounter. We're assured by the type theorists that this is quite valid and significant, and that we should simply not think about it too much, for which we are grateful.

And we are grateful. But if we (meaning David) can't explain it so programmers can understand it, something is seriously wrong.

So now we know it's complex. But if it really saved your programming butt a lot it could be worth it. So what does it save you? It saves you making mistakes, like putting a String in a list that should only contain Longs, or attempting to pull out a String from such a list.

But we have a demonstration proof that we can live without it, namely that we have for nearly a decade. Of course there are such bugs in code, and if you generify a bunch of code you might even find one or two that were waiting to bite you (unless that code is actually orphaned). But I have yet to find someone who believes this to be a major source of error in their code, compared to other problems.

So we have a feature whose complexities are high, whose learning curve is steep, and whose benefit is limited. And add to that the feature is ubiquitous -- with Java 5 it is nearly possible to write code that doesn't interact with generics.

The complexity of Java has been turbocharged to what seems to me relatively small benefit. I don't see that the value is there to justify the cost. Not that we can change things, but I think we should at least view it as an demonstration proof of the value of an explicit complexity budget against which features must be justified. Without such a budget, it feels like the JSR process ran far ahead, without a step back to ask “Is this feature really necessary”. It seemed to just be understood that it was necessary.

It was understood wrong.



Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • I've found several bugs in my code because of generics. I was checking collections for items that could never be there, and putting actual objects instead of wrappers into collections. They didn't show up often at runtime for various reasons, but with generics I found them. I think most people who use generics for a few months will get the hang of it. I definitely have. Especially, if you use a smart editor like IDEA or maybe Eclipse, it does a lot of the work for you, and you don't need to think much about it. (Although IDE support for variance could be improved in some important ways.)

    Posted by: keithkml on June 27, 2005 at 01:08 PM

  • Generics are a great tool (and the benefit vastly exceeds the learning curve, and eventual deficiencies in the current design of Java5) when you really need them. And I very often need this, for example when I have complex collection structures like this:

    /** [QualityParameter->HashMap[String->ConfigurationItem]] Calculation data, indexed by QP. */
    private static final HashMap mapCalcData;
    

    The variable above produced code that was difficult to read and maintain, with a boatload of typecasts and iterators in loops. (Notice that I developed an informal generics-like documentation for the structure of collections.) Now with Java5 it's much better:

    /** Calculation data, indexed by QP. */
    private final HashMap<QualityParameter,HashMap<String,ConfigurationItem>> mapCalcData;
    

    The generic collection above, combined with enhanced-for, has made my code MUCH more readable and robust (even uncovered a couple bugs or bad practices). I have lots and lots of examples like that, because I'm often implementing high-performance, server-side runtimes when I don't have the luxury of wrapping every complex data structure in several type-safe container classes just to make code read better.

    When used in public interfaces, generics are also invaluable to enforce correctness, prevent bugs and reduce testing effort. For example, in one project I have defined an interface that declares a method that return a Map<Integer, Object[][]>[]. Without generics, this method woul return a simple Map[], and there are dozens of classes implementing this method, but I don't need anymore to worry that somebody may be putting the wrong kind of object in the returned map, which would bomb a critical part of the app. This is especially important with teams when one developer defines an interface that others will implement.

    I agree that Java5's generics does have limitations. It should be improved (hopefully in Mustang?) at least to the point that I could write fully generic and warning-free code. Right now, there are some idioms that you cannot code without a type-safety warning, such as creating arrays of generic types. I don't know what it takes to solve these issues -- more syntax sugar, extensions to the bytecode/VM spec, harder type inference in javac, whatever -- I just want it fixed.

    P.S.: I was also a C++ programmer in a previous life, and the complexity of templates exceeds Java5's generics by orders of magnitude. I still remember losing a full hour to simply understand an horrendous, half-page-length compiler message generated by a bug in the expansion of deeply nested templates in STL classes, and sometimes there was no bug at all, just an incompatibility between older compiler and newer libraries requiring support for the very latest template black magic... so, I think Java is in a very good balance btw power and complexity.

    Posted by: opinali on June 27, 2005 at 02:09 PM

  • The problem of the advent of generics is not that they are hard, is that they require a mindet shift from the "This will somehow work" to "I am sure this will work since I thought about it". Anyway I somehow agree that the advantages of generics are too little compared to their complexity, just look at Bruce Eckel's research in the field. I quite grasped java5 generics easily since I already was thinking in terms of covariance/contravariance/rich type systems, but I think this stuff is out of place in java. Oh, and the worst thing is that I think there were some things that could have produced a much grater productivity boost with little or none mental effort, such as adding mixins.

    Posted by: homersimpson on June 28, 2005 at 02:01 AM

  • /** [QualityParameter->HashMap[String->ConfigurationItem]] Calculation data, indexed by QP. */
    private static final HashMap mapCalcData;
    ==============

    Just a thought on the above code - I'm not sure if generics are the best solution for this. Wouldn't returning a wrapping class with methods like "getConfigurationItem(String)" be more helpful? There are places in PMD where I have code that returns Maps that contain Lists, and I always feel like I'm forcing too much information onto my client code. I need to clean those up...

    Posted by: tcopeland on June 28, 2005 at 07:03 AM

  • Generics themselves have a high 'complexity cost' associated with them alone, but the greatest cost comes from the unwillingness to change what is 'Java' and break backwards compatibilities to simplify new features. Generics in Java 5 are "compiler fluff", and users must deal with a lot of warnings and flaws as result. For example, your T[] return above - arrays have tons of shortfalls with javagenerics, including several common array operations being completely illegal with generic types. This is not a flaw of generics (C# does not have these problems) - it is a problem in Java not accepting generics as a "Java" feature, only accepting it as a language feature alone.

    Posted by: cyberakuma on June 28, 2005 at 12:34 PM

  • homersimpson said above

    Oh, and the worst thing is that I think there were some things that could have produced a much grater productivity boost with little or none mental effort, such as adding mixins.

    Java 5 does mixins. Check out the rapt project here on java.net. The documentation for mixins is here and yes, it supports generics.

    And if your at javaone at the moment, these mixins are going to be discussed in the "apt uses of apt" BOF tonight at 7:30

    Posted by: brucechapman on June 28, 2005 at 06:30 PM

  • Ok, so the help screen for posting a comment was bullshitting me when it says

    Allowable html: a href,br/,p,b,strong,em,i,ol,ul,li,blockquote,pre

    The mixin documentation is here
    https://rapt.dev.java.net/nonav/docs/api/net/java/dev/rapt/exploratory/mixin/package-summary.html#package_description
    and the rapt project is here https://rapt.dev.java.net/

    Posted by: brucechapman on June 28, 2005 at 06:34 PM

  • The problem is not generics which are proven technology for decades (see Ada) and well understood. The problem is also not recursive types as in enum«E extends enum«E»»¹ which just restricts the type parameter to subclasses of enum«E». The problem is the half hearted approach Sun has choosen to implement generics. The limitation of generic arrays for instance are just a side effect of type erasure. Microsoft's .NET VM did it better (although not completely satisfying). Sun better considers implementing generics at the VM level. To say "Generics are a mistake" is a short-sighted response. Generic concepts in general are powerful and intuitive. The perverted implementation in Java makes them appear harder than they are.

    --------------------
    1 Had to use french quotes here since the HTML processing messed this up.

    Posted by: albert_bachmann on June 29, 2005 at 02:12 AM

  • tcopeland: Yes, I could have used wrapper classes, but like I stated, the cost of doing that was substantial for this particular program because that data structure could be very large.

    Even in the 95% of the cases, where my collections don't have a million items, wrapper classes have a cost in making my code bigger and harder to maintain. The generic collections are as safe and efficient as wrapped collections, require zero extra code, and the resulting navigation of the collections is even more readable because I can use standard collection methods like get(), or even better, enhanced for. Of course I could emulate this by defining identical methods in my wrapper classes, and implementing Iterable in those classes...

    There are many other great uses for generics too. Collections are just the bread-and-butter that is easier to demo, but my 'killer' case for generics is reifying type relationships that are otherwise implicit. See this example:

    class Account<? extends Client>
    class PremierAccount extends Account<PremierClient>
    

    Here, I have an hierarchy of Clients and another Hierarchy of Accounts, and the two are tighly coupled by type: certain kinds of accounts require certain kinds of clients. But now I can use generics to create constraints binding the account type to the client type. The type-safety benefit exists here (methods like Account.getClient() will return the proper client type and the compiler won't let me create a PremierAccount for a non-Premier Client), but I consider an even bigger benefit the fact that the typing relationship between the two hierarchies is explicit in the code, in a way that beats any documentation, and (when CASE tools support Java5) might smooth even more the transition bwn modeling and code.

    Posted by: opinali on June 29, 2005 at 11:02 AM

  • cyberakuma, albert_bachmann: I agree wholeheartedly that Java5's generics has several limitations that could be avoided. Even with the erasure model, we can do better, if we are prepared to require some changes in the VM level, or some compatibility-breaking in the APIs.

    But I am happy with Java5 because backwards compatibility is a HUGE win for a language so entrenched as Java. In the last few months I ported several projects to adopt generics (and other Java5 features), and I could do that with a very small effort. Because generics (as well as most stuff from JSR201) is only syntax sugar, in the end of the day, the updated code is identical to the original code. It's using the same libraries, it's exporting the same non-generic signatures to non-generified client code (which will only report type-safety warnings), the performance is the same, everything is the same – I barely need to test the new code! In fact, the transition to J2SE5.0 is so smooth that my only constraint is being able to deploy the new runtime, so I only cannot do it when the deployment environment is using some old appserver from a three-letter vendor that lags behind in new J2SE support. And even in this case there are workaround, like RetroWeaver, javac -jsr14 (check it: it's undocumented but produces 1.4-compatible classes with generics!), and similar solutions. I would love to work only in all-new projects with no legacy code, libraries and components, but most often that's not the case...

    Now, consider that we are 10 months past 1.5.0-FCS, and the adoption of Java5 is still modest (at least in my POV). If Sun/JCP had chosen a less compatible way, the adoption would be even slower. I still rememeber my C++ days when I had to wait years to actually use the latest cool stuff in the language due to the slow pace of compiler vendors, tool and library vendors, other developers (due to the big learning curve), and managers (due to the risk of changing "stuff that works")... all these factors being related to the language changes being very big, very complex and very risky.

    My proposed solution: let's use releases 5.0 and 6.0 (Mustang) to stabilize and popularize generics, but for Dolphin (7.0), let's make generics substantially better even if this requires major rework of the VM and libraries, and forces people to move to generics if they haven't yet done so (e.g., dropping support for non-generic use of generic classes, like "ArrayList list", which is one of the major blockers for enhancements in Java5's generics.)

    Posted by: opinali on June 29, 2005 at 11:24 AM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 XML java.net RSS