The Source for Java Technology Collaboration
User: Password:



Stanley Ho

Stanley Ho's Blog

Versioning in the Java Module System (Part 1)

Posted by stanleyh on May 29, 2008 at 04:01 PM | Comments (11)

This is the first of two articles about versioning in the Java Module System. The second article is about version ranges.

Versioning in the Java Module System

In order for concrete module systems to interoperate, the abstract Java Module System defines a standard versioning scheme. Our aim as spec leads is not cause division or confusion with this scheme, but to support popular versioning practices while recognizing that versioning is new to most Java programmers. The JSR 277 Expert Group made a thorough examination of practices some time ago, and we also bring Sun's practices to the table.

Most people agree that three numbers are the core of a pragmatic versioning scheme. Most also agree that textual qualifiers best characterize qualitative behavior for a given version, e.g. "alpha release", "for customer X". The open questions are then:

  • is a fourth number reasonable?
  • if no, what are the semantics of a qualifier?
  • if yes, what are the semantics of a qualifier?

Three numbers or four?

The JDK uses four numbers like 1.2.2_18 or 1.4.2_16 because of its longstanding practice that the major version is 1. You can argue whether this is good practice, but it's the way the JDK is. And the JDK is not alone - Windows, .NET, Linux, Skype, Firefox and Thunderbird all use four numbers because it gives them room to breathe. If you took all the programs in the world and drew the distribution of how many use one, two, three, four or more version numbers, we suspect it would be roughly a normal distribution with a mean slightly past three.

So the JSR 277 EDR2 proposes the format of a version number as:

    major[.minor[.micro[.update]]][-qualifier]
where major, minor, micro, and update are non-negative integers, i.e. 0 <= x <= java.lang.Integer.MAX_VALUE. qualifier is a string, and it can contain regular alphanumeric characters, -, and _.
  • Major version number should be incremented for making changes that are not backward-compatible. The minor and the micro numbers should then be reset to zero and the update number omitted.
  • Minor version number should be incremented for making medium or minor changes where the software remains largely backward-compatible (although minor incompatibilities might be possible); the micro number should then be reset to zero and the update number omitted.
  • Micro version number should be incremented for changing implementation details where the software remains largely backward compatible (although minor incompatibilities might be possible); the update number should then be omitted.
  • Update version number should be incremented for adding bug fixes or performance improvements in a highly compatible fashion.
  • Qualifier should be changed when the build number or milestone is changed.
The fact that some programs cope with three or even two numbers is not a prima facie reason against four. The OSGi policy of three-number versions where the third is for bug fixes (OSGi R4.1 3.6.2) leaves only two numbers for the mainstream product version. This simply isn't enough for some programs, whose OSGi metadata then has to use all three numbers for the mainstream version and the qualifier for bug fixes. Unfortunately, this requires care because qualifiers are compared lexicographically (OSGi R4.1 3.5.3). People who try to simulate a fourth number with the qualifier - 1.2.3.1, 1.2.3.2 - are confused when OSGi deems 1.2.3.2 higher than - and preferred to - 1.2.3.10. So using a qualifier, rather than a fourth number, for bug fixes doesn't really fly. (If OSGi had adopted natural order string comparison, then qualifiers with numeric content would be much more user-friendly, and would be a reasonable alternative to a fourth version number.)

(Interestingly, while you shouldn't use the qualifier to count bug fixes, it's possible in the Java Module System to do the reverse - use the bug fix "update" number in a four-number version as a kind of qualifier. A company could say that updates in the 10000-19999 range are performance-related bug fixes while updates in the 20000-29999 are fixes for a particular customer, etc. The qualifier is then available for build numbers or datestamps in text format. We don't necessarily recommend this scheme to everyone, any more than we recommend Sun's use of a fixed major number to everyone; but it's worth allowing this kind of flexibility given the incredible diversity of Java programs.)

Alex Blewitt recently characterized the JDK's fourth number as a "(failed) marketing exercise" because it allowed Sun to "hide" the fact that JDK x.y.z had n bug fixes after release (x.y.z.1 thru x.y.z.16 for JDK 1.4.2). This would be valid criticism if Sun had adopted OSGi versioning, since OSGi suggests bumping the third number for bug fixes. But that's in the context of three-number versions! We believe the guidance is really to bump the last number, which makes perfect sense. So Sun isn't "hiding" anything - the JDK has four numbers and we've bumped the fourth number over the years. Incorporating a bug fix in the first-class version number, rather than relegating it to the ranks of a qualifier, is responsible engineering. And Sun is hardly alone in using the fourth number for bug fixes.

For people who like more than four numbers, one option is to define the semantics of four numbers then allow an unbounded number of further numbers. Unfortunately, many tools will falter when presented with more than four numbers, so JSR 277 is sticking with four for now.

Qualifier semantics

Alex Blewitt also noted: "According to the documentation of 2006, the qualifier must be used for beta releases, and must not be used for released versions. In fact, the empty string for qualifier is deemed greater than any other value." The forthcoming JSR 277 EDR2 will not require the qualifier to indicate a beta version - that's a non-testable statement.

The Java Module System is sticking with the policy that "an empty string for qualifier is deemed greater than any other value". That is, we view a qualifier as somewhat subservient to "real" versions; a version without a qualifier is higher than the same version with a qualifier. 1.2.3 is higher than 1.2.3-alpha and 1.2.3-beta. This contrasts with OSGi, where 1.2.3 is less than 1.2.3.alpha and 1.2.3.beta. (OSGi R4.1 3.5.3)

Neither scheme is provably more "intuitive" than the other, but the Java Module System follows the traditional versioning policy of the JDK. It is worth examining that policy. (We mean the policy for real developer versions - 1.1.x thru 1.6.x - not the "product" versions - "Java 5.0", "Java SE 6" - used for marketing. Yes, we know the developer/product split is controversial.)

Consider a product which uses the qualifier to distinguish pre-FCS (pre-First Customer Ship) and FCS versions, e.g. 1.2.3-beta and 1.2.3-fcs. A programmer who wants to import the FCS version would have to know its exact qualifier, which is inconvenient considering each vendor will likely have their own FCS qualifier. An easier scheme, which is JDK policy, is to standardize on the unqualified version as the default "acceptable" version, and use qualifiers to denote pre-FCS versions.

If the FCS version is unqualified, how should post-FCS changes be described? The JDK policy is that once a product is given an FCS version, that version's code never changes again. Any code change raises a new version number. (Maybe with a qualifier, maybe not.) This makes support more manageable than if qualifiers express both pre-FCS and post-FCS information on the same version number.

This pair of policies let programmers easily distinguish between pre-FCS and FCS versions and between FCS and post-FCS versions.

How three numbers v. four numbers impacts qualifiers

The OSGi policy that a qualified version is preferred to an unqualified version runs into trouble with three-number versions.

Suppose you absolutely insist on using three numbers plus a qualifier. (We'll use '-' to clearly separate the qualifier from the numbers.) Pre-FCS, you have 1.2.3-alpha and 1.2.3-beta, then 1.2.3 for FCS. There are really two branches after FCS:

  1. the pre-FCS of the next public version, and
  2. the post-FCS patches on 1.2.3.
Let's consider branch 2 first. If you want a qualifier on 1.2.3 to express its post-FCS state, the qualifier must be pre-planned to be lexicographically later than any pre-FCS qualifier you use with 1.2.3 like -alpha and -beta. A convention for pre-FCS and post-FCS qualifiers on the same number is less maintainable than using the numbers themselves to differentiate pre-FCS and post-FCS states. Bumping 1.2.3 to 1.2.4-special for a post-FCS state is clearly later than 1.2.3 ... while if the version is unchanged, 1.2.3-special relates to 1.2.3-alpha only via a convention that the programmer must remember.

It's true that you could use a qualifier convention like "YYYYMMDDhhmmss-MILESTONE" to automatically put pre-FCS alpha builds before post-FCS customer patches. But in OSGi, 1.2.3 is less than any 1.2.3-20080601... version so you cannot use "just" 1.2.3 anymore; your public releases must show the qualifier with its build times and all. This is not necessarily desirable! And the more information you shoehorn into a qualifier, the easier it is to make a typing error and the harder it is for programmers to understand the meaning at a glance.

So, with only three numbers, post-FCS patches after 1.2.3 (branch 2) require either a complicated qualifier :-( or a move to 1.2.4-special. Let's assume a move to 1.2.4-special. Now, where do you do pre-FCS development of the public version after 1.2.3? (Branch 1.) If you do it as 1.2.4-alpha, 1.2.4-beta etc, you're immediately back to conventions for qualifiers on 1.2.4. Undesirable for the same reasons as above.

Using only three numbers plus a qualifier always requires complicated qualifier conventions. Four numbers gives you room to breathe.

We recognize that Sun's belief that a version without a qualifier is higher than the version with a qualifier is implicitly fond of four numbers. Four numbers allow post-FCS patches after 1.2.3 (branch 2) to go into 1.2.3.1-special (the qualifier is optional really) ... and mainstream development (branch 1) to proceed in the 1.2.4 family. Numbers and qualifiers cooperate to differentiate branches and document which branch is maintenance and which branch is mainstream development.

What about interop?

Module systems and frameworks which use more or less than four numbers should consider how their numbers map to the four available in the Java Module System, and whether their semantics for numbers and qualifiers agree with those of the Java Module System.

A five-number version a.b.c.d.e could map to a.b.c.d-ve where the qualifier is 'v' followed by the e number.

A three-number version a.b.c could map to either:

  • a.b.c - this is the easy option. Technically it uses the micro (third) number to represent bug fixes, and the update (fourth) number which is supposed to represent bug fixes is 0, but this doesn't really matter.
  • a.b.0.c - this respects the original version's semantics by keeping the bug fix number (the third of three in OSGi) in the "right" place (the fourth of four in the Java Module System).
Evolving a.b.c to a.b.(c+1) in the three-number scheme is then mirrored by evolving a.b.c to a.b.(c+1) or a.b.0.(c+1) respectively in the Java Module System.

Three numbers plus a numeric qualifier, a.b.c-d, work immediately in the Java Module System with the qualifier becoming the fourth number, a.b.c.d.

Textual qualifiers are a bit complicated due to the qualifier semantics described earlier. Recall that evolving 1.2.3 to 1.2.3-special in OSGi causes the qualified 1.2.3-special to be preferred. If you map 1.2.3 to 1.2.3 and 1.2.3-special to 1.2.3-special, the Java Module System prefers 1.2.3. You should map textually-qualified versions like 1.2.3-special to 1.2.3.1-special, bumping the version number to force the (higher) qualified version to beat the (lower) unqualified version.

Conclusion

The Java Module System in the forthcoming JSR 277 EDR2 supports up to four numbers in a version for flexibility, and defines the semantics of a qualifier as somewhat subservient to numbers (1.2.3 beats 1.2.3-beta) for readability. Versions in the Java Module System are a superset of versions in OSGi, because a.b.c-qual is valid in both.

Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • You seem to have done a great job of assessing existing practice, so please don't interpret the following comment as a criticism.
    The optional text qualifier has the desirable property of if absent then it has the highest priority, e.g. 1.2.3 is greater than 1.2.3-beta. Why not apply this same logic to all levels by allowing all numbers to be preceded with alpha text and if the alpha text is absent then it has the highest value, e.g. 1.2.3 is greater than 1.2.beta3. This system is more flexible and does not need an extra qualifier level.
    However you have probably thought about this alternate scheme or similar and there is good reason not to do it. I would be interested in your thoughts.

    Posted by: hlovatt on May 29, 2008 at 10:34 PM

  • Major? Minor? Micro? Update? 3 or 4 numbers?

    IMHO, doesn't matter at all...

    Stanley, what happened to or what is wrong with my proposal of versioning?

    With proposed solution (version compatibility based on version history), module system would not have to do any potentially incorrect assumptions, it would not enforce the-one-versioning-policy-to-rule-them-all ... simply, it would be more flexible

    Point is:
    (a)library client MAY NOT know what future versions he will be compatible with, but
    (b)library provider DOES know if he breaks anything from the past or not…

    You could have versioning sequence like this (trying to make my point, not to be factually exact):

    kestrel
    merlin (backward-compatible=false)
    hopper
    mantis
    tiger (backward-compatible=false)
    6.0 (backward-compatible=false)


    (legend)


    Version 7.0 might break your app or might not… You don't know TODAY. Developers of 7.0 WILL know better… TOMORROW

    Your dependency could read like:

    depends: merlin (ignore:mantis) #mantis has some known regression bug

    Module system gives you either Merlin, Hopper, Tiger or 6.0.
    When 7.0 will be released, you might end up running under 7.0 as well, if it WILL be declared compatible… It not (backward-compatible=false), 7.0 won't get into your way…


    I would LOVE to know what is wrong whit THIS approach… Because to me this is more sound a solution than playing games with numbers...

    Posted by: patrikbeno on May 30, 2008 at 04:25 AM

  • This reasoning can be most accurately summarised as

    "Sun needs 4 level versioning, so that's what we're going to do."

    Posted by: grandinj on May 30, 2008 at 06:29 AM

  • Hi Stanley,


    I'm not trying to be critical of your efforts as I'm sure being a spec lead is no easy task.


    However, after reading your last two blog posts, I'm left with a sense of "this is it? this is what we've been waiting for?" Both posts read like the persuasive papers we had to write back in primary school. "Our versioning system is better than yours because it....just is" You make a lot of hoopla about comparing it to OSGi, however how does it stack up to other versioning schemes like Oracle's? If you really feel the need to spend several paragraphs justifying your scheme, at least do a complete job in comparing it to the existing alternatives. I don't want to sound like an OSGi zealot, I'm far from it, but since that's the only alternative you mention in your posts, I'm going to keep with that theme.


    You seem to be quite defensive with respect to OSGi. In some ways I can understand that, as the OSGi camp has been critical of your efforts in the past. I think their criticism is justified, however, as they have a module system that works today. You (in the collective sense as I realize that you're not the only one behind the module system effort) have yet to convince me as a developer that we really need something new.


    From a technical perspective, presumably each existing module system will have some adapter code that will be responsible for loading their modules and wiring them into the JSR277 system. So this code can be responsible for mapping the 4 number JSR277 version scheme to the existing module system's scheme and preserving the intent and the semantics. So whatever version scheme you decide on for JSR-277 is really moot since there will need to be mapping done under the covers anyways.


    I also think that the decision to make a dependency of "1.2.3.4" to mean version 1.2.3.4 and only version 1.2.3.4 is the wrong move. I agree with you that the first time a developer saw that dependency listed they would assume exactly version 1.2.3.4. However, in reality, this is very rarely what you want when deploying an application that consists of modules from multiple sources/developers. I'm a lazy programmer, so if I'm working with Library X version 1.2.3.4, then I'll likely specify that as my dependency. If I incorporate a module from someone else and they also depend on Library X but they were working with version 1.2.3.5. All of a sudden the application has to keep two copies of the same library around when in 99% of the cases, version 1.2.3.5 would satisfy both sets of dependencies. I guess it is an ideological difference, but I think the common case is 1.2.3.4 or greater rather than exactly a particular version. If you are working with a library so unstable that you need an exact version, then in my mind you should have to go to the extra work to say [1.2.3.4, 1.2.3.4]. Make the common situation (of 1.2.3.4 or greater) easy for us poor, lazy programmers; make it easy for us to do the right thing.


    So I guess my take away from these two blog posts is that you've worked really hard to come up with a versioning scheme and dependency syntax that is close but different enough from OSGi that you can call it new, unique, different, and right. You won. However, it is a pyrrhic victory as you've spent (wasted?) all this time inventing something that really didn't need to be invented while we're still no closer to seeing how it's all going to work together. To me, as a developer, it's all about where the rubber meets the road and you've done nothing to show me how it's going to work, nor really even convinced me that we need a new approach.

    Cheers,
    Josh

    Posted by: jareed on May 30, 2008 at 06:43 AM

  • Hi Stanley. Sorry to be perfectly blunt, but I think you're suffering from a severe case of Hubris. Versioning, as has been clearly demonstrated by the long, long line of dead bodies that stretch many decades into the past, is a *hard* thing to do. Further, as has been repeatedly demonstrated by the versioning schemes we currently have with us, the chances of getting it right the first time, when you haven't put it into practice, are exactly ZERO. You are, in effect, coming up with a great THEORY. You have no PRACTICE. It's hubris of the highest order to think you've managed to come up with something that, like all battle plans, won't survive the first 5 minutes of use without finding serious, major bugs that you simply didn't understand at the time you came up wit the scheme.

    Please, dear god, PLEASE stop this madness. We don't need another hero. The industry needs something that will work from DAY ONE. Right out of the gate. You're taking the literally YEARS of debugging and real world use of OSGi's versioning scheme and throwing it out because of? Well, because of no good reason that we can see.

    Seriously. Stop the hubris. Stop trying to reinvent not just the wheel, but the carbon atom, the oxygen atom and - heck - the quarks, all because of this silliness.

    If you want to change the world, then do it right by BUILDING on existing, proven, well used technology. Dear god, don't unleash yet another shiny turd that you've plucked from the bowl simply because you can.

    The industry simply can't survive this reoccurring pattern of Sun's. And you're operating at such a fundamental level here with 277 that your hubris is going to have repercussions that go far beyond much of the other stuff that Sun does.

    Posted by: hellblazer on May 30, 2008 at 07:32 AM

  • The fixed number of version components has been problematic for Maven, so I wouldn't be too happy to see modules went the same way. It'd be worth considering the proposal on Maven's wiki to support a flexible versioning scheme that can cope with an arbitrary number of components:

    http://docs.codehaus.org/display/MAVEN/Versioning

    Posted by: mihobson on May 30, 2008 at 07:47 AM

  • Ok, I am trying to get some feedback here, so if anybody wants to comment my proposal (above), don't do it here. Instead, follow this link.

    Posted by: patrikbeno on May 30, 2008 at 10:01 AM

  • Stanley, your argument boils down to one point. It is amazing so many words were used in the explanation, but I must agree with the other posters: Sun uses 4 version numbers, and so will the world too.

    Unconvincing argument.

    Posted by: paulusbenedictus on May 31, 2008 at 05:22 PM

  • The scheme being proposed here isn't new, but rather very slightly extended over that already in use for WebStart. As such it has already been in use for quite a few years.

    Posted by: mthornton on June 02, 2008 at 02:34 AM

  • I agree with others posters:

    1) one-versioning-policy-to-rule-them-all is a bad idea and should be avoid if possible. This is a very subjective matter so if you can avoid forcing a policy please do so.

    2) Baring that, I'd like to state I am opposed to Sun's four-number versioning. It is utterly meaningless to end-users. There is little or no practical difference between "micro" and "update" numbers and this is further aggrevated by Sun's policy of never changing "major" from 1. If you're not going to ever change a number it becomes meaningless and you might as well omit it from the versioning scheme. The same goes for the String qualifier: a certain version number can be promoted to a beta or release-candidate "status" but why do these subjective attributed have to be part of the software version number? Isn't it enough to leave those qualitative tags in the code repository or web page linking to the download?

    Sorry to say but this smells of a "designed by committee" mindset.

    Posted by: cowwoc on June 05, 2008 at 11:40 AM

  • People, please stop bashing SUN as they are not the only ones using a "more than 3 digits" versioning scheme: IBM (yes, the same IBM that created Eclipse and pushed OSGi) uses 4 digits for Websphere and DB2, Oracle uses 5 digits for the database, the Linux kernel uses 4, and I could go on.

    The point is: bashing the proposal and say that it is trying to bash OSGi because they use a numbering schema over another is pure... vapour, as other known and successful products have demonstrated the need of a "more than 3 digits version schema" for a lot longer than the OSGi platform have existed (who choose the 3 digit versioning schema anyway? which are the reasons?)

    This bashing is even more futile if the spec lead states that the new versioning and the current OSGi versioning will be compatible....

    Posted by: soronthar on July 17, 2008 at 02:17 PM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds