Missing Version Numbers

Back in June it was announced that the next Perl release would be called Perl 7. A quick look back at Perl's history reveals that this is a little unusual. Perl 5 was originally released in October 17, 1994 and served its users well for a number of years. In 2000 Perl 6 was announced as a ground-up redesign of the language and it spent a long time in development. It eventually surfaced in 2015 but in 2019 was renamed "Raku" to distinguish it from traditional Perl. This leaves Perl 6 as a sort of "orphaned" version number.

In the meantime while Raku née Perl 6 was being developed, the team behind Perl 5 had delivery numerous point releases and retained an active community of loyal users. Many of these had no desire to migrate to Raku and wanted a number of quality of life improvements in their language, ecosystem and libraries. The sort of changes being proposed were bigger than you'd expect from a point release, so a new major version was required. Hence Perl 7 - the real next version of Perl.

So we now have quite an interesting situation, let's take a look at the release history:

Version Date Comment
Perl 4.000 21st March 1991
Perl 5.000 17th October 1994
Perl 6 - Actually "Raku", little resemblance to Perl 5
Perl 7 Planned 2021+ Refined and improved Perl 5

Over the years many commercial and open source projects have skipped version numbers leaving their own little version holes with their own various explanations. Today we'll take a look at a number of similar situations like this, and we'll try to understand what happened in each case and why.

A fairly typical Perl program which calculates π

PHP 6

Perl was not the first language to skip a version, in fact it wasn't even the first scripting language beginning with "P" to skip version 6. A quick glance at PHP's release history reveals the following:

Version Date Comment
PHP 4.0.0 22 May 2000
PHP 5.0.0 13 July 2004
PHP 6.0.0 - Never released
PHP 7.0.0 3 Dec 2015

So what's the story here? Well in common with most software written in the early '00s PHP did not have particularly good Unicode support. The developers behind PHP recognised this and planned a rewrite - PHP 6 - that used Unicode to represent strings internally, specifically UTF-16. Work began in 2005 and would continue for a few years. While some of the planned PHP 6 features were completed on schedule, the Unicode rewrite stalled and became a major blocker to the release. In the meantime Unicode did indeed begin to take off ...

... however UTF-16 didn't. UTF-8, a different standard, rapidly emerged as the preferred Unicode standard on the web. This was particularly problematic since it became apparent that PHP 6 had some serious performance issues dealing with converting strings to and from UTF-8, and since web applications were PHP's bread and butter this was a big deal. In 2010, five years into the PHP 6's development the project was abandoned. Much of the non-Unicode features of PHP6 were ported into PHP 5 and released in version 5.4.

In 2014 as a new major release of PHP was being planned the topic of what it should be called was raised and it was decided that for the sake of removing any confusion or ambiguity the next version of PHP would be PHP 7. There are plenty of justifications given, but the first one given in the RFC I linked really seals it

First and foremost, PHP 6 already existed and it was something completely different. The decimal system (or more accurately the infinite supply of numbers we have) makes it easy for us to skip a version, with plenty more left for future versions to come.

This makes plenty of sense - PHP 6 was already a thing, and it's not like integers are a finite resource. If this was Ubuntu and they were considering discarding a good adjective/animal pair then perhaps it'd be slightly different

ECMAScript 4

Brendan Eich developed Javascript over the course of 10 short days at Netscape, and it soon became the defacto scripting language for the web. This status was confirmed by Ecma who issued a number of standards and referred to it as "ECMAScript". Nowadays they release these standards on a yearly basis (the latest being ECMAScript 2020, published June 2020) but at the end of 2009 we had five ... kinda:


Version Date Comment
ECMAScript 1 June 1997
ECMAScript 2 June 1998
ECMAScript 3 December 1999
ECMAScript 4 -
Abandoned
ECMAScript 5 December 2009

The first two largely captured and formalised the existing Javascript behaviour, but the committee began to get a little more adventurous in the third version, introducing features like regular expressions and try/catch. I can't find a very good summary other than the Wikipedia one (which seems to be copy-pasted everywhere) and I'm not going to read the entire spec.

Anyway the folks at Ecma recognised the potential of rich client-side web applications, and recognised that in its current state Javascript aka ECMAScript was not particularly well suited this task in its current state:

Though flexible and formally powerful, the abstraction facilities of ES3 are often inadequate in practice for the development of large software systems. ECMAScript programs are becoming larger and more complex with the adoption of Ajax programming on the web and the extensive use of ECMAScript as an extension and scripting language in applications. The development of large programs can benefit substantially from facilities like static type checking, name hiding, early binding and other optimization hooks, and direct support for object-oriented programming, all of which are absent from ES3.
So they intended to address this and more by creating ECMAScript 4, which was supposed to completely overhaul Javascript. Object-Oriented Programming was to be embraced (with classes and interfaces being introduced) and a rich type system was to back it it up (with type annotations available to assist with correctness and provide the runtime with hints to help with compile and runtime optimisations). Numeric values would no longer just be limited to floating point - developers would now gain access to byte, int, uint, double, and even decimal numeric types. They'd even be able to define Algebraic Data Types which, at the time, was only really available in niche languages like Haskell, SML, Ocaml and has only recently started to break into the mainstream. The ability to structure applications using namespaces and packages would make building larger applications composed of many separate modules more manageable. So while previous specifications either captured the status quo or made small incremental enhancements, ES4 was to be a complete redesign of the language with sweeping changes on a scale the committee hadn't yet encountered.

This concerned a number of people - most notably it drew the ire of Microsoft's IE team, who were concerned that many of the changes were not backwards-compatible. Chris Wilson, platform architect of the IE team at the time said this:
For ECMAScript, we here on the IE team certainly believe that thoughtful evolution is the right way to go; as I've frequently spoken about publicly, compatibility with the current web ecosystem - not "breaking the Web" - is something we take very seriously. In our opinion, a revolution in ECMAScript would be best done with an entirely new language, so we could continue supporting existing users as well as freeing the new language from constraints (including the constraint of performantly supporting scripts written in the old language)
Brendan Eich did not agree, and wrote a blog post titled "Open letter to Chris Wilson" which accused Wilson of being disingenuous, and Microsoft of suddenly proposing ECMAScript 3.1 without regard for the ES4 effort:
You seem to be repeating falsehoods in blogs since the Proposed ECMAScript 4th Edition Language Overview was published, claiming dissenters including Microsoft were ignored by me, or “shouted down” by the majority, in the ECMAScript standardization group
...
These ES3.1 proposals came suddenly in March after years of passive participation by Microsoft, while the rest of the group worked on ES4. They have not been touched since April.
It was clear that there would be no agreement. Microsoft effectively held all the cards - IE was at that time the dominant browser by some distance. If Ecma somehow pushed forward with ES4, it's entirely possible that Microsoft could've just ignored it altogether. ES4 was abandoned, and years later ES5 was proposed with slightly more modest, incremental changes. This is a strategy that has continued to this day - where the language evolves but remains backwards compatible. Some ES4 features (such as classes) ended up ECMAScript anyway, and in a curious twist Microsoft created Typescript which takes even more ES4 features like interfaces, union types and type annotations.

IPv5

IPv4 and IPv6 are pretty well known - but you never really hear of any other versions. IP specification took shape towards the end of the late 1970s. As engineers struggled to figure out the exact problem they were solving and the way it should be solved they quickly issued a number of iterations and ultimately produced what we now know as IPv4. In the 1990s it became apparent that the number of devices accessing the Internet would ultimately exceed the number of available addresses IPv4 could provide (a bit less than 4.3 billion when you factor in reserved blocks) - so a new version of the protocol was drafted called IPv6. There were of course other changes and motivations in IPv6, but literally the first bullet point in the IPV6 spec deals with increasing this address space - bumping it up to 128 bits (good for 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses).

Trying to produce a coherent history is a little difficult due to how both TCP and IP were reshaped and redefined during the early period. Version headers in packets are resized and rebased from one version to the next. Thankfully we have RFC 755 which puts its foot down and authoritatively states the following timeline:
    Version Date Comment
    IPv0 March 1977
    IPv1
    February 1978
    IPv2
    February 1978
    IPv3 February 1978

    IPv4 June 1978
    IPv5 - Abandoned?
    IPv6 December 1995

    Note: For IPv6 we could linked RFC 1883 (where it was first specified) or RFC 8200 (which made a few additional changes to the protocol)

    So the only real missing version is IPv5 - which actually spent a couple of decades being defined as the "Internet Stream Protocol", never really went anywhere and is effectively abandoned. Honestly though if you're interested you should check out this article by Matthäus Wander. I spent ages bouncing around different RFCs and IENs and couldn't really piece everything together until Matthäus laid it all out so go read his article.

    IPv6 adoption is slow and steady, currently sitting at 33% worldwide as of 16th October 2020

    Node.js 1.0.0 to 3.0.0

    Node.js came into existence in 2009, promising developers that they could implement performant backend code in a simple language like Javascript thanks in no small part to Google's V8 engine. I personally played with it a bit every now and then and used things like Ghost, but no more than every couple of months. Then one day I tried to run some example code only to find it didn't work and was 5 major versions out of date. I was pretty confused and tried to figure out if there was some reason I was so far behind. A quick look through Node.js' release history reveals the reason behind my confusion:

    Version Date Comment
    Node.js 0.10.0 11 March 2013
    Node.js 0.11.0 28 March 2013
    Node.js 0.12.0 6 February 2015
    Node.js 1.0.0 - Never existed
    Node.js 2.0.0 - Never existed
    Node.js 3.0.0 - Never existed
    Node.js 4.0.0 8 September 2015
    Node.js 5.0.0 29 October 2015

    So between February and September in 2015 Node.js jumped from version 0.12 to 4.0, and then a month later to 5.0. So what happened here? Well apparently a number of members of the Node.js community were dissatisfied with Joyent's stewardship of the project, forked it under the name "io.js" and released a handful of major versions - 1.0, 2.0 and 3.0.

    Ultimately Joyent, io.js and the community reached some sort of agreement, merged their codebases, and released the new version as Node.js 4.0. So if we amend our timeline with to include the io.js releases ...

    Version Date Comment
    Node.js 0.10.0 11 March 2013
    Node.js 0.11.0 28 March 2013
    io.js 1.0.0 14 January 2015
    Node.js 0.12.0 6 February 2015
    io.js 2.0.0 4 May 2015
    io.js 3.0.0 4 August 2015
    Node.js 4.0.0 8 September 2015
    Node.js 5.0.0 29 October 2015

    .. everything looks a little bit clearer.

    Java 2.x to 4.x

    I have only ever been casual observer of Java - I used it a little at University for my Data Structures & Algorithms modules but rarely touched it otherwise. Whenever I needed to think about Java versions it was because it was used by a program I needed to use, and I never quite followed its versioning. I just know that I never really knew what I was using. It was basically "Java" on versions 1.0 and 1.1, then "Java 2 Standard Edition" for a while after ... but retained the "1.x" major version - so versions 1.2, 1.3 and 1.4. Then at some point we were dealing with Java/J2SE 5.0 aka J2SE 1.5.

    Version Date Comment
    JDK 1.0 January 1996
    JDK 1.1 February 1997
    J2SE 1.2 December 1998
    J2SE 1.3 May 2000
    J2SE 1.4 February 2002
    J2SE 2.0 - Never existed
    J2SE 3.0 - Never existed
    J2SE 4.0 - Never existed
    J2SE 5.0 February 2004
    Java SE 6.0 December 2006

    Sun talk a little bit about the version number here but really here's all you need to know:
    Both version numbers "1.5.0" and "5.0" are used to identify this release of the Java 2 Platform Standard Edition. Version "5.0" is the product version, while "1.5.0" is the developer version. The number "5.0" is used to better reflect the level of maturity, stability, scalability and security of the J2SE.

    The number "5.0" was arrived at by dropping the leading "1." from "1.5.0". Where you might have expected to see 1.5.0, it is now 5.0 (and where it was 1.5, it is now 5).

    This was some years before the concept of semantic versioning came about so I suspect this is mostly just a kind of marketing effort - they realised they'd stuck with 1.x for a while and couldn't really come up with a good path to a 2.x and had kind-of already burned that  by using "J2SE" so decided to hit the reset button with 5.0.

    DirectX 4

    Anyone who played PC games within the last 25 years will have had some interaction with DirectX - the collective name for Microsoft's suite of graphics, sound and networking APIs meant for gaming.

    Version Date Comment
    DirectX 1 30 September 1996
    DirectX 2
    5 June 1996
    DirectX 3 15 September 1996
    DirectX 4 - Never released
    DirectX 5 4 August 1997
    DirectX 6 7 August 1998

    After Microsoft released DirectX 3 they launched two successor projects concurrently - one was a smaller, incremental release (DirectX 4) and the other was a larger release with a swathe of new functionality (DirectX 5). Apparently after Microsoft consulted with the community they discovered that there was little interest for the features planned for DirectX 4, and a lot of enthusiasm for DirectX 5. Microsoft simply ceased development on DirectX 4 and focused on delivering DirectX 5. Fun fact: around this time John Carmack called DirectX's D3D "a horribly broken API"

    X-Wing vs TIE Fighter, one of the games to use DirectX 5

    Watcom 5.0 and earlier

    Back in the 1980s Open Source development tools were not yet widespread, so you could buy compilers from a few different companies. One such company was Watcom, which was formed by Unversity of Waterloo students and had a fair amount of success with a Fortran compiler to which they later added a C frontend. Their compilers initially targeted IBM's mainframe architectures (Series 1, System/370 etc) but as 1980s bore on the the IBM PC platform started to take off. In 1988 they launched their first C compiler for IBM platforms - Watcom 6.0


    Version Date Comment
    Watcom 5.0 - Never existed
    Watcom 6.0 1988
    Watcom 7.0 1989
    Watcom 8.0 1990
     
    At the time the dominant compiler products were Borland and Microsoft C compilers - both of which were on version 5. In an effort to be seen as competitive (or perhaps to be recognised as better than) their competitors, they launched it with version 6.0 - literally one-upping their more established rivals. Watcom's C compiler was for a brief moment hot property - famed for being one of the better optimising compilers available for the PC it was widely used in the gaming industry, notably by the newly formed id software who would go on to use it to build Doom and Doom 2 (more on these at filfre.net). There are also references to Watcom in the Quake source code, but I don't know if that means anything.

    I couldn't find a screenshot of anything related to Watcom 6.0 but here's the Watcom 9.0 debugger "WVIDEO"

    Slackware 5 and 6

    For some reason Slackware was the first ever Linux distro I used. I have no idea why I chose it, but looking back it would've been better if I had a more user-friendly one like RedHat or SuSE. Using software on Slackware often involved compiling it yourself, which usually meant hunting down, downloading and compiling dependencies (possibly even encountering more dependencies which require you to hunt down, download and compile sub-dependencies (possibly even encountering more dependencies ... etc)). Between May and October in 1999 they jumped from using 4.0 to 7.0


    Version Date Comment
    Slackware 1.0 17 July 1993
    Slackware 2.0 2 July 1994
    Slackware 3.0 30 November 1995
    Slackware 4.0 17 May 1999
    Slackware 5.0 - Never existed
    Slackware 6.0 - Never existed
    Slackware 7.0 25 October 1999


    The reason given for skipping these versions is explained in the FAQ which is still up:
    I think it's clear that some other distributions inflated their version numbers for marketing purposes, and I've had to field (way too many times) the question "why isn't yours 6.x" or worse "when will you upgrade to Linux 6.0" which really drives home the effectiveness of this simple trick. With the move to glibc and nearly everyone else using 6.x now, it made sense to go to at least 6.0, just to make it clear to people who don't know anything about Linux that Slackware's libraries, compilers, and other stuff are not 3 major versions behind. I thought they'd all be using 7.0 by now, but no matter. We're at least "one better", right? :)
    Let's take a look at the latest version numbers of some of the "other distributions" at the time of Slackware 7.0's release - October 1999:
    • Mandrake 6.1 (released September 1999)
    • RedHat 6.1 (released September 1999)
    • SuSE 6.2 (released August 1999)
    • Debian 2.1 (released March 1999)
    I'm sure you'll see a pretty clear difference - Mandrake, RedHat and SuSE were all commercial linux distributions in competition with each other. So it's understandable that they wouldn't want to seem out of alignment or behind . As for Debian, it was always more of a hackers distro - it was and still is very much Free Software at it's core (which you'll know if you downloaded the default ISO and tried to get your wifi working...) so presumably they felt no pressure to take part in this.

    So going back to Slackware - this always puzzled me. In my opinion Slackware was more similar to Debian than to the commercial distros. I imagine users would've been perfectly content to move on from Slackware 4 to Slackware 5 without worrying about what that version number implied. I don't think it's controversial to suggest that Debian has the last laugh. Mandrake became Mandriva then largely disappeared. RedHat the company has pivoted to where it's distro is not its main focus. SuSE still exists but is relatively niche compared to Debian and its derivative distributions.

    Slackware 4 running KDE 1. Sidenote: that World of Spectrum page looks amazing on what looks like a very old Netscape Navigator

    SuSE, Mandrake and Ubuntu oddities


    There are three other major Linux distributions which seemingly have missing versions - SuSE, Mandrake and Ubuntu. They all have initial releases which are not 0.x or 1.x and which have simple explanations so I've lumped them all together here.

    Version Date Comment
    SuSE Linux 4.0 - Never existed
    SuSE Linux 4.1 - Never existed
    SuSE Linux 4.2 May 1996
    SuSE Linux 4.3 September 1996

    The first release of SuSE Linux was version 4.2. This is a reference to Hitchhiker's Guide to the Galaxy where a computer called Deep Thought calculates the answer to "the Ultimate Question of Life, The Universe, and Everything" as 42. Incidentally the first version of its package manager YaST is 0.42

    Version Date Comment
    Mandrake 4.0 - Never existed
    Mandrake 5.0 - Never existed
    Mandrake 5.1 July 1998
    Mandrake 5.2 December 1998
    Mandrake 5.3 February 1999

    Mandrake Linux's first ever release in July 1998 was version 5.1. It was based on RedHat Linux 5.1 so it's likely they just borrowed that, perhaps they intended to stay aligned with RH's releases and wanted to show they were compatible.

    Version Date Comment
    Ubuntu 2.xx - Never existed
    Ubuntu 3.xx - Never existed
    Ubuntu 4.10 October 2004
    Ubuntu 5.04 April 2005
    Ubuntu 5.10 October 2005

    The final Linux distro I'll talk about is dubious to include as there's not really any missing or skipped version numbers really. Ubuntu's first ever release was 4.10 - so it would seem that they skipped versions 1, 2 and 3. However Ubuntu's version numbers are based on the year and month of release, so version 4.10 implies a release date of October 2004, 5.04 is April 2005 and so on.

    Conclusion

    I only scratched the surface of this subject, really. If look around then you'll start to see missing versions everywhere. For example, what happened to Windows 9? How about .NET [Core] 4, MySQL 6 and 7 or Akumulátor 2? As soon as you apply an sequential version to some software you make a statement a statement about how that software relates to any previous or future releases - whether or not you adhere to a strict set of rules like semver. This is generally intended to help users and it largely works as intended but does not really give any strong prescription for how to handle abandoned releases or forked-then-remerged projects. As a consequence we have to be cool with the idea that skipping a version or two is perhaps preferable than introducing confusion or even discarding any notion of sequence whatsoever - imagine wondering if it's safe to upgrade from PHP e1daa19 to PHP caf3c64.