JBQ's spot on the Wild Wild Web
The musings of a French mathematician living in the heart of the American technology industry

Sun flirting with Open Source and Open Standards
Two interesting tidbits of news on slashdot this morning. Both happen to be somewhat related to Sun Microsystems, but come from different departments and are not related to each other:

"The Open Office XML Format Will Probably Become an ISO Standard"

In summary, Sun's Open Office group is considering submitting the file formats to ISO in order to create an international standard, in a move that would please the EU.

The first thought that went through my mind is "wait, XML itself isn't an ISO standard", but XML is closely related to SGML (which is an ISO standard) and XML is a very small standard so it could be included entirely in the OO standard.

The second thing I thought was "wait, a standard like that will slow down innovation", with the assumption that a standard creates at least a psychological barrier (if not a real one) that prevents people from implementing extensions that are not covered by the standard. But in the real world a standard does not prevent proprietary extensions. Looking at the examples of C or HTML, it's pretty clear that the standards didn't stop Netscape, GNU or Microsoft from implementing proprietary extensions in their products. Some of those extensions eventually made it into the respective standards, but the standard version of the extension isn't always compatible with the original proprietary version.

The third thing that went through my mind was "wait, having a standard doesn't mean anything". My day job is essentially to write and maintain an HTML browser that is written in C code (and often associated with some C++ code). I am therefore constantly exposed to the reality of those standards. A standard does no guarantee that all implementations are compliant. On the compiler side of things, it's painfully obvious that compiler vendors don't read the standards, misinterpret them, or have some serious bugs in their products (it often turns out to be all three of those). In many cases it is obvious that those compiler vendors don't have decent QA departments. I am fairly familiar with the ISO C standard, and I sometimes hit compiler issues that are in direct gross violation of a standard reqirement. To make things worse many compiler vendors don't maintain their code, and there are plenty of compilers out there that are full of well-known and well-documented bugs that the vendors don't fix. On the HTML side of things, things aren't much better. Sure, you can write just about any HTML-looking text, pass it to just about any browser less than 10 years old, and you're likely to have something that looks more or less like what you expected, but if you start to look at the details everything becomes nightmarish: table layout and display is probably the worst of all; followed closely by the handling of invalid markup. The interaction between markup and CSS is often a wild guess, as is the behavior of floating objects, the way object tags work, or even the layout and display of special and invalid characters and in general the handling of character sets. And the standard itself is very poor: unprecise in places, convoluted in others, self-contradictory sometimes.

All that was to say, essentially, that in my opinion having a standard for something as complex an an office suite format is of questionable value. It could very well end up having a "feel good" value, the kind that makes program managers feel warm and fuzzy but that engineers know are worthless. It could even have a negative value with people attempting to write compatible software and relying so much on the standard that they'd forget to do reasonable interoperability testing.


"The Case for Open Source/Closed Standards"

In summary, Sun is attempting to find ways to open the source of Java while maintaining control over the system by forcing people to pass a predefined test suite before they can ship binaries.

It's very hard to create a valuable test suite that can't be easily bypassed. The value of a test suite comes from how reproducible its results are, and how quickly the suite can be run: if the results aren't reproducible in a short period or if they aren't reproducible at all, the test suite has no value since non-compliance cases can't practically be debugged. On the other hand a highly reproducible test suite will have a very rigid set of inputs and expected outputs, meaning that it is very easy to recognize the tests, hard-code the expected outputs, consistently pass the entire test suite and still not work at all.

The killer sentence in the article is "By requiring that any derivative works pass the test suite, Sun could ensure that no one could publish derivative versions of Java that were incompatible with their version" but that is total nonsense. The reality would be "By requiring that any derivative works pass the test suite, Sun could ensure that no one could publish derivative versions of Java that did not pass the test suite" and this is obviously a much weaker statement, not worth anything.

And that's only the beginning. A test suite won't prevent extensions. Nothing in the license (at I read its sketch) would prevent Microsoft or anybody else from implementing MSJVM extensions again. It could be done in a way such that the entire test suite passes, even such that Microsoft's implementation runs all or almost all Java applications (probably more than what some other careless engineers would do), and yet have some extensions that would make applications developed with Microsoft's environment totally incompatible with other Java environments. Microsoft could be sneaky and introduce gratuitous incompatibilities, or they cold actually introduce extensions that have value - the big problem here is that what has value for Microsoft's customers or for Microsoft themselves (which are often the same thing) doesn't necessarily have value for Sun.

Even worse, such forking is a nightmare of forward-compatibility, and if I remember correctly this was an issue when Microsoft had their Java extensions: when Sun released a new version of Java (might have been 1.2, I'm not sure), Microsoft's VM which was an extension of a previous version (might have been 1.1, I'm not sure) was obviously incapable of running the applications developed for the newest version of Sun's Java. Similarly, Sun's Java was incapable of running applications written for Micsosoft's VM.

I have the feeling that this is another "feel-good" attempt by Sun, but that in that case there's a very definite risk of a negative impact. I would very much rather have Sun distribute the source under a "no redistribution" license, along with a program that allows people or organizations to create custom versions for internal use and to submit changes back to Sun which would then consider them for inclusion in the official version of Java.
Home page Related articles Posted on 27 Sep 2004