Google has claimed that Microsoft's proposed Office Open XML document standard is unnecessary and should be rolled into the rival OpenDocument Format.
In a Monday post on Google's official blog, open-source programs manager Zaheda Bhorat said the issue affects everybody who uses editable documents.
"A document standards decision may not matter to you today but, as someone who relies on constant access to editable documents, spreadsheets and presentations, it may matter immensely in the near future," wrote Bhorat.
Document formats are shifting towards the use of the Extensible Markup Language (XML), which allows types of data to be defined and tagged within documents. The OpenDocument Format (ODF) has already been ratified by the International Organization for Standardization (ISO), but Microsoft's alternative, Office Open XML (OOXML), is currently undergoing its second attempt to gain ISO approval.
Google's technical analysis of the OOXML specification — which notoriously runs to 6,000 pages of code, compared with ODF's 860 pages — has led the company to believe that "OOXML would be an insufficient and unnecessary standard, designed purely around the needs of Microsoft Office", Bhorat claimed.
OOXML is this week being debated for the second time by ISO, having failed to gain approval in a preliminary vote held six months ago. A ballot will not be held during the meeting, but the various national standards bodies that voted in September are being invited to adjust their positions, if they wish, by the end of March.
Read this
Leader: World not open to Microsoft promises
Microsoft has promised more openness, more freedom. It looks like more of the same...
"We join the OpenDocument Format Alliance and many other experts in our belief that OOXML doesn't meet the criteria required for a globally-accepted standard," Bhorat wrote. "As ISO member bodies around the world work on possible revisions of their vote previously submitted, the deadline of 30 March approaches fast. I invite you to pay close attention and heed the call of many for unification of OOXML into ODF."
A Microsoft spokesperson defended OOXML by saying that its customers "have told us their data needs can't be addressed by a one-format-fits-all approach".
"Everyone wants to use their data in slightly different ways," said Microsoft's spokesperson. "Furthermore, multiple standards can foster a healthy, competitive industry. By developing tools like the [Office] Open XML-ODF translators and making them widely available, we are promoting customer choice, which is our top priority."







Talkback
Unlike what the article claims this is still the first time that ISO is dealing with the standardization. There has been an earlier vote in this process but that was more an intermediate vote and those votes are still valid now but can be altered based on improvement that are currently made in the standardization process.
It should be noted that Google has built its Google apps around the competing ODF format which is simpler which has less functionality than the OOXML format. Therefore Google apps cannot compete on functionality to applications using the full posibilities of OOXML.
Especially things like the easy business data integration possibilities in OOXML which are lacking in ODF.
So it is in Googles interest to be against the standardization of a format that they are not supporting themselves.
Also interesting is that the ODF format was srtandardized premature to be faster then OOXML. It is therefor not finished yet. OASIS that maintains ODF suggests that a next version of ODF (1.2) that is more complete could become ISO standard in summer of 2009 at the earliest.
Customer choice is the antithesis of a standard - if there are two "standards", there is no standard.
ISO has already has a document standard. It may well be good for this to evolve and take on board new capabilities, but it would be counter productive to allow additional "standards".
The document formats we have used for years have been driven by the proprietary, commercial needs of the supplier. The usual standardisation in businesses on the MS Office formats has been driven by the overwhelming commercial success of Office in destroying the competition - buy Office or you cannot edit the file.
The world has moved on and we now have the opportunity to have standards for the benefit of users rather than a forced "choice" of a specific product to a particular company's commercial benefit.
I do not mind whether the document standard is based on ODF or MS Office, but it is of fundamental importance that there be just one standard, and that it is fully open so that any office product can support it.
JPL
Given that Google apps are based on the ODF standard, a rival to Microsoft’s OOXML, I couldn’t have expected a fair assessment of OOXML from a Google employee.
Having said that, why is it that Microsoft feel they have to have their own standard for everything?
"Multiple standards can foster a healthy, competitive industry. By developing tools like the [Office] Open XML-ODF translators and making them widely available, we are promoting customer choice, which is our top priority." ‘said Microsoft's spokesperson’
No Microsoft, you’re just complicating things, you might even try to charge people for those translator tools. There’s choice between Zoho and Google docs and yet they use the same standard.
If you really want to help, you could contribute to the existing standard.
In this case and others, such as HTML or electrical sockets, the claim that multiple standards will foster competition simply is not true. It's a international document standard, useful for archiving public documents and such, assuring that they will always be freely accessible.
If Microsoft's customers need something better, fine. Put it in Office 2009, make it the default format and call it OpenDRM or whatever. Who cares? In addition to this, Microsoft can still support the already existing ISO certified document format, ODF.
I thought Jim Zemlin, the Executive Director of the Linux Foundation had a funny quote:
"It's akin to Microsoft going to the United States Congress and proposing alternative bumper heights."
So you think Google could have based Google apps on a proprietary, closed undocumented Microsoft format.
I do not know if that had made Microsoft very happy or very unhappy,
It would have made Google very stupid, however.
I think you should read a lot more.
Try consortiuminfo.org. They have followed the whole story from the beginning.
For instance:
http://www.consortiuminfo.org/standardsblog/article.php?story=20080224143425160
With some cutting and pasting.
Here is the conclusion.
>> If the eligible members of ISI/IEC JTC1 vote not to approve OOXML, then OOXML will still be an Ecma standard, and all of the benefits to Microsoft customers and developers will still be preserved. Microsoft will also reap the principal benefits that OOXML can provide for it: its developers will be more likely to continue to support Office, and new developers will doubtless become motivated to become part of that environment. In short, a vote against OOXML does not deprive either the marketplace or Microsoft of the value of OOXML having been made public, and all of the changes already made by Microsoft will still bear fruit.
>> But if the National Bodies vote to approve OOXML, what then?
>> If they do, OOXML will achieve titular parity with ODF in the eyes of legislators around the world, most of whom will lack the existing knowledge and the time and interest to learn whether there would still be a reason to prefer products that implement ODF over OOXML. Presumably, the high water mark of interest in ODF would have passed, and the credibility of ODF-compliant products, as well as the importance of open document formats in general, would begin to recede from public and legislative view.
>> Microsoft, like any other publicly held company, would then have no incentive at all to consider moving even one step farther down the path to openness with OOXML than it had on the date of the vote, except to the extent compelled to do so by the European Commission – a glacial process, as witnessed by the more than nine-year duration of the EC’s last prosecution. Microsoft would not have even the incentive to fully implement OOXML, nor to agree to implement any later Ecma-approved change that it did not find to its liking. Nor to work towards merging ODF, OOXML and UOF (the Chinese open document standard). And then we would be back where we started.
Perhaps most tellingly, neither Microsoft nor any other dominant vendor would be any more likely to cooperate in the creation of another Civil ICT Standard that threatened its hegemony than Microsoft has done in the past. There is an historical antecedent for this as well, because Microsoft stood aside rather than join the working group in OASIS that created ODF, despite the fact that it held a seat on the Board of Directors. Had it chosen to participate rather than bet that the ODF effort would fail, we might have one standard today instead of two, and everyone would be better off, including Microsoft’s customers and ISVs. I believe that this is the type of behavior that government should encourage, rather than the opposite.
>> What is needed for the future is a commitment by governments to ensure that proper Civil ICT standards are created and adopted. I believe that this will happen sooner or later, and the question is only how it will be accomplished. Too often, industry holds out as long as it can, until legislators finally act legislatively, usually long after the point in time at which the public would best have been served (e.g., in the United States, where domestic car manufacturers successfully resisted an increase in government-mandated fleet mileage efficiency requirements for over 20 years).
>> If industry (and not just Microsoft) wishes to preserve its freedom to act, and indeed if the formal global standards infrastructure itself wishes to retain a role in the process of creating Civil ICT Standards at all, then each would be wise to consider the fact that a vote aga
The rest of the text
>> If industry (and not just Microsoft) wishes to preserve its freedom to act, and indeed if the formal global standards infrastructure itself wishes to retain a role in the process of creating Civil ICT Standards at all, then each would be wise to consider the fact that a vote against OOXML is a vote not just to serve the public interest, but also a vote to preserve the right of self regulation. While ISO and IEC lack the treaty recognition of the ITU, they have traditionally enjoyed quasi-governmental status nonetheless. With that privilege comes responsibility to serve the public, or to lose the credibility of their imprimaturs entirely.
Regards
Really. A Microsoft spokesperson said that customers have told them that their data needs can't be addressed by a one-format-fits-all approach? I just started giggling and laughing at that! Come on! Microsoft HAS pushed a one-size-fits all approach! That IS what Office represents! That IS what Windows represents! Here we go, let's pooooooouuuuuur in the features! You don't need anything but Microsoft products! No, it's "Our size fits all," or "One size to rule them all."
What Microsoft doesn't like is not being in control. If they can get OOXML pushed through then they can use the usual "embrace, extend, eliminate" methodology to marginalize ODF and anyone else attempting to implement OOXML support. Remember Java? I'm sure they'll try it anyway, but if OOXML is approved then they can do it using approval as pretense.
And, "Multiple standards can foster a healthy competitive industry"? Really? How? And does ODF not support any kind of extension? At all? Let's see. It's XML...
And they promote "choice" by providing translators? Hah! No, they maintain control. Somehow the translators are released a bit later than the official new OOXML revisions put into Microsoft's products. Hmmm. How did that happen? Huh. Musta been an accident...
Microsoft uses the same playbook over and over and over again. Hopefully the approval committee will understand that.
I was stating the motivation for Google to be against the format. They are using a competing format that is less feature rich. That is why Google is against OOXML becoming an ISO standard, because then Google will have to compete more against office software that has features Google apps cannot provide (at least for a while).
You suggest that google cannot use OOXML but the specs are publicly availiable for more than a year now and google. Also google is able to interprete the old binary office formats of which the specification are a lot less good than OOXML spec.
The ODF format Google is using cannot supply the needs for a faithfull representation of most current document so when converting to ODF you will loose information. The OOXML format does allow faithfull represnetaion of the data in current MS Office files after conversion. This is very important for many of the current Office users which have billions of existing Office documents. Also the OOXML format is a lot more feature rich than ODF and can support the features currently used in MS Office. To support the features in MS Office in a simpler format like ODF it would require extensive MS extensions to ODF which would cripple ODF for any other users. Even now it is difficult for the different ODF implementation to interprete OpenOffice propriety features in ODF. This would become a lot more difficult if ODF documents were to consist more of propriety extensions than official standard complient code.
Btw, referring to the consortium info is like referring to the Linux foundation as Andy writing that blog is a director of the Linux foudation and a laywer working with IBM in an anti-ooxml crusade. He is also aprticipating in the anti-ooxml meeting in Geneva at the moment sitting in a panel with IBM VP Bob Sutor.
So the consortiuminfo blog, I read as an IBM blog (of course it very visibly often directly referring to the blogs of IBM employees like Bob Sutor)
[quote]In addition to this, Microsoft can still support the already existing ISO certified document format, ODF. [/quote]
ODF's limited spec can't support all MS Office features unless Microsoft goes on a major entending trip. Also the format does not contain performace enhancements like OOXML does. So using ODF for Microsoft would mean very a controversial extending of ODF (I can hear the protest already) and still having to use a format that performes a lot less (especially in large spreadsheets).
It would mean downgrading MS Office. Nice for MS Office competitors who are unable to compete with Micrsoft on implementing Office software
I am hearing the statement that OOXML is more featureful than ODF a lot, but the only actual examples of extra functionality that people ever care to name are Digital Restrictions Management (DRM) ones. And DRM is something that I wouldn't miss one little bit. (Hint: Someone has always come along and broken each DRM scheme, e.g. CSS and AACS for DVDs. So all that DRM has ever really been good at is thwarting interoperability.)
You can debate the differences between the open ODF, and the partially open OOXML, from now to eternity. Fact is Microsoft has had enough time to spread enough money around , or to make the appropriate threats to get OOXML approved as a standard. They will continue to slow innovation and force an unwitting public to use their products, as they choke out the competition.
Whoa! Albert, XML by nature is not going to be as efficient as a binary encoding, assuming you follow the spirit of XML. Sure you can do a base64 dump of binary and stick tags around it for example, but that isn't really XML's intention. I've written an XML/object translation library and I know of what I speak.
So what are they doing to "speed up" large spreadsheet loading and saving, or anything else? Tags are tags, and the encoding/decoding needs to write/interpret them. The only way to speed up XML interpretation is to get rid of tags by encoding multiple pieces of information for a particular tag. That introduces portability problems for obvious reasons and defies the intent of XML.
By the way, the controversy over extending things never stopped Microsoft in the past. Again, remember Java?
Lastly, how would it mean downgrading MS Office? If they want speed, use binary. If they want portability, use XML. Those two really don't mix. It's a trade-off. Do both. Doing "performance enhancements" in OOXML, which I suspect is tag-glomming, is NOT proper XML.
You suggest ODF is more open than OOXML but in reality they are equally open.
Allthough Sun and IBM having had a clear majority of votes in the OASIS TC on ODF for 5 years is mayby less open than MS having only a single vote in Ecma and the IBM/Sun powerbase in OASIS was shown in their refusal to add elements that would allow better compatiblity with existing office documents because those element would not fit in OpenOffice.
Or in purpously NOT using the W3C MathML 2.0 schema in ODF because OpenOffice does not support MathML 2.0 yet even though the ODF specs states that ODF implementations should support v2.0.
Just to name a few:
* ODF does not allow for compatibility with existing office binary files so that you cannot faithfully convert those to ODF files
* ODF does not have the easy integration of custom (business) data that OOXML provides
* ODF does not have performance enhancements to allow for fast processing of for instance large spreadsheets
* ODF does not allow integration of Math and Office tags which OOXML does. for istance OOXML can track revision changes in math formulas
* ODF has much more limited grafical support than OOXML
* OOXML package files contains relationship information which show all interpackage and external dependencies. If you want to change an URL in an OOXML file you need only change the relationship information and not scan the entire file to look for all places the URL might be (as you would in ODF).
And there are a lot lot more, but you would need a real expert for that
[quote]So what are they doing to "speed up" large spreadsheet loading and saving, or anything else? Tags are tags, and the encoding/decoding needs to write/interpret them. The only way to speed up XML interpretation is to get rid of tags by encoding multiple pieces of information for a particular tag.[/quote]
OOXML uses a lot smaller tagnames for the most common tags. This allows for faster unzipping and less use of memory than an ODF file in parsing the XML against an XML schema.
OOXML spreadsheet files optionally allow for tracking which cells have formulas in them. This means that when you load a spreadsheet file with 10 million data field and only 100 formulas fields the spreadsheet implementation knows immediatly which cell with formulas need to be recalculated without checking all fields for formulas at least once like an ODF spreadsheet implementation would have to.
OOXML packages have relation ship information showing all internal and external dependencies within an OOXML package. This mean you need only scan a minor fraction of the OOXML file to get this information whereas in an ODF file this info could be spread all over the file and requires parsing of all parts in the package.
Thank you, Albert, for an intelligent response, and for taking the time to do so. So let's discuss this. I'll try to keep it short and to the point.
Tag name length is a matter of debate to be sure. XML by nature, however, is intended to allow for text editing. Too short names introduce possible confusion over intent, and possible name rangling in the future over additions. Having a tag like "CannonicalCellContents" is ridiculously long, but having a tag like "C" is ambiguously short.
That's why, in my opinion, discussing XML efficiency in terms of tag length is a red herring. XML is what it is, and truely there is a fadishness about it. That is why, again in my opinion, if one is concerned about tag creation/parsing performance, one should go with a binary format. It's not like one cannot create portable binary. Lua does it. I've worked with the byte code and it handles nuxi issues just fine. I'm sure Java does as well. I'd be for BODF (Binary ODF). The design also isn't particularly complicated. (Again, look at lua.)
As for formula tracking, I understand your point, although I'd have to see the tracking mechanism to be able to discuss this intelligently. On the other hand, I'm assuming that with OOXML you need to read in the whole document before doing any processing since tags in general can occur in any order, so what you are really describing is the internal representation of the document, and that doesn't have to have much of any relationship to the XML at all. The real issue here is whether there is any unnecessary duplication of tag values in the XML for formulas. If there is, then a reference mechanism should have been put into ODF, and that is certainly something that could be presented to the committee.
Lastly, as for relashionship information, again I'd have to see the scheme, but again, if it isn't outlandish/Microsoftish that could also in some way end up in the standard. You said that with ODF it is spread all over the document. Depending on how this is used, this implies that some XML structure rearrangement should have been done. And again, that is a reasonable thing to present to the committee. In the end, structured documents are databases, and are subject to the same indexing issues as databases.
To conclude, however, both the formula issue and relationship information issue you present are abstract issues, and a design should be approached from that standpoint. I suspect the problem here is that it wasn't.
How could it ever be? Are you expecting bug-for-bug compatibility? Are the binary Microsoft formats even documented? Besides, not even today's MS Office is backwards compatible with all the old Office formats so that's hardly a new issue. Basically, you leave the job of converting old documents to new formats to conversion tools, and you are probably right to fear that only Microsoft can write a 100% reliable conversion tool from any of its old formats to ODF. But this is actually a criticism of *Microsoft* rather than ODF itself.
And how on Earth do you expect a *file format* to have "performance enhancements"? How does a file format even "perform"? I think you are confusing the format with a program that *implements* that format.
I wouldn't expect ODF to incorporate Office tags either; I just expect to be able to use ODF to represent a document that I wish to create.
As for the rest of your items, they are a bit vague. E.g. "easy integration"? "More limited grafical[sic] support"? I'm not familiar with the intricacies of either ODF or OOXML, but "more limited" could just be a euphamism for "doesn't support a boatload of useless bells and whistles".
These people say that even MS Office 2007 cannot create documents in OOXML, which makes the format about as useful as a chocolate teapot:
http://www.zdnet.co.uk/talkback/0,1000001161,39348275-39001068c-20091749o,00.htm
http://www.zdnet.co.uk/talkback/0,1000001161,39348275-39001068c-20091752o,00.htm
http://www.zdnet.co.uk/talkback/0,1000001161,39348275-39001068c-20091755o,00.htm
Talkbacks from this story:
http://news.zdnet.co.uk/software/0,1000000121,39348275,00.htm
The way I look at this particular issue is that Microsoft would be insane to build OOXML into Office, since it hasn't been ISOd yet - and may require alterations to gain that honour. Then again, if they were really as confident about its wonderfulness as they claim, you'd think it'd be in there...
What is it supposed to be reading? Just empty files?
Actually ODF also has several means for backwards compatibility with OpenOffice. The Office settings for instance or omitting current mathml schema so that old mathml version used in OpenOffice can still be used. Or de draw functionality which is a carbon copy of OpenOffice draw functionality and not very generic at all (allthough support for SVG is claimed it does not exist in ODF).
It is very strange that there is a lot of fuss about OOXML providing backwards compatility features for billions of exisitng document but little comment on ODF providing backwards compatibility for OpenOffice for a fraction of the documentsbase.
Those people are incorrect.
MS Office 2007 produces a very good implementation of OOXML.
In fact he immplementation of MS Office 2007 of OOXML is a lot better than any existing implementation of ODF.
MS Office 2007 implementation of OOXML is may not 100% perfect which possibly starts those suggestions but it is a lot better and more complete than OOo which does not implement a number of ODF tags or does not implement them fully and also implements several things incorrect. (For instance OOo is using a modified mathML 1.0 version and ODF requires MathML 2.0)
Also I should note that perfect support without bugs of a complex format is not required as even common formats are not supported 100% perfect and we still call that full support. So I would qualify the suggesting of those people in a range similar to "Firefox does not support HTML 4" or "OpenDcoument does not support ODF"
You suggest that Microsoft could influence the ODF development technical committee. Actually influencing the development has been tried before by independant ODF supporters like the ODF foundation but has failed because IBM and Sun have full control over ODF development.
This control for instance guarantees that when the new formulas are added to ODF that Sun's StarOffice/OpenOffice and IBM's Lotus Notus support those new formulas.
There is no way that those two companies (controlling 70% of the votes in the OASIS ODF TC) will let Microsoft add features that Microsoft already can support but their applications do not yet have support for.
[quote] MS Office 2007 implementation of OOXML is may not 100% perfect which possibly starts those suggestions but it is a lot better and more complete than OOo which does not implement a number of ODF tags or does not implement them fully and also implements several things incorrect. (For instance OOo is using a modified mathML 1.0 version and ODF requires MathML 2.0) [/quote]
From my understanding of situation, this statement is a little bit misleading. The important difference is that every file written by OpenOffice.org is ODF compliant. MS Office does NOT write compliant OOXML files. When (if?) they do, it remains to be seen whether they'll implement every single feature covered in the 6000 pages of spec.
Many are worried about the line which separates the OOXML “standard” from the MS-OOXML “reality”, where features such as scripts, macros and DRM are used but not documented in the OOXML specification. If Microsoft's “Open XML” is really
open, then why is it so tied so heavily to a single vendor's product? In comparison, 40 applications are now capable of supporting the ODF specification.
I've do hope you're speaking from *personal experience* when you say that Office 2007 writes OOXML files!
You are incorrect.
OOo is less complient with ODF than MS Office 2007 is with OOXML.
This however is typical rethoric as both are good implemntations of their respective native formats.
It is however become somewhat of a battlecry for anti-ooxml activist to state this because they are losing more and more ground on the technical deficiencies. And more and more technical problems are being found with ODF as well.
Then this Rupert is an idiot
I have used the Office Open XML format files extensivly and also used the XSD schema files that are in the Ecma standard which I downloaded from the Ecma offical standard site.
All Microsoft Office 2007 default fileformat written files I created fully validated against the XML schema's in the OOXML standard.
Can you direct me to the place where this idiot is suggesting otherwise ?
This looks like another bogus storytelling from anti-ooxml activists.
But seriously, from what I can tell, "OOXML" is in quite a state of flux (as well it might be, given the more-than-3000 comments to MS' last proposed version). It's unfinished. Stuff like date formats and language definitions are being changed to comply with various ISO standards, obsolete behaviours are being removed to annexes, and all that good stuff.
What _is_ it that Office 2007 saves in .docx? Which version of OOXML? Can I make it save in strict OOXML?
Additionally some info
Here an example of why OOo does not comply with ODF spec: http://idippedut.dk/post/2008/01/Do-your-math---ODF-and-MathML.aspx
Here a list of some applications that are supporting Office Open XML (OOXML):
http://www.openxmlcommunity.org/applications.aspx
Allthough it is not a complete list you can see interoperability is not a problem with OOXML. Also OOXML ins only an open standard for about 14 months and ODF is been an open standard for about double that so the gorwth of OOXML support is actually a LOT faster then for ODF.
Also many businesses are supporting OOXML in their products:
Here an astoundingly long list already consisting of only German companies.
http://nomina.de/openxml/software_liste.php
So suggesting that OOXML is not interoperable or not has widespread adaption is incorrect. A lot of support already exist and even a lot of support that ODF has not even got in twice the time.
Currently there is only one official open standard version of OOXML and that is the Ecma 376 standard.
http://www.ecma-international.org/publications/standards/Ecma-376.htm
MS Office 2007 uses that version and validates on those XML schema's .
Soon there might be a new version of ISO OOXML and that is then likely to become the new Ecma 376 version as well. The ISO version will probably become Ecma 376 v1.1.
You might not have noticed but ODF actually has
ODF 1.0,
ODF ISO,
ODF v1.0 SE (second edition)
ODF v1.1
ODF v1.2 (in draft phase)
Amazingly the main OOo implemantation does not fully support ANY of these ODF versions.
Also the current changes planned for v1.2 of ODF are MUCH MUCH MUCH more drastic and complex changes to the format than the total of changes that come out of the ISO standardization proces for OOXML.
To call the state of OOXML "a state of flux" might be correct but in comparison the state of ODF might translate as "a state of chaos".
An important reason for this is that for ODF OASIS is doing maintenance on the standard. For OOXML the maintenance will be done by Ecma and ISO together. So the Ecma and ISO version will be the same. The OASIS version actually are not the same as the ISO version because OASIS is not submitting all version to ISO. So at the moment not only OOo but actually most ODF implementations are NOT ISO ODF version compliant.
Basically, you're saying that OOXML having backwards compatibility with old binary Microsoft formats is a Good Thing, whereas ODF not having such backwards compatibility with Microsoft is a Bad Thing. This is an absurd position because only Microsoft can truly be in a position to leverage its old formats. If ODF contained proprietary Microsoft information then it could not be an open standard.
As for OOXML calling itself "open", there are those who claim that the proprietary Microsoft information within OOXML should actually be considered as a "patent ambush":
http://www.noooxml.org/patents
But personally, I don't care to clutter up a specification with decades of past mistakes. All that is really needed is a decent format-conversion tool, which isn't difficult to create provided you understand both the old and new formats and can map the former correctly into the latter. And that would that mean we wouldn't need to carry the old cruft such as "autoSpaceLikeWord95" or "useWord97LineBreakRules" elements around like a millstone around our necks.
A 'standard' is open for anyone and everyone to adopt and use, free of incumberance. It is not propreiery, it is not, per se, software, it is not copyright controlled, and it is not the exclusive property of an individual or company, nor does it have such dependencies.
I'm not, as such, against an OOXML standard develeloped, in part, by Microsoft if it meets the needs of the freedom to be used and adopted without cost and license implications by all and sundry and so long as it provides the basis for future secure document retrieval, not just Microsoft documents, into the foreseeable future.
But there's the rub, history teaches us that Microsoft has it's own agenda, to the detriment of most fair competition and it's very unwise to imagine that the leopard does change its spots. The EU does not seem to think so, so far.
Albert, you're repeating a myth propagated by Microsoft. I'm not in love with either ODF or OOXML. Both are standards in name only.
But the OpenDocument Foundation's da Vinci plug-in for Excel repeatedly loaded a million-row ODF spreadsheet faster than than Excel loaded the same spreadsheet rendered in OOXML. The margin decreases with repeated loading due to caching, but ODF is still the winner in that regard.
And if you study the OOXML specification, you will see that much of the markup, e.g., the entire volume in Part 5, specifies verbose markup. If processing speed were actually the criteria, one would expect that all markup would be terse, not just some of it.
It is important to realize that the Office apps still use the binary formats in their internal processes. OOXML support is via conversions performed by the major apps' native file support APIs using plug-ins. OOXML processing speed, to the extent it is a factor at all, is only relevant to the conversion of OOXML to the binary formats and vice versa. OOXML is out of the processing picture entirely until it comes time to convert the file to OOXML when saving.
If you study the OOXML spec, it quickly becomes apparent that the drawing line between terse markup and verbose markup is between the portions Microsoft originally submitted to Ecma and the unique portions developed through the Ecma standard development process.
The reason is fairly simple. The terse markup is a dump to XML of the markup used by particular builds of the Office apps native file support APIs' intermediate formats used in the conversion processes, the builds that were the beginning point for the standards work at Ecma. They are terse because it was far easier for Microsoft engineers to simply convert that markup to XML.
But the markup created later in the Ecma standardization process is verbose, as XML markup should be. See e.g., <a href="http://www.w3.org/TR/xml/#sec-origin-goals">W3C XML v.1.0</a> (4th ed.).
<blockquote>"XML documents should be human-legible and reasonably clear. ... XML documents shall be easy to create. ... Terseness in XML markup is of minimal importance."</blockquote>
Terse markup mattered in the days when documents were stored on floppy discs or slow hard drives and storage space and memory capacity was a barrier, making a necessity of compressed binary file formats that were dumps to file of the in-memory binary representation ("IMBR") of a document. But verbose OOXML markup gets converted to terse markup anyway once the OOXML document is converted to IMBR for internal processing.
The better question is why Microsoft inflicted terse markup on developers. One might plausibly answer that it was to save money and development time or even hypothesize something more Machiavellian.
But the bottom line is that OOXML markup terseness is a bug, not a feature. The terse markup forces developers and authors to constantly refer to reference materials to determine markup functionality and gives scant clues for the electronic archaeologists of the future. OOXML violates the principle that "XML documents shall be easy to create .. [and] be human-legible and reasonably clear."
I tend to ascribe more weight to your words, Rupert, than to a faceless talkback person who only surfaces when OOXML is being discussed. Particularly when that person's activity skyrockets shortly before an important vote.
What I am saying is that a good point for OOXML is it's backwards compatibility with the exisiting OFfice formats.
For users and customers who want this compatibility this is very positive.
That ODF does not have this compatibility is their own fault. they never made any effort to get the office market leader interested in their format.
The OOo format base of ODF was already decided upon before the first meeting of the TC. And on the fist meeting of the Open Office TC (as it was called back then) they stated that compatibility with the existing document base was not needed for ODF. And that is what they have done well. Create a format that is not fully compatible with most existing documents.
Mayby next time they should ask the people actually using office formats if they also agree that compatibility is not important.
If you had really looked at the OOXML spec you could have seen that the terseness is mainly used for the tags that are used most like text, cells, paragraphs and the namespaces used in every tag.
These are the tags are the easiest to remember anyways.
It is like complaing about html because the tags are too short.
For the other XML tags the naming convention in OOXML is actually very meaningfull.
Also you must have forgotten that ODF actually does the same (allthough less than OOXML)
In ODF a paragraph is "p" and a header is shortened to "h".
So I guess it is OK to be terse when it is in ODF but not in OOXML.
Oh but wait, Marbux, you were actually a member of the OASIS TC who actually came up with those terse naming for paragraphs and headers, were you. I guess then it is ok.
If MS Office 2007 were not compliant then show us all those file which were created in MS Office 2007 default format that do not comply with the official standard XML schema's.
Rupert bounces words.
I bounce back the official standard specification from Ecma and I have actually used Altovo's XMLspy to validate those Office 2007 files against the official Ecma standard schemas.
[quote]I'm not, as such, against an OOXML standard develeloped, in part, by Microsoft if it meets the needs of the freedom to be used and adopted without cost and license implications by all and sundry and so long as it provides the basis for future secure document retrieval, not just Microsoft documents, into the foreseeable future. [/quote]
I couldn't agree more.
[quote] Here an example of why OOo does not comply with ODF spec: link [/quote]
First off, I'm no expert on document formats and not qualified to debate their development. I'm pretty sure that most mathematicians like TEX for displaying mathematical formulas. That would be my choice, but I guess it's not an ISO standard.
Looks to me like the example you give demonstrates an error in how OpenOffice.org displays the MathML given a particular set of circumstances. Like you said, the standard is relatively new, there's bound to be kinks. For the life of me, I can't see how this equates to “ OOo does not comply with the ODF specification”. This seems more akin to a software bug to me.
Also, this seems like a whole world of difference from producing documents which contain many elements not specified in OOXML (EMCA 376), such as binary code, macros, OLE objects, ActiveX, DRM, SharePoint metadata and other technology, proprietary or otherwise. I'm pretty sure Microsoft Office 2007 uses some are all of these features.
So I'm wondering what happens if I use one or more of these features and save the file in the *default format*. What the heck gets saved? OOXML (EMCA 376), according to Albert. How can this be if the spec doesn't mention em? What happens if another application sends me an OOXML (EMCA 376) compliant file and I make edits to it? Will it get saved in the same format?
Seems I'm not the only one confused, even amongst the semi-experts in our community. Maybe we can get some clarification from a neutral party?
Goldie, the problem really is more that both ODF and OOXML are signed, blank checks. Neither specify a minimum set of features that must be supported and both allow vendor- and application-specific extensions. These failings take both outside the definition of a standard, a specification for a standard product that specifies all product characteristics in mandatory form.
They also produce an interoperability nightmare for the same reasons. E.g., if you check Part 4 of the OOXML spec and grep for extLst, you will find 573 "future extension points" whose functionality is unspecified. But implementations that use those extension points are deemed conformant and documents that use them will validate against the schema.
Likewise, the ODF standard's "conformance" section 1.5 states:
<blockquote>Documents that conform to the OpenDocument specification MAY contain elements and
attributes not specified within the OpenDocument schema. Such elements and attributes must not be part of a namespace that is defined within this specification and are called foreign elements and attributes.
...
There are no rules regarding the elements and attributes that actually have to be supported by conforming applications, except that applications should not use foreign elements and attributes for features by the OpenDocument schema.</blockquote>
Not surprisingly, OpenOffice.org uses some 150 application specific-extensions and destroys all extensions created by other conformant applications other than paragraphs and text spans.
<a href="http://lists.oasis-open.org/archives/odf-adoption/200709/msg00032.html">As was stated</a> by the lead developer of the KDE KOffice word processor:
<blockquote>One thing I have always dreamed to be possible is that when I write a doc in KOffice I can then open it in OOo to use that one feature that's useful to me and then save it and continue in KOffice without loosing lots of data.
"Its still a dream, of course. Most features are lost on opening and saving it in OOo, but its a nice goal[.]</blockquote>
And if you check the OOXML conformance section (in Part 1), you get a series of excuses for not having any conformance requirements at all. Validation in OOXML is a two-step process, with elements and attributes specified by the schema validated by one method and extensions validated by another. The latter step is near worthless, since it can only check whether specified extension points have been used and whether "compatibility markup" is valid. Because extensions are by definition unspecified, the validity of the extensions can not be confirmed; there is no adequate specification/schema against which they could be validated.
Yet ISO/IEC JTC 1 Directives require that:
<blockquote>Standards designed to facilitate interoperability need to specify clearly and unambiguously the conformity requirements that are essential to achieve the interoperability. Complexity and the number of options should be kept to a minimum and the implementability of the standards should be demonstrable.</blockquote>
Both ODF and OOXML flunk that test badly. Their interoperable implementation neither has nor can be demonstrated. Both are designed for the waging of feature wars, not for interoperability. Both attempt to legitimize market-leading companies embracing and extending their own formats. They are standards in name only. What we are watching is a contest to decide which big vendor formats will be allowed to undeservedly claim the title of "international standard."
ODF wins a point because it is not patent encumbered, whereas the OOXML IPR documents are RAND-Z, requiring a negotiated patent license from Microsoft. But openness is pretty irrelevant without interoperability and it is an utter myth that an open format is necessarily interoperable. Indeed, because both ODF and OOXML allow vendor-specific extensions, one might observe that neither is an open format. Both permit extensions whose fu
You're really reaching, Albert. I didn't say that ODF uses no terse markup. That wasn't the point you raised. You said the reason for the terseness in OOXML was for faster processing. It is not. As I said, the Office processors do not process the documents in OOXML. OOXML is only a read-write format accomplished through conversions. The processing is done in the IMBR version of the same old binary formats.
And yes, I was a member of the ODF TC, although not long enough to take credit for the few terse tags in the spec. But I jumped ship on ODF and the TC when the TC refused to do anything about the interoperability warts and actually made interoperability worse for apps implementing the forthcoming ODF v. 1.2.
I don't condone terse XML markup despite whose spec calls for it. But you exagerate how little of OOXML is terse markup. It's a lot. But I do note that your new argument also tends to undercut your previous argument that the terse OOXML markup was necessary for processing speed. If there's very little terse markup in OOXML as you now claim, then it sounds to me like you are in effect arguing that there's so little terse markup in OOXML that it wouldn't affect the processing speed anyway.
You can't have it both ways. So please take a firm position instead of dodging a principled discussion. Do you still contend that the terse markup in OOXML is necessary for processing speed? If so, how do you reconcile that position with your new argument that there is very little terse markup in OOXML?
By the way, I am not an ODF fan. Neither ODF nor OOXML deserve to be international standards. Interoperability is a threshold technical and legal requirement for international standards. Neither ODF nor OOXML enable interop.
--Buck "Marbux" Martin
<a href="http://www.ui-council.org">Universal Interoperability Council</a>
This post has been removed by a moderator.
You claim that OOXML enables interoperability. Please provide me with a single example of two OOXML implementations capable of non-lossy round-trip interoperability that do not require application-specific work-arounds such as those employed by the MindJet mindmapper to establish interop with MS Word.
Please do not bother giving Microsoft Word and Microsoft Sharepoint as examples. Their interoperability depends on a concurrent push of data moving between the apps using the proprietary and binary FrontPage protocols.
It's the same with OOXML as it is with ODF. If you want interop, everyone has to use the same app and app version. That is not standards-based interoperability among competing apps.
Buck "Marbux" Martin
<a href="http://www.ui-council.org">Universal Interoperability Council</a>
Terse markup provides for faster unzipping and zipping and less use of internal memory space whilst unzipping and parsing the XML file.
The data might loaded in memory much more efficient later on but in the start of the processing it isn't.
This is only noticiable at very large files and in batch handling of large number of OOXML files. OOXML's design is better suited to accomodate also those large files or proces a lot of files in bathc processes like with custom data. It shows that Microsoft was preparing for much larger files to be used in office software and is thinking of other ways the document can be produced. Not just by their own office suite.
So terse markup does add performance. Not whilst the document is internal memory but the in opening and saving of documents and in parsing the XML.
[quote]If there's very little terse markup in OOXML as you now claim, then it sounds to me like you are in effect arguing that there's so little terse markup in OOXML that it wouldn't affect the processing speed anyway. [/quote]
Especially in large files wit a lot of data only a few tags will make up 90% of all tags. By making those terse you improve speed most.
So in stead of you can use and in stead of you could use .
Of course the sizeproblem of tag in ODF is more limited because they actually are not using namespaces prefixes in their markup. A very strange choice for a format that claims extensibility trough adding namespaces. So I guess it is either more terse tags (especially for the most used ones) or dropping namespaces prefixes from your tags.
By using the namespace prefixes it also improves things like XML validation and extensibility compared to ODF. A tradeoff, ODF you can manually read better, OOXML you can process better. What do you think will be happening 99.9999% of the time, processing of the XML in OOXML or manual reading of the XML in it
[quote]It's the same with OOXML as it is with ODF. If you want interop, everyone has to use the same app and app version. That is not standards-based interoperability among competing apps. [/quote]
It is correct that ture interoperability is in the applications. The formats can only provide a means of doing that.
But what I am saying is that OOXML depite being call overly large and complex seems to be doing wel in the interoperability parts.
For 100% interoperbility with complex MS Office files you would need an equally extensive office suite. But if OOXML provides support for al features in such an extensive office suite it also provides a lot of support for the less extensive suites. The format is no limiting factor in interoperability.
It does not work the other wat around. By using a less extensive format like ODF you can't even support all the features in the most commenly used Office software to date. That limit interoperability on format level already and not just on application level
There are two important issues to consider, especially wherever there are government pilot studies and ODF mandate proposals. The first is whether or not we can convert volumes of existing MSOffice binary documents to ODF, and do so with an acceptable fidelity loss. The second is whether or not ODF plug-ins for MSOffice can convert documents with sufficient fidelity such that MSOffice bound business processes can continue without costly disruption or re engineering.
For the sake of discussion, these issues are often referred to as <i>“compatibility with existing documents”</i>, and <i>“interoperability with existing applications”</i>. Or, as many do, the entire issue is lumped together as <b>“interoperability”</b>.
ODF must pass this real world test to be considered “implementable” or have any hope of success in the marketplace. Unfortunately, for much of the world isn't a clean slate implementation. Conversion of binary to ODF is a fact of life that can't be avoided.
Fortunately the reverse engineering and conversion of the MS binary formats is a sector as rich with success as it is rife with MS inspired turmoil, confusion, and obfuscation. Yes, the binaries have been a moving target. But the conversion sector can routinely hit an 85-95% fidelity and higher on import. Export however is a far more difficult issue.
Given the importance of being able to convert MS binaries to ODF, one would think that the OASIS ODF TC would do whatever it takes to improve conversion fidelity for both import and export. This sadly is not the case. Over the past five years some of the world's foremost conversion – reverse engineering experts have worked on the ODF TC, but have seen their efforts defeated or pushed out to future generations of ODF.
Two names in particular stand out; Phil Boutros and Florian Reuter. Phil was responsible for the outstanding Stellent conversion filters, considered by most to be the best in the business. Florian wrote the da Vinci ODF plug-in for MSOffice, and, the OOXML conversion plug-in for Novell's OpenOffice. Florian also represented Novell on the OASIS ODF TC, the Ecma 376 Workgroup, the Cleverage open source “Translator” project, and the EU-ISO authorized DIN Workgroup “Harmonization” study. Prior to his work with Novell and the OpenDocument Foundation, Florian was Sun's resident RTF – MS Binary conversion expert tasked with assisting and advising OpenOffice/StarOffice community developers.
That the many conversion sector inspired efforts to improve “compatibility-interoperability” within the OASIS ODF TC were defeated or kicked forward to future generations of ODF is a matter of record. But here's something else to consider. Microsoft is claiming that Sun and IBM have had access to the binary blueprints since 2003. I personally don't know if that's true. But what i do know is that Sun and IBM have had the binary blueprints since early 2006. And i do know that in the past two years, neither Sun or IBM have introduced any proposals to enhance or improve ODF interoperability with MSOffice, or compatibility with the billions of binary documents the marketplace seeks to convert to ODF.
So much for the value of the binary blueprints.
If you want to truly understand this difficult <i>compatibility-interoperability”</i> issue, the best place to look is at independent conversion efforts. Sadly, all you'll ever get from the application vendors is a lot of useless finger pointing, heated politics, and refusal to compromise or cooperate. They have a different agenda than that of interoperability with the enemies product line.
And what do the independent conversion efforts tell us? They scream loudly that there is a fundamental difference between how OpenOffice and MSOffice implement basic document structures such as lists, tables, fields, sections and page dynamics. These basic <i>layout engine</i> differences are further complicated by differences in feature s
Sorry but i was unaware that ZD-UK cut off the comments. Please find the full comment at this link, <b><a href="http://docs.google.com/Doc?id=dghfk5w9_94fvpf9scv">Interoperability and the binary ODF conversion dilemma</a></b>
Thanks,
~ge~
I think that especially Sun in the ODF TC is against any changes they cannot implement in StarOffice/OpenOffice at the time that the format specification is standardized.
IBM on the other hand is against any changes that will allow Micrsoft to use the format better then they themselves are likely to
Those two companies having total control of the format development therefore does not signal improvements in interoperability with MS Office and effectivly makes the format totally useless for Microsoft (unless MS were to custom extend it to a monstruosity what nobody want either).
I think this is correctly identified also by Partick Durusau of the ODF TC (one of the few non Sun/IBM voting members) who has made a case in these recent weeks for having these two standards together. Both ODF and OOXML.
Great article.
“IMHO, the real problem is that both ODF and OOXML began life as an XML encoding of the originating application's binary dump. Neither started life as a clean slate, generic, document structure focused format. Neither started life as an application-platform-vendor independent effort...”
Pretty much sums it up the whole situation.
~Goldie
PS That last paragraph was kinda scary.