XML 2: The XML 1.0 specification is here. Does it cut the mustard?

Daily Newsletters

Sign up to ZDNet UK's daily newsletter.

NEWS
The story of XML as a standard goes back to early 1996. Although HTML had been the darling of a new breed of Internet programmers for a number of years, it was becoming increasingly obvious that it was reaching the limits of its power, and was basically getting on a bit. The Internet is a ruthless arena for even the most popular standards. So the online community decided it wanted to achieve different goals with the next generation electronic document format. It was decided that the new standard should lose none of the simplicity, GUI appearance and hypertext talents of HTML, while having the added benefit of enabling automation of multi-vendor applications. The World Wide Web Consortium, or W3C launched the W3C XML Activity project in May 1996. The working group that designed and tweaked XML comprised an interesting mixture of publishing-industry veterans and Web pioneers, working from the privileged position of strength that a vendor-independent body enjoys. This small working group (The XML Working Group) was also helped out with technical input from a larger Special Interest Group, or SIG. Following ten design goals and passing through a succession of interim drafts, XML reached the 1.0 Recommendation. The W3C was particularly proud of the fact that all technical discussions and decisions were taken via teleconference, email and web-postings, with very little face-to-face interaction. The group believes that not only did this allow worldwide members to play an important role in development, it also sped up the normally lengthy specification process as a whole. Since version 1.0, there have been a number of requests for enhancements to the spec, but the working group is reluctant to make any radical changes before there XML is deployed more widely and greater hands-on experience has been gained. XML.com wisely suggests that if you want to understand XML, you really have to read the specification for yourself. But as Tim Bray, co-editor of the XML 1.0 specification himself once said, most people never get round to reading the basics of how to operate a toaster safely, so ploughing through a technical spec is probably not for everyone. But this specification is only 40 pages long, and is available all over the web in a number of different formats, thanks to its XML authoring, so it's definitely worth a look. Luckily, Bray has written a number of idiot-friendly papers with potted explanations for various aspects of XML 1.0, something that I for one was very pleased about. I don't think that it's unfair to say XML 1.0 was something of a hit, but to be honest the demand for an HTML successor was such that it could be nothing else. XML 1.0 was designed from the ground up to do things that HTML never could. HTML is great for displaying text, but for automating Web processes you need something like XML. It gives you the ability to make rich documents that are open to manipulation from computer programs. For example, a web-robot could be employed to index items, or a Java applet used to push content into graphs, tables for example. It also managed to take away the rigidity of HTML tagging. With XML 1.0 it's up to the user how elements are specified. Tags like "chart-position" or "goals-scored" are common place in XML documents but will never feature in HTML. You may be concerned that you'll have to knock up a new range of tags each time you write or share a document, but the specification includes something called a Document Type Definition, or a DTD, that allows you to define the tags you've made for future use by you or anyone else you want. A document that conforms to a DTD (if it has one) is called "valid". As well as valid, XML documents have to be "well-formed". This means all the tags begin and end correctly (apart from empty elements), all attribute values are correctly quoted, and all entities are declared. This is a fab idea. Surf the web and you will find a lot of crap HTML out there, with unclosed tags, broken links and so on. This makes automated processing extremely unreliable, as you can't be sure that all documents will comply with the same rule sets. Well-formed documents are easy to parse -- that is manipulate and structure. Browsers are very forgiving of bad HTML, but the XML 1.0 spec clearly states that if a document is not well-formed, then it will not exist. The committee decided that it was easy to use good XML practice, so if writers couldn't be bothered, tough. The controversial decision was pushed through primarily by Netscape and Microsoft, and these rules mean that once navigator or Explorer displays an XML page, you know it's well-formed. The most complimentary thing that can be said about the XML 1.0 specification is that it works and does everything the W3C set out for it to achieve. But then it was knocked up by a load of old hacks who loved SGML and had a vested interest in making the web a more dynamic, friendly place to code. Have an opinion on XML? Tell the Mailroom Take me to the XML Special

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your ZDNet UK account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

Get ZDNet UK's daily newsletter

Enter your email address to sign up

ZDNet UK Live

TerryRK

Well it seems there is something a number of us agree on. Why is the Ubuntu Unity launcher so ugly? I thought perhaps it was something to do with...

4 hours ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
Freebies202

Duplicate comments are not made intentionally. Its very good to know that now you are keeping check on this problem because sometimes a commenter...

13 hours ago by Freebies202 on Microsoft fixes blog comments, speeds up blogs with open source
kevinmchapman

"the very significant number of users" and "many (most) of us" - you have no evidence for these statements. It is a fact that most users are saying...

21 hours ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
Marg Menzies Harrison

Another grammar faux pas is the improper use of "you". When sitting down down in a restaurant, for example, I get cringe when the waitress...

23 hours ago by Marg Menzies Harrison via Facebook on 10 flagrant grammar mistakes that make you look stupid
zdnetukuser

And NOW, folks, for Canonical's next trick... Kubuntu is late. Here's a pencil. Draw your own conclusions. cf.:...

23 hours ago by zdnetukuser on Linux Minterface
Moley

@kevinmchapman. The discussion here reflects the very significant number of users who really do like the traditional menu system and who wish to...

1 day ago by Moley on A tale of two distros: Ubuntu and Linux Mint
kevinmchapman

Er, no... It is an efficient means of finding the application/file/setting you need in one place. The icons are a simply a fallback for when you...

1 day ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
TerryRK

Isn't the provision of a text based search an admission by the developers that the mass of icons approach does not work? I don't need to use a...

1 day ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
kevinmchapman

"Unity and GNOME 3 both abandon the old text-based cascading menus in favour of a graphical icon-driven system." Point truly missed. Both use a...

1 day ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
TerryRK

whs001 - Thank you, I'm glad you liked the article. I absolutely agree with you on your first point. I should perhaps have made it clearer that...

1 day ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
Dennis Nilsson

If we allow corporate interest to dictate the way our government circumvents due process against foreign entities then we should accept the same...

1 day ago by Dennis Nilsson via Facebook on ACTA stumbles in Germany
GHar123

I totally dislike pirating of works, I fear that artists will be deterred from creating works if they think that they are going to get ripped off....

1 day ago by GHar123 on ACTA stumbles in Germany
JCB33

How dare film makers, artists or anybody that invests in creativity stop us pirating their works for free. I want to be able to walk into my local...

2 days ago by JCB33 on ACTA stumbles in Germany
Moley

@GrueMaster. I prefer horses for courses rather than one size fits all. I, and I suspect most other computer users, do not really wish to have...

2 days ago by Moley on A tale of two distros: Ubuntu and Linux Mint
greycynic

The product that scares me every time I have to use it is the Office 2007 version of Excel. The first bug that I found was applying the median...

2 days ago by greycynic on Ten flawed products that derail productivity
GrueMaster

Nice review and very informative. One thing I'd like to add (in reply to whs001's 1st question), the main reason to have the same interface from...

2 days ago by GrueMaster on A tale of two distros: Ubuntu and Linux Mint
Frederick Wrigley

I'be been using Mint 12 since the RC came out, and I am far more happy with the Cinnamon, the Mate, and, yes (with extensions), theGnome 3...

2 days ago by Frederick Wrigley via Facebook on A tale of two distros: Ubuntu and Linux Mint
bdantas

Excellent article. One small correction, though--although a fresh installation of Linux Mint 12 will, indeed, provide the user with a version of...

2 days ago by bdantas on A tale of two distros: Ubuntu and Linux Mint
Alan Ralph

In related news, the ISPs club together to get the members of the Home Affairs Select Committee (ya goofed on that part, ZDNet UK) copies of "The...

2 days ago by Alan Ralph via Facebook on MPs urge ISPs to take down terrorist material
Alan Ralph

In related news, the ISPs club together to get the members of the Home Affairs Select Committee (ya goofed on that part, ZDNet UK) copies of "The...

2 days ago by Alan Ralph via Facebook on MPs urge ISPs to take down terrorist material