XML 4: Of all the mark-ups in all the WWW, why walk into XML?

Daily Newsletters

Sign up to ZDNet UK's daily newsletter.

NEWS
The phenomenon that is the World Wide Web is fuelled by the ability it gives geographically remote web-authors to easily and cheaply distribute documents around the globe. XML aims to provide a basic syntax that allows such information to be shared between different systems running different apps without any need for layers and layers of conversion. Most documents on the Web, even now, are transmitted in HTML (HyperText Markup Language), a simple language based on SGML (Standard Generalised Markup Language) suited to simple documentation, and hypertext. HTML applications are limited to a small fixed set of tags in conformance with a single SGML specification. This allows users to leave language specs out of the document and makes it easier to build applications, but this limits HTML in terms of extensibility, structure and validation. HTML users can't specify their own tag sets. They can't support high-end structures such as database sites or objectified hierarchies. Without language specs, HTML users can't check documents are valid when importing/exporting. SGML can do all these things and more, but as a back-end app unfortunately contains many optional features that web-users don't really require for their needs, and so SGML has proven cumbersome and expensive for browser companies and end-users alike. XML is expressly targeted at a web-focused audience, although it does have applications beyond the web. XML was designed to be easy and informative to use, but is not backwards-compatible with existing HTML documents. Users who are used to working with HTML however should be able to pick up the basics of XML pretty quickly, and as documents conforming to the W3C (Worldwide Web Consortium) HTML 3.2 specification can easily be converted to XML, this isn't really a barrier to XML's acceptance. Although many arguments supporting XML do so at the expense of HTML, no-one really believes that the huge volumes of useful HTML pages out there are about to become obsolete in the short term. The W3C has an abiding interest in HTML, as have many of the W3C's. Also the ISO (International Standards Organisation) has standardised HTML in the conviction that HTML will persist for at least 25 years more, which is quite telling. With all its growing pains then, HTML remains a very successful common denominator for building web content. The apps that will promote XML as the definitive markup language of choice will be those that cannot be undertaken within HTML. Expressly these will be:
  • Applications that need to mediate between two or more heterogeneous databases;
  • applications which require web agents to customise information delivery dependent on the needs of individual users; apps that need to represent different views of the same information to different users (e.g. desktop users, handheld users, kiosks etc...);
  • and those that need to distribute a high load from web server to web client. HTML can handle some of these tasks to an extent using proprietary code embedded as "script elements" and delivered with the help of proprietary plug-ins or Java applets in Navigator or Explorer, but it's far from ideal for job. One of XML's key selling points is its simplicity. XML gives programmers and authors of sites a friendly environment in which to work. Well, friendly in computing terms... XML documents are built upon a core set of basic nested structures. Although you can take these key elements and create very complicated structures through layering, the underlying objects themselves remain simplistic and understandable to the less than brilliant. The obvious aspect of XML to look at is the "X". The language is eXtensible, meaning it can grow and develop as demands require. The initial developer's contribution to extensibility was the provision of determinable tag sets. DTDs (Document Type Definitions) are the most obvious sign of extensibility within XML. At the end of the day, XML is a meta-language, and thus outlines a set of rules that can be employed to create a set of rules for a particular document. DTDs give builders a set of tools with which they can define structure. Flexible yet standard -- a compelling combination. XML itself is also still being extended with bolt-ons providing authors with additional styles, linking and referencing. XML can already use many HTML standards like Cascading Style Sheets (CSS) and Hypertext Transfer Protocol (HTTP). XML Linking (Xlink) offers linking facilities that HTML developers can only admire from afar. Xpointers provide a consistent way of reference portions of documents. Extensible Style Language (XSL) provides a more complex tool-set again that that provided by CSS, and uses XML syntax to define style sheets. XML is well supported and is growing at a comforting rate, with more standards on the way. But there are still further arguments supporting the adoption of XML at this stage, over HTML or full SGML. Because its documents behave consistently, and it includes support for additional platform independent languages such as Java, third-party APIs, and parsers for C++, C, JavaScript, Tcl, and Python, XML is extremely interoperable. The standard itself is also open, and thus freely available on the Web. Developers could create obscure DTDs or encrypt data, but why bother and lose one of the main benefits of XML? In addition, skilled XML developers in the shape of members of the sizeable SGML community are already out there, ensuring penetration. As well as all the advantages for Web already discussed, XML has potential as a universal file transfer format. The adoption of XML in Microsoft Internet Explorer 5 and, presumably, the long-awaited Navigator/communicator revision 5 from Netscape will open a lot of these doors. XML can act as a gateway for communications between disparate systems, platforms and applications. Unless you are web-house with demands that can only be satisfied by full SGML implementation, it's difficult at this stage to envision a scenario in which XML will fail to become your defacto markup of choice and you persevere with or turn back to pure HTML. Have an opinion on XML? Tell the Mailroom Take me to the XML Special
  • Post your comment

    In order to post a comment you need to be registered and logged in.

    You can also log in with Facebook. Log in or create your ZDNet UK account below

    • Login

    Will not be displayed with your comment

    By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

    Get ZDNet UK's daily newsletter

    Enter your email address to sign up

    ZDNet UK Live

    TerryRK

    Well it seems there is something a number of us agree on. Why is the Ubuntu Unity launcher so ugly? I thought perhaps it was something to do with...

    4 hours ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
    Freebies202

    Duplicate comments are not made intentionally. Its very good to know that now you are keeping check on this problem because sometimes a commenter...

    14 hours ago by Freebies202 on Microsoft fixes blog comments, speeds up blogs with open source
    kevinmchapman

    "the very significant number of users" and "many (most) of us" - you have no evidence for these statements. It is a fact that most users are saying...

    22 hours ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
    Marg Menzies Harrison

    Another grammar faux pas is the improper use of "you". When sitting down down in a restaurant, for example, I get cringe when the waitress...

    23 hours ago by Marg Menzies Harrison via Facebook on 10 flagrant grammar mistakes that make you look stupid
    zdnetukuser

    And NOW, folks, for Canonical's next trick... Kubuntu is late. Here's a pencil. Draw your own conclusions. cf.:...

    24 hours ago by zdnetukuser on Linux Minterface
    Moley

    @kevinmchapman. The discussion here reflects the very significant number of users who really do like the traditional menu system and who wish to...

    1 day ago by Moley on A tale of two distros: Ubuntu and Linux Mint
    kevinmchapman

    Er, no... It is an efficient means of finding the application/file/setting you need in one place. The icons are a simply a fallback for when you...

    1 day ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
    TerryRK

    Isn't the provision of a text based search an admission by the developers that the mass of icons approach does not work? I don't need to use a...

    1 day ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
    kevinmchapman

    "Unity and GNOME 3 both abandon the old text-based cascading menus in favour of a graphical icon-driven system." Point truly missed. Both use a...

    1 day ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
    TerryRK

    whs001 - Thank you, I'm glad you liked the article. I absolutely agree with you on your first point. I should perhaps have made it clearer that...

    1 day ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
    Dennis Nilsson

    If we allow corporate interest to dictate the way our government circumvents due process against foreign entities then we should accept the same...

    1 day ago by Dennis Nilsson via Facebook on ACTA stumbles in Germany
    GHar123

    I totally dislike pirating of works, I fear that artists will be deterred from creating works if they think that they are going to get ripped off....

    1 day ago by GHar123 on ACTA stumbles in Germany
    JCB33

    How dare film makers, artists or anybody that invests in creativity stop us pirating their works for free. I want to be able to walk into my local...

    2 days ago by JCB33 on ACTA stumbles in Germany
    Moley

    @GrueMaster. I prefer horses for courses rather than one size fits all. I, and I suspect most other computer users, do not really wish to have...

    2 days ago by Moley on A tale of two distros: Ubuntu and Linux Mint
    greycynic

    The product that scares me every time I have to use it is the Office 2007 version of Excel. The first bug that I found was applying the median...

    2 days ago by greycynic on Ten flawed products that derail productivity
    GrueMaster

    Nice review and very informative. One thing I'd like to add (in reply to whs001's 1st question), the main reason to have the same interface from...

    2 days ago by GrueMaster on A tale of two distros: Ubuntu and Linux Mint
    Frederick Wrigley

    I'be been using Mint 12 since the RC came out, and I am far more happy with the Cinnamon, the Mate, and, yes (with extensions), theGnome 3...

    2 days ago by Frederick Wrigley via Facebook on A tale of two distros: Ubuntu and Linux Mint
    bdantas

    Excellent article. One small correction, though--although a fresh installation of Linux Mint 12 will, indeed, provide the user with a version of...

    2 days ago by bdantas on A tale of two distros: Ubuntu and Linux Mint

    In related news, the ISPs club together to get the members of the Home Affairs Select Committee (ya goofed on that part, ZDNet UK) copies of "The...

    2 days ago by via Facebook on MPs urge ISPs to take down terrorist material

    In related news, the ISPs club together to get the members of the Home Affairs Select Committee (ya goofed on that part, ZDNet UK) copies of "The...

    2 days ago by via Facebook on MPs urge ISPs to take down terrorist material