Extensible Mark-up Language (XML) has very rapidly established itself as a viable technology with a huge range of real-world applications. One of the main reasons for its growing importance and wide acceptance is the fact that it offers a working solution to one of the key problems faced by software developers and computer users alike: the exchange of incompatible data. Each software program creates its own unique type of binary file which only it can understand. When data is exported in XML format, it becomes a known, clearly defined quantity, independent of the environment in which it was originated.

The PDF format is another example of a platform-independent format which has gained worldwide acceptance. Once a document is saved in PDF format, its format is set in stone, it can viewed and printed with its layout and formatting intact, without the need for the software which created the original document. However, where the PDF format concerns itself mainly with the presentation of information, XML is used to describe and encapsulate the information itself.

Though XML itself is still fairly new, the idea behind it is not. Back in the 1970s, Standard Generalized Markup Language (SGML) was developed in an attempt to create an application-independent method of describing data. SGML is a text-based language which uses the concept of adding mark-up to data which describes the data itself. An SGML document contains both data and a set of rules defining the structure of the data. SGML is a pretty complex language and, unlike XML, has never become mainstream. In the early 1990s, SGML was used to develop HTML and in the late 1990s, SGML was also used as the basis for the development of XML. So, basically, XML is a restricted form of SGML.

XML has already proved itself to be an excellent medium for storing, describing and transporting data, particularly over the web. It offers developers flexibility, clarity and simplicity. An XML document resembles an HTML document and consists of the same human-readable tags. However, the tags used to markup an HTML document are predetermined: only a fixed set of tags can legitimately be used. XML allows you to create your own markup language and define the tags which are legitimate for your data. It does this via the mechanism of a schema document, which can itself be an XML document. The schema document defines the vocabulary and grammar which may be used within the XML document containing your data.

The fact that, when creating and generating an XML document, you can invent all the rules, means that you never have to force your data into a container which was not designed to contain it. You define tags which reflect the nature of your data; you create a schema document which specifies the hierarchical structure of your information; and you decide on the type of information each element within your document is permitted to contain. In short, if you end up creating an XML documents which is unsuitable for holding your information, the responsibility lies with yourself!

The writer of this article is part of an organisation that offers training courses on web design in London and throughout the UK.