SGML vs. XML

Harold Hunt huntharo@msu.edu
Wed, 16 May 2001 19:13:02 -0400


Ed,

>From http://www.oasis-open.org/cover/xml.html:
	Valid XML documents are designed to be valid
	SGML documents, but XML documents have additional restrictions.

Essentially, XML is a strict dialect of SGML.  For example, in SGML, and
therefore HTML, you may define a tag that has an optional closing tag, this
is not allowed in XML.  In SGML you may have tags that take number or string
arguments and whether or not to wrap the numbers or strings in quotations is
tag dependent; for example, in SGML you might have <foo parm1=foo_string
parm2="foo_string_2" parm3=45 parm4="56">; however, in XML you would have
<foo parm1="foo_string" parm2="foo_string_2" parm3="45" parm4="56">, notice
that *every* parameter has quotations around its data.  Tags can be upper or
lower case in SGML but I cannot remember if case will distinquish two tags
with the same name; in any case, tags in XML must be all lower case.

XML is, in essence, a language processor's language... it is easier to parse
and process XML because the format of the data is very regular.

Currently the XML and SGML DTDs are roughly identical, but in the future the
SGML DTD will probably be mothballed.

You can write SGML according to XML rules in a file that is compiled against
the DocBook SGML DTD; this is how I write my DocBook currently.  I find that
writing to the XML rules makes the source much easier to read, and I would
like to avoid having to tidy up my SGML files when the SGML DTD is
discontinued.

I'd be glad to answer any other DocBook questions you have.

Harold