Well formed XML documents simply markup pages with descriptive tags. You don't need to
describe or explain what these tags mean. In other words a well formed XML document does
not need a DTD, but is must conform to the XML syntax rules. If all tags in a document are
correctly formed and follow XML guidelines, then a document is considered as well formed. Syntax is the Grammar of a language. For a document in XML to be
well formed, it must obey the following most important rules:
| Well
Formed |
Not
Well Formed |
<title>Tootsie</title>
|
"Tootsie"
|
|
-
XML documents must contain a unique
opening and closing tag that contains the whole document, forming what is called
a root element. In this example, the second column is not well formed because it lacks a
root element as in the first column:
<videocollection>...</videocollection>
| Well
Formed |
Not
Well Formed |
<videocollection>
<title>Tootsie</title>
<title>Jurassic Park</title>
<title>Mission Impossible</title>
</videocollection>>
|
<title>Tootsie</title>
<title>Jurassic Park</title>
<title>Mission Impossible</title>
|
|
All other tags must be nested
properly, i.e. there must be an opening and a closing tag and the tags cannot
overlap. The tags that in HTML would normally stand alone, such as <img> or
<br> Tag are called "empty Tags" when used in an XML document In XML empty
Tags look like this: e.g.: <BR/>.
</title... has no closing angle bracket, therfore
the tag is not complete!
</title)...has
a wrong closing bracket, therfore the tag is not complete!
In the second example the tags are not properly nested.
| Well
Formed |
Not
Well Formed |
<videocollection>
<title>Tootsie</title>
<title>Jurassic Park</title>
<title>Mission Impossible</title>
</videocollection><videocollection>
<title>Tootsie</title>
</videocollection>
|
<videocollection>
<title>Tootsie</title
<title>Jurassic Park</title)
<title>Mission Impossible</title>
</videocollection >
<videocollection>
<title>Tootsie
</videocollection></title>
|
|
Tags in XML are case sensitive,
that means that <CREW>, <Crew> and <crew> are not the same. The XML processing instruction must be all
lowercase. But keywords in DTDs must be all UPPERCASE, such as ELEMENT, ATTLIST, #REQUIRED, #IMPLIED, NMTOKEN, ID, etc. However, your own elements and attributes may be any case you
choose, as long as you are consistent.
| Well
Formed |
Not
Well Formed |
<crew>Sydney Pollak</crew> |
<CREW>Sydney Pollak</crew>
<crew>Sydney Pollak</Crew>
|
|
| Well
Formed |
Not
Well Formed |
<title id="1">Tootsie</title>
|
<title id="1>Tootsie</title>
<title id=1>Tootsie</title>
|
|
These are just some examples to for the above mentioned
well formdness constraints, which are only the most important, but by far not complete.
Please check the XML Specifications for a complete knowledge.
To check whether a document is well-formed or not you
should
only checked whether the document is properly marked up according to XML syntax
rules(well-formed or not).
In our example the following section of XML data is Well
Formed.
<?xml version="1.0"?>
<videocollection>
<title id="1">Tootsie</title>
<genre>comedy</genre>
<year>1982</year>
<language>English</language>
<cast>Dustin Hoffman</cast>
<cast>Jessica Lang</cast>
<cast>Teri Gar</cast>
<cast>Sydney Pollak</cast>
<crew>
<director>Sydney Pollak</director>
</crew>
<title id="2">Jurassic Park</title>
<genre>science fiction</genre>
<year>1993</year>
<language>English</language>
<cast>Sam Neil</cast>
<cast>Laura Dern</cast>
<cast>Jeff Goldblum</cast>
<crew>
<director>Steven Spielberg</director>
</crew>
<title id="3">Mission Impossible</title>
<genre>action</genre>
<year>1996</year>
<language>English</language>
<cast>Tom Cruise</cast>
<cast>Jon Voight</cast>
<cast>Emmanuelle Beart</cast>
<cast>Jean Reno</cast>
<crew>
<director>Brian de Palma</director>
</crew>
</videocollection>
|
In the chapter on Parser you can check whether this document is really
well-formed, using a parser.
 
|