Skip to main content

Markup Languages: 3.2. XML

Markup Languages
3.2. XML
    • Notifications
    • Privacy
  • Project HomeTools and Techniques in Software Engineering
  • Projects
  • Learn more about Manifold

Notes

Show the following:

  • Annotations
  • Resources
Search within:

Adjust appearance:

  • font
    Font style
  • color scheme
  • Margins
table of contents
  1. Markup Languages
  2. 1. Grammar of HTML
  3. 2. Organizing HTML
  4. 3. Dialects
    1. 3.1. HTML
    2. 3.2. XML
    3. 3.3. XHTML
  5. 4. Attributes
  6. 5. Semantic HTML
  7. 6. Developing with HTML
  8. 7. Evolution
  9. 8. Bibliography

XML: eXtensible Markup Language

Developed after HTML (in the late 1990's), partly in response to HTML's quirkiness

Goals:

  • Universal format for exchanging both documents and structured data (e.g., records in a database or computer data structures).
  • Many languages in one: each XML document can define its own set of tags.

Sample XML xml <?xml version="1.0" encoding="utf-8"?> <person> <name>Ironman</name> <age>49</age> <state>New York</state> <occupation>Engineer</occupation> </person>

Format:

  • Information in <...> is markup, all other information is raw text.
  • Header line: identifies this as an XML document, indicates character encoding and XML version.
  • Body of document consists of a hierarchical collection of elements, starting with a single outer element (<person>).
  • XML does not require any particular element structure or tag names; each document can have its own schema.
  • Every element must have an explicit start and end (but can use <foo /> as shorthand for <foo></foo>).
  • Attributes can be contained per tag, such as <person name="Ironman" age="49" state="New York" height="49">
  • Tags can be repeated and nested & Cannot use < or > directly in a document; these are reserved for tags.
    • Use entities instead: < and >
    • This means & is a special character also: use &
    • Also, need " to include a quote in an attribute.
    • Many other entities are defined by XML for convenience.
  • XML has a few other features we won't cover here, such as namespaces for organizing tag names.
  • Two optional mechanisms available to enforce a particular structure on an XML document:
    • DTD (Document Type Definition): an XML document that describes permissible structure for a class of XML documents ("... each <person> element must contain <name> and <age> children ...").
    • XML Schema: newer than DTD's, designed to get around some shortcomings of DTD's; also more complex.
    • There exist programs that will read a DTD or XML Schema file and validate an XML file against it.

Benefits:

  • Textual format, supports Unicode and UTF-8 for internationalization.
  • Simple, clean syntax.
  • Can be used for a variety of different purposes.
  • Heavily used for data exchange in Web-based applications, and for storing application data.
  • The common syntax has allowed a large collection of tools to be developed, most of which will work on any XML document (e.g., validators).
  • XML documents can be read by humans when necessary; the tags make them almost self-documenting.

Weaknesses:

  • Verbose.
  • Not as fast to generate or parse as other formats (but fairly efficient parsers have become available**.
  • At the beginning people hoped XML would instantly allow any application that understands XML to communicate with any other application that understands XML, but this hasn't come to pass: if 2 applications use different XML schemas then they can't interact in a meaningful way.

Annotate

Next Chapter
3.3. XHTML
PreviousNext
Web Technology
Powered by Manifold Scholarship. Learn more at
Opens in new tab or windowmanifoldapp.org