Troubleshooters.Com Presents

Troubleshooting Professional Magazine

Volume 5 Issue 3, March 2001
XML
Copyright (C) 2001 by Steve Litt. All rights reserved. Materials from guest authors copyrighted by them and licensed for perpetual use to Troubleshooting Professional Magazine. All rights reserved to the copyright holder, except for items specifically marked otherwise (certain free software source code, GNU/GPL, etc.). All material herein provided "As-Is". User assumes all risk and responsibility for any outcome.

IDL code snippets and other information from the DOM specification are copied from http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/, Copyright © 1998 World Wide Web Consortium , (Massachusetts Institute of Technology , Institut National de Recherche en Informatique et en Automatique , Keio
University ). All Rights Reserved. Status of this document is a w3c recommendation.

[ Troubleshooters.Com | Back Issues ]

Steve Litt is the author of Troubleshooting Techniques of the Successful Technologist and
Rapid Learning: Secret Weapon of the Successful Technologist.

Three Rules of Work : 1) Out of clutter, find Simplicity. 2) From discord, find Harmony.  3) In the middle of difficulty lies Opportunity. -- Albert Einstein


 

CONTENTS

  • Editors Desk
  • About this Issue's Exercises, PLEASE READ!!
  • What is XML?
  • Some Definitions
  • Anatomy of an XML App
  • Simplified Explanation of the DOM API
  • Learning from the Masters: How Dia Uses XML
  • XML Coding Exercises
  • Wrapup
  • Where to Go From Here
  • Apache Software Foundation and W3C Rule!
  • Thanks and Acknowledgements
  • Linux Log: Open Source Means Quality Education
  • Letters to the Editor
  • How to Submit an Article
  • URLs Mentioned in this Issue
  • Editors Desk

    By Steve Litt
    What's up with XML? Is it a revolutionary technology destined to be our livelihood the next few years, or a passing fad? Is it a universal standard specified by the W3C, or has it been usurped and proprietarized by Microsoft? And for some, the most nagging question is "how the heck do I learn it?". This issue of Troubleshooting Professional will attempt to answer all 3 questions. But for those who turn to the last page of the book, let me answer the questions now:
    1. XML is a revolutionary technology destined to be our livelihood the next few years.
    2. XML is a universal standard specified by the W3C.
    3. You can learn the basics of XML in this issue of Troubleshooting Professional.
    XML was detected by trade mags' radar in 1997 or 1998. It was proclaimed a world changing technology. Learn it and you're rich.

    We were all skeptical. After all, the trades had predicted similar futures for push technology, ATM, and a hundred other technologies we've all forgotten. But the trades get it right sometimes. Witness Java and Linux. And definitely XML.

    It's 2001. XML is being incorporated in all sorts of projects. The reason you don't hear about it constantly is the *app* that reads, writes, changes and renders the XML is written in a traditional language such as Java, Perl, Python or C++. In that respect XML is data. But used correctly, much of an application's logic can be stored as easily modified XML. The actual C++, Java, Python or Perl code then becomes primarily the user interface. Imagine how nice it would be to implement your business rules as XML. You can!

    Then there's the Microsoft connection. Microsoft is gung-ho about XML. Does that make XML an unwise move?

    Probably not. Even if Microsoft does what they do best, and somehow manage to proprietarize some dialects of XML, it will be easy to reverse engineer, and may even be legal to do so in spite of UCITA supported anti-reverse engineering license language. Meanwhile, the rest of us can use our own dialects.

    "Dialects" are numerous. As will be explained later in this magazine, XML itself is just an extremely intuitive general specification for how to declare something that could be considered hierarchical data, or markup language, depending on your viewpoint. Within that specification, an implementer specifies his own set of rules for naming XML elements, and what other elements each element can contain. That specification can be implemented on paper, or technologically enforced with a DTD or schema. If this paragraph loses you don't worry -- everything in this paragraph will be explained in detail in this magazine.

    Unfortunately, XML is poorly documented. There are exceptions. The W3C specifications are easily readable and understandable. But for the most part, XML books do nothing but document XML's syntax, rules and vocabulary, leading the novice reader to ask "so how can I do something with it". If you follow along with the Java examples in this magazine, you'll know exactly what you can do with XML. Once you understand XML at that level, you can port that knowledge to Perl, Python, C++ and other languages that have XML APIs.

    XML derives its power from the fact that it can represent anything the human mind can conceive. And that representation is very readable both for a human and for a machine. The concept is so clean that upon understanding it, my first question was "why didn't I invent XML?". I certainly have the intelligence to have invented it -- XML's not rocket science. I've needed it for years, but had to "roll my own" every time I needed a configuration file or data format.

    So get familiar with XML. Whether you're in the Microsoft world or the Open Source world, or somewhere in between, you'll need to interface with it in the next couple years.

    How can a Troubleshooter benefit from XML? XML should make applications simpler to diagnose and simpler to tweak. And an XML file provides loads of testpoints from which you can manipulate the apps interacting with it. It brings back some of the Troubleshooting advantages of the intermediate files of the Cobol era, but unlike those, it's persistent and useful in and of itself.

    So whether you're a Troubleshooter, programmer, DBA, Sysadmin, or just a person who likes technological progress, kick back, relax, and enjoy your magazine.

    Steve Litt is the documentor of the Universal Troubleshooting Process.  He can be reached at Steve Litt's email address.

    About this Issue's Exercises, PLEASE READ!!

    By Steve Litt
    If you complete the XML tutorials in this issue of Troubleshooting Professional, you will have mastered the following: This issue of Troubleshooting Professional Magazine is organized as a tutorial. It's takes you through all aspects of beginning level XML, well into the intermediate level. I strongly recommend you go through this tutorial in the order it's written. That means going down this page through the "Learning from the Masters: How Dia Uses XML" article, then go down the "XML Java Coding Exercises" page, and then come back to this page and continue where you left off.  Everywhere necessary, there are links to point you in the right direction.

    The coding exercises are all in Java. My research indicates Java has the most mature support for XML. Once you download Xerces from the Apache Foundation and install it, these exercises work on a Linux box with Java installed. Java is the most straightforward way I could offer coding exercises.

    I had originally intended to do the exercises in both Perl and Java, but Perl DOM support proved problematic, and there wasn't enough time.


    !! STOP THE PRESSES !!

    Xerces-Perl for Linux has shipped!

    After I had done most of the exercises in Java, I got an email message that there now exists. It's so new it's not on CPAN, and I couldn't find it on xml.apache.org. It's been tested only on Debian. But Xerces is a killer tool, and a Perl/Linux version is a good thing. Stay tuned. More info as it comes in.


    Rest assured, though, if you're a Perl, Python or C++ person, everything you learn in this tutorial will apply to XML in your language of choice. In every exercise, I used only calls defined in the DOM and SAX specifications. I used no "native Javaisms" to manipulate XML.

    Java is a killer language. It's portable, ubiquitous, free beer and in some implementations free speech, it's fast enough, and it's corporationally correct. These are some more reasons I chose Java for the XML coding.

    This tutorial was written, tech edited, and tested in Linux (Mandrake 7.2). No effort was made to test under Windows. Instead I used the time to delve deeper into XML. That being said, I know of no reason the Java exercises shouldn't work on a Windows box that's properly configured with Java and Xerces. If you don't have a Linux box, and you can't get your hands on one, by all means use a Windows box for the Java exercises. You'll need to convert some of the shellscripts to batch files, and you'll need to do a Windows install of the JDK and Xerces instead of a Linux install, but that should be pretty easy.

    The Dia diagramming program, basis of the "Learning from the Masters: How Dia Uses XML", originated on Linux but has been ported to Windows. The Linux package is more mature, so if you have a choice you might want to do that exercise on a Linux box. And that's an exceptionally important exercise, so even if you don't have a Linux box, please try to find someone who will let you use theirs for this exercise. If you don't know anyone with a Linux box, find your local Linux User Group (LUG) and beg someone there to let you use their box to do the Dia exercises.

    Personally, I felt more comfortable working on a Linux platform. If you feel more comfortable on a Windows platform, I'd imagine you should be able to get this tutorial to work from within Windows, although of course I haven't tested it on Windows.

    Steve Litt is the main author of Samba Unleashed.  He can be reached at Steve Litt's email address.

    What is XML?

    By Steve Litt
    In this Article You Will Learn
    • XML is a styles based markup language
    • XML is hierarchical in nature.
    • XML is extremely readable and easy to understand.
    • XML can represent almost any concept.
    • XML can implement a major part of an application.

    This is a far trickier question than you can imagine, and I think once you master the answer, everything else falls into place.

    One possible answer is that XML is a markup language. And that's absolutely true, as anyone who sees the bracketed begin and end tags for its elements can attest. This answer is true, but almost useless. Because to think of XML as HTML on steroids is to relinquish 90% of XML's functionality.

    Another possible answer is that XML is a styles-based markup language, rather than an appearance-based markup language like HTML. Once again, so true, and so useless.

    I think a much better definition for XML is a specification for a markup language that can be used to represent almost any concept. Keeping in mind that neither phonebook, person, info nor name are keywords, imagine how the following could be used:

    <phonebook>
      <person lname="Smith" fname="John">
        <info name="workphone">800-555-1212</info>
        <info name="homephone">407-555-5555</info>
        <info name="relationship">Skating buddy</info>
        <info name="skatetype">Racing inlines</info>
        <bicycle serialno="432845" speeds="21" tires="700cc"></bicycle>
      </person>
      <person lname="Jones" fname="Mary">
        <info name="workphone">800-555-1234</info>
        <info name="homephone">407-555-2222</info>
        <info name="relationship">Coworker</info>
        <info name="yearsatcompany">8</info>
      </person>
    </phonebook>
    You've just implemented a phone book. Add a user interface and you're done. The user interface reads the fields from the XML, and places the values from those fields in on-screen text boxes, queries the user to change the contents of those fields And notice that if you write that user interface well, you can add new fields simply by changing the XML. You can have a program on the other end that puts the finished XML into a database, assuming the database is flexible enough to represent such data.

    Notice a few facts about the preceding XML code:

    Because elements can contain other elements, to a certain extent attributes and sub-elements are interchangeable. For instance, in the person element I described the person's name with an lname and an fname attribute. Instead, I could have had each person element contain an lname and a fname subelement, each of which had the appropriate name between the begin and end tag. In other words:
      <person>
        <lname>Smith</lname>
        <fname>John</fname>
        <info name="workphone">800-555-1212</info>
        <info name="homephone">407-555-5555</info>
        <info name="relationship">Skating buddy</info>
        <info name="skatetype">Racing inlines</info>
      </person>
    Please remember there are no reserved words in the preceding example. info and name are just strings I decided upon to make it self documenting. As an alternative to the preceding, I could have even used info tags to accomplish the same purpose:
      <person>
        <info name="lname">Smith</info>
        <info name="fname">John</info>
        <info name="workphone">800-555-1212</info>
        <info name="homephone">407-555-5555</info>
        <info name="relationship">Skating buddy</info>
        <info name="skatetype">Racing inlines</info>
      </person>
    Your choice of attributes vs. elements depends on things such as whether you'll need more than one of the entity (no two attributes of a single element can have the same name), and whether you should always have the entity (that might favor using an attribute). Also, use elements if order is important, because the XML specification doesn't specify the order of attributes, so parsers don't necessarily preserve attribute order. All this will be explained later in this magazine.

    The preceding examples have used XML as a hierarchical representation. But it can also be used as stylized markup:
     
    <heading level="3"> Why XML is So Great</heading>
    <paragraph>XML is <emphasis>absolutely wonderful!</emphasis>And
    it's not just because <emphasis>XML is <newword>Corporationally
    Correct</newword>!</emphasis></paragraph>
    <paragraph>Now let's talk about...

    In the preceding, the XML markup describes the styles, or functionality, of marked up text. It's up to the application rendering the XML to assign an appearance to such styles. Even the relationship between style and appearance can be moved out of the application using XSL (Extensible Style Language). XSL is a separate but related subject that is not discussed in this issue of Troubleshooting Professional.

    Tags must be nested, never interlaced. The following is not allowed:
     
    XML <emph>is <italic>great</emph> and good.</italic>

    The well formed way to write the preceding would be to nest tags, like this:
     
    XML <emph>is <italic>great</italic></emph><italic> and good.</italic>

    Because tags can't be interlaced, but instead must be nested, all XML represents a hierarchy. For instance, the preceding snippet could be thought of like this:

    XML is
    <emphasis>
        truly 
        <italic>
            great
    <italic>
        and fantastic.
    Generally speaking, in XML intended to represent a hierarchy, an element containing a text node contains no other elements or text nodes, but in XML intended to represent markup, an element often contains several text nodes and several other elements. But this is not a rule, only a custom.

    I believe the best way to learn XML is through the DOM (Document Object Model) spec, so DOM is discussed voluminously in later portions of this issue of Troubleshooting Professional Magazine.
     

    In this Article You Have Learned
    • XML is a styles based markup language
    • XML is hierarchical in nature. Tags must be nested, never interlaced.
    • XML is extremely readable and easy to understand.
    • XML can represent almost any concept.
    • XML can implement a major part of an application.

     
    Steve Litt is the author of  Rapid Learning: Secret Weapon of the Successful Technologist".  He can be reached at Steve Litt's email address.

    Some Definitions

    By Steve Litt
    In this Article You Will Learn
    • You'll learn definitions for the following:
      • Document
      • Element
      • Attribute
      • Text Node
      • Node
      • Document element
      • DTD
      • Well formed
      • Valid
      • Schema
      • DOM
      • SAX
      • Dom document
      • Namespace
    Document The data contained in an entire XML file:
    <?xml version="1.0"?>
    <shellinterface>
        <getch mthd="cmd" access="backtick">getchbsd.pl</getch>
        <pathseparator>;</pathseparator>
    </shellinterface>
    Element The entity defined by a start tag and end tag, but not the entities contained between the start and end tags:
    <getch mthd="cmd" access="backtick">getchbsd.pl</getch>
    Note that all elements are nodes, but not all nodes are elements. Elements inherit all methods of nodes, and add some of their own. Nodes are discussed later in this table.
    Attribute The name/value pairs enumerated in an element's start tag:
    <getch mthd="cmd" access="backtick">getchbsd.pl</getch>
    Text Node The text between the open and close tag of its parent element:
    <getch mthd="cmd" access="backtick">getchbsd.pl</getch>
    Node The most atomic XML entity that is programmatically useful. Elements, attributes and text nodes are all nodes. There are other node types which are described in the DOM spec:
    // NodeType
    const unsigned short ELEMENT_NODE = 1;
    const unsigned short ATTRIBUTE_NODE = 2;
    const unsigned short TEXT_NODE = 3;
    const unsigned short CDATA_SECTION_NODE = 4;
    const unsigned short ENTITY_REFERENCE_NODE = 5;
    const unsigned short ENTITY_NODE = 6;
    const unsigned short PROCESSING_INSTRUCTION_NODE = 7;
    const unsigned short COMMENT_NODE = 8;
    const unsigned short DOCUMENT_NODE = 9;
    const unsigned short DOCUMENT_TYPE_NODE = 10;
    const unsigned short DOCUMENT_FRAGMENT_NODE = 11;
    const unsigned short NOTATION_NODE = 12;
    The Node interface of the DOM spec contains most of the navigational methods.  Note that all elements are nodes, but not all nodes are elements. Elements inherit all methods of nodes, and add some of their own.
    Document element Top level element, of which there can be only one per XML file:
    <shellinterface>
        <getch mthd="cmd" access="backtick">getchbsd.pl</getch>
        <pathseparator>;</pathseparator>
    </shellinterface>
    DTD A sort of type declaration for XML. Here's an ultra-simple one:
    <!DOCTYPE docelement [
    <!ELEMENT docelement (#PCDATA)>
    ]>
    Note that docelement is NOT a reserved word.
    Well formed An XML file conforming to the XML syntax rules, including:
    • Every start tag has an end tag, and none are "interlaced", but instead all are properly nested. 
    • Every attribute has a name followed by an equal sign followed by a quoted value. 
    • There is one and only 1 top level element
    Valid Well formed, AND conforming to the rules of the DTD.
    Schema Performs a function similar to a DTD.
    DOM Stands for Document Object Model. A method of placing an entire XML file's hierarchy, with all its elements, in a memory object. This memory object is built for quick lookup, traversal and modification.
    SAX Stands for Simple API for XML. An event driven method of dealing with an XML file. Instead of containing the entire hierarchy in memory at one time, it presents elements as events which can then be exploited by your code. SAX has the advantage of less memory consumption for large files, but has the disadvantage that the programmer must write code to save anything he wants saved, and must write changes to the XML file in sequential order. DOM allows random changes to elements. Because needn't keep entire files in memory at once, SAX is universally useful, whereas DOM is not useful for truly huge XML files.
    DOM Document In the DOM standard, an object containing the entire hierarchy, elements, and information of an XML file. 
    DOM object Any object contained within a DOM document. Vague, ambiguous, and misunderstood -- don't use this term. THIS TERM IS NOT A SYNONYM FOR DOM document!!!
    Namespace A method of uniquifying tag names from various XML varients:
    <shape xmlns="http://www.daa.com.au/~james/dia-shape-ns"
    
    xmlns:svg=
    "http://www.w3.org/TR/2000/03/WD-SVG-20000303/DTD/svg-20000303-stylable.dtd">
      <name>Circuit - Vertical Zener Diode</name>
      <svg:svg width="3.0" height="3.0">
        <svg:line x1="0" y1="0" x2="0" y2="2" />
        <svg:polygon points="-0.8,2 0.8,2 1,2.6 0.8,2.1 -0.8,2.1 -1,1.5"
              style="fill: inverse" />
        <svg:polygon points="0,2.1 -1,3.85 1,3.85" style="fill: default" />
        <svg:line x1="0" y1="3.85" x2="0" y2="5.85" />
      </svg:svg>

     
    In this Article You Have Learned
    • You'll learn definitions for the following:
      • Document
      • Element
      • Attribute
      • Text Node
      • Node
      • Document element
      • DTD
      • Well formed
      • Valid
      • Schema
      • DOM
      • SAX
      • Dom document
      • Namespace
    Steve Litt is the developer of The Universal Troubleshooting Process troubleshooting courseware. He can be reached at Steve Litt's email address.

    Anatomy of an XML App

    By Steve Litt
    In this Article You Will Learn
    • The high level architecture of a DOM/XML application.
    • Interaction of the XML file, parser, DOM document, XML write logic, Renderer, Modification logic, and output.

    An XML application reads an XML file, after which it can modify and rewrite the XML, and/or it can print output based on that XML (commonly called "rendering"). Note that "rendering" can take widely diverse forms, including changing which fields are available on a form, printing a vector graphic, or the most obvious case of rendering marked up text. Rendering can even take the form of configuring an application, or executing remote procedures.

    The DOM model is easiest to understand, so here is the architecture of an XML app using DOM:
    Diagram of XML app architecture

    So here's what happens: A parser reads the XML file and builds a DOM document to match the XML file. From that point until a save is performed, all interaction between the app and XML hits the DOM document rather than the corresponding XML file. It's interesting to note that almost all XML parsers use SAX. The reason is simple enough. Before you build a DOM document you must detect events such as start of element (start tag encountered), end of element (end tag encountered), new attribute (name followed by equal sign followed by quoted string encountered), and the like. So DOM can be thought of as an extra abstraction to lessen the programmer's workload, at the expense of memory usage.

    Modifications are made directly to the DOM document. Elements can be added, deleted, renamed, rearranged. Text nodes can be added, deleted or changed. Elements can be moved either within the same level, or promoted or demoted to different levels.

    Obviously, the DOM is modified in apps that rewrite the XML file. But DOM modification is also often done in an app that only renders the XML. The classic example is in a "DOMWalker" app, which simply walks the DOM tree and prints what it finds in a hierarchical outline. In fact, the newlines and spaces intended to make the XML file more readable are actually legitimate text nodes in XML, but in an XML app concerned only with a hierarchy they're extraneous. Therefore, the first thing a DOMWalker program does is delete text nodes made up only of whitespace. Source code for an example DOMWalker is given later in this magazine.

    Rendering is the heavy part of most XML apps. It's often graphics intensive. Consider the Dia vector drawing program, which keeps all drawing information in XML but renders as geometric shapes. Often there are several rendering processes, one for each kind of output. Thus a book authored in XML could be rendered as a paper book, as a PDF, as a Postscript file, or as an HTML page or series of HTML pages. Indeed, this is one of the primary benefits of styles based documents. Often the rendering itself is decoupled from the app by use of XSL (eXtensible Style Language), much the same as program logic is decoupled from the app using XML.

    Rewriting the XML file is actually easy -- about what you'd expect for your last class project in a college Programming 101 course. In the case of DOM, you've already assembled the output in a DOM document, so you just walk its tree and write the markup.

    In the case of SAX based XML apps it's a little harder because you often don't read the information in the same order you want to write it. In other words, if your app's specification calls for something occuring later in the input modifying something earlier in the output, you can't just use a read-write loop. So you do the typical stuff -- keep some things in memory, or maybe write an intermediate file and then sort it, or run 2 passes through the XML. This is why for apps interacting with guaranteed small XML files, DOM is better.
     
     

    In this Article You Have Learned
    • The high level architecture of a DOM/XML application.
    • Interaction of the XML file, parser, DOM document, XML write logic, Renderer, Modification logic, and output.
    Steve Litt is the documentor of the Universal Troubleshooting Process.  He can be reached at Steve Litt's email address.

    Simplified Explanation of the DOM API

    By Steve Litt
    In this Article You Will Learn
    • How to create a DOM Document object with a parser.
    • The three main DOM activities.
    • Using the "checker metaphor" to understand iterative document navigation.
    • The major DOM navigation public variables and equivalent methods.
    • The Down, right, up, done DOM walking algorithm.
    • A simple Java implementation of that algorithm.
    • How to access elements by name.
    • How to add, change and delete information in the DOM document.
    • Navigating attributes by name and sequentially.

    If you understand DOM, you're 90% of the way to understanding XML.

    What you might think of as a "DOM object" is really an instance of the  Document class:
     
    DOMParser dp = new DOMParser();
    dp.parse("myfile.xml");
    Document doc = dp.getDocument();

    In the preceding code, the parser delivers an instance of Document, called doc, which contains the entire information hierarchy contained in the original file myfile.xml. You can use methods from the DOM API to extact any info from the DOM document if that information was in the original XML file (with a very few exceptions)

    The simplest explanation of a DOM document is that it's an in-memory tree containing all info from the XML file hierarchy, together with with varous methods to navigate that tree, to get information from a specific node, and to add, delete, rearrange or modify nodes. If you can navigate, get, and change, that's pretty much all you need to do with a hierarchy.

    There's no better documentation on DOM than W3C's DOM specification papers, available at their website. To learn XML, you should spend about a day reading the parts dealing with XML (not with HTML). It is time *very* well spent.

    The purpose of this article is to help you understand what you will see when you read the DOM spec, so that you don't go off on the wrong track and you aren't overwhelmed.

    Throughout this article, keep in mind that DOM methods enable three main activities:

    1. Navigating the hierarchy tree
    2. Viewing information (get)
    3. Modifying information(put, delete, add, move, etc.)

    Navigating the hierarchy tree

    The DOM navigation methods are defined so you can navigate the tree without recursion. They do this using methods that move your current position around like a checker on top of the various nodes. Here I use the word "checker" like the round plastic play pieces used in the board game called "Checkers".

    Note: The "checker" metaphor will be used extensively throughout this issue of Troubleshooting Professional.
    Most of the methods to read and modify elements operate on the element with the checker. HOWEVER...

    My assertion that they operate by moving a checker around is not quite accurate, because these navigation methods do not change the state of the DOM document. Instead, they simply deliver a node. The programmer records the current position by assigning the returns of these methods to a node object. That node object marks the place of the "checker".
     

    !! CAUTION !!
    Although attributes are nodes, they are invisible to the navigation methods and public variables listed below. There are specialized methods and public variables to access and navigate attributes.

    The following is a list of the major navigational methods, and the equivalent public variables, and the interfaces in which these methods and public variables are implemented. Immediately below the list is a sample hierarchy to walk. Observe the naming convention that in general the variable name is converted to the method name by capitalizing the first leter, and prepending either get or set as appropriate. For the time being, don't worry about the Interface column:
     
    Method Equivalent public variable Interface
    getOwnerDocument()
    getDocumentElement()
    getFirstChild()
    getLastChild()
    getNextSibling()
    getPreviousSibling()
    getParentNode()
    readonly attribute Document ownerDocument;
    readonly attribute Element DocumentElement;
    readonly attribute Node firstChild;
    readonly attribute Node lastChild;
    readonly attribute Node nextSibling;
    readonly attribute Node previousSibling;
    readonly attribute Node parentNode;
    Node
    Document
    Node
    Node
    Node
    Node
    Node
    Graphical representation of an object tree

    In plain English, you start with the checker on the document element. At every juncture:

    Trace the preceding pseudocode algorithm on the hierarchy diagram above and you'll see what I mean. Starting at the document element, you go down to A, then right to B, then down to 1, then right to 2, then right to 3, then up to B, then right to C, then up to the document element, at which time you're done because you've already been there.

    That brings up an important point. You shouldn't be able to go down from an element if you've already done so. When you first arrive at an element via a downward or a rightward movement, you descend if you can. But sooner or later, you'll come back up to that same element after you've gone as far right as you can in the level below the element. Obviously, you don't want to descend again, as that would make an infinite loop as described in the following indented paragraph:

    From A move right to B. From B move down to 1. From 1 move right to 2. From 2 move right to 3. From 3 move up to B. From B move down to 1...
    So you implement a boolean control variable (let's call it ascending) that is true when you ascend to a node, and false otherwise. The definition of "can go down" then becomes not only that there are children, but also that you are not ascending. The following Java loop walks a tree and calls once printNodeInfo() for each element:
     
    mynode=doc.getDocumentElement();
    while (true) {
      if (!ascending) {
        printNodeInfo(mynode);
      }
    
      if ((mynode.hasChildNodes()) && (!ascending)) {
        mynode=mynode.getFirstChild();
        ascending = false;
      }
      else if (mynode.getNextSibling() != null) {
        mynode=mynode.getNextSibling();
        ascending = false;
      }
      else if (mynode.getParentNode() != null) {
        mynode=mynode.getParentNode();
        ascending = true;
      }
      else {
        break;
      }
    }

    In the preceding Java code, object mynode is the "checker". Basically, what the code says is perform an action (printNodeInfo() in this case) on the checkered element, and then make your move. Move the checker down if you can, otherwise move it right if you can, otherwise move it up if you can, otherwise you're done (because you've returned to the document element).

    Oh, and one more thing. The preceding navigation accesses not only elements, but also text nodes. You can discern the two types with the nodeType public variable or the getNodeType() method implemented in the Node interface. However, remember that the preceding navigation methods do NOT bring the checker to rest on attributes. Attributes have their own navigation and access methods. Using the "checker" metaphor, they could be said to have their own checker.

    Accessing Elements by Name

    The preceding section of this article discussed navigating elements by tree moves. That's ideal when you don't know what elements you'll encounter. But sometimes, because of the nature of the application, you know it's likely that under a particular element, and that it's likely you'll have one or more elements of a certain name. The following XML is an example:
      <person>
        <lname>Smith</lname>
        <fname>John</fname>
        <info name="workphone">800-555-1212</info>
        <info name="homephone">407-555-5555</info>
        <info name="relationship">Skating buddy</info>
        <info name="skatetype">Racing inlines</info>
      </person>
    It's likely you'll have info elements, and you might want to list them. That's when you use the getElementsByTagName(name)syntax, which delivers a NodeList (similar to an array) of all such subelements. You can then loop through the NodeList to put your checker on each of those similarly named elements. This can be done even when you know there will be only one such named element.

    Viewing information (get)

    Once your checker is on an element, in some DOM implementations, including Python, you can access that element's information with variables:
    readonly attribute DOMString nodeName;
    attribute DOMString nodeValue;
    readonly attribute unsigned short nodeType;
    In other implementations, including Java, you use methods to accomplish these same things:
    public String getNodeName();
    public String getNodeValue();
    public short getNodeType();
    Some implementations allow you to do either.

    Modifying information(put, delete, add, etc.)

    To change the value of an element, use the nodeValue public variable or its setNodeValue() method equivalent. It's read/write. To change name of the element, you'll need to replace the element with a different element, using the replaceChild(newChild,oldChild) syntax. Note that this works not on the node with the checker, but a child of the node with the checker. To do this you need to move your checker up. Depending on language and DOM implementation, and assuming the checker is on myElement, this might be possible with a 1 liner:
    myElement.parentNode.replaceChild(newElement,myElement)
    Otherwise, try something like this:
    Element tempElement = myElement;
    myElement = (Element)myElement.getParentNode();
    myElement.replaceChild(newElement,tempElement);
    An element can be inserted before the checker like this:
    Element tempElement = myElement;
    myElement = (Element)myElement.getParentNode();
    myElement.insertBefore(newElement,tempElement);
    myElement = tempElement;  //Return to original position
    An element can be appended after the checker like this:
    Element tempElement = myElement;
    myElement = myElement.getParentNode();
    myElement.appendChild(newElement,tempElement);
    myElement = tempElement;  //Return to original position
    Once the new node is in place, you can change its value with its nodeValue public variable, or the setNodeValue() method.

    The "checker" element can be deleted like this:

    Element tempElement = myElement;
    myElement = myElement.getParentNode();
    myElement.removeChild(newElement,tempElement);
    In the case of deletion, you can't move the checker back to the original node because the original node is gone. The programmer handles this by storing where he wants to go after the deletion. For instance, a DOM walker that deletes all blank text nodes keeps a copy of where the checker was in the previous iteration, and upon deletion goes back there. In the next iteration, it gets the node "after" the deleted one.

    Navigating Attribute Nodes

    So far this article focused exclusively on navigating or accessing elements and text nodes. But within elements there are sometimes attribute nodes. There are two broad ways to access an attribute node:
    1. By attribute name
    2. Sequentially

    Navigating Attributes by Name

    Believe it or not, accessing nodes by attribute name is by far the more useful. That's because if your app has never heard of a given attribute, there's not a whole lot it can do with it, assuming you're using attributes as they're designed to be used. So it's rare to access attributes sequentially, but it can be done.

    Navigating attributes is simpler than navigating elements because attributes cannot contain anything else, and because you cannot have two attributes with the same name.

    To get the value of a named attribute, use the my element.getAttribute(attribname) syntax. To get an attribute object, use the element.getAttributeNode(attribname) syntax. An attribute object contains the attribute name, its value, whether the value was specified as opposed to default, and the element that owns the attribute.

    Navigating Attributes Sequentially

    Getting attributes sequentially is much more difficult, and various DOM implementations have their own glitches. You'll need to experiment to get it just right. A typical use of sequential access to attributes is a reporting program, or writing the DOM document out to an XML file.

    An element's attributes are accessed as an array, not with a getNext type of API. Different implementations are different, and you'll need to experiment, but typically you get the array, get the array's length, and then loop through the attribute nodes. You get the array with the attributes public variable or the getAttributes() method, defined in the Node interface, and the number of elements with the length public variable defined in the NamedNodeMap interface, then loop, accessing each attribute with the item() method implemented in the NamedNodeMap interface, then accessing the attribute's name and value public variables from the Attr interface. If your implementation uses only methods, use getNodeName() and getNodeValue(). The following is some Java code to do that:
     
    NamedNodeMap attribs = thisNode.getAttributes(); 
    for(int i=0; i < attribs.getLength(); i++){
      Node attrib = attribs.item(i);
      System.out.print(attrib.getNodeName());
      System.out.print("=\"");
      System.out.print(attrib.getNodeValue());
      System.out.print("\"\n");
    }

    Once again, in many DOM implementations the preceding doesn't work. In some cases attribs is an array in the computer language's native format, after which it can be traversed using constructs of the language. Experiment.
     
     

    In this Article You Have Learned
    • How to create a DOM Document object with a parser (DOMParser object).
    • The three main DOM activities are navigating, viewing and modifying.
    • Using the "checker metaphor" to understand iterative document navigation.
    • The major DOM navigation public variables and equivalent methods.
      • The methods:
        • getOwnerDocument() 
        • getDocumentElement() 
        • getFirstChild() 
        • getLastChild() 
        • getNextSibling() 
        • getPreviousSibling() 
        • getParentNode()
      • The Public Variables:
        • readonly attribute Document ownerDocument; 
        • readonly attribute Element DocumentElement; 
        • readonly attribute Node firstChild; 
        • readonly attribute Node lastChild; 
        • readonly attribute Node nextSibling; 
        • readonly attribute Node previousSibling; 
        • readonly attribute Node parentNode;
    • The Down, right, up, done DOM walking algorithm:
      • Go down if you can
      • else go right if you can
      • else go up if you can
      • else you're done
      • Do not descend if you got to the node by ascending.
      • Do not process the node if you got to it by ascending.
    • A simple Java implementation of that algorithm.
    • Using getElementsByTagName(name) to access elements by name.
    • How to add, change and delete information in the DOM document with appendChild(newElement,tempElement) and insertBefore(newElement,tempElement), replaceChild(newElement,tempElement), and removeChild(newElement,tempElement).
    • Navigating attributes by name using getAttribute(attribname) and getAttributeNode(attribname), and navigating attributes sequentially with getAttributes(), getLength(), and getNodeName().
    Steve Litt is the documentor of The Universal Troubleshooting Process. He can be reached at Steve Litt's email address.

    Learning from the Masters: How Dia Uses XML

    By Steve Litt
    In this Article You Will Learn
    • How to use Dia to learn good XML construction.
    • Dia is a vector drawing package that stores its drawing information in XML format.
    • Modifying the drawing modifies the XML, and Modifying the XML modifies the drawing.

    This article may seem very tedious. You might be tempted to skip it. But unless you already have a deep understanding of XML and a feel for what makes good XML, this is the most important article in this magazine. If you skip this article, you'll likely fail (or at least not understand what you're doing) when you try coding the XML app exercises later in this issue. But if you spend the hour it takes to do this article's exercises, and the extra 1 to 3 hours to debrief yourself so you really understand what has happened, you will have a deep, intuitive grasp of XML, and nothing will stop you.

    !! CAREFULLY READ AND PARTICIPATE IN THIS ARTICLE !!

    Many Linux distros come with a vector graphics drawing program called Dia. Dia is an Open Source alternative to Visio. It stores not only drawings but also template shapes in XML, so it's very extensible and could surpass Visio. Using only a text editor, you can create brand new template shapes, each with an arbitrary number and placement of connnection points. It's incredible.

    Dia is available on many Linux distros. I know it's on Mandrake 7.1 and 7.2, although it's not on the menu. But it's in /usr/bin. If Dia isn't installed, see if you can install it from your distribution CD (check for a file with a name like dia-0.86-2mdk.i586.rpm in your RPMS directory on Red-Hat derived distros).

    If your distro didn't come with Dia, here are some places you can get it:
     
    Type of install Where to find it
    Source http://www.lysator.liu.se/~alla/dia/dia.html
    Debian Package http://packages.debian.org/unstable/graphics/dia.html
    RPM files http://www.rpmfind.net, then search for dia.

    Dia is a diagramming tool most suitable for data flow diagrams, network system diagrams, or basically anything resembling a block diagram. Connection lines stay connected as you move components around. You can add bends to connection points by right-clicking a multi-segment connection line and choosing "add new segment". Outstanding!

    All drawings are stored as gzipped XML files. You can modify a drawing two ways -- graphically, or by editing the XML. Although the latter is much more time consuming and harder to visualize, for work requiring exact measurements it might be preferable.

    Hello World Dia XML investigation

    But never mind. I came to use Dia, not to praise it. We're going to use Dia to learn how it uses XML, in preparation for our own XML app. Start by running Dia from the command line. Among other screens which are relatively extraneous, you'll see a screen like the following:

    That's the Dia toolbox. From the menu, click file, then new, and you'll be brought to a blank page. Right click the blank page, choose file, then save as, and save it as blank.xml.gz. Now close the drawing by right clicking the empty drawing and choosing close.

    Remember, Dia saves its drawings as gzipped xml files. View blank.xml.gz with the following command:

    zless blank.xml.gz
    You'll see an XML file whose document element is <diagram>  (with a namespace appended -- well discuss this much later). Second level element are <diagramdata> and <layer>. Examine the <layer> element's XML code:
    <layer name="Background" visible="true"/>
    There's no end tag. In XML, when an element contains no subelements or text nodes, the start tag and end tag would butt up next to each other. To enhance readability in such cases, XML syntax allows a forward slash before the ending angle bracket of the start tag to denote an end tag. The layer element has two attributes, name, with value "Background", and visible, with value "true". Remember that none of these strings are XML reserved words.

    In the case of the <diagramdata> element, it has tons of subelements, most of which are <attribute> elements (this is not an XML reserved word). As you can see, there's an <attribute> element for the drawing's background, an <attribute> element for the "paper" used with the drawing (size, margins, portrait/landscape and the like), an <attribute> element for the grid to be used, and an <attribute> for something called "guides", of which there's apparently a horizontal and a vertical instance. People hear me well, a lot of the Dia application is specified by this layout, and this layout is extremely readable. Behold the power of XML!

    You'll notice a couple other things.  <attribute> elements contain other <attribute> elements (or don't, as the individual elements data dictates). XML allows storage of very freeform data. You'll also notice a <composite> element. This is intended as a container for multiple elements.

    What's in an Ellipse

    We're going to draw an ellipse, save it as ellipse.xml.gz, and then compare it with blank.xml.gz. The result will be the Dia application's XML representation of an ellipse.
      Now use the following commands to view the difference between blank.xml.gz and ellipse.xml.gz:
    $ gunzip ellipse.xml.gz blank.xml.gz
    $ diff ellipse.xml blank.xml | less
    You get something like the following:
     
    58,76c58
    <   <layer name="Background" visible="true">
    <     <object type="Standard - Ellipse" version="0" id="O0">
    <       <attribute name="obj_pos">
    <         <point val="1.3,3.6"/>
    <       </attribute>
    <       <attribute name="obj_bb">
    <         <rectangle val="1.25,3.55;6.9,5.95"/>
    <       </attribute>
    <       <attribute name="elem_corner">
    <         <point val="1.3,3.6"/>
    <       </attribute>
    <       <attribute name="elem_width">
    <         <real val="5.55"/>
    <       </attribute>
    <       <attribute name="elem_height">
    <         <real val="2.3"/>
    <       </attribute>
    <     </object>
    <   </layer>
    ---
    >   <layer name="Background" visible="true"/>    

    Look it over for a second. All that happened was a single <object> element, whose type attribute has value "Standard - Ellipse", has been inserted into the <layer> object whose name attribute has value "Background". The <object> element contains several <attribute> elements describing all the "attributes" you'd expect of an ellipse, such as position (X and Y coords), the top left corner (X and Y coords), the width and the length. There's also an <attribute> element called obj_bb which is the four points comprising the bounding box of the object. It's all very readable.

    Notice there's no color listed? Let's give the ellipse a fill color and observe the change.

    How Colors are Implemented

    First, be sure to gzip ellipse.xml:
    gzip ellipse.xml
    Now open ellipse.xml.gz in Dia. Drag a rectangle around the ellipse to select it without the risk of moving it. Now right click the ellipse, choose dialogs, then properties. Click the color bar next to "Fill colour", and crank the blue all the way down until it's a pure yellow. Now click the color bar next to "Line Colour", and crank up the blue until the line is pure blue. Right click the drawing, choose File/save as, and save the drawing as colors.xml.gz. Finally, click the drawing, choose file/close to close the drawing.

    Now use the following commands to view the difference between blank.xml.gz and ellipse.xml.gz:

    $ gunzip colors.xml.gz ellipse.xml.gz
    $ diff colors.xml ellipse.xml | less
    You get the following:
     
    75,83d74
    <       <attribute name="border_width">
    <         <real val="0.1"/>
    <       </attribute>
    <       <attribute name="border_color">
    <         <color val="#0000ff"/>
    <       </attribute>
    <       <attribute name="inner_color">
    <         <color val="#ffff00"/>
    <       </attribute>  

    It's simple to see what happened. An <attribute> element with attribute name having value "inner_color" was created with a subelement called <color>, with a val attribute whose value is"#ffff00" (pure yellow), to describe the fill color. An <attribute> element called "border_color" was created with a subelement <color> with attribute val valued at "#0000ff" (pure blue), to describe the line color. And an <attribute> element called "border_width" with a subelement called <real>, whose val attribute has value at "0.1". Note that when I say the <attribute> elements were called such and so, what I really meant was that they had an XML attribute called name, and the value of that attribute was such and so.

    If you're like me, you wonder why a border width entity was created. I'd guess that there was no border until you specified its color.


    : NOTE :

    Look what the application has done. Every property of the ellipse is described with an <attribute> element.  They could have had special elements called <border_width> and the like, but they didn't.  Likewise, they have a subelement to describe the value of the property. Each such subelement has a name corresponding to what is being measured, and a value corresponding to the actual value of the property. Why did they do this? they could have just as easily done something like this:

    <border_color units="color" value="#0000ff"/>
    But that wouldn't have been as generic. What the authors of Dia have done is to create a system where any property can be described, and all properties can be read into the app. This is how the pros do XML.


    Anyway, you now see how it handles colors. We've done quite a bit of work manipulating Dia and noting the result in XML. Now let's go the other way.

    Modifying a Drawing with a Text Editor

    Because you gunzipped colors.xml.gz, you now have an XML file called colors.xml. Using your favorite text editor, edit that file. Pay particular attention to the following two elements, which might not be next to each other in your experiment:
        <attribute name="elem_width">
         <real val="5.55"/>
        </attribute>
        <attribute name="elem_height">
          <real val="2.3"/>
        </attribute>
    As you remember, you made the ellipse much wider than it was high. That's why the elem_width is much bigger than elem_height. Using the cut and paste of your text editor, carefully exchange the values associated with elem_width and elem_height, and then resave the file. If things go as expected, pulling the diagram up in Dia should now show an ellipse higher than wide.

    Naturally, you need to gzip the file again:

    gzip colors.xml
    And finally pull the drawing up in Dia. And sure enough, the ellipse is now higher than wide (if not, troubleshoot).

    Creating a Dia Exploration Script

    We really didn't need to go through all the gzipping and gunzipping, file save and file close and text editing and lessing. We did that just to minimize the extraneous variables so you could see the exact effects of tiny changes in the drawing. Now it's time to make a script to quickly alternate between the graphic and text view of drawings, with the ability to change in either view and view the changes in the other. Here's the script:
     
    resp='y'
    echo $resp
    while test "$resp" = "y"; do
       dia test.xml.gz
       rm test.xml
       gunzip test.xml.gz
       vi test.xml
       gzip test.xml
       echo -n "Do it again? (y/n)===>"
       read resp
    done

    Save the preceding script as rdia and chmod rdia as executable by all (chmod a+x rdia).


    : NOTE :
     
    If you don't like the VI editor, substitute the name of your favorite Linux editor for vi in the script
    This script won't function if there doesn't exist a test.xml.gz, so before using the script go into Dia, create a blank drawing, and save it as test.xml.gz. Finally, run the script and  experiment editing both the XML text and the Dia graphics, and note how changes in one environment appropriately change the other.
     
    ! CAUTION !

    This script will not procede to editing the XML file until you completely exit the Dia application. You exit Dia after saving your work by clicking the close icon on the Dia toolbox.

    If you "gum up" the XML so badly that you can't pull up the file in Dia, simply create a new blank test.xml.gz in Dia.

    Exploring Shapes and Connectors

    While in the rdia loop, make a drawing with a single ellipse, a single rectangle, a single triangle (make it with the  button), and a single line. View it in the XML view, noting how each shape is specified in XML. Feel free to move and magnify things. Go back and forth. Have fun.

    Who's on Top?

    In Dia, make a yellow ellipse on top of a blue rectangle. Make sure the yellow ellipse is not completely inside the blue rectangle, and that the yellow ellipse doesn't completely cover the blue rectangle. If you have trouble putting the yellow ellipse on top, right click the yellow ellipse, choose objects, and then click bring to front. The yellow ellipse will now be on top. You should be able to see parts of the blue rectangle below it, and the yellow ellipse should not be entirely inside the blue rectangle. Save and exit Dia.

    Once in the XML file, you'll notice that the blue box object appears before the yellow ellipse object. That's intuitive, because objects appearing later get thrown on the canvas "on top of" existing objects. To test this theory, cut the XML for the yellow ellipse object, and place it below the XML for the blue box object. Save the file and continue. If everything's gone right you should now see the blue box on top of the yellow ellipse.

    See the beauty of XML. A concept like "how do I signify which objects are on top of which others would normally be difficult to implement. But if the app stores its info in XML, it's a no-brainer.

    Notice that all objects are inside a single layer object. If you want to have a little fun, within the Dia environment send different objects to different layers, then view the results in XML.  Note that I've had cases where seemingly correct changes to layers caused Dia not to load the file, and I've even seen where simply saving the file in VI caused Dia not to load the file. The good news is I've always been able to correct this type of problem by deleting the new layer in VI, after which  Dia would load the file. When all is working well, you can manipulate layers in the XML file and have the results show up exactly as expected in Dia.

    Experiment. The possibilities are endless.

    A Little Theory

    In your editor, go to the top of the XML file and note the following:
     
    <?xml version="1.0"?>
    <diagram xmlns:dia="http://www.lysator.liu.se/~alla/dia/">

    The first line is the XML Declaration, and basically gives the XML version. The second line declares a namespace, called dia, and equates it with the URI http://www.lysator.liu.se/~alla/dia/.  Note that I said URI, not URL. There's a subtle difference. But anyway, don't expect to find anything at that URI. It would be coincidence if you did.

    The second line is a namespace declaration. It declares a namespace called dia, associating it with the unique identifier "http://www.lysator.liu.se/~alla/dia/". The reason URI's are used is because they are the best hope for a unique identifier worldwide. For instance, if I were to author a new XML file and wanted to give it a new namespace, I could name it after a directory on Troubleshooters.Com, knowing that Troubleshooters.Com is mine to control. Of course, this doesn't stop someone else from using Troubleshooters.Com as part of their unique identifier, but that would be very bad ettiquette.

    A namespace is simply an "area" or "scope" (for want of better words) within which each name is guaranteed unique. This is important as Internet enabled apps use more and more XML files from more and more sources. The basic idea is that all elements could be prepended with dia:, in which case, for instance, <dia:attribute> would be differentiated from, let's say, <umenu:attribute>.

    For the time being this is isn't important in the learning process, but remember it in case you later see this syntax. And remember, you WILL NOT find a DTD or schema for the XML file at the URI. The URI is just a method of unique identification.

    Exploring Groups

    So much for theory -- back to action. Here we'll explore groups of objects in Dia. Run your ./rdia script, and delete all objects from the drawing using the Ctrl+X keystroke combination. Also, if you created extra layers in prior exercises, be sure to delete them. Right click anywhere on the drawing, choose dialogs, then layers. The X button is what enables you to delete a layer. Be sure not to delete the original layer, which is probably called Background.

    Now draw yourself an ellipse (wider than high) and a rectangle (wider than high). Now drag a band around both to select them, right click on either object, select objects then group, and note that instead of both objects being selected, the pair of objects is selected. You cannot select one object without selecting both, and when you move one both move. Save the drawing and quit Dia.

    Viewing it XML, you see that the two objects are now between a <group></group> pair of tags. The complex task of grouping objects is handled just that simply.

    Go back to Dia, select the group, right click either object in the group, and choose objects and ungroup. The objects become two separate objects now. Save the drawing and exit Dia. Looking at the XML, you'll notice everything's the same except the <group></group> pair of tags is gone.

    Exploring Connection Drawings

    Let's make a real diagram and see how it's implemented in XML. We'll be creating a diagram similar to the following:
    Diagram: Battery, resistor and zener

    First erase everything from the drawing with the Ctrl+X keystroke combination. Next, draw the battery, resistor and zener diode using their buttons from the Circuit template group, which will probably be the default. The buttons for battery, resistor, and zener are, respectively: , and . Each of these graphic circuit components have connection points at their leads, so use the Zig Zag Line button () to create zig zag lines, clicking on one connection point and dragging to the one on the next electronic component. Try to arrange the components so they look something like the drawing above. When you have something resembling the preceding drawing, save it and exit Dia to see the XML you've created.
     



    : NOTE:
    If you're really having trouble drawing this drawing in Dia, click here.


    First, notice that you've created the following objects:

    <object type="Circuit - Vertical Powersource (European)" version="0" id="O0">
    <object type="Circuit - Horizontal Resistor" version="0" id="O1">
    <object type="Circuit - Vertical Zener Diode" version="0" id="O2">
    <object type="Standard - ZigZagLine" version="0" id="O3">
    <object type="Standard - ZigZagLine" version="0" id="O4">
    <object type="Standard - ZigZagLine" version="0" id="O5">
    Observe that each object has an id attribute with values from "O0" through "O5". That fact will come in handy investigating the mechanics of line to circuit component connections. Note that depending on the order in which you placed components and lines, your ID numbers may vary.

    Next, notice that each Zig Zag line has a <connections> element, containing two <connection> elements. The following code shows the <connections> elements for the first, second, and third zigzag lines respectively:

    <connections>
      <connection handle="0" to="O0" connection="0"/>
      <connection handle="1" to="O1" connection="0"/>
    </connections>
    <connections>
      <connection handle="0" to="O1" connection="1"/>
      <connection handle="1" to="O2" connection="0"/>
    </connections>
    <connections>
      <connection handle="0" to="O2" connection="1"/>
      <connection handle="1" to="O0" connection="1"/>
    </connections>
    So let's describe each zigzag line's connections in English. Line 1 connects to handle 0 of object O0 (the battery), and also to handle 1 of object O1 (the resistor). It connects the battery to the resistor. Zigzag 2 connects to handle 0 of object O1 (the resistor, and please remember that handle 1 of the resistor is already taken), and also to handle 1 of object O2, the zener. It connects the resistor to the zener. Zigzag 3 connects to handle 0 of object O2 (the zener), and handle 1 of object O0 (the battery). It connects the zener to the battery, completing the circuit.

    You can actually modify the XML to put the program in an illegal state. On the following line:

      <connection handle="0" to="O2" connection="1"/>
     Change the 0 to 1 in the handle="0" attribute, save, and pull it up in Dia, and note that everything looks fine. Now click on the zener diode, and note that Dia aborts. Change the 1 back to 0 and confirm that the illegal state has been taken care of.

    Basically, what happened is that a line with nonzero length had its begin and end handles at the same point -- an error condition in both Dia and mathematics.

    That brings up an interesting hypothesis. Perhaps we should strive to make XML apps so mutable that you can't put them in an illegal state by editing the XML. Such an app would indeed have all its logic in the XML file, and the executable app would merely be a viewer. Perhaps that hypothesis is a little over the top, but it's an interesting thought.

    Creating a new Template Shape

    Indeed, part of Dia's logic is in its XML files. The template shapes are determined by two XML files and one file with a dot picture. The XML files determine the properties of the template shape -- its image and its connection points. The dot file determines the icon on its button. During this exercise we'll add a new template shape to the beginning of the circuit template group. We'll call that new shape a smily, and it will be a smily face, in a rectangular head, with a connection point at each ear. These are the four steps to making this new shape:
    1. Find the Dia shared directory
    2. Create the icon, shapes/Circuit/smily.xpm
    3. Create the smily shape, shapes/Circuit/smily.shape
    4. Add the Smily Face shape to the Circuits template group, sheets/Circuit.sheet
    #2 creates the icon. #3 defines the shape, and associates it with the icon. #4 incorporates the new shape in the Circuits template group (sheet).
     

    Find the Dia shared directory

    It's probably going to be /usr/share/dia, and it will have two subdirectories, shapes and sheets. The best way to find it is to find Circuit.sheet, which resides in the sheets directory under the Dia shared directory. First try the locate command:
    $ locate Circuit.sheet
    If that doesn't produce results, do a brute force search through the /usr tree:
    # find /usr -type f | grep "Circuit\.sheet"
    /usr/share/dia/sheets/Circuit.sheet
    /usr/src/RPM/SOURCES/dia-0.86/sheets/Circuit.sheet
    #
    In the preceding example, the Dia shared directory would be /usr/share/dia.

    Create the icon, shapes/Circuit/smily.xpm

    This file specifies the appearance of the icon on the new shape's button. Copy and paste the contents of the following box to a file called smily.xpm in the Circuit directory below the Dia shared directory:
     
    /* XPM */
    static char * smily_xpm[] = {
    "22 22 3 1",
    "       c None",
    ".      c #000000",
    "+      c #FFFFFF",
    "                      ",
    "                      ",
    "                      ",
    "                      ",
    "                      ",
    "                      ",
    "    ..............    ",
    "    .            .    ",
    "    .  .      .  .    ",
    "    .            .    ",
    "    .     .      .    ",
    "    . .       .  .    ",
    "    .  ..   ..   .    ",
    "    .    ...     .    ",
    "    ..............    ",
    "                      ",
    "                      ",
    "                      ",
    "                      ",
    "                      ",
    "                      ",
    "                      "};

    As you can see, it's really just a dot picture of the icon to be displayed, plus the size and a few other properties.

    Create the smily shape, shapes/Circuit/smily.shape

    This is the specification of the new shape. Copy the contents of the following box to a file called smily.shape in the Circuit directory below the Dia shared directory:
     
    <?xml version="1.0"?>
    
    <shape xmlns="http://www.daa.com.au/~james/dia-shape-ns"
           xmlns:svg="http://www.w3.org/TR/2000/03/WD-SVG-20000303/DTD/svg-20000303-stylable.dtd">
      <name>Circuit - Smily Face</name>
      <description>A smily face to brighten your day</description>
      <icon>smily.xpm</icon>
      <connections>
    
        <point x="0" y="4.5"/>
        <point x="13" y="4.5"/>
      </connections>
      <svg:svg width="3.0" height="3.0">
        <svg:polygon points="0 0,13 0,13 9,0,9"
              style="fill: default" />
        <svg:polygon points="2.5 1.5,3.5 1.5,3.5 2.5,2.5 2.5" style="fill: inverse" />
        <svg:polygon points="9.5 1.5,10.5 1.5,10.5 2.5,9.5 2.5" style="fill: inverse" />
        <svg:polygon points="6.0 4.5,7.0 4.5,7.0 5.5,6.0 5.5" style="fill: inverse" />
        <svg:polygon points="1.0 4.5,3.0 5.5,6.5 6.5,10.0 5.5,12.0 4.5 
                             12.0 5.5,10.0 6.5,6.5 7.5,3.0 6.5,1.0 5.5"
                                  style="fill: inverse" />
      </svg:svg>
    </shape>

    The document element is <shape>. It has the following subelements:
    Subelement Function
    <name> The name by which this shape is known.
    <description> A human readable description of the shape.
    <icon> The Icon file associated with the shape, in this case the one you made earlier.
    <connections> The connection points for persistant line connections.
    <svg:svg> The actual shape information, built from further subelements which are geometric shapes such as polygons.

    Now you have a shape file describing your new template, and associating it with an icon. The last step is to inform the Circuit template group (sheet) that this shape has been added...

    Add the Smily Face shape to the Circuits template group, sheets/Circuit.sheet

    There's a file called Circuit.sheet in the sheets directory under the Dia shared directory. Copy that file to Circuit.sheet.org in the same directory. Now if you hopelessly mess up Circuit.sheet it won't be necessary to reinstall Dia to fix the mess. You can simply copy the original file back.

    Now open Circuit.sheet with your favorite text editor. You'll note that the document element is <sheet>, with several <description> subelements, each of which gives a description in a different language. It also has one <contents> subelement. That's where the rubber meets the road, because <contents> contains many <object> subelements, each of which describe a template shape. If your file has not been modified, the first <object> will be named "Circuit - Vertical Resistor". You're going to insert the smily object before the vertical resistor. Simply copy the contents of the following box between the <contents> tag and the first <object> tag:
     
        <object name="Circuit - Smily Face">
          <description xml:lang="no">En smilyface</description>
          <description xml:lang="fr">Une smilyface</description>
          <description xml:lang="de">Eine smilyface</description>
          <description>A smilyface</description>
        </object>

    The value of the name attribute of the <object> element must be spelled, capitalized and spaced exactly as shown. That's because that value is how Dia finds the shape file you previously created. In fact, this text must exactly match the <name> text in the shape file. Once you've completed this, quit Dia, restart it, and you should see the smily icon on the Circuit template. Click it, and drag on the drawing to make a smily. Connect zigzag lines to it, put it into a circuit. Save, quit, and see the results in XML. Within XML, experiment with changing the boolean value of the <attribute> elements whose name attributes are flip_vertical and show_background, and view the results in Dia.

    This is XML at its best. You've just given an application a new capability, using only a text editor.

    Summary

    The exercises in this article were long, and possibly tedious. But they were worth it, because if you did them, you now know XML intellectually and intuitively. You've seen an XML dialect that has been specified for maximum adaptibility. You've seen how the masters put as much of the implementation as possible in the XML, rather than in the executable app. You've seen true round trip development between a GUI and a text editor environment.

    Allow your imagination to relish all the possibilities. Imagine how XML would help you write that app you always wanted to write but never figured out how.  With XML, the only limit is your imagination.

    Congratulations. You've learned as much XML as you can get from reading. Now it's time to do. You've graduated. The next article walks you through writing your own XML Hello World. Enjoy.
     
     

    In this Article You Have Learned
    • How to use Dia to learn good XML construction.
    • Dia is a vector drawing package that stores its drawing information in XML format.
    • Modifying the drawing modifies the XML, and Modifying the XML modifies the drawing..
    Steve Litt is the author of  Troubleshooting Techniques of the Successful Technologist".  He can be reached at Steve Litt's email address.

    XML Coding Exercises

    The XML coding exercises are on a different page. This tutorial is most effective if at this point you perform the XML Coding Exercises, in order, at this point. When you are done with all the XML coding exercises, continue down from here. The XML Coding Exercises page has a link to come back to this point.

    Wrapup

    By Steve Litt
    It's been fun, hasn't it. Starting with XML basics such as terminology, XML construction and syntax, you moved to XML application architecture (with a DOM accent) followed by DOM tree navigation.

    You spent some time with the Dia diagramming application to see how it wrote XML, and how it interpreted the XML changes we wrote to its files. You observed Dia as a benchmark for XML file design.

    Then it was time to code. Starting with a "Hello World" DOM app, continuing with a DOM walker, and then building a DOM Document object in memory from scratch. You then wrote the code to write that DOM Document out to a file (actually to stdout) as XML. You accessed DOM elements and attributes by name.

    YOUR "Hello World" SAX  app quickly progressed to a SAX tree walker, then a SAX Explorer complete with a subclass of ErrorHandler, and finally SAX with per-record DOM Documents for a combination of small footprint and random access within each record.

    You then thoroughly investigated DTD's, ending up with a methodology to build a DTD from an existing XML file.

    So kick back and put your feet up. You did great. If you started out being intimidated by XML, you certainly aren't any more. You now have enough XML knowledge to quickly read and understand intermediate and advanced XML books, specification documents from standards bodies, and XML websites.

    If an XML opportunity presents itself at work tomorrow, you can step forward with confidence. Sure, you don't know everything of XML. You know only a small part. But you've learned how to code XML, and more important you're going to learn where to find the additional information you need in order to do whatever XML task presents itself.

    Read on...

    Steve Litt is the developer of The Universal Troubleshooting Process troubleshooting courseware. He can be reached at Steve Litt's email address.

    Where to Go From Here

    By Steve Litt
    If you've done all the exercises you're pretty comfortable at XML, and could probably take on an XML project at work tomorrow. Does that mean you know all necessary XML information? Not even close.

    We covered the tip of the iceberg. Just the basics of reading and writing XML from a program. There's a world of other XML knowledge to gain. There are excellent XML processing models besides SAX and DOM. Schemas offer an alternative to DTD validation. There's the entire discipline of rendering, complete with XSL, XSLT, and XML frameworks like Cocoon from Apache Software Foundation. There are specific XML varients such as SVG for vector graphics, as well as chemical and mathematical markup languages. There's even a markup language called XML-RPC for remote procedure control. And there's much, much more.

    Luckily, the information is easy to find, and you now have the skills to exploit it. This article gives some web resources and a couple books that you'll find helpful.

    Here are some URL's that will help you with specific projects. And we definitely covered enough so that computer programs you write can access and work with XML. Without the info in this tutorial, going farther would have been folly.But there's a world of other XML available to you:

    1. XML Schemas: http://www.w3.org/XML/Schema.html
    2. XML processing models other than SAX and DOM:
    3. XSL: http://www.w3.org/Style/XSL/
    4. XSLT: http://www.w3.org/TR/xslt
    5. Cocoon: http://xml.apache.org/cocoon/
    6. XML-RPC: http://www.xmlrpc.com/

    Excellent XML Books

    Beware. Most XML books talk about little else except XML syntax and validation. Most of the XML books out there don't understand that programmers learn by programming, not by memorizing YAAFAX (Yet Another Arcane Fact About Xml). Most XML books out there devote several chapters to HTML, SGML, browser rendering, and blatantly Microsoft specific applications, while having little or no programming to show how a real programmer can manipulate and render XML.

    Most XML books out there are trash. That's why I wrote this tutorial -- to undo the damage done by those so-called XML books that did nothing but scare would-be XML programmers out of XML, before they started. I assume that by this point in this month's Troubleshooting Professional Magazine, you understand that XML is anything but rocket science.

    Yes, most XML books are trash. But in my travels I discovered some good ones, and one truly outstanding one.

    Java and XML by Brett McLaughlin: ISBN 0-596- 00016-2

    This is an astoundingly excellent book! Because it's intermediate level, once you've finished this tutorial you can graduate directly to this book.

    The first 8 chapters are built quite a bit like this tutorial, with code progressions to walk you through the process of learning the principles. McLaughlin starts the reader on SAX, then walks you through creating, parsing and interpreting DTD's and Schemas, and finally giving a thorough indoctrination in DOM and JDOM. Chapter 12, "Creating XML with Java", is also necessary to basic XML programming. All the material is thorough and rigorous, with programs done in a Javanically compliant way.

    From there he goes on to discuss all the Kewl things you can do with XML now that you know its principles, including XML publishing frameworks, XML-RPC, XML and Enterprise Java Beans, and finally business to business examples (think that might be a valuable skill?).

    This is the book I would have written if I knew as much XML and Java as McLaughlin.

    I don't say that lightly. In scope, depth, organization, and writing style, I find "Java and XML" quite similar to Samba Unleashed.

    If you have anything to do with XML, even if you're not a Java person -- get this book!

    XML Processing with Python by Sean McGrath: ISBN 0-13-021119-2

    This book was where I got my first real XML knowledge. It's an excellent book, especially if you consider Python easier than Perl and Java. It comes with a CD full of everything you'll need for the exercises in the book. There are chapters on DOM and SAX. If you're a Python programmer, this is the XML book for you.

    One word of warning: "XML Processing with Python" is heavily weighted toward Pyxie XML methodology, and the utilities and tools on the CD aren't those you'd typically download from python.org. So if you want to learn generic XML, especially language independent, that's a disadvantage. BUT, if you're a Python guy (and most of us are whether we admit it or not), the tools that come with this book, ESPECIALLY Pyxie, can have you up and running with XML in record time.

    The DOM Specification:

    One might say this isn't a book, but the Level 2 Core PDF is 107 pages. Taken together, they're as big as a book. And man, they're well written and informative. I didn't really understand XML until I read the DOM spec, and then it was obvious. Read it. URL's below:

    XML Devcon

    There comes a time when reading and performing exercises just aren't enough. You want to rub elbows with your peers. Luckily, there are two more XML Devcons this year. www.xmldevcon2001.com lists this year's remaining conferences as: This is put on by Camelot Communications, the same people who put on ApacheCon. I went to the 2000 ApacheCon in Orlando, and it rocked. These conferences sell out, so if you want to go to the New York conference, sign up soon at  http://www.xmldevcon2001.com/NY/html/registration.html. You can order a free Exhibit Pass for the New York event, good for all Expo Days that gets you into the exhibit floor, the Keynotes, the Technical Briefings, the Management Briefings, and All Special Events. Go to registration URL and click "For Free Exhibit Pass".
    Steve Litt is the author of Rapid Learning: Secret Weapon of the Successful Technologist. He can be reached at Steve Litt's email address.

    Apache Software Foundation and W3C Rule!

    By Steve Litt
    As you can imagine, writing this issue of Troubleshooting Professional took some time. And the more time I spent researching, the more obvious it became that the Apache Software Foundation and the World Wide Web Consortium are two of the most powerful software entities on earth. I think of them as the legislative and executive branch. W3C manages the creation of the specifications. And Apache Software Foundation maintains the actual projects.

    When it comes to standards based specs, look what W3C has to offer. They offer the standard specifications for XML, XSL, XML Schemas, DOM, HTML, SVG (Scalable Vector Graphics varient of XML), Cascading Style Sheets. There's a working draft for XQuery -- a query language to extract info from XML docs. They have a working draft of the WAI -- the Web Accessibility Initiative. I've just scratched the surface. And best of all, these are *standards*. They won't be changed or kidnapped at the whim of a corporation.

    As I researched for this month's magazine, it started looking like whatever W3C recommends, Apache Software Foundation builds or maintains. Sometimes the projects are initiated at corporations, but ASF has a reputation for running Open Source projects, so when the originators want to leave their project in good hands and move on to other things, they leave it to ASF. And of course, many ASF projects start at ASF. During the writing of this magazine, I saw so much kewl stuff from ASF that I almost forgot they're the source of the worlds most popular web server.

    I'd like to take a quick look at just a few of the software tools you can download from the Apache Software Foundation website.

    Near and dear to my heart is Xerces, the "parser" that made it possible to do this tutorial. The reason I put quotes around the word is Xerces can do things far beyond mere parsing. It contains the entire DOM interface, and all sorts of other things. And every bit of it works consistently, exactly like you'd expect it to. I had forgotten how much fun it is to work with software tools so solid you can spend your mental effort in design, rather than workaround. I used Xerces for Java, but there appear to be versions for C++ and Perl. And according to an email I just got, the Perl version now works with Linux.

    Xalan is an XSLT processor for transforming XML documents into HTML, text, or other XML document types.

    Cocoon is an XML framework. The way I interpret their description, you author your content in XML, which has no appearance component. There's then a logic component, which I don't understand, and finally an XSL component to map each XML entity to an appearance. The way I read this, the subject matter expert can write his content without having to worry about appearance. Obviously I don't understand the full picture. You might want to have a look yourself. It sounds mighty powerful.

    What can I say about SOAP: :-) :-) :-). My understanding is that Microsoft originated SOAP, which is a lightweight data exchange mechanism for distributed computing. Now it's been submitted to W3C. IBM has implemented it, and as you can see Apache supports it. The way it looks to me, the community intercepted Microsofts pass and scored a touchdown.

    Then there's Batik: Whoooaaa! This is a series of core modules to work with the SVG (Scalable Vector Graphics) XML dialect. Something very similar to the data format of Dia. Batik works with Java. Imagine being able to draw a picture in a Vector Drawing Program (vector drawings consume much less bandwidth than their bitmap graphics cousins), and have it visible in any browser with the proper Batik plugin! Ya know, I'm sick of creating diagrams in Dia and then having to tweak them in Gimp to show them to the world. I'm not saying Batik can do that yet, but it's what crossed my mind when I read what it is. The W3C tested six SVG implementations, including Adobe and JASC (the Paintshop Pro people), and Batik did exceptionally well in all areas except animation. See the results at http://www.w3.org/Graphics/SVG/Test/BE-ImpStatus.

    Summary

    What do you think of when someone utters the phrase "the most powerful software entity of our time". Until a few days ago I thought of the crumbling but still mighty empire. And if money were the measure of power, Microsoft would still be the most powerful. But if quality, reliability, and staying power are the measure, W3C and ASF far surpass mere corporations.

    So the next time you get an assignment in "new technology", your very first move should be to check W3C for a recommendation or working draft, and to check Apache Software Foundation for an implementation. After all, nobody wants to reinvent the wheel. And certainly we wouldn't pay a corporation for a reinvented imitation of the free wheel at W3C or ASF.

    Steve Litt is the documentor of The Universal Troubleshooting Process. He can be reached at Steve Litt's email address.

    Thanks and Acknowledgements

    By Steve Litt
    A guy named Andrew Anderson works at Red Hat. He's also an active member of the mailing list at my local LUG, Linux Enthusiasts and Professionals (LEAP). In November 2000, when I knew absolutely nothing about XML, Andrew posted some XML on Leaplist, and answered my questions about that code to the point where I could begin my XML studies. Thanks Andrew!

    Another Leapster named Aaron Wadley took the time to tell me what he'd learned from some professional XML apps he'd done. Thanks Aaron!

    A third Leapster, Scott Porter, chuckled and called me wrong when I expressed the opinion that it's easier to learn using Python than Java. Scott's a smart guy whose opinion you must take seriously, especially in matters of education. He's Dean of Business and Technology at the DeVry Institute campus in Orlando. So when I hit a brick wall with Perl::DOM, I investigated Java with an open mind, forgetting my experience writing a gui applet (the Symptom Description Wizard) before Swing existed. Scott was right. Java is a very straightforward, consistant language with a remarkably short debug cycle. I used Java to demonstrate XML, and the results speak for themselves. Thanks Scott.

    A big thank you goes out to LEAP itself. LEAP has served as an incubator for so many of my ideas. LEAP is a parallel processing superbrain anyone can join. Thanks LEAP.

    My wife Sylvia is a former grade school teacher. She's always told me the bottleneck in my teaching techniques is not telling the student what he's going to learn, then teaching it, then telling him what he learned. So I did that for every tutorial article in this magazine. Thanks Sylvia.

    The Apache Foundation maintains the Xerces package, which includes validating parsers, DOM, SAX, and tons of other stuff. Xerces works right out of the .tar.gz, every time. It's what software should be. The Apache Foundation maintains many other astounding XML products, including the Coccoon XML Framework. Oh, and I guess I should mention that they maintain the most used web server software on the planet, Apache. Thanks Apache Foundation.

    Where would we be without the W3C? What would we do without a solid, well written specification for what is and isn't XML, and what is and isn't DOM? I guess each year we'd use whatever definition the biggest commercial player happened to use that year. Thanks World Wide Web Consortium.

    A guy named Brett McLaughlin wrote a killer book called "Java and XML". In any subject area Brett's writing, scope and depth would be exceptional, but in the XML world, where most books are useless, "Java and XML" can make the difference between learning and ignorance. Thanks Brett.

    A big thank you goes out to everyone who has made the free software community what it is -- Richard, Linus, Eric, Maddog, Larry, Guido, Tim, and literally thousands of others.

    Last and most, thanks to all you Troubleshooting Professional Magazine readers, for your critiques, suggestions, contributions and encouragement. This is truly your magazine.

    Steve Litt is the author of Samba Unleashed. He can be reached at Steve Litt's email address.

    Linux Log: Open Source Means Quality Education

    Linux Log is a regular column in Troubleshooting Professional Magazine, authored by Steve Litt. Each month we'll explore a facet of Linux as it relates to that month's theme.
    A couple nights ago President Bush pitched his new budget to Congress. Time and again, he emphasized education. We don't have infinite money, but we're going to improve education for every child. The president is very concerned that no child, regardless of race or economics, fall through the cracks. That's the president's job.

    My job is to see that adults get educated. In a world where technical learning obsoletes every 18 months, it's all too easy to fall off through the cracks. Especially if you're old, poor, or minority. This is one of the reasons I wrote "Rapid Learning: Secret Weapon of the Successful Technologist". And it's one of the reasons I'm so enthusiastic about Open Source and Linux.

    College costs a fortune. Only slightly less are trade schools, corporate training courses, certification preparation courses, and even those self-guided courses on CD.

    And then there's the bargain of the century, my Linux distribution. Just for today, don't think of it as an operating system. Think of it as a voluminous reference work and a killer computer lab, on a CD costing $29.00.

    Let me ask you a question. You don't think I really knew all this XML information when I started writing this TPM issue 15 days ago, do you?

    My Java was mighty rusty. No problem, the entire (voluminous) JDK documentation is contained in /usr/java/docs.  I simply navigated to it with Konqueror, and looked up what I need. Sure, I could have gone online, but at 56K dialup, it sure is nice to have it on disk. Java books are great for learning Java, but when it comes to a class and method reference, nothing beats the JDK docs that come on your Linux distro.

    I knew DOM pretty much by heart, but nothing else. SAX? Not a bit. DTD's? I knew their purpose, and I had some books giving partial syntax that wouldn't validate anything, but I'd never used it. Oh, and my Java was mighty rusty. Using my Linux distro, the Internet, and my Rapid Learning Process, I learned almost as fast as I could write.

    Try the following command:

    $ find /usr -type f | grep -i xml | less
    I found literally hundreds of files, many of which contained valuable XML information. I found docs on the SAX functions for Gnome, parsing info for QT, and for KDE's KXMLGUIBuilder, KXMLGUIClient, and KXMLGUIFactory classes.  I found the XML functions for PHP.

    But the educational value of a Linux distro goes far beyond Java and XML. Try some of the following:
    Subject Command
    DNS find /usr -type f | grep -i named | less
    Samba find /usr -type f | grep -i samba | less
    find /usr -type f | grep -i smb | less
    Apache find /usr -type f | grep -i apache | less
    find /usr -type f | grep -i httpd | less
    Java find /usr -type f | grep -i java | less
    C++ find /usr -type f | grep -i c++ | less
    find /usr -type f | grep -i gcc | less
    Perl find /usr -type f | grep -i perl | less
    Python find /usr -type f | grep -i python | less
    LDAP find /usr -type f | grep -i ldap | less
    SSL find /usr -type f | grep -i ssl | less
    The preceding list could go on forever, but you get the idea. The output of these commands will give you ideas for similar commands you can run until you find what you're looking for. And don't forget the apropos and man commands.

    Your Linux distro pays for itself many times over as a reference. But wait, there's more! If you order your Linux distro now, you also get...

    A computer lab! That's right, a UNIX workalike where YOU have the root password. I ask myself how far I would have gotten trying to learn web serving, TCP/IP, DNS, sockets, and the like without Linux. The answer is self evident. I've been in "IT" for 17 years, but until I got my first Linux distro in November 1998 I had to ask my network administrator how to solve connectivity problems. Now I can solve my own problems (and other people's too).

    But what about Java XML? Java and Xerces are available on Windows. Why didn't I use Windows?

    In a word, BSOD (Blue Screen of Death). When I'm learning, I have enough variables to consider without having to worry if what I'm seeing is a result of an operating system quirk, rather than a facet of the technology under investigation. Technology just seems to work as advertised on Linux. That can't always be said of Windows.

    Then there's the issue of democratization of education. Doesn't it seem like proprietary software is like the camel -- first his nose is in the tent, and then the whole camel is in the tent. Theoretically Windows is cheap, but somehow it always ends up sucking up big bucks. Once you buy your Linux distro, your spending days are over, but your learning days have just begun. And oh yeah, you can run Linux on cheaper hardware. We recently inaugurated a president who many refer to as "the education president" -- a man constantly campaigning for the democratization of education. Could it be that Linux is now politically correct?

    So the next time one of your Windows buddies refuses to look at Linux, try a different approach. Tell him for $29.00 he can get a CD to teach him how to get ahead in the Windows world, and how to do better on his certification tests. Tell him that CD is a wishing well -- wish to know anything and he can find it there. Just don't tell him it's Linux.

    The days of throwing money at education hoping it will somehow improve are over. We heard it the other night. These are times of accountability in education. In the spirit of the day, if the certification mills and the colleges can't do the job, people must have an alternative. A Linux distribution CD is part of that alternative. And it's so affordable vouchers aren't necessary.

    Steve Litt is a member of Linux Enthusiasts and Professionals of Central Florida (LEAP-CF). He can be reached at Steve Litt's email address.

    Letters to the Editor

    All letters become the property of the publisher (Steve Litt), and may be edited for clarity or brevity. We especially welcome additions, clarifications, corrections or flames from vendors whose products have been reviewed in this magazine. We reserve the right to not publish letters we deem in bad taste (bad language, obscenity, hate, lewd, violence, etc.).
    Submit letters to the editor to Steve Litt's email address, and be sure the subject reads "Letter to the Editor". We regret that we cannot return your letter, so please make a copy of it for future reference.

    How to Submit an Article

    We anticipate two to five articles per issue, with issues coming out monthly. We look for articles that pertain to the Troubleshooting Process, or articles on tools, equipment or systems with a Troubleshooting slant. This can be done as an essay, with humor, with a case study, or some other literary device. A Troubleshooting poem would be nice. Submissions may mention a specific product, but must be useful without the purchase of that product. Content must greatly overpower advertising. Submissions should be between 250 and 2000 words long.

    By submitting content, you give Troubleshooters.Com the non-exclusive, perpetual right to publish it on Troubleshooters.Com or any A3B3 website. Other than that, you retain the copyright and sole right to sell or give it away elsewhere. Troubleshooters.Com will acknowledge you as the author and, if you request, will display your copyright notice and/or a "reprinted by permission of author" notice. Obviously, you must be the copyright holder and must be legally able to grant us this perpetual right. We do not currently pay for articles.

    Troubleshooters.Com reserves the right to edit any submission for clarity or brevity. Any published article will include a two sentence description of the author, a hypertext link to his or her email, and a phone number if desired. Upon request, we will include a hypertext link, at the end of the magazine issue, to the author's website, providing that website meets the Troubleshooters.Com criteria for links and that the author's website first links to Troubleshooters.Com. Authors: please understand we can't place hyperlinks inside articles. If we did, only the first article would be read, and we can't place every article first.

    Submissions should be emailed to Steve Litt's email address, with subject line Article Submission. The first paragraph of your message should read as follows (unless other arrangements are previously made in writing):

    I (your name), am submitting this article for possible publication in Troubleshooters.Com. I understand that by submitting this article I am giving the publisher, Steve Litt, perpetual license to publish this article on Troubleshooters.Com or any other A3B3 website. Other than the preceding sentence, I understand that I retain the copyright and full, complete and exclusive right to sell or give away this article. I acknowledge that Steve Litt reserves the right to edit my submission for clarity or brevity. I certify that I wrote this submission and no part of it is owned by, written by or copyrighted by others.
    After that paragraph, write the title, text of the article, and a two sentence description of the author.
     

    URLs Mentioned in this Issue