Tuesday, January 1, 2008

Resolution for 2008: Lets cut down on XML abuse

When XML started off it, it was envisioned to be a super set of HTML where custom tags could get created and content in custom document strutures can be circulated in the World Wide Web. I, however, believe that XML technology is being misused. It is popping up in Semantic Web, SOA, and in pure data modeling efforts. It is true that XML is human AND machine readable however computing processors are paying a heavy price parsing and ingesting the content in complex XML documents. I recently attended an Ontology conference in Maryland and the biggest theme from the conference is that there isn't enough machine power to process all of the complex business logic to make content inferences. Even though XML is human and machine readable, it is extremely verbose and it is not practically to have machine process all of the tags and get the data from a XML document. The next question to ask is: "When should you use XML?" The answer is that when the XML documents are not overly complicated which might include nested tags, and a large amount of data. If the data is quite big then other binary content should be passed around in COM objects or Enterprise Java Beans (EJB)s. Here are some arguments regarding XML
  1. XML is interoperable - True XML is interoperable for cross platform communication. However please don't send massive XML documents which can bough the system down. XML may be interoperable but it is not performance friendly.
  2. XML is a good format for web service communication - Web Services (SOAP and REST) largely deal with XML based content however Web Services are NOT reusable if the XML format is propriatory between the Service Provider and its clients. Please avoid sending large amounts of data over the wire. SOAP with attachments is a good alternative for sending large chunks of data.
  3. XML is human readable - This is true however if a developer decides to create a XML format which he only understands then the XML is unreadable and it needs to be released. For example, a developer decides a create a XML document which looks like this: Person. This XML document is a complete waste since not every human understands it. However if tag names are defined properly and their definitions captured then the XML is human reable.
  4. XML is the future in Semantic Web- Recently I came across OWL-S editor software and I tested it out. The software generates an OWL and RDF documents however these XML documents are verbose and they cannot be processed. Instead of a XML based technology, the Semantic Web should look at other options rather than just XML.
In conclusion, XML is a great technology however people tend to misuse it and then tag it as a wasted technology. XML is not going to go away so lets use it properly and take care of it.

No comments: