The XML vs. HTTP article has been retired

The "XML vs. HTTP" article you are looking for was written in December 2000, when very little XML was being written and used on the Web. Developers were crafting their own XML-over-HTTP Web applications from scratch, and were running into character encoding problems as they tried to transport XML using mechanisms designed for HTML form data.

These encoding issues at the HTTP transport level are mostly gone, thanks to APIs in the Web service frameworks that most developers use nowadays. If you're using a .NET, SOAP, or XML-RPC toolkit, or any major servlet engine or other popular framework, then you probably don't need to deal with any of these issues. You'd only need to deal with them if you're writing such a toolkit or framework, or crafting a standalone XML-over-HTTP application.

For example, when I wrote the article, I was using a then-popular Java servlet engine that was internally hard-coded to assume that form data was ISO-8859-1 encoded. So when I asked it for a parameter value from an HTTP request, it'd give me a String with the ISO-8859-1 interpretation every time, which was no good for UTF-8 data. I had to undo that problem by converting it back to bytes, and then converting the bytes to characters again according to UTF-8 or whatever I knew the right encoding was. It's a rather obvious solution, but some programmers needed the help. I wouldn't have do to that today, because now there's a setCharacterEncoding() method that I can call before asking for the String.

If you're dealing with data at the raw HTTP level, or if you just want to read the original article anyway, read on (click).

—Mike Brown, 28 Jun 2008