An XHTML file of mine that worked fine four years ago now displays as a blank white page
in the latest Opera. The problem is a self-closing <script /> tag that the parser
fails to properly close. The real problem is that Opera ignored the XHTML 1.1 DOCTYPE and
the tag and decided that the document must be HTML 4 or earlier
because it was not presented with the proper application/xhtml+xml MIME type, even though
it was loaded directly off my hard drive and had no MIME type. The real real problem is
that browser developers were so upset with the merger of HTML and XML that they allowed
it to break.
At the time that XHTML was introduced, it was common for web site developers to write
bad code and then complain to the browser developers that their page was not displaying
correctly in that browser, but it worked in a different browser. The browser developers'
reply to these web developers was to follow the standards. Then Microsoft
tried to coerce web developers to use its own alternate markup, alternate scripting
language, alternate authentication method, and other technologies only supported by
Microsoft software. To keep the web free and competitive, browser developers advised web
developers to follow the standards. This would also become the
rallying cry for destroying XHTML.
Because old browsers will not understand the new format, the standard
for XHTML 1.0 recommends that developers either follow backwards-compatibility guidelines or
serve XHTML files with an alternate MIME type. This advice was reinforced by later
documents saying that application/xml+html
is the proper MIME type for XHTML. Mozilla, Opera, and Safari all chose to read this
to mean that their newer browsers should not recognize XHTML when served as HTML.
Internet Explorer went the other way and refused to display content that was served
as application/xml+xhtml until 2009. As a result, XHTML web pages simply stopped working
in alternative web browsers when they decided to make that change, and if developers did
things "the right way", the page would not work in the most popular browser. This heavily
suppressed adoption of XHTML.
To see how bad a decision this was, consider that web pages can have a DOCTYPE. This is an instruction that tells the reader whether
the page is HTML or XHTML and what specific version of that standard it uses.
Browser developers chose to follow a policy of ignoring the DOCTYPE when deciding
whether to use the new XHTML or old HTML parsing and display rules. Web developers not only
had to fix their own bugs but would now routinely encounter inexplicable
situations where a small change would cause the page to display in a
different manner if the MIME type was different on a different server or if their
edits introduced a minor parsing error. Browsers provided no other feedback as to
which display mode they were using and were steadfastly ignoring the web developer's
direct, explicit instruction as to which exact mode the browser should use.
Another shock came when hand-coded pages with minor typos suddenly stopped working.
Indeed, the XML standard says parsers should fail and produce an error if there
is a parse error. It should also be universally recognized that this is stupid
behaviour for a web browser and no problems would be caused if the browser failed to comply
with this part of the standard and instead attempted to recover.
Compatibility problems could be caused if one browser allowed an extension to the
language, such as case insensitivity, and others did not. Web developers wanted to keep
their case insensitivity; common practice at the time was to make tag names upper-case,
and there was grumbling when the XHTML standard said they must all be lower-case. For consistency, this lower-case rule
was also applied to the HTTP methods used by data entry forms, GET and POST, which are
upper-case in their own standard. The major browser developers could have grouped together
and said "We're supporting mixed-case. People who need web pages in lower case for data
processing can just run them through tidy." One browser developer could have done
this and coerced the rest to follow. The cost for future browser developers to implement
this change is near zero; they only need to run certain strings through tolower() or
toupper().
Browser developers followed a policy of working to rule
for the standard that they did not want to support. XHTML was so hated that its
replacement, HTML5, recognizes the self-closing tag instruction introduced with XHTML only
to instruct browser developers not to support it unless it is on a tag that
cannot contain data and would not be open in any case. Other tags are to be
considered open when such a closing instruction is seen. This is a direct
introduction of incompatibility for no apparent reason other than to disconnect
HTML5 from XHTML.
The w3c, for its part, failed to respond to the desires of web developers for
a more flexible system than XML provides. XHTML could have been redefined as
a language inspired by and easily translatable to XML, or a flexible alternative
to XML could have defined, or they could have simply drawn up a list of exceptions
to the XML standard that XHTML did not need to follow, and they could have produced
a translator to convert loose XHTML to strict XML. They could also have loosened
up the XML standard for the ease of web users and developers. They did nothing.
The intransigence of both parties caused a ten year delay in the advancement of
HTML. Things had been looking forward for a short time; XHTML had been adopted by web developers as the new standard
and the first draft of XHTML 2.0
had been released, but no browser developer chose to support XHMTL 2 and then XHTML 1 became un-adopted when it stopped working in the browsers. We could have had a better HTML ten years ago if browser developers and the w3c had found a way to get along. Meanwhile, advancements have happened in CSS and Javascript; much of what is new in "HTML5" has nothing to do with HTML, but is the collection of these other advancements that browser developers have made and agreed upon.