tangaroa | Resolving incompatibilities between XHTML and HTML5 (Reply)

To resolve the incompatibilities between XHTML and HTML (discussed earlier), I propose that browser developers adopt these rules:

HTML is a superset of XML

Any XML is lexically valid HTML.
- HTML readers MUST accept syntactically valid XML. For example: <script src="..." /> SHOULD be read to close the script tag and not to treat the rest of the web page as javascript. Alternate parsing methods SHOULD only be attempted if the page fails to lex.
- An HTML reader MUST accept certain XML control sequences. For example, a reader reading <script><![CDATA[ ... ]]></script> MUST read the CDATA as CDATA and MUST NOT send the XML control characters to the script reader.
Certain HTML is NOT lexically valid XML.
- HTML tags and attribute names MAY be case-insensitive.
- Certain HTML tags may be self-closing without a '/' self-closing mark.
- Any fragment of HTML which is a complete element is itself a valid HTML document. For example, "<p>Hello World" is a valid HTML document.
XHTML is the subset of HTML which is also valid XML.
- Use of XHTML by web developers is optional.
- Documents which claim to be XHTML SHOULD comply with all of the rules of XML.
- Readers SHOULD consider a document's declarations of its own file type, such as DOCTYPE and control sequences, in considering whether to interpret a document as XHTML.