tangaroa | Entries tagged with webdev

A long time ago the w3c proposed new standards for web forms to supplement the simple first-generation forms. They were heralded as a step forward in web development, but no browser ever implemented them. Their current status in Mozilla:

Bug 97806 - Implement W3C XForms in browser and composer - Status: RESOLVED WONTFIX

Comment 111 Robin Whittleton 2013-04-10 06:47:31 PDT
I’m guessing this is a WONTFIX now that they’ve been marked as obsolete on MDN? https://developer.mozilla.org/en/docs/XForms
Comment 112 Benjamin Smedberg [:bsmedberg] 2013-04-10 06:54:28 PDT
Indeed, we would not accept an implementation even if somebody wrote it.

Seen around the web, a website displays the copyright symbol as: Ãƒâ€šÃ‚Â©

The Unicode garbage is in the source:

Ã&#402;â&#8364;&#353;Ã&#8218;Â

Googling either string shows the same characters added before special characters on other websites. I wonder what causes this.

Saving the raw HTML + xxd produces:

          c326 2334 3032 3be2 2623 3833      .&#402;.&#83
3634 3b26 2333 3533 3bc3 2623 3832 3138  64;&#353;.&#8218
3bc2 a9                                  ;..

Converting the HTML escape sequences produces:

c3 0192 e2 20ac 0161 82c3 201A c2a9

In which c3 and e2 are one-byte characters, the entities are multibyte Unicode characters, and c2a9 at the end is the UTF-8 representation of unicode 0xa9, the copyright symbol. Looking up the entity values shows that the garbage string is being produced exactly as specified by the HTML.

So what's going on here? People are presumably copying and pasting a symbol from some other location, most likely Word or another website, into their web browser or HTML editor. Somehow these extra characters get passed along in a way that they don't notice it and fix it. In the case of the website where I first saw this, it looks like an HTML editor translated the characters into HTML escape sequences; nobody would do that manually. I don't know what causes this or what the characters are supposed to mean.

Firefox, Chrome, and Safari have all decided to remove support for scrollbars and fixed sizes for table and tbody elements. Their justification is that an old version of the CSS spec does not say they need to support that feature even though the latest version does and HTML4 introduced the tbody element for the specific purpose of allowing future browsers to make the tbody scrollable separate from the table's header row. So they removed a feature that web developers have been using for close to a decade, and the other major browsers went along with them.

It reminds me of this from earlier: XHTML and the revolt of the browser developers.

On a related note, Gnome removed support for transparent terminal backgrounds and then started insulting and banning users who complained. Transparent terminal backgrounds were one of the nifty features that attracted me to Linux back in the day, along with multiple desktops and focus following the mouse.

This has been out for three years and I hadn't heard of it before? It looks really useful. An interesting point is the reversal of flow control: Instead of a PHP page being an HTML template that breaks into and out of PHP, XHP promotes the design style of a PHP file that breaks into and out of HTML.

Another unrelated awesome project from three years ago: The LBW project is Wine for Windows, allowing ELFs to be run under Windows XP. Immediately impressive is that "it's adequate for ... downloading and installing packages with apt and dpkg" since I could never get apt and dpkg to compile under mingw or cygwin.

( Cut for being long and uninteresting. )

One of my side projects is a Javascript drag-and-drop list using the modern drag and drop specification with the goal of allowing users to rearrange the order of items in a list. I quickly ran into a problem: the drag-and-drop spec was designed for dragging one object onto another single discrete object which is expecting a drag event. This does not translate well to a draggable list's use cases of dragging above, below, and between objects (or subobjects), or of dragging an item off of the drag area to bring it to the top or bottom of the list. To do anything fancy, we need to find a relationship between the coordinates of the MouseEvent parent of a drag event, and the coordinates of the elements on the screen.

Visual elements have these coordinate attributes:

offsetTop
scrollTop

Drag events have these coordinate attributes:

clientY
pageY
screenY

There is no correlation between the two sets of coordinates. I tried summing the offsetTop of an item and its ancestors but found no correlation between that sum and any of the mouse coordinates. I also had no luck using the various page and scroll properties for window and document. Since I couldn't find the answer, I changed the question. Element.clientHeight reliably works across browsers, so we can do this:

Save the initial drag event at the start of a drag.
Calculate the difference between the start and end events.
Count the heights of the elements to see where to place the dragged item.
If we run out of elements, place the dragged item at the head or tail of the list.

This should work. The MouseEvent gives us three sets of coordinates, so we should be able to pick one and it should work.

Hah.

Among the problems:

In Firefox, the clientY of the starting DragEvent is zero. This is bug #505521 which has been open since 2009.
In Safari 5.1.7, the clientY of the starting DragEvent is measured from top of window while the clientY of the ending DragEvent is measured from the bottom of the window.
In Safari, the pageY of the ending DragEvent is some ridiculous number that seems to be measured from some point over 500px off the bottom of the screen.
In both Firefox and Safari, the differences in clientY, pageY, and screenY are different for the same beginning and ending mouse position.
In Opera, the Y values for MouseEvents ending on the sidebar panel are different from the Y values for MouseEvents on a page at the same vertical level.

I decided to use the difference in screenY, even though there is the obvious bug that the math will be wrong if the screen scrolls in the middle of a drag, because it produces the least number of compatibility problems across browsers.

Side note: The best practice for defining class methods in Javascript is to use the prototype:

ClassName.prototype.method = function(){...}

This allows every instance of the class to use the same function instead of giving each instance its own copy of the function.

Member variables are not in scope in prototype methods; the method is expected to use this to access them. In the context of an event handler, however, this is not the containing object. Therefore, using prototyped methods as event handlers is not a good idea. A solution is to use the old-fashioned "this.method=" declaration which suffers inefficiency but does the job:

function ClassName {
this.method = function(){...}

I ran into this problem when I tried to fix my old-style drag-and-drop code to use the best practice instead.

validator.nu now reports an error for the empty action in <form action="">. An empty action used to send the form data back to the current page, whatever it was. This was useful for repeating the same form code on different pages, and for developing a form on a site that is in development and where the official target URLs can change.

In January 2011, the HTML5 spec changed to specify that action may not be empty. Yet the whatwg spec still says that blank actions are allowed (while saying that is a violation of RFC3986 which defines a URL, which makes no sense). Of course, HTML5 is a "living spec" so it can change tomorrow (Next week: every landing page must have a commented ascii goatse in its source). So what's the story?

The story is in the comments to w3 bug #12561. The chief complaint hixie makes is that this use of a relative path to the current URL conflicts with the <base> tag which allows site developers to redefine all relative paths. This sequence can happen:

Developers create pages with form action="" to send data back to the current URL.
An automated process later adds a <base> tag which rewrites the base of all relative URLs.
Form data is now sent back to the URL relative to the <base> tag, meaning it goes to the <base> url.

I consider that to be operating exactly as defined, with <form action> going to the same place that <a href=""> goes to, but hixie sees it as a problem whose best solution is to outlaw empty action attributes.

Another of hixie's criticisms in that action="" is "too confusingly similar to action=" " with a space which has different behaviour." By my understanding of things, action=" " with a single space should be treated the same as action="%20" and should refer to the filename that is a single space in the same directory as the current URL. In testing, Opera strips the whitespace and sends data back to the current URL, so I don't see a difference in behaviour. The html5 spec says that URLs should be trimmed of whitespace.

One resolution that hixie recommends is to exclude the action attribute altogether. My recollection from many years ago is that browsers would do nothing when no action attribute is specified, but today this has the same effect as a blank action. This seems to be the best thing to do.

Another recommendation is to use action="?", which I do not agree with. For one, this should have the exact same problem with <base> that "" does, since it is a relative URL to the base URL: + "?". In my testing, it in fact does have this problem. It also causes the minor inconvenience of there being two URLs for the same resource when POSTing data, and when GETing this could theoretically produce invalid URLs that have two "?" characters in them except for the fact that browser developers saw plenty of buggy urls like this produced by shit code in the 1990s and know how to handle it.

In Bug #10332 and Bug #11161, people have read the spec to expect the DOM to report the current URL when the action attribute is an empty string. I believe this is mistaken.

I wanted to use use a stylesheet to add an &mdash before certain content on a web page. This should be easy with modern CSS, but span.foo:before{ content: '— '} places the text — on the web page as if the CSS engine ran the string through a sanitizer and is giving me &mdash; instead.

A Stack Overflow discussion says I need to know the character's Unicode value and use that instead. This is stupid.

Useful reference: numeric values of all named HTML escape sequences.

My website will have a directory where I do not want CGIs to run. The obvious and wrong answer is to create an .htaccess file with "Options -ExecCGI", but that causes the errors "Access forbidden!" and "Options ExecCGI is off" when I try to access a .pl file. For the next obvious and wrong solution, I tried renaming the file to ".pl.txt". That did not work.

Given that I would like to serve .pl files as plain text, I tried "AddType text/plain .pl". RemoveType also does not work; note that the docs say that this is what you use for this case. RewriteRule \.pl$ - [T=text/plain] did not work; note, again, the docs say it should.

What did work was to add "RemoveHandler .pl".

You can set variables inside an Apache .htaccess file. To copy directly from someone else's example:

 RewriteCond  %{REQUEST_URI}  ^/category_abc/ 
 RewriteRule .* - [E=cat_id:1] 

 RewriteCond  %{REQUEST_URI}  ^/category_def/ 
 RewriteRule .* - [E=cat_id:2]

 - The E= tells Apache we're creating a new ENV variable. 
 - The cat_id is the name of the variable we're creating 
 - The :x is the value of the variable (simple key : value syntax).

And from another example:

RewriteCond %{HTTP:Accept-Language} ^.*(de|es|fr|it|ja|ru|en).*$ [NC]
RewriteRule ^(.*)$ - [env=lang:%1]
Set lang var to URI
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(.+)/(de|es|fr|it|ja|ru|en)/\ HTTP/ [NC]
RewriteRule ^(.*)$ - [env=lang:%2]

Environment variables are referenced as %{ENV:varname}. How they are used is another story. There are places you can't use them.

# Make a variable for RewriteBase
RewriteCond "/base_dir/" ^(.*)$
RewriteRule ^(.*)$ - [E=RewriteBase:%1]
# Fails, causes an internal server error.
RewriteBase %{ENV:RewriteBase}

You can't do this either:

# RequestPrefix is a previously declared variable. 
RewriteCond %{REQUEST_URI} ^%{ENV:RequestPrefix}(.*)$
RewriteRule ^(.*)$ - [E=RequestSuffix:%1]

A quick web search finds someone claiming that rewrite condition patterns are compiled before variables are interpreted.

An XHTML file of mine that worked fine four years ago now displays as a blank white page in the latest Opera. The problem is a self-closing <script /> tag that the parser fails to properly close. The real problem is that Opera ignored the XHTML 1.1 DOCTYPE and the tag and decided that the document must be HTML 4 or earlier because it was not presented with the proper application/xhtml+xml MIME type, even though it was loaded directly off my hard drive and had no MIME type. The real real problem is that browser developers were so upset with the merger of HTML and XML that they allowed it to break.

At the time that XHTML was introduced, it was common for web site developers to write bad code and then complain to the browser developers that their page was not displaying correctly in that browser, but it worked in a different browser. The browser developers' reply to these web developers was to follow the standards. Then Microsoft tried to coerce web developers to use its own alternate markup, alternate scripting language, alternate authentication method, and other technologies only supported by Microsoft software. To keep the web free and competitive, browser developers advised web developers to follow the standards. This would also become the rallying cry for destroying XHTML.

Because old browsers will not understand the new format, the standard for XHTML 1.0 recommends that developers either follow backwards-compatibility guidelines or serve XHTML files with an alternate MIME type. This advice was reinforced by later documents saying that application/xml+html is the proper MIME type for XHTML. Mozilla, Opera, and Safari all chose to read this to mean that their newer browsers should not recognize XHTML when served as HTML. Internet Explorer went the other way and refused to display content that was served as application/xml+xhtml until 2009. As a result, XHTML web pages simply stopped working in alternative web browsers when they decided to make that change, and if developers did things "the right way", the page would not work in the most popular browser. This heavily suppressed adoption of XHTML.

To see how bad a decision this was, consider that web pages can have a DOCTYPE. This is an instruction that tells the reader whether the page is HTML or XHTML and what specific version of that standard it uses. Browser developers chose to follow a policy of ignoring the DOCTYPE when deciding whether to use the new XHTML or old HTML parsing and display rules. Web developers not only had to fix their own bugs but would now routinely encounter inexplicable situations where a small change would cause the page to display in a different manner if the MIME type was different on a different server or if their edits introduced a minor parsing error. Browsers provided no other feedback as to which display mode they were using and were steadfastly ignoring the web developer's direct, explicit instruction as to which exact mode the browser should use.

Another shock came when hand-coded pages with minor typos suddenly stopped working. Indeed, the XML standard says parsers should fail and produce an error if there is a parse error. It should also be universally recognized that this is stupid behaviour for a web browser and no problems would be caused if the browser failed to comply with this part of the standard and instead attempted to recover.

Compatibility problems could be caused if one browser allowed an extension to the language, such as case insensitivity, and others did not. Web developers wanted to keep their case insensitivity; common practice at the time was to make tag names upper-case, and there was grumbling when the XHTML standard said they must all be lower-case. For consistency, this lower-case rule was also applied to the HTTP methods used by data entry forms, GET and POST, which are upper-case in their own standard. The major browser developers could have grouped together and said "We're supporting mixed-case. People who need web pages in lower case for data processing can just run them through tidy." One browser developer could have done this and coerced the rest to follow. The cost for future browser developers to implement this change is near zero; they only need to run certain strings through tolower() or toupper().

Browser developers followed a policy of working to rule for the standard that they did not want to support. XHTML was so hated that its replacement, HTML5, recognizes the self-closing tag instruction introduced with XHTML only to instruct browser developers not to support it unless it is on a tag that cannot contain data and would not be open in any case. Other tags are to be considered open when such a closing instruction is seen. This is a direct introduction of incompatibility for no apparent reason other than to disconnect HTML5 from XHTML.

The w3c, for its part, failed to respond to the desires of web developers for a more flexible system than XML provides. XHTML could have been redefined as a language inspired by and easily translatable to XML, or a flexible alternative to XML could have defined, or they could have simply drawn up a list of exceptions to the XML standard that XHTML did not need to follow, and they could have produced a translator to convert loose XHTML to strict XML. They could also have loosened up the XML standard for the ease of web users and developers. They did nothing.

The intransigence of both parties caused a ten year delay in the advancement of HTML. Things had been looking forward for a short time; XHTML had been adopted by web developers as the new standard and the first draft of XHTML 2.0 had been released, but no browser developer chose to support XHMTL 2 and then XHTML 1 became un-adopted when it stopped working in the browsers. We could have had a better HTML ten years ago if browser developers and the w3c had found a way to get along. Meanwhile, advancements have happened in CSS and Javascript; much of what is new in "HTML5" has nothing to do with HTML, but is the collection of these other advancements that browser developers have made and agreed upon.

A spinner is a debugging tool.

A spinner moves to inform the user that a long-running process is continuing in the background. The process updates the appearance of the spinner as it sends or receives data, as it does something, to inform the user that the process is actively working and is not being blocked and has not gone into an infinite loop. The speed of the spinner hints at to the rate of processing or data flow, whichever is being measured.

The spinner's movement is analogous to the sounds made by a modem when connecting to a computer system through a telephone network. If everything works correctly, there is no need for a modem to have a speaker to repeat signals for the user to hear. The speaker makes it possible for the user to detect that something is wrong when the sounds are not normal. Similarly, a spinner informs the user that something has gone wrong when it stops moving.

You're doing it wrong.

The modern use of spinners is as a distraction to keep the user's attention occupied for a few moments while the process does its work. "Go watch a cartoon, Mommy's too busy to deal with you." This works if the process runs correctly and completes within a few seconds, but it fails to accurately communicate the state of the program. In the worst cases, a user interface will display a spinner when the process has failed and exited or has not successfully launched in the first place. If your software does this, consider it a bug.

Instead of reporting anything useful about the background process, animated GIF spinners report the state of the browser's main timing loop and its image rendering library. Question: "Is the network connection working?" Answer: "The client's UI can display images." It's a nonsequitor.

Spinners might not be right for the web.

Web programming is like war: you spend most of your time waiting, and then there's a small amount of action. Spinners don't map onto this model well. The action is too short to get any useful information from a spinner. A traditional spinner would be seen to be stopped for a few seconds and then finish and exit in the blink of an eye. The extra effort to implement a traditional spinner would be wasted and you may as well just show an animated GIF.

With faster computers and greater bandwidth, the time spent processing and transferring data is no longer as significant as the delays in waiting for change of state. An process spends its time resolving the domain, waiting for a connection, and waiting for a response. If an interface element is to provide useful information to the user, it should change with these changes of state. For example: an animation could be told to advance to the next frame, a message can be posted on the display, a light could change colour, or a transparent coloured overlay can be thrown onto an existing animation.

Profile

tangaroa

Navigation

April 2020

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Syndicate

Page Summary

XForms is dead
What is this garbage: Ãƒâ€šÃ‚Â
Browser stupidity of the nonce
XHP: Making mixed PHP+HTML cleaner
Thoughts on filtering include()d HTML
Notes on dragging and dropping in Javascript
HTML5 forms cannot have a blank action
HTML escape sequences and content-before
RemoveHandler .pl .cgi
.htaccess variables
XHTML and the revolt of the browser developers
Most spinners are wrong

Style Credit

Base style: Librarian's Dream by branchandroot
Theme: A Night In by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Jul. 3rd, 2025 06:33 pm

Tang's DW

Entries tagged with webdev

XForms is dead

What is this garbage: Ãƒâ€šÃ‚Â

Browser stupidity of the nonce

XHP: Making mixed PHP+HTML cleaner

Thoughts on filtering include()d HTML

Notes on dragging and dropping in Javascript

HTML5 forms cannot have a blank action

HTML escape sequences and content-before

RemoveHandler .pl .cgi

.htaccess variables

XHTML and the revolt of the browser developers

Most spinners are wrong