tangaroa | Entries tagged with programming

During a classroom exercise, the programming teacher told us not to put argument checking code such as ptr != NULL inside a function but instead to check the arguments before calling the function. My first reaction was, well, you're old. There was also the fact that at the time he was teaching us recursion, where small amounts of spent time build up through repetition, and precautions can sometimes be ignored in the rare case of internal functions that are never meant to be used in other places or by other developers, but let me explain my initial reaction.

There were arguments in the 1990s over who is to blame when a bad parameter causes a function to crash a program. Coders of the old school maintained that crashing libraries where the fault of the third-party programmers for passing in bad values when the documentation clearly said that such values were not allowed. This practice probably stems from the 1970s and earlier when every CPU cycle counted, when programmer effort was cheaper than CPU effort and memory space. The new and contrary idea which won the argument was that libraries should be so solid and robust that third-party programmers should not be able to crash them.

There is still a concept that the old-school programmers recognized. The data-checking code which only validates the data can be considered separate from the operations code that does whatever the function is meant to do with it. This might not affect how we write code, but perhaps this idea could be used to affect how code is compiled and run.

Applying this concept to the build process

Consider this pseudocode:

myfunction (x,y,z)
10: return false if x == NULL

20: return false if y > 24
30: return false if z < 0
40: Do something
...
90: return true

The first several lines of the function ensure that the arguments are passed correctly. A sufficiently smart compiler could recognize the parameter checking code -- perhaps as code which returns false or throws an error before any data is modified -- and create some metadata saying that for all calls to myfunction(), x must be non-null, y must be above 24, and z must be below zero.

An even smarter compiler[1] could inspect later code and see if the parameters passed into myfunction() are set to known valid values, which would include constants or non-volatile values which have already been checked for the same constraints and have not changed since. If the values are all knowable and within acceptable ranges, the compiler can have this function call jump to line 40 instead of line 10, saving a whole three to six ops.

[1] I've earlier used the term "never-watches" to describe this kind of inspection. I hear that modern compilers have some functionality like this, but I don't know what they are capable of.

Problems with the idea

That's not much of a savings

The CPU is probably spending more time blocked on memory I/O than it would spend running these checks at the start of each function. Code reordering means there might be literally no time savings in the common case, since a data fetch instruction might be moved before the checks and then the checks can be run while waiting for the data.

There are also no memory savings, not that it would matter today. The compiler cannot leave lines 10 through 30 out of the library because third-party developers will often pass in variables whose range of possible values cannot be determined at compile time. This is a form of the P!=NP problem; some ranges can be determined, others cannot.

If we were to go further and leave lines 10 through 30 out of the library, relying on the compiler and linker to reject code that does not match the constraints, then anybody could link in bad code by using their own development tools.

Those aren't always check constraints

Functions like isalpha() may use tests indistinguishable from argument checking code as part of their functional logic. A value which causes these functions to immediately return false is not an invalid value. When our too-sufficiently smart compiler says that the arguments should be restricted to a range, it is wrong.

The one advantage of the sufficiently smart compiler is that it would have code with known arguments jump straight to the logic for handling these arguments. An even smarter compiler could inline and optimize that code to get rid of the function call entirely.

The data may change on runtime

So a particular function call goes straight to line 40 because only constants were used in the code. A debugger or hostile code sets x to null.

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Tang's DW

Entries tagged with programming

Is "One Language To Rule Them All" a possibility?

What if system libraries included check constraints as metadata?

Applying this concept to the build process

Problems with the idea

That's not much of a savings

Those aren't always check constraints

The data may change on runtime

Jari Komppa's DirectDraw library

Profile

Navigation

April 2020

Syndicate

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags