Optimizable Malleability, and other thoughts
Optimizable Malleability
The life of a productive environment may look like:
- High malleability - you can do almost anything, with few restrictions.
- Exploration - certain workflows are found to be better than others.
- Optimization - simplified interfaces increase productivity while leaving out lesser-used features.
- Stagnation - The existence of a standard way of doing things discourages the development of alternative ideas.
- Replacement - Somebody invests a new system that replaces the entire old system.
Examples of malleable systems and optimized systems:
- Assembly language lets you do anything, leading to goto spaghetti. C gives you formal looping and function call patterns along with a subset of assembly's features. Higher level languages have additional formal implementations of common programming patterns and lack some of C's features to prevent entire categories of bugs. Driver writers and OS programmers still need the lower-level features, but 90% of programming can be done without them.
- Text files allow you to store data in any format. Every Unix tool invented its own text file format. INI, CSV, JSON, and XML are types of text files that are optimized toward storing certain types of information.
- Tail call recursion is an optimizable subset of recursion.
- Excel lets you freely place data without structure. Access requires you to define the structure first.
- Static typing versus dynamic typing. A language could be dynamic by default with optional static typing and optional type constraints. In this case the optimized system (static typing) came first and the malleable system came later.
Stagnated systems and their replacements:
- Why aren't webforums based on Usenet/NNTP? The environment of HTML+SQL allowed for the development of alternatives at a lower cost.
- Why don't new languages use object files? The burden of compatibility is not worth it.
- Why aren't most programs made of shell scripts? Fewer features, security concerns, etc.
Reconsidering the notion of a "program"
Let us view the entire operating environment as a set of mechanisms for directing input to a function.
Consider the command line shell.
- The user provides the program with representations of the function's arguments. The arguments are not passed directly, but you provide a string that represents the argument. Code inside the program converts the string to a filename or URL, parses the data, and provides that as the argument to the function.
- The program provides standard input and output methods, using pipes
- The program may allow the user to set the initial state of the program, using switches
We could say that a program is a wrapper that sets state, has the standard input and output, and provides access to the desired function(s). Let us consider a future operating environment that separates these tasks.
- The future system could automate the generation of wrapper code.
- The future system may use dynamic type casting, late binding, just-in-time compilation, etc to rewrite part of the program based on the user's input.
- The future system might allow the caller to directly set global variables / class members by name, doing away with the need to parse switches.
- Inputs and outputs may be something other than raw byte streams. The pipes may have a class type. The system may provide a standard iterator for a given type and handle casting automatically. The standard Unix tools may be rewritten to work on interfaces.
Incomplete classes and comprehensive dynamic typing
Consider the evolution of classes.
- 1970s - C - A struct is, basically, a way to arrange data
- 1980s - C++ - A class is, basically, a way to arrange functions around the data they operate on
- 1990s - C#/Java - An interface is, basically, a definition for a set of functions that we will expect any given type to implement.
The struct, class, and interface are all key/value pairs.
Consider the function:
sub doStuff(x){ return x + 1 }
In the "message passing" concept we might consider x:Dynamic and expect the runtime to check whether the given x has a ._plus() method at call time. (A more complex system could check for this when the object to be passed as x is known.) In the "interface" concept we might check that x implements the INumeric interface which includes a ._plus() function. "INumeric" is a way of saying that a dynamic type will implement a standard set of methods, what C# calls a contract.
An interface might be seen as an incomplete data type. An incomplete data type might be seen as a type of filter. A data type filter could describe the partially resolved value of a dynamic type, or could be used to implement very strict typing.
interface IPercent:UInt8 { _value = 0..100 // Value is between 0 and 100 inclusive } interface asdf { foo:String and len = 0..8 // .foo will be a string whose length is > 8 // let's propose something outrageous... baz: fnord(_) between &("camel"), dronf("llama") and & > 3 // * we will have a .baz member // * the namespace will also include functions fnord() and dronf() // * the result of fnord(baz) has a comparator that puts its result // between the results of fnord("camel") and dronf("llama"), // * and the value of fnord(baz) will be greater than 3 // * or else the program throws a type mismatch exception, // preferably at compile time }
This would be of little use to a programmer, but might be useful to a compiler, runtime, or IDE.
// as written by the programmer function doSomething(futz): // as interpreted internally function doSomething(futz:{ foo:String = "foo" // string with a known value bar():Int // an undefined function that returns int } )
The same sort of filter or incomplete class can also be used as a search object.
interface SalariesOver30000:Employee { salary:Numeric >30000 } select SalariesOver30000 from Employees // pseudo-sql
Function constraints
A compiler could theoretically determine that it is impossible for a function to do certain things:
- a given function may not modify its inputs
- the function may not make any asynchronous calls
- the function may not open any additional inputs or outputs
- the function's inputs may have values known at compile time or runtime
A compiler could potentially use this information to optimize the function, possibly running it at compile time if enough information is known.
My TODO list for language design is to look into:
- intermediate languages
- continuation-passing-style (CPS) interpreters
- A-normal form (ANF) representations
- static single assignment (SSA) representations
- Haskell's universal machine and Higher Order Abstract Syntax
- read: static single assignment for functional programmers
- ANTLR