PC-BSD needs a better startup system

It's 2015 and we still haven't fixed init systems yet.

  • Problem: Duplication of code in the scripts. For example, every script contains code that checks to make sure the script is not disabled. This is best handled by the caller.
  • Problem: Startup scripts are disabled by looking for the name of a magic word in the startup script and manually adding it to a global config file. This is clunky and there has to be a better way.
  • Problem: Startup waits 30s for wireless to not connect when booting my laptop away from home. I can ^C but shouldn't have to.

Goal: allow a parallel system startup Read more... )

  • Any operation may take operands (parameters, arguments). Example: function foo($arg1, $arg2, $arg3)
  • Any operands may have attributes. Example: f(long int $arg1, fancy blue dishwasher $arg2)
  • The reader (compiler or interpreter) takes operation + operands and produces instructions that will have the desired effect. This can be done at runtime or at compile time.

Operations, operands, and attributes might be seen respectively as Verbs, Nouns, and Adjectives.

The question of what instructions are to run depends upon the attributes of the operands. For the simplest example, the code "2 + x" will produce different instructions depending upon the attributes of x.

small int x = 42
return 2 + x // add ax, 2

long int x = 42
return 2 + x // add eax, 2

float x = 42.0
return 2 + x // fadd(42.0) ; adds to st0

At some point the system determines "x is an int, use integer add" or "x is a float, use floating point add". The process of compilation can be described as the resolution of high-level code to lower-level implementation. This may involve translation through intermediary languages.

An interpreted high-level language will make these determinations at run time. A compiler may make the process more efficient by stripping out all of these determinations at compile time. At the most extreme of efficiency the reader may determine that the code will always produce the same result (42+2 = 44) and insert the resulting value rather than the code to produce it.

Optimization vs. reuse

A downside of such optimizations is that it becomes impossible to modify the behavior of a program at runtime if the behavior-determining code is optimized out. Correspondingly, an advantage to keeping code unoptimized is that the program may be modified by a wrapper that changes the resolution of a function call.

A system could provide the best of both worlds by using an object format that stores both the optimized version and a high-level interpretation, and recompiling on the fly when directed to change part of the program. To allow the recompilation of one subroutine at a time, instead of rebuilding the whole program, the metadata may include a catalogue of entry points.

After seeing Episode 7, I have some ideas of where Episode 8 and 9 could go. Read more... )

The current Star Wars mania gives me Star Wars on the mind which reminds me that "Episode 1" sucked. As a shitty fanfic writer, I will ask the question: what could be changed to make the movie better? Let's skip the popular opinion of cutting Jar Jar and consider other possibilities. Read more... )

Go claims to have strings. That would be a nice feature except that most of the core library functions operate on byte arrays and you need to make an explicit cast every time you move from strings to byte arrays and back.

Do you want to interpret a boolean as an int? Fuck you.

It is difficult to define complex types such as sets of function references. The syntax that works in a parameter definition produces a syntax error when I try to cast an array. By the way, you need to cast your arrays when you define them because the compiler cannot determine that information from the array's contents.

Error handling is... different. For example, if you wanted to write to a file in any other language you would do something like this:

status = fopen("foo.txt", "w")

or:

try {
 fopen("foo.txt", "w")
} catch e {
...
}

In Golang you need to do something like this:

var writer *bufio.Writer 
f1, err1 := os.Open("foo.txt")
if(err1 != nil){
	return writer, err1
}
tmp2, err2 := io.Writer(f1)
if(err2 ! nil){
	return writer, err2
}

return bufio.Writer(tmp2)

... so you end up with long repeating branches in a function that would be a few lines in any other language.

Do you want to close the file, allow other subprocesses to work on it, and reopen it later for reading? Fuck you.

Have I mentioned that I don't like golang's error handling pattern? I don't like golang's error handling pattern.

Go gets one thing right and that is its goroutines. Parallelization of a function should be as easy as putting the word "go" in front of the function. Other languages should pick up on this.

Sorting

Nov. 29th, 2015 08:44 pm

Streamable sorting

A streamable sorting algorithm is one that has an intermediary state in which the entire set may not yet be sorted, but it can be guaranteed that some number of items at the head are sorted. For example, the bubble sort will bring the optimal item to the head of the set before the rest of the set is sorted. The sorted elements at the head may be dispatched while the sorting algorithm continues to run on the remainder of the list.

The use case is to pipe data to a different operation in a multi-stage process so that the entire process will be completed more quickly than if the process had waited for the set to be sorted. This could be in shell scripting or any I/O bound transfer of data. Another use case is in a user-facing interface where the appearance of doing something is more important than total runtime.

Speed considerations

A strong argument against use of a streamable sorting algorithm is that non-streamable algorithms are so much faster than the bubble sort that the process is likely to complete sooner if one chooses the better algorithm and waits for it to complete.

Made-up example:

  1. Algorithm A will sort in 10 seconds and can start transfering after 4 seconds because it knows that the head is optimal.
  2. Algorithm B will sort in 5 seconds, but I/O cannot begin transfer until all items are sorted.

The slower algorithm will get the data across sooner if certain conditions are true:

  1. Transfer time is slower than the sort time of the faster algorithm.
  2. The transfer process will not interfere with the sorting process. On modern multi-cpu systems this should not be a problem.
  3. The fraction of time when transfer may begin (say, at 40% of sort time) is lower than the speed of the comparable nonstreamable algorithm compared to the nonstreamable one (say, algorithm B finishes sorting in 50% of the time of the streamable algorithm). Note that this will depend heavily on the set size due to the performance differences in sorting algorithms.

Variables:

  • Tx = (Time to transfer X objects)
  • At = (start of transfer for algorithm A)
  • Bt = (start of transfer for algorithm B)

When transfer time is slower than the sorting time of either sort algorithm, the overall process is bound by when the transfer begins. The streamable algorithm will be faster when it begins transfer before the non-streamable algorithm would finish and begin transfer.

When transfer time is faster than the slower streamable sorting algorithm, the streamable algorithm is bound by its own slowness. The process will be faster than a non-streamable algorithm only if transfer time remains slow enough for (Tx + Bt) to be greater than the time needed for the streamable algorithm to finish sorting.


Dividing sort time across sender and receiver

Imagine the following:

  • We need to sort and transfer data between two systems that both need to minimize CPU use.
  • We have a sorting algorithm that divides the set into chunks and sorts the chunks over multiple passes.
  • We can predict the number of passes needed by the algorithm from the set size.

Assuming that it is still the 1990s when these limits would matter (this is more of an intellectual exercise than a serious proposal), we can divide the effort of sorting between client and server. The sender runs half of the expected number of passes and sends the data partially-sorted. The receiver finishes sorting the half-sorted data as the data is received and inserts the items coming down the pipe where they best fit.

Let us say that we have a data object and we want to manage metadata about the object. There are multiple ways of handling this. Read more... )

Proposed: The basic data types of a programming language should be interfaces, the standard library should work on interfaces, and the low-level data types should be implementations of the basic interface.

The php devs spent years trying to convert their default string type from ascii to unicode before giving up. Golang shits a brick if you try to convert between a string and a []byte without an explicit cast. Consider instead that every function were written to operate on a string interface rather than a string data type, and we let the compiler and optimizer figure out what to do with whatever is passed into the function. []byte can implement the string interface. The programmer can send the standard string functions data in any charset (korean, japanese, whatever) and it will just work because a string of any charset will implement the string interface.

Another use case: You read some numbers as an integer. It could be a bigint or a small int. You don't care until you need to store it again. Performance and storage do not matter, you just need some math or a comparison to work. You write your code to work on an "int" interface and let the compiler figure out how to implement it.

The conventional handling of technology upgrades in a video game is fairly simple:

  • Item stats go up
  • Cost of next upgrade goes up

The better games will open up new branches of gameplay possiblities, but let's leave that out of the discussion.

I've been playing a game where the tech upgrades often produce surprisingly large leaps in power. The game makes the player choose between replacing items with the newer technology (which costs a limited supply of money) or waiting for the next upgrade (which may be either a minor upgrade or a huge upgrade). These upgrades appear to be hardcoded. What if they were randomized?

It is common for modern roguelike games to have categories of upgrades. Upgrades only start with the classical D&D +1, +2, +3 system. Then there is a category of upgrades that upgrade one stat, a "rare" category that upgrades two stats, and a "legendary" category that upgrades three or more stats. The upgrade categories are usually color-coded for your convenience.

Let us consider a game where the player pours resources into research, and imagine possiblities of different outcomes.

  • Complete failure. The scientists shamefully report that the plan didn't work and they have to go back to the drawing board. However, there is an increased chance of succeeding in the next research attempt. This was used in the DOS game Stellar Conquest 2469 (stelcon).
  • Two steps forward, one step back. Two (or more) stats are upgraded, while one (or more) other stats are downgraded. This reflects the TANSTAAFL principle.
  • Technology branch. A new branch of research is opened for this type of item. This may involve a downgrade of the item stats. However, because this is a new and unexplored branch of research, the cost of advancement is reduced. This can be considered a subcategory of "two steps forward, one step back".

In a game where different item choices require different gameplay styles, the optimal choice of items may become different for each time the game is played. Imagine a space combat game where missiles got cheap and powerful in one playthrough while their range and reloading speed improved in another playthrough. Imagine an RTS where in one playthrough your ranged units get increased range while in another playthough they get increased hit points. Imagine an RPG where in one playthrough your healer gets reduced cost and casting time of your healing spell while in another playthrough only the healing power improved significantly. The behavior of the game components may change just enough to make the player choose a different strategy, increasing replay value.

This is starting to sound like removing the upgrade tree from player control. My intended vision was for randomized upgrade paths resembling organic evolution, where each upgrade is the starting point for the next upgrade. Imagine that a crafter develops a new technique which requires more time and resources but produces a more resilient product; you can continue using the old technique or refine the new technique.

Counter-thought: the Dragon Warrior series randomized the values of upgraded stats when a player gained a level. It barely affected the game because the characters were different enough in their skillsets that the trivial differences in stats did not matter; the stat upgrades were guided by weights for each character class; and the randomness approached a mean average over enough levels. If the upgrades are expected to be different enough to affect gameplay when the game is replayed, these normalizing factors need to be overcome.

Layers of abstraction

At the hardware layer we have native bytecode created from amd64 through a translator. We operate on registers, have to keep track of what data is in what register, and if the data has any meaning there is no way for the computer to know about it.

At the low level of abstraction we operate on bytes and sets of bytes. OS, POSIX, and standard library interfaces are written in C. One might consider high level assembler with strings and matrices to be at this level. There are multiple data definition formats to define how data is laid out in memory: C header files, SQL data definitions, a custom format for each language.

At the intermediate level of abstraction we define classes to associate the data structures with the methods that work on the data. We have templates, interfaces, abstracts, generics to allow code reuse.

At higher levels everything is abstract. We create anonymous objects on the fly and tell the computer to run operations on members that we hope will have been defined by runtime.

Method abstraction

There are similar layers of abstraction for methods, functions, subroutines. At the hardware layer a method is just a series of instructions that you jump to. Low-level programming allows you to describe the inputs and outputs: this function takes a byte pointer and returns a byte pointer. At the intermediate level of abstraction you can define a template function that allows the same code to be reused on different data types. At higher levels of abstraction all functions do that by default.

Abstracting operating environments

We have had all of these advances in abstraction in the programming area but our shells are still stuck in the early 1990s. Programs are written to read bytes or lines at a time. Streams of characters are passed between programs with no information about what the characters mean. Each program in the sequence must re-parse the stream.

Consider this sequence.

sh$ ls -l | bar > baz

Presently this outputs a chunk of text, and the bar program needs to know what to do with the text. Imagine instead that ls -l generates a table or recordset of files and their attributes, and what it outputs is context-dependent.

  • If the receiver is a text terminal, the shell uses a formatter provided with ls to generate a text table compatible with today's ls -l.
  • If the receiver is a hypertext terminal running in a windowed environment, the shell uses a formatter to send the data and metadata to the terminal in a given standard format that the terminal will use to produce an interactive display of files.
  • If the receiver is a .csv file, the shell automatically uses the standard table-to-CSV converter from the operating system's libcsv.
  • If the receiver is a program that requests input of a given class, the shell attempts to present the input as that class.

In a high-level shell design,

  • Programs are written to read more abstract sequences of objects. The implementation of the reader is chosen at runtime. Programs will have metadata exposing their expected input types.
  • Programs may pass both data and metadata. There will likely be a standard for a data+metadata format, and all standard readers will automatically recognize it.
  • Programs may pass pre-parsed chunks of data to skip the step of parsing by subsequent programs. This may be as simple as copying a chunk of memory from one program to another on the same computer. The high level program code will be the same whether the receiving program runs on the same computer (memory copy) or sends the data over the network.

Members of Turkey's opposition Republican People's Party have accused the Turkish state of responsibility for the Ghuta gas attack, claiming to have wiretaps proving that the sarin and its delivery system were produced or acquired by the Turkish Mechanical and Chemical Industry Corporation.

From John Schindler:

We've reached the point of security collapse in DC that the general reaction to the CIA director getting hacked & doxxed is "yeah, sure, ok"

Browsing through the news a little too quickly, I badly misread the headline "Nato warns Russia on Syria strikes" and now I can't stop imagining a certain whiskered ninja bragging about his jitsu skills to a flexing, nonchalant Vladimir Putin.

Link related.

Profile

tangaroa

January 2016

S M T W T F S
     12
34567 89
101112 13141516
17181920212223
24252627282930
31      

Syndicate

RSS Atom

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 11th, 2016 12:53 pm
Powered by Dreamwidth Studios