Nifty link of the nonce
"Why Python Is Slow" is a quick read on Python's dynamic typing. I had not known that it was possible to change the value of an integer constant in Python.
"Why Python Is Slow" is a quick read on Python's dynamic typing. I had not known that it was possible to change the value of an integer constant in Python.
Math has a concept called singular value decomposition. The short version is that you put in one matrix and get three out. This apparently being a well known concept in engineering, it is implemented in the data analysis language IDL and in the NumPY library for Python, and you can probably guess where this is going. The singular value decomposition functions in IDL and NumPy produce different matrices for the same input matrix.
I've only tested one chunk of data, so I do not know if the pattern will hold for different input matrices.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6034: ordinal not in range(128)The solution is to comment out the
config = config.decode('ascii')
line in Lib/site-packages/setuptools/easy_install.py.
Here's an even more fun one:
File "geopts.py", line 128, in xp2str s = etree.tostring(resultset[0], method="text") File "lxml.etree.pyx", line 3165, in lxml.etree.tostring (src\lxml\lxml.etree. c:69399) exceptions.TypeError: Type '_ElementStringResult' cannot be serialized.
This says that the XML library's own tostring() function cannot convert one of its own string types to a string. What's especially brilliant about this is that _ElementStringResult inherits from the native string class.
Here is a hacky attempt to manage the problem:
r = resultset[0] if isinstance(r, etree._ElementStringResult): s = r else: s = etree.tostring(r, method="text")
Chromatic's book Modern Perl Programming (2011) is a free download. It's a good refresher/updater for people who already have some experience with Perl, like me whose copy of the Camel Book is a 1996 edition.
Consider a generic function like this one:
function square(x){ return x * x; }
In C, you tell the compiler what data type 'x' is and the compiler will print out some corresponding assembly code and consider that the canonical square() function. In an object oriented system, the system adds a layer of overhead to track what objects have what interfaces and will decide what to do at runtime.
Imagine something in between with JIT-like compilation based on how the caller uses the function.
There would be no one canonical square() implementation. The compiler/interpreter would be aware of several different square() implementations and would choose to use or create a particular one based on the circumstances.
Now consider having these different compiled segments stored in the binary, with the high level version also available for any subroutines that might need it.
Do any language environments do this?
I found myself with a bit of free time, so I decided to try getting back into programming for fun and see what Haxe is like these days. ( Read more... )
One of my side projects is a Javascript drag-and-drop list using the modern drag and drop specification with the goal of allowing users to rearrange the order of items in a list. I quickly ran into a problem: the drag-and-drop spec was designed for dragging one object onto another single discrete object which is expecting a drag event. This does not translate well to a draggable list's use cases of dragging above, below, and between objects (or subobjects), or of dragging an item off of the drag area to bring it to the top or bottom of the list. To do anything fancy, we need to find a relationship between the coordinates of the MouseEvent parent of a drag event, and the coordinates of the elements on the screen.
Visual elements have these coordinate attributes:
Drag events have these coordinate attributes:
There is no correlation between the two sets of coordinates. I tried summing the offsetTop of an item and its ancestors but found no correlation between that sum and any of the mouse coordinates. I also had no luck using the various page and scroll properties for window and document. Since I couldn't find the answer, I changed the question. Element.clientHeight reliably works across browsers, so we can do this:
This should work. The MouseEvent gives us three sets of coordinates, so we should be able to pick one and it should work.
Hah.
Among the problems:
I decided to use the difference in screenY, even though there is the obvious bug that the math will be wrong if the screen scrolls in the middle of a drag, because it produces the least number of compatibility problems across browsers.
Side note: The best practice for defining class methods in Javascript is to use the prototype:
ClassName.prototype.method = function(){...}
This allows every instance of the class to use the same function instead of giving each instance its own copy of the function.
Member variables are not in scope in prototype methods; the method is expected to use this to access them. In the context of an event handler, however, this is not the containing object. Therefore, using prototyped methods as event handlers is not a good idea. A solution is to use the old-fashioned "this.method=" declaration which suffers inefficiency but does the job:
function ClassName {
this.method = function(){...}
I ran into this problem when I tried to fix my old-style drag-and-drop code to use the best practice instead.
Recommended reading: Douglas Crockford's tutorial: Private Members in Javascript.
I was getting an unexplained, unlogged 500 internal server error response for a perl Hello World script.
#!/usr/bin/perl print "Content-type: text/html\n\n<p>Hello World</p>\n";
This was especially odd because I have a perl program running elsewhere on the same server. After comparing .htaccess settings and triple-checking my Content-type syntax, I found the apparent cause: the server requires either the -w (warnings) or the -T (taint checks) flag be turned on.
I was unable to determine which setting causes this. mod_perl's PerlSwitches can force all scripts to be run with -wT turned on, but I could find no setting to refuse to run scripts which lack either flag.
For a description of taint mode, read perlsec.
.Reason #1:
foreach($items as $i){ // $i is an item reference ... // Now let's loop through something for($i=0,$max=10; $i<$max; $i++){ // d'oh
Reason #2:
$dir = $asdf ? -1 : +1; // direction ... $dir = getDirectory(...); // d'oh
Another reason PHP sucks: references to static methods are not supported. There is a hacky way to use them, however. ( Read more... )
Some notes on keypress handling in Python's PyGame library, from about seven years ago: ( Read more... )
I finally got my Javascript implementation of Craig Reynolds's boids algorithm to work. Here is the relevant code:
var weights= new Array(1.0, 1.0, 1.0); sepvect = sepvect.multiply(weights[0]); alivect = alivect.multiply(weights[1]); cohvect = cohvect.multiply(weights[2]);
I just needed to record magnitudes for one run and fiddle with the weights. (0.5,1.0,0.2) seemed to do the trick, though I should test with different numbers of boids to see if the magnitudes are relative to that variable.
As regards "finally", I started on this so long ago that I forget when and have intermittently picked it up and re-abandoned it since then. The oldest timestamp that I can find for it is 2006, but I think it goes back to 2003 or 2004 when it was going to be something that I would put together during spring break. The biggest problem was a trig error that I fixed last month after having almost fixed it earlier, causing directions to be wrong in one or two of the four quadrants.
While looking through old files, I found a TODO list so old that I've actually done most of the things on it. Usually these things double in size every year. I shall celebrate with a mocha. ( Details below the cut, if anyone cares. )
Consider these changes to the try/catch model of C++ and Java:
Has any language already done this or something similar? How would this affect program design, code quality, and readability? What would language developers need to do to implement these features, and how would it impact performance?
Also posted to HN, where one of the users informs me that I've reinvented the Common Lisp condition system.
I've finally gotten around to downloading those US state department cables that Wikileaks acquired. You can acquire them at Cryptome:
http://cryptome.org/z/z.7z
John Young would probably appreciate it if you can find another way of acquiring the file so he doesn't have to pay the bandwidth bill. I'm not linking directly so he doesn't get hammered by bots following the link.
The cables will need to be imported into a database before they can be read. I am using MySQL from the XAMPP distribution.
Create a final table and an all-text table for importing into. (Attempting to import directly into the date field will zero out the dates).
create table Cables ( id int PRIMARY KEY, date datetime, local_title varchar(128), origin varchar(255), classification varchar(128), referenceIDs text, -- really a pipe-separated array header text, data mediumtext -- some larger than 65536 chars );
create table Cables2 ( id text, datetime text, local_title text, origin text, classification text, referenceIDs text, header text, data mediumtext );
Load the data using mysql's LOAD DATA INFILE, which needs some help to learn to read multi-line CSV correctly.
load data infile 'c:\\cables.csv' into table cables2 fields ENCLOSED BY '"' escaped by '\\' terminated by ',';
There should be zero warnings. If you see warnings, you can use "show warnings" to see at what record number the import failed.
Copy from the staging table to the final table. This will take about two minutes.
insert into cables (select id, str_to_date(datetime, "%m/%d/%Y %k:%i"), local_title, origin, classification, referenceIDs, header, data FROM cables2);
Then create an index on the dates to speed up future searches. This takes two minutes on my computer.
create index idx_date on cables (date);
You will want to search the cables for a specific subject. Two methods are
MySQL has fulltext indexing. I found it to be too slow.
Creating the index took 25 minutes:
create fulltext index idx_text on cables (header, data);
Fulltext index queries using MATCH AGAINST took 1-2 minutes to run.
select count(*) from cables where match (header, data) against ('Sudan');
select count(*) from cables where match (header, data) against ('Sudan') OR match (header,data) against ('Sudanese');
I found the fulltext index to be slower than sequentially searching the table for "data like '%SUDAN%'". YMMV.
I built my own index using separate tables for all search terms and for the connections between the search terms and the cables.
create table words( wordID int auto_increment NOT NULL, word varchar(64) NOT NULL, CONSTRAINT cx_word_uniqueness UNIQUE (wordID, word) ); create table idx_words( wordID int NOT NULL REFERENCES words(id), cableID int NOT NULL REFERENCES cables(id), INDEX idx_word_match (wordID, cableID), CONSTRAINT cx_words UNIQUE (wordID, cableID) );
The custom index can be seeded with two queries:
INSERT INTO words (word) VALUES ('SUDAN'); INSERT INTO idx_words ( SELECT words.wordID, cables.id FROM words, cables WHERE words.word = 'SUDAN' && upper(cables.data) LIKE '%SUDAN%' );
It takes few minutes to seed each word, but searches are instantaneous with a small number of seeded words. I have not tested this with a large number of seeded words.
If you know of a better search method, please mention it in comments.
The cables have additional information that a sufficiently intelligent program can pull out of the data field and add to the database metadata. If someone has already done this work, please mention it in comments.
You will need a program to pull the data out of the database in a form that you can read. I wrote a quick and dirty PHP program to display search results as HTML.
Cables before circa 2000 were in ALL CAPS and are difficult to read. A program could potentially convert the text to normal mixed case, although it would need to be able to recognize acronyms and peoples' names. If someone has already done this work, please mention it in comments.
A program could potentially recognize key words such as peoples' names, and link these words to other sources of information such as History Commons and Wikipedia. If someone has already done this work, please mention it in comments.