nontemplate-0.12

2010-09-25

Finally got around to pushing nontemplate v0.12 out the door, fixing the most glaring problems with 0.1. It is amazing what eating your own dogfood can do for you :-) . It is still pre-alpha, and there’s still some bigger changes I’m thinking about for 0.2, but I’m pretty happy with the general direction at least.


(sort of) first class classes in C#

2010-08-10

I find myself writing some C# code while still thinking in Python. One thing in particular caught me out … it seems, at first, that C# doesn’t have first class classes. This is annoying, because I’d started writing some device driver classes where each class is a type of device, and instances represent the individual devices themselves. And I wanted to construct a list of these classes, and call a “probe” classmethod on each of them to ask the class to go search out any devices which were available. In Python, this would look like:


device_classes = (FooDevice, BarDevice, BazDevice)

for device_class in device_classes:
device_class.probe()

See? The classes are being treated just like any other variable, because they are, they’re just instances of type ‘classobj’ . But the equivalent doesn’t work in C# — doing this:


Type[] DeviceClasses = {
FooDevice,
BarDevice,
BazDevice
};

… complains that “‘FooDevice’ is a ‘type’ but is used like a ‘variable’”. At first it seemed that C# didn’t have first class classes, and indeed a few web searches came up empty handed.

Thankfully after a bit more exploration it turns out that all that is needed is some syntactic nastiness … namely, typeof(), GetMethod() and Invoke() (Passing “null” to Invoke works for static methods).


Type[] DeviceClasses = {
typeof(FooDevice),
typeof(BarDevice),
typeof(BazDevice)
};

foreach (Type dct in DeviceClasses) {
dct.GetMethod("Probe").Invoke(null, new object[] {} );
}

Now, quite why a shiny new programming language has to get saddled with such godawful syntax is a bit beyond me, but so it goes.

As always, this is lovingly documented in MSDN, in such a way that the answer is clear so long as you already know what you’re looking for.

(As a bonus, yes, you can use reflection to find the list of Devices in the first place. It just wasn’t all that relevant to this example)


Fibonacci Regex Perversity

2010-06-01

Consider these two regex substitutions:


s/fi?b/i/
s/fii(i*)b/f$1bfi$1b/

For those unfamiliar with Perlish regexes: that first one says “replace the string ‘fb’ or ‘fib’ with the string ‘i’”. The second one says “replace a string ‘fiiXb’ with ‘fXbfiXb’, where X is zero or more ‘i’s.”

We can repeatedly apply these rules to a string until the string stops changing. So for example, our string might mutate as follows:

* fiiiiib
* fiiibfiiiib
* fibfiibfiibfiiib
* ifiibfiibfiiib
* ifbfibfbfibfibfiib
* iiiiiifbfib
* iiiiiiii

Expanding the path of fiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiib is left as an exercise to the reader :-) .

What on earth is this all this substituion doing? Well, it is calculating Fibonacci numbers of course!

Regexes don’t handle arithmetic well, so we represent numbers in unary … a string of n ‘i’s represents the number n. When dealing with unary, you can add numbers by simply appending them. ‘f’ and ‘b’ are like parens around the number we’re calculating the Fibonacci number of. So “iiiii” represents the number 5, and “fiiiiib” represents the fifth Fibonacci number.

So the sequence of strings above could also be written:

* fib(5)
* fib(3)+fib(4)
* fib(1)+fib(2)+fib(2)+fib(3)
* 1+fib(0)+fib(1)+fib(0)+fib(1)+fib(1)+fib(2)
* 6+fib(0)+fib(1)
* 8

So really, any language
that allows a sufficiently powerful regex mechanism is able to calculate Fibonacci numbers.

And it is pretty easy to see how to implement a Turing machine by representing each state transition as a regex substitution, so these languages are bound to be Turing complete as well, even if they do turn out to be Turing tarpits.

I’m quite interested in substitution as a kind of pure functional programming. More on that later.


NonTemplate

2010-04-30

It is rather sketchy still, but I’ve just put up a little idea about a way to avoid doing template languages at all, and
embedding HTML into Python code directly instead.

It is called NonTemplate. Let me know what you think!

UPDATE: Some vague performance figures, using the same very simple benchmark as the previous template language performance comparison posts.

These figures are running on Python 2.6.5, on a linux laptop (x64) with output to /dev/null …

print: 1.612
Mako: 1.756
Jinja: 10.803
nontemplate: 18.198
django: 42.212
SimpleTAL: 59.024
genshi: 81.460

… mako is very very clearly the winner here … its code generation is head-and-shoulders above the rest, producing pretty much
exactly the same code as you’d get if you wrote a whole lot of “print” statements yourself. Nontemplate is stuck in the middle … unfortunately, all the ‘with’ shenanigans turns out to be pretty slow. On the other hand, it is still a lot quicker than Django templates, SimpleTAL or Genshi, and a lot smaller than any of them, so I guess it is not all bad news.


Functional Parallel Programming

2010-04-05

Guy Steele’s [VID] [PDF] [PAPER] talk “Organizing Functional Code for Parallel Execution” came up recently on reddit and I found it very interesting.

I haven’t had a lot to do with functional programming in my career, but I’m kind of perpetually hovering on the fringes of learning more and I find myself attracted to the functional way within imperative languages such as Perl and Python.

One of the things I’ve always found fascinating about Haskell(and Lisp, for that matter) is that once you dig down far enough, you keep hitting the singly linked list, formed by consing new elements onto the start of a slightly shorter list. It always seems strange that there, after hiking all this way into the functional wilderness, is the datastructure you left back in the carpark on day 1 of Comp Sci 101.

Anyway, the lecture, and trying to learn Haskell at the moment, got me thinking. It’s pretty likely that this has already been covered in Guy’s talk, but in order to understand what he’s on about I had to write it out myself, so I thought I’d do it here. It won’t make a lot of sense unless you’ve watched the first half of the video or at least read the slides, so see you in a while :-)
Read the rest of this entry »


Why NoSQL Will Not Die, And Also Why SQL Will Not Die, And Also Why BDB Will Not Die, And Also Why Flat Files Will Not Die …

2010-03-30

Because there’s no such thing as “Die”, that’s why. There’s no real reason why an SQL problem can’t be run on an SQL database, a key-value shaped problem can’t be run on a key-value store, and a company with both kinds of problem can’t run both.

(So far, I’m a big fan of MongoDB, for sitting somewhere in the middle of the spectrum, eg: supplying its own indexing mechanism. But that doesn’t mean I’ll never work with MySQL again …)


Milestones …

2009-11-11

Years ago, a personality test told me that there were two kinds of people: Process-oriented people and Goal-oriented people. Shortly thereafter, it told me that I was the latter. (It also told me I should become either a Mad Scientist or a Park Ranger, so I’m not sure how much credibility to assign it. I think it might have predated the era of Software Engineering in any case.)

Anyway, so: 3000 lines of hopefully not too awful Perl, another 3000 lines of HTML/CSS/JS. 450 commits. 63 closed bugs (and a bunch more little ones which never made the bugtracker, of course). This B2B client project I’ve been working on over the last year or so has finally finished “Phase 1″. It’s actually out there, being marketed hard by the client and currently being used by 250 or so customers.

Sure, it needs a bunch of stuff rewritten, refactored or just plain rethought. There’s some infrastructure work needs doing if its going to scale to 10,000 customers. But I’m pretty damned proud of it anyway, and its great to get to this point.

There’s 28 features currently labelled as “Phase 2″, so the job isn’t over yet …


SyntaxError: keyword argument repeated

2009-09-22

This one is amusing … the case of the mysterious syntax error … turns out that repeated keyword arguments (kwargs) were illegal in python 2.4, ignored in python 2.5 and illegal again in python 2.6. This means that if some have crept into your codebase, you’ve now got a handful of syntax errors!

For example, here’s test.py:

    def foo(**kwargs):
        for k, v in kwargs.iteritems():
            print "%s: %s" % (k, v)

    foo(a = 1, b = 2, a = 3)

And here’s what happens when you run it under 2.4, 2.5 and 2.6:

    nick@pluto:~/tmp$ python2.4 test.py
      File "test.py", line 5
        foo(a = 1, b = 2, a = 3)
    SyntaxError: duplicate keyword argument
    nick@pluto:~/tmp$ python2.5 test.py
    a: 3
    b: 2
    nick@pluto:~/tmp$ python2.6 test.py
      File "test.py", line 5
        foo(a = 1, b = 2, a = 3)
    SyntaxError: keyword argument repeated

How could this creep in? Most likely through revision control mergers. Function calls with lots of kwargs are often laid out like:

    foo(
        kwarg_with_a_long_name=1,
        another_self_documenting_flag=False,
    )

Alice adds a kwarg for yadda="yadda yadda", at the start on the kwargs list, before kwarg_with_a_long_name. So does Bob, but Bob adds it at the end. The revision control system will happily merge these two changes without conflicts as:

    foo(
        yadda="yadda yadda",
        kwarg_with_a_long_name=1,
        another_self_documenting_flag=False,
        yadda="yadda yadda",
    )

And under python 2.5, neither Alice or Bob are likely to notice … after all, yadda=”yadda yadda” just as they intended.

But when the underlying system is upgraded to python 2.6, a mysterious “SyntaxError: keyword argument repeated” appears.


Multiple Inequalities in Google AppEngine

2009-08-11

So I’m playing around with Google AppEngine (still!) trying to put together some kind of sensible use for it. AppEngine is neat-O, but it is also quite limited in what it can and can’t do. One of the most glaring problems (for my toy app) is the datastore query API, which has various restrictions, including:

Inequality Filters Are Allowed On One Property Only

Now this is pretty obviously an efficiency measure: retrieving on inequalities involves iterating along one index, and the datastore isn’t in the business of picking which one to iterate along. But its also really annoying if you actually want to do something which needs multiple inequalities.
Read the rest of this entry »


Templates Fugit 3

2009-07-29

I’m still messing around with template languages … I’ve added a couple of new ones and here’s the revised leader board.  I tried adding ClearSilver for Perl & Python but I couldn’t get either of those to work …  pity, its an interesting approach.  As with the previous posts this is time in seconds for  1000 tables of 100 rows by 10 columns each, not counting template compilation time.  The 200x difference in performance is what makes it interesting:
Read the rest of this entry »