I’m doing some typical display list of orders for certain customers together with the total price of each order. In the past what I would did is just some sql joining orders table and order_items and getting the sum of each price column in the order_items table. But since now I’m using Django, I thought there must be some way to do this through Django API without getting my hands dirty with the plain sql anymore.

Simple Google search on ‘django aggregation’ quickly took me to a blog post with a link to the Django documentation page. Turn out aggregation support first landed in Django 1.1. So how to use it ? This is my models:-

class Item(models.Model):
    id = models.IntegerField(primary_key=True)
    description = models.TextField()
    price = models.FloatField()
    name = models.CharField(unique=True, max_length=-1)
    komisen = models.FloatField()
    stock_threshold = models.IntegerField()
    category = models.CharField(max_length=-1)
    komisen_ahli = models.FloatField()
   status = models.IntegerField()

class Order(models.Model):
    id = models.IntegerField(primary_key=True)
    transaction_date = models.DateField()
    customer = models.ForeignKey(Customer)
    status = models.IntegerField()
    payment_due_date = models.DateField()
    category = models.CharField(max_length=-1)
    uid = models.ForeignKey(User, db_column='uid')
    items = models.ManyToManyField(Item, through='OrderItem')

What I want is some output like:-

ID |      Date       |   Total
--------------------------------------
 8 |  12-01-2009 |  89.00
 9 |  12-01-2009 |  100.00

And here’s the Django code to get it:-

from django.db.models import Sum

orders = Order.object.filter(customer=1301).annotate(total_price=Sum('items__price'))

Yup, it’s only one line. And here’s to see the actual query executed by Django, just what I would did manually:-

from django.db import connection
connection.queries.pop()

{'time': '0.005', 'sql': 'SELECT "orders"."id", "orders"."transaction_date", "orders"."customer_id", "orders"."status", "orders"."payment_due_date", "orders"."category", "orders"."uid", SUM("items"."price") AS "total_price" FROM "orders" LEFT OUTER JOIN "order_items" ON ("orders"."id" = "order_items"."order_id") LEFT OUTER JOIN "items" ON ("order_items"."item_id" = "items"."id") WHERE "orders"."customer_id" = 1301  GROUP BY "orders"."id", "orders"."transaction_date", "orders"."customer_id", "orders"."status", "orders"."payment_due_date", "orders"."category", "orders"."uid" LIMIT 1 OFFSET 2'}

p/s: Markdown screw up the formatting :(

After having to go through this again and again, I think it’s time to take some note for future references. A common buildout.cfg:-

[buildout]
newest = false
parts = 
    pylons
    omelette
    paster-serve
find-links = http://pylonshq.com/download/0.9.7
unzip = true
extensions = gp.vcsdevelop
vcs-extend-develop = git+git://github.com/countergram/PylonsTemplates.git#egg=PylonsTemplates
develop = clinic

[pylons] recipe = zc.recipe.egg interpreter = python eggs = nose>=0.11.1 Paste>=1.7.2 PasteScript>=1.7.3 PasteDeploy>=1.3.3 Pylons==0.9.7 SQLAlchemy==0.5.5 Jinja2==2.1.1 FormAlchemy==1.2.3

[omelette] recipe = collective.recipe.omelette eggs = ${pylons:eggs}

[paster-serve] recipe = zc.recipe.egg eggs = ${pylons:eggs} arguments = args=['serve', '--reload', 'clinic/development.ini'] scripts = paster=run

The issue here is with the develop eggs, both in develop = and vcs-extend-develop. The buildout run fine, and in develop-eggs directory, I could see two eggs were created:-

clinic.egg-link PylonsTemplates.egg-link  setuptools.egg-link

But those two eggs were nowhere in the list of included path within the custom interpreter script sys.path. Only then I realized that buildout just create the eggs. We need to explicitly ‘install’ the eggs in order to use it, just like any other eggs we have. So:-

[pylons]
recipe = zc.recipe.egg
interpreter = python
eggs =
    nose>=0.11.1
    Paste>=1.7.2
    PasteScript>=1.7.3
    PasteDeploy>=1.3.3
    Pylons==0.9.7
    SQLAlchemy==0.5.5
    Jinja2==2.1.1
    FormAlchemy==1.2.3
    PylonsTemplates
    clinic

Yup, we must enable the eggs in any eggs = for buildout that using zc.recipe.egg. Hopefully I won’t forget about this again next time.

Easy Install has become kind of standard way to install Python package now day, with Python Package Index (PyPI) now officially at python.org (previously at Cheeseshop). So to install any modules in Python, you simply run the following command:-

$ easy_install Werkzeug

Provided you already have easy_install properly set up, that command would download Werkzeug and all it’s dependencies and install it in your Python site-packages directory. One problem I have is when working on a number of environment (through virtualenv) so the following transcript better explain the situation:-

Setting up a new virtual environment for project XXX.
This project use Werkzeug, Jinja2 and SqlAlchemy
$ virtualenv XXX
$ cd XXX
$ . ./bin/activate
$ easy_install Werkzeug
$ easy_install Jinja2
$ easy_install SQLAlchemy

Now I want to work on different environment, project YYY
which appear to use Pylons as a framework but still
depends on Jinja2 and SqlAlchemy
$ virtualenv YYY
$ cd YYY
$ . ./bin/activate
$ easy_install Pylons
$ easy_install Jinja2
$ easy_install SQLAlchemy

Here we can see that I have to download Jinja2 and SQLAlchemy from PyPI twice. There must be a better way. Reading through Pylons book, I’d noticed that we can actually specify -f to easy_install telling it the location of the eggs. So, what I did is, go through all the virtual environments site-packages directory, copying all the eggs into one directory (named as eggs):-

$ cd XXX/lib/python2.5/site-packages
$ cp -a *.egg ~/python/eggs
And now to install any eggs that already exists on my local system
$ easy_install -f /home/kamal/python/eggs Jinja2

Now I can grab all my favourite eggs from PyPI and no need to worry anymore PyPI would goes down in the time I really need it.

On my tweet “no innovation in local hosting industry. All we got is Yet Another Cpanel Hosting.”, r0kawa of Zenpipe replied on what’s my suggestion to improve our local hosting industry. My quick answer to him was to just look at how webfaction run their hosting services. Now I have some time to write a little more, I’d like to share my insight on this topic.

To fully embrace what webfaction has done in hosting services maybe a bit too much so I have 2 things that I’d like to see in our local hosting service.

  • Process isolation

All local shared hosting that I know use mod_php in their infrastructure running as global main process that was shared for every users on that server. To enable functionality such as file upload, users has to allow write access to their directory to this global process which as a result, made it writable to everyone on that system. Workaround has been taken such as by enabling safe_mode directive in PHP but most of the time it caused more trouble than the problem it try to solve.

Webfaction solved this by using mpm_peruser instead of the usual mpm_prefork or mpm_worker which allow them to assign each user dedicated apache process independent of each other. Another possible solution is mpm_itk which also can accomplish the same thing. It’s just a matter of benchmark to see which one perform the best in massive deployment. My experience on webfaction quite satisfying. Webfaction also goes one step further by allowing user to have their own Apache environment stack running on custom port that was then proxied through by the main apache process (later on they switch to nginx as their main front end server). This mean user’s process truly isolated from the main process and other users on that shared server.

  • Shell acces

These days, I just can’t imagine deploying my site using FTP. If you can’t give full shell access, at least make it possible to use tools such rsync or scp to transfer files to the server. While I realize the fear of granting shell access to random users, webfaction include it even in their basic plan so there must be some way to this correctly.

There could be much more but this the things that I’d like to see from local shared hosting provider. Yes I can get all this from VPS but to developer like me, shared hosting such as webfaction much more appealing because it free me from doing all the sys admin job to maintain the server while at the same time still have all the tools that I like for development. Otherwise, you are just ‘Yet Another Cpanel Hosting’.

Setting up hgwebdir.cgi on Apache which then proxied through Nginx at the front. SSL was enabled at Nginx level but not apache since it just running at localhost. Work perfectly fine except when pushing, I got this error:-

searching for changes
ssl required

Two weeks ago, I signed up to Zenpipe VPS hosting. It’s a local hosting company run by quite a well known figure in the local development community. I currently using slicehost and webfaction for my hosting need and to me they are the best so far in VPS and shared hosting business respectively. The only thing they lack is something that out of their control - latency. Two weeks ago latency to international link from .my was really bad. It jumped up 600-700ms, unacceptable for most sites hosted outside of local Internet link. Things back to normal now (300+ms) but we don’t know when it would happened again.

When we add new fields to Drupal Content type, CCK would create a new table inside the database as content_type_<content_type_name>. Example:-

mysql> desc content_type_tweet;
+--------------------+------------------+------+-----+---------+-------+
| Field              | Type             | Null | Key | Default | Extra |
+--------------------+------------------+------+-----+---------+-------+
| vid                | int(10) unsigned | NO   | PRI | 0       |       | 
| nid                | int(10) unsigned | NO   | MUL | 0       |       | 
| field_follow_value | longtext         | YES  |     | NULL    |       | 
+--------------------+------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

The table basically store the reference to nid and vid of that node + the value for that field. Any new fields added to this content type through will be added as a new column into this table. The column naming is fixed and you can’t customize it, CCK would always create it as field_<field_name>_value. Things look so simple so far. It look cools and allow us to add new field on the fly and have it as a new column in the content type table. But it’s not.

I currently use Kohana as a PHP framework. I’d already lost faith to PHP framework so I built my applications straightly from Drupal. But after starting my new (current) job, we need to use a framework. The current framework used at that time was Code Igniter, which was embedded inside Drupal so we don’t have to deal with any authentication and acl issues and have a CMS out of the box. I’m not a fan of CodeIgniter. You know what, it suggest people to write HTML form by hand, one by one and it validation library is the fugliest I’ve ever seen in my life. You define the rules inside an array and if you want friendly error message for that validation, you have to define another array repeating yourself again. Wtf ! The way the validation library was designed make it really hard to create a wrapper that would automate all the tedious tasks. We try our best and we have a wrapper that would automate form generation from a schema file and automatically handle the validation part but it’s not fun with ugly hack here and there which is really hard to maintain and add new features. Every time user comes with a feature request, oh damn I have to hack it again.

Webfaction Django setup use user’s apache instance running on custom port which then would be proxied by the main apache process running at the front. I try to replicate this setup on my laptop so I can easily play around with the setup without actually using webfaction server through the slow ssh connection. Webfaction use Centos while my laptop is running Ubuntu 8.04 but look’s like the setup can easily be replicate (once you get through the hardest part).

The hardest part seem to get the latest mod_wsgi (2.x) on ubuntu which still provide 1.x version through apt-get.

Spend few hours to get Plone running behind Nginx through proxy_pass. It supposed to be as simple as:-

server {
    listen      80;
    server_name cvt.int-prokab.com;
    root html;

rewrite ^/(.*)$ /cvt last;

location / {
    proxy_pass http://127.0.0.1:8080;
    proxy_set_header        Host            $host;
    proxy_set_header        X-Real-IP       $remote_addr;
    proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
}

}

The plone site are running on port 8080 at 127.0.0.1:8080/cvt. I want it to be exposed to the outside as http://cvt.int-prokab.com/ but it turn out that Plone generate all url on the site as 127.0.0.1:8080/cvt - the address behind and not the exposed public url. Reading explanation on ZMI Virtual Hosting interface:-

The VHM doesn’t do anything unless it sees one of the following special path elements in a URL: VirtualHostBase sets the protocol and host, while VirtualHostRoot sets the path root. If the URL path of a request begins with /VirtualHostBase/http/www.buystuff.com, for instance, then URLs generated by Zope will start with http://www.buystuff.com. Since the port number was not specified, it is left unchanged. If your Zope is running on port 8080, and you want generated URLs not to include this port number, you must use /VirtualHostBase/http/www.buystuff.com:80. If the URL contains VirtualHostRoot, then all path elements up to that point are removed from generated URLs. For instance, a request with path /a/b/c/VirtualHostRoot/ will traverse a/b/c/d and then generate a URL with path /d.

Took me sometimes and hair pulling before I could understand how it would work. The rewrite supposed to be like this:-

rewrite ^/(.*)$ /VirtualHostBase/http/cvt.int-prokab.com:80/cvt/VirtualHostRoot/$1 last;