Older blog entries for cinamod (starting at number 76)


So Caleb is some sort of coding machine. In the matter of a few days, we've (read: mostly Caleb) (mostly) finished the Cairo backend for librsvg. In honor of this work, I've just released version 2.13.0 (may take a few minutes for the FTP mirrors to sync).

What's the big deal? Well:

  • It doesn't use libart.
  • It can generate more formats than just PNG - drawing to PDF, X11, Win32, Quartz, PS should all be possible now.
  • More conformant output - with the exception of some CSS and text, we nearly pass the W3C SVG 1.1 conformance test. We even beat Batik's conformance for a few tests...
  • It is significantly faster than its libart-based counterpart - on some tests, it's 6x faster than the libart backend. Using pre-multiplied RGBA clearly has some performance penalties.
  • It has the beginnings of a DOM API.
  • Did I mention that it doesn't use libart?

The library is stable - it can handle our own test suite, the W3C suite, and Batik's suite without incident. Only in 1 of the 182 W3C conformance tests did I notice any appreciable memory leakage directly responsible by librsvg (which we'll be correcting shortly). All told, this is one rockin' release. Carl, Caleb, Chris - you all rock. Thanks.

I will warn consumers of the library that the API and ABI have changed to accomodate this work. It's mostly backward-compatible, but there are some changes to be aware of. I expect the API and ABI to change further during the release cycle as things get ironed out and the DOM API formalizes. Caveat emptor.

16 Oct 2005 (updated 16 Oct 2005 at 17:47 UTC) »
Reverse Outsourcing?

It's been an odd week. I've had 3 Indians contact me, repeatedly, via email, asking me to do their work for them.

  • The first was a MSCE who works for Intel. This guy basically couldn't figure out how his compiler works. He doesn't know anything about building packages from source and wants bugfixes for wvWare that are available only in its recent 1.2.0 release. His machine runs (as best I can tell) a 4+-year-old Linux distro and he's complaining about glib-2.0 dependencies. I've been as patient and helpful as I can be (repeating things in the README and INSTALL files before telling him to RTFM), but don't I know that his project for Intel was due yesterday? Does no one at Intel know about ./configure && make?
  • The second was a student in Bangalore who "demands" that I do his CS homework for him. I pointed this guy to some relevant literature for his project (texts summarization, like I worked on for OTS) and politely told him that he would only learn by doing the work himself. Outrage, I tell you! Didn't I know his homework was due?
  • The third was a guy who "demands" that I export Microsoft Word art as LaTeX from wvWare immediately. I told this guy that we currently convert the art to PNGs and that converting it to LaTeX would be a lot of work that I don't feel like doing. Heresy!

These guys have all been extremely polite, but forceful, persistent, and presumptuous. What's in it for me? Intel isn't going to fire me and I won't be kicked out of school due to bad grades. Like I said, it's been an odd week...


Maybe the 0.3s Fontconfig startup penalty will be a thing of the past with its new devel version. Patrick Lam has been optimizing FC in order to reduce memory usage and startup penalties related to computing glyph metrics.

1 Oct 2005 (updated 1 Oct 2005 at 22:42 UTC) »
10 gallons of basil

So, in mid June, Ruth and I bought two small basil plants from a local nursery. Soon, the plants started to get big, so we split them into 4. Not long after that, the plants became a hedge of basil and overtook our back deck.

So today, we harvested the plants and filled a 13 gallon trash bag almost to the brim - about 38 liters of basil, all told.

So, I'm looking for suggestions, besides the obvious pesto and pasta. What should we do with it? If you're in Cambridge, MA, a perfectly good answer might be "give me some" :-) (/me looks in Luis', Krissa's and Bryan's direction).

librsvg 2.12.3

Thanks, Philip. A new release is out that should fix the problems you've run into.

I'm amused at Michael Meeks' latest interview on OpenOffice performance and startup time.

I'd like to preface this by saying that there are real problems with g++, kernel I/O schedulers, and the like, and I'm glad that someone like Michael is looking into them. I have little doubt that OOo will be speedy enough for many people's uses soon.

Sure, kernels aren't the greatest at knowing what resources you'll use next - see RedHat's attempts to speed up the Linux distro boot time and RML's "disk seeks are evil" paper. And, sure, g++ isn't the greatest C++ compiler on the planet, though it's getting better lately, in part due to Michael's efforts.

But at some point, comments like Michael's start looking like an exercise is passing the buck, and it's hackneyed and tiring by now. Ok, comparing OOo's and vi's load times isn't exactly a fair fight. But what about when Microsoft Office loads in Crossover Office faster than a native copy of OOo does? Is that a fairer comparison?

At some point (and after something like 4 or 5 years of optimization work), some of the blame simply must amount to bad design, bad implementation, or both. Copping out and saying that "it's complex" doesn't cut it. Lots of things are high-performance, complex, written in C++, run on Linux, and hit the disk. But these things tend not to require a JVM, VB interpreter, thousands of C++ vtables, hundreds of images from on-disk, and its own CORBA-like component system in place before the first window gets rendered to screen.

From the outside looking in, it looks like something is unecessarily complex, fundamentally broken, or else not well thought-through. From the outside looking in, it looks like you're optimizing the edges. And while it may be simpler to do that than re-organize OOo's behemoth codebase, I can't say that I'm encouraged. The "code first, optimize later" philosophy doesn't scale up well to projects of OOo's size.

Unecessary complexity does create itself from nothing when there's not constant vigelence. And it's time for the OOo group to look in the mirror and shoulder at least some of the blame, or at least stop passing the buck. I'm tired of hearing about it.


You can think of copyright as a certain kind of social contract between the creator of a work and society at large.

Copyright doesn't view an author's control over his/her work as absolute. In the US at least, Copyright is codified in the Constitution as a necessary evil - something that Congress is entitled to grant authors in order to promote science and the useful arts.

Copyright isn't all-encompassing. It expires. Some things aren't copyrightable. Other societal rights trump your rights as a copyright holder. You can't stop someone from writing a news report about your book, or writing a scholarly essay about it. Until the DMCA, you couldn't stop people from making a backup copy of a work. Even with the DMCA in place, you still might not be able to stop them. In short, there are several "fair use rights" that the copyright holder simply can't deny you, no matter how much he or she wishes to. It's simply not in society's interests for them to do so.

The GPL doesn't interfere with any of these fair use rights, in fact it encourages something much stronger than fair use. It says, "Please, embrace and extend this work." With traditional licenses, only a narrow amount of "embracing and extending" is permitted under fair use rights. The GPL serves to limit your rights as an author, where traditional licenses generally seek to maximize them. You're always free to give up your own rights, but you may not force others to give up theirs. So your analogy with the GPL's enforcability is a poor one.

Certainly under today's laws, authors of a work have the right to license their work under any contract they see fit. But then I don't think (from a moral standpoint) they're entitled to the protections Copyright would otherwise provide, because they've encumbered reasonable societal benefits and rights. Copyright holders simply aren't entitled to do whatever they see fit in order to "protect" their works, because the laws are set up to protect society.

22 Sep 2005 (updated 22 Sep 2005 at 04:28 UTC) »

This whole Google print lawsuit bungle has got me thinking.

I'm confused by those who think that Google isn't unambiguously in the clear here. Not because they're doing it for a scholarly purpose. Or because they're only reproducing a terse portion of the work. Or not even because what Google is doing can't possibly affect these author's past decisions to create their works, and thus retroactively disincentivize their respective work's creation (ahem, Sonny Bono CTEA, I'm looking at you here...).

I think that if you're arguing those points, perhaps you're looking at this from an overly-narrow perspective. I invite you to look outside the box. Sure, Google might (or might not) win on the above points alone. But I don't think that Google is copying expressive works. Google is copying databases of words.

In the case of Rural v. Feist, the Court ruled that databases (in that case, telephone directories) were not entitled to copyright protection, as they contained little (if any) expressive content. Copyright protects expression fixed in a tangible media, and even then only within certain limitations.

Here, I believe that Google is treating otherwise expressive, copyrighted texts as databases, thus stripping them of their expressivity in the context of the texts' uses. I think that the use of a derivitive work matters a great deal in determining that work's expressivity before the Court. That the use of a work has a transformative effect on the expressivity of that work, possibly even voiding that work's expressiveness in a given context. In Google's case, the works are copied - perhaps verbatim - but their expressiveness is lost in the process. Granted, this may seem non-obvious.

The search results page rendered by Google most likely have some expressiveness, and would be copyrightable. The texts that Google OCR'd are expressive and copyrightable. But Google's treatment of these texts as search indexes - reverse text lookup databases - is in itself not expressive. They're just unexpressive token sequences, capable of being searched. It is in the translation from meaningful, expressive words into an ordered sequence of cold, machine-searchable tokens that the work loses its expressivity. Note that this distinction would still attach copyright protection to things like eBooks, as the purpose of eBooks is to convey expressivity to a human reader via an electronic medium. The purpose of the tokens is to convey an ordered sequence of words to a machine algorithm incapable of appreciating the work's expressivity or content in any way that we'd call "meaningful".

From that, we're left to conclude that the tokens "John Galt" appearing on page 1 of Ayn Rand's "Atlas Shrugged" next to the tokens "Who is" is merely a fact, absent any inherent meaningful expressivity. And absent this expressivity, copyright doesn't attach to this sentence (which, fwiw, is probably considerably too short for copyright to attach to, anyway). Facts - even collections of facts - simply aren't protected under copyright law.

In the end, Google's Print project is just a fact retrieval system - in essence, no different from the index in the back of the book that they're OCR'ing. Copyright law needn't get involved, because at no point does it affix to what Google is doing. Or so I hope that the Courts decide.

25 Aug 2005 (updated 25 Aug 2005 at 16:06 UTC) »

What does garnome vs. jhbuild got to do with anything? The bug report clearly says "pango 1.9.1", which is their development series. "Cairo 0.9.0" is also a development release. Both now have stable releases (1.10.0, and 1.0.0, respectively). Development releases aren't usually supported by their own maintainers, even if they come in tarball form. I certainly am not going to support people using my library against unstable versions of other people's libraries. Further, if you look at the bug report, you'll see that it's not my job to track my dependency's dependencies. That's upstream of me, and should be handled by pkg-config.

What's rude is assuming that librsvg tracks or would want to track CVS HEAD GNOME dependencies, which it doesn't. What's rude is assuming that its development and release schedule is tied to GNOME's, which it isn't. We do things on our own timeline, and I'll release a tarball when I think that the thing is ready to be released, and not because you say so. What's rude is then asking your users to file bugs against my product because you've gone and done something stupid. I won't be a support line for your "highly unofficial" mistakes. What's confusing is that you used the librsvg HEAD branch instead of the GNOME-2-12 branch to create this "highly unofficial" tarball. Why haven't you submitted your 'make dist' patch upstream? Or your version patch?

You made several assumptions about librsvg and my intentions without consulting me or Caleb. You then acted upon those wrong assumptions (granted, with the admirable intention of wanting to help out your users) in a way that affected your users. Now you contend that your missteps and wrongheaded assumptions should somehow reflect poorly on me instead of you. That's absurd - pull your head out and get a clue. How you handle your users is your problem, not mine.

WRT bug 314400, librsvg has never had a promise of API/ABI stability. In light of that, I've marked those few functions as deprecated for the better part of 2 years. I've informed Sven and others that they're deprecated. So now in a *unreleased development version* of librsvg, I've removed the deprecated functions. As librsvg is in the midst of a major overhaul in terms of its API and what it's implemented in terms of (Cairo vs Libart), I don't think that's a problem. The HEAD version wasn't meant to be used against the GNOME 2.x series anyway. Use the GNOME-2-12 branch if you want the API/ABI stability, which I've left in to make folks like Sven happy. As far as I'm concerned, bug 314400 is a bug in the Gimp or your faulty assumptions. For now, I've reassigned it to the Gimp.

Anyhow, don't you think that it's hypocritical to file bug 313349 against librsvg because I allegedly haven't tracked new and unstable pango/cairo/gtk+ API/ABI changes, and then file bug 314400 against librsvg because I've broken API/ABI in an unreleased unstable version? By your reasoning, why shouldn't the Gimp be forced to track my unstable, fluctating API/ABI? So it's my fault both times. I contend that this is inconsistent reasoning. When you mix and match development versions of things - especially unreleased development versions of things-, caveat emptor.

Finally, WTF's up with your ChangeLog complaints? You're meandering and really ranting here. In an ideal world, I'd paste in the whole bug conversation and sample files, but I think that'd be a bit absurd. Learn how to use bugzilla. You can figure out what's been fixed in a certain module since a certain day. Don't be lazy and complain that you don't know "what's been fixed" when the ChangeLog and bugzilla both will tell you so readily. Don't cop-out and blame me for your own laziness, ineptitude, or both.

You could've avoided all of this if you'd just taken 2 seconds out of your day to join #librsvg and talk with me and Caleb. Pull your finger out and join the channel or write an email. But instead you decided to assume things and then act like a jerk because your assumptions didn't pan out like you'd hoped they would.

I've grown a bit annoyed at a certain MacOS competitor of - er, i mean "partner" of - AbiWord's that shall rename nameless for now.

This company has re-molded AbiWord's import/export capabilities into a standalone program. (Unlike one of their competitors still allegedly illegally using GPL'd wvWare after several C&D letters, $company does this in order to comply with the GPL, and kindly publishes their modifications to the AbiWord sources necessary to build this standalone program.)

Because Abi does such a good job of import/export, supports so many formats, and can competently speak RTF, $company can focus on just dealing with their core competencies - namely producing a nice MacOSX Word Processor. And from what I've seen, they have a very nice product. It's very well integrated with the MacOS environment, with quite the attention to detail. Import/export was just a mundane detail that they shouldn't have to pay attention to. This is how OSS is supposed to work.

However, there's a gotcha. There's always a gotcha. $company has been submitting bugs up to our bugzilla, which is a nice thing to do as it improves both our products. But their rep has a habit of geting a little crotchity if we don't attend to these issues quickly, as if we were somehow obliged to. (In his defense, I'm not exactly the nicest person you'll run across in a bugzilla.) It seems that they're willing to "work around" complex problems in their own code rather than even attempt to identify or fix the problem inside of Abi. It's as though they have some allergy to having their programmers fix bugs in what's substantively their own product.

There is no support contract in place. There is no "help" from them other than these bug reports. As far as I know, we're not even publicly getting credit for being "the man behind the man", so to speak. Yet, some of their developers are comfortable stating in their blogs that were are in a "partnership" with them. I ask - what partnership do they speak of? I'm not aware of it, and I'm allegedly 1/2 of this partnership.

I don't mind helping out. And I (usually) don't mind fixing bugs in my product. I sincerely appreciate these bug reports - Abi is a much better product because of it. I do mind being cajoled for not doing fixing things quickly enough by a company who is making money from my product (and worse still, not passing any of it down the food chain). If you want to call this a partnership, contract with some Abi devs to fix the bugs. Or "donate" some of your programmers' time to fixing these problems. Partnership means both sides helping each other toward a common goal. And so far, this has been a one-way street.

67 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!