Solid reusability

Posted 16 Aug 2000 at 20:48 UTC by gord Share This

To date, there have been many promises that if we write software a certain way, it will be reusable in the future. Free software authors would especially benefit from the fulfillment of such a promise, because many of us are interested in maximizing the usefulness of our work. I have some ideas on what we can do to nail down this dream, and I want to use this discussion to help make software reuse a reality.

I've spoken often about Figure, which is my attempt to build a software environment that is useful for all kinds of programs, and is portable to all kinds of environments.

The thing that I find wrong with existing portable environments is that they are bound to policy (whether to use a garbage collector, whether to run programs in the same address space or separate spaces, what object representation to use, what instruction set to use, what programming language, etc). I feel that this is a natural consequence of starting to design and implement a new software environment (thereby fixing on certain policies), then choosing to make it portable at a later date. Portability, to me, means to be independent of policy, and so I believe it is difficult to attain once the policies have snuck into the design.

What makes Figure different is that it was designed from the ground up to be portable. A common theme in Figure is interfaces that are chosen to hide policies from Figure's user. For example, it is possible to write garbage-collected and reference-counted programs and have Figure use the memory reclamation scheme you choose. To go one step further, you can use the same interfaces Figure uses for memory reclamation, and thereby make your program as portable (in this respect) as Figure is.

The ideas used by Figure are not particularly innovative... I'm mostly trying to collate the different approaches that programmers have found useful in the past, and structure them in a way that hides their differences. However, note that this does not introduce much overhead... you as a programmer are still able to exert as much platform-dependent control as you need. Figure is a toolbox, not a religion.

I started designing Figure three years ago, when I was looking for a way to write an application that would run under both Linux and the Hurd, while taking advantage of the special features of each platform. Since that time, I have not found an application framework that would do the job, and so I began writing Figure.

Now, I am looking for people who know of existing software that fills this job, who have comments and suggestions, or who want to help with implementation (including documentation). But before I turn the floor over to you, I recommend that you take a look at the existing design (if you feel comfortable with C, nested functions and lots of FIXMEs), which you can find at the Figure home page.

Why should I use it?, posted 16 Aug 2000 at 23:45 UTC by kjk » (Journeyer)

I understand that this post is half an invitation to discussion about reusability and half an evangelization for the project you're working on. Nothing wrong with that. I'll only post my comments about the latter. My comments will be negative but please don't treat them as a flaming. My intent is to provide a constructive critique.

Two minor points: web site sucks (I know that it's modelled after Pliant's web site, but their web site sucks too). Mailing list should have an archive.

One major point: I've heard about Figure some time ago (I think you've announced it on pliant or I came to your web page through pliant). I've also read the current web page carefully. And I still don't know what Figure is all about. It looks like it's goal is the same as Java's (portable application framework). But I'm not sure. I know it will be object-oriented, it will have it's own programming language etc. but I don't know how it will help me achieve anything. So here's a constructive critique part. I assume that the main purpose of the web page is to attract developers (I might be wrong but that's my best guess) so think of your visitors as users or customers. You want to sell them sth. (an idea) and they will have to pay (with sth. very valuable: their time). So they are developers with very little time and looking (maybe) for sth. to do. They encounter your page (eg. by following link from Advogato), they read it (if you're lucky) and they're gone unless you give them a good reason to stay. The current web page doesn't give a good reason to stay. I would have to download the sources and probably look at them to get even the slightest idea on what Figure is all about. What I would like to see is a (convincing) answer to the following question: how can Figure help me? I presume that the goal of Figure is to help in writing programs so the more precise question is: can Figure help me write better software? Will I be able to write software faster? If yes, how? Give me concrete examples (such and such program has been written in half the time it would take in C and it ran faster on 57 platforms, including PDP-11). Compare it with other approaches (will it be better than Lisp, Perl, whatever?). Ok, I know What kind of programs will be able to use it (well, one sentence claims that it will be useful for all kinds of programs but this doesn't seem convincing and is unlikely to be true - there is NO silver bullet). In other words: think about marketing. Decide who are you targeting Figure at (application developers? script writers?). Then figure out what they need and convice them that Figure gives tham what they need. And more.

Maybe put the presentation online?, posted 17 Aug 2000 at 00:40 UTC by matt » (Journeyer)

I, too, am a little lost (though I'm not interested in critiquing). I've been studying (or trying to study) Figure for a bit since you mentioned it in my article on path-relocatable software. I guess I'm trying to figure out how it will help my goals and others'.

You mentioned in a diary entry that you had a well-received presentation on it; perhaps you could post the contents of that presentation?

A White Paper Perhaps, posted 17 Aug 2000 at 00:58 UTC by nymia » (Master)

A White Paper would be a very good way of presenting your proofs. Perhaps, you could post your presentation online too.

code reuse, posted 17 Aug 2000 at 05:30 UTC by mettw » (Observer)

One thing I've never seen with code reuse projects is a solid study of which forms of programming result in good reuse and which don't. If you want figure to meet its design goals then I think you'll have to start with such a study rather than just a new idea.

My experience with code reuse evangelism is that people seem to put all of their faith in yet another programming language or idea. We now have GNOME putting all of its faith in CORBA. Given the woefull failure of other such claims I think I'm justified in being skeptical.

But to be more constructive, here the successes in code reuse that I've seen:

  • Good API Systems like Java don't acheive good code reuse because of any characteristic like VM, OOP or so forth. Most code reuse with it comes from the well designed standard APIs such as DOM. If a programme only uses the DOM API then it is assured to run regardless of what DOM library the computer actually has. This is especially important in cases where every programme has a different GUI library. If there was a standard GUI API then each library could have wrappers for this API and therefore you would only need one GUI library on a system.
  • Low levelness The standard C libraries enjoy enourmous code resuse mainly because they only impliment the bare bones. There are no policies among the C libraries that can conflict with a programmers view of the world simply because they are too low level to have any policy at all.
  • Pipes and lazy eval Unix pipes and lazy evaluation in some functional languages give great code reuse because they have a common, unrestrictive way to share data between programmes. The most important part being that the second programme doesn't need to know how the first is going to serve up the data because the first can only serve it up one way - sequentially. This ofcourse needs some strong glue to tie together the different programmes/functions by converting output from the format of the first programme to that of the second. In the shell world you have AWK for this, in functional programmes the language itself provides the glue.
  • Types C++ templates tries to get at something existant in functional languages, but misses the mark IMO. Basically, in a language like haskell you define a function like add a b = a + b and this function will work for any type for which the `+' operator makes sense. This is the ultimate in code reuse.
  • Flexibility Going back to UNIX shell utilities, a large part of the code resuse in them comes from their incredible flexibility. There is no such thing as a view in the unix shell world so you just don't see things like the DOM and RAX APIs being needed for differing views of the same thing.
No matter how much effort you put into getting things right in your base system though, you'll never get good code reuse without good engineering on the part of those using it.

Essentials?, posted 17 Aug 2000 at 10:20 UTC by tetron » (Journeyer)

mettw: you're dancing around a few points, let me see if I can put my finger on them.

The basics of software reuse are twofold: interfaces and adaptibility.

I'll take the second one first. Adaptability or flexibility is not only how general a case of the problem at hand this code solves, but what sort of meta-level hooks the code has to be extended with. A good example of this might be the C library function qsort(). Instead of enforcing a policy ("this function only sorts integers") for example, it allows the user to supply their own function pointer which qsort() will then use when it does comparisons as it sorts. Taking this further out, you get ideas like parametric polymorphism (which C++ templates try to be, and is found properly in languages like ML and Haskell), reflection (the ability to analyze program structure at runtime, as in the Java reflection API java.lang.reflect) and way out there are meta-object protocols as such found in the Common Lisp Object System (CLOS).

What these things do is allow the programmer to take the essential structure of a piece of code and decorates it with her OWN code. The MVC pattern (Model-View-Controller) is a good architectural example of this, because it is built on the concept of using events (model changes) to trigger hooks (viewers) which have been added on at a later time by the application programmer. The point here is that you are not simply using the code, but actually changing or augumenting the way it works for your application. Applications with embeded scripting languages are another good example of this; the scripting system then lets you reuse the native application logic (for example, Emacs and elisp).

The second part of reuse are interfaces. These of course define how exactly one module is going to interact with other modules that need to use it. Interfaces and extensibility are orthagonal issues, I think; you can have a useful API that is totally nonextensible, or a very extensible system with a horrible API. However, the best systems are going to have both.

One of the dreams of component-based achitectures (and when I say components here I'm also refering to object-oriented systems more generally) is that no piece of code needs to be written more than once. Then all other programs can simply use the interface exported by that component, and everyone is happy. Well, this suffers from several problems. One is that the API may not match up exactly to your needs and if it is not extensible as I discussed above you're out of luck. Another problem is that APIs tend to change over time, which breaks dependencies. Also, as anyone who has done much object-oriented programming can attest, components themselves tend to form dense networks of interdependencies, so that component X depends on component Y which depends on component Z (and if you're really unlucky, X and Z will both depend on Q, but different versions of Q).

I should note that interfaces bear many resemblences with (and are in fact related to) type systems. For example, a pipe is a universal interface, but it's a lot like "void *" in C. It is completely untyped, and left up to the parties at each end to make any sort of sense out of the data being exchanged. They have to agree on their own protocol, and a change of how data is interpreted at one end of the pipe may completely throw off the logic at the other end. At the other end you have strong interfaces, such as CORBA IDL (Interface Design Language), which are somewhat like typesafe languages like ML. You get strong typechecking at compile time, and are basically guaranteed that data will come in in a certain format, or not at all. If one end sends a bad message to the ORB, it will (I assume) reject it for not following the previously-agreed interface. The problem here now becomes that you have lost the ability to change your interfaces at runtime, which actually works against the ability to extend your program logic at runtime.

It's an interesting problem. If interfaces were a piece of rope, very strict interfaces would enough rope to tie yourself up with, and pipes would be enough rope to hang yourself with. Interfaces are something that most programming languages don't give you much choice about, of course. Most of them tend towards the strict, typesafe model with lisp being a notable exception; C would be more strongly typed but has a much too lenient compiler (that is to say, C has a decent type system, but the compiler tends to ignore it.)

Good interface design is a difficult thing, as you need to balance ease of use with complexity with overall power, in addition to the aformentioned issues of dealing with issues when interfaces don't quite match up. A good API, rather than simply providing certain services, should completely encapsulate a certain conceptual computational structure, and more importantly expose that computational structure in the API at both high and low levels - for example, the OSI (I think that's the right acronym) layered network model, going from hardware protocols (ethernet) to routing (IP) and streams (TCP). However, the application can, at its discretion, select which layer it actually deals with, or tweak operations at a lower layer to better support the higher-level high-layer operations.

Whoops! I just realized it's 6:25 in the morning and I want to go to bed so I'm going to stop here :-)

Documentation, posted 17 Aug 2000 at 20:19 UTC by gord » (Master)

Since the primary complaint so far is that Figure is lacking documentation, I've made that my priority for the next little while.

My only regret is that the docs may not be useable for a while yet, so it seems this discussion may be premature. On the other hand, I like the points raised by mettw and tetron.

I will try to post more as soon as I get the chance. In the meantime, I hope there may be others with their own thoughts on what portability and reuse entail.

My two cents about reusability, posted 17 Aug 2000 at 21:38 UTC by nymia » (Master)

It looks like you're soliciting for ideas, so here are some of my simple ideas worth two cents.

I've always look at software reuse in the light of the following:
(2)Syntax and Semantics

Software reuse became possible because of abstraction. In compiler design, abstraction is done by defining a type for an object and is known as typing. A type would mean a storage location having defined operations. For example, an integer might be a 16 bit storage capable of performing addition and subtraction. Multiplication and division can then be implemented using addition and subtraction respectively. With that kind of system, programmers can easily grasp the idea and reuse it on a language level.

Syntax and Semantics
Once the types are defined, the next step is to establish a way how an idea gets translated down into an expression that is both readable and maintainable. Readable in the sense that I can understand with ease what the expression is trying to accomplish. As a result, it then becomes maintainable, programmers can then add more features as it goes along the process of upgrade.

This is an area where most compiler makers stumble. They seem to be afraid of defining a concrete definition of what kind of framework a language should have. An example of that would be K&R and Stroustrup where they believed that it was not the language's responsibility to force programmers to work inside a framework. That's why see hear or read Stroustrup stating C++ is a language, not an environment. IMHO, if these people managed to define a framework for the language then, probably, software reuse wouldn't be at issue at all. But, that didn't happen and now we are seeing projects like Java, Jini, COM, CORBA, GNOME, KDE and the New Amiga are now tackling this issue.

Overall, I simply must submit myself to the idea that software reuse at the framework level will never happen. Or maybe I'm wrong, if you can only prove it.

concrete suggestions, posted 18 Aug 2000 at 00:11 UTC by mettw » (Observer)

OK, I've thought some more about code reuse and I think I may have hit upon two suggestions for how to get it.

  • Expressions By expressions I mean things like regex and Xpath. These offer enourmous flexibility and their use is not fixed at compile time. Compare for example a series of DOM requests with a function like xpath(Node*, char*). The former is limited to how many cases you account for in your code, while the latter can have the char* argument completely constructed during execution, giving much greater flexibility. You also won't see DOM and RAX APIs with Xpath because XPath accounts for every view of the underlying data. So, generally, every data structure should be examined using expressions rather than an API.
  • Cascading Style Sheets These have many more applications than just HTML. I came up with this idea while trying to come up with a structure for a generalised GUI API. My idea was that, rather telling the GUI library how to do something, the programme should only tell it what to do and use CSS to convert this into a GUI. For example, a programme wouldn't tell a GUI library to put a menu bar at the top and construct these menus/submenus etc. Instead the programme would tell the GUI library that it wishes to export the following callbacks to the UI and would also group them together by what sort of callback they contain. This then leaves the user free to have his own style sheet for binding keys to these callbacks, or if he is a hacker or blind he might have a stylesheet to bind all of these callback to an emacs style minibuffer instead of a menubar. Someone new to the application, but who wishes to use a minibuffer might have a stylesheet that binds the callbacks to both a manu system as well as custom keybindings and a minibuffer. The real point here being that the setting of policy is taken away from the programmer and given to the user.

Heh..., posted 18 Aug 2000 at 14:37 UTC by sab39 » (Master)

mettw: Three letters - X U L :)

Seriously, I think that things like XUL and Glade, combined with flexible scripting architectures that give accessibility to the underlying components of the application, have definite promise in this area. Aphrodite, for example, is a complete re-implementation of a XUL-based browser that doesn't share much (any?) XUL or javascript with the Navigator browser, but can do all the same things through use of the same components. Basically, XUL allowed the whole user-interface to be re-implemented from scratch at very little cost (cost == time, of course).

This has obvious implications for experimentation with good UIs and alternative placement of elements, but it also has implications for code reuse - in a XUL/scripting environment, if I want to write (say) an HTML enabled help application, I just need to pop up a XUL window and embed the same HTML viewing component that the browser does. If I want an HTTP connection, I can (presumably) instantiate the scriptable http object. If I want to construct and sent a MIME email from my application, there are objects for it. Scriptable components go a long way towards code reuse, and the best way to ensure that the scripting interfaces are useful is to build your actual application using those scripting interfaces!

Wide-spread adoption, posted 18 Aug 2000 at 22:31 UTC by kjk » (Journeyer)

One and single most important thing in reuse is wide spread adoption. It's much more important than engineering details like the stuff already mentioned: good API, abstraction, flexibility etc. This statement doesn't imply that those are unimportant. It's just if you have wide spread adoption you can get away with not having the most briliant design, if you have most briliant design and no adoption at all no reuse will happen. You want proofs? I could try to win a popularity contest here and pick on a few technologies from a certain company we all love to hate but I'll limit myself to a wonderful world of Unix. Exibit number one: X Windows. xlib is just another name for the Evil. Really. People wrote programs in it only because they had no choice and xlib was there, on every Unix machine. Then came Motif. Some would say it's even more Evil but people were just desperate and would do everything not to use xlib directly. And Motif was there, on every Unix machine (I'm talking about pre-Linux times when Motif was shipped with every major Unix workstation). Now we have Gtk. Do people use it because it's oh so great? No. File manager sucks like you wouldn't believe, list/tree widget is useless if you have a large number of items, and fonts... My point is: even if you have less than stellar design wide adoption will make up for it tenfold. libc, perl, Java are all evidences in support of this statement. Conclusion. The lesson here is if anyone wants to promote reuse of any technology (like Gord:Figure) he should strive for technical excellence and wide adoption with the latter being more important.

Online demonstration, posted 18 Aug 2000 at 23:05 UTC by gord » (Master)

[Don't mind me... I'll just keep posting to this thread.]

As before, I appreciate the comments that have come out thus far.

I just wanted to mention that I have an online demonstration of a bit of Figure, that you can check out if you're interested in seeing it, but don't want to bother with the code yet.

Just do telnet and follow your nose.

Concrete reusability, posted 19 Aug 2000 at 16:32 UTC by terop » (Journeyer)

I always try to divide the code to 3 different pieces:

  • Client is outside the system you're building and outside of your control. You need to make client's life as simple as possible.
  • Co-ordinator is inside your system, but only deals with complexity of one system.
  • Reusable code is inside your system, but it needs to deal with complexity of all systems that it can be used with.
The important part here is that client and co-ordinator are in the same level of abstraction. They are only designed to work with one system, while reusable code should be usable with many systems. Co-ordinator thus hides many details that are irrelevant for the client, but which reusable code provides for other systems. Co-ordinator's interface should be simpler than the reusable code ever can be -- this comes from the fact that it needs to implement much smaller set of requirements than the reusable code.
This division of roles/responsibilities gives the following design(UML):

Client <>---->1 Co-ordinator <>----->n Reusable_Class1

Usually Co-ordinator's interface it very simple -- like methods like DoAllOfIt(), while reusable code has been split to very many small methods that do very simple part of the whole problem. Maintainance of reusable code involves usually adding new methods to customize different aspects of the code -- rewrites of the main algorithm inside reusable code are many times needed when new parameters appear. Soon that code implements support for many parameters and co-ordinators using it need to do much setup to utilize the code.

This is also called Facade in the Design Patterns. (Though, the difference between the above is that facade is usually done after you have existing classes and you want to make them simpler, the co-ordinators should be designed to the system -- Also Design Patterns does not mention the important connection between client and the co-ordinator that they're in same level of abstraction and co-ordinators are only responsible of one kind of clients, while the subsystem/reusable classes are responsible of all possible clients -- there are usually many co-ordinators for same reusable classes which utilize different aspects of the reusable code.)

For example, this has happened to Gtk+, they've added large number of methods to gtk+ to make it usable in as many contexts as possible. This allows people to build co-ordinators (gtk+ based libraries/applications) that provide simple interfaces to clients... => Most of the existing code already works this way...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page