[Software Carpentry logo]
[ACL Logo]

[CodeSourcery Logo]

sccons Design Overview

Steven Knight
knight@baldmt.com
June 2000

Abstract

The sccons tool provides an easy-to-use, feature-rich interface for constructing software. Architecturally, sccons separates its dependency analysis and external object management into an interface-independent Build Engine that could be embedded in any software system that can run Python.

At the command line, sccons presents an easily-grasped tool where configuration files are Python scripts, reducing the need to learn new build-tool syntax. Inexperienced users can use intelligent methods that ``do the right thing'' to build software with a minimum of fuss. Sophisticated users can use a rich set of underlying features for finer control of the build process, including mechanisms for easily extending the build process to new file types. Dependencies are tracked using digital signatures, which provide more robust dependency analysis than file time stamps. Implicit dependencies are determined automatically by scanning the contents of source files, avoiding the need for laborious and fragile maintenance of static lists of dependencies in configuration files. The tool supports use of files from one or more central code repositories, a mechanism for caching derived files, and parallel builds. The sccons tool also includes a framework for sharing build environments, which allows system administrators or integrators to define appropriate build parameters for use by other users.


Index

The full sccons design is spread across the following pages, each concentrating on a specific aspect of the overall submission. Separate top-level pages are in bold in the following list. This page's internal index is inserted at the appropriate point in the list.


Background

Most of the ideas in sccons originate with Cons, a Perl-based software construction utility that has been in use by a small but growing community since its development by Bob Sidebotham at FORE Systems in 1996. The Cons copyright has recently been transferred from Marconi (who purchased FORE Systems) to the Free Software Foundation. The author of this proposal is currently a principal maintainer of Cons, working closely with the primary maintainer, Rajesh Vaidheeswarran.

Cons was originally designed to handle complicated software build problems (multiple directories, variant builds) while keeping the input files simple and maintainable. The general philosophy is that the build tool should ``do the right thing'' with minimal input from an unsophisticated user, while still providing a rich set of underlying functionality for more complicated software construction tasks needed by experts.

I've used the name sccons to distinguish this design from the already-available Perl implementation of Cons, and knowing that the actual name of the tool isn't important and should change to reflect some common naming with other Software Carpentry tools.


Goals

The limitations of the classic Make utility, which sccons is intended to replace, have been recounted repeatedly in other papers, notably including the other design submissions for the Software Carpentry Build tool. This paper won't belabor those points, but will instead describe a set of goals that sccons aims to satisfy as a next-generation build tool. Figuring out which goals are directly driven by limitations of Make is left as an exercise for the reader.

Practicality
The sccons design emphasizes an implementable feature set that lets users get practical, useful work done. sccons is helped in this regard by its roots in Cons, which has had its feature set honed by several years of input from a dedicated band of users.

Portability
sccons is intended as a portable build tool, able to handle software construction tasks on a variety of operating systems. It should be possible (although not mandatory) to use sccons so that the same configuration file builds the same software correctly on, for example, both Linux and Windows NT. Consequently, sccons should hide from users operating-system-dependent details such as filename extensions (for example, .o vs. .obj).

Usability
Novice users should be able to grasp quickly the rudiments of using sccons to build their software.

This extends to installing sccons, too. Installation should be painless, and the installed sccons should work ``out of the box'' to build most software.

This goal should be kept in mind during implementation, when there is always a tendency to try to optimize too early. Speed is nice, but not as important as clarity and ease of use.

Utility
sccons should also provide a rich enough set of features to accommodate building more complicated software projects. However, the features required for building complicated software projects should not get in the way of novice users. (See the previous goal.) In other words, complexity should be available when it's needed but not required to get work done. Practically, this implies that sccons shouldn't be dumbed down to the point it excludes complicated software builds.

Sharability
As a key element in balancing the conflicting needs of Usability and Utility, sccons should provide mechanisms to allow sccons users to share build rules, dependency scanners, and other objects and recipes for constructing software. A good sharing mechanism should support the model wherein most developers on a project use rules and templates that are created and maintained by a local integrator or build-master,

Extensibility
sccons should provide mechanisms for easily extending its capabilities, including building new types of files, adding new types of dependency scanning, being able to accomodate dependencies between objects other than files, etc.

Flexibility
In addition to providing a useful command-line interface, sccons should provide the right architectural framework for embedding its dependency management in other interfaces. sccons would help strengthen other GUIs or IDEs and the additional requirements of the other interfaces would help broaden and solidify the core sccons dependency management.


Architectural Overview

The heart of sccons is its Build Engine. The sccons Build Engine is a Python module that manages dependencies between external objects such as files or database records. The Build Engine is designed to be interface-neutral and easily embeddable in any software system that needs dependency analysis between updatable objects.

The key parts of the Build Engine architecture are captured in the following quasi-UML diagram:

The point of sccons is to manage dependencies between arbitrary external objects. Consequently, the Build Engine does not restrict or specify the nature of the external objects it manages, but instead relies on one or more Intercessors to interact with the external system or systems (file systems, database management systems) that maintain the objects being examined or updated.

An Intercessor provides a general interface for interacting with external objects, by encapsulating knowledge of interacting with a specific external system, such as a file system or a database management system. This includes translating human-readable descriptions of the external object (e.g., a path name string) into and out of the internal Node objects; creating or deleting external objects; copying external objects; etc. (It may help to recognize that, in Object-Oriented pattern terms, the Intercessor class is an Abstract Factory, generating product objects through Concrete Factory subclasses.)

The Build Engine presents to the software system in which it is embedded a Python API for specifying source (input) and target (output) objects, rules for building/updating objects, rules for scanning objects for dependencies, etc. Above its Python API, the Build Engine is completely interface-independent, and can be encapsulated by any other software that supports embedded Python.

Software that chooses to use the Build Engine for dependency management interacts with it through Construction Environments. A Construction Environment consists of a dictionary of environment variables, and one or more associated Scanner objects and Builder objects. The Python API is used to form these associations.

A Scanner object specifies how to examine a type of source object (C source file, database record) for dependency information. A Scanner object may use variables from the associated Construction Environment to modify how it scans an object: specifying a search path for included files, which field in a database record to consult, etc.

A Builder object specifies how to update a type of target object: executable program, object file, database field, etc. Like a Scanner object, a Builder object may use variables from the associated Construction Environment to modify how it builds an object: specifying flags to a compiler, using a different update function, etc.

Scanner and Builder objects will return one or more Node objects that represent external objects. Node objects are the means by which the Build Engine tracks dependencies: A Node may represent a source (input) object that should already exist, or a target (output) object which may be built, or both. The Node class is sub-classed to represent external objects of specific type: files, directories, database fields or records, etc. Because dependency information, however, is tracked by the top-level Node methods and attributes, dependencies can exist between nodes representing different external object types. For example, building a file could be made dependent on the value of a given field in a database record, or a database table could depend on the contents of an external file.

The Build Engine uses a Job class (not displayed) to manage the actual work of updating external target objects: spawning commands to build files, submitting the necessary commands to update a database record, etc. The Job class has sub-classes to handle differences between spawning jobs in parallel and serially.

The Build Engine also uses a Signature class (not displayed) to maintain information about whether an external object is up-to-date. Target objects with out-of-date signatures are updated using the appropriate Builder object.

Details on the composition, methods, and attributes of these classes are available in the Internals page.


Build Engine

More detailed discussion of some of the Build Engine's characteristics:

Python API

The Build Engine can be embedded in any other software that supports embedding Python. in a GUI, in a wrapper script that interprets classic Makefile syntax, or in any other software that can translate its dependency representation into the appropriate calls to the Build Engine API. An attached page describes in detail the specification for a ``Native Python'' interface that will drive the sccons implementation effort.

Single-image execution

When building/updating the objects, the Build Engine operates as a single executable with a complete Directed Acyclic Graph (DAG) of the dependencies in the entire build tree. This is in stark contrast to the commonplace recursive use of Make to handle hierarchical directory-tree builds.

Dependency analysis

Dependency analysis is carried out via digital signatures (a.k.a. ``fingerprints''). Contents of object are examined and reduced to a number that can be stored and compared to see if the object has changed. Additionally, sccons uses the same signature technique on the command-lines that are executed to update an object. If the command-line has changed since the last time, then the object must be rebuilt.

Customized Output

The output of Build Engine is customizable through user-defined functions. This could be used to print additional desired information about what sccons is doing, or tailor output to a specific build analyzer, GUI, or IDE.

Build failures

sccons detects build failures via the exit status from the tools used to build the target files. By default, a failed exit status (non-zero on UNIX systems) terminates the build with an appropriate error message. An appropriate class from the Python library will interpret build-tool failures via an OS-independent API.

If multiple tasks are executing in a parallel build, and one tool returns failure, sccons will not initiate any further build tasks, but allow the other build tasks to complete before terminating.

A -k command-line option may be used to ignore errors and continue building other targets. In no case will a target that depends on a failed build be rebuilt.


Interfaces

As previously described, the sccons Build Engine is interface-independent above its Python API, and can be embedded in any software system that can translate its dependency requirements into the necessary Python calls.

The ``main'' sccons interface for implementation purposes, uses Python scripts as configuration files. Because this exposes the Build Engine's Python API to the user, it is current called the ``Native Python'' interface.

This section will also discuss how sccons will function in the context of two other interfaces: the Makefile interface of the classic Make utility, and a hypothetical graphical user interface (GUI).

Native Python interface

The Native Python interface is intended to be the primary interface by which users will know sccons--i.e., it is the interface they will use if they actually type sccons at a command-line prompt.

In the Native Python interface, sccons configuration files are simply Python scripts that directly invoke methods from the Build Engine's Python API to specify target files to be built, rules for building the target files, and dependencies. Additional methods, specific to this interface, are added to handle functionality that is specific to the Native Python interface: reading a subsidiary configuration file; copying target files to an installation directory; etc.

Because configuration files are Python scripts, Python flow control can be used to provide very flexible manipulation of objects and dependencies. For example, a function could be used to invoke a common set of methods on a file, and called iteratively over an array of files.

As an additional advantage, syntax errors in sccons Native Python configuration files will be caught by the Python parser. Target-building does not begin until after all configuration files are read, so a syntax error will not cause a build to fail half-way.

Makefile interface

An alternate sccons interface would provide backwards compatibility with the classic Make utility. This would be done by embedding the sccons Build Engine in a Python script that can translate existing Makefiles into the underlying calls to the Build Engine's Python API for building and tracking dependencies. Here are approaches to solving some of the issues that arise from marrying these two pieces:

Lest this seem like too outlandish an undertaking, there is a working example of this approach: Gary Holt's Make++ utility is a Perl script that provides admirably complete parsing of complicated Makefiles around an internal build engine inspired, in part, by the classic Cons utility.

Graphical interfaces

The sccons Build Engine is designed from the ground up to be embedded into multiple interfaces. Consequently, embedding the dependency capabilities of sccons into graphical interface would be a matter of mapping the GUI's dependency representation (either implicit or explicit) into corresponding calls to the Python API of the sccons Build Engine.

Note, however, that this proposal leaves the problem of designed a good graphical interface for representing software build dependencies to people with actual GUI design experience...


Other Issues

Interaction with SC-config

The SC-config tool will be used in the sccons installation process to generate an appropriate default construction environment so that building most software works ``out of the box'' on the installed platform. The SC-config tool will find reasonable default compilers (C, C++, Fortran), linkers/loaders, library archive tools, etc. for specification in the default sccons construction environment.

Interaction with test infrastructures

sccons can be configured to use SC-test (or some other test tool) to provide controlled, automated testing of software. The Link method could link a test subdirectory to a build subdirectory:

        Link('test', 'build')
        Conscript('test/Conscript')

Any test cases checked in with the source code will be linked into the test subdirectory and executed. If Conscript files and test cases are written with this in mind, then invoking:

        % sccons test

Would run all the automated test cases that depend on any changed software.

Java dependencies

Java dependencies are difficult for an external dependency-based construction tool to accomodate. Determining Java class dependencies is more complicated than the simple pattern-matching of C or C++ #include files. From the point of view of an external build tool, the Java compiler behaves ``unpredictably'' because it may create or update multiple output class files and directories as a result of its internal class dependencies.

An obvious sccons implementation would be to have the Scanner object parse output from Java -depend -verbose to calculate dependencies, but this has the distinct disadvantage of requiring two separate compiler invocations, thereby slowing down builds.

Limitations of digital signature calculation

In practice, calculating digital signatures of a file's contents is a more robust mechanism than time stamps for determining what needs building. However:

1) Developers used to the time stamp model of Make can initially find digital signatures counter-intuitive. The assumption that:

        % touch file.c

will cause a rebuild of file is strong...

2) Abstracting dependency calculation into a single digital signature loses a little information: It is no longer possible to tell (without laborious additional calculation) which input file dependency caused a rebuild of a given target file. A feature that could report, ``I'm rebuilding file X because it's out-of-date with respect to file Y,'' would be good, but an digital-signature implementation of such a feature is non-obvious.

Remote execution

The ability to use multiple build systems through remote execution of tools would be good. This should be implementable through the Job class. Construction environments would need modification to specify build systems.

Conditional builds

The ability to check run-time conditions as suggested on the sc-discuss mailing list (``build X only if: the machine is idle / the file system has Y megabytes free space'') would also be good, but is not part of the current design.


Summary

sccons offers a robust and feature-rich design for an SC-build tool. With a Build Engine based on the proven design of the Cons utility, it offers increased simplification of the user interface for unsophisticated users with the addition of the ``do-the-right-thing'' env.Make method, increased flexibility for sophisticated users with the addition of Builder and Scanner objects, a mechanism to allow tool-masters (and users) to share working construction environments, and embeddability to provide reliable dependency management in a variety of environments and interfaces.


Acknowledgements

I'm grateful to the following people for their influence, knowing or not, on the design of sccons:

Bob Sidebotham
As the original author of Cons, Bob did the real heavy lifting of creating the underlying model for dependency management and software construction, as well as implementing it in Perl. During the first years of Cons' existence, Bob did a skillful job of integrating input and code from the first users, and consequently is a source of practical wisdom and insight into the problems of real-world software construction. His continuing advice has been invaluable.

The Cons Community
The real-world build problems that the users of Cons share on the cons-discuss mailing list have informed much of the thinking that has gone into the sccons design. In particular, Rajesh Vaidheeswarran, the current maintainer of Cons, has been a very steady influence. I've also picked up valuable insight from mailing-list participants Johan Holmberg, Damien Neil, Gary Oberbrunner, Wayne Scott, and Greg Spencer.

Peter Miller
Peter has indirectly influenced two aspects of the sccons design:

Reading Miller's influential paper ``Recursive Make Considered Harmful'' was what led me, indirectly, to my involvement with Cons in the first place. Experimenting with the single-Makefile approach he describes in ``RMCH'' led me to conclude that while it worked as advertised, it was not an extensible scheme. This solidified my frustration with Make and led me to try Cons, which at its core shares the single-process, universal-DAG model of the ``RMCH'' single-Makefile technique.

The testing framework that Miller created for his Aegis change management system changed the way I approach software development by providing a framework for rigorous, repeatable testing during development. It was my success at using Aegis for personal projects that led me to begin my involvement with Cons by creating the cons-test regression suite.

Stuart Stanley
An experienced Python programmer, Stuart has provided valuable advice and insight into some of the more useful Python idioms at my disposable, and has contributed greatly to the actual usability of sccons.

Gary Holt
I don't know which came first, his first-round contest entry or the tool itself, but Gary's design for Make++ showed me that it is possible to marry the strengths of Cons-like dependency management with backwards compatibility for Makefiles. Striving to support both Makefile compatibility and a native Python interface cleaned up the sccons design immeasurably by factoring out the common elements into the Build Engine.

   [Home]       [FAQ]       [License]       [Rules]       [Configure]       [Build]       [Test]       [Track]       [Resources]       [Archives]   



Powered by Zope

Zope management by SPVI

Last modified 2000/07/02 21:12:58.4037 US/Mountain