UQBT - A Resourceable and Retargetable Binary Translator

Cristina Cifuentes, Mike Van Emmerik, The University of Queensland
Norman Ramsey, Harvard University.
 
 
Group photo
The UQBT team (June 1999)


News Flash! The source code for UQBT will be released soon! Watch for details!

Table of Contents

Funding support UQ UniNews articles on UQBT-related topics

What is Binary Translation?

Binary translation is the process of automatically translating a binary executable program from one machine to another. This process normally involves different machines, Mi, different operating systems, OSi, and different binary-file formats, BFFi, in order to translate programs from (M1,OS1,BFF1) to (M2,OS2,BFF2).

We are developing general, platform-independent techniques for binary translation based on descriptions of machines and operating system-related issues. Such techniques are valuable to any software developer that wants to port legacy software to a new machine. Our aim is to develop tools that will make software available quickly on new machines, without requiring source code or re-programming, and at low cost, by reuse of machine descriptions.

Our project is constrained to translations across operating systems that are the same or similar enough, such as Solaris and Linux, i.e. OS1 ~= OS2, in order to concentrate on analyses of machine instructions and their semantics, rather than API calls. When the operating systems are not the same, i.e. OS1 != OS2, there is a lot of tedious work translating API (Application Programmer Interface) calls. Cross operating system translation issues are addressed by projects such as wine and to a lesser extent, freeMWare.

Resourceable and Retargetable Binary Translation - An Overview

Like a compiler, a binary translator can be loosely divided into front end, analyzer and optimizer, and back end. The front end decodes a source-machine binary file, produces RTLs, and then lifts the level of abstraction to HRTL (the high-level intermediate representation) by using knowledge of the source-machine calling conventions and instruction set. The analyzer and optimizer map from source-machine locations to target-machine locations, and it may apply other machine-specific optimizations to prepare for the back end. The back end translates the intermediate HRTL to target-machine instructions, and it writes a binary file in the required format.

Compilers and other tools have traditionally been called retargetable when they can support multiple target machines at low cost. By extension, we call a binary-analysis tool resourceable if it can analyze binaries from multiple source machines at low cost. Retargetability is supported in the UQBT framework through specifications of properties of machines and operating system conventions. Figure 1 illustrates the components of the translation process in the form of boxes. Specifications for different machines are illustrated with the document shape. Blue shapes mean that they are not currently supported by UQBT and therefore an implementation for a particular format or analysis has been done instead. Arrows represent conceptual flow of control in the translation process.

Translation framework Figure 1: Resourceable and Retargetable Binary Translation Framework.


The UQBT framework is currently implemented reusing other's backends, hence we generate low-level C code and interface to GNU's gcc, Sun's cc and Jack Davidson's VPO. This shortcut is done as we don't have the resources to build yet another optimizer using standard compilet technology. Figure 2 illustrates this process.

UQBT implementation Figure 2: Current UQBT Implementation.


The UQBT framework generates fat binaries in the form described in Figure 3.

Fat binary format
Figure 3: Generated Fat Target Binary.


Specification languages

UQBT uses a variety of descriptions of properties of machine instructions and operating systems conventions.

The characteristics of a machine's instruction set are its syntax (i.e. which bits match which assembly instructions), its semantics (i.e. what a particular assembly instruction mean), and its control transfer instructions (i.e. which instructions change the flow of control of a program and how). Machines that exhibit delayed transfers of control can specify this fact as well. These characteristics have been specified in a variety of languages for historical reasons, but could be specified in the one language. The following are the implementation languages that UQBT uses:

Conventions and formats set by an operating system come in the form of calling conventions, including where parameters are passed (e.g. stack or registers), descriptions of a procedure's stack frame and where to find local variables, and the format of the binary file that the OS expects. These pieces of information are represented by the following implementation languages:

We also use specialized analysis or APIs for other parts of the translator, such as:


Demos

The static UQBT translation framework is at the stage where simple/small programs can be translated successfully. A full implementation of the framework is expected sometime :-)

In the mean time, we show two examples of translations between Sparc binaries and Pentium binaries (one in each direction). Sources and targets are included below for the fibonacci and the knights programs.
 

Fibonacci

Here is a very simple program to calculate Fibbonaci numbers:
#include <stdio.h>

int fib (int x)
{
        if (x > 1)
                return (fib(x - 1) + fib(x - 2));
        else return (x);
}

int main (void)
{       int number, value;

        printf ("Input number: ");
        scanf ("%d", &number);
        value = fib(number);
        printf("fibonacci(%d) = %d\n", number, value);
        return (0);
}

The original Pentium Solaris program is here; you won't be able to run it unless you have a Pentium Solaris system. Here is a disassembly of the fib function:
 
fib()

fib()
8048960:  55                  pushl  %ebp
8048961:  8b ec               movl   %esp,%ebp
8048963:  53                  pushl  %ebx
8048964:  83 7d 08 01         cmpl   $0x1,8(%ebp)
8048968:  7e 2e               jle    0x2e <8048998>
804896a:  8b 45 08            movl   8(%ebp),%eax
804896d:  48                  decl   %eax
804896e:  50                  pushl  %eax
804896f:  e8 ec ff ff ff      call   0xffffffec <fib>
8048974:  83 c4 04            addl   $0x4,%esp
8048977:  8b d8               movl   %eax,%ebx
8048979:  8b 45 08            movl   8(%ebp),%eax
804897c:  05 fe ff ff ff      addl   $0xfffffffe,%eax
8048981:  50                  pushl  %eax
8048982:  e8 d9 ff ff ff      call   0xffffffd9 <fib>
8048987:  83 c4 04            addl   $0x4,%esp
804898a:  8b c0               movl   %eax,%eax
804898c:  8d 14 18            leal   (%eax,%ebx),%edx
804898f:  8b c2               movl   %edx,%eax
8048991:  eb 0d               jmp    0xd <80489a0>
8048993:  90                  nop    
8048994:  eb 0a               jmp    0xa <80489a0>
8048996:  90                  nop    
8048997:  90                  nop    
8048998:  8b 55 08            movl   8(%ebp),%edx
804899b:  8b c2               movl   %edx,%eax
804899d:  eb 01               jmp    0x1 <80489a0>
804899f:  90                  nop    
80489a0:  8b 5d fc            movl   -4(%ebp),%ebx
80489a3:  c9                  leave  
80489a4:  c3                  ret    


If you know a bit about Pentium assembler, you can see the standard function prologue and epilogue (in blue) which are Pentium specific (they set up the stack frame by manipilating the esp and ebp registers), that the parameters are passed on the stack (the purple push instructions), and that  there are some redundant jumps and no-operation instructions (orange instructions). After each call, the stack pointer (esp) is restored with the green instructions.

The UQBT static translated version of fibo is here; it's a Sparc Solaris binary. Here is the translated version of function fib:

fib()
8050b54:  9d e3 bf 90      save         %sp, -112, %sp
8050b58:  80 a6 20 01      cmp          %i0, 1
8050b5c:  04 80 00 08      ble          0x8050b7c
8050b60:  01 00 00 00      nop     
8050b64:  7f ff ff fc      call         fib
8050b68:  90 06 3f ff      add          %i0, -1, %o0
8050b6c:  a0 10 00 08      mov          %o0, %l0
8050b70:  7f ff ff f9      call         fib
8050b74:  90 06 3f fe      add          %i0, -2, %o0
8050b78:  b0 02 00 10      add          %o0, %l0, %i0
8050b7c:  81 c7 e0 08      ret     
8050b80:  81 e8 00 00      restore
You can immediately see it's quite different - for one thing it's a lot shorter! The original binary program was not well optimised, so there are some savings there. The Pentium prologue and epilogue are gone, replaced by special Sparc save and restore instructions. These set up Sparc's register windows (which don't exist on Pentiums, or on most if not all other architectures). The green instructions restoring the stack pointer are gone. Function arguments are no longer pushed to the stack; instead they are placed in registers (here, only register %o0 is used). Some instructions have been "moved to delay slots" (e.g. the add instructions performing "x-1" and "x-2" occur after the calls that use them!). Again, this is a property of the Sparc system that the Pentium does not have, although several RISC machines have delay slots.

It's hard to believe that it's the same program, but the semantics of the original program have been preserved, using different instructions to perform the same tasks.

Knights


For a larger example, we compiled an ncurses demo program (slightly modified) that plays the Knight's Tour game (where you try to visit every square on a chess board exactly once using only knight moves). Here are the Sparc source and Linux target binary files. Below is a screen capture of the translated program in action:

                    KNIGHT'S MOVE -- a logical solitaire

+---+---+---+---+---+---+---+---+  Possible moves are shown with `-'.        
|###|   |   |   |   |   |   |   |  
+---+---+---+---+---+---+---+---+  You can move around with the arrow keys or
|   |   |   |   |   |   |   |   |  with the rogue/hack movement keys.  Other
+---+---+---+---+---+---+---+---+  commands allow you to undo moves or redraw.
|   |###|   | - |   |   |   |   |  Your mouse may work; try left-button to   
+---+---+---+---+---+---+---+---+  move to the square under the pointer.
| - |   |   |   | - |   |   |   |
+---+---+---+---+---+---+---+---+  x,q -- exit             y k u    7 8 9   
|   |   |#X#|   |   |   |   |   |  r -- redraw screen       \|/      \|/   
+---+---+---+---+---+---+---+---+  ^h, bsp -- undo move    h-+-l    4-+-6  
| - |   |   |   | - |   |   |   |                           /|\      /|\ 
+---+---+---+---+---+---+---+---+                          b j n    1 2 3
|   | - |   | - |   |   |   |   |
+---+---+---+---+---+---+---+---+  You can place your knight on the selected
|   |   |   |   |   |   |   |   |  square with spacebar, Enter, or the keypad
+---+---+---+---+---+---+---+---+  center key.  You can quit with `x' or `q'.


                                   Press `?' to go to game explanation

The program has a few bugs (some original, some caused by the translation), but it demonstrates the potential for the UQBT static translator.


Preliminary Results

UQBT is written in C++ and compiles using gcc on Solaris and Linux systems. It uses SLED and SSL descriptions for SPARC and Pentium, and PAL descriptions for SPARC and Pentium Unix System V ABI calling conventions, which are used under Solaris. UQBT uses gcc or cc as optimizing backends. We have extended gcc's backend to generate bytecode for the JVM. We have several translators in place, we present results for the following: A SPARC to SPARC and a Pentium to Pentium translators are useful to test the adequacy of the internal representation, as translated programs should not slow down when translating from machine M to machine M using the same optimizer compiler. The test programs are: For all programs, we measured the time in seconds to execute the program on the target machine and compared that to the time measurement produced by a native compiler on that target machine; this allows us to see the quality of the translation. Each test program also lists on the second row the size in bytes of the executable file for comparison purposes. SPARC results were obtained on an UltraSPARC II, 250MHz machine with 320Mb RAM running Solaris 2.6. Pentium results were obtained on a Pentium MMX, 250 MHz machine with 128Mb RAM running Solaris 2.6. The source binary programs (input to the translator) were all compiled with gcc 2.8.1 -O4. Translated code programs used two different optimizing C compilers; gcc 2.8.1 and cc 4.2, on both SPARC and Pentium machines. Native code for the target machine was compiled using gcc 2.8.1 with -O0 and -O4 options, on both SPARC and Pentium. The reported results are as at October 1999.
Translated Code Native Code
Program gcc opt cc opt -O0 -O4
Fibo(40) sec 18.2 21.3 41.0 23.0
bytes 24,924 6,700 24,628 24,564
Sieve(3000) sec 23.7 24.1 29.3 24.5
bytes 24,732 6,316 24,552 24,452
Mbanner(500K) sec 25.8 22.2 63.7 26.6
bytes 30,500 12,248 30,652 30,268
Static SPARC to SPARC Translation

Tranlated Code Native Code
Program gcc opt cc opt -O0 -O4
Fibo(40) sec 23.0 24.3 41.0 23.0
bytes 24,916 6,680 24,628 24,564
Sieve(3000) sec 26.9 23.9 29.3 24.5
bytes 24,776 6,312 24,552 24,452
Mbanner(500K) sec 53.3 36.9 63.7 26.6
bytes 34,188 21,448 30,652 30,268
Static Pentium to SPARC Translation

Translated Code Native Code
Program gcc opt cc opt -O0 -O4
Fibo(40) sec 27.7 28.5 28.6 25.9
bytes 16,512 7,292 16,144 16,152
Sieve(3000) sec 17.8 17.4 18.9 18.6
bytes 16,244 6,548 15,964 15,944
Mbanner(500K) sec 42.5 n/a 80.5 44.8
bytes 22,240 21,524 25,436
Static SPARC to Pentium Translation

Translated Code Native Code
Program gcc opt cc opt -O0 -O4
Fibo(40) sec 25.8 24.5 28.6 25.9
bytes 16,496 7,268 16,144 16,152
Sieve(3000) sec 18.6 17.1 18.9 18.6
bytes 16,228 6,536 15,964 15,944
Mbanner(500K) sec 48.7 46.5 80.5 44.8
bytes 25,664 16,016 21,524 25,436
Static Pentium to Pentium Translation

Translated Code Native Code
Program gcc opt cc opt -O0 -O4
Fibo(40) sec 421.64 58.02 41.0 23.0
bytes 739 739 24,628 24,565
Sieve(3000) sec 103.66 20.52 29.3 24.5
bytes 677 677 24,552 24,452
Static SPARC to Java Bytecode Translation


Java Backend for GCC

Some of our work with UQBT has involved translating from binary files to Java bytecode (.class) files. One of the ways that we do this is with a port of the GNU GCC compiler to the Java Virtual Machine (JVM). Gcc emits assembler code (usually that fact is hidden from the user); our port uses Jasmin as the "Java assembler". This software is highly experimental, but adventurous readers can download the software here as a gzipped tar file (245K).

Note: egcs 1.1.2 source code and Jasmin are required, and possibly many other packages; see the README files in the tar file.

Limited support is available by emailing the author, Trent Waddington, at s337240@student.uq.edu.au.


Publications

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Last updated: 2 January 2001

cristina@csee.uq.edu.au

This page: http://www.csee.uq.edu.au/csm/uqbt.html