Click here to learn
about this Sponsor:
home  |  news  |  articles  |  polls  |  forum  |  events  |  links  |  products  |  search
Articles & white papers about Linux based embedded applications ...
Keywords: Match: Sort by:
Why ARM's EABI matters
by Andres Calderon and Nelson Castillo (Mar. 14, 2007)

Foreword: -- This article describes the floating point performance benefits of the ARM EABI (embedded application binary interface), which now has its own Debian port. The authors use an open source floating point benchmark to compare how compiling with EABI support can improve floating point performance.

Incidentally, the authors did their benchmark testing on Carlos Camargo's ECB_AT91, a two-layer, 3.3 x 3 inch (85 x 77mm) board with an open source hardware design. Enjoy . . . !



Why ARM's EABI Matters

by Andres Calderon and Nelson Castillo


It's common nowadays to hear of the new ARM EABI (embedded application binary interface) Linux port. There are many motivations to start using it, but there is one we especially like -- it's much faster for floating point operations. Since many ARM cores lack a hardware FPU (floating point unit), any software acceleration is more than welcome.

It might be hard to switch to EABI, though. For instance, for the Debian distribution, EABI is actually considered a new port.

Without EABI

The ARM EABI improves the floating point performance. This is not surprising, if you read how your processor is wasting a lot of cycles now. From the Debian ARM-EABI wiki:
The current Debian port creates hardfloat FPA instructions. FPA comes from "Floating Point Accelerator." Since the FPA floating point unit was implemented only in very few ARM cores, these days FPA instructions are emulated in kernel via Illegal instruction faults. This is of course very inefficient: about 10 times slower that -msoftfloat for a FIR test program. The FPA unit also has the peculiarity of having mixed-endian doubles, which is usually the biggest grief for ARM porters, along with structure packing issues.
So, what does this mean? It means that the compilers usually generate instructions for a piece of hardware, namely a Floating Point Unit, that is not actually there! When you make a floating point operation, such at 3.58*x, the CPU runs into an illegal instruction, and it raises an exception. The kernel catches this specific exception and performs the intended float point operation, and then resumes executing the program. And this is slow because it implies a context switch.

The benchmark

We decided to make a simple benchmark using our Open Hardware Free ECB_AT91 ARM(ARMv4t) development board, based on an Atmel AT91RM9200 processor.


The ECB_AT91, top and bottom
(Click each image to enlarge)

We used a simple benchmark we have used before: the dot product of two given vectors, the Euclidean distance of the vectors, and the FFT (fast Fourier transform) algorithm (complex valued, Cooley and Tukey radix-2). The source code we used is available here (GPL).

It's common to use the number of floating point operations per second (FLOPS) performed by a given program for benchmarking purposes. However, this can be misleading, because some operations (e.g. division) take more time than others (e.g. addition). To ensure uniformity, we ran the same program in both setups, with similar compiler flags.

First we tried the Old ABI using the Debian distribution (Debian Sid), and an image that we bootstrapped. Then, for the EABI test, we used the Angstrom Distribution, part of the OpenEmbedded project.

Results


EABI vx. OABI, floating point benchmark (Free_ECP_AT91_V1.5, AT92RM9200)
(Click to enlarge. Source: emQbit)



EABI/OABI speed-up, floating point benchmark (Free_ECB_AT91_V1.5, AT92RM9200)
(Click to enlarge)

In each context switch, both the data and instruction cache are flushed, and this hurts the Old ABI's performance. You will notice it in the graphs because the performance with the old ABI does not depend on the size (N) of the input data, whereas in EABI the impact of the cache in the performance is seen clearly. The dot-product performance only goes down when N > 4096 (When we use more than 16KB in memory); the Atmel processor we're using has a 16 Kbyte data cache.



About the authors: Andres Calderon (left) and Nelson Castillo (right) are co-founders of emQbit, an embedded hardware/software co-design company based in Columbia, South America. Both are long time Linux users. Andres has experience in high performance computing and in embedded systems design, DSP, and FPGA programming. Nelson has experience in Unix Network Programming, Linux driver programming, and embedded system programming.


Comments? Questions?

If you have questions or comments about this article, Talkback here






Related Stories:



Got a HOT tip?   please tell us!
Free weekly newsletter
Enter your email...
Click here for a profile of each sponsor:
Platinum Sponsors:
(Become a sponsor)
Gold Sponsors:
(Become a sponsor)

(Advertise here)

Reference Guides
•  Embedded Linux distros
•  Real-time Linux
•  Embedded graphics
•  Embedded Java
•  Linux Devices Showcase
•  Little Linux systems
•  Single board computers
•  Embedded processors
•  Embedded Linux books

2006 Embedded Linux market snapshot:
Check out the latest Linux-powered...

mobile phones!

other cool
gadgets

...in our 2007 Embedded Linux Market Survey

BREAKING NEWS

• 11 from IBM: SPUs, RHEL4, Emacs, XQuery, Acegi, GEF...
• DIY embedded Linux service deemed "disruptive"
• Tux Droid... cool toy, or Tuxploitation?
• Tiny Linux SBC steps up to PXA270
• Industrial PC captures multi-channel video
• 2D barcodes link Linux devices to online info
• Dual core rail-mount industrial PC runs Linux
• Embedded graphics toolkit gains features
• Analysts, legal eagles dissect new GPLv3 draft
• Linux-friendly microkernel embraces PowerPC 440
• Driver dev kit supports recent Linux kernels
• Enea acquires Swedish Linux expertise
• Startup hopes to redefine smartphone Web content access
• AMD brings Linux to East Africans
• Telematics platform gains portable UI support


Don't miss the...

"Great Gadget Smack-Down"

Linux-Watch headlines:
• Red Hat's gross income grows, net disappoints
• Analysts, legal eagles dissect new GPLv3 draft
• RIP: Community Linux (1991-2007)
• GPLv3 draft 3 arrives, bans patent pacts
• How late could the GPLv3 be?
• Linux kernel gains paravirtualization option
• Linux Foundation names board of directors
• Oracle joins Linux patent commons
• FSF revises GPLv3 release schedule
• Clearing up anti-GPL3 FUD



CAREER CENTER
Search more than 60,000 tech jobs. Search by keywords, skill, job title, and location.
Check it out today!




"Website of the Year"




SUBMISSIONS...
Want to submit news, events, articles, resource links, product listings? Contact us here.

Also visit our sister site:


Sign up for LinuxDevices.com's...

news feed

home  |  news  |  articles  |  polls  |  forum  |  events  |  links  |  products  |  search  |  aboutus  |  contactus
 
Use of this site is governed by our Terms of Use and Privacy Policy. Except where otherwise specified, the contents of this site are copyright © 1999-2007 Ziff Davis Publishing Holdings Inc. All Rights Reserved. Reproduction in whole or in part without permission is prohibited. Linux is a registered trademark of Linus Torvalds. All other marks are the property of their respective owners.