Following Instructions

by John Fruehe

Men don’t like to follow instructions. If you don’t believe me, come down to my garage and I will show you a box called “the extra parts box” which has all the leftover parts from my projects. I once heard a quote that “the instructions are only the manufacturer’s opinion about how something should be put together.” My wife believes that men think this way.

But instructions are something totally different when you start talking about processors and instruction sets. As a follow on to the Flex FP blog I thought I would give a quick follow up to discuss some of the things that we mentioned in that blog.

There are several new instructions supported with our “Bulldozer” core which is due out next year. It’s kind of an alphabet soup – SSSE3, SSE4.1, SSE4.2, AES-NI, PCLMULQDQ , AVX, XOP, and FMA4. When you see this list of new instructions, you are probably thinking “well that means I am going to have to change my software.”  Well, yes and no.

To make use of the functionality of these instructions you will need software that supports them. This can be achieved with software that was written to directly call these instructions – like operating systems, development tools, system-level utilities and even specialized applications. Or it could be your own software that has been developed using compilers and libraries that support these instructions.

Some of these instructions supported in “Bulldozer” are new to AMD processors but have been out in the market for some time. This includes SSSE3, SSE4.1, and SSE4.2 which together provide over 75 different extensions for speeding up applications as diverse as video rendering, virus scanning and XML parsing, making them valuable for specialized software used for media handling, cloud computing,  and streaming functionality. Because there has been hardware out and available with support for these instructions – there is already support for these instructions in software. For example, the most widely used compilers by developers – Microsoft’s Visual Studio compilers and for Linux® the GCC compilers – support these instructions.

Two additional instruction sets, AES-NI and PCLMULQDQ, are tied to encryption and provide hardware acceleration of certain security algorithms. Hardware that supports these instructions is just beginning to appear on the market, which is triggering initial software development efforts. An example of a widely deployed software is Windows 7 which supports AES-NI via cryptography APIs in Windows.   Generally the companies that develop security software are careful about introducing new instructions and functionality due to the nature of their products. And like most software developers, they want to have supporting hardware widely available to ensure a solid return on their development work.  We should see a rich ecosystem of software with this functionality by the time we launch products based on our “Bulldozer” core next year.

We have talked in the past about the AVX instructions, which provide a big benefit in terms of the ability to execute floating point code in 256-bit pieces.  Development tools used by the HPC community have support for AVX in progress so you will be able be to recompile your code with AVX-supported compiler, like x86 Open64, PGI 2010, Visual Studio 2010, and GCC to gain AVX functionality.  Keep in mind that AMD is implementing AVX in the same manner as our competitor. That means any application that supports AVX will work the same on both of our platforms.  I am sure that you can appreciate that decision – more consistency in software code is always a good thing for the ecosystem.

“Bulldozer” will also have support for XOP and FMA4. Here is where AMD is striking out in a leadership role. We have added this functionality to support a wide variety of numeric-intensive, multimedia, and cryptographic applications, and allow some new cases of automatic vectorization by compilers. While our competitor has not yet agreed to support these instructions, we feel we need to implement to enable software to get the very most out of our new hardware.  These instructions will help put our newest set of products out ahead of the competition from an instruction standpoint.  XOP and FMA4, when combined with AVX, create a set of instructions similar to what AMD had originally proposed with SSE5 – more on that, (along with a discussion of FMA3) in a later blog sometime in the future.

The net of this discussion is that while we are introducing new instructions with “Bulldozer” we expect few issues around software.  Some of the instructions are already integrated into existing code bases and we went out of our way to try to minimize the impact these instructions will have on you. To make use of these instructions you will need hardware and software that offer support for them. And what many may not grasp is that these commands are being actively integrated into OSs and compilers, and are often invisible to the end user.

For a more detailed description of these instructions and how they will be implemented, be sure to check out David Christie’s blog.

John Fruehe is the Director of Product Marketing for Server, Embedded and FireStream products at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

SHARE: Reading: Following Instructionstwitter stumble upon delicious facebook mail

COMMENTS: 6

6 Comments

  • Pingback: Bulldozer Instructions

  • Pingback: [JF blog] AMD new Instructions - Overclock.net - Overclocking.net

  • Pingback: CHW » Los juegos de instrucciones soportados por AMD Bulldozer

  • Pingback: New Processors Will Mean New AVX Instructions | insideHPC.com

  • mehmet November 24, 2010

    I want to ask questions about bulldozers processors:

    the principal question to be asked:

    athlon 64 processors, 2. core was 80% performance increase.

    as far as i know, to phenom processors don`t nothing the improvement this technology.

    so phenom processors to : 2. core to 80% , 3. core to % 40 , 4. core to 20%, happen performance increase.
     
     
    do the improvement this technology of amd athlon 64 processors, to bulldozer processors?
     
     
    if do happen improvement, how much will be an improvement?
     
    As far as I read tech news,
     
     
    Bulldozer processors, will be only 30% better performance from Phenom processors.
     
    this true?

    if true, this difference is slightly. (substandard and unsatisfactory improve )

    • John Fruehe November 24, 2010

      All of your math is way off. I recommend waiting until the actual product is launched. I don’t know where you got the 30% performance increase number but because we have not released any performance estimates for the client products. If the tech news is assigning performance increases to products that they have not worked with yet (and based on a completely different architecture that they do not know all the details of), then those are merely guesses and should be treated that way.

Submit a Comment

Connect with Facebook

Reminder about Comments:

All comments will be moderated by AMD before they are published. Unrelated comments or requests for support will not be published. Please post your technical questions in the AMD Forums or for drivers and other support resources visit AMD Support. By submitting a comment, you are agreeing to AMD Terms and Conditions.