

# **XLP832 Processor**

## Multi-Core, Multi-Thread Superscalar Communication Processor

# **Overview**

The NetLogic Microsystems XLP832 processor is a highly scalable The NetLogic Microsystems XLP<sup>™</sup> processor family uses a highly scalable design that incorporates key functions of a high-end communication system, including wired and wireless security, networking, storage, data center acceleration, load balancing, and other acceleration engines. The XLP processor is a third-generation architectural enhancement to NetLogic Microsystems' industry-leading multi-core, multi-threaded XLR<sup>®</sup> processor family. The XLP processor family is designed using 40-nm technology and offers processor core frequencies from 500 MHz to 2 GHz, providing a greater than 2X performance per watt improvement over its XLR predecessor. The XLP processor family is software backward-compatible with the XLR and XLS<sup>®</sup> processor families.

## **EC4400 Processor Cores**

The XLP832 processor has eight EC4400 processor cores that provide optimum performance for both data plane and control plane applications. Each core includes field-proven multi-threading capabilities to provide the highest possible performance for throughput-oriented data plane processing. Each EC4400 core features a superscalar engine with out-of-order execution capabilities, and combines guad-issue instruction scheduling with simultaneous 4-way multi-threading. These features enable new classes of systems with uncompromised performance in a single-chip solution. Each of the four VirtuCore<sup>™</sup> virtual threads embedded in an EC4400 core appears to software as a completely separate processing element, enabling extremely flexible software architectures that simultaneously simplify software development and increase overall system performance. Each EC4400 core is MIPS64 Release-II ISA-compliant and contains an IEEE754 and MIPS-compliant floating point unit per core. By combining architectural improvements and frequency enhancements, the EC4400 enables the XLP processor to deliver a greater than 2X performance per watt improvement over NetLogic Microsystems' performance leading XLR processor product offering.

## **Processor Cache Architecture**

The XLP832 processor contains a MOESI+ coherent, three-level cache architecture. Each of the eight EC4400 cores contains a dedicated 64-KByte instruction cache, a 32-KByte L1 data cache, and a 512-KByte 8-way set-associative L2 cache. The cores also share access to an 8-bank, 16-way set-associative 8-MByte L3 cache, providing a total of more than 12 MBytes of cached data on the XLP832 processor.

#### **Memory Subsystem**

The XLP832 processor's high-performance memory subsystem contains four on-chip DDR3 memory controllers with 51.2 Gigabytes/sec of bandwidth.

#### Fast Messaging Network<sup>™</sup>

A low-latency, high-speed Fast Messaging Network<sup>™</sup> (FMN) system allows for non-intrusive internal communication and control messaging among VirtuCore threads, acceleration engines, and I/O. The FMN enables inter-unit communication without the need for spin-locks or semaphores. By passing control descriptors, the FMN also permits lockless simultaneous access to peripheral devices, which dramatically



simplifies and increases the performance of the associated device drivers.

#### **Acceleration Engines**

The XLP832 processor contains numerous Autonomous Acceleration Engine<sup>®</sup> modules that offload processing tasks from the EC4400 cores, thus freeing up the cores to perform other compute-intensive application-dependent tasks:

- A robust Autonomous Network Acceleration Engine® module supports up to 40 Gbps of packet throughput. Features included are a programmable packet parsing engine, FCoE, iSCSI and SCTP checksum/CRC generation and verification, TCP/UDP/ IP checksum on both ingress and egress, TCP segmentation offload, and IEEE1588v2 precision timing protocol support.
- A Packet Ordering Engine (POE) supports packet ordering for up to 64K flows. The POE can handle up to 60 million packets per second, which corresponds to 40 Gbps with 64-byte packets.
- 40 Gbps bandwidth Autonomous Security Acceleration Engine® module.
- 10 Gbps compression/decompression engine
- An 8-channel DMA and Storage Acceleration Engine with RAID-5 XOR acceleration, RAID-6 P+Q Galois computations; and deduplication acceleration hardware assistance.

#### **Interchip Cache Coherency**

Three Interchip Coherency Interfaces seamlessly interconnect up to four XLP832 processors. Each interface has 80 Gbps of full-duplex bandwidth. These interfaces are fully software transparent. Hardware manages the chip-to-chip coherency, message passing between threads, and the sharing of memory and I/O resources.

# **Product Specification**

## **Next Generation Processor Cores**

- 8 cores each quad-issue 4-way simultaneous multi-threaded
- 64-bit MIPS64 Release-II ISA with enhanced instructions
- Out-of-order processing
- IEEE754-compliant floating point unit per core
- 500 MHz 2.0 GHz core frequency
- Enhanced TLB support with hardware page table walker

#### **Cache Subsystem**

- Fully cache-coherent MOESI+ Protocol
- 32-KByte 2-way write-through L1 data cache per core
- 64-KBvte 2-way L1 instruction cache per core
- 512-KByte 8-way write-back L2 cache per core
- Shared 8 MB 16-way write-back L3 cache with simultaneous access

#### High Performance Memory Controller

· Four 72-bit DDR3 memory controllers supporting DDR-1600

#### • 51.2 GBps memory bandwidth available **Cache Coherent Scalability**

- Three high-speed low-latency Interchip Coherency Interfaces per chip provide seamless connection between up to four processors
- Fully software transparent with hardware management for message passing, chip-tochip coherency, and global shared memory and I/O resources

#### **High Speed Distributed** Interconnects

 Bi-directional Memory Distributed Interconnect® bus connects both cores to memory system

- TCP/UDP/IP checksum for both ingress and earess · Generation and verification of checksum/ CRCs used by iSCSI, FCoE, and SCTP protocols
  - · TCP segmentation offload
    - · Egress scheduling support for QOS applications
    - Highly programmable (micro-coded) packet parsing engine
    - IEEE1588v2-compliant Precision Timing Protocol (PTP) controller

• I/O Distributed Interconnect® bus moves

data among I/O interfaces, memory and

· Fast Messaging Network system non-

intrusively passes packet descriptors

and control messages among CPUs,

**Acceleration Engine® Module** 

acceleration engines and I/O

**Autonomous Network** 

• 40 Gbps packet throughput

(with 64-byte packets)

CPUs automatically

• Lossless Ethernet support for Data Center applications

## Hardware Packet Ordering Engine

- · nsures packets transmit in same order as received within a single flow while enabling simultaneous processing across multiple threads
- · Support for 64K flows
- Optional modes can impose single packet per-flow processing or binding of a flow to a single VirtuCore or acceleration engine

### **Autonomous Security Acceleration Engine® Module**

- 40 Gbps of bulk encryption/decryption • 10 high-speed crypto cores
- DES/3DES, AES (128, 192, 256), ARC4/

- RC4, MD5, SHA-1, SHA-256/384/512, and SNOW3G (All HMAC)
- RSA / DH exponentiation for SSL / IPSec with up to 60K RSA key exchanges per second
- ECC (Elliptic Curve Cryptography)

#### **DMA and Storage Application Acceleration Engine**

- 8-channel DMA controller
- RAID5 XOR engine
- RAID6 P + Q syndrome engine
- · De-duplication acceleration hardware assistance
- Compression/Decompression
- Four Compression/Decompression Engines (CDEs)
- 10 Gbps operation: Deflate or Inflate or combination (across all CDEs)

#### **Networking Interfaces**

- Multi-rate SERDES interfaces support Interlaken, XAUI and SGMII
- Integrated System Interfaces
- Four PCIe 2.0 controllers
- I A Interlaken for TCAM
- Three USB 2.0 ports: 2 Host mode and 1 Host or Device mode
- NAND, NOR, MMC Flash memory interfaces SPI interface
- Two I2C interfaces
- Two 16550 UART interfaces
- · GPIO interface

## **Power Management**

NETLOGIC

KBP

NL11K

NETLOGIC

NETL7

NI \$2008

I-LA1

DDR3

Memory

**XLP832** 

XAUI XAUI XAUI XAUI

Gig E

Gig E

SGMI

SGMI

- Dynamic frequency scaling with per-core granularity
- Dynamic voltage control over the 2-core complex





## **Contact Information**

## Phone:

+1 (650) 961-6676

#### Internet:

www.netlogicmicro.com

#### Worldwide Headquarters:

1875 Charleston Road Mountain View, CA 94043 United States of America



Putting Intelligence in the Network™

NetLogic Microsystems, the NetLogic Microsystems logo, XLP, XLR, XLS, Fast Messaging Network, DedicatedPlus Cache, Autonomous Acceleration Engine, Autonomous Network Acceleration Engine, Autonomous Security Acceleration Engine, Memory Distributed Interconnect, I/O Distibuted Interconnect, and VirtuCore are trademarks or registered trademarks of NetLogic Microsystems, Inc. Other names and brands may be claimed as the property of others

The information contained herein is the property of NetLogic Microsystems, Inc., and is believed to be accurate at the time of publication. NetLogic Microsystems, Inc. assumes no liability for any error or omissions in this information, or for the use of this information or products described herein. NetLogic Microsystems, Inc. reserves the right to make changes to its products or product documentation at any time without notice. Disclosure of the information herein does not convey a license or any other right in any patent, trademark, or other intellectual property of NetLogic Microsystems, Inc.

Copyright © 2010 NetLogic Microsystems, Inc

All Rights Reserved.

