## DDR2 SDRAM interfaces for next-gen systems

By Razak Mohammed Ali Product Marketing Manager High-Density Products Altera Corp. E-mail: rmohamme@altera.com

Memory devices are critical components of electronic systems. And with increasing complexity brought about by greater endmarket demands, next-generation systems require newer memory architectures. For example, highend servers, router boxes and even video game systems need highspeed and low-latency memory architectures, so memory developers are now designing innovative devices.

Representing the latest highspeed memories available today are DDR2 SDRAM, RLDRAM II and QDRII SRAM. Designers must understand their respective strengths and weaknesses to balance system performance with bandwidth, density, latency, power and cost. **Table 1** shows various memory device options available today.

SDRAMs have been traditionally used in PCs. As processor core speeds exceeded 2GHz, changes in memory speed, ef-



Figure 2: DRAMs are integrated into networking applications requiring high-speed memories.

ficiency, size and costs became necessary to support processor enhancements. DDR SDRAMs were then introduced as a costeffective solution for upgrading data bandwidth to memory, and quickly became the memory choice for the PC and server markets (**Figure 1**).

The price drop was noted in other applications such as networking. Several memory requirements exist for networking applications. Among these, DRAMs are primarily used for packet buffer memory, which involves large packet memory to store entire packets during network processing of packet headers (**Figure 2**).

## Improved design

DDR2 SDRAM is the next evolutionary step from DDR memory. Both DDR and DDR2 SDRAM have mostly identical addressing and command control interfaces. The basic difference between the two lies in the data interface. DDR uses both the rising and falling edges of the clock to transfer data. Meanwhile, DDR2 SDRAM

architecture uses a 4n-prefetch architecture in which the internal data bus is four times the width of the external data bus. A single read/write cycle has a single 4nbit-wide, one-clock-cycle data transfer at the memory core and four corresponding n-bit-wide, one-half-clock-cycle data transfers at the I/O. Note that two external clock cycles correspond to two rising and two falling edges. This structure enables high-speed operation, as internal column accesses are a guarter frequency of the external data transfer rate.

The data interface is designed to transfer two n-bit-wide words per clock cycle. However, if data transfers were based on a freerunning system clock, maximum frequency occurs as total output access and flight time equaled the bit time. Moreover, in such a scheme, the data does not track the clock during changes in temperature and loading, thus cutting down the effective data valid window and further limiting the maximum attainable frequency.

To overcome these limitations, DDR2 SDRAMs use a byte-wide, bidirectional differential or singleended data strobe (DQS) that is transmitted externally, along with data (DQ) for capture. DQS is









## Figure 4: Stratix II FPGAs feature dedicated DQS circuitry.

| Parameter                      | DDR SDRAM                                                                                      | DDR2 SDRAM                                                                                     | RLDRAM II                                                                             | QDRII SRAM                                                                                              |
|--------------------------------|------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
| Performance<br>(MHz)           | 100-200                                                                                        | 200-400                                                                                        | 200, 300, 400                                                                         | 154-333                                                                                                 |
| Density                        | 64Mb – 1Gb<br>32Mb-2Gb (DIMM)                                                                  | 256Mb – 1Gb 32Mb-<br>2Gb (DIMM)                                                                | 288Mb, 576Mb                                                                          | 8-72Mb                                                                                                  |
| Important<br>features          | Doubling raw bandwidth over SDR SDRAM                                                          | Higher speed, density<br>and lower power than<br>DDR SDRAM                                     | ower power than bandwidth, efficient                                                  |                                                                                                         |
| Target market                  | Desktops, servers,<br>storage, LCDs,<br>displays, networking<br>and communication<br>equipment | Desktops, servers,<br>storage, LCDs,<br>displays, networking<br>and communication<br>equipment | Main memory, cache<br>memory, networking,<br>packet processing,<br>traffic management | Cache memory,<br>routers, ATM switches,<br>packet memories,<br>lookup and<br>classification<br>memories |
| I/O standard                   | SSTL-2 Class I, II                                                                             | SSTL-18 Class I, II                                                                            | HSTL-1.8V / 1.5V                                                                      | HSTL-1.8V                                                                                               |
| Data width<br>(bits)           | 4, 8, 16, 32                                                                                   | 4, 8, 16                                                                                       | 9, 18, 36                                                                             | 8, 9, 18, 36                                                                                            |
| Burst length                   | 2, 4, 8                                                                                        | 4, 8                                                                                           | 2, 4, 8                                                                               | 2, 4                                                                                                    |
| Number of<br>banks             | 4 8                                                                                            | (>1Gb), 4                                                                                      | 8                                                                                     | N/A                                                                                                     |
| Row/column<br>access           | Row before column                                                                              | Row before column                                                                              | Row and column<br>together or<br>multiplexed option                                   | N/A                                                                                                     |
| CAS latency                    | 2, 2.5, 3                                                                                      | 3, 4, 5                                                                                        | 4, 6, 8                                                                               | N/A                                                                                                     |
| Posted CAS<br>additive latency | N/A                                                                                            | 0, 1, 2, 3, 4                                                                                  | N/A                                                                                   | N/A                                                                                                     |
| Read latency                   | RL = CL                                                                                        | RL = CL + AL                                                                                   | RL = CL                                                                               | 1.5 clk cycles                                                                                          |
| On-die<br>termination          | No                                                                                             | Yes                                                                                            | Yes                                                                                   | Yes                                                                                                     |
| Data strobe                    | Single ended bidirectional strobe                                                              | Differential or single-<br>ended bidirectional strobe                                          | Free-running<br>differential read and<br>write clocks                                 | Free-running read and write clocks                                                                      |
| Refresh<br>requirement         | Yes                                                                                            | Yes                                                                                            | Yes                                                                                   | No                                                                                                      |
| Relative cost comparison       | Lowest                                                                                         | Less than DDR SDRAM with market acceptance                                                     | Higher than DDR<br>SDRAM, less than<br>SRAM                                           | Highest                                                                                                 |

transmitted edge-aligned by the DDR SDRAM during reads, and center-aligned by the controller during writes to the memory. DDR SDRAM uses on-chip delaylocked loops (DLLs) to clock out DQS and corresponding DQs, ensuring that they are wellmatched and can track each other during changes in voltage and temperature.

DDR2 SDRAMs feature differential clock inputs (CK and CK#) to mitigate effects of dutycycle variation on clock inputs. Compared with SDR and DDR SDRAMs, DDR2 SDRAMs also support the use of data mask signals to mask data bits during write cycles. All I/Os are compliant with Jedec standard for SSTL-18.

A DDR2 SDRAM implementation includes several design blocks such as memory control, read physical and write physical block (Figure 3). The memory control block is designed for efficiently issuing accesses from memory to the applicationspecific core logic or vice versa. The read physical block handles external signal timing that captures data during read cycles. Likewise, the write physical block manages the issuance of clock and data with the appropriate external signal timing.

## **FPGA** support

In many digital systems, FPGAs connect multiple devices for interoperability and co-processing, to implement features not supported by ASIC devices, or to implement an entire device function. Some modern FPGAs, such



Figure 5: Eye diagram verifies the functionality of DDR2 SDRAM interface.

| Memory type    | Maximum data rate peripherals (Mbps) | Maximum clock frequency<br>(MHz) | Bandwidth for 32bits<br>(Gbps) |
|----------------|--------------------------------------|----------------------------------|--------------------------------|
| DDR2SDRAM      | 667Mbps                              | 333MHz                           | 21Gbps                         |
| DDRSDRAM       | 400Mbps                              | 200MHz                           | 13Gbps                         |
| RLDRAM II      | 600Mbps                              | 300MHz                           | 19Gbps                         |
| SDR SDRAM      | 200Mbps                              | 200MHz                           | 6Gbps                          |
| QDR/QDRII SRAM | 1,200Mbps                            | 300MHz                           | 38Gbps                         |
| ZBT SRAM       | 200Mbps                              | 200MHz                           | 6Gbps                          |

Table 2: Stratix II and Stratix II GX give high-speed memory interface support.

| Measurement             | Drive strength | Vout (V) | DIMM spec (V) |  |
|-------------------------|----------------|----------|---------------|--|
|                         | Oct25          | 1.15     | 4.005         |  |
| V <sub>oh</sub> minimum | 16mA           | 1.23     | 1.025         |  |
| V movimum               | Oct25          | 0.55     | 0.775         |  |
| V <sub>ol</sub> maximum | 16mA           | 0.43     | 0.775         |  |

Table 3: Eye diagram results exceed DDR2 specifications.

as Altera's Stratix II, Stratix and Stratix GX FPGAs, are designed to support high-speed memory interfaces and the corresponding intellectual property core. **Table 2** gives the support for memory interfaces in Stratix II and Stratix II GX FPGAs.

The major challenge in de-

signing a high-speed DDR2 external interface is how to reliably capture the DQ data with the DQS strobe during read operations, and properly drive DQ data with DQS strobe during write operations. DDR2 poses an added challenge over other high-speed memory interfaces because the DQS signal is a bidirectional strobe, and not a unidirectional clock. PLLs cannot be used in the front-end of the read path, and bidirectional aspects introduce issues in tristating the signal at high speed. Memory devices like Stratix II have dedicated DQS phase-shift circuitry to handle such challenges.

Meanwhile, Altera's Quartus Il software provides Megafunctions for DQ and DQS to instantiate RTL blocks based on DDR2 memory configuration. The software analyzes DQS input frequency to determine the optimal resynchronization clock phase. It also provides Tcl commands to extract timing-analysis results. Also, a simulation model is provided to verify DLL and DQS delay behavior.

Figure 5 shows an eye diagram that verifies the functionality of DDR2 SDRAM interface in Stratix II FPGAs. The eye diagram was captured during a write operation at 267MHz, with OCT25Ω on the left and 16mA on the right. The measurement was taken at the DIMM (far) end—i.e. with the FPGA driving. Drive strength was 16mA and the termination was set to Class II. Measurements show that the limit exceeds DDR2 specifications of 900mV ±125mV (775mV to 1.025V) (Table 3).

DDR2 SDRAM interface offers a memory technology choice for next-generation systems. Designing for DDR2 SDRAM interfaces involves multiple challenges. However, emerging FPGA-based memory interface solutions are being developed in hopes of overcoming these challenges and providing a robust solution that meets industry's performance needs.