## THE INTEL MEMORY DESIGN HANDBOOK

CONTENTS

1. Organization and Operation of Fixed Address RAM ..... 1-1
Introduction ..... 1-1
Designs Using Bipolar and Static MOS Circuits ..... 1-3
Dynamic MOS Memory - The 1103 ..... $1-10$
2. Read Only Memories - ROMs and PROMs ..... 2-1
Introduction ..... 2-1
Electrically Programmable MOS ROMs ..... 2-2
PROMs - Field Programmable Bipolar ROMs ..... 2-6
Preparing Data for ROMs/PROMs ..... 2-8
ROM Implementation of Generalized Logic ..... 2-9
Example of ROM Application ..... 2-12
3. Serially Accessed Semiconductor Memory - SHIFT REGISTERS ..... 3-1
Introduction ..... 3-1
Principles of Operation ..... 3-2
Circuit Considerations ..... 3-4
Power Considerations in Shift Registers ..... $3-7$
RAMs as Serial Memory ..... 3-7
Application of Shift Registers ..... 3-8
4. Other Memory Structures - CAMs, Buffer Memories, and Multiport Memories ..... 4-1
Introduction ..... 4-1
Content Addressable Memory (CAM) ..... 4-2
Buffer Memories ..... 4-5
Multiport Memories ..... 4-6
5. BCD Addressing ..... 5-1
6. Appendices ..... 6-1
Appendix A - Silicon Gate MOS ..... 6-2
Appendix B - Schottky Bipolar Technology ..... 6-6
Appendix C - Article Reprints ..... 6-7

## PREFACE

It is hoped that these notes will be of assistance to the designer who requires memory function in his system. The variety of semiconductor memory products is already quite extensive and is growing rapidly. This booklet has attempted to give some understanding of the characteristics of the types of memory now available and some hints about how to solve a few system design problems. However, the variety of applications is so broad that is has been possible only to very briefly touch upon a few of them.


## INTRODUCTION

Memory function is one of the most important elements of a digital computer. Large computer systems often contain many types of memories: reels of tape and drawers of cards, disk cartridges, ferrite core or semiconductor main storage, fast registers, etc. Each of these different types of memory offers a means for storing and retrieving data (and programs which are also "data") for use by the computer system, yet they differ in cost, organization, retrieval time, etc.
Memory function is also important to many other systems which utilize digital processing. Digital filters, computer terminals, electronic cash registers, traffic-light controllers - all may require data storage to achieve their intended function.
Today there are many types of memory available to the designer. One of the most recent additions to the repertory of memory technologies is the semiconductor memory. In fact, there are many new semiconductor memory components, which utilize different technologies, and offer different organizations, operating characteristics, etc.

The purpose of this booklet is to offer the designer some basic information about the types of semiconductor memory available and how to utilize them effectively.
The booklet is divided into six parts. Part 1 describes basic random access, fixed-address memories (RAMs). This type of memory organization is most commonly used as the main memory of general purpose digital computers, and is also used to realize fast registers or scratch-pad memories, many types of buffer memories, etc. Part 2 discusses Read Only Memories (ROMs) and Programmable Read Only Memories (PROMs) which are also almost always random-access, fixed-address memories as well. ROMs/PROMs are used for data tables, control units, as the main memory of certain dedicated special purpose computers, and to replace general random logic. Part 3 discusses serially addressed memory (shift registers). Part 4 discusses memory organizations other than fixed address, such as Content Addressable Memory (CAMs), buffered memories, virtual memory, etc. Part 5 discusses Binary Coded Decimal (BCD) Addressing techniques. Part 6 includes selected article reprints of process technology and device applications.

[^0]Page
INTRODUCTION ..... 1-1
DESIGNS USING BIPOLAR AND STATIC MOS CIRCUITS ..... $1-3$
A. Buffer Circuits ..... 1-6
B. Access Time Considerations ..... 1-7
C. Chip Select Decoding ..... $1-8$
D. Power Consideration ..... $1-8$
E. Paging Techniques for Access Time Reduction ..... 1-9
DYNAMIC MOS MEMORY - THE 1103 ..... 1-10
A. Operation of the 1103 ..... 1-10
B. Chip Select ..... 1-14
C. Clock Signal Amplitudes ..... 1-14
D. Array Connection of 1103 s ..... 1-14
E. System Timing ..... $1-21$
F. Power Considerations ..... 1.25
G. Application Examples ..... $1-29$
H. Protection Against Catastrophic Damage ..... $1-31$
I. Summary ..... $1-32$

## INTRODUCTION

As defined here, a random access memory is one in which the writing time and retrieval (access or cycle) times are essentially independent of the location into which the data is entered. By fixed address, it is meant that data is entered into a known, pre-determined location, and is retrieved from that same location. In a fixed address RAM the data does not move around in the memory except by external control.

For many years, there has been just one practical way to build a random access high speed memory - use ferrite cores. But suddenly, the designer of memory systems is confronted with a proliferation of new semiconductor memory devices and a host of new rules and considerations to face. The designer who is now in that position, facing a pile of data sheets, and wondering where to start, shouldn't give up hope. Semiconductor memories are easy to use and offer high performance. Most important of all, some of the newest semiconductor components cost less per bit than core memory.
First consider some of the types of RAM building blocks now available. These can be first divided into three basic categories, related to the technology and circuit techniques used for their implementation:

## 1. Bipolar

Bipolar technology is most familiar to the logic designer in the form of RTL, TTL, ECL, and other high performance low voltage circuit families. Bipolar circuits are so called because the technology primarily realizes npn bipolar junction transistors. Other components which can be integrated include resistors and junction diodes. A more recently developed bipolar technology, the Schottky technology, also allows Schottky diodes, and substrate and lateral pnp transistors.
Bipolar semiconductor memories offer high performance and are usually directly compatible with one or more of the standard bipolar logic families. Presently available are bipolar memories of up to 256 bits per chip. When compared to MOS memories, bipolar units are usually faster,
but cost more and usually dissipate more power.

## 2. Static MOS

Both static and dynamic MOS circuits use insulated gate field effect transistors as the primary circuit element. A few circuits may also include diffused resistors. However, in static circuits, the devices are used as if they were DC amplifiers while in dynamic circuit extensive use is made of capacitances for temporary storage.

In general, static MOS circuits are easier to use in systems than dynamic MOS circuits, but require more power and cost more per circuit function. With the availability of newer (low threshold voltage) MOS technology, many static MOS circuits can be directly interfaced to TTL or DTL logic circuits.

Static MOS designs are available in both n and p channel technologies. The $n$ channel MOS technology offers even easier interface to TTL than the p channel technology. Static MOS memories are now available with up to 1024 bits per chip. These memories are usually slower than bipolar ménoriés of à sininilari size byy a fáctỡ off 5 tô 10 , buit cost less because of the inherently higher yield and smaller size of the MOS devices.

## 3. Dynamic MOS

Dynamic MOS circuits make use of temporary storage of data on the parasitic capacitances within the circuit. Due to leakages associated with junctions within the circuit, the charge on these capacitances may leak off in a few milliseconds. To prevent loss of data, this charge must be restored (refreshed). Restoration of the charge is achieved by regenerating or recirculating the data. Dynamic circuits usually require extermally generated clock voltages. These clock voltages are usually too high to be generated with ordinary integrated circuits, so special ICs or discrete circuit components must often be used. As a result, dynamic circuits are usually more difficult to interface to standard bipolar logic families than the other types of circuits described above.

Although more difficult to use, dynamic MOS circuits offer some very significant advantages. The chip area required per unit function is much smaller than for either bipolar or static MOS. This high functional density combined with the higher yield or silicon-gate MOS technology, makes the dynamic MOS circuit the most economical of the three approaches, and permits large amounts of memory to be integrated on a single chip. 1024 bit dynamic MOS memories have been available for some time and larger units are in development. Speeds possible with dynamic MOS circuits are much faster than for static MOS, although usually not quite as fast as for bipolar circuits. Thus, because high functional density can be combined with high performance, dynamic MOS circuits are potentially the strongest contenders for use in large random access memory systems. Dynamic MOS circuits also offer the lowest power dissipation per function of the three basic approaches. In many cases, such circuits are designed so that they dissipate power only during access or data regeneration operations.
These properties are summarized in Table 1.1 below. Samples of different types of semiconductor memory elements are shown together with the technology used.

The access times and operating powers given in Table 1.1 refer to the memory parts themselves. However, when used in systems, actual access and cycle times are usually somewhat greater and may be affected significantly by memory size. In addition, system cost per bit may be a significant function of memory size. Figure 1.1 gives relative cost and access time for systems over a range of memory sizes for each of the products described above. Additional components for memory systems are assumed to be drawn from TTL logic families.

From the charts of Figure 1.1, the relative speed and cost of the different approaches can be determined. If speed is unimportant and a relatively large memory must be built, then the 1103 dynamic MOS RAM is the obvious choice. However, when the memory needed is small, one of the other components may be more economical, because they require less peripheral and control circuitry than does the dynamic MOS. In other cases, very high speed may be required so that one of the bipolar products must be used.

Table 1.1. Organization and Operation of Fixed Address RAM

| Part No. | Technology | Configuration | Access/Cycle | Operating Power/Bit | Standby Power/Bit |
| :--- | :--- | :---: | :---: | :---: | :---: |
| 3101 | Bipolar | $16 \times 4$ | $60 / 60 \mathrm{~ns}$ | 8 mW | 8 mW |
| 3101 A | Bipolar | $16 \times 4$ | $35 / 35 \mathrm{~ns}$ | 8 mW | 8 mW |
| $3106 / 7$ | Bipolar | $256 \times 1$ | $80 / 80 \mathrm{~ns}$ | 2.5 mW | 2.5 mW |
| $3106 \mathrm{~A} / 7 \mathrm{~A}$ | Bipolar | $256 \times 1$ | $60 / 60 \mathrm{~ns}$ | 2.5 mW | 2.5 mW |
| 1101 A | Static MOS - | $256 \times 1$ | $1500 / 1500 \mathrm{~ns}$ | 2.5 mW | .14 mW |
|  | p-channel |  |  | .35 mW | .09 mW |
| 2102 | Static MOS - | $1024 \times 1$ | $1000 / 1000 \mathrm{~ns}$ |  |  |
|  | n-channel |  |  | .45 mW | .06 mW |
| 1103 | Dynamic MOS - | $1024 \times 1$ | $300 / 600 \mathrm{~ns}$ |  |  |



Figure 1.1. Memory System Characteristic vs. Size

Having chosen a suitable memory part for the basic building block of the system, the designer must next organize the system. The design of the static MOS memories is quite
similar to that of bipolar memories. However, dynamic MOS random access memories may require a number of additional considerations.

## DESIGNS USING BIPOLAR and STATIC MOS CIRCUITS

Fixed address RAMs are usually organized as [some number] N words of [some number] M bits each. N may be any value from 8 to 16 for scratch pad memories to several million for very large computer memories. M usually falls in the range of 6 to several hundred. Because most computers use binary arithmetic, the address which selects one of the N words is usually a binary value of some k bits, with N designed to be equal to a power of two, that is, $2^{k}$.

To some extent, these organizations are the result of ferrite core economics, for core memory costs are usually lowest when $N=M^{4}$. Semiconductor memory offers greater freedom of organization, with the organization limits being determined by the size of the basic building blocks, i.e., memory components. Not all words of a semiconductor memory need have the same number of bits, nor is it necessary for N to always be exactly some power of two. For example, when using $1024 \times 1$ memory components such as the 1103 or 2102 , a memory of 3036 or 5120 words could easily be constructed with a cost per bit about the same as for a 4096 word memory.
In general, each semiconductor memory chip realizes some
$2^{P}$ words of $q$ bits each. As long as $N$ is a multiple of $2 P$ and $M$ a multiple of $q$, an $N$ word by $M$ bit memory can be realized without wasting bits.
To build an N word by M bit memory from $2^{P}$ word xq bit components, the components are effectively laid out in a two dimensional array. The array must include a total of $\mathrm{N} / 2^{P}$ rows and $\mathrm{M} / \mathrm{q}$ columns.
All of the semiconductor memory components listed in Table 1.1 are designed to make the connection in arrays as easy as possible, and with as few additional components as possible.

All permit several devices to have inputs and outputs connected to common input and output busses. One chip at a time is enabled for writing or reading (placing its data on an output bus) by means of a chip select signal.
To wire an array of $2^{P}$ word by $q$ bit chips as an $\mathrm{N}=2^{k}$ word memory, the array will contain $2^{k-p}$ rows of $M / q$ circuits (chips or packages) each.
Of the $k$ address lines to the memory, $p$ will be common to all chips. The remaining k-p address lines will be decoded


Figure 1.2. Organization of $2^{\mathrm{k}}$ Word by m Bit Memory from $2^{\mathrm{p}}$ Word by q Bit Memory
to generate $2^{k-p}$ row select signals. One of these two select signals drives all of the chip select leads in the row.
Figure 1.2 shows this basic organization.
These general rules apply to most of the memory products listed in Table 1.1. Because the remaining considerations, buffer circuitry, etc. are quite different for dynamic MOS system than for bipolar and static MOS, the remainder of this section has been divided into two segments.
In general, the rules of wiring the array of Figure 1.2 are as follows:

1. All corresponding power supply leads are made common throughout the array. In general, a two dimensional grid should be used when possible. (See Figure 1.3.)
2. The write enable signal is made common throughout the array.
3. All corresponding addresses are made common throughout the array.
4. Corresponding data input and data output leads are made common within array columns.
5. Corresponding chip select leads are made common within each row. The function of the chip select leads is to permit the array interconnection. When conditions for chip (i.e., row) selection are not met, no input signal can affect the contents of that row, nor does any unselected chip affect the signals on the data output line to which it is attached. Thus the chip select leads permit output leads to be OR-
tied and eliminates the necessity to decode the write pulse signals.
The array of Figure 1.2 can be conveniently laid out using two printed circuit layers. In Figure 1.2, buffer circuits on address and data input and outputs have been indicated. The degree to which such buffer circuits are necessary is a function of the type of circuit used, the loading that can be tolerated by the external circuits, and the degradation in access and cycle time which is acceptable.
Figure 1.3a shows the pin connections for three of the parts
mentioned in table 1.1 and Figure 1.3b shows typical layouts for each of the memory parts included in this category; with the orientation of all layouts chosen so that chip selects run horizontally in rows and data leads run vertically in col umns. The remaining leads are run in the most convenient direction. All of these layouts make use of leads running between the pins of the dual in line package. To prevent solder shorts, these lines are run on the component side of the board. In Figure 1.3b the lines on the soldered side of the board are shown dotted.


Figure 1.3. Printed Circuit Layout of Memory Arrays

## A. BUFFER CIRCUITS

## A. 1 TTL to P-channel MOS

As an example of interfacing TTL to p-channel static MOS circuitry, consider the input requirements for the 1101 A memory, which is realized with p-channel (low-threshold voltage) silicon gate MOS technology.
1101A inputs are essentially purely capacitive, so that when driving the 1101 A memory arrays, the primary considerations involve: $\quad$ the effects of significant capacitive loads on address and data line buffers, and $\square$ guaranteeing proper voltage levels on these lines.
Because 1101 A memory arrays are usually relatively slow, the speed degradation associated with capacitive loads are usually not so important as the effects of reflecting large capacitive loads back into the driving capacity.
Figure 1.4 shows the nature of the TTL to 1101 A static MOS interface. In Figure 1.4, $\mathrm{Q}_{1}$ conducts when the TTL gate output is sufficiently negative with respect to $\mathrm{V}_{\mathbf{C C}}$ (at least 4.5 V below $\mathrm{V}_{\mathrm{CC}}$ ). The ratio of geometries between $\mathbf{Q}_{1}$ and $\mathbf{Q}_{2}$ is chosen so that when $\mathbf{Q}_{1}$ conducts, the internal MOS signal line is brought within a voit or two of ${ }^{\mathbf{V}} \mathrm{CC}$. When the TTL gate output is high, $Q_{1}$ does not conduct, and the internal MOS line is drawn toward the $-V$ supply by $Q_{2}$. This high level must be no lower than $V_{C C}-2 V$. Resistor R should be added to guarantee proper operating levels, as TTL outputs without the resistor are only guaranteed to reach $\mathrm{V}_{\mathrm{CC}}-2.35 \mathrm{~V}$ ( 7400 series). Note that the input signal voltage is referenced to $\mathrm{V}_{\mathrm{CC}}$. It is, therefore, important that $V_{\mathrm{CC}}$ be adequately bypassed to the TTL ground line. $\mathrm{V}_{\mathrm{CC}}$ is normally the same supply as used for the TTL logic. When driving p-channel MOS, it is undesirable to allow this voltage to fall too low, for noise immunity may suffer. Because the MOS input draws no DC current for normal bias, a very large number of devices may be driven by a single TTL gate, although the capacitive loading in large arrays may cause some speed degradation.


Figure 1.4. TTL to p-channel MOS Interface (1101A)

The output of the p-channel MOS circuit can be designed to drive a TTL or low power TTL (LPTTL) gate. The 1101A can drive at least one TTL gate over the full temperature range of 0 to $70^{\circ} \mathrm{C}$. Figure 1.5 shows the nature of the p channel MOS to TTL interface. Parts which are disabled (not enabled by chip select) have neither $Q_{1}$ nor $Q_{2}$ conducting. For a logic " 1 ", Q1 conducts, providing a 1 level very nearly at VCC. However, for a logic " 0 ", $Q_{2}$ conducts. For the $1101 \mathrm{~A}, \mathrm{Q}_{2}$ sinks at least 2.0 mA when the output is at .45 V above ground, thus exceeding the minimum TTL sinking requirement of 1.6 mA . Typically, the excess sinking current capability of the 1101A causes the TTL input clamp diode $D_{1}$ to be forward biased (at around -.7 V with respect to ground) so that some TTL substrate current flows. The maximum current from the 1101 A is 13 mA at -1 V .


Figure 1.5. P-channel MOS to TTL Interface

## A. 2 TTL to N-channel MOS (2102)

The 2102 is an n-channel 1024 word by 1 bit memory. It is ideally suited for applications where high performance, low cost, large bit storage, and simple interfacing are important design objectives. The 2102 requires only a single +5 V power supply. The 2102 has operating conditions which are almost identical to TTL, so that interfacing the 2102 to TTL is far easier than for p -channel MOS devices. Figure 1.6 shows the nature of the input and output interfaces.

Within the n-channel circuit, the signal voltage levels are very much like TTL levels. When the TTL output is high, $Q_{1}$ conducts, producing a low internal level. When the TTL output is low, $\mathrm{Q}_{1}$ is off, and $\mathrm{Q}_{2}$ produces a high internal level. Input requirements are for high input levels of 2.2 V or greater and low input levels of .65 V or less. As a result, noise margins are somewhat less than those of TTL circuitry.


Figure 1.6. N-channel TTL Interface

However, the slower n-channel circuits do not respond as rapidly to noise as bipolar TTL circuits. This slower response somewhat compensates for the reduced noise margins.
The output circuit is capable of driving one 1.6 mA TTL load with .3 mA in reserve. However, when OR-tying several devices in an array, this .3 mA is needed to compensate for output leakage currents. These output leakages ( $100 \mu \mathrm{~A}$ maximum) are equivalent to conduction currents in output transistor $Q_{4}$ of disabled devices. As a result, no more than 4 devices (2102s) should be OR-tied when driving a standard TTL load. For larger arrays, OR-tie capability can be increased by an LPTTL buffer stage, or by organizing the memory in 4 k word modules each with its own output buffer. These output buffers might be open-collector TTL gates enabling data onto a bus or might be realized using multiplexers. The TTL circuits may be operated from the power supply provided for the 2102 s .

## A. 3 Bipolar Buffer Considerations

Once a bipolar memory has been laid out using the techniques shown in Figures 1.2 and 1.3, the signals at the periphery must be properly dealt with.
The $3101,3101 \mathrm{~A}, 3106,3106 \mathrm{~A}$, and 3107 memories are all designed using a Schottky technology, and except for the 3101, have input loads less than $1 / 6$ that of standard TTL.

In spite of these small loading factors, buffering is usually necessary because the large number of inputs which must be driven represent a significant capacitive load. Because of TTL loading restrictions, several levels of buffering may be necessary.

The bipolar components listed have either open collector outputs (3101, 3101A, 3107, 3107A) or three-state output (3106, 3106A).
Those components with open collector output require a pull-up resistor or a resistor terminator. When the output is going from " 1 " to " 0 ", discharge current for load capacitance is provided by the output transistor in the memory
device. When the output goes from " 0 " to " 1 ", the capacitor charging current is provided by a pull-up resistor. A resistive terminator circuit may be used to reduce the time constant of the output circuit, yet still not exceed the current sinking capacity of the output devices. Such a circuit is shown in Figure 1.7. This terminator reduces the time for the one to zero transition, while slightly degrading the zero to one transition. For such memories as the 3101, a net reduction in worst case access time results.


Figure 1.7. Resistor Terminator for Bipolar Outputs

## B. ACCESS TIME CONSIDERATIONS

To estimate the access time of a system using these semiconductor memory elements, the worst case path for the addressing mode used must be found. Figure 1.8 shows symbolically some of the paths in a typical memory system.

To estimate the access time, the longest path through the system is found (taking into consideration the differences in delay for chip select and address inputs, and including any delays due to wire, excessive capacitive loading, etc.). Note that, as a general rule, every increase in memory size by a factor of 8 to 10 requires at least one more level of buffering on all inputs and requires one or more levels of multiplexing on outputs. Therefore, access time in TTLbuffered systems tends to increase by 20 to 30 nsec . for each order of magnitude increase in memory size, until the physical size of the memory also contributes significant delay.
All of the parts described allow an OR-tie connection of output leads. However, each additional device connected to the output adds capacitive loading to the output line. When very large memory arrays are used, this capacitive loading may seriously increase access time due to the time required for the output line to be charged or discharged. In general, the access time is incurred by an amount approximately equal to the time constant of the data line.
Three state outputs are provided on the 3106 and 3106 A as opposed to open collector outputs on the 3107 and 3107 A . Some designers prefer the three-state output because of its increased capacitance driving capability and ability to handle more TTL loads on the bus. (Total TTL load driving capability is reduced by the pull-up re-


Figure 1.8. Possible Delay Paths in 1101A, 3101 and 3106/7 Systems
sistor loading.) Other designers object to the increased system noise and power supply spiking which may take place if the switching of the various three-state devices on a bus line are not carefully synchronized.

## C. CHIP SELECT DECODING

Each of the buffered arrays described above presents a number of address leads, together with one or more chip selects per row. The address leads perform selection of a single word from each row while chip select leads act to select which row will deliver data to (or accept data from) the outside of the array.

To select a single word location in an array of $2^{p}$ words requires $p$ binary address bits. If each memory device contains $2^{n}$ words, $2^{p-n}$ rows must be used in the memory device array. To select a word, $n$ of the address leads are sent to the array as common addresses and p-n are decoded to provide row selection. The chip select signals are used in conjunction with row select decoding.

For the 1101A and 3101A memory devices, which have a single chip select input, the additional address bits are decoded such that the chip select for the selected row is low, and all others remain high. For memory parts with multiple chip selects such as the $3106 / 7$, the multiple chip select inputs may be used to decode some of the additional address bits. Examples of chip select decoding are shown in Figure 1.9. For small arrays, there may be sufficient chip select inputs to decode all of the additional address bits.

Figure 1.9 shows how the 3205 one of eight decoder can be used to decode 3 to 6 additional address input leads and generate 8 to 64 chip select signals. These chip select signals are compatible with $3101,3101 \mathrm{~A}, 3106,3106 \mathrm{~A}$,
$3107,3107 \mathrm{~A}$, and with the 1101 A (if pull-up resistors are added to the outputs of the 3205).

## D. POWER CONSIDERATIONS

The random access memory products discussed above all dissipate power at significantly greater levels per package than most LSI and MSI components. In addition, the array wiring shown provides much denser packing than is commonly achieved with random logic. As a result, the power dissipation per unit area of circuit board can approach an order of magnitude greater than ordinary TTL logic. The system designer must insure an adequate flow of cooling air in any design which uses more than a few packages.

Power distribution is also important. Although most static MOS and bipolar devices draw relatively constant current from the power supplier, MOS output circuits, particularly those of p-channel MOS, can contribute significant transients to power supplies. Bipolar devices with relatively low pull-up resistors and those with three state outputs can also introduce significant transients. To insure proper operation of both memory and the surrounding logic circuitry, power supplies should be adequately bypassed. Ceramic capacitors in the .001 to $.05 \mu \mathrm{fd}$ range are recommended, with one capacitor being used for every one to 10 parts. Bypassing requirements are usually less stringent for multi-layer boards using power and ground planes.
Two of the static memory components described here, the 2102 and the 1101 A , offer reduced standby power modes of operation. These parts have been designed such that the memory cells can operate on reduced voltages without loss of data. In addition, the 1101A provides separate power leads for the memory cells and the peripheral decoders.
To enter standby mode, the chips should all be deselected, and in the read mode, before the voltages are lowered. For
a. 8 Chip Selects From One 3205

b. $\mathbf{1 6}$ Chip Selects From Two 3205s

c. 64 Chip Selects Using 9 3205s


Figure 1.9. Chip Select Decoding Using The 3205
the 2102, all addresses should be held low to achieve the greatest power savings. Power supplies should switch to the voltages (standby or operating voltage) without "glitches" or overshoot. At least one microsecond cycle time should be allowed between the last access or write cycle before power is lowered, and one microsecond cycle time should be allowed after normal voltages are restored before operating the memory. Table 1.2 shows the possible power saving.

Table 1.2. Low Power Standby - Static MOS Memories

| Device | Normal Power | Standby Power |
| :--- | :---: | :---: |
| 1101 A | 720 mW | 370 mW |
| 2102 | 370 mW | 90 mW |

## E. PAGING TECHNIQUES FOR ACCESS TIME REDUCTION

Semiconductor memory systems may exhibit different access times for different address changes. For example, when only the multiplexer control bits are changed for multiplexer connected modules, the access time for new data fetched from within an array is equal to the multiplexer switching time. Similarly, the chip select access for 1101As is much faster than the address-change access. In the $3106 / 7$, access for addresses $\mathrm{A}_{0}$ to $\mathrm{A}_{3}$ is typically much faster than that of addresses $\mathrm{A}_{4}$ through $\mathrm{A}_{7}$.

It is often possible to organize a memory system such that some address bit changes result in shorter access times than others. Simple circuits may be added which compare the new address with the old to signal whether fast or slow access will be obtained.

## DYNAMIC MOS MEMORY--THE 1103

The Intel 1103, a 1024-bit, dyanmic, silicon-gate, MOS random access memory chip, has become one of the most important semiconductor components ever produced. The memory system designer should consider this component for new random access memory system designs, as it offers significant advantages over cores in cost, system flexibility, and performance. Fourteen out of eighteen computer mainframe manufacturers throughout the world are now using the 1103 memory. With more than a dozen semiconductor manufacturers having announced their intentions to second source the 1103, this product is assured a position as an industry standard.

Because the 1103 is such an important component, and because dynamic MOS circuits require a number of unique considerations, this special section has been devoted exclusively to the 1103. This section is divided into articles on the principles of operation of the 1103 , basic system design using the 1103 , special considerations such as minimizing power consumption in 1103 memories, and examples of 1103 system design.

## A. OPERATION OF THE 1103

The Intel 1103 is a 1024 -bit random-access, fully decoded, read-write memory utilizing the cell shown in Figure A-7b. shown in the Appendix, for storage. It is mounted in an 18 lead dual in-line package. Figure 1.10a shows a block diagram of the part, while Figure 1.10b shows the pin connections. A more complete diagram is shown in Figure 1.11.

The memory is organized as 32 rows of 32 cells each. Five address lines, $\mathrm{A}_{0}$ through $\mathrm{A}_{4}$, are decoded to select one row of cells. When accessed, the contents of this row are transferred to a row of 32 refresh amplifiers. In the course of a memory cycle, whether read or write, the data is regenerated and written back into the selected row of cells. Address bits $\mathrm{A}_{5}$ through $\mathrm{A}_{9}$ are decoded to select one refresh amplifier for communication with the data input and output terminals. Data output is sensed as a current. Activation of the "write" clock effectively disconnects the refresh amplifier outputs from the write data lines and permits the signal on the data input line to over ride the signal at the output of the selected refresh amplifier.


Figure 1.10. Block Diagram and External Connections of 1103


RAMs

Figure 1.12 shows the basic timing of the 1103 memory cycle. The timing values specified for each of the input signals (shown in Table 1.3) are those guaranteed to permit operation over the specified operating temperature range.*

The cycle timing is established by the three clock signals: precharge, cenable, and write. Initially (prior to execution of a memory cycle) all clocks are at their high state, at a voltage approximately equal to $\mathrm{V}_{S S}$.

To begin a cycle, precharge is first brought low, to approximately $\mathrm{V}_{D D}$ potential. Referring to Figures 1.11 and 1.12, this operation activates the row and column decoders, and also charges all read and write data lines negatively (i.e., to the equivalent of a logic "high" state for the p-channel MOS). In the discussion which follows, clocks, etc. are considered "on" at $\mathrm{V}_{D D}$ level, and "off" at $\mathrm{V}_{S S}$ level. "High and low" refer to the magnitude of charge with respect to the MOS substrate.
*While individual units may be operable typically at much high speeds than the guaranteed values, particularly at room temperature, the designer of a production system should observe all of the limits specified in the data sheet.

The decoder circuitry is somewhat faster than the line charging circuitry, so addresses need not be stable until somewhat after precharge is applied. The system designer may take advantage of this characteristic in several ways; for example, to utilize a less expensive (slower) address driver. Of course, addresses may be provided before precharge is turned on.

After precharge and addresses have been present long enough for the data lines to charge and decoders to stabilize, the cenable clock may be turned on - i.e., dropped to its low state. At this time, the desired read-select line is activated and the read-data line charging circuits are disabled.

These data lines begin to discharge selectively, with the signals on them approaching values corresponding to the complements of the data stored in the selected row of cells.
As the read-data lines selectively discharge, the precharge signal is turned off, i.e., raised high to $\mathrm{V}_{S S}$. This action removes the charging signal on the write-data lines, and closes a path so that the write-data lines may be selectively discharged. The write-select line corresponding to the selected read-select line is also activated, so that restoration of the cell contents occurs. The signal level on the writedata line is a function of the overlap time between precharge


Figure 1.12. 1103 Timing - Minimum Read/Write Cycle and Supply Current Variation
and cenable. If this overlap is too short, the read-data lines will not have discharged sufficiently when the discharge path from the refresh amplifiers to the write-data lines is closed. As a result, high (negative) levels written into the cells may be reduced.
If, however, the overlap time is excessive, weak lows within the cells may result in some discharge of the read lines before closure of the write path back, so that cells with weak lows have higher levels (ever weaker lows) written back into them, eventually resulting in lows changing to highs. This problem is somewhat aggravated by the small but unavoidable capacitive coupling between the data and select lines and the cell storage capacitor.
When cenable is turned on, a path for current to flow from $\mathrm{V}_{S S}$ to output exists, for one column decoder is enabled and all write-data lines have been charged high (negatively). If the selected cell (the cell at the intersection of the selected column and selected row) contains a low, the write-data line will be discharged after precharge is removed and the
output current will be cut off. If, however, the selected cell has been negatively charged, the output current will continue to flow.

Cenable must remain present (after precharge turn off) for sufficient time to allow the contents of the selected row of cells to be refreshed. Even after cenable is turned off (raised to $\mathrm{V}_{S S}$ ) the addresses should remain for a few ( $\mathrm{t}_{c a}=$ 20) nsec to allow completion of internal operations. Precharge should not be applied again until cenable has remained off for at least $\mathrm{t}_{c p}=85 \mathrm{nsec}$.

If the memory cycle must include a write operation (with or without a read operation) all sequences proceed as above. However, before cenable is removed, but after a sufficient time has been allotted for stabilization of the write data lines, the write line may be activated. As a result, the read data lines are discharged, effectively disconnecting the refresh amplifiers from the write data line. Thus, a direct path from the data input to the selected cell is established.

Table 1.3. Timing Specifications for 1103
AC CHARACTERISTICS $T_{A}=0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}, \mathrm{V}_{S S}=16 \pm 5 \%,\left(\mathrm{~V}_{B B}-\mathrm{V}_{S S}\right)=3.0 \mathrm{~V}$ to $4.0 \mathrm{~V}, \mathrm{~V}_{D D}=0 \mathrm{~V}$
READ, WRITE, AND READ/WRITE CYCLE

| SYMBOL | TEST | MIN. | TYP. | MAX. | UNIT | CONDITIONS |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $\mathbf{t}_{\text {REF }}$ | TIME BETWEEN REFRESH |  |  | 2 | ms |  |
| $\mathrm{t}_{\text {AC }}$ (1) | ADDRESS TO CENABLE SET UP TIME | 115 |  |  | ns |  |
| $t_{C A}$ | CENABLE TO ADDRESS HOLD TIME | 20 |  |  | ns |  |
| $t_{\text {PC }}(1)$ | PRECHARGE TO CENABLE DELAY | 125 |  |  | ns |  |
| tove | PRECHARGE \& CENABLE OVERLAP, LOW | 25 |  | 75 | ns |  |
| $\mathbf{t}_{\text {CP }}$ | CENABLE TO PRECHARGE DELAY | 85 |  |  | ns |  |
| $\mathrm{t}_{\text {OVH }}$ | PRECHARGE \& CENABLE OVERLAP, HIGH |  |  | 140 | ns |  |

READ CYCLE


WRITE OR READ/WRITE CYCLE


Note 1: These times will degrade by 40 ns (worst case) if the maximum values for $V_{I L}$ (for precharge, cenable and read/write inputs) go to $V_{S S}$. $14.2 \mathrm{~V} @ 0^{\circ} \mathrm{C}$ and $\mathrm{V}_{\mathrm{SS}}-14.5 \mathrm{~V} @ 70^{\circ} \mathrm{C}$ as defined on page 2.

A signal on this input will then overwrite the contents of the cell. If the read/write line remains "on" ( $\mathrm{V}_{D D}$ potential) after cenable goes off ( $\mathrm{t}_{\mathbf{c w}}>0 \mathrm{nsec}$ ), those conditions caused by excessive overlap time are aggravated and improper operation of the device may result.

The timing specifications for operating the 1103 are shown in Table 1.3. All the time values listed, except $\mathrm{t}_{\mathrm{P} 0}, \mathrm{t}_{\mathrm{ACC}}$, and $\mathrm{t}_{\mathrm{ACC}}$ are generated by the system in which the 1103 is installed. These times should be kept within their stated limits if proper operation over the full temperature range is to be achieved. In addition to these stated limits, the precharge duty cycle should be held below $40 \%$ to keep power dissipation at an acceptable value.
The time designated $t_{P 0}$ which refers to the time delay observed between the turn-off of precharge and the availability of data at the 1103 output terminals, is a characteristic of the part. When operated within the proper voltage and timing limits, this delay is guaranteed not to exceed the stated value of 120 nsec .
The two access times, $\mathrm{t}_{\mathrm{ACC}}$ and $\mathrm{t}_{\mathrm{ACC} 2}$ represent a combination of system operating parameters and characteristics of the 1103. Thus, the stated "minimum" values of 300 and 310 nsec represent the shortest access times which can be guaranteed when parts are operated within the limits specified and with rise and fall times of 20 nsec . As will be discussed later, system access times will exceed these values because of the additional delays and tolerance introduced by the rest of the system.

## B. CHIP SELECT

In operation, the cenable clock also acts as a chip select. That is, precharge and write signals may be applied at their normal times in the cycle, but if cenable is not applied, the part will neither deliver current to the output terminal nor will the contents of any cell be altered. However, no refreshing of memory content takes place during such a cycle.

## C. CLOCK SIGNAL AMPLITUDES

To guarantee operation of the 1103 over the full temperature range, the clock amplitudes (precharge, cenable and write) must be maintained at the levels defined below in Table 1.4.

The reduced performance specification given in Table 1.4b allows simpler clock-driver designs or permits wider power supply variations for a given clock driver design, but requires adding 40 ns to $\mathrm{t}_{\mathrm{PC}}$ and $\mathrm{t}_{\mathrm{AC}}$. As a result, access and cycle times are increased by at least 40 ns , when this less stringent specification is used.

Table 1.4
A. Clock Signal Amplitudes - Full Performance

| Signal | Minimum | Maximum |
| :--- | :--- | :--- |
| $\mathrm{V}_{I H}$ (high) @ $0^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-1.0$ | $\mathrm{~V}_{S S}+1.0$ |
| $\mathrm{~V}_{I H}$ (high) @ $70^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-0.7$ | $\mathrm{~V}_{S S}+1.0$ |
| $\mathrm{~V}_{I L}$ (low) @ $0^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-17$ | $\mathrm{~V}_{S S}-14.7$ |
| $\mathrm{~V}_{I L}$ (low) @ $70^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-17$ | $\mathrm{~V}_{S S}-15.0$ |

B. Clock Signal Amplitudes - Reduced Performance

| Signal | Minimum | Maximum |
| :--- | :--- | :--- |
| $\mathrm{V}_{I H}$ (high) @ $0^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-1.0$ | $\mathrm{~V}_{S S}+1.0$ |
| $\mathrm{~V}_{I H}$ (high) @ $70^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-0.7$ | $\mathrm{~V}_{S S}+1.0$ |
| $\mathrm{~V}_{I L}$ (low) @ $0^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-17$ | $\mathrm{~V}_{S S}-14.2$ |
| $\mathrm{~V}_{I L}$ (low) @ $70^{\circ} \mathrm{C}$ | $\mathrm{V}_{S S}-17$ | $\mathrm{~V}_{S S}-14.5$ |

NOTE: The maximum value for $V_{I L}$ and the minimum value of $V_{I H}$ are linear functions of temperature over the range 0 to $70^{\circ} \mathrm{C}$ and can be calculated using a straight line relationship.

## D. ARRAY CONNECTION OF 1103s

To design a memory system with 1103 s , a number of factors must be considered. The system must be capable of generating the timing for the various clocks, etc. and distributing these signals at the proper voltage levels to the 1103s. Suitable power must be distributed to the 1103 s and peripheral circuitry, and means must be provided to prevent catastrophic damage to the memory array in the event of peripheral circuit failure. Refresh control circuits may also have to be provided to guarantee that all memory cells retain data.
The nature of the 1103 permits a number of units to be wired in a rectangular array much like that used for bipolar and static MOS circuits. In this way, several thousand words of some selected number of bits can be realized with a single set of peripheral interface circuits. Figure 1.13 shows how 1103 s may be so wired. When making access to an array such as that of Figure 1.13, the cenable clock is used to select a row of 1103 s , while the 10 address lines select the word from within the selected row. In most cases, the system designer will probably find it most convenient to generate timing signals for the 1103 memory system using one of the standard high speed integrated circuit logic families such as TTL or ECL. To operate the 1103, the generated signals must be converted from the logic family levels to MOS levels. Similarly, the outputs from the 1103 must be converted back to suitable logic levels. In Figure 1.13, the small blocks labelled L represent level shifters from the ECL or TTL levels to MOS levels (e.g. the Intel 3207A), while the blocks labelled $S$ are sense amplifiers (e.g., the Intel 3208 A or 3408 A ) which convert the output current from the 1103s back to TTL or ECL levels.
The power supply connections are not shown in Figure 1.13. In practice, $\mathrm{V}_{B B}, \mathrm{~V}_{S S}$, and $\mathrm{V}_{D D}$ connections are each made common throughout the array. It is especially important to


Figure 1.13. 1103 Array
provide adequate distribution for these supplies as will be described in the section on printed circuit layout.

## D. 1 Level Shift Circuits for TTL Interface

The function of the level shift circuits is to convert from the TTL or ECL levels used by the system to MOS levels. To avoid degrading the performance of the system, fast rise and fall times must be maintained into the load represented by the array of 1103 s . The level shifter must also be capable of holding the shifted level within rated tolerances for the signal line being driven. The levels for clock and address lines are referenced with respect to VSS. However, in actual practice, most level shifters use ground or the VDD supply as a reference value for the low level. As a result, the offset from ground which is characteristic of the level shifter, when combined with normal power supply variation, may result in an insufficiently negative level for some of the input signals. To give the system designer some freedom in his choice of drivers, the 1103 is characterized for two different levels of clock drive signals as outlined in Table 1.4. In many cases, a driver with a higher offset voltage may have lower cost or better load drive capability than those with low offset voltages. The designer may choose
to sacrifice some speed to permit use of a net lower cost set of level shifters. Tables $1.4,1.5$, and 1.6 list some of the characteristics of the 1103 which are relevant to level shifter design.
The signal lines from an 1103 memory array represent significant capacitive loads. For example, consider a 4096 word, 16 bit per word memory array. If a connection equivalent to that of Figure 1.13 is used, the array will consist of 4 rows of 16 devices each. The worst case capacitances due to the memory parts are then as follows:
Each of 10 address lines

$$
\begin{aligned}
& 64 \times 7 \mathrm{pF}=488 \mathrm{pF} \\
& 16 \times 18 \mathrm{pF}=288 \mathrm{pF} \\
& 16 \times 18 \mathrm{pF}=288 \mathrm{pF} \\
& 16 \times 15 \mathrm{pF}=240 \mathrm{pF} \\
& 4 \times 5 \mathrm{pF}=20 \mathrm{pF}
\end{aligned}
$$

Each of 4 precharge lines
Each of 4 cenable lines Each of 4 write lines

In addition to these loads, the wiring capacitance of the printed board, etc., used to connect the array must also be considered. To estimate the average charging and discharging current $I$ required to change the voltage by a value of $\Delta E$ on a capacitor of value $C$ in a time $\Delta t$, the relation$\operatorname{ship} \mathrm{I}=\mathrm{C} \Delta \mathrm{E} / \Delta \mathrm{t}$ may be used.
If $C$ is in picofarads ( pF ), $E$ in volts and $t$ in ns, the units of

Table 1.5. Input Level Requirements - 1103 Address and Data Lines

| SYMBOL | TEST | MIN. | TYP. | MAX. | UNIT | CONDITIONS |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $I_{\text {LI }}$ | INPUT LOAD CURRENT (ALL INPUT PINS) |  |  | 1 | $\mu \mathrm{A}$ | $V_{1 N}=O V$ |
| 'LO | OUTPUT LEAKAGE CURRENT |  |  | 1 | $\mu \mathrm{A}$ | $V_{\text {OUT }}=0 V$ |
| $V_{1 L, 1}{ }^{11}$ | INPUT LOW VOLTAGE (ALL ADDRESS \& DATA-IN LINES) | $V_{S S}-17$ |  | $\mathrm{V}_{\text {SS }}{ }^{-14.2}$ | V | $\mathrm{T}_{\mathrm{A}}=0^{\circ} \mathrm{C}$ |
| $V_{1 L 2}{ }^{(1)}$ | INPUT LOW VOLTAGE <br> (ALL ADDRESS \& DATA-IN LINES) | $\mathrm{V}_{\text {SS }}{ }^{-17}$ |  | $\mathrm{V}_{\text {SS }}{ }^{-14.5}$ | V | $\mathrm{T}_{\mathrm{A}}=70^{\circ} \mathrm{C}$ |
| $V_{1 L 3}{ }^{(1,2)}$ | INPUT LOW VOLTAGE (PRECHARGE CENABLE \& READ/WRITE INPUTS) | $V_{S S}{ }^{-17}$ |  | $V_{S S}-14.7$ | V | $\mathrm{T}_{\mathrm{A}}=0^{\circ} \mathrm{C}$ |
| $V_{1 L 4}{ }^{(1,2)}$ | INPUT LOW VOLTAGE (PRECHARGE CENABLE\& READ/WRITE INPUTS) | $\mathrm{V}_{\text {Ss }}{ }^{-17}$ |  | $V_{S S}-15.0$ | V | $\mathrm{T}_{\mathrm{A}}=70^{\circ} \mathrm{C}$ |
| $V_{1 H 1}{ }^{(1)}$ | INPUT HIGH VOI.TAGE (ALL INPUTS) | $\mathrm{V}_{S S}{ }^{-1}$ |  | $\mathrm{V}_{S S}+1$ | V | $\mathrm{T}_{\mathrm{A}}=0^{\circ} \mathrm{C}$ |
| $\mathrm{V}_{1 \mathrm{H} 2}{ }^{(1)}$ | INPUT HIGH VOLTAGE (ALL INPUTS) | $V_{S S}-0.7$ |  | $\mathrm{V}_{S S}+1$ | V | $\mathrm{T}_{\mathrm{A}}=70^{\circ} \mathrm{C}$ |

Note 1: The maximum values for $V_{I L}$ and the minimum values for $V_{I H}$ are linearly related to temperature between $0^{\circ} \mathrm{C}$ and $70^{\circ} \mathrm{C}$. $T$ hus any value in between $0^{\circ} \mathrm{C}$ and $70^{\circ} \mathrm{C}$ can be calculated by using a straight-line relationship.
Note 2: The maximum values for $V_{I L}$ (for precharge, cenable \& read/write) mav be increased to $V_{S S}-14.2 @ 0^{\circ} \mathrm{C}$ and $V_{S S}-14.5 @ 70^{\circ} \mathrm{C}$ (same values as those specified for the address \& data-in lines) with a 40 ns degradation (worst case) in $t_{A C}, t_{P C}, t_{R C}, t_{W C}, t_{R W C}, t_{A C C 1}$ and $t_{A C C 2}$.

Table 1.6. Capacitance Value of 1103 Signal Lines*

| SYMBOL | TEST | TYP. | $\begin{aligned} & \text { PLASTIC PKG. } \\ & \text { MAX. } \end{aligned}$ | $\begin{gathered} \text { CERAMIC PKG. } \\ \text { MAX. } \end{gathered}$ | UNIT | CONDITIONS |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $\mathrm{C}_{\text {A }}$ | ADDRESS CAPACITANCE | 5 | 7 | 12 | pF | $\mathrm{V}_{\text {IN }}=\mathrm{V}_{\text {ss }} \quad$ ] |  |
| $\mathrm{C}_{\text {PR }}$ | PRECHARGE CAPACITANCE ${ }_{c}$ | 15 | 18 | 19.5 | pF | $V_{\text {IN }}=V_{\text {SS }}$ |  |
| $\mathrm{C}_{\text {cE }}$ | CENABLE CAPACITANCE | 15 | 18 | 21 | pF | $V_{\text {IN }}=V_{\text {SS }}$ | $\mathrm{f}=1 \mathrm{MHz}$ |
| $\mathrm{C}_{\text {RW }}$ | READ/WRITE CAPACITANCE | 11 | 15 | 19.5 | pF | $V_{\text {IN }}=\mathrm{V}_{S S}$ | All Unused |
| $\mathrm{C}_{\text {IN } 1}$ | DATA INPUT CAPACITANCE | 4 | 5 | 7.5 | pF | $\begin{aligned} & \text { CENABLE }=0 \mathrm{~V} \\ & \mathrm{~V}_{\mathrm{IN}}=\mathrm{V}_{\mathrm{SS}} \end{aligned}$ | At A.C. Ground |
| $\mathrm{C}_{\text {(N2 } 2}$ | dATA InPUT CAPACITANCE | 2 | 4 | 6.5 | pF | $\begin{aligned} & \text { CENABLE }=V_{S S} \\ & V_{I N}=V_{s s} \end{aligned}$ |  |
| Cout | DATA OUTPUT CAPACITANCE | 2 | 3 | 7 | pF | $\mathrm{V}_{\text {Out }}=0 \mathrm{~V} \quad J$ |  |

[^1]current I will be mA. Thus, for 20 ns rise and fall times and 16 volt transitions, each of the precharge and cenable lines will require a drive current which averages about 230 mA over the transition. The peak charging current may be somewhat higher than this value. Thus, the circuits used to drive 1103 arrays must be capable of providing high peak currents.

Although a capacitive load dissipates no power, when a capacitive load is charged, energy is drawn from the driver power supply. Some of this energy is dissipated by the driver while the remainder is stored in the capacitor. When the capacitor is discharged, its stored energy is dissipated in the driver. If a capacitor C is first charged to a voltage V and then discharged back to ground at a rate of $f$ times per second, the minimum average current $I$ drawn from the driver power supply will be given by $I=f C V$. If $f$ is in megahertz, C in picofarads, and V in volts, I will be in microamps. If the power supply is also of voltage $V$, the power dissipated by the driver due to the capacitive load is given by $P=\mathrm{fCV}^{2}$, where $P$ is in microwatts if the same units listed above are used. Thus, worst case dissipation due to this factor for a precharge or cenable driver in the $4 \mathrm{k} \times 16$ memory listed above would be about 120 mW per driver at $\mathrm{V}=16$ volts, $\mathrm{C}=288 \mathrm{pF}$ and $\mathrm{f}=1.6 \mathrm{MHz}$. This dissipation must be added to any other dissipation associated with the driver.

The capacitive loading associated with the array may also produce ringing if the leads from the drivers to the array have any significant inductance. Series damping resistors must usually be used. The choice of the damping resistor will usually be a function of the layout used and the number of 1103s driven. In Figures 1.14 and 1.15 a nominal value of 10 ohms is shown in several of the circuits. In some cases a different value may be found to give better results. Figure 1.14 illustrates a number of level shifters for use with TTL logic.

Table 1.7 lists some of the characteristics of these drivers. The level shifter of Figure 1.14a is used primarily for driving data input lines. In very small arrays, or arrays where degraded performance can be tolerated, it may be used for driving address lines as well. A load resistor of 1 k is shown, however, other values may be used. Of course, higher values reduce power dissipation but degrade speed. Lower load resistor values will speed up the level shifter, but increase power dissipation and may, if too low, exceed the current sinking capability of the driving gate.

The capacitive load driving capability of the driver of Figure 1.14a may be increased by adding a booster stage, as shown in Figure 1.14b. The complementary emitter follower considerably increases capacitive drive capability, as can be seen from Table 1.7. However, the emitter base drop of transistor $\mathrm{Q}_{2}$ in Figure 1.14b raises the low output level to about +1 volt. If VSS is allowed to fall to 15.2 volts, the output may fall out of the specified operating range at elevated temperatures and degrade the performance.

When using a driver such as that of Figure 1.14b, the designer has several choices:

1. He can restrict the allowable range (or tolerances) of the


Figure 1.14. TTL to 1103 Interface Circuits
$V_{\text {SS }}$ supply to values which do not require speed degradation. For example, a VSS range of 15.7 to 16.8 allows precharge, etc. inputs not to exceed VSS -14.7 and still maintains $\mathrm{V}_{\mathrm{SS}}$ in the proper range.
2. He can raise $\mathrm{V}_{S S}$ by approximately 1 V and compensate by biasing the $V_{D D}$ return from the 1103 array at about +1 V . (A diode in series with the $\mathrm{V}_{D D}$ return results in a nominal $0.7 \mathrm{~V} \mathrm{~V}_{D D}$ return value, but requires some reduction in allowable $V_{S S}$ tolerances.)

Table 1.7. Typical Output Characteristics of Level Shift Circuits

|  | Driver (A) | Driver (B) | Driver (C) |
| :---: | :---: | :---: | :---: |
| 1 Low Voltage ( $\mathrm{V}_{\mathrm{OL}}$ ) No Load Sinking 3 mA | $\begin{gathered} 0.25 \mathrm{~V} \\ - \end{gathered}$ | $\begin{aligned} & \approx 0 \mathrm{~V} \\ & +0.97 \mathrm{~V} \end{aligned}$ | $\begin{aligned} & 0.04 \mathrm{~V} \\ & 0.14 \mathrm{~V} \end{aligned}$ |
| II High Voltage ( $\mathrm{V}_{\mathrm{OH}}$ ) <br> No Load <br> Source - 3 mA <br> Sink x 0.5 mA | $V_{s s}+0.65 V$ <br> - | $\begin{aligned} & V_{S S} \pm 0.01 \mathrm{~V} \\ & V_{S S}-0.05 \mathrm{~V} \\ & V_{S S}+0.01(1) \\ & \hline \end{aligned}$ | $\begin{aligned} & V_{S S} \pm 0.01 \mathrm{~V} \\ & V_{S S}-0.12 \mathrm{~V} \\ & \mathrm{~V}_{\mathrm{SS}}+0.45 \mathrm{~V}(1) \\ & \hline \end{aligned}$ |
| $\text { III Rise time } \left.\left(\mathrm{t}_{\mathrm{R}}\right)^{(2)}\right) \text { 10 pF } \begin{aligned} & 50 \mathrm{pF} \\ & 100 \mathrm{pF} \\ & 200 \mathrm{pF} \\ & 470 \mathrm{pF} \end{aligned}$ | $\begin{gathered} 25 \mathrm{~ns} \\ 90 \mathrm{~ns} \\ - \\ - \end{gathered}$ | 25 ns 25 ns 25 ns 25 ns 40 ns | 10 ns 15 ns 20 ns 25 ns 50 ns |
| $\text { IV Fall time }\left(\mathrm{t}_{\mathrm{F}}\right)^{(2)} 10 \mathrm{pF} \text { } \begin{gathered} 50 \mathrm{pF} \\ 100 \mathrm{pF} \\ 200 \mathrm{pF} \\ 470 \mathrm{pF} \end{gathered}$ | $\begin{gathered} 12 \mathrm{~ns} \\ 20 \mathrm{~ns} \\ - \\ - \end{gathered}$ | $\begin{array}{r} 6 \mathrm{~ns} \\ 8 \mathrm{~ns} \\ 10 \mathrm{~ns} \\ 14 \mathrm{~ns} \\ 24 \mathrm{~ns} \end{array}$ | $\begin{array}{r} 6 \mathrm{~ns} \\ 9 \mathrm{~ns} \\ 12 \mathrm{~ns} \\ 20 \mathrm{~ns} \\ 35 \mathrm{~ns} \end{array}$ |

This voltage level is a function of transistor reverse gain. A diode clamp to $V_{S S}$ is recommended.
2. These values are measured between the $10 \%$ and $90 \%$ points.

The driver of Figure 1.14b holds the output line very close to $V_{S S}$ when the output is high. However, if the layout permits positive transients to be capacitively coupled to the driver line, the driver may not be capable of holding the line within tolerance. The reason for this is that the positive transient clamping capability of the driver is a function of the reverse gain of transistor $Q_{2}$. To insure that the high output remains within limits, it may be necessary to add a clamp diode (shown dotted in Figure 1.14b from the output line to $V_{S S}$ ).

The driver shown in Figure 1.14c has somewhat reduced capacitive load driving capability from that of Figure 1.14b, but does not have the offset problem for low outputs. Again, positive transient clamping for high outputs is limited by the reverse gain of transistor $\mathrm{Q}_{2}$ so a diode clamp to $\mathrm{V}_{S S}$ may be desirable.

Figure 1 -14d shows a monolithic integrated quad level shifter and driver, the 3207 A , that is available from Intel. Each of the 4 drivers in the 16 pin dual in-line package can drive up to 200 pF , with rise and fall times under 30 and 35 ns . respectively.

## D. 2 Level Shift Circuits for ECL Interface

In Figure 1.15, a circuit for converting from ECL levels (ECL biased between ground and -5.2 V ) to MOS levels
$\left(V_{D D}=G N D, V_{S S}=+16\right)$ is shown together with the performance characteristics of the circuit. This circuit does not have significant negative offset, but may require a diode clamp to $\mathrm{V}_{S S}$ to reduce positive going, capacitively coupled noise on the output line. A negative clamp to ground may also be desirable, as ringing can sometimes result in charging the output line to an excessively negative value. Excessively negative clock values may effectively reduce the timing tolerances for $\mathrm{t}_{O V L}$.


Figure 1.15. ECL to MOS Level Converter

## D. 3 Sense Amplifier Circuits

The output of the 1103 is a current which is present for a logic " 1 " output and absent for a logic " 0 ". This current is usually converted to a logic level by a sensitive differential amplifier which measures the voltage drop across a resistor of a few hundred ohms. Larger resistor values slow performance, while smaller values result in very small signals which are difficult to sense. Table 1.8 lists some of the relevant 1103 output characteristics.

Figure 1.16 shows several different circuits for converting the 1103 output current back to standard logic levels. The first choice for a sense amplifier is the Intel 3208A or 3408A. The 3208A, shown in Figure 1.16a, is a hex sense amplifier specifically designed for use with the 1103. The 3408A, shown in Figure 1.16b, is similar to the 3208A, but includes an internal hex latch so that the 1103 outputs may be stored and delivered to the data bus at a later time.

In Figure 1.16a, a sense resistor $\mathrm{R}_{\boldsymbol{s}}$ is shown. In practice each sense input which is used would have a sense resistor $\mathrm{R}_{\boldsymbol{s}}$ to ground. Reference voltage for the 3208A (or 3408A) is generated by the voltage divider consisting of resistors $\mathrm{R}_{1}$ and $\mathrm{R}_{2}$.

The reference voltage of the 3208A and 3408A may be derived from either $\mathrm{V}_{S S}$ or $\mathrm{V}_{C C}$. The output current of the 1103 is directly proportional to $\mathrm{V}_{S S}$. To maintain close tracking of 1103 output current and sense amp. reference voltage, it is recommended that the reference voltage for the sense amp. be derived from $V_{S S}$. Due to the sensitivity of the sense amplifier input, the reference voltage must be well filtered and decoupled from noise and ripple. Any noise at the reference line may be falsely recognized as an input signal.

The choice of sense resistor for use with the 3208A and 3408 A is influenced by two considerations. Larger values of resistor produce larger 1103 output voltages, but increase the time constant of the data output bus, and therefore increase the access time of the memory. Lower values result in high speed, but reduce the noise immunity of the system. If the resistor is too small, the minimum output " 1 " level of the 1103 may not be large enough to exceed the " 1 " threshold of the sense amp. An optimum value of sense resistor and reference voltage may be defined as the values which result in the last added circuit delay given a desired amount of noise immunity. The optimum value may be computed
easily if the following assumptions are used (also see note below):

1. The added delay due to an increase in data bus time is approximately equal to the increase in time constant.
2. The added delay due to a change in current sensing level for the 1103 can be estimated at $.1 \mathrm{~ns} / \mu \mathrm{A}$ - that is, if the equivalent current level for sensing zeroes from the 1103 is reduced by $10 \mu \mathrm{~A}$, the time delay is increased by 1 nsec.

Based on these assumptions, the optimum value of reference voltage is given by the equation:
$V_{r e f}=I_{\min } \sqrt{\frac{2 \times V_{N}+V_{T}}{10 C}}$
where $\mathrm{V}_{r e f}$ is in millivolts and
$I_{\text {min }}=$ minimum " 1 " level output current of the 1103 in $\mu \mathrm{A}$.
$\mathrm{V}_{N}=$ desired noise immunity in mV .
$\mathrm{V}_{T}=$ maximum threshold variation $\left(\mathrm{V}_{S H^{-}} \mathrm{V}_{S L}\right)$. For $3208 \mathrm{~A}=50 \mathrm{mV}$, for $3408 \mathrm{~A}=60 \mathrm{mV}$.
$\mathrm{C}=$ data bus capacitance in pF .
If $\mathrm{V}_{\text {ref }}$ falls outside the range of 100 to 200 mV , then the value ( 100 mV or 200 mV ) which is closest to $\mathrm{V}_{\text {ref }}$ should be used as $V_{\text {ref }}$.
Once $\mathrm{V}_{\text {ref }}$ has been chosen, the value for the sense resistor $\mathrm{R}_{S}$ is given by
$\mathrm{R}_{S}=\frac{(1000) \times\left(\mathrm{V}_{r e f}+\mathrm{V}_{N}\right)}{\mathrm{I}_{\min }}$
where $\mathrm{R}_{S}$ is in ohms, $\mathrm{V}_{\text {ref }}$ and $\mathrm{V}_{N}$ are in millivolts, and $I_{\text {min }}$ is in $\mu \mathrm{A}$.
Using these assumptions, the increase in access time $\Delta \mathrm{T}_{A}$ of the memory is given by
$\Delta \mathrm{T}_{A}=\mathrm{t}_{s a}+\left[\mathrm{I}_{r e f}-\left(\frac{\mathrm{V}_{r e f}-\mathrm{V}_{T}-\mathrm{V}_{N}}{\mathrm{R}_{S} \times .001}\right)\right] \times \frac{1}{10}+\left[\frac{\mathrm{R}_{S}-100}{1000}\right] \times \mathrm{C}$
where $\mathrm{t}_{s a}=$ sense amplifier delay in nsec, $\mathrm{I}_{\text {ref }}=$ defined measuring current for 1103 access time and $V_{r e f}, I_{m i n}, R_{s}$ $\mathrm{V}_{T}$ and C are as defined above.
As an example, consider a data bus capacitance of 100 pF and desired noise immunity of 25 mV . Then for an 1103

Table 1.8. Output Characteristics of the 1103

| ${ }^{1} \mathrm{OH} 1$ | OUTPUT HIGH CURRENT | 600 | 900 | 4000 | $\mu \mathrm{A}$ | $\begin{aligned} & \mathrm{T}_{\mathrm{A}}=25^{\circ} \mathrm{C} \\ & \mathrm{~T}_{\mathrm{A}}=70^{\circ} \mathrm{C} \end{aligned}$ | $\mathrm{R}_{\mathrm{S}}=100 \Omega$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ${ }^{1} \mathrm{OH} 2$ | OUTPUT HIGH CURRENT | 500 | 800 | 4000 | $\mu \mathrm{A}$ |  |  |
| ${ }^{1} \mathrm{OL}$ | OUTPUT LOW CURRENT |  | Note |  |  |  |  |
| $\mathrm{V}_{\mathrm{OH} 1}$ | OUTPUT HIGH VOLTAGE | 60 | 90 | 400 | mV | $\mathrm{T}_{\mathrm{A}}=25^{\circ} \mathrm{C}$, |  |
| $\mathrm{V}_{\mathrm{OH} 2}$ | OUTPUT HIGH VOLTAGE | 50 | 80 | 400 | mV | $\mathrm{T}_{\mathrm{A}}=70^{\circ} \mathrm{C}, \pm$ |  |
| $\mathrm{V}_{\mathrm{OL}}$ | OUTPUT LOW VOLTAGE |  | Notr: |  |  |  |  |

NOTE: The output current for the low state is the sum of
the 1103 leakage and externally coupled noise. $V_{O L}=I_{O L} \cdot R_{S}$
with $\mathrm{I}_{\text {min }}=500 \mu \mathrm{~A}$ and $\mathrm{I}_{r e f}=400 \mu \mathrm{~A}$,
$V_{\text {ref }}=500 \sqrt{\frac{100}{10 \times 100}}=150 \mathrm{mV}$
$R_{S}=\left(\frac{175}{500}\right) \times 1000=350 \Omega$
$\Delta \mathrm{T}_{A}=20+\left[\frac{400}{10}-\frac{750}{35}\right]+\frac{350-100}{1000} \times 100=64 \mathrm{nsec}$

Thus the sense amp (3208A) circuitry contributes an additional 64 ns to the access time of the system.
(3) 3208A Block Diagram

c. Using SN75107/75108

## d. Using Low Power TTL


b. 3408A Block Diagram


f. Conversion to ECL Levels


Figure 1.16 also shows several other circuits for converting the output current to logic levels.

The circuit shown in Figure 1.16c uses one half of a SN5107 or SN75108 line receiver as a sense amplifier. The SN75108 requires a differential signal of at least 25 mV for guaranteed operation. The high input side may require up to $75 \mu \mathrm{~A}$ input current. The SN75107/8 may be used with balanced or unbalanced input. In Figure $1.16 c$ the SN75108 is shown with a balanced input circuit. Current from the 1103 array is sensed across resistor $\mathrm{R}_{s}$.
The reference threshold is established by the combination of resistors $\mathrm{R}_{1}$ and $\mathrm{R}_{2}$, with $\mathrm{R}_{2}$ usually made equal to $\mathrm{R}_{S}$. Using the minimum " 1 " level current of $500 \mu \mathrm{~A}$ from the 1103 and the 25 mV and $75 \mu \mathrm{~A}$ characteristic of the SN75108, the reference level must be less than $(.50-.075) \mathrm{R}_{S}-25 \mathrm{mV}$, where $\mathrm{R}_{S}$ is in ohms. However, the reference level must be in excess of 25 mV for proper detection of zeroes. For this reason, $\mathrm{R}_{S}$ must be larger than $100 \Omega$ when using the 75108. A value of 200 to $400 \Omega$ is more appropriate.

In some systems, the effects of capacitively coupled noise on the data output line may be reduced by adding a dummy line in the array. This dummy line runs parallel to the sense line, but makes no connection. The noise signals then couple more or less equally into both lines and appear as common mode signals at the inputs to the SN75108. However, the output currents from the 1103s couple into only one line, and therefore contribute to a differential between the lines.
The SN75108 may be also used as an unbalanced sense amplifier by grounding one of the differential inputs and adding a negative bias current to the sense line input. This current may be supplied by a suitable resistor returned to a negative supply voltage.

The two versions of the part differ in that the SN75107 has a totem-pole TTL output, while the SN75108 has an open collector output. The SN75108 may be used to advantage in large systems, where several memory modules may share a common data bus.
A comparator, such as the $\mu \mathrm{A} 710$ or MC1710, could be used as a sense amplifier in a similar fashion. The tighter differential voltage tolerance ( 6.5 mV vs. 25 mV ) and lower bias current $(40 \mu \mathrm{~A}$ to $75 \mu \mathrm{~A})$ permit lower sense resistor values. However, the lack of strobe signal input and/or open collector output, and the more awkward power supplies required by the 710 make it a less desirable choice.
Unlike the 3208A and 3408A, the SN75108 requires an additional negative supply for its operation. Where performance can be sacrificed, sensing can also be done without an extra supply by using circuits such as those of Figure 1.16 d and 1.16 e . In Figure 1.16d, low power TTL is biased to allow direct sensing of the output current. In Figure 1.16e a discrete transistor circuit is used for sensing. Each of these circuits adds significantly to the access time of the system.

Figure 1.16 f shows one possible connection for conversion to ECL levels. In this circuit the 1103 output terminals
are biased about 1.5 V negative with respect to $\mathrm{V}_{D D}$. The relatively high impedance required adds delay to the system as described above. Another means for conversion to ECL levels is to use a discrete pnp transistor difference amplifier between the 1103 sense line and the ECL gate.

## E. SYSTEM TIMING

Many potential users of the 1103 have experienced some confusion as to the meaning of "minimum" and "maximum" as used in relation to the 1103 specifications. For a sys-tem-generated time value, such as $\mathrm{t}_{P W}$ or $\mathrm{t}_{O V L}$ the value, as measured at the terminals of any 1103 , must fall within the stated limits. That is, for example, the time between turn-on of cenable and turn-off of precharge ( $\mathrm{t}_{0 V L}$ ) must fall within the range of 25 to 75 nsec . Operation outside of this range may cause inferior performance. Those time values which represent direct measurement of 1103 properties, such as $\mathrm{t}_{P O}$ (the time from precharge turnoff until data is available at the output) are specified as not to exceed the stated maximum values. However, 1103 access and cycle times ( $\mathrm{t}_{A C C 1}, \mathrm{t}_{A C C 2}, \mathrm{t}_{W C}, \mathrm{t}_{T W C}, \mathrm{t}_{R C}$ ) represent a combination of system driving characteristics and 1103 characteristics. The minimum values stated represent the minimum system cycle time (or system access time) for which all parts are guaranteed to operate. These minimum values are achieved by operating with all drive signal timing values set to their minima and the access time strobing signal set at a value corresponding to the maximum $t_{P O}$. Of course, by selecting devices, a cycle faster than the stated minimum might be achieved. In a practical system, skews in driver and logic circuitry will usually prevent operation with all time values set at their minima, and delays in sense amplifier and latches will add to access time.

In generating timing signals for an 1103 memory system, the skews (variations in the delays of gates and drivers) in the timing path must be considered. As an example, consider a system in which the driver delay (from input transition to start of rise or fall) is $15 \pm 5 \mathrm{~ns}$. In addition, let each timing signal pass through two gates, each with delays in the range of 5 to 10 ns . Thus, for each timing signal generated, a delay of $30 \pm 10 \mathrm{~ns}$ and a rise time of $20 \pm 5 \mathrm{~ns}$ is experienced. In Figure 1.17, each of the timing signals have been expanded to include the effects of these skews while still maintaining the timing requirements. All signals are assumed to have independent skews.

The delay spread for gates within a single package will typically be much less than the spread for the logic family under consideration. In some systems, it may be possible to take advantage of this reduced spread to achieve higher per-
formance. Using the skew figures listed above, the shortest read/write cycle becomes almost 700 ns in length. A readonly cycle could be as little as 580 ns long. Of course, each system designer must use figures characteristic of his own system.

## E. 1 Delay Line Controllers

There are several techniques for generating the timing signals needed for the system. One such technique, shown in Figure 1.18, uses a delay line and controller to produce cycle timing signals. This circuit also includes control for refresh cycles, and generation of a ready/busy signal. At the end of each normal (non-refresh) cycle, the memory goes "ready", signalling readiness to accept another memory
cycle request. A single shot multivibrator controls refresh cycle requesting timing - a cycle is requested every $60 \mu \mathrm{~s}$. If such a refresh request is pending, at such time as the memory is ready, the refresh request is accepted and a refresh cycle is executed. However, the memory control continues to indicate "ready". Any normal requests received during the execution of the refresh cycle are acknowledged by the controller (by going "busy"), but are not executed until after completion of the refresh cycle. Thus, refresh cycles are visible to the normal requester only as occasionally longer access and cycle times.

Cycle timing in the circuit of Figure 1.18 is established by launching a signal down the delay line. This signal, a square pulse of one half cycle in length, is used to generate
CENT

Figure 1.17. 1103 System Timing


Figure 1.18. Delay Line Controller for Generating System Timing Signals
the various timing signals by performing logic at the various taps on the line. It should be noted that the most critical timing is associated with the overlap interval - i.e., the time between the turn-on of cenable and the turn-off of precharge. As shown in Figure 1.17, with the skews shown, the source timing signals must have a spacing of exactly 70 ns . To make the system requirements more realistic, several choices are open to the designer:
a. Provide several delay line taps for precharge turn-off and cenable turn-on timing. The appropriate tap is selected for proper overlap values.
b. Generate overlap timing at (or closer to) the drivers to reduce the number of gates in the timing path. A suitably delayed signal derived from cenable turn-on may be used to control precharge turn-off. A variation of this technique utilizes a discrete circuit to detect turn-on of cenable at the 1103's MOS levels themselves. The output of this circuit is used to control precharge turn-off. Figure 1.19 shows such a circuit. The delay line can usually be a single LC lumped stage.


Figure 1.19. Local Control of TOVL
The Intel 3207 A driver provides input leads which can tolerate MOS level inputs. As a result, the logic of Figure 1.19 can be implemented with fewer components when using the 3207A.
c. More closely matched skews in the path from the timing generation to the cenable and precharge drivers by insuring equivalent gates are located in common package. If inte-
grated circuit drivers are used, corresponding precharge and cenable drivers should be located in the same package whenever possible. The Intel 3207A, quad level shifter and driver, is specified to have differential delays within one package not to exceed 10 ns for equal loads of 200 pF . In addition, rise and fall times of drivers within a common package will not differ by more than 10 ns under these conditions.
d. Faster logic, such as Schottky TTL, may be used in the critical paths of the timing control circuits.

The access time for a system with timing as shown in Figure 1.17 is at least 450 ns plus any additional delays associated with starting the cycle and propagating the data to the memory output.

## E. 2 Shift Register Based Controller

The circuit shown in Figure 1.20 uses clocked JK flipflops wired as a shift register instead of delay lines to generate timing for an 1103 memory. Operation is otherwise similar to the controller of Figure 1.18. However, access time is increased by the time which elapses from the occurrence of the cycle request until the next clock pulse arrives. If timing similar to that of Figure 1.17 is used, the clock must operate at a 70 ns period. All other requirements could be met if an 840 ns read/write cycle were used. However, by using the $\mathrm{t}_{0 V L}$ control circuit of Figure 1.19, much greater freedom is offered the designer.
The shift register controller results in a longer memory cycle than the delay line controller because the timing cannot be optimized with the shift register. When cost is more important than speed, the shift register may be a more favorable choice, as the flip-flops used typically cost less than the delay lines.
Of course, many other controller designs may be used, as long as the timing requirements of the 1103 are met.

## E. 3 Refresh Control

The maximum time interval between accesses to a row of memory cells within the $1103\left(t_{R E F}\right)$ is specified as 2 ms over the normal operating temperature range. To guarantee that data is retained within the memory, at least one read or write cycle must be executed for each row of cells within any 2 ms interval. As the rows are selected by address inputs $\mathrm{A}_{0}$ through $\mathrm{A}_{4}$ at least 32 memory cycles, one for each state of address lines $\mathrm{A}_{0}$ through $\mathrm{A}_{4}$ must be executed in each 2 ms interval. These cycles may in some cases result from normal accessing; such as when a memory is used in a sequential-access mode. In other cases special refresh cycles must be executed.
In any of the systems described above, during refresh cycles the address is derived from a refresh address counter, rather than from the normal address source. Figure 1.21 shows how address generation and switching may be accomplished. Only the 5 bits controlling address lines $\mathrm{A}_{0}$ through $\mathrm{A}_{4}$ need to be switched during refresh cycles. In larger memories where cenable (and sometimes precharge) are decoded to act as a chip select signal, the decoding may be overridden during refresh cycles. In this way, the entire mem-


Figure 1.20. Shift Register Based Controller for Generating System Timing Signals
ory is refreshed with the execution of only 32 cycles. When this type of "bulk" refreshing is used together with the "local" $t_{O V L}$ control circuit of Figure 1.19, the skews in turn-on of cenable drivers must be controlled to insure meeting the $t_{O V L}$ specification during refresh cycles. The two controllers described above each execute a refresh cycle every $60 \mu \mathrm{~s}$ so that 32 such cycles are executed within 2 ms . Refresh cycles are, of course, read cycles, and may, if the user desires, take advantage of the shorter cycle available when reading only. In small memories with slow cycles, it is sometimes convenient to use every other cycle as a refresh cycle.
For some applications, the normal cycles executed may achieve meeting the refresh requirement. For example, random access memories used for maintenance of CRT displays are often operated in a "mostly sequential" access mode, or may have some recurring accesses which are sequential. If these sequential cycles access three percent or more of the memory in any 2 ms period, it may be possible to arrange the address connections such that these cycles alone are sufficient to maintain the data.

Another example where normal use can maintain the data involves transfers from disk or drum and 1103 memory. In some cases, it may be desired that these transfers take place at full memory rate, with no cycles devoted to refreshing. In most cases, addressing may be arranged so that the transfer itself refreshes the memory. Consider transfer between a 4096 word 1103 memory and a disk. Such transfers usually involve serial blocks of data. If the address bits include 1103 addresses $\mathrm{A}_{0}$ through $\mathrm{A}_{4}$, and the two bits


Figure 1.21. Refresh Address Switching
decoded to perform row selection, then any transfer of at least 128 consecutive words will refresh the entire memory. Using this organization, the controller generated refresh cycles may be inhibited during execution of such transfers without the loss of data.

## F. POWER CONSIDERATIONS

## F. 1 Power Supplies

Signal and power supply levels are important to the proper operation of the 1103 . Speed is a function of both the $V_{S S}$ level and the clock amplitudes. In general, higher amplitudes or voltages result in faster operation. However, voltages outside of the specified operating ranges may result in marginal operation or more critical timing. Substrate bias $\mathrm{V}_{B B}$ also has an effect on performance. This bias improves noise immunity and prevents parasitic interaction within the device, but larger values of bias result in slower operation and lower output current.
Between $\mathrm{V}_{S S}$ and $\mathrm{V}_{B B}$ the 1103 presents the equivalent of a silicon junction diode, which is reverse biased under normal operating conditions. During power supply turn on, $\mathrm{V}_{B B}$ should rise at least as fast as $\mathrm{V}_{S S}$ and during operation, $\mathrm{V}_{B B}$ should not fall below $\mathrm{V}_{S S}$. To insure proper $\mathrm{V}_{S S}$ regulation and to guarantee these turn-on characteristics, $\mathrm{V}_{B B}$ is best generated by a $3-4 \mathrm{~V}$ power supply in series with $\mathrm{V}_{S S}$ or by regulating $\mathrm{V}_{S S}$ at $3-4 \mathrm{~V}$ below $\mathrm{V}_{B B}$. Because the 1103s draw very little current from $\mathrm{V}_{B B}$, some protective series resistance may be used, providing that $\mathrm{V}_{B B}$ is adequately bypassed to $\mathrm{V}_{S S}$ at the 1103 s , and that any level shifters connected to $\mathrm{V}_{B B}$ are connected to the power supply side of the series resistance.
In spite of these precautions, shorting $\mathrm{V}_{B B}$ to ground may result in the $V_{S S}$ supply delivering destructive current through the 1103 s . Again, a current limit on $V_{S S}$ to provide protection, or deriving $\mathrm{V}_{S S}$ from $\mathrm{V}_{B B}$ is advisable. The current handling capability of the device is limited by bond wire and metallization current capability. Typical 1103s appear to have a voltage drop of about 0.7 V with an additional series resistance of about $5 \Omega$ and can be destroyed by energies in excess of .01 joule, or steady currents in excess of 100 mA . For cases where current limit protection cannot be used, a power supply crow bar may be desirable, to shunt the $\mathrm{V}_{S S}$ supply in the event of a $\mathrm{V}_{B B}$ short circuit or precharge driver failure. A Schottky barrier rectifier from $V_{S S}$ to $V_{B B}$ may also be used to draw $V_{S S}$ to ground when $\mathrm{V}_{B B}$ is shorted. This type of rectifier has a lower voltage drop than the junctions of the 1103. Conventional silicon rectifiers also offer some, but reduced, protection. In each case, one or more rectifiers are connected such that the cathode is connected to $\mathrm{V}_{B B}$ and the anode to $\mathrm{V}_{S S}$.

## F. 2 Power Consumption

Power consumption of the 1103 is associated primarily with current flow from $\mathrm{V}_{S S}$ to $\mathrm{V}_{D D}$. This current varies significantly during execution of a memory cycle. The maximum consumption of power occurs during the interval when precharge is active, particularly when both precharge and cenable are both active. Thus power consumption is a strong function of the precharge duty cycle.
In Figure 1.12, the current flow from $\mathrm{V}_{S S}$ to $\mathrm{V}_{D D}$ is shown in addition to the driving signals. Table 1.5 itemizes the absolute maximum ratings and limits on power supply current for various parts of the cycle. When precharge is
first applied, a surge of current flows associated with the charging of internal capacitances. The peak current is a function of precharge turn-on fall time, and is typically $80-100 \mathrm{~mA} @ 20 \mathrm{~ns}$. This current rapidly decays to a typical values of $37 \mathrm{~mA}\left(\mathrm{I}_{D D 1}\right)$ until cenable is made active. Most of this current is associated with the decoder circuits. During the overlap period (precharge and cenable both low), a somewhat higher current ( $\mathrm{I}_{D D 2}$ ) flows. When precharge is removed, the device current typically falls to below 11 mA ( ${ }_{D D 3}$ ) remaining at this level through the remainder of the cenable active period.
When cenable is removed, the current falls to a low level, under $4 \mathrm{~mA}\left(\mathrm{I}_{D D 4}^{\prime}\right)$. This current is associated with the circuits which generate internal signals $R$ and $\bar{R}$ (see Figure 1.11). The circuit generating $R$ draws current when address line $A_{4}$ is low and that generator $\overline{\mathrm{R}}$ draws current when internal signal $\overline{\mathrm{A}}_{4}$ is low. Both currents may be blocked to implement a low power data retention mode of operation for 1103 memories. The technique is described in the standby power dissipation section (page 1-27).
To estimate the power drawn by an 1103 memory system, the memory array power, the level shifter dissipation, and the controller dissipation must all be considered. The power consumed by the memory array is a function of the memory cycle timing used, with precharge duty cycle and the number of devices made active within the system the most important parameters.
Referring to the dc power characteristics of the 1103 , as itemized in Table 1.9 and for the cycle used as reference, 1103's executing memory cycles draw a maximum of 25 mA average current. Devices which are not executing memory cycles may draw a maximum current ( $\mathrm{I}_{D D 4}$ ) of 4 mA . Unless the memory cycle is altered such that the duty cycles of the various clocks are significantly different than the reference cycle, these figures will apply. For example, consider a $4 \mathrm{k} \times 16$ ( 4 rows of 161103 s each) memory, in which only the selected row is driven with precharge. Maximum average power supply current for the array is then $16 \times 0.025+3 \times 16 \times .004=.593 \mathrm{amp}$.
For memory cycles which differ significantly in timing, the average current per device executing a cycle may be estimated from the duty cycles associated with precharge, overlap, cenable, and transition periods.
To a first approximation, the user can base current requirements on precharge-on to precharge-off duty cycles (including transitions), precharge-off to cenable-off duty cycle, and the cenable-off to start of precharge turn-on. For the reference cycle of the 1103 data sheet, these times are as follows:
full precharge interval:
$\mathrm{t}_{P C}+\mathrm{t}_{O V L}+3 \mathrm{t}_{T}=210 \mathrm{nsec}$ precharge-off to cenable-off:
${ }^{\mathrm{t}} \mathrm{POV}^{+}{ }^{\mathrm{t}}$ R
or: $\mathrm{t}_{P W}+\mathrm{t}_{W P}+2 \mathrm{t}_{T}=285 \mathrm{nsec}$
cenable-off to precharge-on:
${ }^{\mathrm{t}} \mathrm{CP}=85 \mathrm{nsec}$

Using these current figures for these intervals and neglecting the extra current during the overlap period results in an estimate within a few percent of the current actually observed.
Executing memory cycles, the 1103 dissipation can approach 400 mW . Between cycles, dissipation is less than 64 mW . A significant amount of power can be saved in an 1103 system by applying precharge only to those rows of 1103 s which are to be accessed. The same circuits used to decode address bits to drive cenable are also used to select which precharge drivers are to be operated.
When precharge is decoded in the same manner as chip select, noise problems can result within the 1103 unless at least one of the following precautions is taken:

1. Decode "write" in the same manner as precharge and cenable.
2. Apply a minimum 40 ns "sliver" of precharge to the unselected devices in the array rather than fully block precharge.
3. Reduce the high level of the precharge driver so that it falls in the range of $\mathrm{V}_{S S}-0.5$ to $\mathrm{V}_{S S}-1.5$.

It is also undesirable to operate the array in a mode where write cycles are continuously executed at the same address. This action may aggravate noise levels in the system.

Arrays of 1103 s draw large surges of current during normal operation. Power supplies must be adequately regulated and bypassed at the memory array to insure proper maintenance of the required voltages. Distribution of bypass
capacitors is discussed in more detail in the section on printed circuit layout.
A. major contribution to power dissipation in 1103 memory systems is made by the level shifter circuits. Level shifters of the types shown in Figures 1.14a and 1.14b dissipate some 400 mW each when the output of the level shifter is low $\left(\mathrm{V}_{S S}=16 \mathrm{~V}, \mathrm{~V}_{B B}-\mathrm{V}_{S S}=4 \mathrm{~V}\right)$. A typical $4 \mathrm{k} \times 16$ memory requires ten address level shifters, 16 data level shifters, and from 6 to 12 clock level shifters depending upon the clock decoding scheme used. Because the clock level shifters have relatively low effective duty cycles, the de dissipation is relatively small, perhaps 300 400 mW total. However, worst case dissipation for the 26 address and data level shifters could exceed 10 watts. Some reduction in power may be achieved by gating the data level shifters such that they deliver low outputs only during the cenable period of write cycles. In large systems, addresses may be held high on all but selected memory modules in order to conserve power.
In level shifters which are driving large capacitive loads, the power associated with charging and discharging the capacitances must also be considered. This power, given by the relationship $P=\mathrm{fCV}^{2}$ (see above in the section on level shift circuits) is most significant in clock driver circuits. However, in high performance, heavily loaded address drivers it is very important to prevent switching transients from producing an effectively higher frequency and correspondingly higher dissipation. For example, in a $4 \mathrm{k} \times 16$ memory, using a 700 nsec cycle, addresses would typically change at most once per cycle, corresponding to a maxi-

Table 1.9. Absolute Maximum Ratings and Power Characteristics for the 1103
1103 ABSOLUTE MAXIMUM RATINGS*

| Temperature Under Bias | $0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$ | * COMMENT: |
| :---: | :---: | :---: |
| Storage Temperature | $-65^{\circ} \mathrm{C}$ to $+150^{\circ} \mathrm{C}$ | Stresses above those listed under "Absolute Maximum |
| All Input or Output Voltages with Respect to the Most Positive Supply Voltage, $\mathrm{V}_{\mathrm{BB}}$ | -25 V to 0.5 V | Rating" may cause permanent damage to the device. This is a stress rating only and functional operation of the device at these or at any other condition above those indicated in |
| Supply Voltages $V_{D D}$ and $V_{S S}$ with Respect to $V_{B B}$ | -25 V to 0.5 V | the operational sections of this specification is not implied. Exposure to absolute maximum rating conditions for ex- |
| Power Dissipation | 1.0 W | tended periods may affect device reliability. |

## 1103 POWER CHARACTERISTICS

$T_{A}=0^{\circ} \mathrm{C}$ to $+70^{\circ} \mathrm{C}, \mathrm{V}_{\mathrm{SS}}^{(1)}=16 \mathrm{~V} \pm 5 \%,\left(\mathrm{~V}_{\mathrm{BB}}-\mathrm{V}_{\mathrm{SS}}\right)=3 \mathrm{~V}$ to $4 \mathrm{~V}, \mathrm{~V}_{\mathrm{DD}}=0 \mathrm{~V}$ unless otherwise specified

| $\mathrm{I}_{\mathrm{BB}}$ | $\mathrm{V}_{\text {BB }}$ SUPPLY CURRENT | 100 |  | $\mu \mathrm{A}$ |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| ${ }_{\text {'DD1 }}{ }^{[1]}$ | SUPPLY CURRENT DURING $T_{\text {PC }}$ | 37 | 56 | mA | ALL ADDRESSES $=0 \mathrm{~V}$ PRECHARGE $=0 \mathrm{~V}$ <br> CENABLE $=V_{S S} ; T_{A}=259 \mathrm{C}$ |
| $\mathrm{DOD}^{[1]}$ | SUPPLY CURRENT DURING ToV | 38 | 59 | mA | $\begin{aligned} & \text { ALL ADDRESSES }=0 \mathrm{~V} \\ & \text { PRECHARGE }=0 \mathrm{~V} \\ & \text { CENABLE }=0 \mathrm{~V} ; \mathrm{T}_{A}=25^{\circ} \mathrm{C} \end{aligned}$ |
| ${ }_{\text {DOS }}{ }^{[1]}$ | SUPPLY CURRENT DURING TPOV | 5.5 | 11 | mA | $\begin{aligned} & \text { PRECHARGE }=V_{S S} \\ & \text { CENABLE }=0 \mathrm{~V}: T_{A}=25^{\circ} \mathrm{C} \end{aligned}$ |
| ${ }^{1} \mathrm{DO4}{ }^{[1]}$ | SUPPLY CURRENT DURING $T_{C P}$ | 3 | 4 | mA | $\begin{aligned} & \text { PRECHARGE }=V_{S S} \\ & \text { CENABLE }=V_{S S}: T_{A}=25^{\circ} \mathrm{C} . \end{aligned}$ |
| ${ }^{\text {I DOA }}$ [2] ${ }^{\text {a }}$ | AVERAGE SUPPI.Y CURRENT | 17 | 25 | mA | $\begin{aligned} & \text { CYCLE TIME }=580 \mathrm{~ns} ; \text { PRECHARGE } \\ & \text { WIDTH }=190 \mathrm{~ns} ; \mathrm{T}_{A}=25^{\circ} \mathrm{C} \end{aligned}$ |

NOTES: 1. These values are taken from a single pulse measurement.
2. This parameter is periodically sampled and is not $100 \%$ tested.
mum frequency of half the memory cycle rate. Using a value of 16 volts and 448 pF , the dissipation associated with capacitance charging is about 84 mW . If, however, "glitches" result in two or more additional address transitions prior to starting a cycle, this figure may be tripled or more.

In larger memory systems built with the 1103, the total system power may be reduced by insuring data and address level shifters in unaccessed modules remain in a low dissipation condition.

## F. 3 Standby Power Dissipation

Low power data retention (LPDR) refers to a mode of operation for an 1103 memory whereby data is retained within the memory with minimum power dissipation. With a suitably designed 1103 memory system, the power required to retain data can be reduced to below 5 watts per million bits of memory. This mode is used to make the memory appear non-volatile to the user. A small battery pack may be used to maintain the memory content in the event of the failure of the external power source. In the low power data retention mode, the data is not considered accessible. Only that power necessary for the memory to retain the data is consumed. When normal power is restored to the system, the memory is switched back to normal mode, but does not have to be reloaded before the system is usable. This feature is particularly useful for process control applications and for small machines that do not have disk or tape back-up storage.
Although in normal operation the 1103 has a potential dc current path from $V_{S S}$ to $V_{D D}$ even when cycles are not being executed, it is possible to design a system such that this path is blocked when cycles are not being executed. The ability to block this path permits very low standby power operation to be achieved.
To maintain data within an 1103 system, it is sufficient to execute the 32 required refresh cycles in each 2 msec period. Rather than space the refresh cycles $60 \mu \mathrm{sec}$ apart as is common in normal operation, when running in LPDR mode the refresh cycles are more conveniently executed in a burst of 32 cycles every 2 msec . With this mode of refreshing, no memory for refresh address selection is needed between bursts, so that the timing and refresh address circuitry may be turned off between bursts. This switching-off of power to the controller reduces the standby power associated with the controller logic. $\mathrm{V}_{B B}$ and $\mathrm{V}_{S S}$ power supplies must be maintained at the memory throughout LPDR operation, although the current paths through the 1103 are blocked between bursts. To biock $\mathrm{I}_{D D 4}$ current in the 1103s between bursts, the last cycle of the burst should have address $\mathrm{A}_{4}$ low. Subsequent to this cycle, $\mathrm{A}_{4}$ is raised high for the between burst interval.
Level shift circuits should be designed so that their power consumption can be made negligible between bursts of cycles. The level shifter should also maintain the output levels at $V_{S S}$ with the TTL supply turned off. To prevent current flow from the $\mathrm{V}_{B B}$ supply to the $\mathrm{V}_{S S}$ supply, the
$\mathrm{V}_{B B}$ supply to the level shifters should be dropped to $\mathrm{V}_{S S}$ voltage between bursts. However, normal $\mathrm{V}_{B B}$ voltage must be applied to the 1103s,
Those level shifters of Figure 1.14 which utilize the SN7406 may not operate properly with the TTL supply off, for the clamping action of an internal transistor across the emitter base diode of the output transistor may be necessary for the output device to sustain the 16 V supply. However, the circuit of Figure 1.14C has been successfully used in 1103 memories operated in LPDR mode.
In LPDR mode, the $\mathrm{V}_{B B}, \mathrm{~V}_{S S}$ and controller TTL supplies are maintained by batteries. For best results, these batteries are located on the unregulated side of the power supply. When main power fails, the memory is switched to LPDR mode, but voltages within the array do not experience switch-over transients. The controller TTL supply is switched on to the logic only during execution of the refresh bursts.

One form of control circuit necessary to implement LPDR mode is shown in Figure 1.22. A timer circuit realized by an RC network and difference amplifier established the 2 msec period between bursts. When this circuit times out, or when main power is switched on, the controller power is switched on. A power-on reset pulse initializes the controller and refresh address counter and resets the refresh mode control flip-flop to "burst" mode. After the poweron reset is removed, the refresh mode control flip-flop requests refresh cycles until all refresh addresses have been exercised. The carry-out from the refresh address counter sets the refresh mode control flip-flop to normal after the 32 refresh cycles have been executed. This action returns the memory to the mode where refresh cycles are executed every $60 \mu \mathrm{sec}$, and makes it available for normal cycle requests. However, if main power is not on, the setting of the refresh mode control flip-flop resets the 2 millisecond refresh interval timer and turns off the controller power. With controller power off, the 2 millisecond timer again charges for 2 msec then switches on controller power, and the process is repeated. Turning main power on or off also requests execution of a burst of cycles, so that at no point does any location ever experience more than 2 msec between refresh cycles.
In a typical $4 \mathrm{k} \times 16$ system, the following power levels were experienced:

|  | Normal <br> Operation | LPDR Mode |
| :--- | :---: | :---: |
| $\mathrm{V}_{B B}(+19.5 \mathrm{~V})$ | 5.7 W | .070 W |
| $\mathrm{~V}_{S S}(+16 \mathrm{~V})$ | 3.2 W | .145 W |
| $\mathrm{~V}_{C C}(+5 \mathrm{~V})$ | 3.6 W | .090 W |
| $\mathrm{~V}_{C C 2}(+5 \mathrm{~V})$ | 4.0 W | - |
| $\mathrm{V}_{C C-}(-5 \mathrm{~V})$ | 0.4 W | - |
|  | 16.9 Watts | .305 Watts |

$\mathrm{V}_{C C}$ is the +5 V supply backed up by batteries while $\mathrm{V}_{C C 2}$ supplies power to circuits required only for normal operation. $\mathrm{V}_{C C-}$ is a supply used only for the SN75107 sense amplifiers.


Figure 1.22. LPDR Circuit

The power dissipation in LPDR mode is essentially proportional to the number of refresh bursts executed per second. This rate is limited to 500 per second or more by the 2 msec maximum interval between refreshes. However, the 2 msec figure applies to operation over the full range from 0 to $70^{\circ} \mathrm{C}$. Typical parts can tolerate much longer intervals between refresh cycles when operating at room temperature. Systems have been built which measure the operating temperature of the memory and vary the refresh interval accordingly. Standby power levels under one watt per million bits have been reported for memories at room temperature.*

## F. 4 Printed Circuit Layout Considerations

The pin connections of the 1103 permit an array to be laid out on $1 / 2^{\prime \prime} \times 1$ " centers if leads are run between pins of the dual in-line package. To achieve this packing density, leads running between pins need only be used on the component side of the board. Figure 1.23 shows a layout for use with the 1103.

At the present time, there seem to be several schools of
*D. Appelt - Providing Non-volatile LSI Memory, Digest of Papers, COMPCON 72, Printed Circuit Layout Considerations.


DATA OUT
thought concerning the use of boards with ground planes or ground and $\mathrm{V}_{S S}$ planes vs. two-sided boards. Although empirical evidence seems to indicate that the ground and $\mathrm{V}_{S S}$ plane construction is superior in terms of noise within the system, successful two-sided layouts have been achieved. To keep noise low in two-sided layouts, a number of rules should be observed.

1. Occasional vertical busses on $\mathrm{V}_{S S}$ and $\mathrm{V}_{D D}$ make the power feed more closely approximate a grid reducing power supply noise.
2. Adequate bypassing of the $\mathrm{V}_{S S}$ supply to $\mathrm{V}_{D D}$ is mandatory. At least $.05 \mu \mathrm{fd}$ of low inductance capacitance per device should be used. It is usually sufficient to use a $0.5 \mu \mathrm{fd}$ ceramic for every 8 to 10 devices in the array. One such capacitor is used at the end of each row of devices. The capacitors should be very close to the array or within the array. In general, no part should be further than a few inches from a bypass capacitor.
Referring to figure 1.12 and Table 1.9 , it may be noted that the 1103 draws a rapidly fluctuating current during execution of a memory cycle. Even if it were possible to design a regulator which could respond to these rapid fluctuations, the inductance of the leads from the power supply would result in poor regulation at the memory. (Most regulator circuits have response times of several tens of microseconds.) It is the function of the bypass capacitors to maintain constant voltage throughout the execution of a cycle.
3. The $V_{B B}$ supply should be bypassed to $V_{S S}$ rather than $\mathrm{V}_{D D}$, as the difference between $\mathrm{V}_{S S}$ and $\mathrm{V}_{B B}$ is the more important parameter. Bypasses to both $\mathrm{V}_{S S}$ and $\mathrm{V}_{D D}$ are commonly used. Although the 1103 draws little or no DC current from the $V_{B B}$ supply, transient currents do flow during clock transitions. Bypassing insures adequate tracking of the supplies, and reduces fluctuations due to capacitive coupling within the 1103. A typical bypass circuit is shown below:

4. Clock drivers and address level shifters should be located as close to the array as possible, and should be connected to the array with short leads. The array may otherwise resonate with the series lead inductance and produce ringing.
Excessive ringing can seriously impair memory operating margins. Series damping resistors and clamp diodes which prevent 1103 input lines from rising more than a volt above the $\mathrm{V}_{S S}$ supply may be utilized to improve drive signal quality, but at the expense of some increase in rise time.
5. When low power data retention is utilized, all devices in the array simultaneously execute a burst of 32 cycles. As a result, power supply surge currents are even more severe when this mode of operation is used. Because power supply regulators do not respond rapidly enough, the supply
voltages must be maintained at the array by using additional bypass capacitors. For LPDR mode, an additional 1 to $2 \mu \mathrm{fd}$ per device should be used in addition to the ceramic bypasses described above. For example, this capacitance might take the form of a $100 \mu \mathrm{fd}$ tantalum electrolytic for a 64 device array. For example, this capacitance should be located immediately adjacent to the array.
6. With the availability of inexpensive integrated circuit regulators, local regulation may be used to advantage in some 1103 memory systems. Current limiting, and the shutting down of $\mathrm{V}_{S S}$ in the event of $\mathrm{V}_{B B}$ short may be more readily achieved with local regulation.
7. The large currents which flow during clock and address transitions can be troublesome if poor layout practices are used. These currents should not be allowed to flow to TTL signal or ground lines or inferior drive waveforms and even oscillation may result. The connections from level shifters to the array provide for isolation of the array current paths: from the TTL ground lines.
8. In the section covering sense amplifiers, the use of a dummy line to achieve a balanced system was mentioned. Although use of this technique is optional, more care must be used in unbalanced systems to prevent coupling of sig. nals from data output busses back around through clock drivers, etc. High ground impedances may result in circuit instabilities.

## G. APPLICATION EXAMPLES

## G. 1 1k x 8 Memory

The 1103 has been used in memories covering a wide range of sizes. Systems as small as $1 \mathrm{k} \times 8$ to over a million bits can be achieved economically and conveniently using the 1103.

Figure 1.24 shows a photograph of a 1 k x 8 memory using 81103 s with a clock counting controller. The system is synchronous, with alternate cycles reserved for refresh cycles. A control line permits inhibiting the refresh cycles. Some power can be saved by enabling refresh cycles for only $40-50 \mu \mathrm{sec}$ out of the 2 msec period. The board in the photo provides address and data output latches. Level shifters similar to those of Figure 16a are used for address and data. Clocks use a driver similar to that of Figure 16b although that of Figure 16c could also have been used.
Most of the design compromises for the $1 \mathrm{k} \times 8$ were made in favor of minimum cost. As a result, using medium quantity prices for the peripheral logic, the peripheral parts cost, including the board, amounts to less than four tenths of a cent per bit. In high volume production quantities, the overhead cost should be much less than this figure.
Memories in the order of 1 kx 8 in size find application in cathode ray display terminals, when the designer wants the advantages of random access memory over conventional serial memory. Other applications include buffers for peripheral equipment, calculator memories, etc.


Figure 1.24. $1 \mathrm{k} \times 81103$ Memory


Figure 1.25. $4 \mathrm{k} \times 181103$ Memory Complete with Controller

## G. $24 \mathrm{k} \times 18$ Memory

A very popular application for the 1103 is as a mini-computer main frame memory. Figure 1.25 shows a photograph of a $4 \mathrm{k} \times 18$ memory complete with delay line controller similar to that of Figure 1.20. For level shifters and clock driver, the system utilizes the Intel 3207A integrated quad clock driver. This board also includes address and data output latches. The SN75108 is used for the sense amplifiers.
For some memories, cost savings may be achieved by using one controller for a number of memory modules. Figure 1.26 shows a photo of an example of a typical 1103 application. This memory board is available as part of a basic system that is being offered by Intel's Memory Systems Division. The basic memory consists of two plug-in boards: a memory board (MU) and a control board (CU). The control board is capable of operating up to 8 MUs in configurations of 32 k words by 18 bits or 65 k words by 9 bits. The overhead cost for the system approaches that of the individual module in large systems. In large systems, the designer may also wish to utilize some of the power saving schemes described above.

The large 1103 memory system can offer the system designer some unusual extra features at very little additional cost. The addition of address and data registers to each module permits multiple simultaneous access to the memory (although only one access may be made to each module). At a slight increase in memory cycle, the data registers may be eliminated, and the various modules polled via the "strobe" input to the sense amplifiers. By using additional write clock drivers, a memory can be structured to have a wide data path for reading, and a variable width data path for writing.

## H. PROTECTION AGAINST CATASTROPHIC DAMAGE

As with any semiconductor component, 1103s can be damaged by misuse, or mis-applied voltages. The component should be protected from such misuse during shipment, handling, and when installed in the system.
MOS circuits are characterized by high impedances, and therefore are capable of being charged to high voltages by static charges. The gate circuits of MOS transistors are subject to destructive breakdown if excessively charged. The 1103 has an effective gate protection circuit for all input connections, and requires no elaborate precautions in normal use. However, some environments are subject to extreme buildup of static charge, and are capable of releasing sufficient amounts of energy to damage any semiconductor component. MOS components in particular should be protected from these static charges. Some precautions which are easy to implement and yet which are quite effective include:

1. Carrying components in conductive trays, such as metal or foil-lined pans.
2. Having personnel touch ground, the chassis, or the carrier tray before picking up components.
3. Avoiding high-static materials and fabrics in work areas. In a circuit or tester, the part should not be subject to high voltage spikes. Although voltages somewhat in excess of the maximums may not cause damage if applied for short periods of time, voltages near the maximums should not be applied for long periods of time. Operation under excessive power dissipation conditions may also impair long-term reliability.


Figure 1.26. in-10 Memory System

In a system, adequate cooling air must be provided to prevent excessive temperature rise. As in the case of bipolar and static memory components, it is possible to install 1103 s at packing densities exceeding those typical of TTL. In arrays with undecoded precharge, power dissipation per package can be higher than the typical TTL dual in-line package.

Power dissipation is a function of precharge duty cycle. If a precharge driver fails such that precharge remains permanently low within the array, excessive dissipation may result. Some form of overcurrent protection for the $V_{S S}$ supply is recommended to prevent potential damage due to this type of failure.

## I. SUMMARY

The 1103, a 1024 bit dynamic-MOS random-access readwrite memory, is now regarded an industry standard. Any memory system designer should be aware of the advantages of using this component. The foregoing text was prepared to help the potential user understand the operation of the 1103 , and to provide the designer with the information he needs to use the 1103 in practical systems.

Page
INTRODUCTION ..... 2-1
ELECTRICALLY PROGRAMMABLE MOS ROMs ..... 2-2
A. Principles of Operation for Floating Gate Devices ..... 2-2
B. Programming the 1602/1702 ..... 2-3
C. Programming the 1602A/1702A ..... 2-3
D. Programmers ..... 2-4
E. Erasing the 1702 or 1702A ..... 2-4
PROMS - FIELD PROGRAMMABLE BIPOLAR ROMs ..... 2-6
PREPARING DATA FOR ROMs/PROMs ..... 2-8
ROM IMPLEMENTATION OF GENERALIZED LOGIC ..... 2-9
A. Combinatorial Circuits ..... 2.9
B. Sequential Circuits ..... 2-9
EXAMPLE OF ROM APPLICATION ..... 2-12

## INTRODUCTION

The devices discussed in this section are also random access memories, and may be wired in arrays much like the fixed address read-write memories of Part I. However, all of the products in this group are characterized by having the data, once it is entered into the array, essentially unchangeable by the system using the memory array. Entering data into
the array is known as "programming" the read only memory (ROM).
Two categories of devices are discussed here: 1. mask-programmed read-only memories, which have the data entered during the manufacture of the components, and; 2. field programmable read-only memories, which allow the user to enter data into the devices in the field. Field programmable read-only memories may be erasable or non-erasable. Eras able read-only memories provide some means for removing the data from the array and making the device again available for field programming.
Read only memories (ROMs) are used for a wide variety of applications. They may be used to perform character generation for TV displays, code conversion, to perform the equivalent of logic function in controllers, and to store programs (particularly certain commonly used routines such as loaders for computers, etc.). Primary advantages of ROM are its non-volatility and low cost, usually much lower than read-write memory. The main disadvantage, particularly for mask programmed ROM, is the inability to change erroneous data.

Table 2.1 lists the types of ROM that will be discussed in this section. All of the ROMs listed are general purpose devices, but vary in access time, eraseability, configuration, and programming method.
All of the ROMs of Table 2.1 include one or more chip select leads and may be wired in arrays as were the readwrite memories of Part I. All of the buffering considerations for p-channel MOS RAMs apply to p-channel MOS ROMs with the exception that there is no "write" lead to be bussed in the array. TTL input and output considerations

Table 2.1. Intel ROMs and PROMs

| Part Number | Technology | Programming <br> Technology | Programming <br> Time | Erasable | Size | Access Time <br> (nsec) |
| :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| 1302 | p-channel MOS | Mask | NA | No | $256 \times 8$ | 1000 |
| 1602 | p-channel MOS | Field | 20 min. | No | $256 \times 8$ | 1000 |
| $1602 A$ | p-channel MOS | Field | 2 min. | No | $256 \times 8$ | 1000 |
| 1702 | p-channel MOS | Field | 20 min. | Yes | $256 \times 8$ | 1000 |
| 1702A | p-channel MOS | Field | 2 min. | Yes | $256 \times 8$ | 1000 |
| $3301 A$ | Schottky Bipolar | Mask | NA | No | $256 \times 4$ | 45 |
| 3304 | Schottky Bipolar | Mask | NA | No | $512 \times 8$ or | 65 |
| 3601 | Schottky Bipolar | Field | 2 sec. | No | $1024 \times 4$ | $256 \times 4$ |

are also similar to those for static p-channel MOS readwrite memory.
Power dissipation considerations are also similar to those for read-write memories. However, some systems may
operate with switched power to reduce overall power consumption. Because ROMs and PROMs are non-volatile, power may be turned on and off without altering memory content.

## ELECTRICALLY PROGRAMMABLE MOS ROMs

The $1602,1602 \mathrm{~A}, 1702$ and 1702 A are 256 words by 8 bit electrically programmable ROMs ideally suited for uses where fast turn-around and pattern experimentation are important. The 1702 and 1702A are packaged in a 24 pin dual in-line package with a transparent quartz lid. The transparent quartz lid allows the user to expose the chip to ultraviolet light to erase the bit pattern. Therefore, unlike for a metal mask ROM where a pattern cannot be changed, a new pattern can then be written into the devices.
The 1602 A and 1602 are packaged in a 24 pin dual in-line package with a metal lid. The 1602A and 1602 are intended for applications where field programmability and package hermeticity are desired. The 1602A and 1602 are not erasable due to the metal lid.
There is no electrical difference between the 1602A/1702A and the 1602/1702 in the read mode. However, the 1602A/ 1702A offers a faster programming time than the 1602/1702. The $1602 \mathrm{~A} / 1702 \mathrm{~A}$ will program in 2 minutes versus 20 minutes for the $1602 / 1702$. The power dissipation during programming is also greatly reduced in the $1602 \mathrm{~A} / 1702 \mathrm{~A}$.

## A. PRINCIPLES OF OPERATION FOR FLOATING GATE DEVICES

The $1602,1602 \mathrm{~A}, 1702,1702 \mathrm{~A}$ all utilize the same principle for data storage. The storage medium for each data bit is a silicon gate MOS transistor. The gate of the transistor is left floating, completely surrounded by oxide. As manufactured, the floating gate is uncharged, and all of the transistors are in their "off" or non-conducting state. However, avalanching one of the junctions associated with the transistor produces electrons with sufficient energy to traverse the thin gate oxide layer. These electrons reach the floating gate and charge it negatively. With sufficient negative charge on the gate, the MOS transistor will turn on and enter the conducting state.
Once the floating gate is charged, considerable energy is necessary to give the electrons sufficient velocity to escape the gate. Typically in excess of 5 eV is necessary. Sunlight and ordinary illumination are insufficient, and the thermal excitation associated with normal operating temperatures is far too small. X-rays and stray radiation may discharge the
gate, but only at dosages far in excess of those fatal to humans. However, special short-wave ultraviolet sources very effectively erase the devices by sufficiently exciting the trapped electrons. The 1702 and 1702A have transparent quartz lids to facilitate erasure.

## B. PROGRAMMING THE $1602 / 1702$

The 1602 and 1702 are 2048 bit static field programmable ROMs. As supplied, all data bits are initially ones (output high), with the programming operation forcing ones to zeroes or leaving ones unchanged. Ultraviolet erasure (of the 1702) restores the data to all ones.

Figure 2.1 shows the waveforms required to program the 1602 or 1702 . During programming $V_{C C}$ is held at ground and $\mathrm{V}_{B B}$ is biased to +12 V . All voltages shown in Figure 2.1 are measured with respect to the $V_{C C}$ terminal.

During programming, address logic level " 0 " (corresponds to address low during reading) is -40 to -48 volts and address logic level " 1 " is approximately zero volts. To enter data at a specified address, the address signals are applied to the address leads, data to be entered is applied to the output leads, and the power supplies $V_{D D}$ and $\mathrm{V}_{\mathrm{GG}}$ and program lead are pulsed negatively. To program a location such that it will deliver a low when reading, during programming the data input terminal must be held high ( 0 volts with respect to VDD) and to enter a logic
" 1 " (high output when reading) the data terminal must be held at -40 V during programming.

Once address and data are stable, the power and program pulses may be applied. The power leads, $V_{D D}$ and $V_{G G}$, must be on (at their negative extreme) about $1 \mu$ s before the program pulse is applied and should remain on for $1 \mu \mathrm{~s}$ after the program pulse is removed. Duty cycle for these pulses should not exceed $2 \%$ or the device may be destroyed by excessive power dissipation. A typical programming sequence consists of applying five 20 msec pulses 1 second apart.

## C. PROGRAMMING THE 1602A/1702A

The 1602 A and 1702A are 2048 bit static field-programmable ROMs. As supplied, all data bits are initially zeroes (output low), with the programming operation forcing zeroes to ones or leaving zeroes unchanged. Ultraviolet erasure (of the 1702A) restores the data to all zeroes.

The 1602A and 1702A require a different programming sequence than the 1602/1702. Where the 1602/1702 draws a peak current during programming in the order of 1.2 A , the $1602 \mathrm{~A} / 1702 \mathrm{~A}$ draws approximately 200 mA . As a result a much higher programming duty cycle may be used, and programming time is reduced from the 20 minutes for the $1602 / 1702$ to 2 minutes for the $1602 \mathrm{~A} / 1702 \mathrm{~A}$.


Figure 2.1. Programming Waveforms for 1602, 1702.


Figure 2.2. Programming Waveforms for 1602A, 1702A

Figure 2.2 shows the waveforms for programming the 1602A/1702A. During programming $\mathrm{V}_{\mathrm{CC}}$ should be held at ground and $V_{B B}$ should be held at +12 volts. Address levels are approximately -40 volts for a logic " 0 " and approximately zero volts for a logic " 1 ". Note that these levels are larger in magnitude but in the same polarity sense as those used for reading from the memory:

$$
\begin{aligned}
& \text { logic " } 0 " \leqslant V_{C C}-4.2, \operatorname{logic} " 1 " \geqslant V_{C C}-2 \\
& \text { where } V_{C C}=5 \mathrm{~V} \pm 5 \%
\end{aligned}
$$

When programming, the negative going power supplies must be pulsed. $\mathrm{V}_{D D}$ is pulsed to $-47 \mathrm{~V} \pm 1 \mathrm{~V}$ and $\mathrm{V}_{G G}$ to -35 to -40 V . Before $\mathrm{V}_{D D}$ and $\mathrm{V}_{G G}$ are turned on, the complement of the address to be programmed should be applied. After the power has been applied for at least $25 \mu \mathrm{~s}$, the address must be returned to its true form. Some 10 or more $\mu \mathrm{s}$ after the address has reached its true form and at least $100 \mu \mathrm{~s}$ after turning on power, the 3 ms program pulse, at $-47 \pm 1 \mathrm{~V}$, may be applied. During the interval when $V_{D D}$ is applied, data signals must be applied to the data output lines. A data level of approximately zero volts will result in the location remaining unchanged, while a level of $-47 \pm 1 \mathrm{~V}$ will program a logic " 1 " (high output in read mode). After the program pulse is turned off, the $V_{D D}$ and $\mathrm{V}_{G G}$ voltages should be turned off. This turn off should occur from 10 to 100 microseconds after removal of the program pulse.
For best results, the 1602A/1702A should be programmed by scanning through the addresses in binary sequence some 32 times. Each pass repeats the same series of programming
pulses. The duty cycle for applied power must not exceed $20 \%$. As a result, each pass takes about 4 seconds, with the 32 passes taking just over 2 minutes.

## D. PROGRAMMERS

Although manual programmers may be constructed for the $1602 \mathrm{~A} / 1702 \mathrm{~A}$, the circuitry is fairly complex and manually programming a PROM of this capacity is quite tedious. There are several automatic programmers capable of programming the $1602 \mathrm{~A} / 1702 \mathrm{~A}$. One is an assembly using the Intel SIM4-03 or SIM8-01 micro computer board which executes a set of micro computer instructions for programming the device. It is assembled with the MP7-03 programmer card and an ASR-33 teletype.
Figure 2.3 is a diagram of the SIM4-03 or SIM8-01/ MP7-03 programming assembly. Figure 2.4 shows the complete programming system with the boards (SIM4-03 or SIM8-01/MP7-03) and control chassis (MCB4-20 for the SIM4-03 and MCB8-10 for the SIM8-01). For further information about these units, refer to the MCS-4 and 8 Users Manuals.

## E. ERASING THE 1702 OR 1702A

The 1702 and 1702A ROMs may be erased by exposure to high intensity short-wave ultraviolet light at a wavelength of $2537 \AA$. The recommended integrated dose (i.e., UV intensity $x$ intensity time) is $6 \mathrm{~W}-\mathrm{sec} / \mathrm{cm}^{2}$. The devices are made with a transparent quartz lid covering the silicon


Figure 2.3. Microcomputer Programming Diagram


Figure 2.4. MCS-4 and MCS-8 Programming System
die. Conventional room light, fluorescent light, or sunlight has no measurable effect on stored data, even after years of exposure. However, after 10 to 20 minutes under a suitable source, such as the Ultraviolet Products UVS-54 or UVS-52 (Ultraviolet Products, 5114 Walnut Grove Avenue, San Gabriel, California 91778), the device is erased to a state of all ones (1702) or all zeroes (1702A). It is recommended that no more ultraviolet light exposure than that necessary
to erase the $1702 / 1702 \mathrm{~A}$ should be used to prevent damage to the device.
CAUTION: When using an ultraviolet source of this type, one should be careful not to expose one's eyes or skin to the ultraviolet rays because of the damage to vision, or burns which might occur. In addition, these shortwave rays may generate considerable amounts of ozone which is also potentially hazardous.

## PROMs-FIELD PROGRAMMABLE BIPOLAR PROMs

The 3601 is a bipolar 256 word by 4 bit Read Only Memory which is programmed in the field by blowing a polycrystalline silicone fuse for each bit to be programmed (to a logic " 1 "). The 3601 is guaranteed to have a maximum access time of 70 ns over the $0^{\circ} \mathrm{C}$ to $75^{\circ} \mathrm{C}$ operating range. It is pin compatible to the 3301 A metal mask ROM.

The 3601 may be programmed using the basic circuit of Figure 2.5. Address inputs are at standard TTL levels. Only one output may be programmed at a time. The output to be programmed must be connected to $V_{\mathbf{C C}}$ through a $300 \Omega$ resistor. This will force the proper programming current ( $3-6 \mathrm{~mA}$ ) into the output when the VCC supply is later raised to 10 V . All other outputs must be held at a TTL low level ( 0.4 V ).


Figure 2.5. 3601 Programming
The programming pulse generator produces a series of pulses , to the $3601 \mathrm{~V}_{\mathrm{CC}}$ and $\mathrm{CS}_{2}$ leads. $\mathrm{V}_{\mathrm{CC}}$ is pulsed from a low of $4.5 \pm .25 \mathrm{~V}$ to a high of $10 \pm .25 \mathrm{~V}$, while $\mathrm{CS}_{2}$ is pulsed from a low of ground (TTL logic 0 ) to a high of $15 \pm .25 \mathrm{~V}$. It is important to accurately maintain these voltage levels, otherwise, improper programming may result.

The pulses applied must maintain a duty cycle of $50 \pm 10 \%$ and start with an initial width of $1( \pm 10 \%) \mu \mathrm{s}$, and increase linearly over a period of approximately 100 ms to a maximum width of $8( \pm 10 \%) \mu \mathrm{s}$. Typical devices have their fuse blown within 2 ms , but occasionally a fuse may take up to 400 ms .

During the application of the program pulse, current to $\mathrm{CS}_{2}$ must be limited to 100 mA . The output of the 3601 is sensed when $\mathrm{CS}_{2}$ is at a TTL low level output. A programmed bit will have a TTL high output. After a fuse is blown, the $\mathrm{V}_{\mathrm{CC}}$ and $\mathrm{CS}_{2}$ pulse trains must be applied for another $100 \mu \mathrm{~s}$.
One circuit which can be used to generate this pulse train is shown in Figure 2.6, while the characteristics of the pulse train are shown in 2.7.


Figure 2.6. 3601 Programmer


Figure 2.7. Pulṣes During Programming

## PREPARING DATA FOR ROMs and PROMs

Entering the data into a ROM is called "programming" the ROM. Whether programming a field-programmed device or preparing to order masked-programmed devices, the data to be entered into the array must be prepared and very carefully checked (although erasable devices are more forgiving of errors than the other types). These data are usually in the form of truth tables or their equivalent, such as suitably prepared paper tapes, etc.
In the case of mask-programmed devices, the truth tables, tapes, computer punched cards, etc. are submitted to the integrated circuit manufacturer, who uses them to prepare at least one of the photo masks used for processing the wafers of read-only memory devices. He also uses this data to prepare test programs for testing the wafers after they are through the wafer fabrication process. Properly functing devices are assembled, tested, and delivered to the customer.

Although the mask used for ROM programming is usually the metal mask, chosen because it is used at a late stage in the processing, the turnaround time for mask-programmed ROM's is still often several weeks. In addition the user incurs a significant charge for the mask and special handling and testing necessary.
Because of the time and cost associated with mask-programmed ROM's, most potential users develop and check out their truth tables using read-write memory simulators or field-programmable ROM's.
Data for field programmable ROMs is often prepared in the same form as for mask-programmed devices. Intel's SIM4-03 ; MP7-03 microcomputer set ROM programmer requires initial inputs via a suitably prepared paper tape.
When preparing truth tables and paper tapes, it is extremely important that careful attention be paid to logic level definitions, address connections, etc. A ROM is of little use if the pin assignments don't agree with those required by the system for which it was intended.
Intel has adopted the following conventions for paper tape preparation:

1. Tapes are prepared using ASCII Code as produced by an ASR-33 teletype terminal.
2. The tape contains a leader of at least $6^{\prime \prime}$ of blank or rubbed-out tape.
3. Following the leader, the tape contains one data record for each ROM word ( 256 records for 1302, 1602, 1602A, 1702, 1702A, 3301A, 3601; 512 or 1024 for the 3304).
4. Each data record starts with the letter B, and is immediately followed by a string of P's and N's. There is one " P " or " N " for each bit in the word. The last P or N of the record is followed by an F .
5. The first data record represents the contents of address " 0 " i.e. all address lines at their low or negative extreme. This record is followed by a record for address 1(all low but address A0), address 2 (all low but address A1) etc., in (binary) sequence. The last record represents the data for the location selected when all address lines are high or at their positive extreme.
6. The data records contain P's where the corresponding data bit is to be a high or at the most positive output level, and an N where the output is to be low or at the most negative output level.
The first $P$ or $N$ of each data record corresponds to the most significant or highest numbered output bit, while the last P or N of each record corresponds to the value for the lowest numbered output bit or least significant output bit. (Note that these numbers apply to Intel's signal labels, not to package pin numbers). Between the highest and lowest output line labels, the remaining outputs are entered in descending sequence.
7. Following the last data record, a trailer of at least $6^{\prime \prime}$ of blank or rubbed-out tape must appear.
8. No character other than B, P, N, or F should appear in a data record. However, between data records (after $F$ and before B) any other characters except B's and F's may appear. It is recommended that a carriage return and line feed be inserted after every fourth data record to allow listing of the tape on an ASR-33 teletype.
When ROMs or PROMs are to be programmed at Intel, the data information may also be sent in the form of computer punched cards. See the Intel Data Catalog for the format.

## ROM IMPLEMENTATION OF GENERALIZED LOGIC

## A. COMBINATORIAL CIRCUITS

Digital circuits are often divided into two categories: combinatorial and sequential. Combinatorial circuits have no internal storage elements. As a result, the output signals are functions only of the inputs supplied at the time the output is measured (neglecting propagation delays). A ROM may be used to generate combinatorial functions when the number of input signals is not excessive. For example, a 256 word by 4 bit ROM has 8 input leads (addresses) and 4 output leads and so can be used to generate any 4 combinatorial functions of 8 variables. Additional functions may be generated by adding more ROM's - doubling the number of ROM's doubles the number of functions which can be generated.
Expanding the number of input variables is much more costly, however. Additional input variable may be decoded to operate chip selects just as additional addresses inputs are decoded in a memory array. However, each additional input variable doubles the amount of ROM required.
Various authors have expressed the option that 8 to 16 bits of ROM are equivalent to one logic gate. However, this ratio does not apply to all designs. For example, to make a quad full adder ( 5 outputs, 9 inputs) would require $5 \times 2^{9}$ or 2560 bits of ROM, but can be realized with less than 40 gates - for a ratio greater than 64 bits/gate.
When using ROM to replace wired logic gates, the designer should remember that the ROM is not guaranteed to give a single output transition for a single input transition. Figure 2.8 illustrates the way the designer should view the ROM's behavior. In Figure 2.8, after a short hold time, the outputs are undefined until a period equal to the ROM's access time has elapsed. During this undefined interval, the ROM outputs may show noise or extra transitions. Not all ROM's specify a hold time. Even when a hold time is specified, it is valid only when access to a location has been made, and is measured from the first address transition.

## B. SEQUENTIAL CIRCUITS

Sequential circuits are logic circuits with internal storage. As a result, outputs are a function of past as well as present inputs. Sequential circuits are often realized by a collection


Figure 2.8. ROM Behavior for Combinatorial Logic
of storage elements (flip-flops) together with combinatorial logic. Outputs of the sequential network are combinatorial functions of the inputs to the network and the flip-flop outputs. The inputs to the flip-flops are combinatorial functions of network inputs and flip-flop outputs.

When a sequential digital system is described in the above manner, the state of the circuit is determined by the contents of the flip-flops. Therefore, a machine with $n$ flip-flops can have at most $2^{n}$ internal states. To describe the circuit behavior, two sets of information must be known:

1. The outputs as function of inputs and internal states; and
2. The next states as functions of inputs and internal states.

This information may be presented via tables or graphically in the form of a state sequence diagram, such as that shown in Figure 2.9a. The state sequence diagram is usually drawn as a collection of circles, each labelled to correspond to one state of the machine. The circles (states) are connected by directed lines (arrows) indicating which state transitions may take place. Each such transition line is labelled with the values of the input variables for which the transition takes place, unless the input variables have no effect. In that case, the state transition always takes place and the arrow is unlabelled.

Some digital circuits are clocked, i.e., state transitions take place only upon occurrence of a clock pulse. If for some input conditions no state transition takes place at a clock time, it is indicated on the diagram as an arrow which leaves and re-enters the same circle. This arrow is labelled, like any other, with the corresponding input conditions. Clocked sequential circuits are readily designed using clocked flip-flops of the JK or D variety such as those shown in Figures 2.9 b and 2.9 c .


Figure 2.9. Sequential Circuit State Diagram and Realizations

## B.1. State Assignment

The state-sequence diagram describes the digital circuit behavior independent of the assignment of states to the circles of the diagram. Each circle in the diagram must be assigned a unique set of values for the state variables. Each state variable can take on the value of 1 or 0 , so that $n$ state variables can provide values for up to $2^{n}$ circles in the diagram. However, the way the values are assigned to the circles can make a significant difference in the ease of realization when JK or D flip-flops are used. At present, no known technique, other than repeated trials, exists for determining the minimum cost state assignment. The designer's insight and experience contribute significantly to the design efficiency. However, when ROM's are used, state assignments are less critical than for realization with wired logic gates.

## B.2. Asynchronous Input to Clocked Systems

When a clocked system has asynchronous input variables, i.e., variables which can change at other than clock times, proper behavior may depend upon the state assignment used. For example, if the values of a given asynchronous input variable can affect the values of two state variables in a given state transition, differential delays in the logic may allow 4 rather than 2 possible state changes to take place: neither, either, or both of the variables may change. To avoid this situation, state assignments should be such that only one state variable is a function of each asynchronous input variable or the asynchronous input variable should be made synchronous by clocking it into a flip-flop. Of course, the latter procedure increases the response time of the system to the input signal.

These considerations also apply to the asynchronous flip-flop forcing inputs. In general, these inputs can force the network into one or more of a subset of the states where it will remain until the forcing input is removed. If the network clocked transitions attempt to change more than one forced state variable, asynchronous removal of the forcing signal may result in any of several state transitions: any or all of the variables attempting to change may do so, depending upon differential delays in flip-flop responses, clock distribution, and distribution of the forcing signal.

## B.3. Realizations with D Flip-Flops

Having assigned state variable values for each state, realization with D flip-flops is very straightforward.* First, a truth table or set of Karnaugh maps is prepared. The source variables include all state variables and all input variables. The functions to be generated involve all state variables (next state value) and all output functions. Those functions representing the next state values are used as the data inputs to the corresponding $D$ flip-flops.
*For sequential networks wired with logic gates, JK flip-flops may reduce the gate count as in Figures $2.9 b$ and 2.9c. However, ROM realizations are more economical when $D$ flip-flops are used, because fewer functions need be generated.

Figure 2.10 shows a symbolic diagram of such a network. The "clocked register" is an array of $n$ D-type flip-flops.
A read only memory array with $p$ address inputs and $q$ outputs ( $2^{\mathrm{p}} \times \mathrm{q}$ bits) can generate a total of $q$ output functions of $p$ inputs. Thus for Figure 2.10, if $n$ state variables are required, $p$ - $n$ input variables may be used and $q-n$ output signals may be generated.


Figure 2.10. Realization of Digital Machine
Because a ROM's internal realization is quite different than that of a conventional combinatorial logic network, different considerations apply to ROM designs than for conventional designs. For example:

1. State variable assignment has little or no effect on circuit complexity when ROM realization is used. Therefore, the designer may use state variables to form output functions directly with greater ease than for conventional designs. If, however, additional logic circuits are added to reduce total ROM requirements or allow asynchronous input variables (see next


Figure 2.11. ROM Realization of Sequential Machine
parágraph), some of this design freedom may be removed.
2. All outputs of a ROM must be considered functions of all inputs. Therefore, asynchronous inputs to the ROM should not be permitted to change within an access time prior to clocking the output register, or the contents of the output register may be completely unpredictable. Additional latches or separate logic between the ROM output and D flip-flop inputs should be used so that the conditions described above (under Asynchronous Inputs to Clocked Systems) can be met. Additional ROM outputs may be used to enable or disable this logic.

## B.4. Methods of Reducing ROM Size

If the number of input or output variables is large, a straightforward realization with ROM may not be practical. However, it may be possible in certain areas to reduce the amount of ROM's by adding a small amount of additional logic. Several methods for reducing the size of a ROM needed to perform a given function are described below. The use of these techniques when appropriate may permit a ROM approach to be used in a situation where it would normally be impractical to do so. Most of these techniques are illustrated in Figure 2.11.

## Multiplexing Input Variables

Instead of using all input control variables at all times, many digital machines have only a few states where the next state decision is affected by the input variables. Therefore, a multiplexer may be used to select the input variables which are active for each given state of the machine. The effective number of input control variables at the ROM may be reduced to a number equivalent to the largest number active at any one time.

The control signals for the multiplexer may be generated by logic circuits which decode the state information or by extra output variables from the ROM. In general, these extra ROM output variables are far less expensive than the extra ROM inputs that would otherwise be necessary.

## Bypassing the ROM for Input Control Variables

If the state assignments are made so that the next state is a simple function of the input variables, a small amount of logic may be placed between the ROM output and the clocked register. Some of the input control variables are then brought into the system via this logic rather than through the ROM. As in the case with input multiplexing, additional output signals may be used to enable this logic.

One simple form of this method uses a multiplexer between the ROM output and the clocked register. Certain of the state variables take on the values of the input variables whenever the multiplexer is set to accept these inputs. This method places restrictions on state assignment.
A similar technique is usually necessary for use with input control variables which are asynchronous.

## Output Function Distribution

When a large number of control functions must be generated, but only one or two are active at one time, data distributors may be used to generate a large number of control functions from a few ROM outputs. As an example of the type of coding which might be used, 8 non-simultaneous control functions might be generated using one data bit and 3 selection bits. The Intel 3205 decoder may also find use in ROM output expansion. Eight selection signals can be generated from three ROM function outputs.

## External State Generators/Partitioning/Factoring

When a large number of states of a state diagram fall in an easy to generate sequence, the number of state variables generated by the ROM may in some cases be reduced by generating the additional states with external circuits such as counters or shift registers. Functions of these separately realized state variables may be used as equivalent state variable inputs to the ROM.
As an example of this technique, consider a binary counter connected to a ROM such that the ROM can generate a preset or count enable variable and accept a carry output as equivalent to a state input variable. The ROM may be programmed so that for some states of the conventional state variables, the counter counts from its preset values until it overflows with the ROM staying the same state throughout the counting sequence. In this example, one input (equivalent state) variable replaces all of the state variables in the counter.

The example above is a special case of a more general technique which may be called, partitioning. Instead of using an external counter with the ROM system, another ROM/register sequential machine might have been connected. The net result is a ROM/register realization of a sequential digital machine in which not all state variables are used as inputs to all of the ROM. In effect, the machine is partitioned into a number of smaller, but interactive, machines.

To partition a circuit, the state variables must be isolated into two or more groups. A new state diagram can be generated for each group. In these partitioned state diagrams, the state variables for one state diagram are treated as if they were input control variables in the other. In general, for partitioning to be effective, the state variables must be such that they can be divided into relatively independent groups.
These examples are but a few of those available to the
designer wishing to take advantage of ROM. As the design complexity progresses, the structure approaches the complexity of a microprogrammed processor - one application where ROM is extensively applied.
ROM, even in complicated networks like that of Figure 2.11 or a microprogrammed processor, offers much easier modification of machine structure than wired logic. With the availability of programmable ROM's, ROM approaches to sequential circuit design merit serious consideration.
Even when a prototype system has been developed using the 3601 or 1702 A , once ROM patterns have been fixed, the prototypes can be easily converted to use the 3301 A or 1302 for production for these mask programmed devices are pin and signal compatible with their field programmable counterparts.

## EXAMPLE OF A ROM APPLICATION

To show one simple application of ROM, consider the signal generator shown in Figure 2.12. An 8 -bit counter driven by an oscillator drives a 2048 -bit ROM ( 256 words of 8 bits). The ROM outputs are converted to an analog voltage by a digital to analog converter (DAC). By properly coding the ROM, a wide variety of waveforms may be generated.
For the system shown in Fig. 2.12, each step of the 8 -bit counter represents $\frac{360}{256}$ degrees of phase angle. The value at each address in ROM is the digital number representing the signal output for that phase angle. Multiple ROM/DAC combinations might be used to generate several simulataneous waveforms, or multiple phases of a signal, for example. The output of the DAC's will change in small discrete steps, each less than $1 \%$ of full scale. Where this might be a problem, filtering might be used. However, undesired harmonic content of the signal will be limited to the upper harmonics.


Figure 2.12. Digitally Controlled Waveform Generator

## Part 3


Page
INTRODUCTION ..... 3-1
PRINCIPLES OF OPERATION ..... 3-2
CIRCUIT CONSIDERATIONS ..... 3-4
A. P-Channel MOS ..... 3-4
B. Clock Driver Requirements ..... 3-4
POWER CONSIDERATIONS IN SHIFT REGISTERS ..... 3-7
RAMs AS SERIAL MEMORY. ..... 3-7
APPLICATION OF SHIFT REGISTERS ..... 3-8
A. Shift Registers for Larger Memories ..... 3-9
B. Addressing Shift Registers ..... 3-10

## INTRODUCTION

One of the first semiconductor memory devices to find wide application was the MOS shift register. Two properties of MOS IC technology are uniquely compatible with the design of shift registers: the high impedance associated with
the gate circuit permits temporary storage of charge on the parasitic capacitances, and MOS technology permits realization of bi-directional transmission gates which have zero dc offset. With the transmission gate, a gate-node may easily be connected to, or disconnected from, other points in the circuit.

The shift register structure offers layout economies as well. Few interconnections are required, basic shift register stages may be made using little silicon area, and no decoding or other "overhead" circuits need be placed on the chip. These features have made the shift register one of the lowest cost semiconductor memories on the market.

Shift register memories are already extensively used in computer display terminals, where the data to be displayed are recirculated through the shift register memory in synchronism with the CRT display. Shift registers are also used as the memory of most electronic calculators. By providing address counters and a comparator, the shift register is often used as a low-speed, random-access memory. For example, in line printers and card readers where access time is relatively non-critical, the MOS shift register may be used as a low-cost buffer memory. To locate a datum in a serially accessed memory, a sequence of interviewing locations must first be accessed. Magnetic tapes, disks, and drums are all serially accessed.

## PRINCIPLES OF OPERATION

There are many variations on the basic shift register. A typical p-channel MOS shift register circuit, i.e., half of a dual 100-bit shift register (Intel 1406) is shown in Figure 3.1. Each bit of the shift register requires six MOS devices. Note that the devices of bit 2 have been labeled $Q_{2 A}$ through $Q_{2 F}$. The input datum to the stage is the charge on the gate of $Q_{2 A}$, deposited by the previous stage. When clock $\phi_{2}$ goes negative (for $p$-channel devices), devices $Q_{2 A}$ and $Q_{2 B}$ form an inverter stage. If the charge on the gate of $\mathrm{Q}_{2 \mathrm{~A}}$ is sufficiently negative to cause it to conduct strongly, the node common to $Q_{2 A}, Q_{2 B}$, and $Q_{2 C}$ will approach $\mathrm{V}_{\mathrm{CC}}$ (a positive level). However, if the charge at the gate of $Q_{2 A}$ is positive enough to leave $Q_{2 A}$ cut off, when $\phi_{2}$ becomes negative, the common node will approach $\mathrm{V}_{\mathrm{DD}}$, the negative supply.

At the same time, when $\phi_{2}$ goes negative, Q2C conducts, $^{2}$ charging the parasitic gate capacitance of $\mathrm{Q}_{2 \mathrm{D}}$ (shown as capacitor C in Fig. 3.1) to the same potential as the node common to $Q_{2 A}, Q_{2 B}$, and $Q_{2} C$. When $\phi_{2}$ is removed, the gate of $Q_{2 D}$ retains its potential. Pulsing $\phi_{1}$ negative then transfers and again inverts the datum depositing it at the input to the next stage.
In the circuit shown, an output device is provided so that charge need not be retained on external leads. This output buffer also acts as an inverter. To make input and output levels compatible, an input inverter is provided. In Figure 3.1, the data is transferred to the output during clock phase 1. Thus, the data is available during clock phase 2. Data at the input must be available prior to and during phase 2.


Figure 3.1. MOS Shift Register

The previously discussed circuit is a dynamic shift register, using only capacitive storage. To retain the data stored in the register, the rate at which the data is circulated (or clocked) through the register must not fall below some minimum, typically 1 kHz at room temperature. The minimum data rate is proportional to the ambient temperature.
Static shift registers may also be manufactured using MOS technology. The static register can retain the data indefinitely if stopped at the proper point in the clock cycle. However, static shift register cells are significantly larger than the dynamic cells and consume much more power. For these reasons, not only are static registers more expensive, but static registers are usually slower than dynamic registers. The earliest MOS shift registers used high-voltage, metal gate technology, and required 15 to 20 volt power supplies and 30 volt clocks. However, with the introduction by Intel of p-channel silicon-gate technology in 1969, shift registers
could be operated from 10 to 14 volt supplies ( $+5,-5$, or $+5,-9)$ and with clock swings of about 15 volts. Furthermore, by biasing the positive power terminal at +5 volts, TTL compatibility on data input and output become achievable.

The compatibility of MOS shift register with TTL logic was further enhanced by the development of a 5 volt, n-channel, silicon gate MOS process. The Intel 2401 (dual 1024 bits) and 2405 (single 1024 bits) are n -channel, dynamic recirculating shift registers with a data rate of 1 MHz at $0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$. Only a single 5 V supply is required for operation. The input, output, and clock level is completely TTL compatible. The 2401 and 2405 require only a single TTL level clock. Power dissipation is typically $120 \mu \mathrm{~W} / \mathrm{bit}$. A photomicrograph of the 2401 is shown in Figure 3.2.

Table 3.1 shows some of the varieties of dynamic MOS shift registers which are currently available.


Figure 3.2. Photomicrograph of n-channel 2048 bit Shift Register
Table 3.1. Shift Registers

| Part Number | Technology | Organization | Data Rate (Max/Min) | Power Supplies (Volts) ${ }^{[2]}$ | TTL Comp |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  | Data | Clocks |
| 1402A | p-channel | $256 \times 4$ | $5 \mathrm{MHz} / 10 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1403A | p-channel | $512 \times 2$ | $5 \mathrm{MHz} / 10 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1404A | p-channel | $1024 \times 1$ | $5 \mathrm{MHz} / 10 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1405A | p-channel | $512 \times 1$ recirc. ${ }^{[1]}$ | $2 \mathrm{MHz} / 10 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1406 | p-channel | $100 \times 2$ | $2 \mathrm{MHz} / 10 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1407 | p-channel | $100 \times 2$ | $2 \mathrm{MHz} / 10 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1506 | p-channel | $100 \times 2$ | $2 \mathrm{MHz} / 5 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 1507 | p-channel | $100 \times 2$ | $2 \mathrm{MHz} / 5 \mathrm{kHz}$ | +5, -5 | Yes | No |
| 2401 | n-channel | $1024 \times 2$ recirc ${ }^{[1]}$ | $1 \mathrm{MHz} / 25 \mathrm{kHz}$ | +5 only | Yes | Yes |
| 2405 | n-channel | $1024 \times 1$ recirc. ${ }^{[1]}$ | $1 \mathrm{MHz} / 25 \mathrm{kHz}$ | +5 only | Yes | Yes |

## NOTES:

1. $1405 A, 2401$ and 2405 have recirculation logic on the chip.
2. The 1402A, 1403A, 1404A, and 1405A may also be operated at $+5 \mathrm{~V},-9 \mathrm{~V}$. Refer to the Intel Data Catalog for specifications.

## CIRCUIT CONSIDERATIONS

## A. P-CHANNEL MOS TO TTL INTERFACE

All of the p-channel MOS devices listed in Table 3.1 are TTL compatible on data input and output by means of at most a single added resistor per input/output pin.

With the shift register biased so that the $\mathrm{V}_{\mathrm{CC}}$ pin is +5 V and the $\mathrm{V}_{\mathrm{DD}}$ pin is at -5 V , a TTL gate (biased between ground and $\mathrm{V}_{\mathrm{CC}}$ ) can drive the data input pin of the shift register. To meet rated signal levels, a pull up resistor should be used for logic gates with active pull-ups. A suitable resistor on the output terminal allows a single TTL load to be driven. The connections are shown in Figure 3.3.
Figure 3.3 also shows an interface from the output of one register into another, along with a table of resistor values which apply to the different interfaces. The 1407 and 1507 have an internal pull down resistor of approximately 20 K . As a result, these devices do not require an MOS to MOS interface resistor.

For the devices in Fig. 3.3 operating at +5 V and -5 V , the clock amplitude is between +5 and some negative voltage,
in the range of -9 to -12 V . The positive swing should approach $\mathrm{V}_{\mathrm{CC}}$ with a tolerance of $+0.3 \mathrm{~V},-1.0 \mathrm{~V}$, so that no substrate current flows and no transmission gate remains conductive during the opposite clock pulse. Exceeding either limit can reduce the ability of the gate nodes to store charge. The effects may prevent operation or, at best, raise the minimum operating frequency.
While some variation in power supply and clock amplitudes may be tolerated, deviations of sufficient magnitude may reduce the range of frequency of operation from both ends. Excessive supplies may result in problems due to field inversion, i.e., creation of conducting channels under leads carrying the excessive voltage. These channels correspond to undersired connections between points of the circuit. Low supply voltage may lower the maximum and raise the minimum operating frequency.

## B. CLOCK DRIVER REQUIREMENTS

Clock driver circuits must be capable of producing the proper clock voltage levels and timing. These circuits must


Figure 3.3. Shift Register to TTL Interface
also be capable of driving the significant load represented by the clock input terminals. For the devices listed in Table 3.1, each clock terminal presents a load of approximately 0.1 to 0.2 pF per bit.
Thus, for a system of 10,000 bits, a capacitance of 1,000 to $2,000 \mathrm{pF}$ can be expected. Large transient currents flow in clock driver circuits. A 15 V swing in 150 ns requires 0.1 mA average current per picofarad. As a result, clock drivers capable of driving a 10,000 -bit memory with 150 ns rise and fall times must provide 200 mA drive currents. Rise and fall times this long would limit operation to below 1.2 MHz .

The clock drivers must not only be capable of driving the capacitive load, but must do so without excessive ringing, because ringing voltages may violate the restrictions mentioned in Section A. In addition, when at the positive extreme (off condition), the clock driver should present a low impedance. As all shift registers exhibit some clock to clock coupling capacitance, drivers with insufficient damping capability (too high an impedance in the high state) may allow positive or negative spikes to be coupled from one clock to the opposite phase. Such coupled spikes, like any other noise, can degrade performance.

a. Clock Driver with Emitter Follower Output

b. Clock Driver with Non-Saturating Output Stage

Figure 3.4. Shift Register Clock Driver Circuits

Figure 3.4 shows two clock driver circuits which may be used to drive shift registers.

The typical waveform produced by a clock driver operation in a shift register system is similar to the idealized waveform shown in Figure 3.5.
Table 3.2 lists some of the characteristics of the clock driver waveforms as produced by the clock drivers of Figure 3.4. The parameters in the table are labelled on Figure 3.5. The measurements of Table 3.2 were taken using a test load.
The test load specified in Table 3.2 is as shown in Figure 3.6. This test load models load and coupling capacitance for a typical shift register. The two capacitance values in the table correspond to $\mathrm{C}_{1}$ and $\mathrm{C}_{2}$, respectively.
The values in Table 3.2 are derived by applying a pulse train to input 1 while holding driver 2 in the high output state by suitable signal on input 2 . Output rise and fall times are measured at test point 1 while coupled signals are measured at test point 2.

In the driver of Fig. 3.4a, resistors $R_{1}, R_{2}$, and $R_{3}$ control the damping and limit charging current. The diode serves to apply some negative bias to the base of $Q_{3}$ when the driver is in the high state. This bias improves the ability of $Q_{3}$ to damp positive transients.
The driver of Fig.3.4b uses a transistor operated in a non-saturating mode $\left(\mathrm{Q}_{1}\right)$ to provide low output impedance in the high state. The output impedance will be the sum of resistance $R_{2}$ and the output impedance at the collector of $\mathrm{Q}_{1}$. Positive pulse damping is limited by the current


Figure 3.5. Clock Driver Waveforms


Figure 3.6. Clock Driver Test Load
through $R_{1} . R_{3}$ limits the negative charging current to avoid excessively negative coupled pulses. Because of the limited positive clamping capability, loads in excess of $2,000 \mathrm{pF}$ are not recommended unless $\mathrm{R}_{1}$ is lowered. However, a lower value of $\mathrm{R}_{1}$ may require replacing the TTL gate with a TTL buffer/driver gate.
For very small loads, e.g., $C_{1}=200 \mathrm{pF}$, saturation effects in transistor $Q_{3}$ may become significant. Adding capacitor $C$
and diodes $D_{3}$ and $D_{4}$ to the circuit of Fig. 3.4b makes $Q_{3}$ a non-saturating transistor and speeds turn off time. (The line from the collector of $Q_{2}$ to the base of $Q_{3}$ must be opened at point $\chi$ in Fig. 3.4b). When capacitive coupling like that of capacitor $C$ is used, care must be taken that noise on the $\mathrm{V}_{\mathrm{DD}}$ supply does not cause $\mathrm{Q}_{3}$ to conduct and produce stray negative clock pulses.

Table 3.2 Clock Driver Characteristics

| Driver/Load | Delay to <br> Start <br> nsec | Fall Time <br> nsec | Delay to <br> Start <br> nsec | Rise Time <br> nsec | Coupled <br> Max. | Coupled <br> Min. | High Level | Low Level |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Circuit a, $\mathrm{R}_{1}=0$ <br> $10,000 \mathrm{pF} / 1100 \mathrm{pF}$ | 70 | 200 | 20 | 90 | $\mathrm{~V}_{C C}+.1$ | $\mathrm{~V}_{C C}-.95$ | $\mathrm{~V}_{C C}-.5$ | $\mathrm{~V}_{D D}+.5$ |
| Circuit a, $\mathrm{R}_{1}=100$ <br> $1000 \mathrm{pF} / 110 \mathrm{pF}$ | 45 | 45 | 20 | 25 | $\mathrm{~V}_{C C}-.05$ | $\mathrm{~V}_{C C}-.95$ | $\mathrm{~V}_{C C}-.5$ | $\mathrm{~V}_{D D}+.5$ |
| Circuit a, $\mathrm{R}_{1}=100$ <br> $200 \mathrm{pF} / 22 \mathrm{pF}$ | 40 | 50 | 15 | 10 | $\mathrm{~V}_{C C}-.05$ | $\mathrm{~V}_{C C}-.8$ | $\mathrm{~V}_{C C}-.5$ | $\mathrm{~V}_{D D}+.5$ |
| Circuit b, <br> $8000 \mathrm{pF} / 220 \mathrm{pF}$ | 40 | 130 | 30 | 150 | $\mathrm{~V}_{C C}+.05$ | $\mathrm{~V}_{C C}-1.0$ | $\mathrm{~V}_{C C}-.5$ | $\mathrm{~V}_{D D}+.2$ |
| Circuit b, <br> $1000 \mathrm{pF} / 110 \mathrm{pF}$ | 25 | 65 | 25 | 75 | $\mathrm{~V}_{C C}+.00$ | $\mathrm{~V}_{C C}-.95$ | $\mathrm{~V}_{C C}-.5$ | $\mathrm{~V}_{D D}+.2$ |
| Circuit b, <br> $200 \mathrm{pF} / 22 \mathrm{pF}$ <br> $\mathrm{C}, \mathrm{D}_{3}, \mathrm{D}_{4}$ <br> added (see text) | 20 | 25 | 20 | 20 | $\mathrm{~V}_{C C}+.2$ | $\mathrm{~V}_{C C}-.6$ | $\mathrm{~V}_{C C}-.3$ | $\mathrm{~V}_{D D}-.7$ |

## POWER CONSIDERATION IN SHIFT REGISTER MEMORIES

Except for the 2401 and 2405 , all of the shift registers listed in Table 3.1 exhibit a power dissipation which is directly proportional to clock duty cycle. As a result, suitably designed shift register memories can retain data at very low power levels. These shift registers draw power supply current only when one of the clocks is active. As a result, lowest operating power is achieved at the lowest operating frequency when the clock is operated at its rated
minimum width. For example, the 1402A/1403A/1404A family of shift registers can typically be operated at room temperature with dissipations of less than $0.2 \mu \mathrm{~W}$ per bit. At such low power levels, the dissipation of clock drivers, address control and interface circuits becomes significant. As a result, special switching of power to clock drivers and interface pull down resistors may be necessary to achieve minimum power data retention.

## RAMs AS SERIAL MEMORY

Random-access memories may be made to appear serial by providing a counter to generate addresses for the RAM. Using a read/write cycle allows the contents of the selected location to be read, after which new data may be entered. A divide by N counter makes a RAM of N or more words appear much like an N -bit recirculating shift register.
Although a RAM may be used in this manner, certain features of true shift register will be absent. For example, because data does not actually move, editing by switching extra stages in or out is not available. Similarly, while a "tap" between sections of a shift register does not slow its operation, simulating the function of a tap when using the RAM simulation requires an additional access to memory at
an address which must be computed by adding the tap displacement (from the end of the shift register) to the value in the address counter. Nevertheless, some CRT display terminals use RAMs for buffer storage, operating them in a mode similar to that of a shift register. The RAM used in this manner may be accessed in normal randomaccess mode at those times when it is not being "shifted" e.g., during CRT retrace. A static RAM used in this manner need not have a minimum shift rate. However, dynamic RAMs with refresh requirements (such as the 1103) may meet those requirements by using this pseudo-shift register organization with an appropriately chosen minimum shift rate.

## APPLICATION OF SHIFT REGISTERS

Although a shift register memory behaves much like an acoustic delay line or magnetic drum memory, unlike these memories, the shift register can be "stopped" for some maximum period and then instantaneously restarted. As a result, it is much more convenient to synchronize a shift register memory to other circuits than it is to synchronize a drum or delay line. In addition, shift registers waste less storage than the other techniques when used in certain applications which do not have continuous data flow.
One such application is the refresh memory for a cathode ray tube (CRT) computer output terminal. These terminals are used to display alphanumeric computer data and usually consist of an input keyboard, CRT with buffer memory, and a relatively low data rate interface to a (remote) computer.
Figure 3.7 shows a block diagram of the display portion of such a terminal. The serial memory is usually realized with MOS dynamic shift registers and stores several hundred to a few thousand characters. To minimize the amount of memory, the characters are usually stored in a compact code of six to eight bits each. The display on the face of the CRT is usually a $5 \times 7$ or a $7 \times 9$ dot matrix, so that to display the character, the code must be converted from the compact code to the 35 - or 63 -bit display code. This conversion is performed by a read-only memory (ROM).


Figure 3.7. CRT Display Terminal

Several types of CRT raster scans may be used with this type of display. If a standard video scan is used, the serial memory must present the same data for several lines in succession, i.e., for each line on which part of that row of characters appears. Figure 3.8 shows one way to organize the memory so that no storage is wasted. One register is used as a recirculating register as well as a continuation of the main memory. In normal mode, control signal $Y$ is true, X is false, and both clock sets are operated in synchronism. When data to be recirculated have been loaded into the recirculation register, Y is made false, X made true, and the clocks to main memory are then recirculated as many times as necessary by applying a sufficient number of clock pulses ( $\phi_{1 \mathrm{~B}}$ and $\phi_{2 \mathrm{~B}}$ ). Note that the length of the recirculating register need not be exactly the same as the number of characters to be displayed - the shift register can be longer if there is sufficient time during retrace to shift the unwanted data past the output port.


Figure 3.8. Serial Memory with Recirculating Loop (1-bit path shown)

Editing of data in the shift register memory can be accomplished by providing an extra stage or two which can be switched in and out - effectively lengthening or shortening the shift register. Thus, when a bit must be inserted between two other bits, the data in the register are shifted until the two bits between which the new data must be inserted are adjacent to the extra stage. The data to be added are placed in the extra stage and the connections changed to include the extra stage in the shift register
chain. To maintain the length of the shift register at a constant value, some other character must be deleted. By clocking the data until the character to be deleted is in the extra stage, and then disconnecting the extra stage, the deletion can be effected. Fig. 3.9 shows one circuit which may be used to provide this feature. When A is false and B and $X$ are true, the extra stage, realized by a D flip-flop, is included. When $A$ is true and $B$ is false, the stage is not included. Inputs $Y$ and $D$ are for loading the $D$ flip-flop with data from an external source. The use of AND-ORINVERT gates results in the data in the flip-flop being inverted, so the $Q$ output is used. In Figure 3.9, the extra flip-flop stage is shown being clocked with clock phase $\phi_{1}$. This signal (supplied at the proper TTL logic level) would be correct for the 1406 and 1407 shift register. However, when using the $1402 \mathrm{~A}, 1403 \mathrm{~A}$, or 1404 A shift registers, which are internally multiplexed, the signal to the clock input of the D flip-flop must be equivalent to $\phi_{1}$ and $\phi_{2}$ OR'ed together. In general, the D flip-flop must be clocked once for each data item transfer within the shift register.


Figure 3.9. Circuit for Adding and Deleting Data in Shift Register Memory

Another common organization used for CRT display terminals utilizes a "dither" scan or short range vertical scan super-imposed on a slower and coarser raster than that described above. The vertical dither scan is chosen to be one character in height, with the coarse raster having one horizontal trace for each line being displayed. Keeping the vertical scan rate the same as that used for conventional raster scans results in a much slower horizontal line rate. In addition, because each character is fully displayed after being fetched from the shift register memory, the number of accesses to the memory and hence the data rate of the memory is reduced. As a result, the lower shift rate reduces the power dissipation of the shift register, and the single access per display eliminates the need for the one line recirculating buffer as shown in Figure 3.8.

## A. SHIFT REGISTERS FOR LARGER MEMORIES

The MOS shift register offers a low-cost, high-performance memory that falls, both in access time and price, between the drum memory and the main frame ferrite-core or semiconductor memory. The shift register may be used to realize file memories with performance in this intermediate range. While most drums show access times in excess of
several milliseconds, and mainframe computer memory access time is usually under $1 \mu \mathrm{~s}$, shift register memories are conveniently designed with access times in the order of 50 to $500 \mu \mathrm{~s}$.
Shift register access time may be reduced below these figures in several ways not available with drum systems. For example, a typical shift register may be stopped for periods up to 1 ms without loss of data. If the requirement for a given access to the shift register memory can be predicted in advance, it may be possible to bring the shift register to the proper location before the data are needed. To fully utilize this capability, it is necessary to predict the need for the access at least a worst case access time in advance of when the access is to be made. However, if the prediction . occurs too early, the shift register may have to be clocked past the desired location to retain the integrity of stored data. Of course, the data must then be fully circulated to again reach the desired location. However, if it is not possible to predict the next access within 1 ms of when needed (a typical maximum stopping time), the shift register might be stopped several addresses before the desired location. As an example, consider a 2 MHz shift register, 1024 bits long, with 1 ms hold time. Suppose that subsequent accesses to the shift register memory can be predicted as occurring some time between 0.5 and 10 ms in the future. Normal worst case access would be 0.5 ms without prediction. By stopping 10 addresses in advance of the desired location, worst case access time is reduced to $5 \mu \mathrm{~s}$, and yet the desired location can remain available within $5 \mu$ s for at least 10 ms .
When the capacity of one shift register is insufficient to store a given string of data bits, several shift registers may be placed in series as shown in Fig. 3.8. The series cascade, while the least expensive organization, increases the time to access data in the memory.

An alternate structure is shown in Fig. 3.10. This circuit permits the outputs of several shift registers to be gated selectively onto the data bus. The select-and-read and select-and-write signals may be derived by decoding address


Figure 3.10. Combination Serial and Random Access Memory
bits in addition to those used for serial memory address. By using floating collector gates on the register outputs to drive the data bus, the wired OR connection may be used. Several of the shift registers listed in Table 3.1 include recirculation logic on the chip which permits interconnections equivalent to Fig. 3.10 without additional logic. The $1405 \mathrm{~A}, 2401$ and 2405 include this logic.

When shift registers are used as file memories, it may be desirable to add some local processing. As was shown in Fig. 3.9, some editing operations may be implemented by switchable register stages. Additional logic may be added within the shift register "loop" to perform such operations as content searches, etc.

## B. ADDRESSING SHIFT REGISTERS

The data in a shift register memory moves within the memory. As a result, some form of control circuit must be provided to establish which data (equivalent address) is available at the shift register terminal.
One form of control utilizes a counter which cycles once for each circulation of the data in the shift register. That is, a divide by N counter would be used for an N -bit long shift register. (NOTE: N must correspond to the full length of the recirculation path. If a few additional stages of register are included in the recirculation logic, these stages must be included in the length count).
Data in such a shift register memory can be "randomly" accessed by combining the control counter with an address comparator, as shown in Fig. 3.11. When an address is applied at "address in" in Fig. 3.11 and if the comparator does not indicate a match, the shift register and divide by N counter are advanced until a match is achieved. Worst case access time is equal to N times the minimum clock interval, while average access time is half this value.
In some cases, a length other than that provided by the standard components of Table 3.1 may be desired. While a custom chip may be designed for the application, it may be possible to use a double clocking scheme to make a given length register appear to be a few stages shorter. To utilize this technique requires that the normal operating frequency of the shift register be less than half of the maximum rate specified for the device used.
The double clocking scheme works as follows: Let N be the actual length of the shift register and $M$ be the desired length, where $\frac{N}{2}<M<N$.

Then using a structure similar to that of Fig. 3.11, let the address counter divide by $M$ rather than N. Provide a logic circuit which generates a function of the states of the address counter which is true for exactly N-M of the gounter states. When this logic function is true, clock the shift register twice rather than once. As a result the corresponding input data item is entered into two consecutive locations of the shift register. On output these two locations are shifted out in a single data interval, thus appearing as a single datum.


Figure 3.11. Random Access Control for Shift Register
An alternate method for addressing data in a shift register memory utilizes a form of content addressing. Certain word states may be reserved as markers. For example, if a 4 -bit wide register is to be used for binary-coded decimal (BCD) data, only the codes for $\phi$ through 9 normally appear. Some of the remaining codes, 10 through 15 , may be used as markers to signify the starting point of each data string. To locate a given data item, the shift register is circulated until the proper marker is located. The desired data is usually placed in the locations immediately following the marker. In systems with very wide words, a number of separate marker tracks may be provided.
For minimum power systems, for example, where batteries are used to maintain data in a memory in an otherwise non-operating environment, the power associated with address control may be significant. If the retained data is to be useful, when the system is restored to operation, the correlation of address counter to shift register data location must also be restored. One approach is to use a very low power technology, such as CMOS, for the address counter, and retain the correlation throughout the low power condition. An alternate technique, similar to the second addressing mode described above, may be used when certain memory word-states are not used as data items. One of these unused states may be used to mark a given address, perhaps address zero or address N -1. (Of course, this address is no longer available for data storage). After normal power is restored to the memory, but before operating it in the normal mode, the memory is circulated until the special marker code is found and the address counter is then reset to the corresponding value. In this way, the address count need not be retained during low power operation. However, the marker address and code must be reserved, and extra logic for writing the marker initially and finding the marker and resetting the address initially must be provided.

## Part 4



# Other Memory Structures-CAMs, Buffer Memories, and Multiport Memories 

Page
INTRODUCTION ..... 4-1
CONTENT ADDRESSABLE MEMORY (CAM) ..... 4-1
BUFFER MEMORIES ..... 4-4
MULTIPORT MEMORIES ..... 4-4

## INTRODUCTION

The first three parts of this booklet have discussed the most fundamental and common types of semiconductor memory. However, a number of other semiconductor memory organizations and structures are possible. Three memory or memory related structures are briefly discussed in this section.

## CONTENT ADDRESSABLE MEMORY (CAM)

All of the random-access memories previously discussed in this booklet used a fixed address structure. At the time of storage, data is assigned an address in memory. To retrieve the data, that address must be supplied. Content addressable memories retrieve data based on content rather than address. When data are entered, they may be assigned to the first available locations. For such data to be retrievable, they must contain "keys" or identifiers to aid in their location.
One realization of content addressable memory utilizes a memory cell which combines a basic random-access memory cell and a comparator. A Schottky technology bipolar realization of such a cell is shown in Fig. 4.1.

In Fig. 4.1, transistors $Q_{1}$ and $Q_{2}$ form a memory flip-flop while transistors $Q_{3}$ and $Q_{4}$ and the four Schottky diodes form a comparator. Data are entered into the cell in the conventional way using the row-select data and write enable lines. However, cell data may be non-destructively compared with data placed on the data lines. The match line will be drawn positive by any mismatch between a connected cell and data line data. If both DATA and $\overline{\text { DATA }}$ lines are held low, the contents of the cell will not influence the signal on the match line.

CAM cells are arranged in a two dimensional array to form a memory. To expand such a memory, row select and match lines must be extended in the horizontal direction and data lines must be extended in the vertical direction. One of the problems associated with the expansion of a


Figure 4.1. Content Addressable Memory Cell

CAM is associated with the connection of match lines. Each match line must be individually connected to all cells in its row. If a row must extend over more than one memory chip, its match line must be made common to all chips contributing to the row. As a result, a much larger number of connections are required in a given size CAM than in the same size RAM.

The Intel 3104 is a $4 \times 4$ bipolar CAM using the cell of Fig. 4.1. Figure 4.2 shows the package pin connections and a logic diagram of the 3104. As can be seen from Fig. 4.2b, each data input line has a corresponding enable line. Turning off the enable causes both internal data lines, DATA and $\overline{\text { DATA }}$ in Fig. 4.1, to be held low so that the column of cells cannot contribute to match.
a. Pin Configuration

b. Logic Diagram


Figure 4.2. Intel 3104 CAM


Figure 4.3. CAM Organization

To use a CAM, each word entered into the memory contains a "key (search information)" as well as a data portion. In some systems, what is considered a data portion at one time may at a later time be used for searching. Figure 4.3 shows how an array of 3104 s may be organized in an 8 -word memory with a 4 -bit key and a 4-bit data portion to each word.

Some organizations may permit multiple matches when searching a CAM. For example, the search may be equivalent to the question, "How many entries do I have which used X as the key?" The organization of Fig. 4.3 would result in simultaneous selection of several data words if several such entries were found. To resolve multiple finds, a priority encoder may be used. The encoder selects only one of the several addresses. The addresses selected by the search may be polled in sequence to determine the various data items. Some IC priority encoders give binary addresses as output. To use these with the 3104 would require a decoder to generate the linear select addresses for the 3104. If the data portion does not participate in searches, a $3101 / 3101$ A RAM might be used as in Figure 4.4.

Small fast CAMs find application in buffer memory control (see below) and certain types of parallel processors. Possible applications are found in pipelined processors and the implementation of some functions, such as sorts, searches, etc.


Figure 4.4. CAM/RAM Combination for Multiple Entry CAM

Content addressable memory function can also be realized, with much longer effective access times, by actually searching the memory. For example, a shift register memory may be designed to include the comparison function as part of the recirculation logic. The data are completely searched in one circulation of the memory.

## BUFFER MEMORIES

A small, fast random-access memory (buffer) may be used to store images of a number of blocks of data selected from a much larger, but slower, memory. During execution of a program, a processor using a buffer needs to access the main memory only when data cannot be found in the buffer. When a buffer control strategy can be found which reduces the number of main memory accesses to a small fraction of the total number of accesses, the system may operate as if the large memory had an access time approaching that of the buffer. (In general, faster memories cost more per bit than slow ones - the buffer effectively reduces the cost of a large, fast memory). A block diagram of a buffered memory system is shown in Fig. 4.5.


Figure 4.5. Buffered Memory System
Considerable literature has been developed concerning design of buffer memories, control algorithms, and their performance vs. size. Since it is not possible within the scope of this booklet to include details of buffer design, the interested reader is referred to the literature.*

A number of the fast, bipolar memories mentioned in Part I may be used for the buffer memory. The 3101, 3101A, 3106 , and 3107 may be used for the fast, random-access memory. Depending on the type of buffer control strategy used, the 3104 CAM may be useful for storing buffer control tables. These tables are used to identify which data blocks from the main memory reside in the buffer, and identify the location in the buffer of those that are present. The CAM, or an associated RAM, may also be used to store information about the degree of utilization of each block in the buffer and to indicate whether the contents of a block have been altered by program execution. When the buffer is full, a new block from main memory will have to replace a block in the buffer. The block to be replaced is often chosen based on the time of last access or frequency of access. If a replaced block has been altered, the data must be rewritten to main memory. Some designs write back to main memory whenever alteration of a location takes place, while others write only when the block is replaced.
When semiconductor memory is used for the main store, the system may be designed to take advantage of the non-destructive read and simplified output buffering. For example, the main memory may be organized in small,
interleaved modules with the data being transferred to the buffer by a polling technique.

*R.L. Mattson, et. al. "Storage Hierarchy Designs", IEEE COMPCON 1972 Digest, pg. 147.

## MULTIPORT MEMORIES

Large computer systems may include several processors, e.g., one or more central processors, several data channels, etc., all communicating with a very large memory. If the large memory is divided into a number of modules, each having several independent access paths, or "ports", one processor may access one memory module at the same time that another processor is accessing another module. In this way, the effective bandwidth of the memory is increased. The resultant structure, as diagrammed in Fig. 4.6, somewhat resembles a cross point switch array.
Each memory module in Fig. 4.6 watches the various processor busses. When requests for data within the module are received, the memory switches itself onto the highest priority requesting bus. As a result, processor No. 1 may access memory No. 2 even while processor No. 3 is accessing memory No. 1. Although interference is possible, and some delay is introduced by the extra logic, overall bandwidth is increased.
Another multiport structure is used in fast, scratch pad memories where it may be desirable to fetch two operands at once. Although true that two or more port memories may be designed with semiconductor technology, they are often significantly more expensive to produce than a single


Figure 4.6. Multiport/Multimodule Memory System
port memory. In some cases where two port access is needed, it may be desirable to duplicate the scratch pad memory. When reading from the memory, one datum is taken from each scratch pad. However, when writing to the scratch pad, identical data is written into both scratch pads. Thus, the two scratch pad machine appears as a two port memory for reading, but as a single port memory when writing.


All of the memory products described use binary addressing. For example, the 1103 provides 1024 address locations selected by 10 binary address lines. However, some machines are more conveniently organized to use binary coded decimal (BCD) addressing. For example, 1000 address locations may be desired, selectable by 12 BCD leads (Lead labelling might be U1, U2, U4, U8, T1, T2, T4, T8, H1, H2, H4, H8 - for units digit bit 1, units digit bit 2, . . . . , tens digit . . . . , hundreds digit bit 8). A BCD to binary encoder could be used to perform a translation from the 12 line BCD to a 10 line binary code so that a standard 1103 could be used for the memory. However, full translation is cumbersome. It should be noted that any one-to-one mapping of BCD addresses to binary addresses will serve this purpose. Thus, a true BCD to binary conversion need not be performed.

To generate a mapping, the designer should note that certain BCD codes cannot appear. For example, line U8 cannot be true when U4 or U2 is true. However, given any state for the 9 variables U8, U4, U2, T8, T4, T2, H8, H4, H 2 , all possible combinations of $\mathrm{U} 1, \mathrm{~T} 1$, and H 1 may appear. Thus, U1, T1, and H1 are all effectively binary
variables. If the variables $\mathrm{U} 8, \mathrm{~T}$, and H 8 are all zero, all the remaining variables are effectively binary. Thus, if the binary address leads are designated $\mathrm{B}_{0}, \mathrm{~B}_{1}-\mathrm{B}_{9}$, a suitable mapping can be derived as shown in Table 5.1.

Table 5.1.
Generation of 10 -Line Address from 12-Line BCD

*These locations are "don't cares" - T2 and T4 have been chosen for easier implementation.

Table 5.1 is read as follows: Consider the second line of the tabular part where the code 001 appears on the left and the series of variables $\mathrm{U} 2, \mathrm{U} 4-0,0,1$ appear in the columns for $\mathrm{B}_{3}, \mathrm{~B}_{4},-\mathrm{B}_{9}$. This line implies that if $\mathrm{U} 8=0, \mathrm{~T} 8=0$, and $\mathrm{H} 8=1$, then the output variable $\mathrm{B}_{3}$ should be equivalent to the input $U 2$, the output $B_{4}$ should be equivalent to the input U4, etc., with output $\mathrm{B}_{9}$ being held at 1 . Table 5.1 may be filled out in a number of different ways. The following rules must apply:

1. For each code on the left, all variables which are effectively binary must appear on the right. Thus, except where the 8 -bit of the digit is one, the corresponding 4 - and 2 -bit variables must appear on the
right. This organization insures that all possible input states on the left produce unique output states on the right.
2. Each row on the right must not overlap any other row. In other words, when any two rows on the right are exclusive NORed together, at least one column must be zero for all possible variable values.

The logic to implement the function of Table 5.1 is shown in Fig. 5.1. A fast ROM with at least 512 words of at least 7 -bits each could also have been used to realize these functions. As shown about 6 packages of TTL, SSI and MSI are required.


Figure 5.1. Realization of BCD Modified Binary Address


Appendix A-Silicon Gate MOS Appendix B-Schottky Bipolar Technology Appendix C-Article Reprints

## APPENDIX A-SILICON GATE MOS



Figure A-1. MOS Processes

Some of the advantages of using silicon gate processing are made obvious by Fig. A-1:

1. Gate oxide is protected immediately in the silicon gate process, while in conventional processing the gate oxide remains exposed throughout one entire masking step.
2. The self aligned gate permits a smaller device with less gate to drain capacitance than is possible in conventional technology. Faster, more compact circuits are made possible.
3. The silicon layer can be used for interconnections, permitting reduced chip area per function. In many LSI circuits, interconnection area affects cost even more than active component area.
Some other, less obvious, advantages include improved reliability due to the number of protecting layers above the gate oxide, and lowered threshold voltage due to the use of silicon rather than aluminum as the gate material. This threshold reduction is achieved with very little change in the intermediate field threshold-i.e., the voltages at which parasitic MOS devices are created under silicon lines over thick oxide.

## A. STATIC AND DYNAMIC MOS CIRCUITS

The unique characteristics of the MOS FET permit the construction of a wide variety of logic circuits with the devices. The MOS device may be used as an active amplifying device or as a load resistor. In Fig. A-2, the characteristics of a device are shown, together with a curve corresponding to the device used as a load resistor with a 15 V supply. These curves show drain current ( $\mathrm{I}_{\mathrm{D}}$ ) vs drain-to-source voltage ( $\mathrm{V}_{\mathrm{DS}}$ ) with gate-to-source voltage $\left(\mathrm{V}_{\mathrm{GS}}\right)$ as a parameter. The substrate is assumed at source potential. The load resistor curve is only approximate, as substrate bias effects have been neglected. In the discussions which follow, high and low refer to the relative magnitude of the voltage with respect to the substrate level. Polarities are to be assumed correct for p-channel devices, i.e., all voltages negative with respect to the substrate.


Figure A-2. Typical MOS FET Characteristics (Dotted line shows load line for device used as pull up resistor)

In Fig. A-3, four types of MOS logic invèrter stages are shown. In Fig. A-3a, two MOS devices, $\mathrm{Q}_{1}$ and $\mathrm{Q}_{2}$ are wired as a static inverter. When input 1 is sufficiently high, $Q_{2}$ turns on and output 0 is low. If 1 is low, $Q_{2}$ is off and $Q_{1}$ pulls the output high. This circuit requires that $Q_{1}$ have radically different geometries than $Q_{2}$. For equivalent bias, $Q_{2}$ must have much higher conductance than $Q_{1}$ to get reasonable noise margins. This fact is symbolized in Fig. A-3a (and in A-3b) by showing $\mathrm{Q}_{1}$ symbolized as a MOS resistor rather than as a device. As a result of the low conductance of $Q_{1}$, current available from $Q_{1}$ for charging load capacitance is quite limited. As a consequence, low to high transistions are rather slow.

Fig. A-3b is a variation on Fig. A-3a. When the clock $\phi$ is active, the circuit is much like that of Fig. A-3a. By making the clock voltage $\phi$ higher than V , a more consistent high output level may be established.
Once the output level is established, the clock $\phi$ may be switched off to save power. This circuit technique may also be used to give improved noise margins.


In Fig. A-3b, the output behaves as an inverter when the clock $\phi$ is active. In Fig. A-3c, $Q_{1}$ has such a high inductance that when the clock is active (high), low outputs may have noise margins so poor that they are unusable. However, after the clock is removed, low outputs quickly approach usuable values if the input " 1 " is maintained long enough. During the time that $\phi$ is active, this circuit may consume large amounts of power (if input 1 is high). This circuit may be used to drive relatively large capacitive loads because $Q_{1}$ may have relatively high conductance and 1 may have higher amplitude than V .
In the circuit of Fig. A-3d, the capacitive load is charged when $\phi$ is high, and then is discharged when $\phi$ goes low, if and only if 1 is high. This circuit draws current only to charge and discharge the load capacitance; there is no DC power drain. However, the load capacitance is reflected back to the clock driver.
The circuits of Fig. A-3b, A-3c and A-3d make use of temporary retention of data on the load capacitance, and are said to be dynamic, while the circuit of Fig. A-3a is DC stable and therefore is said to be static.

Larger capacitive loads may be driven by static inverters if a booster stage is used. However, a loss of "high" level is introduced unless bootstrap techniques are utilized. A booster stage is shown in Fig. A-4.


Figure A-4. MOS Inverter with Output Booster
In Fig. A-5, a transmission gate is shown. The load capacitance $\mathrm{C}_{\mathrm{L}}$ may be charged to a voltage equal to that on $\mathrm{C}_{\mathrm{S}}$ when the clock $\phi$ is high. When $\phi$ is returned low, $\mathrm{C}_{\mathrm{L}}$ retains the value to which it was charged.
The circuit of Fig. A-5 may be driven from a number of different source circuits. However, when the source circuit is dynamic, the transmission gate clock, $\phi$ should be kept high during the entire period when $\mathrm{C}_{\mathrm{S}}$ and $\mathrm{C}_{\mathrm{L}}$ are being charged and discharged or improper voltage levels may result. For example, if $\mathrm{C}_{\mathrm{S}}$ is initially charged and $\mathrm{C}_{\mathrm{L}}$ initially discharged, when the transmission gate clock is made active, $\mathrm{C}_{\mathrm{S}}$ and $\mathrm{C}_{\mathrm{L}}$ share their charge, resulting in a voltage level which is a function of the ratio of $\mathrm{C}_{\mathrm{L}}$ and $\mathrm{C}_{\mathrm{S}}$. Unless the circuit is designed to operate with such intermediate voltage levels, incorrect operation will result.


Figure A-5. Transmission Gate
These circuits give some idea of the flexibility of design of logic with MOS FET's. In addition to these techniques MOS devices can easily realize compound gate structures. An example is shown in Fig. A-6. On Fig. A-6, only eight devices are used to realize the function

$$
f=(a b+c d+a x d+\dot{c b x}+y z)
$$

This type of compound gate structure can be used with any of the forms of Fig. A-3.


Figure A-6. Compound Gate. Example shown realizes

$$
f=(a b+c d+a x d+c x b+y z)
$$

## B. STATIC AND DYNAMIC MOS MEMORY CELLS

Fig. A-7 illustrates two basic types of MOS memory cells. Fig. A-7a, shows a static memory cell consisting of two static inverters wired as a flip-flop. A pair of transmission gates selectively connects the cell to a pair of vertical data wires. Such cells may be arranged in a two dimensional array. When one select wire is made high, the corresponding row of cells is effectively connected to the data wires. Writing is accomplished by forcing signals on the data wires which are sufficient to "flip" the flip-flops. To prevent flipping the cell when reading, both data wires are initially charged high prior to activating the select line.


b. DYNAMIC

Figure A-7. Memory Cells

Operation of the dynamic cell of Fig. A-7b is quite different from that of the static memory cell. Data is stored as charge on the parasitic capacitance $\mathrm{C}_{\mathrm{S}}$ associated with the gate of $Q_{2}$ and the connected junction of $Q_{1}$. Data may be written into this capacitance via the transmission gate formed by transistor $Q_{1}$. The data to be written is placed on the WDATA line and WSEL is activated (made high). To read from the cell, the RDATA line (with its associated capacitance $C_{R}$ ) is initially charged high by activating the signal $\phi$. When the line RSEL is activated, the RDATA line will be discharged if and only if the capacitor $\mathrm{C}_{\mathrm{S}}$ contains a high, and will remain high if and only if $\mathrm{C}_{\mathrm{S}}$ contains a low. We may, therefore, consider that the RDATA line then contains the logical complement of the cell data.
Although the read out operation from the cell is non-
destructive, the leakage associated with the junction of $\mathbf{Q}_{1}$ eventually may result in the loss of the charge stored in $\mathbf{C}_{S}$. To maintain the data stored in the cell, the data must be periodically regenerated. This regeneration is accomplished by reading the contents of the cell, out onto the read data line, inverting and amplifying the signal and applying it to the WDATA line, and rewriting back into the cell by activating the WSEL line. A circuit which performs the inversion and amplification function is called a refresh amplifier.

In use, the dynamic cells are laid out in a two dimensional array. One entire row of cells is refreshed (or accessed) at one time, one refresh amplifier being provided for each column of cells in the array. To refresh the entire memory, each row of cells must be individually refreshed.

## APPENDIX B-SCHOTTKY BIPOLAR TECHNOLOGY

Bipolar technologies are familiar to the user of logic in the form of RTL, TTL and ECL families. Most early bipolar technologies utilized doping with gold to reduce minority carrier lifetime. Without gold doping, the storage time of transistors in saturation would slow logic circuit operation.
While gold-doping made fast circuits, the side effects limited circuit capabilities. Because of the reduced minority carrier lifetime, such wide base devices as lateral and substrate pnp transistors were not usable in gold doped technologies. The effects of the doping were diminshed at high temperatures, so that speed of performance would have to be derated for high temperature operation.
However, in the late 1960's, it was discovered that an aluminum to n -silicon contact formed a Schottky barrier diode. While early diodes were somewhat erratic in performance, these diodes exhibited a lower forward voltage drop than conventional p-n junction diodes. It was recognized that such a diode could be used in the circuit known as the Baker clamp to prevent transistor saturation. Using this circuit would then eliminate the need for gold doping. Figure B-1 shows the Baker clamp circuit.
In Figure B-1, when $\mathrm{i}_{\mathrm{b}}$ is sufficient to drive transistor Q into saturation, the collector voltage falls to a level which forward biases the Schottky diode D. Further increases in


Figure B-1. Schottky Clamped Transistor
$\mathrm{i}_{\mathrm{b}}$ flow through diode D rather than into the base of transistor $Q$. Because of the diode, transistor $Q$ is not driven hard into saturation. Thus, upon turn off of the current $i_{b}$, transistor $Q$ exhibits negligible storage time.
The first to develop a Schottky process using this diode was Intel Corporation, with the first such product being the 3101 64-bit memory. The Schottky diodes are formed by contact of the aluminum metallization to the collector region. To make ohmic contact to the collector, the conventional technique of first diffusing $n+$ region (emitter diffusion) into the $n$ - collector region is used. An aluminum contact to the $n+$ region is ohmic while contacts to the n - region become Schottky diodes. This process requires no additonal masking steps to produce the Schottky diodes. All of Intel's bipolar devices utilize the Schottky technology.
Because this technology does not destroy the gain of substrate and lateral pnp transistors, these components become available to the circuit designer. Most of Intel's bipolar products use substrate pnp transistors on all inputs, as shown in Fig. B-2. As a result, the inputs offer far less DC loading than standard TTL logic. This feature is most helpful when interfacing to MOS logic.


Figure B-2. Intel Substrate - pnp Bipolar Input Circuit

## APPENDIX C-ARTICLE REPRINTS

Schottky Diodes Make IC Scene ..... 6-8
A 4096-Bit Dynamic MOS RAM ..... 6-15
N-Channel Goes to Work with TTL ..... 6-17
A Fully Decoded 2048-Bit Electrically Programmable FAMOS Read-Only Memory ..... 6-21
Standard Parts and Custom Design Merge in Four-Chip Processor Kit ..... 6-27
The MCS-4 - An LSI Micro Computer System ..... 6-32
Considerations for the Use of Micro Computers in Small System Design ..... 6-38
Micro Computer Applications of
Electrically Alterable ROMs ..... 6-42
Impact of LSI on Micro Computer and Calculator Chips ..... 6-46
The New LSI Components ..... 6-50
A Multimillion Byte Core ReplacementUtilizing MOS Dynamic Memory.6-53
Product Reliability: 1601 ..... 6-58
Reliability Report - 1103 Silicon Gate MOS LSI RAM ..... 6-60

# Schottky diodes make IC scene 

# With reproducibility problems licked, these devices make attractive elements; 

they permit unusual component combinations, save chip real estate, reduce power dissipation, and enhance production yields

By.R.N. Noyce, et. al.<br>Intel Corp., Santa Clara, California

Schottky diodes, because they don't store charge, open new worlds to digital integrated circuits. Di-ode-transistor logic with Schottky diodes can be made as fast as transistor-transistor logic and emit-ter-coupled logic. What's more these devices reduce an IC's power dissipation and permit unusual combinations of components on a chip. They also make large-scale integration easier to achieve because more functions can be performed in less area. Not the least of their virtues is the fact that they can increase production yields.
For all these advantages, Schottky-diode IC's have not been commercially available. Intel Corp., however, has concentrated on their development and is now making Schottky diodes that are stable and reproducible. The company has committed Schottky IC's to commercial production, and has found that they live up to expectations. Their first Schottky IC will be introduced in the next few weeks.
Structurally, the Schottky diode is little more than a metal in contact with a semiconductor. As long as aluminum is used for the metal, formation of Schottky diodes in a monolithic IC is compatible with standard processing. Some provision is needed to prevent high-field effects at the edge of the metal contact, but this too is compatible with standard processing. In fact, once reproducibility problems have been solved, Schottky diodes actually simplify processing of high-speed digital IC's because gold doping is eliminated.
The Schottky diode has been around in one form or another for a long time-the crystal detector in early radios, for instance. The modern device, which gets its name from the German scientist who developed the first valid theory of metal-semiconductor rectification, can be used to enhance the performance of existing IC designs as well as to develop entirely new configurations with exceptional features.

In the forward-biased $p$ - n junction, current is carried by holes flowing from the $p$-type material into the n-type material, and electrons flowing from the n to p material. Either case results in an excess of minority carriers near the junction. If the voltage is reversed these carriers will flow back across the junction, creating a high current until the supply is exhausted. In other words, the p-n junction can't be turned off immediately.
In a Schottky diode made of aluminum on n-type silicon, essentially all of the forward current is carried by electrons flowing from the semiconductor into the metal. They quickly come into equilibrium with the other electrons in the metal, so there is effectively no stored charge to prevent rapid switching. Another major point of difference between the Schottky and the p-n junction diode is that the former has lower forward voltage for a given current.
In a practical circuit the Schottky diode is placed in parallel with the base-collector junction of an npn transistor; the metal electrode is connected to the base and to the n region of the collector, where it forms a rectifying contact.

Since the Schottky diode has a lower forward voltage compared with the collector-base junction, the diode clamps the transistor and diverts most of the excess base current through the Schottky diode, preventing the transistor from saturating. There's no charge storage, either in the transistor or diode.

Clamping techniques have been used in the past to prevent charge storage. Most are variations of the Baker clamp proposed in 1956. ${ }^{1}$ In this scheme, a germanium diode shunts the collector-base junction of a silicon transistor to prevent it from being forward-biased; some charge is still stored in the germanium diode. The Schottky IC employs clamping but charge storage is eliminated.
In the past, other efforts to improve switching



Double duty. In a Schottky transistor a Schottky diode is in parallel with the collector-base junction. The base metal is also the anode of the diode; the collector diffusion is also the diode cathode. Edge effects can be avoided with the extended-metal or guard-ring structure. The curves compare storage time of the gold-doped and Schottky transistors. The former uses up 7 nsec of recombination time before it can change state; the Schottky transistor responds almost instantaneously.
speed by reducing or avoiding stored charge have taken two approaches: process innovation and novel circuit design. ${ }^{2,3,4}$

Attempts to minimize storage time through novel circuit designs have had varying success. The most familiar is current-mode switching, which avoids saturation (and hence charge storage) by controlling the collector current so that the collector-base junction never goes into forward conduction. Cur-rent-mode logic is fast-propagation time through a gate can be as low as 1 nanosecond-but when such IC's are assembled in a system, they will oscillate unless precautions are taken to control their input impedance characteristic.

## Off the gold standard

The standard processing approach to chargestorage reduction in conventional IC's is gold doping. Since gold acts as a recombination center, it reduces the lifetime of minority carriers; that is, it shortens the time for recombination of holes and
electrons. Intentional reduction of minority carrier lifetime is somewhat ironic. In the early days of transistors, much effort was expended in increasing the minority carrier lifetime to increase current gain and decrease junction leakage.

Unfortunately, the gold diffusion has some highly undesirable side effects. It tends to reduce gain ( $\mathrm{h}_{\mathrm{FE}}$ ). If this value gets too low, the IC's won't function because the transistors either don't saturate or have excessive turn-on delay. The manufacturer can use a narrower base to get higher $h_{\mathrm{FE}}$, but this can lead to excessive junction leakage.

The Schottky process sidesteps this problem. Without the need for gold doping, low storage time and high $\mathrm{h}_{\mathrm{FE}}$ can be achieved simultaneously.

Another difficulty with gold doping is the lack of selectivity-if one transistor must be high speed, all the others on the chip have to be high speed too. But with the Schottky process, fast and slow devices can be mixed on the same chip. This freedom of choice in component characteristics is exploited


REGULAR DIODE


EXTENDED-METAL


GUARD-RING
Edge effect. The high field concentration at the edge of the regular Schottky structure can cause spurious currents. These can be avoided by means of the extended-metal structure or the guard ring.
in some of Intel's new products.
But before Schottky diodes could be successfully integrated manufacturers had to resolve some dilemmas; diode characteristics were often grossly different from theoretical predictions and devices were not reproducible. The cause, it has since been found, is the high electric-field concentration at the periphery of the metal electrode, as shown above. This field, through the mechanism of tunneling, can lead to spurious currents that dominate the ideal diode characteristic. Because the shape of the edge varies from one device to another, the magnitude of these excess currents can vary greatly.

Two solutions to this problem were recently proposed. ${ }^{5,6,7}$ In one, the metal electrode extends over the silicon-dioxide passivating layer. With this dielectric layer between the metal and the semiconductor, the electric field at the periphery of the electrode is greatly reduced, and the diode has ideal characteristics.

In the other solution, the edge of the metal extends over a diffused $\mathrm{p}^{+}$guard ring. This also reduces the field, and moves it away from the edge of the electrode so that it can't affect the diode. Again, the characteristics are ideal.
Electrically, the aluminum-silicon Schottky diode differs somewhat from the diffused p-n junction
diode in its d-c forward current-voltage characteristic, shown at right. The forward voltage of the Schottky diode at any given current is 200 to 300 millivolts less than that of the junction diode, making it an ideal low-voltage clamp.
Another major difference is the recovery characteristics, opposite page at bottom. The Schottky diode storage time is effectively zero, in contrast to typical values of 6 nsec for the gold-doped junction diode and 30 nsec for the junction diode without gold doping. In all three cases, the voltage across the diode decays with the same R-C time constant once the stored charge recombines.
The Schottky transistor is simply an npn IC transistor, produced by conventional photolithography, diffusion, and epitaxial growth techniques with an Al-Si Schottky diode in parallel with the collectorbase junction. The base contact of the transistor serves as the metal contact of the diode and the collector region of the transistor serves as the n region of the diode. It's convenient to regard the npn-Schottky diode combination as a single device and to represent it by a single symbol:


To form the Schottky transistor, the base-contact opening is extended beyond the base diffusion and over the collector region. When the aluminum metalization is deposited, it simultaneously functions as the contact to the base region and the anode of the Schottky diode. To prevent high field concentration, the metal can either be extended over the passivating oxide or terminated over a $\mathrm{p}^{+}$ guard ring, which can be diffused into the chip at the same time as the base, as shown on page 75 .
One big difference between the Schottky transistor and the conventional gold-doped npn device is in offset voltage, which is lower for the lattertypically 120 millivolts vs. 250 mv . However, the saturation voltage need not be excessive because of this large offset; by controlling both device geometry and processing, $\mathrm{V}_{\mathrm{CE}(\text { sat })}$ can be kept to 0.5 volt at 20 ma collector current and 1 ma base current. And because the Schottky diode has a lower temperature coefficient than the emitterbase diode, $\mathrm{V}_{\mathrm{CE}(\text { sat })}$ decreases with increasing temperature.
This behavior of the Schottky diode amounts to partial temperature compensation of the emitterbase diode and hence of $\mathrm{V}_{\mathrm{CE}(\mathrm{sat})}$. It helps to maintain a high worst-case logic 0 noise margin at high temperature when the Schottky transistor drives fan-out circuits that operate at a $2 \mathrm{~V}_{\text {BE }}$ logic threshold as standard DTL and TTL IC's do.
The storage time of a Schottky transistor is effectively zero, as shown on page 78 , in contrast to typically 7 nsec for a gold-doped device and 34 nsec for a non-gold-doped unit. Moreover, storage time hardly varies with temperature. At $125^{\circ} \mathrm{C}$,


Forward characteristic. At 1 ma forward current, voltage drop is 0.45 volts for the Schottky diode, 0.74 volts for the $\mathrm{p}-\mathrm{n}$ junction diode.


No storage. Transient characteristic shows that the Schottky diode, unlike p-n junction diodes, has essentially zero storage time.
it's still less than a nanosecond, whereas the golddoped storage time doubles to 15 nsec at that temperature.
Aside from much faster switching speed, Schottky diodes offer the IC designer a greater range of devices to choose from. Because gold doping can't be done selectively, the designer has at his disposal only one active component: a fast switching npn transistor whose emitter-base and collectorbase junctions can be used as diodes. But since the Schottky process doesn't require gold doping, the
designer has seven active components to choose from. And he can mix them at will-fast- and slowswitching devices can be placed on the same chip. In both the Schottky and the gold-doped transistor, various levels of resistivity are available in the bulk material, as well as in the isolation, emitter and base diffusions for use as resistors.

## Advantageous options

The Schottky diode across the base-collector junction of the npn transistor is an almost ideal active switch with less than 1 nsec storage time and high amplification. Without the Schottky diode, the device becomes a charge-storage npn transistor characterized by low saturation voltage, high inverse gain ( $\mathrm{h}_{\mathrm{FE}}$ ), and lengthy recovery time.
Substrate pnp transistors can be diffused into the chip too; these can have $\mathrm{h}_{\mathrm{FE}}$ of 10 or more and are ideal for input buffering. Lateral pnp transistors can be formed as well-with and without Schottkyclamped collectors to provide fast or slow recovery time. These devices are useful as curent sources and for voltage translation.
Then there are SCR's. Like transistors, these can be diffused into the chip in both charge-storage and Schottky-clamped versions, each of which can be optimized for different purposes. They make efficient bistable elements and give a high functional density to integrated shift registers and counters.
And, of course, there are diodes-emitter-base, charge-storage, and Schottky-each with different characteristics.
With just the Schottky diode and transistor, it's easy to upgrade old IC designs. Take DTL, for instance. If, without changing any of the resistor values, the basic DTL gate is modified with Schottky components as shown at top of page 80 , the performance of the gate is improved as follows:

- Less sensitivity to power supply variations, since the circuit will continue to function at a lower minimum supply voltage. This is because; when all inputs are in the high state, the voltage at the base of $\mathrm{Q}_{1}$ is typically 0.3 volts lower that it would be in the gold-doped counterpart.
- Less sensitivity to temperature variations because of the compensating effect of the Schottky diode temperature behavior.
- Faster switching; the gate turns on faster, and storage time is less than 1 nsec , remaining so throughout the operating temperature range.
In addition, modified DTL circuits can be fabricated at better yield (and lower cost to the user) because $h_{\text {PE }}$ is higher and leakage is lower than in the gold-doped version.
The Schottky version of the TTL gate also has some important advantages over its gold-doped counterpart:
- Lower input inverse leakage current, typically less than 0.1 microampere. This is made up of the reverse-bias leakage of the emitter-base junction and the input clamp diode; there is no contribution from inverse $\mathrm{h}_{\mathrm{FE}}$ since the collector base junction



AMBIENT TEMPERATURE, ${ }^{\circ} \mathrm{C}$
of the input transistor never becomes forward biased.

- Less line reflection, since the input clamp di-odes-now Schottky diodes-have a lower forwardbias voltage.
- Lower power dissipation because no charge is stored (and wasted) in the transistors.
- Higher output high level, by approximately 0.3 volt.

Although the Schottky process significantly improves many of the existing circuits, its real potential lies in entirely new designs. For example, by designing from scratch with the Schottky process in mind, a DTL gate can be made just as fast as TTL. An example of such a circuit is shown on page 80 , bottom left.

When the circuit is loaded with two-200-ohm resistors, as shown, the switching speed is about 5 nsec , comparable to that of the fastest TTL circuits commercially available. The advantage of DTL

Storage time. The Schottky transistor, because it doesn't store charge, responds much more rapidly to an input signal (top traces, ambient temperature $25^{\circ} \mathrm{C}$ ). This difference becomes even more pronounced at high temperatures (bottom curves).
over TTL, given equal speeds, is that the outputs of the DTL gates can be OR-tied. Moreover, DTL doesn't create heavy current transients in the power supply during switching.
The input current to this Schottky DTL gate is low, typically 250 microamperes. This presents a very small load to the driving gate. As a result, each gate can drive a large number of similar gates -fan-out can be 25 or more. If the two 200 -ohm resistors are used as a terminating resistance for a 100 -ohm line, there will still be enough current left to drive 10 gates.
Another novel circuit is the binary divider on page 80, bottom right. It combines the best of the slow and fast worlds: the Schottky transistors that comprise the flip-flop switch at high speed, and the slow charge-storage transistors serve as tempo-rary-data-storage elements.

## Bridge work

The special advantages of these circuits, and of the Schottky process that makes them possible, are particularly applicable to complex circuits. One such IC built at Intel is a 64 -bit high-speed scratchpad memory. It's organized as 16 words by 4 bits; each of the 16 words is addressable through its own binary decoder. It has four data inputs, four address inputs, and four outputs. If it's required, the outputs can be OR-tied with the outputs of other memory

Components compatible with the Schottky process

| TYPE | SYMBOL. <br> PROFILE | PRINCIPAL CHARACTERISTICS |
| :---: | :---: | :---: |
| SCHOTTKY NPN TRANSISTOR |  | STORACE TINE〈IUSEC VCELen)"0.2 TO 0.4 VoIT hre: 60 |
| CHARGE-STORAGE TRANSISTOR |  | STORAGE TME $>20$ MSEC <br> $V_{\text {ce }}(s 01)<0.1$ Volt <br> $h_{\text {FE }}$ NVERSE $>2$ |
| SUBSTRATE PNP |  | $H_{F E}>10$ |
| LATERAL PNP |  | $H_{F E}>2$ |
| LATERAL PNP WITH SCHOTTKY COLLECTOR |  | STORACE TIME < INSEC |
| CHARGE-STORAGE SCR |  | STORAGE TIME $>20$ MSEE Yon < oil volt |
| SCHOTTKY-CLAMPED SCR |  | STORAGE TIME <INSEC $V_{\text {ON }}=0.2 \text { To } 0.4 \text { vit }$ |
| EMITTER-BASE DIODE |  | $V_{F}=0.6 \text { T0 0.8 volt }$ |
| CHARGE-STORAGE DIODE |  | STORAGE TIME $>20$ NSEC |
| SCHOTTKY DIODE |  | $V_{F}=0.3 \text { T0 0. } 5 \text { Volt }$ <br> STORACE TME KI HSEC |
| DIFFUSED RESISTOR | $\stackrel{1}{\square} M M=2$ | $\theta s=100 \mathrm{TO} 200 \mathrm{Q} / \mathrm{FI}$ |
| COLLECTOR-PINCH RESISTOR |  | $\rho_{s}=400701000 n / \mathrm{m}$ |
| BASE-PINCH RESISTOR | ${ }^{1} M M=?$ | $f s=2 \mathrm{To} \text { 10k } \mathrm{k} / \mathrm{m}$ |



Updated. Modifying the standard DTL gate with Schottky diodes and transistors (top) affords significant performance improvement.

The new DTL. Without gold doping, many components become available to the IC designer. In the DTL gate at bottom left, input loading is reduced an order of magnitude by the substrate pnp's that replace the input diodes normally used.

Simplified. With the Schottky process, a binary counter stage (bottom right) is simple and occupies a small chip area. This is because chargestorage transistors ( $Q_{1}$ and $\mathrm{Q}_{2}$, which are incompatible with the gold-doped process, can be used to store information.
integrated circuit chips.
The input load is equal to one unit TTL load, and the output can sink 20 milliamperes. The access delay-from address input to data output-is less than 50 nsec under a 20 ma resistive load. Power dissipation, including address buffering, decoding, sensing, and control logic, is typically 6 milliwatts per bit.

These specifications represent major improvements over gold-doped IC performance. Because storage time has been eliminated by the Schottky process, the input-to-output propagation delay has been reduced by as much as $40 \%$ for certain critical delay paths. And the variation in propagation delay due to temperature change is considerably less than in a gold-doped circuit. The higher $\mathrm{h}_{\mathrm{FE}}$ values that result from the Schottky process also help keep propagation delay constant with temperature. This is particularly important for circuits that operate at low temperatures where large increase in delay, due to low $h_{\mathrm{FE}}$, is a chronic problem in gold-doped IC's.
Another advantage of this higher-valued $\mathrm{h}_{\mathrm{FE}}$ distribution is that transistors can be designed for higher forced-beta conditions, and power dissipation is therefore lower. For the bipolar logic-buffer chip, for example, power dissipation is reduced by about $30 \%$.
Most significant, perhaps, the Schottky process
has resulted in higher functional density and smaller chips. A comparison of the Intel 64-bit memory with an equivalent gold-doped circuit is revealing: Although the gold-doped IC is made with the same mask tolerances, the Schottky version uses a $30 \%$ smaller chip.

With all that they have going for them-high speed, greater functional density, and flexibility of design-Schottky-diode IC's certainly appear to have a bright future.

## References

1. R. Baker, "Maximum Efficiency Switching Circuit", M.I.T., Lincoln Lab., Lexington, Mass., Report TR-110, 1956.
2. K. Tade et al., "Reduction of the Storage Time of a Transistor Using a Schottky Barrier Diode", PROC. IEEE, 55, pp. 2064-2065, 1967.
3. E.R. Chenette and R.A. Pedersen, "Integrated Schottky Diode Clamp for Transistor Storage Time Control', PROC. IEEE, 56, pp. 232-237, 1968.
4. A. Tarui, Y. Hayashi, H. Teshima and T. Sekigawa, "Transistor Schottky-Barrier-Diode Integrated Logic Circuit", Journal of SolidState Circuits SC-4, pp. 3-12, 1969.
5. A.Y.C. Yu and E.H. Snow, "Surface Effects on Metal Silicon Contacts", Journal of Applied Physics 39, 3008, 1968.
6. M.P. Lepselter and S.M. Sze, "Silicon Schottky Barrier Diode with Near Ideal Characteristics", Bell System Technical Journal, 47, 195, 1968.
7. R.A. Zettler and A.M. Cowley, "The p-n Junction-Schottky Barrier Hybrid Diode", IEEE Trans. on Electron Devices, Ed-16, 58, 1969.

## A 4096-BIT DYNAMIC MOS RAM

## SESSION I: Memory I

WAM 1.1: A 4096-Bit Dynamic MOS RAM

By J.A. Karp*, W.M. Regitz and S. Chou Intel Corp., Santa Clara, California<br>*Now with Intersil Corp.

AN MOS SEMICONDUCTOR MEMORY ARRAY, designed for large main memory, will be described. High density has been achieved with a unique cell configuration, in combination with N -channel silicon-gate technology, while still maintaining manufacturing tolerances consistent with high volume and low cost production.

A block diagram of the memory array is shown in Figure 1. The 12 -bit address buffer register accepts TTL inputs; 2.4-V minimum. The address is then decoded into 1 of 64 rows and 1 of 64 columns, thus selecting one of the 4096 cells contained in a $64 \times 64$ matrix. The decoders use 6 input dynamic NOR gates which accomplish high speed address decoding at minimum power dissipation. The address buffers, which also utilize dynamic circuitry, not only convert the TTL levels to MOS levels, but also serve as a register, thus requiring stable address for only 100 ns ; Figure 2.

The 4096 -bit memory uses a single high-level clock (CE) from which all internal timing signals are triggered. The precharge clock input, normally used in dynamic RAMs to precondition all internal nodes, has been eliminated. The function of this clock is performed by the $\overline{\mathrm{CE}}$ driver; thus all dynamic nodes are

Chairman: D. A. Hodges
University of California, Berkeley, Cai.

Reprinted from the DIGEST OF TECHNICAL PAPERS, 1972 IEEE International Solid State Circuits Conference, February, 1972 with the permission of the copyright owner, IEEE, and the publisher, Lewis Winner.

> 4096 bits: 4096 words by 1 bit
> TTL compatible inputs; except clock
> Single clock input; high level
> Cell size $<2$ mil $^{2} ; 3$ transistors
> Access time $<\mathbf{3 0 0} \mathbf{n s}$
> Cycle time $<500 \mathrm{~ns}$
> Active power $<100 \mu \mathrm{~W} /$ bit
> Power supplies: $+12 \mathrm{~V},+5 \mathrm{~V},-5 \mathrm{~V}$
> Standby power $<1 \mu \mathrm{~W} / \mathrm{bit}$
> Standard 22-pin package

Memory array characteristics.
preconditioned automatically between active memory cycles. The additional internal timing pulses in Figure $2(R, \bar{R}$ and Row Voltage) are representative of the types of signals that have been generated on the chip.

The cell has three minimum geometry transistors and occupies an area of 1.8 sq. mils. This small area per bit is the


FIGURE 1-Block diagram of 4096-bit dynamic RAM.
result of the cell configuration (Figure 3) which needs only 2-1/2 interconnect lines per cell as follows:
A)-A single selection line ${ }^{1}$ activates a row of cells for both reading and writing. The Row Voltage is a tri-state signal gated on to the Row Select line by the Row Decoder. Reading of the cell is accomplished during the intermediate level of this signal and writing or refreshing the cell is accomplished during the high level of this signal. The high level is terminated as soon as proper information is gated to the cell in order to conserve power.
B)-A single column line 2 allows transfer of information into or out of the cell. This line is connected to the column amplifier which is used to sense the information in the cell, or write new information into the cell.
C)-A ground line is shared between two adjacent columns of cells, resulting in $\mathbf{1 / 2}$ connection per cell.

Since the information is stored dynamically in the cell, it must be refreshed periodically. To refresh the 4096-bit array, it is only necessary to perform memory cycles at each of the 64 row addresses.

The memory can perform read, write or read-modify-write cycles controlled by a single TTL level input signal.

[^2]

FIGURE 2-Timing diagram showing inputs (ADD, CE) and internally generated signals ( $\overline{\mathrm{CE}}, \mathrm{R}, \overline{\mathrm{R}}, \mathrm{ROW}$ VOLTAGE).
Dashed lined shows worst case as predicted by computer internally -generated signals ( $\overline{\mathrm{CE}}, \mathrm{R}, \overline{\mathrm{R}}, \mathrm{ROW}$ VOLTAGE).
Dashed lined shows worst case as predicted by computer analysis.
R


# N -channel goes to work with TTL 

> Two dynamic shift registers and a static RAM operate from a single 5-V supply, have 1-V thresholds

by R. Abbott, H. Gopen, T. Rowe and D. Bryson, Intel Corp., Santa Clara, California

Nearly ten years of increasing success with p-channel devices have brought mOS technology to the point where it can tackle the much harder iob of n-channel processing. With the addition of silicon gates, such devices can outdo their p-channel forerunners in speed, packing density, and low threshold voltage. They also offer direct compatibility with transistor-transistor logic.

For instance, a family of dynamic recirculating shift registers now being built with a silicon-gate n-channel process has speeds typically of 2 megahertz and threshold voltages of less than a volt, and is TTL-compatible; it enables an entire memory subsystem to operate from a single +5 -v power supply. The same process is being used to build a static 1,024 -bit random-access memory, which also operates off a single $+5-\mathrm{v}$ supply.

A different n -channel process is being developed that will yield a 4,096 -bit dynamic RAM. Like most mOS devices, this will require both positive and negative supply voltages.

An alternate approach to designing n-channel devices is to forego compatibility with TTL and instead go all out for speed. This requires the device to be operated like its p-channel equivalent, at drive voltages of 10 to 17 v . But for this initial family of circuits, TTL compatibility was preferred, in order to promote its use in existing systems.

## How to make n-channel devices

The processing steps used in building the shift registers and the static ram are much the same as those for p -channel silicon-gate structures. The difference is that with n-channel devices the starting wafer is borondoped, p-type silicon. The device threshold is then maintained at less than 1 v by controlling the impurity of the boron and the thickness of the gate dielectric.

Fabrication starts with the p-type wafers being thermally oxidized to a thickness of about 1 micrometer, and parts of the oxide being removed to define the diffusion and gate regions in the first of five photomasking steps. Then a gate dielectric and a polycrystalline silicon are deposited, and areas defining the gate regions of the transistors and the deposited silicon undercrossing are etched out, again by photomasking.

Next comes a diffusion operation, in which n-type impurities form the source and drain regions and the dif-


1. Making n-channels. Fabricating $n$-channels requires the deposition of a gate dielectric and polycrystalline silicon (a), after which an etching step defines the gate regions and the silicon undercrossing (b). Next, contact holes are etched through the glass to the polysilicon and diffused regions (c). The finished device has the usual aluminum interconnection and glass passivation (d).
fused undercrossing (see Figs. la and b). Glass formed by the oxidation of silane is then deposited on the wafers at low temperature, and contact holes are etched through it to the polycrystalline silicon and the diffused regions (see Fig. 1c).

Aluminum is evaporated onto this surface, and a fourth photomasking step forms interconnection patterns. Finally, the wafers are heated (alloyed) to insure ohmic contact between the aluminum and silicon, and a glass passivation layer is vapor-deposited over the surface. The fifth photomasking step removes the glass

## SPECIAL REPORT

from the bonding pad area and the scribe line.
A cross section of the completed device is shown in Fig. 1d. For some more complex n-channel arrays, a buried contact may be desirable to save silicon space. The price is an additional photomasking step, used to etch contact windows before the polycrystalline silicon is deposited, so that the polysilicon may make direct contact with the diffused regions.

## Low voltage payoff: greater density

There's more to this low-voltage n-channel silicongate process than just the convenience of its compatibility with TTL. The size of a given chip layout is a function of the voltages applied to the various junctions, and since the low-voltage device requires only a +5 -volt supply, and since the internal nodes are at even lower voltages, diffusions may be placed close together ( 0.4 mils) and channels may be short ( 0.25 mils). Indeed, tighter and smaller layouts are possible without the expense of tight alignment tolerances.

The resulting high packing density permits larger arrays at a lower cost per bit. Table 1 lists the area savings it makes possible, and Fig. 2 shows the photomicrographs of the cells for the devices listed in the table. For a RAM, the n-channel array needs a memory cell less than half the size of a comparable p-channel device cell, enabling a chip that could handle only 256 bits as a pchannel RAM to accommodate 1,024 bits as an n-channel RAM. A similar density advantage applies to shift registers.

2. Making room. Photomicrographs of RAMs and shift registers made with p - and n -channel technology show higher device density achieved by the $n$-channel devices. Compared here are static RAMs and a dynamic shift register, all units that are in production.

| Table 1: <br> Comaprison of p - and n - channel static RAMs and shift-registers |  |  |  |
| :---: | :---: | :---: | :---: |
| Product |  | Process |  |
|  |  | p-channel | n-channel |
| Static RAM | Device \# | 1101A | 2102 |
|  | Size | 256 bits | 1,024 bits |
|  | Organization | $256 \times 1$ | $1,024 \times 1$ |
|  | Cell size | $17.2 \mathrm{mil}^{2}$ | $7.9 \mathrm{mil}^{2}$ |
| Dynamic shift register | Device \# | 1404A | 2401 |
|  | Size | 1,024 bits | 2,048 bits |
|  | Organization | 1,024 $\times 1$ | 1,024 $\times 2$ |
|  | Ceil size | $10.0 \mathrm{mil}^{2}$ | $5.6 \mathrm{mil}^{2}$ |


| Table 2: <br> Comparison of p - and n - channel products |  |  |
| :---: | :---: | :---: |
|  | Products |  |
| Parameters | p-channel dynamic shift register | n-channel recirculating shift register |
| Size | 1,024 bits | 2,048 bits |
| Organization | 1,024 $\times 1$ | 1,024 $\times 2$ |
| Chip size | $118 \times 136$ mils $^{2}$ | $125 \times 151 \mathrm{mils}^{2}$ |
| Cell size | 10.0 mils $^{2}$ | $5.6 \mathrm{mils}^{2}$ |
| Power supply or supplier | +5/-5/-12V | $+5 \mathrm{~V} /$ ground |
| Data levels | $\mathrm{V}_{\mathrm{cc}}-2 / \mathrm{V}_{\mathrm{cc}}-4.2$ | $2.2 \mathrm{~V} / 0.65 \mathrm{~V}(\mathrm{TTL})$ |
| Clock levels | $+5 /-12 \mathrm{~V}$ | 2.2 V/0.65 V (TTL) |
| Number of clocks | 2 | 1 |
| Clock capacitance | 140 pF | 7 pF |
| Maximum power dissipation at $25^{\circ} \mathrm{C}$, maximum frequency | 500 mW (not including clock generator) | 350 mW (including clock generator) |
| Maximum frequency (over temperature range) | 5 MHz | 1 MHz |
| Minimum frequency | $10 \mathrm{kHz} @ \mathrm{~T}_{\mathrm{A}}=70^{\circ} \mathrm{C}$ | 25 kHz @ $\mathrm{T}_{\mathrm{A}}=70^{\circ} \mathrm{C}$ |
| Output requirement | External $\mathrm{R}_{\mathrm{L}}$ needed | No external $\mathbf{R}_{\mathrm{L}}$ needed |
| Other features | None | Recirculating and chip select |

Another advantage of the low-voltage technology is the reduction of parasitic device interactions, which are caused by interconnection levels that have higher threshold voltages than the voltages used for the operation of the circuit. This eliminates both large leakage paths and high capacitance caused by field inversionproblems that needed solving before $n$-channel devices could be implemented.

The reliability of n-channel silicon-gate technology is dependent on the stability of the threshold voltages of the MOS transistors and the two levels of interconnects used in the array. In addition, the stability of dynamic storage circuits will be determined by the change in junction leakage, since it is this leakage that determines the retention time of the memory cell.

The basis for the inherent high reliability of silicongate $n$-channel devices is the way the layers are arranged. In the cross section of an n-channel mos transistor and the interconnects used in the silicon-gate process shown in Fig. 3, three parts of the device are

3. Reliable. With the low-threshold silicon-gate process reliability is maintained in all parts of the n-channel device. In region labeied (a), the glass passivation insures against contamination, just as in p-channel fabrication. The silicon interconnects and the metal lines-regions (b) and (c)-are also well protected because of the thick oxide step surrounding the silicon gate and aluminum electrode.

4. Extra protection. Devices made with n-channel technology have better gate protection than p-channel structures made with conventional fabricating techniques. The low n-channel breakdown voltage curve means that these $n$-channel structures operate at voltages substantially lower than those causing gate breakdown. This extra gate protection, compared to p-channel devices, adds to device reliability.
identified; the thin field segment (a), the intermediate field with the silicon line interconnects (b), and the thick field segment with metal line interconnects (c).

In (a) the MOS transistor is well protected from the external environment by a glass passivation barrier impervious to contamination. The resistance of this barrier to contamination $(\mathrm{Na}+$ ) has been well established in p-channel technology. The interconnects of (b) and (c) are also fabricated as in the p-channel technology process, and so are also well protected from external contamination.
Included in Fig. 3 are the results obtained after 800,000 unit-hours of stress on test devices. It shows that virtually no shifts in any of the threshold voltages were observed. The stress condition on these units was 5.5 volts (maximum value for the TTL circuits) at $125^{\circ} \mathrm{C}$. The junction leakages of these devices were also monitored, and no increases observed.

An additional reliability feature of the $n$-channel technology is improved gate protection of the input de-
vices-resulting in the lower breakdown voltage of the diode. Figure 4 compares the dc behavior of the same gate protection device for both n-channel and p-channel processes as a function of the static charge applied. If the same dynamic impedance is assumed for the two diodes, the breakdown voltage in the n-channel version is consistently lower than in the p-channel device, requiring a large static charge to break down the gate dielectric.

## The real thing

The n-channel 2,048 - and 1,024 -bit dynamic recirculating shift registers, the Intel 2401 and 2405, are examples of products using this new technology. Both are directly TTL-compatible in all respects: inputs, outputs, clock line, and power supply. Table 2 compares a typical p-channel silicon-gate 1,024 -bit shift register (1404A) with the n-channel silicon-gate 2,048 -bit recirculating shift register (2401). A photomicrograph of the latter die is shown in Fig. 5.

## SPECIAL REPORT

Even though the $n$-channel device has twice as many bits as the p-channel version, it is only $20 \%$ larger in area. This is because the cell size of the 2401 is also nearly half the cell size of the 1404A, thanks primarily to the closer spacing allowed by the low voltage. Doubling the bit capacity in each package results in considerable savings for the user in printed-circuit board area, of course. The increase in bits per chip on 2401 will far outweigh the loss in yield on the slightly larger chip and result in a much lower cost per bit.

Additionallv the input levels are all referenced to ground, unlike the p-channel MOS levels that are referenced to the positive power supply voltage $\mathrm{V}_{\mathrm{Cc}}$. Nearly $50 \%$ of the user problems with p-channel MOS shift registers are related to the clock levels. Incidentally. these are $16-\mathrm{v}$ clock swings, and require either a voltage multiplying stage between the 5-v TTL supply or a separate negative voltage supply. Since the 2401 has a TTL level clock and the clock capacitance is only 7 picofarads
(worst case) over the full temperature range, as compared to over 100 pF for p -channel MOS clock inputs, system design is greatly simplified.

The maximum power dissipation of the 2401 is less than the 1404 A in system configurations. The 400 -milliwatt worst-case power dissipation over the temperature range at maximum frequency of operation includes the clock generator power of the 2401 . which is about $40 \%$ of the total.

Since ease of use and TTL compatibility were emphasized for this $n$-channel silicon gate design, the maximum data rate of the 2401 is actuallv slower than that of the 1404 A . This is due to the large load presented to the internal drivers. But even under these circumstances the device is capable of operating at 2 MHz .

Finally. the 2401 saves the user some expense in external parts. Many p-mos shift registers must have separate circuits to recirculate the data and to address the bit lines. With the 2401 , on the other hand, recirculation, and two chip selects for X-Y matrix selection are supplied on chip. Also included is an internal load resistor. removing the user's need for external matching. $\square$

5. Close fit. Unlike most p-channel memory devices, this 2.048 -bit dynamic shift register made with n-channel technology is directly compatible with TTL interface circuits in input, output, and clock lines; most significantly, it can operate from a single $5-\mathrm{V}$ supply

# A Fully Decoded 2048-Bit Electrically Programmable FAMOS Read-Only Memory 

DOV FROHMAN-BENTCHKOWSKY, member, ieee


#### Abstract

This paper describes a fully decoded 2048-bit electrically programmable read-only memory implemented with a novel floating-gate avalanche-injection MOS (FAMOS) charge-storage device as the basic nonvolatile memory element. The memory is organized as 256 words of 8 bits, it is fully TTL compatible, and can be operated in both the static or dynamic mode. The memory array was successfully fabricated with silicon gate MOS technology yielding functional devices with access times of 800 ns in the static mode and 500 ns in the dynamic mode of operation. The memory chip is assembled in a 24-lead dual-in-line package.


SEMICONDUCTOR read-only memories (ROM) are presently implemented in a variety of digital system and computer applications. Most available semiconductor ROMs are programmed permanently at the integrated-circuit fabrication stage by a custom mask that defines the desired information pattern. As a result, program changes in microprogramming applications as well as pattern changes during the debugging phase of digital systems involve the generation of a new mechanical mask for every modified ROM pattern. In addition to being an expensive step, it also limits the flexibility of ROM applications because of the delay involved in the production process. These limitations of mask programmable ROMs have led to a growing interest in electrically programmable semiconductor ROMs in which the permanent information pattern is recorded by application of an electrical signal. This allows programming changes to be affected without the expense and time delay involved in generation of a custom mask.

The different proposed electrically programmable ROMs can be divided into two main categories: 1) ROMs in which a permanent (irreversible) change in the memory metal interconnection pattern is affected by an electrical pulse; and 2) alterable ROMs in which a reversible change in active memory device characteristics is induced electrically.

The first category includes fusible-type ROMs, which are mainly bipolar memories with capacities of up to 512 memory bits. Their main disadvantage is that they cannot be reprogrammed, which prevents complete functional testing before shipment, as well as pattern modification in case of programming error or needed change.

The search for alterable (charge storage) ROMs stems from the need for low-cost fully tested field-program-

[^3]mable ROMs as well as an attempt to provide a substitute for the nonvolatile storage capability (retention of stored information without an external power source) of magnetic memories. Most proposed alterable ROMs rely on charge storage in a dielectric that forms part of the gate of an insulated-gate field-effect transistor. Feasibility has been demonstrated for a metal-nitride-oxide-silicon (MNOS) memory [1], [2] a metal-aluminum-oxidesilicon (MAS) memory [3], and a dual-gate MNOS memory [4]. Difficulties in controlling the electrical characteristics of the storage dielectrics and additional fabrication steps required to achieve on-the-chip decoding have limited the realization of these approaches to undecoded memory arrays of up to 256 memory bits.

Recently, feasibility of the ovonic amorphous semiconductor memory device has been demonstrated by fabrication of an undecoded 256-bit memory array [5].

The introduction of a novel MOS memory elementthe floating-gate avalanche-injection MOS (FAMOS) charge-storage device-has led to the fabrication of a fully decoded 2048-bit electrically programmable ROM. The memory is organized as 256 words of 8 bits, it is fully TTL compatible and can be operated in both the static or dynamic decoding and sensing mode. The monolithic memory array was successfully fabricated with silicon-gate MOS technology, yielding functional devices with access times below 800 ns in the static mode and less than 500 ns in the dynamic mode of operation. The memory chip is assembled in a 24 -lead dual-in-line package.

## I. Device Structure and Operation

A cross section of the FAMOS structure is shown in Fig. 1 with its suggested electrical symbol. It is essentially a p-channel silicon-gate MOS field-effect transistor [6] in which no electrical contact is made to the silicon gate. The floating polysilicon gate is isolated from the silicon substrate by a $\mathrm{SiO}_{2}$ layer of approximately $1000 \AA$ and from the top surface by $1.0 \mu$ of vapor-deposited oxide. Operation of the FAMOS memory structure depends on charge transport to the floating gate by avalanche injection of electrons from either the source or drain p-n junctions. A junction voltage in excess of -30 V applied to a p-channel FAMOS device (Fig. 2) with $1000 \AA$ of oxide and $5-8-\Omega \cdot \mathrm{cm}$ substrate resistivity will result in the onset of injection of high-energy electrons from the p-n junction surface avalanche region to the floating silicon gate. The gate charging current is of the


Fig. 1. (a) Cross section of FAMOS structure. (b) Suggested electrical symbol.


Fig. 2. Cross section of FAMOS device under bias.
order of $10^{-7} \mathrm{~A} / \mathrm{cm}^{2}$. Since the silicon gate is floating the electron current through the oxide results in the accumulation of a negative charge on the gate. For a p-channel FAMOS transistor this negative charge will induce a conductive inversion layer connecting source and drain. The amount of charge transferred to the floating gate as a function of the amplitude and duration of the applied junction voltage is shown in Fig. 3. The presence or absence of charge can be sensed by measuring the conductance between the source and drain regions.

Once the applied junction voltage is removed, no discharge path is available for the accumulated electrons since the gate is surrounded by thermal oxide, which is a very low conductivity dielectric. The electric field in the structure after the removal of junction voltage is due only to the accumulated electron charge and is not sufficient to cause charge transport across the polysilicon-thermal-oxide energy barrier. The maximum stored charge of $4 \times 10^{12}$ electrons $/ \mathrm{cm}^{2}\left(V_{W}=50 \mathrm{~V}\right.$, Fig. 3) results in an electric field of approximately $2 \times 10^{6} \mathrm{~V} / \mathrm{cm}$ across the thermal oxide. Assuming current transport by FowlerNordheim emission from the polysilicon gate into the oxide, the estimated discharge current (for a polysilicon$\mathrm{SiO}_{2}$ energy barrier of at least 3.2 eV ) is of the order of $10^{-40} \mathrm{~A} / \mathrm{cm}^{2}$ at $300^{\circ} \mathrm{K}$.

Charge decay plots as a function of time at $125^{\circ}$ and $300^{\circ} \mathrm{C}$ are shown in Fig. 4. The rapid initial decay saturating with time cannot be explained by electron transport across the oxide. Its temperature and electric-field dependence correspond to that of positive-charge buildup at the $\mathrm{SiO}_{2}-\mathrm{Si}$ interface that occurs at high electric fields


Fig. 3. Charge accumulation on the floating gate as a function of charging pulsewidth for different values of pulse amplitude.
under negative gate bias conditions [7], [8]. The activation energy of charge decay was measured to be 1.0 eV . An extrapolation of the $300^{\circ} \mathrm{C}$ charge decay results indicates that 70 percent of the initial induced charge can be retained for as long as 10 years at $125^{\circ} \mathrm{C}$.
Since the gate electrode is not electrically accessible, the charge cannot be removed by an electrical pulse. However, the initial condition of no electronic charge on the gate can be restored by two nonelectrical methods. Illumination of the unpackaged device with ultraviolet light will result in the flow of a photocurrent from the floating gate back to the silicon substrate thereby discharging the gate to its initial condition. This erase method allows complete testing of a complex memory array before the package goes through final seal. Once the package is sealed, information can be erased by exposure to X-ray radiation. The radiation dose required is in excess of $5 \times 10^{4} \mathrm{rad}$, which is many orders of magnitude higher than the average yearly atmospheric radiation dose and easily attainable with available commercial X-ray generators.

## II. Memory Cell

The FAMOS charge-storage device described above can be used as the basic storage element in a large memory array. A circuit schematic of the memory cell and its associated decoding circuitry is shown in Fig. 5. The decode and sense circuits shown are common to both the program and read modes. To select a bit to be programed the address-decode inputs as well as the $V_{D D}$ and $V_{G G}$ lines are energized to -50 V . Programming of a memory bit is accomplished by coincidence selection of the $X$ and $Y$ select lines. The applied programming pulse is transferred to the selected FAMOS device that turns normally on due to the electron charge transferred to its floating gate. All other memory bits are not programmed due to either the lack of a pulse on the $Y$ select line or the absence of a transfer pulse on the $X$ select line. The programming signal $V_{P}$ is a $-50-\mathrm{V} 5.0-\mathrm{ms}$ pulse. The amount of charge stored in a memory cell in re-


Fig. 4. Charge decay in a FAMOS device as a function of time at $125^{\circ}$ and $300^{\circ} \mathrm{C}$.


Fig. 5. Memory cell with its associated decode and sense circuits.
sponse to a programming pulse is typically $3.0 \times 10^{-7}$ $\mathrm{C} / \mathrm{cm}^{2}$, which is equivalent to 10 V on the gate of a conventional MOS transistor. The load current required to program a memory bit is approximately 5.0 mA . The increase in programming pulse amplitude and duration to charge the memory cell compared to the data presented in Fig. 3 is due to the additional voltage drop across the decode circuits. These high applied voltages in the ProGRAM mode could give rise to parasitic programming paths due to field inversions over the thick oxide regions on the chip. This potential problem was accounted for in both the circuit design and chip layout by providing a highly conductive programming path as compared to the parasitic path as well as careful routing of high-voltage lines on the chip to avoid sensitive circuit nodes.

Memory-cell operation in the read mode is similar to that of other ROMs. A memory bit is selected by a coincidence of signals on the $X$ and $Y$ select lines. However, compared to the programming mode the read mode voltages are substantially lower. Since the programming threshold is -30.0 V , the maximum READ voltage of -15 V guarantees a wide margin to avoid disturbing the information during the read mode. Information in the se-


Fig. 6. Block diagram of memory organization.
lected memory cell is sampled by the output sense circuit. If a " 0 " is stored in the cell (charge on the floating gate) the FAMOS device is on and the level at the input of the sense circuit is close to $V_{c c}$. A " 1 " corresponding to no charge stored in the cell is reflected by a more negative level (close to $V_{D D}$ ) at the input of the sense circuit. Information can be decoded and sensed in either the static (no clocks) or dynamic mode ( 2 clocks). The static mode of operation eliminates the need for clocks at the expense of increased power dissipation and reduced speed, while the dynamic mode offers advantages in both performance categories. The option of the two modes of operation is achieved by parallel load transistors in the decode and sense circuitry, one connected to the clock lines and the other to $V_{G G}$ as shown in Fig. 5. The load device connected to the $V_{P}$ terminal is connected to $\phi_{1}$ in the dynamic read mode. Mode selection is done by activating the clocks with $V_{G G}$ connected to $V_{C C}$ in the dynamic mode. $V_{G G}$ is activated in the static mode with the clocks connected to $V_{c c}$.

## III. Memory Organization and Operation

The electrically programmable memory chip consists of a monolithic array of 2048 FAMOS devices organized as 256 words of 8 bits. A block diagram of memory organization is shown in Fig. 6. All circuit blocks are common to both the program and read modes with the exception of the program data input buffers. In the program mode the eight output terminals are used as data inputs to determine the information pattern in the eight bits of each word, while word address selection is performed by the $X$ and $Y$ decoders through the input drivers. The program data input circuitry for one of eight outputs is shown in Fig. 7. To inhibit the programming of a bit, a negative voltage is applied to the data input terminal. This voltage level is transferred to the inhibit transistor (chip select is enabled), which turns on and overrides the $Y$ select signal. To allow programming of a selected bit, the data input terminal is kept at ground. Initially all 2048 bits are in the " 1 " state corresponding to normally off FAMOS devices (no charge on the floating gate). Information is introduced by selectively programming " 0 " in the proper bit locations through charging the FAMOS devices from the PROGRAM terminal. The supply ( $V_{D D}, V_{G G}, V_{P}$ ), address $\left(A_{0}-A_{8}\right)$ and data input ( $D_{\text {IN1 }}$


Fig. 7. Circuit description of program data input buffers.

- $D_{\text {IN8 }}$ ) voltages in the program mode are detailed in Fig. 8. A timing diagram for the applied voltage sequence in the program mode is shown in Fig. 9. In the read mode the program data input buffers are inhibited by the chip select signal to cut off the feedback path from the output to the memory array established in the program mode. Memory operation in the read mode is the same as in conventional mask programmable ROMs. The input and output buffers provide for full TTL compatibility and addressing is accomplished by the $X$ and $Y$ decoders operating at the voltage levels detailed in Fig. 8. A selected memory cell with a charged FAMOS device will be reflected by a low " 0 " TTL level at the output, while a memory bit that is not charged will result in a high "1" TTL level. Both the static and dynamic modes of decoding and sensing are available in a single package through use of parallel load transistors as described in Section II. Mode selection is done by activation of either the clock lines $\left(\phi_{1}, \phi_{2}\right)$ or the $V_{G G}$ terminal in the dynamic and static modes, respectively. Typical access times are 400 ns in the dynamic mode and 700 ns in the static mode. A photomicrograph of the chip with the designation of the different circuit blocks is shown in Fig. 10.

As can be seen from the above description, electrical programming of the memory is conceptually the same as operation in the read mode with the exception of the voltage levels (Fig. 8). Hence, the memory can be easily programmed from punched paper tape or other data input devices through an electrical programming terminal. Once programmed, the information pattern in the memory can be erased (restored to the all " 1 " state) by ultraviolet light before packaging or by exposure to X-ray radiation as described in Section I. The ability to erase before packaging allows for complete testing of the memory chip prior to shipment, which is a distinct advantage over once-programmable ROMs. Erasure of information in packaged devices by placing them in a commercially available X-ray generator allows for correction of programming errors as well as unpredictable future pattern modifications.

## IV. Reliability

One of the most important performance factors in an electrically programmable ROM (as in all other products) is its long-term reliability. Reliability in this case incorporates both the long-term functionality of the standard MOS transistors and the long-term information retention of the FAMOS devices. Reliability data on sili-

| APPLIED VOLTAGE <br> (VOLTS) | $V_{\text {MOD }}$ | $v_{G G}$ | $v_{P}$ | $g_{1}$ | $g_{2}$ | $v_{C C}$ | DATA IN |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| PROGRAM | -50.0 | -50.0 | -50.0 | $V_{C C}$ | $V_{C C}$ | 0.0 | -40.0 |
| STATIC READ | -9.0 | -9.0 | $V_{C C}$ | $V_{C C}$ | $V_{C C}$ | +5.0 | $T T L$ |
| DYNAMIC READ | -9.0 | $V_{C C}$ | -9.0 | -9.0 | -9.0 | +5.0 | $T T L$ |

Fig. 8. Applied voltage level to the memory in the program and reAd modes of operation.


Fig. 9. Timing diagram for the program mode.
con-gate MOS memory products have been accumulated over a period of two years, resulting in many device hours without failure. This record is directly applicable to the programmable ROM since it is fabricated with the same silicon-gate MOS technology. As to long-term information retention, the charge-storage programmable ROM has a predictable time-temperature dependence for stored information.

The long-term decay (Fig. 3) has a logarithmic dependence on time with a slope of approximately 1.0 V per 5 decades of storage time at $125^{\circ} \mathrm{C}$. This rate of charge decay extrapolates to storage retention times greater than 10 years at $125^{\circ} \mathrm{C}$. The predictable timetemperature behavior of stored charge corresponding to an activation energy of 1.0 eV , allows for accelerated life testing of the programmable ROM. To guarantee $10-$ years retention at $125^{\circ} \mathrm{C}$, the device has to be subjected to $300^{\circ} \mathrm{C}$ storage for only 10 h . This procedure is similar to existing high-temperature stress testing in MOS products to detect potential contamination in the dielectric.

A summary of reliability results on the 2048-bit electrically programmable ROM is shown in Fig. 11. The extrapolation of time from $125^{\circ}$ and $200^{\circ} \mathrm{C}$ to the operating temperature of $85^{\circ} \mathrm{C}$ is based on an activation energy for charge decay of 1.0 eV . The bias configuration in the temperature/bias test corresponds to worst case operating voltages in the read mode. No failures have been


Fig. 10. Photomicrograph of the memory chip with designation of the different circuit blocks.

1601 RELIABILITY RESULTS

| test | $T_{A}$ | UNITS | HOURS | FALLURES | EO. TIME <br> AT $85^{\circ} \mathrm{C}$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
| TEMP/BIAS | $125^{\circ} \mathrm{C}$ | 15 | 1500 | 0 | 5 YEARS |
| Storage | $125{ }^{\circ} \mathrm{C}$ | 10 | 1500 | 0 | 5 Years |
| storage | $200^{\circ} \mathrm{C}$ | 10 | 250 | 0 | 100 YEARS |

## I. 0 HOUR AT $125^{\circ} \mathrm{C}=35$ HOURS AT $85^{\circ} \mathrm{C}$

Fig. 11. Reliability results for storage retention of the programmable ROM.
observed in any of the long-term information-retention tests.

An additional reliability factor is the retention of the uncharged " 1 " memory state. It was pointed out in Section III that operation in the read mode differs from the program mode only in the lowering of the operating voltages from -50 to -14 V . This brings up the question of whether an uncharged memory bit can be slowly charged by repeated selection in the read mode. Retention of the uncharged memory state as measured on a typical memory cell is shown in Fig. 12. The time for a shift in FAMOS device turn-on voltage as a function of applied voltage to the memory cell is plotted for different values of turnon voltage shift. For an operating voltage in the read mode of -14 V , a shift of $V_{T}=1.0 \mathrm{~V}$ is expected to take place after more than 100 years. A shift of at least 5.0 V is required before the uncharged memory state " 1 " is interpreted as a " 0 " by the sense circuitry. Another important aspect of reliability is the extent of preshipment testing. As pointed out in previous sections, because of its reprogrammable feature, the memory array can be completely tested before shipment for functional-


Fig. 12. Retention of the uncharged memory state as a function of applied voltage to the memory cell.
ity and programming, as well as an accelerated hightemperature test to guarantee long-term retention. After the tests are performed, the information pattern is erased and the programmable ROM is shipped with a pattern of all " 1 " ready for programming.

## V. Applications

Despite the attempts of semiconductor manufacturers to lower mask charges and turn-around time for maskprogrammable ROMs, the economics and design delay involved have limited the versatility of semiconductor ROM applications. The availability of a low-cost electrically field-programmable ROM opens new application areas, which heretofore have not been economically feasible, mainly those in which relatively small quantities of different ROM patterns are required. For example, the implementation of computer lookup tables that provide a hard-wired means of performing certain routine computer calculations. The variety of possible patterns makes the use of mask-programmable ROMs costly and impractical, while ideally suited for field-programmable ROMs. Other application areas that fall into the same category are binary-sequence generators, microprogramming control of central processors in computer terminals, and programmed logic arrays.

On the other hand, in areas such as code converters, character generators, and other applications in which large quantities of a few patterns are required the field programmable ROM augments the capability of mask programmable ROMs. In these applications, the initial custom mask cost is insignificant compared to the total cost of the high-volume standard ROM pattern. How-
ever, in order to finalize a decision on a given highvolume standard pattern a method of debugging is needed to avoid the castly phase of correcting potential errors. In this initial pattern definition phase the electrically programmable ROM is most economical and flexible. Since the flexibility of field-programmable ROMs is generally achieved at the expense of chip area, once the pattern has been defined, the high-volume order is placed for mask-programmable ROMs, which are lower cost in high quantities. Hence the combination of the two programming methods offers both the flexibility and cost incentive required to proliferate the use of semiconductor read-only memories in high-volume applications.

## VI. Summary

The introduction of a novel semiconductor memory element-the FAMOS charge storage device-and its implementation in a fully decoded 2048-bit electrically programmable read-only memory constitutes a significant advance in the state of the art of semiconductor memories. It is the first available large-capacity programmable ROM in which the information pattern is recorded electrically by way of a reversible change in memory-
device characteristics. This was achieved by a combination of a new charge-storage structure and circuit techniques that allow its implementation in large-scale memory arrays with existing MOS processing techniques.

## Acknowledgment

The author wishes to thank L. L. Vadasz, A. S. Grove, and G. E. Moore for many helpful discussions, G. Pasco for fabrication of the memory devices, and G. Greenwood for his help in testing and instrumentation.

## References

[1] H. A. Wegener, "MNOS memories," Digest Intermag. Conf., Apr. 1970.
[2] D. Frohman-Bentchkowsky, "An integrated metal-nitride-oxide-silicon (MNOS) memory," Proc. IEEE (Lett.), vol. 57, June 1969, pp. 1190-1192.
[3] S. Nakanuma et al., "A read-only memory using MAS transistors," ISSCC Digest Tech. Papers, Feb. 1970, pp. 68-69.
[4] H. G. Dill and T. N. Toombs, "A new MNOS charge storage effect," Solid-State Electron. vol. 12, 1969, pp. 981-987.
[5] R. G. Neale, D. L. Nelson, and G. E. Moore. "Amorphous semiconductors (Part I)," Electronics, Sept. 1970.
[6] L. L. Vadasz, A. S. Grove, T. A. Rowe, and G. E. Moore, "Silicon-gate technology," IEEE Spectrum, vol. 6, 1969, pp. 28-35.
[7] B. E. Deal, M. Sklar, A. S. Grove, and E. H. Snow, J. Electrochem. Soc., Solid-State Sci., vol. 114, 1967, p. 266.
[8] S. R. Hofstein, Solid-State Electron., vol. 10, 1967, p. 657.

# STANDARD PARTS AND CUSTOM DESIGN MERGE IN FOUR-CHIP PROCESSOR KIT 

by F. FAGGIN and M.E. HOFF<br>INTEL CORP. 3065 BOWERS AVE. SANTA CLARA, CALIFORNIA 95051

Reprinted from ELECTRONICS, April 24, 1972: © McGraw Hill 1972.

# Processors of various degrees of complexity can be designed from three standardized chip elements and a fourth, a read-only memory, customized to each user's specific purpose by his own microprogram 

A vast new array of applications has opened up with the advent of low-cost minicomputers that the user can microprogram himself for any number of repetitive functions. These applications range from control of computer peripheral devices to hospital patient monitors to gaming machines.Such widespread use of microprogramed minicomputers is made possible by large-scale integration that brings the cost about two orders of magnitude below the cost of the most inexpensive conventional minicomputer. One such computer, the Intel MCs-4, is composed of only four kinds of LSI chips. Three of these chip modules-a read/write memory, a processor, and a shift register-are standardized designs. The fourth module, a read-only memory, is programed to the user's specifications.

Among the potential applications for these microprogramed machines are, generically speaking, office equipment, process control, and instrumentation. More specific examples abound in these and other categories: billing machines and point-of-sale terminals; numerically controlled machines and traffic control, spectrum analyzers, navigational receivers, hospital patient monitors, intelligent terminals, and others.

The mCS- 4 consists of the 4001 read-only memory (ROM), which also contains a four-bit input/output section, or port; the 4002 read/write memory (RWM),
which includes both main storage and status memory, and a four-bit output port; the 4004 four-bit central processing unit (CPU); and the 4003 shift register (SR) for extending output functions-actually a mediumscale circuit because it holds only 10 bits and is on a relatively small chip. Input lines can be expanded with standard MSI multiplexers.

## Unexceptional chip size

All employ p-channel silicon-gate mos circuitry and have conventional measurements typical of today's smaller LSI circuits -53 by 85 mils for the shift register and roughly 110 by 150 mils for the other three. Circuit density is quite high; each chip holds more than these measurements imply. The standard LSI circuits-the RWM, the SR, and the CPU-are produced in high volume, and the only specialized part of the design-the ROM-uses the already proven economy of mask programing. Because the customer can maintain the mask patterns for his ROMs as proprietary, he still has the advantages of a custom design, without the high cost that usually accompanies custom designs.

The I/O port of the ROM contains latches that store data being transferred from the computer's data bus to the outside world-that is, the remainder of the system in which the computer is used. Should such external equipment require more than the number of output

[^4]

2. Maximal. Up to 16 ROMs and 16 RWMs can be attached to one processor in the new chip set. Input and output are also handled through these chips; output can be expanded beyond four bits in parallel through shift registers.
lines provided by the ROMs and RWMS, the SR makes more lines available, as described below.

The status memory portion of the 4002 RWM is really only an extension of the main memory, differing from the latter primarily in its means of access. It is useful for storing labels, exponents, signs, and other control information associated with the data stored in the main memory.

A typical computer configuration uses one CPU, together with several RWMs and ROMs, and perhaps a few SRs. These configurations are flexible, economical, and easy to design.

Flexibility comes from easy program changes, ease of changing capacity, small size, and low power dissipation. A short design cycle is possible because the system is controlled by a microprogram, which is inherently easier to design and implement than either random logic or custom LSI circuits.

The system is economical to manufacture because all the chips come in standard 16-pin dual in-line packages that lend themselves to automatic insertion and enable the customer to implement rather complex functions with relatively few chips. Furthermore, because three of the four chips are standardized, the customer is not faced with a development charge for them, while mask charges for a new ROM are far less than development charges for custom LSI chips. Furthermore, even these mask changes can be saved, particularly during prototype development, by using electrically alterable ROMs.

## Minimum system

The smallest possible system would require one CPU and one ROM. The latter, which contains 256 words of eight bits each, as shown in Fig. 1, would hold programs and subroutines for the CPU, and perhaps some fixed data. The maximum amount of ROM that can be directly addressed is 16 chips or 4,096 bytes (see Fig. 2). While these 4,096 bytes require a 12 -bit binary address,
the ROM communicates with the CPU over a four-bitwide data path. Therefore the ROM contains, in addition to the usual decoding circuits, a demultiplexer to permit the address to come in three four-bit chunks and a multiplexer to divide the output word into two more fourbit chunks.

The CPU delivers the three address chunks in three cycles. The first two of these select one byte in each ROM chip, while the third selects one chip out of all the ROMs. which delivers its byte to the data bus. Following these three address cycles are two more cycles during which the CPU accepts the eight bits from the ROM--four during each cycle.

The ROM chip itself contains the logic circuitry required to keep track of which data transfer is in process at any time, so that it can react to each one in the proper way.

## Read/write memory

Each RWM (Fig. 3) contains 80 characters of four bits each, organized as four 20 -character registers. In each register, 16 characters are part of the main storage, and the other four characters belong to the status memory previously mentioned. Therefore, the 16 chips in the largest memory configuration contain 64 registers, or 1,024 characters of main storage and 256 characters of status memory.

The CPU controls both the RWM and the ROM chips through six control lines. One of the six synchronizes the clocks of all chips to that of the CPU. The other five, called command lines, control the chip addressing and the input/output function. These six control lines permit the single family of chips to be used in a wide variety of system configurations, suitable for many applications.

If more than the 16 ROMs and 16 RWMs are needed for a particular application, external interface circuits and bank switching techniques similar to those used in

3. Read-write memory. Four 16-character data registers and four 4-character status registers appear in the rectangular array at left center in this photo. Remainder of chip holds control and timing circuits, and output buffers similar to those in the ROM.
some minicomputers permit the necessary number of additional memory chips of either type to be connected.

The SR, a 10 -bit static shift register into which data is loaded serially, has output lines that are accessible in parallel. Logic included on the chip disables the output lines until all desired data has been shifted into it from a ROM or RWM.

The CPU is a small but complete processor (Fig. 4) capable of executing 45 different instructions in a basic instruction cycle of 10.67 microseconds. It is built around a four-bit adder, and has a set of 16 four-bit scratchpad registers. Internally, the CPU is strictly binary; but for applications using decimal arithmetic, the binary result of adding two binary-coded decimal digits can be returned to BCD by a special instruction.

Of particular interest is the set of four 12-bit address registers that permit as many as three levels of subroutine nesting. During the execution of a program, when a jump to a subroutine is encountered, the instruction address register contains the address of the next instruction to be performed after the subroutine is executed.

## Bigger and better

Intel's new microcomputer set, as described in the accompanying article by Ted Hoff and Federico Faggin, is actually only one of several microcomputers now appearing on the scene. Recently, Intel also announced its 8008 microcomputer, an eight-bit CPU on one 125-by-170-mil chip [Electronics, March 13, p.143]. Like the 4004, the 8008 requires external memory and interface chips to make a full computer, which, in turn, would be only one part of a larger system in most applications. However, it does have an internal scratchpad memory of seven words, plus an instruction counter to use with the external memory.
"But the eight-bit chip doesn't make the four-bit obsolete," insists Hoff. "For example, the 4004 is more economical; a functioning computer can be made with it and only one additional memory chip." But the minimum system using the 8008, he says, would require 15 or 20 packages and would be a correspondingly more powerful system.
-W.B.R.

Its contents must be temporarily stored to permit the main program to resume following the subroutine. In the 4004 CPU , the subroutine may itself branch to a second subroutine, and that one to a third; three of the four address registers store three address words for this nesting process, while the fourth holds the address of the current instruction.

## Extended capability

For a complex system, requiring more extensive processing capability, multiple processors can be used. In such a system with two central processing units, one CPU could be dedicated to control functions and the other to arithmetic functions. They could communicate with each other through the I/O ports, or they could share a common memory through a minimum of external circuitry.

Each $10.67-\mu$ s instruction cycle is divided into eight machine cycles of $1.33 \mu \mathrm{~s}$ each, as shown in Fig. 5. During the first three of these, an address is sent out to the ROM. Data from the ROM is returned to the CPU during the next two cycles and is processed during the last three. Some instructions require an additional $10.67-\mu \mathrm{s}$ cycle to fetch eight more bits from the ROM.

The individual machine cycles are driven by a twophase clock running at 750 kHz .

## Steps in design

All four LSI chips may operate with a single -15 volt supply voltage with respect to ground; or they may be made compatible with transistor-transistor logic by operating them between -10 volts and +5 volts.

4. Processor. Heart of the four-chip set, this central processing unit is a small but complete general-purpose computer that can execute 45 instructions. The four-bit adder, its central component, is at left center, as key diagram shows. Address registers at right are important features. Processor works exclusively in binary, but has a special instruction that translates a pure binary sum to decimal form. Four white lines around left, bottom, and right are bus interconnecting major sections on chip.

For a particular application, designing a system around the MCS-4 is quite simple. There are essentially seven steps:

- Define the input/output requirements in terms of the peripheral equipment that will be needed.
- Define the amount of storage needed, and from that determine the number of RWM chips.
- Define the amount of control and/or program, and thus the amount of read-only storage needed; this determines the number of ROM chips. (This step may involve an iterative procedure.)
- Specify shift registers as needed to increase the capacity of output lines.
- Write the program.
- Build a prototype system, implement the program and controls in electrically programable roms, and get the bugs out.
- Submit the program for manufacturing mask-programed ROMs for volume production.

In defining the input/output requirements, the first step in the design sequence, a number of software/hardware alternatives are available. For example, if the system includes a keyboard, it will interface with
the computer's data bus through either a ROM or a RWM. This interface may be capable of climinating bounce from the key depressions, encoding the data, and rejecting the results of a multiple key depression; or all these functions may be implemented in software and executed in the CPU.

## Keyboard design

An approach that uses little hardware and a moderate amount of software, shown in Fig. 6, places each key at the intersection of one input line of a rom port and one output line of another ROM or RWM port. The program continually scans the keyboard by sequentially placing a single binary 1 on each output line; if any key has been depressed, that 1 reappears on the corresponding input line. The particular combination of output and input lines establishes the key's identity, and may be translated by a code conversion routine in the CPU into a command or character in conventional ASCII or other code. By requiring an encoded character to appear on several successive scan cycles, the CPU obtains key debouncing. With other suitable programs the CPU can detect multiple keying or key rollover.
5. Eight-in-one. Instruction cycle of 10.67 microseconds comprises eight $1.33-\mu \mathrm{s}$ machine cycles in 4004 CPU . First two cycles transmit address of desired word in ROM, third selects one of up to 16 ROMs. Then the addressed word returns to the CPU in two cycles; instruction is executed in remaining three cycles, plus another full eight if needed. Clock frequency is 750 kHz .


A larger keyboard may be more conveniently scanned by the 4003 SR, as shown in Fig. 7. Here only two output port lines are used. One of the lines is pro-

6. Keyboard interface. For data entry to system, the MCS-4 scans all columns on keyboard; signal reappearing on a row input shows that a key has been depressed, and timing of signal relative to scan identities the key. Program then encodes the signal, and also eliminates bounce and multiple keying

7. More hardware, less software. For larger keyboards, the 4003 shift register extends the scan while simplifying its operation.

8. Still more hardware. Direct approach requires external encoding and debouncing, and works with minimum support from program.
gramed to provide a clock pulse for the shift register, and the other supplies data. A single binary 1 is loaded into one end of the register and shifted along it: the parallel outputs effectively scan the large keyboard, and appear at the input port just as with the previous example.

A third design that uses the most hardware and very little software, shown in Fig. 8. requires an external diode encoder and debouncing circuits. In this arrangement the key depressions are read directly into the computer without any scanning.

The amount of read-write storage depends on the number of characters to be stored at one time and the number of bits that are to be stored in read/write storage as opposed to ROM.

For example, if a particular computer is to work with 16-digit numbers and may have to store 10 such numbers at any one time, these numbers will occupy the main-storage registers in three RWM packages, leaving two spare registers in one of the packages. These three packages also contain status memory with a capacity of 48 characters or 144 bits.

## Calculating ROM needs

In most applications using the MCS-4 chip family, normal programs are stored in the ROM. However, when the user desires to load programs at execution time, it is open to him to program the MCS-4 to operate in interpretive mode.

In this mode, the program in the ROM fetches "data" from RWM and interprets it as new instructions for the CPU, jumping as required to subroutines kept in the ROM. Interpretive mode programs may also run with pseudo-instructions kept in ROM; such programs often use ROM more efficiently than conventionally written programs. In either case, the designer planning to use interpretive mode must allow space for the programs when defining his memory requirements.

Determining the amount of read-only storage is a good deal more difficult. It depends on the system complexity and sophistication. The ability to make an educated guess early in the design process largely depends on experience.

With the number of ROM and RWM chips established, another look at I/O requirements may be in order if the number of $1 / 0$ lines provided by these packages is substantially different from that assumed in the first step of the design process. But if the $1 / 0$ layout is firm, the number of lines can be increased if necessary through the use of SRs.

After development of the program that the computer will execute, the next step is to simulate the program before committing it to mask tooling for the production of ROMs. The suggested approach is to use programable ROMs, instead of the 4001 , at the prototype stage. When all the bugs are out of the system in the simulation, the truth tables for the program can provide the data for mass-produced mask-programed ROMs.

The MCS-4 is just a beginning. More complex CPUS are being designed [see "Bigger and better," p. 114], and will be common before long. Some of them will be faster, and some of them will be extremely flexible, thanks to the use of programable roms.

# THE MCS-4-AN LSI MICRO COMPUTER SYSTEM 

by F. FAGGIN, M. SHIMA*, M.E. HOFF, H. FEENEY, S. MAZOR<br>INTEL CORP. 3065 BOWERS AVE. SANTA CLARA, CALIFORNIA 95051<br>*BUSICOM CORPORATION, TOKYO, JAPAN

Reprinted from IEEE 1972 Region Six Conference;
copyright 1972 by the Institute of Electrical and
Electronic Engineers, Inc.

## ABSTRACT

The MCS-4 is a totally self-contained four-bit general purpose microprogrammable computer in component form. It consists of four basic elements: the CPU (central processing unit), the ROM (read only memory), the RAM (random access memory), and the SR (shift register). They are fabricated by MOS silicon gate technology and packaged in economical sixteen pin DIPs to minimize board area and reduce system cost.

Using combinations of these standard building blocks, any degree of customization may be built into these powerful microprogrammed fourbit computers. Using as few as two devices, a CPU and a ROM, a four-bit microprogrammed dedicated computer may be built for under $\$ 50$.

This paper describes this new micro-computer set, highlighting the system partitioning and the basic CPU hardware instruction set.

## INTRODUCTION

LSI technology provides new components for system design -- microcomputers. With the MCS-4, a four-bit LSI microcomputer set fabricated with the p -channel silicon gate MOS process, the power of a general purpose computer is available to every system designer as an alternative to conventional designs of random logic systems. The MCS-4 can provide the same control and computing functions of a minicomputer in as few as two sixteen pin DIPs and costs nearly two orders of magnitude less.

This set of components is not designed to compete with the minicomputer, but rather to extend the concept of the dedicated computer into new applications where the minimization of cost and size are very important, but where speed is not a mojor factor.

A major goal in the development of the MCS-4 was to devise a computer architecture allowing
system partitioning into a minimum number of sixteen pin packages. The result is a set of standard LSI building blocks which are manufacturable in high volume and are flexible enough to be used in a variety of applications.

The partitioning resulted in four MOS microcomputer building blocks:

$$
\begin{aligned}
& \text { CPU -- four-bit parallel processor } \\
& \begin{array}{l}
\text { ROM + I/O -- }
\end{array} \begin{array}{l}
256 \text { words } x \text { eight bits / } \\
\\
\text { four I/O lines metal mask } \\
\text { programmable ROM }
\end{array} \\
& \text { RAM + output - } \begin{array}{l}
\text { output lines characters / four }
\end{array} \\
& \text { SR -- ten-bit shift register (serial in, serial } \\
& \text { out, parallel out) }
\end{aligned}
$$



Figure 1. MCS-4 ${ }^{\text {TM. }}$ BASIC SYSTEM

Time multiplexing was used extensively to reduce the pin count and minimize circuit area. The MCS-4 is a totally self-contained system; no additional interface components are necessary.

The circuits operate with a single supply voltage of -15 v and two non-overlapping clock phases, $\varnothing_{1}$ and $\varnothing_{2}$.

The heart of each system is a single chip CPU which performs all the control and data processing functions. Directly interfacing with the CPU are ROMs which store microprograms and data tables and RAMs which store data and pseudo instructions. The MCS-4 communicates with peripheral devices through I/O "ports" provided on each ROM and RAM chip. In addition, tenbit parallel shift registers can expand the I/O capability of the system.

Figure 1 shows the basic MCS-4 system. All address data and instruction communication is carried out on the four-bit nondedicated data bus; system synchronization and memory control are provided by the CPU.

The basic system timing for a typical instruction is shown in Figure 2. A total of eight clock periods is required for the addressing, fetching, and execution of a single word instruction. An instruction cycle is completed in $10.8 \mu \mathrm{~s}$ when operating at a clock frequency of 750 khz . In a typical operating sequence, the CPU sends out a twelve-bit address in three successive four-bit bytes during the clock times $A_{1}, A_{2}$, and $A_{3}$.

This address is received by the ROMs, and an eight-bit instruction word is selected (one of 4096 instructions). The eight-bit instruction is sent back to the data bus in two four-bit bytes during the following two cycles, $\mathrm{M}_{1}$ and $\mathrm{M}_{2}$. During the final three clock cycles, $\mathrm{X}_{1}, \mathrm{X}_{2}$. and $X_{3}$, the instruction is interpreted and executed.

## THE CPU (SYSTEM CONTROL)

This single chip CPU is the fundamental component of the system. It provides the complete memory addressing, instruction interpretation, and control for the total system. The CPU contains the following functional blocks (refer to Figure 3):

Address Register and Address Incrementer
The address register is a dynamic RAM array of four twelve-bit words. One word is used to store the effective address (program counter) and the other three words are used as a stack for subroutine calls. Thus nesting up to three levels is possible. The program counter is incremented on each instruction cycle as the address is sent out from the CPU. With the twelve-bit address the CPU can directly address up to sixteen ROMs, each containing 256 eight-bit words.

(1) IO instructions control the flow of informetion between accumulator in CPU I/O If $10^{(1)}$ case the CPU will rective deta from RAM storege locations or I/O input lines of 4001's.
(2) The SRC instruction designates the chip number and address for a following 10 instruction

Figure 2. MCS-4 ${ }^{\text {T.M. }}$ BASIC INSTRUCTION CYCLE


Figure 3. 4004 CPU BLOCK DIAGRAM

Index Register
Sixteen four-bit general purpose registers are provided for use as scratch pad registers or memory pointers. These registers may be accessed individually or in eightbit register pairs.

Four-bit Arithmetic Unit
This is a four-bit parallel adder with ripple through carry. All arithmetic instructions are executed using this functional block. Since this system operates on four-bit bytes of data, it can be used directly for computation in BCD. The result of an operation between two BCD numbers is computed in binary; using the decimal adjust accumulator (DAA) instruction, the result of the operation and the carry bit are converted back to BCD.

## Instruction Register and Decoder

Eight-bit instructions are received, stored, and decoded to generate control signals for all other functional blocks.

In addition to the basic functional blocks, internal timing, ROM and RAM enable control, and data bus I/O control are included in the peripheral circuitry.

The instruction repertoire is permanently stored in an associative memory in the CPU (the instruction decoder). It consists of three basic groups of instructions:

Sixteen machine instructions (basic control of the address and index registers)

Fifteen I/O and RAM instructions (communication with peripheral devices through the ROM and RAM I/O ports and character storage in RAM memory)

Fourteen accumulator group instructions (basic accumulator processing instructions)

The complete instruction set including the mnemonic, binary code, and description is presented in Figure 4.

Figure 4. MCS-4 ${ }^{\text {TM. }}$ INSTRUCTION SET
[Those instructions preceded by an asterisk (*) are 2 word instructions that occupy 2 successive locations in ROM] MACHINE INSTRUCTIONS

| MNEMONIC | $\begin{gathered} \mathrm{OPR} \\ \mathrm{O}_{3} \mathrm{O}_{2} \mathrm{O}_{1} \mathrm{O}_{0} \end{gathered}$ | $\mathrm{O}_{3} \mathrm{OPA}_{2} D_{1} D_{0}$ | describtion of operation |
| :---: | :---: | :---: | :---: |
| NOP | 0000 | 0000 | No operation. |
| - JCN | $\begin{array}{cccc} 0 & 0 & 0 & 1 \\ A_{2} & A_{2} & A_{2} & A_{2} \end{array}$ | $\begin{aligned} & C_{1} C_{2} C_{3} C_{4} \\ & A_{1} A_{1} A_{1} A_{1} \end{aligned}$ | Jump to ROM soddress $A_{2} A_{2} A_{2} A_{2}, A_{1} A_{1} A_{1} A_{1}$ (within the same ROM that contains this JCN instruction) if condition $C_{1} C_{2} C_{3} C_{4}{ }^{(1)}$ is true, otherwise skip (go to the next instruction in sequence). |
| - FIM | $\begin{array}{cccc} 0 & 0 & 1 & 0 \\ \mathrm{D}_{2} & \mathrm{O}_{2} & \mathrm{O}_{2} & \mathrm{D}_{2} \end{array}$ | $\begin{array}{llll} \text { R R R R } & 0 \\ D_{1} & D_{1} & D_{1} & D_{1} \end{array}$ | Fetch immediate (direct) from ROM Data $\mathrm{D}_{\mathbf{2}}, \mathrm{O}_{1}$ to index register peir tocation RRR. ${ }^{(2)}$ |
| SAC | 0010 | R R R 1 | Send register control. Send the sddress (comtents of index register pair RRR) to ROM and RAM at $X_{2}$ and $X_{3}$ time in the Instruction Cycle. |
| FIN | 0011 | R R 0 | Fetch indirect from ROM, Send contents of index register pair location 0 out as an address. Data fetched is placed into register pair tocation RRR |
| JIN | 0011 | R R R 1 | Jump indirect. Send contents of register pair ARR out es an address at $A_{1}$ and $A_{2}$ time in the Instruction Cycle. |
| -JUN | $\begin{array}{cccc} 0 & 1 & 0 & 0 \\ A_{2} & A_{2} & A_{2} & A_{2} \\ \hline \end{array}$ | $\begin{aligned} & A_{3} A_{3} A_{3} A_{3} \\ & A_{1} A_{1} A_{1} A_{1} \\ & \hline \end{aligned}$ | Jump unconditional to ROM adcress $A_{3}, A_{2}, A_{1}$. |
| -JMS | $\begin{array}{cccc} 0 & 1 & 0 & 1 \\ A_{2} & A_{2} & A_{2} & A_{2} \end{array}$ | $\begin{aligned} & A_{3} A_{3} A_{3} A_{3} \\ & A_{1} A_{1} A_{1} A_{1} \end{aligned}$ | Jump to subroutine ROM address $A_{3}$. $A_{2}$, $A_{1}$, save old address. (Up 1 level in stack.) |
| INC | 0110 | R $\boldsymbol{A}$ R R | Increment contents of register RRRR. ${ }^{(3)}$ - |
| -ISZ | $\begin{array}{cccc} 0 & 1 & 1 & 1 \\ A_{2} & A_{2} & A_{2} & A_{2} \end{array}$ | $\begin{array}{llll} R & R & R & R \\ A_{1} & A_{1} & A_{1} & A_{1} \end{array}$ | Increment contents of register RRAR. Go to ROM address $A_{2}$. A $A_{1}$ (within the seme ROM that contains this ISZ instruction) if result $\neq 0$, otherwise skip (go to the next instruction in sequence). |
| ADD | 1000 | R R R R | Add contents of register RRRR to accumulator with carry. |
| SUB | 1001 | ( R R R | Subtract contents of register RRRR to accumulator with borrow. |
| LD | 1010 | A R R | Loed contents of register RRRR to accumulator. |
| Xch | 1011 | R R R f | Exchange contents of index register RRRR and accumulator. |
| B8L | 1100 | D D D | Branch back (down 1 level in stack) and loed data DODD to sceumulator. |
| LDM | 1101 | O D D 0 | Losd data DDOD to accumulator. |

INPUT/OUTPUT AND RAM INSTRUCTIONS
IThe RAM's and ROM's operated on in the I/O and RAM instruc

| MNEMONIC | $\stackrel{\text { OPR }}{\mathrm{O}_{3} \mathrm{O}_{2} \mathrm{D}_{1} \mathrm{O}_{0}}$ |  |  |  | $\stackrel{\text { OPA }}{D_{3} D_{2} D_{1} D_{0}}$ |  |  |  | DESCRIPTION OF OPERATION |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| WRM | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | Write the contents of the accumulator into the previously selected RAM main memory character. |
| WMP | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | Write the contenis of the accumulator into the previously selected RAM output port. (Output Lines) |
| WRR | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | Write the contents of the accumulator into the previously selected ROM output port. (I/O Lines) |
| WR $\phi^{(4)}$ | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | Write the contents of the accumulator into the previously selected RAM status character 0 . |
| WR1 ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | Write the contents of the accumulator into the previously selected RAM status character 1. |
| WR2 ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | Write the contents of the accumulator into the previously selected RAM status character 2. |
| WR3 ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | Write the contents of the accumulator into the previously selected RAM status character 3 . |
| SBM | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | Subtract the previously selected RAM main memory character from accumulator with borrow. |
| RDM | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | Read the previously selected RAM main memory character into the accumulator. |
| RDR | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | Read the contents of the previousiy selected ROM input port into the accumulator. (1/O Lines) |
| ADM | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | Add the previously selected RAM main memory character to accumulator with carry. |
| RD¢ ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | Read the previously selected RAM status character 0 into accumulator. |
| RDI ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | Read the previously selected RAM status character 1 into accumutator. |
| RO2 ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | Read the previously selected RAM status character 2 into accumulator. |
| RD3 ${ }^{(4)}$ | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | Read the previousiy selected RAM status character 3 into accumulator. |

ACCUMULATOR GROUP INSTRUCTIONS

| CLB | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | Clear both. (Accumulator and carry) |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| CLC | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | Clear carry. |
| IAC | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | Increment accumulator. |
| CMC | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | Complement carry. |
| CMA | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | Complement accumulator. |
| RAL | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | Rotate left. (Accumulator and carry) |
| RAR | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | Rotate right. (Accumulator and carry) |
| TCC | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | Transmit carry to accumulator and clear carry. |
| DAC | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | Decrement accumulator. |
| TCS | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | Transfer carry subtract and clear carry. |
| STC | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | Set carry. |
| DAA | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | Oecimal adjust accumulator. |
| KBP | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | Keyboard process. Converis the contenis of the accumulator from a <br> one out of four code to binary code. <br> DCL |

NOTES (1)The condition code is assigned as follows
$\begin{array}{ll}C_{1}=1 \\ C_{1}=0 & \text { Invert jump condition } \\ C_{2}=1\end{array}$ Jump if accumulator is zero $C_{4}=1$ Jump if test signal is a 0
$C_{1}=0$ Not invert jump condition $\quad C_{3}=1$ Jump if carryllink is a 1
${ }^{(2)}$ RRR is the address of 1 of 8 index register pairs in the CPU.
${ }^{(3)}$ RRRR is the address of 1 of 16 index registers in the CPU.
${ }^{(4)}$ Each RAM chip has 4 registers, each with twenty 4 bit characters subdivided into 16 main memory characters and 4 status characters Chip number. RAM register and main memory character are addressed by an SRC instruction. For the selected chip and register, however
status character locations are selected by the instruction code (OPA).

## THE ROM (CONTROL MEMORY AND I/O)

This device performs two very distinct but independent functions in the system, storage of the instruction sequence, and input/output functions for communication with peripheral devices. The four input/output lines are provided on the ROM rather than the CPU to reduce the individual package pin count. Inputs or outputs may be custom selected to individual system requirements at the same time that the metal mask ROM program is prepared. As more ROMs are added to a system, the number of I/O ports is also increased.

The ROM memory array is organized as 256 eightbit words. When a particular ROM word is addres sed by the CPU, the address is stored in a register included in the ROM and the eight-bit ROM word is multiplexed into two four-bit bytes and sent to the system data bus. Since the CPU can directly address up to sixteen ROMs, a binary code for each ROM must also be programmed in the metal mask.

The ability to program both the memory and the I/O ports provides custom "personality" of the system along with the economic advantage of
using high-volume standard LSI devices.

## THE RAM (DATA STORAGE AND OUTPUTS)

The RAM also provides two distinct and independent functions, data and pseudo-instruction storage and output communication with external peripheral devices.

The RAM is organized as four registers, each containing sixteen four-bit main memory characters and four four-bit status characters. If the system is operating with decimal arithmetic (all decimal numbers represented in BCD), each register in the RAM can store a complete sixteen digit decimal number. The status characters provide additional storage for the sign, decimal point position, exponent of the number, or other control information. RAM and I/O line addressing is accomplished in a rather unique way. The five CPU command lines (CM-ROM, CM-RAMi) control the way in which ROM and RAM chips interpret the data on the data bus.

RAMs can be arranged in four banks of four chips each, each bank controlled by a separate CM-RAM ${ }_{i}$ line (refer to Figure 5).


To operate on an arbitrary location in RAM, the following instruction sequence is required:

1. The command line must be designated (DCL instruction)
2. The chip, register, and character must be selected (SRC instruction)
3. The operation is performed on the selected character (RDM, WRM, ADM, instructions). If an I/O instruction is executed (WMP) the content of the CPU accumulator is latched on the output port.

Operations on the I/O port of the ROM chips are accomplished in a similar manner.

## THE SR (I/O EXPANDER)

To increase the number of output lines for peripheral communication, the SR chip (MSI function) was added to the set.

This is a ten-bit serial-in, parallel-out, serialout, static shift register that directly interfaces with the I/O ports provided on ROM and RAM chips.

## CONCLUSION

A system using this set of devices will usually consist of one CPU and from one to sixteen ROMs, up to sixteen RAMs and an arbitrary number of SRs. A minimum system could be designed with just one CPU and one ROM.

Using microcomputers, changes in system function can be easily implemented by changing the ROM program rather than by costly alteration of random logic hardware. The MCS-4 offers tremendous flexibility of design and allows the user to have many of the desirable features of a custom MOS LSI design, small package count, a set of components uniquely his own (each user's programs are his proprietary property), and yet none of the disadvantages of a long development cycle and high development cost associated with custom LSI design. The short design cycle and flexibility associated with ROM programming allows much more rapid response to market demands than is possible with custom LSI, providing insurance against obsolescence.

The important features of the MCS-4 family are summarized in the following table:

Four-bit parallel CPU with 45 instructions
Instruction set includes conditional branching, jump to subroutine, and indirect fetching

Nesting of subroutines up to three levels
Sixteen four-bit general purpose registers
Decimal and binary arithmetic modes

Synchronous operation with memories
Direct compatibility with ROM, RAM, and SR
Directly drives up to:
. 4096 eight-bit words of ROM (sixteen chips)

- 1280 four-bit RAM characters (sixteen chips)
. 128 I/O lines (without SR)
- Unlimited I/O (with SR)

Memory capacity expandable through bank switching.

Two-phase dynamic operation
Single power supply ( $\mathrm{V}_{\mathrm{DD}}=-15$ volts)
$10.8 \mu \mathrm{~s}$ instruction cycle
Addition of two eight-digit numbers in $850 \mu \mathrm{~s}$
P-channel silicon gate MOS
Sixteen-pin DIP package
Minimum system: one CPU and one ROM
To add even more flexibility and further accelerate the design cycle, the CPU and RAMs may be interfaced with conventional electrically programmable and erasable ROMs. This will allow fast program development and provide a viable approach to build few-of-a-kind systems.

The microcomputer is already playing an important role in today's system designs.

## BIBLIOGRAPHY

Faggin, F., Klein, T., and Vadasz, L., "Insulated Gate Field Effect Transistor Integrated Circuits with Silicon Gates," presented at the IEEE International Electron Device Meeting, October 1968.

IEEE Transactions on Computers, "Special Issue on Microprogramming", July 1971, c-20, No. 7.

Noyce, R. N., "A Look at Future Costs of Large Integrated Arrays", AFIPS, Vol. 29. 1966 FJCC, Spartan Books, pp 111-114.

Roberts, William, "Microprogramming Concepts and Advantages as Applied to Small Digital Computers", Computer Design, November 1969, pp 147-150.

Vadasz, L. L., Grove, A. S., Rowe, T. A., and Moore, G. E., "Silicon Gate Technology". IEEE Spectrum, October 1969, pp 27-35.

# CONSIDERATIONS FOR THE USE OF MICRO COMPUTERS IN SMALL SYSTEM DESIGN 

by M.E. HOFF - MANAGER, APPLICATIONS RESEARCH<br>INTEL CORP. 3065 BOWERS AVE. SANTA CLARA, CALIFORNIA 95051

## SUMMARY

Microprocessors consisting of a very few LSI Circuits are a new tool for the electronics designer. While they require the designer to learn to write programs in addition to performing electronic hardware design, the advantages offered by small systems using these microprocessors often make the effort worthwhile. An actual example of such a design, a unit for encoding data into fieldprogrammable ROM!s, is discussed.

Integrated circuit technology has made dramatic advances sinœ the development of the first monolithic circuits in the early l960's. This technology has proven particularly appropriate for manufacturing circuit elements for digital systems. Initially single gates were implemented on each silicon die, but the complexity of these circuits rapidly grew as more complicated functions such as flip-flops, multi-bit counters and shift registers and ultimately memory arrays were developed. MOS processes have allowed even higher component densities than the early bipolar processes and now allow sufficient function to permit the central processor of a simple computer to be produced on a single chip of silicon. With this technology, a complete digital computer can be implemented with very few circuit packages.

Already the minicomputer has, because of its small size and low cost, produced a revolution in systems design. As the new single-chip computers cost an order of magnitude less than the typical minicomputer, an even greater revolution in the use of digital processing is likely.

The availability of such extremely low cost computers makes possible a number of new approaches to the design of such small systems as test instruments, computer terminals and controllers for computer peripherals. The use of these "microcomputers" rather than more conventional logic in such designs offers a number of advantages. Typically, most of the "personality" of the system using a microcomputer is determined by the program for the microcomputer rather than by wiring. This program is typically stored in read-only
memory. As a result the characteristics of the system can be radically changed by changing the content of the read-only memory. This feature allows the designer to make radical changes in equipment characteristics far more easily than when conventional design techniques are used.

Nevertheless, there are certain disadvantages associated with the use of microcomputers for a system design. Perhaps the most important disadvantage is the necessity for the designer to be capable of efficiently programming the microcomputers as well as considering their electrical requirements. This programming must usually be done in a more difficult microcomputer mach ine language or assembly language rather than the higher level languages offered with large computers. Furthermore, the debugging of these programs in prototype equipment can be more difficult than debugging conventional hardware. To offset these difficulties, a number of tools may be used. For example, assembler and simulator programs which operate on larger, more powerful computers may be used to simplify the program development and debugging stages. The assembler allows the programmer to write in a symbolic assembly language rather than the direct mach ine language of the microprocessor. The simulator allows the user to observe the operation of a program as it is executed by simulating this execution on the larger computer. With these techniques, the development and debugging of microcomputer programs is greatly simplified, and need be no more difficult than a conventional hardware design.

To better illustrate the uses for these microprocessors, consider the following design problem for which a microprocessor provided an ideal solution:

Intel Corporation is a manufacturer of semiconductor memory products; included in the product line are a number of field-programmable readonly memory (ROM) devices. One family of these devices includes the 1701, a 2048-bit silicon-gate MOS ROM which can be both programmed and erased in the field. A second

Reprinted from 1972 Wescon Technical Papers, Session 26; copyright 1972. Western Electronics Show and Convention.
programmable ROM product is the 3601, a l024-bit bipolar ROM which utilizes a blown-silicon fuse technology for its implementation. To support these field-programmable read-only memory products at factory sales offices and distributors throughout the country, it is necessary for Intel to provide a ROM-programmer unit. The basic mode of operation for the ROM-programmer should be to program new read-only memories from a paper tape or a previously programmed device. This ROM programmer should not only be capable of handling existing product categories, but should also be capable of handling future read-only memory developments.

Because new and quite different programmable read-only memory products might be developed in the future, the use of a microcomputer to build the ROM programmer seemed ideal. As a result, the intel MCS-4 set was chosen as the basis for the design.

The MCS-4 is a family of four integrated circuits which may be used to build a variety of microprocessor configurations. The four members of the family include the 4004 central processor, the 4001 mask-programmed read-only memory, the 4002 read-write memory and the 4003 lo-bit shift register. The 4003 is used primarily as an output expander. The central processor implements some fortyfive instructions, each of which is executed in either 10.8 or 21.6 microseconds. Instructions typically manipulate four or eight bits of data. The microprocessor family communicates with outside devices by means of 4-1 ine input-output "ports" associated with each read-only memory and read-write memory. Read-write memory ports are output-only but read-only memory ports may be chosen for input or output at the time that the read-only memory program mask is taped.

To make the hardware check out phase of an MCS-4 system more convenient, it is customary to interface 1701 programmable read-only memory to the MCS-4 central processor as a substitute for the 4001 mask programmed read-only memory. Using this interface, programs can be quickly encoded into the ROM, checked out and corrected without the tedious and expensive procedure of having the programs committed to masks. While it is customary for production systems to use the 4001 mask-programmed ROMs because of their economy, the decision was made to continue to use the 1701 type of read-only
memory for the production ROM programmer units to make future equipment revisions more convenient.

Figure 1 shows a photograph of the ROM programmer which was designated the 7600 C . The mechanical construction of the unit is quite conventional. The main chassis is divided into two parts with the power supplies and paper tape reader located on the left side and a printed-circuit card cage located on the right. The card cage holds some five printed circuit boards with provision made for an additional interface board. Five basic boards are used in this system: a power supply board which contains the power supply regulators and over-voltage protection for the main system; an interface board which shifts voltage levels as necessary for ROM's being programmed; and three boards which make up the computer system itself. Of these three boards, one contains the central processor, memory and clock generator circuits. The second contains the read-only memory array for the computer program and the third provides data bus buffering and circuitry to simulate the ports associated with the 4001's (that are being replaced by l701Is). In addition to those boards in the card-cage, there is also a display panel board which includes an adapter for the device test sockets, as well as an address display, two data displays and an eight-bit status display.

The front console of the unit includes a power switch and a tape winder switch, both of which are independent of the computer system, an eight position mode selector switch, and six push-button momentary contact switches. Although the mode switch and the push-button switches are labelled, these switches are merely inputs read by the processor so their functions are determined exclusively by the program in the processor's read-only memory. Similarly, while the address, data and status displays are labelled, these displays are determined exclusively by the program. The control of the tape reader and the interpretation of the data read from the tape is also determined by the microprocessor program. Initial versions of the microprocessor program accept the tapes only in Intel's standard 1601 format although a number of alternate formats have been proposed. Two mode switch positions have been reserved on the front panel for loading tape. These are presently labelled "load TELEX" and "load TWX" for five-wide and eight-wide tapes respectively. However, if the two widths are not required, one of these positions might be used to handle an alternate tape format or the microprocessor might be programmed to recognize one or more special characters which would signal the use of an alternate format for the tape data.

Figure 2 shows a block diagram of the microcomputer system and its interface to the device under test, system displays, and the panel controls. The 4004 central processor is wired directly to an array of sixteen 4002 read-write memory units. In addition, a buffer circuit is provided so that the 4004 can fetch its program from the array of 1701 programmable read-only memories. Output signals from the 4004 are delivered at the output ports of the 4002 read-write memories. The buffer circuit logic also includes provisions for a number of input ports. These input ports are used to read the status of the front-panel switches and to read signals from the device under test. The interface for signals to the device under test and the display interface both use the 4003 output expander. The multiplexing inherent in the use of the 4003 reduces the number of wires which must be routed with in the chassis. For example, although the display includes 32 individual light emitting diodes and four digits of seven segment display, only twelve signals leads run from the microprocessor to the display. Not all timing signals for the device under test are generated by 4002 memory ports, because the 4004 executes instructions no faster than one per 10.8 microseconds. Where faster timing signals must be generated for the device under test, these are done on the interface board for the device. As was described above, this interface board also includes level shifters and any special regulators required to power the device under test. It should be noted that while the interface board does implement some of the critical timing functions for testing the device, most of the timing functions are implemented directly under computer control. For example, when programming 1701's, the control of the power supply and program pulses are done by the microprocessor.

At the time of this writing, the program implements eight modes of operation. Two of the modes are used to load teletype tapes into the (4002) buffer memory; one mode for five track TELEX tape and the other mode for eight track tapes in ASCII format. A third mode permits the loading of the buffer memory from a device in the test socket. (This operation is much more rapid than that of loading paper tape). A fourth mode programs the test device and the fifth mode is used to read and check the test device. The sixth and seventh modes are intended for use with an external teletype. They allow listing the contents of the memory or the contents of the test device. If the teletype used has a paper tape punch, either of these modes can produce a tape which is readable in TWX mode. The last mode provides for manually altering the contents of the buffer memory and/or manually programming the device plugged into the test socket.

The six momentary contact push-buttons are used to control operation within a mode. Three of them are active in all modes of operation. They are the reset, stop/increment, and start/continue buttons. Three other push-buttons labelled "l" and "O" and "write" are used primarily in manual mode. Pressing the "0" or "l" buttons shifts the corresponding datum into a data input buffer, while the write push-button causes this data to be programmed into the test device and to be deposited into the internal buffer memory.

The program which implements these eight modes of operation with the full push-button control, for the two programmable ROMs occupies eight 1701 ROMs or about 2,000 bytes of read-only memory. The read-only memory card for the microprocessor has provision for up to sixteen 1701's, i.e., at this state of the program approximately one-half of the capacity of the system is used.

The use of a microprocessor has had some interesting effects on the 7600C system. In a few cases, a change which would have meant moving only one or two wires in conventional hardware has required several hours for reassembly. (Assembly of programs is done on an in-house PDP-10 timesharing system at Intel and requires approximately 2 to 3 minutes of computer time. Editing changes required for minor alterations usually takes but a few minutes to perform. However, once the assembly is complete, tapes for reprogramming the 1701's must be generated and these programmable ROMs must have the new data encoded into them.) However, many changes which would have represented several days of rewiring and extensive alterations to the printed circuit layout have taken only the same few hours to implement. For example, an early revision of the manual mode of operation did not display the original contents of the buffer memory. Adding this display capability modified only two locations of one of the 1701 ROMs. Total turnaround time for th is change was perhaps one hour. Had it been necessary to perform this change in hardware, it would have required extensive wiring changes within the chassis and the alteration of one or more of the printed circuit boards. Another feature that was added by very minor change in the program was the ability to set the memory to all l's or to all 0's. This was implemented as follows:

With the ROM programmer in the load DUT mode, pressing the start button still loads the contents of the test device into the buffer memory. However, if the 1 or the 0 button is pressed, the entire buffer memory will be loaded to that value.

The turnaround times mentioned for these changes refer to the time elapsed from the "design" of the change to its first implementation. Once the change is debugged, any field model of the ROM programmer could be changed merely by replacing one or more of the eight 1701 ROM's on the ROM-printed circuit board.

## While this model of the ROM programmer

 uses the 1701 for all of its program memory, other systems might use combinations of 4001 mask-programmed ROM and 1701 fieldprogrammable ROM for greater economy. One technique is to utilize the subroutine capability of the MCS-4 by "hardening" many of the subroutines into conventional ROM while leaving the mainl ine program in programmable ROM.
## CUNCLUSIONS

The use of a microprocessor in a piece of test equipment for use with fieldprogrammable ROMs simplified the internal design of the unit and greatly increased the flexibility and capability for future modifications. The turnaround time for changes in the prototype are usually on the order of a few hours, relatively independent of the complexity of the changes designed. However, when such a change is made, units in the field may be altered by merely replacing the ROMs contaiting the program for the microcomputer. Extensive use has been made of the subroutine capability of the MCS-4 microcomputer set. As a result, most changes usually involve only one or two ROM chips. The use of these products does require design approaches but the advantages far outweigh the disadvantages associated with learning these new techniques.


Fig. I. Photograph of ROM Programmer Using MCS-4


Fig. 2. Block Diagram of ROM Programmer using MCS-4

# MICRO COMPUTER APPLICATIONS OF ELECTRICALLY ALTERABLE ROMs 

by H. FEENEY<br>INTEL CORP. 3065 BOWERS AVE. SANTA CLARA, CALIFORNIA 95051

## SUMMARY

Micro computer systems (combination of LSI micro processors, program storage, ROM, and data storage, RAM) provide a new tool for the system designer. At the present time, both mask charges and turn-around time for mask programmable ROMs are a barrier to rapid system development. The electrically programmable and ultra violet erasable ROM provides the solution to this problem. This paper describes the application of alterable read-only-memories to micro computer systems.

## INTRODUCTION

In 1971, a fully-decoded 2048 bit electrically programmable and ultra violet erasable read only memory was introduced by Intel Corporation. This memory used a novel floating-gate avalancheinjection MOS (FAMOS) charge stored device as the basic memory element. Later, during the same year, Intel introduced LSI micro processors. Now, complete micro computer systems can be developed which offer system flexibility, design expediency and manufacturing economy. It is in the areas of both flexibility and expediency where the electrically programmable ROMs (PROMS) play an important part.

## FAMOS MEMORY

The FAMOS programmable and erasable memory was first explained by Dr. D. Frohman-Bentchkowsky at the 1971 IEEE International Solid State Circuits Conference:
"The operation of the FAMOS memory structure depends on charge transport to the floating gate by avalanche injection of electrons. Charge can be transferred to the floating silicon gate if an avalanche injection condition is reached in either the source or drain junction... In a P-channel FAMOS transistor with a 1000 A thick oxide, an applied junction voltage of -30 v is required for the onset of avalanche injection. The gate charging
current is of the order of $10^{-7}$ A/cm. Since the silicon gate is flaating, the avalanche injected current results in the accumulation of a negative electron charge on the gate. For a P-channel FAMOS transistor, this negative charge will induce a conductive inversion layer connecting source and drain. The amount of charge transferred to the floating gate is a function of the amplitude and duration of the applied junction voltage. Once the applied junction voltage is removed, no discharge path is available for the accumulated electrons since the gate is surrounded by silicon oxide which is a very low conductivity dielectric...since the gate electrode is not electrically accessible, the charge cannot be removed by an electrical pulse. However, the initial equilibrium condition (no electronic charge on the gate) can be restored by illuminating the unpackaged device with ultra violet light or by exposure of the packaged device
to $X$-ray radiation."


Figure 1. Cross-section of a Floatinggate Avalanche injection MOS (FAMOS) device.

When this PROM is used as a system development tool, the ability to erase the memory is of primary importance. The FAMOS memory may be erased and completely reprogrammed in less than one half hour. Figure 2 (next page) shows the simple erasing procedure using a high intensity short-wave ultra violet
light. The FAMOS memory is neither affected by sunlight nor by flourescent lighting.


Figure 2. Erasing of FAMOS PROM (Quartz Lid Package).

## MICRO COMPUTERS

LSI technology now provides additional new components for the system designer -- micro processors. These are complete computer central processing units which include an arithmetic unit, index register storage, program counter, address stack, instruction decoding, and system timing and control on a single silicon substrate. When combined with RAMs for data storage, complete micro computer systems can be assembled.

These components provide the same flexibility as the minicomputer, but they extend the concept of the dedicated control computer into applications where the minimization of cost and physical size are important, but where processing speed is not a major factor.

The power of a general purpose computing system is now available to every system designer as an alternative to the conventional design of random logic systems. Since micro computer systems are custom tailored by programming of ROMs, they ${ }_{2}$ are also an alternative to custom LSI.

Using as few as two micro computer components (CPU and ROM), many of the same control and computing functions of the minicomputer are available at a cost of nearly two orders of magnitude less.

Two micro processors ${ }_{4}$ currently available are the 4004,3 a four-bit
parallel CPU designed for control and decimal arithmetic applications, and the 8008, ${ }^{5}$ an eight-bit parallel processor for data handling. The four-bit CPU (4004) interfaces directly with special ROMs (mask programmable 4001) and special RAMs (4002). On the other hand, the eight-bit CPU (8008) may be used with standard ROMs (mask programmable 1301) and standard RAMs (1101, 1103), but it requires some TTL interface circuitry.

In both instances, the final systems will be formed by the appropriate combination of standard LSI building blocks. For the system development phase, electrically programmable and ultra violet erasable ROMs should be used instead of the metal mask programmable ROMs.

## MICRO COMPUTER SYSTEM DEVELOPMENT

System development is always a tedious, iterative task. The use of micro computers moves much of the design from hardware to firmware. Most iterations in design then become ROM program modifications, not component and board changes. Many systems using micro computers have been designed and ready for production in less than three months.

A flowchart for the development of a typical system is shown in Figure 3 (see next page). The general complexity of a system will be established by both the I/O requirement for data and the number of I/O lines required for peripheral controls. The RAM data storage requirement must also be determined, and sample programs must be written to determine ROM program storage requirements. At this time, the basic hardware is fixed and it can be developed concurrent with the firmware.

When the system control programs are complete, they can be assembled using a general purpose computer, and tapes for the actual programming of FAMOS PROMs can be generated. After the PROMs have been programmed, the system can then be debugged. Subsequent iterations of the system program may be required.

After the final iteration, the program can be committed to metal mask ROMs for the high volume production run. Concurrently, systems using PROMs can be built for initial deliveries. In the case where only low volume or one-of-akind customized systems are required, the system can be delivered using PROMs.

DEVELOPMENT OF A MICRO COMPUTER SYSTEM


Figure 3.

A typical prototyping system is shown in Figure 4. This board uses the four-bit CPU (4004) and associated RAM memory (4002). FAMOS PROMs and TTL interface logic is used to simulate the mask programmed ROMs (4001). This system forms a complete micro computer. In addition to its use as a prototype system, this board may be used to provide the control for a PROM programming system. Using three special control ROMs, this micro computer can read data from a teletype, examine data format, present addresses and data to a FAMOS PROM programmer, and activate the program sequence (Figure 5). A similar micro computer prototype system is used with the eight-bit micro processor.


Figure 4. Micro Computer System using PROMS.


Figure 5. PROM Programming System.

## RELIABILITY

One of the most important performance factors in using both electrically programmable ROMs and micro computers is long term reliability. Both devices are fabricated using Intel's standard 6 p -channel silicon gate technology ${ }^{6}$ of proven reliability. The only additional question posed by the use of the FAMOS PROM is the stability of the "l" memory state and the stability of the "0" memory state. The extrapolation of the
charge decay results based on Dr . Frohmañ-Bentchkowsky's initial measurements ${ }^{1}$ indicate that $70 \%$ of the inital induced charge is retained in excess of ten years at 125 C .

Subsequent Intel testing ${ }^{8}$ of discrete devices, l6-bit memory elements, and the fully decoded 2048-bit memories confirms the reliability of this device. The worst case condition for the "0" state occurs when the device is subjected to high temperature storage at $125^{\circ} \mathrm{C}$. There have been no failures in over 1.5 million unit hours ( 67 million bit hours) of testing. The worst case condition for the "1" state occurs when voltage is applied to the charge-storage device. There have been no failures in over 1.7 million unit hours ( 3.4 billion bit hours) of life testing.

FAMOS memory devices have also been exposed to sunlight and to flourescent light to determine the effect on data retention. In neither case was there a loss of stored data.

## CONCLUSION

PROMs are currently being used in hundreds of micro computer develgpment programs. Systems in production or undergoing field testing which use proms for system control are in the areas of process control, inventory control, data acquisition, point of sale terminals, intelligent terminals, and many more.

Much of the success and acceptance of the micro computer system is due to the availability of the electrically programmable and ultra violet erasable PROMs.

## REFERENCES

1. D. Frohman-Bentchkowsky, "A FullyDecoded 2048-Bit Electrically-Programmable MOS-ROM," ISSCC Digest of Technical Papers, Feb. 1971, pp. 80-82.
2. F. Faggin, M.E. Hoff, "Standard Parts and Custom Design Merge in Four-Chip Processor Kit," Electronics, April 24, 1972, pp. 112-116.
3. MCS-4 Micro Computer Set User's Manual, Intel Corporation, July 1972.
4. F. Faggin, M. Shima, M.E. Hoff, H. Feeney, S. Mazor, "The MCS-4 -An LSI Micro Computer System," IEEE 1972 Region Six Conference, pp. 1-6.
5. 8008, 8-Bit Parallel Central Processor Unit, Intel Corporation, June 1972.
6. L.L. Vadasz, A.S. Grove, T.A. Rowe, G.E. Moore, "Silicon-gate Technology," IEEE Spectrum, October 1969, pp. 28-35.
7. D.J. Fitzgerald, G.H. Parker, P. Spiegel, "Reliability Studies of MOS Si-gate Arrays," 1971 IEEE Reliability Physics Symposium, March 3l-April 2, 1971.
8. "Product Reliability: 1601/1701 and 1602/1702," Intel Corporation, Aug. 1972.
9. "Two Control Computers Cost Under \$1000," Control Enigneering, May 1972, p. 38.

# IMPACT OF LSI ON MICRO COMPUTER AND CALCULATOR CHIPS 

by H. SMITH

INTEL CORP. 3065 BOWERS AVE. SANTA CLARA, CALIFORNIA 95051

## I. INTRODUCTION

The complexity of the integrated circuit has been doubling every year, reaching a level today where it is possible to integrate complete computer central processors and calculators on a single chip. The availability of these tiny, low cost chips is already producing a revolution in system design and rapidly changing the architecture of many systems.

These LSI micro computer and calculator chips are classified as standard random logic arrays which do not have the regularity of LSI memory chips (where a single cell is repeated a number of times to form an array). This poses some unique design problems in both the area of product definition and product realization. First, the chip must be defined and partitioned in such a way that it will be sufficiently universal to have wide usage; and secondly, the chip size must be kept small enough to be able to produce it in large volume economically. In a random logic array, the chip size will be more a function of the method of partitioning and the amount of interconnection, rather than of cell size (as is the case with LSI memory).

The Si-gate MOS process has had a tremendous impact on the design of random logic chips because it allows the high component and interconnect densities that are required to economically produce these chips. With this technology, a complete central processor has already been put on a
single chip, and a complete computer has been implemented with a few LSI packages.

This paper will briefly discuss the Si-gate MOS process and the impact that it is having on the high volume production of low cost, random logic LSI chips. It will then look specifically at several micro computer and calculator chips that are being produced using this process. These chips will be compared with equivalent chips that have been manufactured with the metal-gate process.

## II. MOS SILICON-GATE TECHNOLOGY

Briefly, the process steps are as follows:

1. The starting material is N-type silicon for P-channel or P-type for N -channel.
2. The wafer is first placed in an oxidizing atmosphere at high temperature and a relatively thick layer (about $1 \mu \mathrm{~m}$ ) of $\mathrm{SiO}_{2}$ is grown on the surface.
3. The region for the source, drain and channel is then defined by photo masking and etching.
4. The wafer is again placed in an oxidizing atmosphere and a thin layer of $\mathrm{SiO}_{2}(.1 \mu \mathrm{~m})$ is formed. This thin layer $\mathrm{Of}_{\mathrm{SiO}}^{2}$ will serve as the gate dielectric.
5. Next, a thin layer of polycrystalline silicon is deposited over the entire wafer.
6. The wafer is then returned to photo masking for removal of the silicon layer except where the gate regions are defined or where the sili-
con film will be used as an interconnection. The thin oxide is then removed by exposing the wafer to an oxide etch.
7. The wafer is then placed in a diffusion furnace where boron (for pchannel) or phosphorous (for N -channel) is diffused into the gate, interconnect, source and drain regions. 8. Next, a thick layer of oxide is deposited over the entire wafer and openings are etched for contacts between the subsequent metalization and underlying diffused regions or the polycrystalline silicon interconnect level.
8. Aluminum is then evaporated over the entire surface and is etched to define the metal pattern.
9. The process is then completed with the coating of a layer of glass over the entire chip and the etching of the contact holes for the external interconnect.

The silicon-gate process has many inherent advantages over the conventional metal-gate MOS process. These advantages have made possible the fabrication of low cost random logic LSI devices. The most dramatic evidence of this has been the introduction of several computer-on-a-chip devices by Intel in late 1971 and early 1972. The advantages of the Si-gate process are:

1. Smaller and faster devices resulting from (a) the self aligned gate, and (b) elimination of metal-to-metal separations between the gate and the source or drain.
2. Circuit interconnection flexibility resulting from the use of deposited polycrystalline for interconnection with additional use of diffused conductors in the substrate, as in other MOS technologies. This gives flexibility approaching that of three layers for interconnecting complex functions efficiently. These three interconnections are illustrated in a magnified view of a portion of the 8008 8-bit CPU chip shown in Fig. 1. THE AREA SAVED OVER CONVENTIONAL METAL GATE MOS TECHNOLOGY IS APPROXIMATELY $50 \%$.

The impact of this Si-gate LSI technology on the production of low cost micro computers and calculator chips


Figure 1. Photomicrograph of a small region of Intel's 8008 8-bit CPU chip showing the three possible modes of interconnection: (a) diffused regions within the wafer, (b) deposited silicon, and (c) metal interconnecting lines.
will now be illustrated.

## III. USING THE SI-GATE PROCESS TO BUILD MICRO COMPUTER CHIPS

In November, 1971, Intel ushered in a new era of integrated electronics by introducing its first micro computer on a chip -- the 4004 4-bit CPU. In May, 1972, Intel introduced its second micro computer on a chip -- the 8008 8-bit CPU. Both of these devices are fabricated and made economically possible with the p-channel Si-gate process. The announcement of these chips brought the power of a general purpose computer to every systems designer as an alternative to conventional designs of random and custom logic systems.

4004 4-Bit CPU Chip
The 4004 is the heart of the MCS-4 general purpose, micro programmable computer set. When used with other members of the micro computer set (4001 -- 256x8 bit ROM and 4-bit I/O port, 4002 -- 320 bit RAM and 4-bit output port, and 4003 -- I/O expander), an almost unlimited number of types of systems may be built.

The 4004 CPU chip consists of a 4-bit adder and accumulator, a 64-bit index register, a memory stack containing a 12-bit program counter and three 12-bit words used to store subroutine addresses, an address incrementer, an 8-bit instruction register and decoder, and control logic. (The block diagram is shown in Figure 2.) The characteristics of the chip are:

Chip Size -- 117:159 mils, approx. 18,500 square mils
Number of transistors -- 2248
Package -- 16 pin DIP


Figure 2. Block Diagram of Intel 4004 4-bit CPU Chip.

8008 8-Bit CPU Chip
The 8008 CPU, when combined with standard memory devices, forms the MCS-8, general purpose micro computer set.

The 8008 CPU chip consists of an 8bit parallel binary arithmetic unit and accumulator, six 8-bit data registers, two 8-bit temporary registers, four flag flip-flops, a memory
stack containing a 14-bit program counter and seven 14-bit words used to store subroutine addresses, instruction register and decoder, and control logic. (The block diagram is shown in Figure 3.) The characteristics of the chip are:

Chip Size -- $124 \times 173$ mils, approx. 21,400 square mils
Number of transistors -- 3098
Package -- 18 pin DIP
If we compare this CPU chip with an equivalent CPU chip that performs exactly the same functions and is fabricated in the metal-gate process, the advantages of the Si-gate LSI process can be clearly seen.
\(\left.$$
\begin{array}{|l|l|l|}\hline \text { CHIP } & \begin{array}{l}\text { CHIP SIZE } \\
(\text { Mils })\end{array} & \begin{array}{l}\text { CHIP AREA } \\
\text { (Sq. Mils) }\end{array}
$$ <br>
\hline Intel's Si-gate \& 124 \times 173 \& 21,400 <br>

8008 CPU Chip\end{array}\right]\)| Company T |
| :--- | :--- | :--- |
| Equivalent Metal |
| Gate CPU Chip |

THE METAL-GATE CHIP IS MORE THAN 2.2 tIMES THE SIZE OF THE SI-GATE CHIP.


Figure 3. Block Diagram of Intel 8008 8-Bit CPU Chip.

## IV. USING THE SI-GATE PROCESS TO build calculator chips

The last example that will be used to illustrate the impact of the Si-gate MOS process on random logic LSI chips is the single chip calculator. Intel has developed and has in production
a single chip calculator that is being fabricated with the P-channel Sigate process. The functions and features of the chip are:

Operations performed: addition, subtraction, multiplication, division, chain multiplication, chain division, constant multiplication, constant division, percentage calculation
Display: 10 or 12 digits
Features: floating point, overflow indication, credit balance, zero suppression

The size of this chip is $132 \times 128$ mils, or approximately 17,000 square mils. A comparison with equivalent single chip calculators is shown below.

| CHIP | CHIP <br> SIZE <br> Mils) | CHIP <br> AREA <br> (Sq. Mils) |
| :--- | :--- | :--- |
| Intel single chip <br> calculator (12 <br> digit floating <br> point) | $132 \times 128$ | 17,000 |
| Company T single <br> chip calculator <br> (conventional <br> metal-gate) (8- <br> digit floating pt. | $230 \times 230$ | 53,000 |
| Company M single <br> chip calculator <br> (Ion implanted <br> metal-gate) (10 <br> digit floating pt. | $180 \times 180$ | 32,400 |

THE CONVENTIONAL METAL-GATE CHIP IS 3 TIMES THE SIZE OF THE SI-GATE CHIP AND THE ION-IMPLANTED METAL-GATE CHIP IS NEARLY 2 TIMES THE SIZE OF THE SI-GATE CHIP.

## V. CONCLUSION

Si-gate LSI has made possible the production of micro computers and calculators on a single chip. The availability of these tiny, low cost computer chips is producing a revolution in system design similar to that produced by the mini computer. The range and applications for the micro computer are very broad and the architecture of many systems is being
changed to make effective use of these devices. Already, a large number of cash registers, point of sale terminals, traffic control systems, small accounting machines, process controllers, intelligent typewriters, digital scales, digital instruments, test equipment, etc. have been built using these new processors.

The future will bring higher speed and more sophisticated LSI computer chips and we will see an even greater proliferation in the use of these devices. It is not unreasonable to project that computer chips will be routinely used in future design as TTL MSI is used today. The impact on system design will be truly phenomenal.

## VI. REFERENCES

1. Les Vadasz, et. al., "Silicon Gate Technology," IEEE Spectrum, Vol. 6, No. 10, October 1969, pp. 28-35.
2. MCS-4 Micro Computer Set User's Manual, Intel Corporation, July 1972.
3. F. Faggin, M.E. Hoff, "Standard Parts and Custom Design Merge in Four Chip Processor Kit," Electronics, April 24; 1972, pp. 112-116.
4. F. Faggin, M. Shima, M.E. Hoff,
H. Feeney, S. Mazor, "The MCS-4 --

An LSI Micro Computer System," IEEE 1972 Region Six Conference, pp. 1-6.
5. MCS-4 Data Sheet, Intel Corporation, April 1972.
6. 8008, 8-Bit Parallel Central Processor Unit, Intel Corporation, June 1972.

# THE NEW LSI COMPONENTS 

by M. E. HOFF - MANAGER, APPLICATIONS RESEARCH<br>INTEL CORP. 3065 BOWERS AVE. SANTA CLARA, CALIFORNIA 95051

For several years, many authors have discussed techniques for partitioning logic designs to make them compatible for large scale integration. In the mearitime, the complexity of the most economically optimum integrated circuit has been doubling every year. With this rapidly growing complexity, it has now become possible to integrate many systems completely on a single chip. A large number of single-chip control circuits and single-chip desk calculators are already in service.

The high complexity of the modern integrated circuit results in high development costs and raises the question of how to define the circuit in such a way that it will be sufficiently universal to have a wide usage. Wide usage is necessary or the development cost, even when amortized over the total number of devices produced, may make the circuit uneconomical when compared with more conventional logic circuits. Unfortunately, not all circuits have the wide usage or universal ity of function that is characteristic of the desk calculator. However, the high complexity available with modern integrated circuits makes possible the construction of a relatively universal component, the general purpose digital computer. At the present state of the technology, a computer central processor somewhat like that of a minicomputer can be integrated on a single silicon chip. While the performance of these computers is not yet equal to that of the minicomputer constructed with more conventional design, there are many applications where the capabilities of these single-chip computers are quite adequate.

The availability of these tiny, low cost microprocessors can be expected to produce a revolution in system design greater than that produced by the minicomputer. With the advent of the minicomputer, it became possible to use a digital computer as a component in larger systems. Using a minicomputer as the heart of a system offers a degree of flexibility that no ordinary hard-wired system could have. Because of the still significant size and cost of the typical minicomputer, however, its use as a component has been limited to relatively large and expensive systems. The new LSI microprocessors, however, remove most of the cost and size restrictions associated with the use of a digital computer as the heart of a system.

The range and application for the new components is very broad. Already, a large number of cash registers, point of sale terminals, computer terminals, traffic control systems, small accounting machines, intelligent typewriters, process controllers, numerical machine tool controllers, etc., have been built using the new microprocessors. In some cases, the use of these microprocessors greatly simplifies the overhead associated with a design. For example, many organizations were already designing simplitied
low-performance minicomputers for use in small systems after having recognized the advantages of a programmable system rather than a hard-wired one. However, any processor design has a significant cost associated with the development of the software for the system. Minimum requirements for the support software usually include an assembler to permit programming in a symbolic language and a simulator to allow programs to be checked out on a larger computer. In some cases, a microprocessor has been chosen because a wide variety of functions can be implemented with a single unit of hardware. Modifying equipment for a given customer's needs is also easier with the programmed structure.

Many users of microprocessors have chosen the approach because the system is simplified as well as increased in flexibility. While not always true, there are many functions which are quite difficult to realize with hardware, but are quite easily implemented with a microprocessor. However, it should be noted that some functions which are fairly easy to perform with hardware can be more difficult to perform with microprocessor. The most common difficulties arise when the designer converts from an essentially parallel hardware network to the sequential microprocessor implementation.

A potential difficulty faced by the manufacturer who would use a microprocessor to implement his system is caused by the requirement for programming. Although programming experience is much more common among engineers today than it was a few years ago, a significant nunber of engineers are still unfamiliar with programming techniques, particularly in machine or assembly language. For the available microprocessors, however, the instruction sets are relatively simple. As a result, they are easily learned and quickly remembered. Some engineers have found it helpful to think of the instructions as building blocks of the system much as MSI elements are building blocks of a hard-wired logic system. Most of the designers who investigate these microprocessors feel it well worth it to make this investment. for not only does the microprocessor make it possible to build a more powertul, more capable system but it also offers a number of other advantages. For example, because so much of the system's characteristics are determined by the program for the microprocessor, it is usually quite convenient to make changes in the system by changing this program. bithile it is customary when using these microprocessors to commit most of the program to read-only memory so that the program does not have to be loaded each time the system is turned on, it is still quite a bit more convenient to make changes by replacing or altering the contents of the read-only memory than it is to have to cut traces on printed circuit boards, move and add wires and back plane wiring, etc. It is also usually easier to provide for future expansion of a system when this expansion will take the form of enhancement or additions to a
a program than it is when the expansion will involve changes of hardware.

As in the case of any new component, the designer must learn to use the item efficiently. Because of the low cost of the processor itself, it becomes most desirable to minimize the cost of the peripheral circuitry around the processor. For example, in one application using the Intel MCS-4 single-chip computer family, it was necessary for the processor to be able to determine the value of an analog voltage. While it was possible to use the conventional approach of interfacing an analog to digital convertor to the microprocessor, a cost saving was achieved by having a microprocessor execute a program which enabled a digital to analog convertor and a comparator to perform the analog digital convertor function. Figure $I$ shows how the conversion was achieved. The MCS-4 uses a "port", for input/output communication. A four-wire port is associated with each read-only memory or read-write memory chip. (It should be noted that the MCS-4 microcomputer set is a family of four parts consisting of the 4004 central processor unit; 4001 read-only memory; 4002 read-write memory; 4003 shift register/output expander.) In Figure i, two of these output ports have been used to drive the inputs of a digital to analog convertor (DAC). The DAC is wired to a comparator which allows the output of the DAC to be compared with the analog input signal. The output of the comparator is in turn wired to the test input of the 4004 central processor. This test input line is interrogated when the central processor executes a certain conditional jump instruction. Whereas the normal instruction execution flow within the MCS-4 system is sequential through program memory, when the conditional jump is executed, the processor jumps to a new location in memory, starting a new instruction sequence.

Figure 2 lists the program for the analog to digital convertor in MCS-4 assembly language (for those readers desiring more information about the instruction set of the MCS-4, they are referred to reference (1)). The program implements a successive approximation conversion technique. Starting with the highest order bit, each bit in turn is turned on and the output of the comparator tested. If turning on the bit results in a signal from the DAC that is larger than the analog input, the bit is turned off and the next bit in turn tested. However, if turning on a bit leaves the output of the digital-to analog convertor still smaller than the analog input signal, then that bit will be left turned on. The coding for the program consists of testing each of the lines of one port in turn using in-line coding, then repeating the sequence for the next set of port lines by looping back. Setting a bit is accompl ished by loading the accumulator with a load immediate instruction (LDM) and then writing the contents of the accumulator to the output port. The output port is selected at the begin ning of the program by the combination of fetch immediate (FIM) and send register control (SRC) instructions. Register \#4 (R4) is used to contain the current estimate of the value for the 4-bits being tested. A bit under test is retalned or cleared by updating or not updating the contents
of register 4. At the end of the basic 4-1 ine test sequence of instructions, the contents of register 4 are saved in an alternate location by a series of exchange (XCH) instructions and the instruction increment and skip on zero ( $15 Z$ ) is used to perform the function of counting the number of passes through the loop and jump ing back to the loop start. The loop selects the next port in turn by the increment (INC) instruction which modifled registers $R 0$ so that when the next SRC instruction is executed, it wilr select the next port in sequence. This basic program can be easily moditied to handle 12 bit binary or 2 or 3 digit decimal conversions. Execution of the sequence of instructions takes less than one millisecond and as can be seen from the listing, occupies some words of read-only memory.

A multiplexer for multiple analog inputs can be added quite easily by providing a separate comparator for each analog input and performing digital multiplexing at the input to the test terminal of the 4004 central processor. An alternate use of the structure shown in Figure 1 permits determining which if any of the several signals is above or below some predetermined analog threshold value. The analog threshold value is deposited at the output ports driving the DAC and the outputs of the comparators are then read into the MCS-4 system at an input port or at the test terminal of the CPU.

This example illustrates a typical application for some of the new single-chip microprocessors. At the time of this writing, there are two such microprocessors available, the intel MCS-4 4-b it processor system and the intel MCS-8 8-bit processor system. Al though both of these processors are produced by one manufacturer, several other semiconductor manufacturers have announced their intention to produce micro-processor-like structures. With the continuing growth in the complexity of integrated circuits, the potential user of these devices can expect faster, more powerful microprocessors will become available in the future. However, two families of these processors are already available and are now finding wide acceptance. A designer who becomes familiar with their characteristics probably will find it easier to adapt to new microprocessors and will be in a better position to influence the design of the next generation of microprocessors.
I. MCS-4 Microcomputer set -- Users Manual, Intel Corp., Mar 1972
2. 8008 8-bit Parallel Central Processor Unit --MCS-8 Microcomputer set, Intel Corp., April 1972


ISET UP FOR SELECTION OF ROM OUTPUT PORT (RO, RI=PO), USING RI /AS A LOOP COUNTER -- VALUES IN BINARY
$000000032 \quad$ FIM PO 00001111 B 00015
/CLEAR REGISTERS R4, R5. (THESE TWO REGISTERS ARE /DESIGNATED PAIR 2 OR P2 BY THE FIM INSTRUCTION). R4 AND R5 WILL BE USED TO RECEIVE THE RESULT OF THE CONVERSION
$000200036 \quad$ FIM P2 0

00000
ISTART OF MAIN LOOP

| 0004 | 00033 | ADLP, | SRC PO |
| :--- | :--- | :--- | :--- |
| 000500240 |  | CLB | ISELECT PORT USING CONTENTS OF RO, RI |
| 000600216 |  | ICLEAR ACCUMULATOR AND CARRY FLIP-FLOP |  |

000600216 LDM 8 /LOAD ACCUMULATOR WITH 1000
/LDM 8 SETS THE HIGH ORDER BIT OF THE ACCUMULATOR
000700226 WRR WRITE ACCUMULATOR TO ROM OUTPUT PORT
000800025 JCN TI *+3 /JUMP PAST SCH IF RESULT TOO BIG
00011
001000180
XCH R4
/SAVE RESULT IF NOT TOO BIG
/NOW REPEAT FOR 2ND HIGHEST BIT

| 0011 | 00212 | LDM 4 | /LOAD ACCUMULATOR WITH 0100 |
| :--- | :--- | :--- | :--- |
| 0012 | 00132 | ADD R4 | IADD RESULT OF PREVIOUS TEST |
| 0013 | 00226 | WRR | NWRITE TO ROM OUTPUT PORT |
| 0014 | 00025 | JCN TI *3 | IJUMP PAST XCH IF RESULT TOO BIG |
| 00017 |  |  |  |
| 0016 | 00180 | XCH R4 | /SAVE CURRENT RESULT IF NOT TOO BIG |

/REPEAT PROCEDURE FOR LAST TWO BITS OF THIS PORT
001711210 LDM 2 ILOAD ACCUMULATOR WITH 0010

001800132 ADD R4
001900226 WRR
$002000025 \quad$ JCN TI *+3
00023
$002200180 \quad$ XCH R4
002300209 LDM 1 /LOAD ACCUMULATOR WITH 0001
002400132 ADD R4
002500226 WRR
002600025 JCN TI *+3 00029
002800180
$\mathrm{XCH} R 4$
INOW WRITE FINAL RESULT TO ROM PORT

| 0029 | 00164 | LD R4 |
| :--- | :--- | :--- |
| 0030 | 00226 | WRR | /LOAD FINAL RESULT TO ACCUMULATOR

INEXT MOVE THESE 4 BITS TO R5 AND CLEAR R4 AND CLEAR R4 FOR NEXT PASS /NOTE R5 INITIALLY CONTAINED ZERO

| 0031 | 00181 | XCH R5 | IACCUMULATOR TO R5, R5 TO ACCUMULATOR |
| :--- | :--- | :--- | :--- |
| 0032 | 00180 | XCH R4 | ICLEARS R4 IF AT END OF FIRST PASS |
| 0033 | 00096 | INC RO | IPREPARE FOR SELECTION OF NEXT ROM PORT |
| 0034 | 00113 | ISZ RI ADLP | IRETURN FOR SECOND PASS AFTER PASS 1 |
| 00004 |  |  |  |

IAFTER PASS 2, PROGRAM CONTINUES PAST THIS POINT. HIGH ORDER /BITS OF RESULT WILL BE IN R4, LOW ORDER BITS IN R5.

# A MULTIMILLION BYTE CORE REPLACEMENT UTILIZING MOS DYNAMIC MEMORY 

T.J. DeFranco, H.F. Bodio, W.F. Jordan Intel Corp. Santa Clara, California

Reprinted from the IEEE Intercon Technical Papers, 1973.

## SUMMARY

MOS memory arrays are being employed in some of the largest commercially available computers. This paper describes the hardware and system design of a million byte mainframe memory module for use as an addon to a 370-155 or 165. Design features which exploit the computers error correction capability are presented. Included also is a description of the off-line self tester.

Each module is available as a self-contained enclosure closely matching IBM construction and appearance, as shown in Fig. 1.


Fig. 1 In-70 Memory System
The in-70 can replace all or part of the IBM memory up to one megabyte per cabinet enclosure, with expansion of Main Memory Capability to 4 million bytes on both systems $370 / 155$ and 165.

The effective density of the in-70 memory system is four times that of IBM's 3360 processor storage (Fig. 2) and dissipates $50 \%$ less heat. Due to this dramatic miniaturization in size and power, significant cost savings can be experienced by the system 370/ 155 and $370 / 165$ user.


FIG. 2 Comparison of IBM 3300 and in-70 Memories

The electronic technology used in the 'in-70 memory is intrinsically more reliable than electromagnetic core memory. As a result the user will experience greater availability and improved cost performance from his total system. Reliability is further enhanced by utilizing the Error Correction Code electronics in the CPU. Single bit errors, the major source of previous core memory failures, are detected and automatically corrected.

## ARCHITECTURE

The in- 70 memory is organized with a single bit per memory card. The 370-155/165 processors communicate simultaneously with four storage modules of 36 data lines each, so that a minimum of 144 memory boards, with additional control cards, are required for the implementation of this architecture. Two $6.75^{\prime \prime} \times 8^{\prime \prime} \times 0.4^{\prime \prime}$ memory cards have been developed, with 1 bit $\times 16 \mathrm{~K}$ words and 1 bit $\times 32 \mathrm{~K}$ words, corresponding to memory expansion modules of $1 / 4$ megabyte and $1 / 2$ megabyte, respectively. The basic in-70 module block diagram is shown in Fig. 3.


CONTROL, 15 ADDRESS LINES, BSMO-BSM3 SELECT

Fig. 3. in-70 Block Diagram

## MOS Storage Element

The semiconductor storage device used in the memory cards is the Intel 1103, a 1024 bit low threshold P-channel, dynamic silicon gate MOS LSI RAM (Random Access Memory). The 1103 is organized as a 32 row $\times 32$ column array, with five address bits selecting the row, and five address bits selecting the column. The 1103 is packaged in an 18 pin dual-in-line package with three pins required for power, three pins for control signals PRECHARGE, CENABLE, and READ/WRITE, and two pins for DATA IN and DATA OUT. Each memory element is comprised of three P-channel enhancement mode field effect transistors (FET) (Fig. 4).


Fig. 4. Dynamic MOS Memory Cell

In the dynamic cell of Fig. 4 data is stored as charge on the parasitic capacitance $\mathrm{C}_{\mathrm{s}}$ of the gate Q2. Data on the WDATA line is written into the cell by enabling Q1 with WSEL. To read from the cell, the RDATA line (with capacitance $C_{R}$ ) is discharged to $V_{D D}$ (enabled by signal $\varnothing$ ). When the signal RSEL enables Q3, the RDATA line will be charged to VSS through inverter Q2 if and only if the capacitor CS contains a low and will remain low if and only if $\mathrm{C}_{s}$ contains a high.
Therefore, the RDATA line then contains the logical complement of the cell data. Although the read out operation from the cell is non-destructive, the leakage associated with the junction of $Q_{1}$ eventually may result in the loss of the charge stored in $\mathrm{C}_{\mathrm{S}}$. To maintain the data stored in the cell, the data must be periodically regenerated. This regeneration is accomplished by reading the contents of the cell out onto the read data line, inverting and amplifying the signal and applying it to the WDATA line, and rewriting back into the cell by activating the WSEL line. A circuit which performs the inversion and amplification function is called a refresh amplifier. In the 1103, the dynamic cells are laid out in a two dimensional array. One entire row of cells is refreshed (or accessed) at one time, one refresh amplifier being provided for each column of cells in the array. To refresh the entire memory, each row of cells must be individually refreshed. A block diagram of the 1103 is shown in Fig. 5.


Fig. 5 Block Diagram of 1103
Five address lines, $A_{0}$ through $A_{4}$ are decoded to select one row of cells. When accessed, the contents of this row are transferred to a row of 32 refresh amplifiers. In the course of a memory cycle, whether read or write, the data is regenerated and written back into the selected row of cells. Address bits $A_{5}$ through Ag are decoded to select one refresh amplifier for communication with the data input and output terminals. Data output is sensed as a current. Activation of the "write" clock effectively disconnects the refresh amplifier_outputs from the write data lines and permits the signal on the data input line to override the signal at the output of the selected refresh amplifier. Figure 6 shows the basic timing of the 1103 memory cycle. The cycle timing is established by the three clock signals: precharge, cenable, and write. Initially (prior to execution of a memory cycle) all clocks are at their high
state, at a voltage approximately equal to $V_{S S}$. To begin a cycle, precharge is first brought low, to approximately $V_{D D}$ potential. Referring to Figures 5 and 6, this operation activates the row and column decoders, and also charges all read and write data lines negatively. In the discussion which follows, clocks, etc., are considered "on" at VDD level, and "off" at Vss level.


Fig. 6. 1103 Timing

After precharge and addresses have been active long enough for the data lines to discharge and the decoders to stabilize, the cenable clock is turned on, the decoded read-select line is turned on, the readdata line precharge circuits are turned off, and the data lines are charged to the complements of the data stored in the selected row of cells. As the read-data lines are selectively charged, the precharge signal is turned off, removing the precharge signal from the write-data lines and closing the path which enables WSEL, restoring the contents of the memory cells.

When the write line is activated, the read data lines are discharged, disconnecting the refresh amplifiers from the write data lines, and enabling a path from the data input line to the selected cell. The signal on the data input will then overwrite the contents of the cell.

The non-destructive read-out capability of the 1103 is utilized in the "Refresh Abort" mode of operation of the in-70 system. If precharge and cenable are turned off simultaneously during a memory cycle interrupt, the Intel 1103 memory contents are not altered, no matter where in the memory cycle the interrupt occurs. In the 370-155/165 application it is necessary to stop (abort) any refresh cycle which may be in process when the computer demands a memory access. The subsequent period normally allocated for a magnetic memory restore cycle is used to service the 1103 refresh cycle at the previously interrupted row address.

## SYSTEM IMPLEMENTATION

The processor interface timing is shown in Fig. 9. A memory cycle is initiated when a "BSM Select" line goes TRUE at time TO, which enables a 36 bit slice of the 144 bit data Bus. The address lines must be stable within 100 nanoseconds of TO, and stay stable until T $0+1300$ nanoseconds. Read Data is stable well within the specified access time of 800 nanosec, and write data is strobed into the memory at approximately TO + 1200 nanoseconds, if write control is true (Fig. 7).


Fig. 7 Processor Interface Timing

Each $1 / 2$ megabyte memory module contains refresh logic which refreshes $1 / 128$ th of the memory at approximately 10 microsecond intervals.

The refresh logic consists of a 7 bit counter, to sequentially address the 128 segments of memory, a timing circuit, and a multiplexer to steer the row addresses from the 7-bit counter to the memory chips. Standard READ cycle timing circuits are used to execute a refresh cycle, which requires approximately 500 nanoseconds to complete.

If a memory cycle is in process when the refresh initiate logic times out, the refresh logic will wait until the memory cycle is complete before the refresh cycle is initiated. If a memory access occurs during a refresh cycle, the refresh cycle is aborted, the standard memory cycle is completed, and the refresh cycle is then restarted. Consequently, the through-put of the processor is not degraded because of the refresh requirements of the dynamic semiconductor memory. All refresh operations occur when the processor is not accessing memory, and the memory is never "BUSY" while a refresh cycle is in process.

## BASIC EXPANSION MODULE

The basic expansion module of the in-70 is $1 / 2$ megabyte, consisting of 144 ea. 32 K bit capacity memory cards. The 144 cards correspond one-to-one with the IBM 144 data and byte-parity lines in a standard IBM storage of four 36 bits/word parallel access modules. If expansion in $1 / 4$ Megabyte ( 256 K byte) increments is required, $1 / 2$ populated boards ( 16 K words x 1 bit ) are provided. The bit-per card organization coupled with single-bit failure error correction capability allows a faulty memory card to be replaced on-line, without interrupting the processor. The land patterns on the edge power connections are recessed relative to signal connections in order to insure proper logical biasing when plugging cards in and out. An expansion module, with power supplies, occupies $1 / 2$ of a 31 " $\times 31$ " $\times 60^{\prime \prime}$ cabinet.

The basic $32 \mathrm{~K} x$ word 1 bit Mu- 37 memory card is shown in Fi.g. 8. The memory card is TTL compatible, and contains address decoders, clock driver, sense amplifiers, and control to read and write data at any memory word location.


Fig. 8 MU-37 Memory Card
The CU-37 control cards generate the timing signals required to read, write and refresh data in the MU-37 memory cards. Two CU-37 control cards are required for each $1 / 2$ megabyte module.

## Buffer Cards

Three types of buffer cards are used in the in-70; the BU-37 address and control buffer, the DI-37 write data buffer, and the D0-37 read data buffer.

One BU-37 card is required for each 18 memory cards, so a total of 8 each are required for a basic expansion module. Four DI-37 and four D0-37 cards are required for each basic expansion module.

## Self-Test

An off-line self-test module is available for the in-70 memory, which can rapidly locate faulty or marginal devices. The self-test module simpley attaches to the door of the in-70 cabinet, as shown in Fig. 9.


Fig. 9 Self-Tester Packaging

The tester is electrically connected to the memory by removing the shorting plugs and inserting "paddlecard" connectors. A number of modes of operation are possible.

The operation and versatility of the tester is best understood by describing the controls and indicators as shown in Fig. 10.


Fig. 10 Self-Tester Panel Layout

1. Data Select - The two rotary 16-position "DATABITS" switches are hexadecimal encoded, and used to load each data byte in each selected module with the two 4bit hexadecimal characters selected on the switches and parity. All four modules, BSMO through BSM3 and the "BUMP" memory are loaded if the individual BSM "ERROR CHECK/BYPASS" toggle switches are in the CHECK position, and not loaded if in the BYPASS position.
2. Single Address Test - To test a single memory location, the desired address is set on the "ADDRESS FORCING" toggle switches, and the "MANUAL LOAD" toggle switch is set first to the "W/DATA" position, and then to the "R/DATA" position. The "DATA IN DISPLAY" and the "DATA OUT DISPLAY" will indicate the encoded data words selected on the DATA BITS switches.
3. Dynamic Memory Test - To test all memory locations, one or more of the three "MODE CONTROL" toggle switches ( $R / W, R / \bar{W}, W / R$ ) must be in the "AUTO" position. When the "CLOCK CONTROL" switch is set to the "RUN" position, the memory will sequentially step through all modes except those with the "BYPASS" position set on the "MODE CONTROL" switch. To single step through the memory, the "CLOCK CONTROL" switch is momentarily set to the "STEP" position for each address increment. The dynamic memory test patterns and test sequences are selected to insure that no address or data lines are held low or high, and that no address or data lines are open or shorted together without being detected.
4. Fault Isolation - One of the major fault isolation aids is the ability to either "CHECK" or "BYPASS" any data bit or BSM, with the "ERROR DETECTION" and "ERROR" switches respectively. Any address bit may be set to "l" or "O" (ADDRESS FORCING) to address any desired segment of memory repetively. Two BNC outputs are available for sync inputs of oscilloscope-aided tests; i.e., "SCOPE SYNC" and "ERROR OUT". "SCOPE SYNC" is programmed by setting "ADDRESS BIT SYNC" and/or "MODE SYNC" switches to a " 1 " or " 0 ", which enables an output "AND" gate whenever programmed bits coincide with the switch settings. The "ERROR OUT" sync pulse occurs whenever an unmasked error is detected.
5. Unique Features - The self-test unit has several features which are helpful in locating random intermittant errors caused by marginal devices. The clock may be varied from the normal 2.0 to 2.1 usec by setting the "CLOCK RATE" switch to the "FAST" or "SLOW" position. Typical variations are $5 \%$ to $10 \%$, although the tester may be adjusted for wider or narrower perturbations as desired. The power supplies may also be varied by setting the "VOLTAGE MARGIN" switches to the "+" or "-"
position, which will vary the selected supply or supplies by $\pm 5 \%$. Perhaps the most valuable troubleshooting aid, in combinations with the marginal clock and supply switches, are the "BURST" and "DISTURB" functions. When the "BURST" switch is "ON", the memory cycles are interrupted for an interval of time from 4 to 50 milliseconds determined by the "BURST FREQ" potentiometer. During the "BURST" interval, if the "DISTURB" switch is "ON", although no BSM is selected, the address lines are sequenced through addresses unrelated to the memory address at the time of the "BURST" interrupt.

## RELIABILITY AND MAINTAINABILITY

In addition to the economics of lower inital cost and reduced size, power, and weight, the in-70 has the additional advantages of increased reliability and maintainability.

The improvement in reliability may be attributed to several factors:

1. Reduced number of interconnections: The major savings in interconnections is in the 1103 memory chip, with 1024 bits stored in a single 18 pin package, and addressed by a common 10 address lines. Partially decoding 5 higher order address lines to select one of 32 devices per memory card in an $8 \times 4$ array further reduces interconnections.
2. Improved Noise Environment: All MOS memory devices are buffered with TTL compatible interface devices such that there is no mixing of low level sense lines, TTL lines, and high-level MOS clock drivers on the memory back panel. On the memory card itself, the high level (16V) MOS clocks are on the opposite side of the board and at right angles to the low level ( 400 mv ) sense line outputs to minimize coupling. Only four 1103 device outputs are common to a single sense line prior to buffering and translating the output to the TTL backpanel level. However, these four sense lines can access 4096 bits.
3. Silicon Gate Technology: The silicon gate technology eliminates the majority of MOS surfacerelated failure effects, since the gate is not only buried in protective dielectric but is also protected by an overall oxide barrier, no contamination from the external environment can introduce surface changes. In other MOS technologies contamination has caused increases.in threshold voltages and leakage currents, or field inversion effects, depending on the location and nature of the contamination.
4. System Burn-In: Because of the large number of semiconductor devices in the memory, the small percentage of infant mortality failures inherent in any semiconductor device can cause unsatisfactory performance if allowed to occur in the field. Consequently, each system is operated at elevated temperatures (typically $50^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$ depending on application) under dynamic conditions for a minimum of twelve hours. Failed devices are replaced with devices which have been burned in to the same or more stringent criteria than the memory system.
5. System Qualification: Final acceptance testing of the burned-in memory system is at elevated temperature and with $\pm 5 \%$ variations in the $V_{s s}$ and $V_{c c}$ supplies. These are the conditions most likely to result in "dropping a bit." If an error is detected, the marginal device is replaced, and the testing repeated until no errors occur.
6. Power Reduction: Since the life of semiconductor devices correlates inversely with respect to power dissipation and heat, a significantly longer life may be expected relative to an otherwise comparable magnetic or bipolar memory.

## MAINTAINABILITY FACTORS

## On-Line Repair

The single error correction capability of the IBM 370-155/165 processor is fully exploited by the in-70 architecture. When a hard error occurs, the card is readily identified by the bit number of the failure and BSM address, and may be replaced on-line, with the missing bit supplied in the interim by the processor Error Correction Code (ECC) and ECC logic.

## Rapid Off-Line Configuration and Off-Line Test

All input and output lines between the memory and the processor are wired through shorting plugs. Removing eight of these plugs interrupts all signals to a $1 / 2$ megabyte memory module. The memory self-test unit may then be plugged in for rapid fault isolation in an off-line mode, as described previously.

## Memory Architecture and Internal Buffering

Referring again to Fig. 3, the memory module is fully buffered at the processor interface, and again at each memory card. Consequently, the memory system backpanel provides an excellent intermediate starting point for troubleshooting and fault isolation. A line held low or high can be quickly found by removing the memory or control cards on the bus. Since the memory subassemblies are very orderly in their implementation, the modular structure of the memory permits the technique of transposition of memory cards for rapid fault location. Back panel wiring of address coded card positions allows double bit errors (detected by CPU diagnostics) to be translated into independent single bit errors.

## CONCLUSION

The semiconductor memory system described in this paper is not only more reliable than core memories of comparable size, but much easier to troubleshoot and repair when failures or data degradation does occur. Single bit memory errors can be repaired on-line, and fault-isolation of marginal conditions can be accomplished off-line during scheduled main-frame down time.

Refresh operations are accomplished during normal machine cycles without interfering with memory access.

## ACKNOWLEDGEMENT

The authors wish to acknowledge the self-tester design, by R. Blanding, the MU card design by D. Grove, the system test efforts of $N$. Massa, and the support of W.M. Regitz of the Intel Components Division in developing the refresh abort capability in the 1103.

## BIBLIOGRAPHY

1. L.L. Vadasz, A.S. Grove, T.A. Rowe, and G.E. Moore, "Silicon Gate Technology" IEEE SPECTRUM 6, No. 10, 28 (1969).
2. W.M. Regitz and J. Karp, "Three-Transistor-Cell 1024 Bit 500 ns MOS RAM", IEEE JOURNAL SC-5 No. 5, 181 (1970).
3. W.M. Regitz and H.F. Bodio, "A MOS Main Memory System", HONEYWELL COMPUTER JOURNAL, VOLUME FIVE, NUMBER TWO (1971).
4. D.J. Fitzgerald, G.H. Parker, and P. Spiegel, "Reliability Studies of MOS Silicon-Gate Arrays", IEEE RELIABILITY PHYSICS SYMPOSIUM, LAS VEGAS, NEVADA March 31 to April 2, 1971.

## INTRODUCTION

The 1601 is a fully decoded 2048-bit electrically programmable MOS read-only memory whose basic nonvolatile memory element is a novel MOS chargestorage device ${ }^{1}$. In this bulletin, the results of reliability studies on this element and the 1601 itself are given. It is shown in the course of high temperature and high voltage lifetesting there have been

$$
\begin{array}{ll}
0 \text { failures in } & \sim 2,250,000 \text { unit-hours } \\
\sim 100,000,000 \text { bit-hours }
\end{array}
$$

## FAILURE MODES

When a 1601 memory bit is in the " 0 " state, charge is stored in the MOS charge-storage device; in the " 1 " state, no charge is stored. The device is fabricated using Intel's standard p-channel silicongate technology ${ }^{2}$ of proven reliability. Thus, the only reliability questions posed by use of this new MOS charge-storage device are

How stable is the " 0 " memory state?
How stable is the " 1 " memory state?

## LIFETEST RESULTS

To investigate the stability of the " 0 " state, discrete devices, 16 -bit memory elements, and 1601's have been subjected to high temperature
storage which is the worst case condition for this state ${ }^{1}$. The results obtained to date are given in Table 1.

0 FAILURES IN MORE THAN 1,500,000 UNIT-HOURS
Time-to-failure versus temperature is shown in Fig. 1, where failure is taken as a $30 \%$ change in the logic level. It is evident from this figure that for the maxi-


Figure 1: Time-to-Failure Versus Temperature

| Unit | Temp. | No. of <br> Units | No. of <br> Hours | No. of <br> Unit-Hours | No. of <br> Failures |
| :--- | :--- | ---: | ---: | ---: | :---: |
| Discrete device | $125^{\circ} \mathrm{C}$ | 149 | 5215 | 777,035 | 0 |
| Discrete device | $125^{\circ} \mathrm{C}$ | 100 | 3835 | 383,500 | 0 |
| 16 -bit Memory | $125^{\circ} \mathrm{C}$ | 100 | 3835 | 383,500 | 0 |
| 1601 | $200^{\circ} \mathrm{C}$ | 10 | 500 | 5,000 | 0 |
| 1601 | $125^{\circ} \mathrm{C}$ | 5 | 2415 | 12,075 | 0 |
| 1601 | $125^{\circ} \mathrm{C}$ | $\underline{5}$ | 1440 | $\frac{7,200}{0}$ | $\frac{0}{0}$ |

TABLE I: Summary of $125^{\circ} \mathrm{C}$ and $200^{\circ} \mathrm{C}$ Storage Lifetest results.
mum operating temperature the time-to-failure extrapolates to a value well in excess of 500 years.

The worst case condition for the " 1 " state is when voltage is applied to the charge-storage device. Table II summarizes the results obtained to date.

$$
0 \text { FAILURES IN ~650,000 UNIT-HOURS }
$$

Time-to-failure versus applied voltage is shown in Fig. 2, based upon experiments performed at voltages well in excess of the maximum allowed voltage. At the maximum allowed voltage a time-to-failure well in excess of 100 years is indicated.


Figure 2: Time-to-Failure Versus Applied Voltage

## CONCLUSION

The 1601 does not involve any changes to Intel's standard p-channel silicon-gate technology. However, it does involve a novel charge-storage device. The two possible failure modes associated with this transistor are

$$
\begin{aligned}
& \text { loss of " } 0 \text { " level } \\
& \text { loss of " } 1 \text { " level }
\end{aligned}
$$

In this bulletin, results of lifetests designed for the worst case conditions for the above failure modes ... high temperature for loss of " 0 " level, and high voltage for loss of " 1 " level . . . are given. In all there have been 0 failures in almost 100 million bit-hours.

## REFERENCES

1. D. Frohman-Bentchkowsky, "A Fully Decoded 2048 Bit Electrically Programmable MOS Read Onfy Memory," paper to be presented at the 1971 ISSCC, Philadelphia, Feb. 17-19, 1971.
2. L. L. Vadasz, A. S. Grove, T. A. Rowe, and G. E. Moore, "Silicon-Gate Technology," IEEE Spectrume 6, 28, 1969.

| Unit | No. of <br> Units | No. of <br> Hours | No. of <br> Unit-Hours | No. of <br> Failures |
| :--- | ---: | ---: | ---: | ---: |
| Discrete device | 10 | 3,835 | 38,350 | 0 |
| 12-bit Memory | 102 | 6,000 | 612,000 | 0 |
| 1601 | 10 | 2,000 | 20,000 | 0 |
| 1601 | 4 | 1,000 | 4,000 | 0 |
| 1601 | 6 | 500 | $\frac{3,000}{0}$ | $\underline{0}$ |

TABLE II: Summary of applied bias Lifetest results.

# RELIABILITY REPORT- <br> 1103 SILICON GATE MOS LSI RAM 

## I. INTRODUCTION

The Intel 1103, 1024 bit dynamic, P channel silicon gate MOS random access memory has become one of the most important semiconductor components to be introduced in many years. Offering significant advantages over cores in cost, system flexibility and performance, this product is fast becoming the industry standard. In addition to all of the design features, the 1103 is also an extremely reliable device. This report was prepared to present the data to substantiate the reliability of the 1103.

To date, based on the equivalent of $8,663,000$ unit hours of system lifetests, a best estimate failure rate of $0.008 \%$ per 1000 hours at $55^{\circ} \mathrm{C}$ has been calculated for the 1103.

## II. ELECTRICAL TESTING

## A. Lifetesting

In order to assure the electrical reliability of the 1103 die itself, the following three categories of lifetests were carried out. ${ }^{11}$

## 1. System Lifetests

## a. Continuous System Lifetests

Four operating 1103 system lifetests, each consisting of a 4096 word $\times 18$ bit memory board are presently on lifetest. Each system consists of a memory board, with 72 plastic 1103's, assorted drivers, refresh control and clock generator, and an exerciser board which generates the addressing, data generation, and error detection. The exerciser board generates an essentially random data pattern of 13 bits plus 5 parity bits. The read or write mode is also controlled randomly. The output from the memory
is read through the parity tree and errors are decoded to identify a failed unit.

Table I summarizes the system lifetest results to date.

All systems are operating at $\mathrm{V}_{\mathrm{SS}}=+16$ volts, $\mathrm{V}_{\mathrm{BB}}-\mathrm{V}_{\mathrm{SS}}=+3.5$ volts, with a 650 nsec cycle time. Periodically the system clocking is slowed to verify the 2 msec refresh time. Figure 1
shows the voltage margin plot on System \#2 initially and at present.

## b. Rotating System Lifetest

As a means of assuring the continuing reliability of the 1103, a fifth system is in


Figure 1. 1103 System Voltage Margin Plot $@ T_{A}=25^{\circ} \mathrm{C}$

|  | $\mathrm{T}_{\text {A }}$ | UNITS | HOURS AT <br> Nov. 22, 1971 | EQUIVALENT UNIT HRS. <br> @ $55^{\circ} \mathrm{C}(1 \mathrm{eV})$ | FAILU <br> (DC, AC, FUN CATASTROPHIC | ES ${ }^{(2)}$ <br> TIONALITY) DEGREDATION |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| SYSTEM \#1 <br> SYSTEM \#2 | $70^{\circ} \mathrm{C}$ | 72 | 8000 | 5,750,000 | 0 | 0 |
|  | $55^{\circ} \mathrm{C}$ | 72 | 1000 | 72,000 | 0 | 0 |
|  | $70^{\circ} \mathrm{C}$ | 72 | 1110 | 799,000 | 0 | 0 |
| SYSTEM \#3 | $55^{\circ} \mathrm{C}$ | 72 | 1250 | 90,000 | 0 | 0 |
|  | $70^{\circ} \mathrm{C}$ | 72 | 1100 | 792,000 | 0 | 0 |
| SYSTEM \#4 | $70^{\circ} \mathrm{C}$ | 72 | 500 | 360,000 | 0 | 0 |
|  |  |  |  | 7,763,000 |  |  |

Table 1.

[^5]|  | $\mathrm{T}_{\mathbf{A}}$ | UNITS | HOURS | EQUIVALENT UNIT HRS. @ $55^{\circ} \mathrm{C}(1.0 \mathrm{eV})$ | FAIL <br> (DC, AC, FUNC CATASTROPHIC | RES ${ }^{(2)}$ TIONALITY) DEGREDATION |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| WW 37 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 38 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 39 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 1 |
| WW 40 | $.70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 41 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 42 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 43 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 44 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 45 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |
| WW 46 | $70^{\circ} \mathrm{C}$ | 12 | 1000 | 120,000 | 0 | 0 |

Table 2.
operation on a rotating system basis. The system is also a 4096 word by 18 bit operating as above. Twelve units are added weekly and allowed to remain for 6 weeks or 1000 hours. The initial results as of Work Week \#47 are reported in Table 2. The one degredation failure was an increase in $\mathrm{I}_{\mathrm{BB}}$ current.

## c. Summary of System Lifetests

The data in Tables 1 and 2 represents the equivalent of $8,663,000$ unit hours of system operation giving a best estimate failure rate of $0.008 \%$ per 1000 hours at $55^{\circ} \mathrm{C}$.

## 2. HTRB (High Temperature Reverse Bias)

A second category of lifetesting, High Temperature Reverse Bias, consists of DC biasing the chip for extended periods of time at elevated temperature. It is the most common form of lifetesting and is used to detect the presence of thermally activated failure modes. Figure 2 shows the bias configuration used for the 1103 HTRB lifetests.

In Table 3 we present HTRB data on the 1103. The units reported cover samples from


Figure 2. Bias Configuration for Intel 1103 HTRB
the past 6 months of production material. The one degredation failure was an increase in input leakage current.

## 3. Storage

A third common lifetest is extended high temperature storage. This lifetest is generally

|  | $\mathrm{T}_{\text {A }}$ | UNITS | HOURS | EQUIVALENT UNIT HRS. @ $55^{\circ} \mathrm{C}(1 \mathrm{eV})$ | FAILURES ${ }^{(2)}$ <br> (DC, AC, FUNCTIONALITY) |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| HTRB | $70^{\circ} \mathrm{C}$ | 28 | 7350 | 2,060,000 | 0 | 0 |
| HTRB | $125^{\circ} \mathrm{C}$ | 20 | 1000 | 2,000,000 | 0 | 0 |
| HTRB ${ }^{(1)}$ | $125^{\circ} \mathrm{C}$ | 27 | 6300 | 17,000,000 | 0 | 0 |
| HTRB ${ }^{(1)}$ | $125^{\circ} \mathrm{C}$ | 105 | 1000 | 10,500,000 | 0 | 1 |

Table 3.

[^6]|  | $\mathrm{T}_{\mathbf{A}}$ | UNITS | HOURS | EQUIVALENT UNIT HRS. @ $55^{\circ} \mathrm{C}(1 \mathrm{eV})$ | FAILURES (DC, AC, FUNCTIONALITY) |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Storage | $70^{\circ} \mathrm{C}$ | 15 | 7000 | 1,050,000 | 0 | 0 |
| Storage | $125^{\circ} \mathrm{C}$ | 31 | 5700 | 17,700,000 | 0 | 0 |
| Storage ${ }^{(1)}$ | $160^{\circ} \mathrm{C}$ | 34 | 1000 | 34,000,000 | 0 | 0 |

(1) Ceramic package.

Table 4.
aimed at mechanical reliability and process stability.

Table 4 summarizes storage data on the 1103.

## B. End-of-Life Data

Virtually all active semiconductors tend to undergo some change in their basic parameters during life; however, the basic reliability of the silicon gate process has been well established through numerous lifetests ${ }^{(a)}$. For example, ionic contamination can result in degradation of p -n junction characteristics which will be evidenced by increased input leakage or major changes in refresh times. Likewise, surface charging can lead to leakage increases and loss


Figure 3. Plot of Timé vs Temperature for Threshold Drift
of isolation between nodes. The effects of surface charging or ionic contamination are generally catastrophic to the operation of a dynamic array and are thus readily detectable. The primary effect on end-of-life data comes from the positive charge accumulation atothe oxide-silicon interface. This inherent phenomenon results in slight increases in device threshold voltages. Its effect on MOS devices and arrays has been well documented. (a, b, c, d, e)

These charge migration failure modes are thermally activated; that is, one can trade time for temperature. Figure 3 shows a plot of time vs temperature for threshold drift ( 1 eV ) and ion migration ( 1.4 eV ) activation energies. Note that 1000 hours at $125^{\circ} \mathrm{C}$ is equivalent to 100,000 or $1,000,000$ hours at $55^{\circ} \mathrm{C}$, depending on the failure mechanism.

A group of 20 devices was subjected to a 1000 hour HTRB lifetest. The units were on the static bias configuration illustrated in Figure 2. They were connected in parallel and stressed at $125^{\circ} \mathrm{C}$. Table 5 presents the initial and final average leakage current readings by pin as well as deviations and worst case values.

Table 6 shows the initial and final power supply current values as well as standard deviation and worst case values.

The results show that both power drain and leakage currents are stable with only one pin showing an increase to 130 nA (spec. limit is $1 \mu \mathrm{~A}$ ) after 1000 hours.

The output current or $\mathrm{I}_{\mathrm{OH}}$ results for the lifetest are included in Table 6 and prove to be stable.

All readings were taken with a Teradyne J259 system.

## C. Refresh

The basic operation of the 1103 depends on the retention of charge on a storage node for a certain length of time. Figure 4 shows the cell structure indicating the location of the storage node.

The primary charge loss mechanism from the storage node is the leakage current of the p-n junction associated with that node. The length of time the charge can be retained at a useful

|  | INITIAL (in nanoamps) |  |  | FINAL (in nanoamps) |  |  | MAX $\triangle$ <br> (in nanoamps) |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | AVG | STD DEV | MAX | AVG | STD DEV | MAX |  |
| $\mathrm{A}_{0}$ | 0 | 0 | 0 | 16.5 | 4.0 | 23 | +23 |
| $A_{1}$ | 3.4 | 4.2 | 15 | 0 | 0 | 0 | -15 |
| $A_{2}$ | 0 | 0 | 0 | 0.6 | 2.2 | 12 | +12 |
| $\mathrm{A}_{3}$ | 5.3 | 6.8 | 16 | 3.4 | 4.7 | 13 | -14 |
| $\mathrm{A}_{4}$ | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| $\mathrm{A}_{5}$ | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| $A_{6}$ | 3.8 | 4.1 | 15 | 0 | 0 | 0 | -15 |
| $\mathrm{A}_{7}$ | 0.5 | 1.5 | 10 | 0 | 0 | 0 | -10 |
| $\mathrm{A}_{8}$ | 0 | 0 | 0 | 5.2 | 28.0 | 130 | +130 |
| $\mathrm{A}_{9}$ | 1.2 | 3.0 | 14 | 0 | 0 | 0 | -14 |
| R/W | 5.4 | 3.0 | 16 | 7.7 | 4.0 | 15 | +13 |
| $\mathrm{D}_{\mathrm{I}}$ | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| $\mathrm{D}_{0}$ | 0 | 0 | 0 | 4.8 | 5.5 | 16 | +16 |
| PRECH | 3.6 | 5.1 | 12 | 7.9 | 6.2 | 12 | +13 |
| CENBL | 0.5 | 1.2 | 1 | 1.1 | 1.8 | 11 | +11 |
| $\mathrm{I}_{\mathrm{BB}}$ | 0.5 | 0.2 | 3 | 1.35 | 0.4 | 7 | +. 7 |

Table 5. Leakage Current and IBB Readings Before and After 1000 Hour HTRB

|  | INITIAL (in milliamps) |  |  | FINAL (in milliamps) |  |  | MAX $\triangle$ <br> (in milliamps) |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | AVG | STD DEV | MAX | ÁVG | STD DEV | MAX |  |
| IDD1 | 33.0 | 1.5 | 36.6 | 32.6 | 1.4 | 36.2 | -1.5 |
| $I_{\text {DD2 }}$ | 36.5 | 2.2 | 40.7 | 36.1 | 2.1 | 40.2 | -0.9 |
| IDD3 | 5.98 | 0.6 | 6.6 | 5.92 | 0.6 | 6.5 | -0.2 |
| ${ }^{\text {I DD4 }}$ | 2.01 | 0.2 | 2.15 | 2.01 | 0.2 | 2.16 | -. 03 |
| IOH | 1.19 | 0.2 | 1.09 (Min.) | 1.19 | 0.2 | 1.08 (Min.) | -. 02 |

Table 6. Power Supply Current and ION Readings Before and After 1000 Hour HTRB
level is termed the refresh time. Figures 5 and 6 show the refresh times for the first fail and second fail bits before and after a 50 hour $160^{\circ} \mathrm{C}$ (equivalent to 1000 hours at $125^{\circ} \mathrm{C}$ ) system burn-in.

## D. Voltage Margin

In the operation of the 1103, two supply voltages are involved: $\mathrm{V}_{\mathrm{DD}}$, the voltage to operate the device and $V_{B B}$, the substrate potential. Functionality is guaranteed for $V_{D D}$ $\pm 5 \%$ and $V_{B B}=3$ to 4 volts.

The range of supply voltages for operation is defined as the voltage margins of the device. Rather than attempting to present individual devices, the voltage margin plot of a 4096 word by 8 bit system was presented in Figure 1. The final reading is after 1000 hours of operation at $55^{\circ} \mathrm{C}$ plus 1000 hours at $70^{\circ} \mathrm{C}$. The system plot


Figure 4. Basic RAM Dynamic Cell


Figure 5. First Fail
defines the worst case device at each measurement point. Other system tests and discrete devices also show the same stability.

## III. PLASTIC MECHANICAL

## A. Moisture Failure Modes

One of the primary functions of an integrated circuit package is to prevent moisture penetration to the die itself. Such penetration leads to eventual corrosion of the aluminum metallization of the array. In the case of packages with internal cavities, screens such as Fine and Gross Leak tests have been developed to demonstrate a hermetic seal which guarantees against moisture penetration. Unfortunately, non-destructive screens as simple as these have not been developed for plastic packages. The test that is most popular is the so-called "pressure cooker"test, i.e., exposure to 30psia steam. For example, one large computer manufacturer who relies very heavily on plastic encapsulated devices, specifies that the devices must be capable of surviving 6 hours of exposure to 30psia steam. Intel's standard silicone lot qualification also includes 30psia steam testing. Lots failing the initial qualification or subsequent monitors are rejected and new qualified material is put into production.

1. Autoradiographic Analysis: 0 Fail/10 Devices

A procedure has recently become available which detects minute amounts of moisture penetration - below the levels necessary to cause corrosion - as well as the entry path of moisture. The technique involves the use of


Figure 6. Second Fail
radioactive Zn as a tracer in an aqueous solution. After exposure to this environment at 30psia, the units are sanded to remove excess plastic and placed on X-ray film. Moisture paths into the chip will then show on the film. Figure 7 shows two such film plates from a group of ten 1103's which were subjected to 24 hours exposure at 30 psia to the aqueous solution. The autoradiogram clearly shows no evidence of moisture penetration. This was true for all 10 devices.


Figure 7. Results of Autoradiographic Analysis on 2 1103's

## 2. Mil Std 883 Method 1004: 0 Fail/50 Devices

A commonly utilized test, taking somewhat longer than the pressure cooker, is the Method 1004 of Mil Std 883. This moisture test involves 10 cycles of 1 day each at various temperature/ humidity conditions with bias. Figure 8 outlines one such cycle.


Figure 8. Graphical Representation of Moisture-Resistance Test-MIL STD-883, Method 1004-2

A group of 50 1103's representing six fabrication lots was subjected to the Method 1004 test and resulted in no functional, AC, DC, or mechanical failures.

## B. Bond-Related Failure Modes

Bond-failures - opens and thermal intermittence -- can result because of the differences in thermal expansion coefficients between the plastic, the leads, and lead/frame/die system. These differences set up stresses as shown schematically in Figure 9. The magnitude of these stresses depends on the design of the lead frame, the type of lead bonding employed, and the type of plastic and the conditions under which it was molded and cured.

## 1. Temp Cycling ( $-55^{\circ} \mathrm{C}$ to $150^{\circ} \mathrm{C}$ ): 1 Fail/ 26000 Unit Cycles

Intel uses temperature cycling as a continuing process monitor on all products encapsulated in plastic. Units are selected from current production lots and are subjected to temperature extremes of $-55^{\circ} \mathrm{C}$ to $150^{\circ} \mathrm{C}$. For example, during the past 3 months, 260 P1103's from 26 production lots were each


Figure 9. Schematic Representation of Stresses Set Up in a Plastie Package
subjected to 100 temperature cycles with only one bond failure.

## 2. Thermal Shock ( $-65^{\circ} \mathrm{C}$ to $+125^{\circ} \mathrm{C}$ ): 0 Fail/ 6250 Unit Cycles

Another commonly utilized test of bond integrity is thermal shock. The units are placed in a $125^{\circ} \mathrm{C}$ bath of FC40, soaked 5 minutes, transferred (less than 10 seconds) to an alcohol

|  | $\mathbf{T}_{J}$ | UNITS | HOURS | CYCLES | FAILURES <br> (FUNCTIONAL,AC, DC, <br> CONTINUITY OVER TEMP.) |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Group A | $25^{\circ} \mathrm{C}-110^{\circ} \mathrm{C}$ | 16 | 6000 | 72,000 | 0 |
| Group B | $25^{\circ} \mathrm{C}-110^{\circ} \mathrm{C}$ | 16 | 7000 | 84,000 | 0 |

Table 7.
dry ice mixture, and soaked 5 minutes. A group of 25 P1103's was subjected to 250 such cycles. The test resulted in no failures.
3. Power Cycling ( $\mathrm{T}_{\mathrm{J}}-25^{\circ} \mathrm{C}$ to $110^{\circ} \mathrm{C}$ ): 0 Fail/ 2,112,000 Unit Cycles
A less severe but more realistic test of bond integrity is power cycling, where the unit is alternately powered and cooled. The bias voltages are adjusted to bring the chip to about $110^{\circ} \mathrm{C}$ within 2.5 minutes, and then a fan is used to return the unit to room temperature in the next 2.5 minutes. Table 7 shows the results of extended power cycling for two groups of 1103's.
4. Storage ( $125^{\circ} \mathrm{C}-0$ Fail/790,000 Unit Hours; $160^{\circ} \mathrm{C}$ - 0 Fail $/ 390,000$ Unit Hours)
Another test of bond integrity is extended high temperature storage. 243 non-functional units were stored at elevated temperature as summarized in Table 8. Note that only continuity of bonds over the range $0^{\circ} \mathrm{C}$ to $90^{\circ} \mathrm{C}$ was measured at each readout.

|  |  |  |  | FAILURES <br> (CONTINUITY) |
| :--- | :---: | :---: | :---: | :---: |
| Group A | $125^{\circ} \mathrm{C}$ | 158 | 5000 | 0 |
| Group B | $160^{\circ} \mathrm{C}$ | 78 | 5000 | 0 |

Table 8.

## REFERENCES

a. D.J. Fitzgerald, G.H. Parker, \& P. Spiegel, "Reliability Studies of MOS Si-Gate Arrays', 1971 IEEE Reliability Physics Symposium, Las Vegas, Nev., March 31 - April 2, 1971.
b. A. Goetzberger, Recent News Paper, Electrochemical Society Meeting, May 1966.
c. B.E. Deal, M. Sklar, A.S. Grove, and E.H. Snow, "Characteristics of the Surface-State Charge (OSS) of Thermally Oxidized Silicon", J. Electrochem, Soc., Solid State Science, 114, 266 (1967).
d. S.R. Hofstein, 'Stabilization of MOS Devices', Solid State Electronics, 10, 657, (1967).
e. R.H. Reynolds, 'The Response of the Threshold Voltages of the Transistors in Simple MOS Circuits to Tests at Elevated Temperatures', 1971 IEEE Reliability Physics Symposium, Las Vegas, Nev., March 31 - April 2, 1971.

Random Access Memories

|  | Type | No. of Bits | Dascription | Organization | Electrical Characteristics over Temperature |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  | Access Time Max. | Cycle <br> Time <br> Max. | Power Dissipation Max. | Supplies [V] |
|  | 1101A | 256 | Static Fully Decoded | $256 \times 1$ | $1.5 \mu_{5}$ | $1.5 \mu_{\mathrm{s}}$ | 685 mW | +5, -9 |
|  | 1101A1 | 256 | Hi-speed Static Fully Decoded | $256 \times 1$ | $1.0 \mu \mathrm{~s}$ | $1.0 \mu_{\mathrm{s}}$ | 685 mW . | +5, -9 |
|  | 1103 | 1024 | Dynamic Fully Decoded | $1024 \times 1$ | 300 ns | 580 ns | 400 mW | +16,+19 |
|  | 1103-1 | 1024 | Dynamic Fully Decoded | $1024 \times 1$ | 150 ns | 340 ns | 400 mW | +19,+22 |
|  | 2102 | 1024 | Static Fully Decoded | $1024 \times 1$ | $1.0 \mu_{\mathrm{s}}$ | $1.0 \mu \mathrm{~s}$ | 350 mW | +5 |
|  | 3101 | 64 | Fully Decoded | $16 \times 4$ | 60 ns | 60 ns | 525 mW | +5 |
|  | 3101A | 64 | Hi-speed Fully Decoded | $16 \times 4$ | 36 ns | 35 ns | 525 mW | +5 |
|  | 3106 | 256 | Hi-speed Fully Decoded (With 3-state Output) | $256 \times 1$ | 80ns | 80 ns | 650 mW | +5 |
|  | 3106A | 256 | Hi-speed Fully Decoded (With 3-state Output) | $256 \times 1$ | 60 ns | 70 ns | 650 mW | +5 |
|  | 3107 | 256 | Hi-speed Fuliy Decoded (With Open Collector Output) | $256 \times 1$ | 80 ns | 80 ns | 650 mW | +5 |
|  | 3107A | 256 | Hi-speed Fully Decoded (With Open Collector Output) | $256 \times 1$ | 60 ns | 70 ns | 650 mW | +5 |
|  | 3104 | 16 | Content Addressable Memory | $4 \times 4$ | 30 ns | 40ns | 625 mW | +5 |

Read Only Memories

|  |  | Type | No. of Bits | Description | Organization | Electrical Charecteristics over Temperature |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Access Time Max. |  |  |  | Power Dismipation* Max. | Supplies [V] |
|  |  |  | 1602A | 2048 | Electrically Programmable (Static) | $256 \times 8$ | $1.0 \mu_{5}$ | 700 mW | +5, -9 |
|  |  | 1702A | 2048 | Erasable Electrically Programmable (Static) | $256 \times 8$ | $\mathrm{C}_{1.0} \mu_{5}$ | 700 mW | +5, -9 |
|  |  | 1302 | 2048 | Mask Programmable (Static) | $256 \times 8$ | $1.0 \mu_{\mathrm{s}}$ | 700 mW | +5, -9 |
|  |  | 3301A | 1024 | High Speed, Mask Programmable | $256 \times 4$ | 45 ns | 625 mW | +5 |
|  |  | 3304 | 4096 | High Speed, High Density | $\begin{array}{r} 1024 \times 4 \\ \text { or } 512 \times 8 \end{array}$ | 65 ns | 875 mW | +5 |
|  | $\Sigma_{0}$ 0 0 0 | 3601 | 1024 | High Speed Electrically Programmable | $256 \times 4$ | 70ns | 650 mW | +5 |

## Shift Registers

|  | Type | No. of Bits | Description | Electrical Characteristics over Temperature |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  | Data Rep. Rate |  | Power Dissipation* Max. | Input Output Levels | Clock Levels | Supplies [V] |
|  |  |  |  | Min. | Max. |  |  |  |  |
|  | 1402A | 1024 | Quad 256 Bit Dynamic | 10 kHz | 5 MHz | 500 mW | TTL | MOS/TTL | 5, -5 or 5, -9 |
|  | 1403A | 1024 | Dual 512 Bit Dynamic | 10 kHz | 5 MHz | 500 mW | TTL | MOS/TTL | 5, -5 or 5, -9 |
|  | 1404A | 1024 | 1024 Bit Dynamic | 10 kHz | 5 MHz | 500 mw | TTL | MOS/TTL | 5, -5 or 5, -9 |
|  | 1405A | 512 | Dynamic Recirculating | 10 kHz | 2 MHz | 400 mW | TTL | MOS/TTL | 5, -5 or 5, -9 |
|  | 1506** | 200 | Dual 100 Bit Dynamic | 6 kHz | 2 MHz | 110 mW | TTL | MOS | +5, -5 |
|  | 1507** | 200 | Dual 100 Bit Dynamic ( 20 k З output) | 6 kHz | 2 MHz | 110 mW | TTL | MOS | +5, -5 |
|  | 2401 | 2048 | Dual 1024 8it Dynamic Recirculating | 25 kHz | 1 MHz | 350 mW | TTL | . TTL | +5 |
|  | 2405 | 1024 | 1024 Bit Dynamic Recirculating | 25 kHz | 1 MHz | 350 mW | TTL | TTL | +5 |

* The 1506 and 1507 are also available in military temperature range $\left(-55^{\circ}\right.$ to $\left.+125^{\circ}\right)$. To order specify 1406 or 1407 , respectively.

Memory Peripherals

|  | Type | Description | Electrical Characteristics over Temp. |  | Supplies [V] |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | Input to Output Delay Max. | Power Dissipation* Max. |  |
|  | 3205 | 1 of 8 Binary Decoder | 18 ns | 350 mW . | +5 |
|  | 3207A | Quad Bipolar to MOS Level Shifter and Driver | 25 ns | 925 mW | +5 |
|  | 3208A | Hex Sense Amp for MOS Memories | 20 ns | 600 mW | +5 |
|  | 3404 | High Speed 6 Bit Latch | 12 ns | 375 mW | +5 |
|  | 3408A | Hex Sense Amp and Latch for MOS Memories | 25 ns | 625 mW | +5 |



Intel wishes to give acknowledgement to the following publications in which certain sections of this handbook originally appeared.

Computer Design, June 1970
Electronic Design, January 20, 1972
Electronic Design, February 17, 1972
Electronic Design, March, 1972

EDN, January 15, 1971
EDN, August 1973
Electronic Engineer, June 1970
Electronics, August 3, 1970

## Coming...

Several significant new products were announced by Intel in the third quarter of 1973. The announced products included RAMs which ranged in size from 1024 to 4096 bits. The 1024 bit RAMs were the 1103A and 2105. The 1103A is electrically pin compatible to the 1103, however, it does not require a precharge clock. The 2105 is a high speed N -channel dynamic RAM with an access time less than 100 nsec. A low cost, high bit density RAM ( 4096 bits), the 2107, was announced in July. Like the 2105, the 2107 is fabricated using N-channel silicon gate technology.

The above devices, along with their peripheral circuits, will be some of the new devices discussed in the supplement to this handbook. If you have any comments or applications which you would like included in the handbook supplement, please fill in and return the attached card to Intel.

## READERS COMMENTS - INTEL MEMORY DESIGN HANDBOOK

I. Primary Product InterestMicrocomputers

RAMs
$\square$ Others $\qquad$ROMs/PROMs
$\square$ Shift Registers
II. I would like to see the following included in the next handbook:
$\qquad$
$\qquad$
$\qquad$
$\qquad$
III. My comments on your handbook are:
$\qquad$
$\qquad$
IV. Name (optional)

Company (optional)

READERS COMMENTS - INTEL MEMORY DESIGN HANDBOOK
I. Primary Product Interest
$\square$ Microcomputers
$\square$ RAMs
$\square$ Others
$\square$ ROMs/PROMs
$\square$ Shift Registers
II. I would like to see the following included in the next handbook:
$\qquad$
$\qquad$
$\qquad$
$\qquad$
III. My comments on your handbook are:
$\qquad$
$\qquad$
IV. Name (optional)

## BUSINESS REPLY CARD

No postage stamp necessary if mailed in the United States

Postage will be paid by:

3065 Bowers Avenue Santa Clara, California 95051

Attn: Product Marketing

## BUSINESS REPLY CARD

No postage stamp necessary if mailed in the United States

Postage will be paid by:

3065 Bowers Avenue
Santa Clara, California 95051


EUROPEAN MARKETING HEADQUARTERS

## belgium

Jens Paulsen
Intel Office
216 Averue Louise
492003, Telex: 846
492003. Telex: 846-21060
*Bruxellies 1050

EUROPEAN MARKETING OFFICES

| FRANCE | ENGLAND |
| :--- | :--- |
| Bernard Giroud | Keith Chapple |
| Intel Office | Intel Office |
| Cidex R-141 | Broadfield House |
| (1) $677-60-75$, Telex: $842-27475$ | 4 Between Towns Road |
| *94-534 Rungis | 771431, Tele: $851-837203$ |
|  | *Cowley, Oxford |

## germany

Erling Holst
Intel Office
Wolf ratshauserstrasse 169 798923, Telex: $841-212870$ *D8 Munchen 71

## INTERNATIONAL DISTRIBUTORS

| ORIENT MARKETING | AUSTRALIA | FINLAND | ISRAEL | SOUTH AFRICA |
| :---: | :---: | :---: | :---: | :---: |
| HEADQUARTERS | A.J. Ferguson (Adelaide) PTY. Ltd. 125 Wright Street | Havulinna Oy P.O. Box 468 | Telsys Ltd. <br> 54, Jabotinsky Road | Electronic Building Elements <br> P.O. Box 4609 |
| JAPAN | 51.6895 | 90.61451. Telex: 12426 | 2528 39, Telex: TSEE-1L 333192 | 78.9221 Telex: 44.0181 SA Pretoria |
| Y. Magami |  |  |  | SWEDEN |
| Intel Japan Corporation | AUSTRIA | France | italy |  |
| Han-Ei 2nd Building | Bacher Elektronische Gerate GmbH | Tekelec Airtronic | Eledra 3 S | Nordisk Elektronik AB |
| 1-1. Shinjuku, Shinjuku-Ku | Meidlinger Haupstrasse 78 | Cite des Bruyeres | Via Ludovico da Viadana 9 | Fack 08.24 .8340 Telex: 10547 |
| 03-354.8251, Telex: 781.28426 | 0222-93 0143 , Telex: (01) 1532 | Rue Carle Vernet | (02) 86.03 .07 | 08.24.83-40, Telex: 10547 |
| Tokyo 160 | A. 1120 Vienna | 626-02-35, Telex: 25997 | 20122 Milano | S-103 Stockholm 7 |
|  | BELGIUM | 92 Sevres | NETHERLANDS | SWIT ZERLAND |
|  | Inelco Belgium S.A. | GERMANY | Inelco N.V. | Industrade AG |
| ORIENT DISTRIBUTORS | Avenue Val Duchesse, 3 | Alfred Neye Enatachnik GmbH | Weerdestein 205 | Gemenstrasse 2 |
| ORIENT DISTRIBUTORS | (02) 60.00.12, Telex: 25441 | Schillerstrasse 14 | Postbus 7815 | Postcheck $80 \cdot 21190$ |
|  | B-1160 Bruxelles | 041 06/612.1. Telex: 02.13590 2085 Ouickborn-Hamburg | 020441666 , Telex: 12534 Amsterdam 1011 | 01-60-22-30, Telex: 56788 8021 Zurich |
| JAPAN | DENMARK |  |  |  |
| Pan Electron inc. |  | Ing Erich Sommer | NORWAY | UNITED KINGDOM |
| No. 1 Higashikata-Machi | Scandinavian Semiconductor Supply A/S | Elektronic Gmbr | Nordisk Elektronik (Norge) A/S | Walsnore Electronics Ltd. |
| 045-471.8321. Telex. 781.4773 | Telex. 19037 | Jahnstrasse 43 | Mustads Vei 1 | 11-15 Betterton Street |
| Midori-Ku, Yokohama 226 |  | O611-55-02.89, Telex: 414069 | 602590, Telex: 16963 | Lane |
|  |  | 6 FrankfurtMaın 1 | Oslo 2 | 01-836.0201, Telex: 28752 |

CANADIAN SALES OFFICES - MEMORY SYSTEMS ONLY

|  |  |
| :--- | :--- |
|  | ALBERTA |
|  | Datagraphics $L$ td |
|  | 2912 Palisade Drive S/W |
|  | $403 / 281.1636$ |
| -Direct Intel Sales Office | Caigary |


| BRITISH COLUMBIA | ONTARIO |  |
| :---: | :--- | :--- |
| Datagraphics Ltd. | Datagraphics Lid. | Datagraphics Ltd. |
| 164 West Second Ave. | 834 Clydf Ave. | 65 Adelaide St. E |
| $604 / 732.5033$ | $613 / 7223409$ TWX: $610-562-1953$ | $416 / 366-6646$ |
| Vancouver 9 | Ottawa KIZ $5 A 1$ | Toronto |




[^0]:    NOTICE: The circuits contained herein are suggested applications only. Intel Corporation makes no warranties whatsoever with respect to the completeness, accuracy, patent or copyright status, or applicability of the circuits to a user's requirements. The user is cautioned to check these circuits for applicability to his specific situation prior to use. The user is further cautioned that in the event a patent or copyright claim is made against him as a result of the use of these circuits, Intel shall have no liability to user with respect to any such claim.

[^1]:    *This parameter is periodically sampled and is not $100 \%$ tested. They are measured at worst case operating conditions.

[^2]:    ${ }^{1}$ Regitz, W. M., Karp, J. A., "Three-Transistor Cell 1024-Bit 500-ns MOS RAM", IEEE Journal of Solid-State Circuits, Vol. SC-5, No. 5; Oct., 1970.
    2 "Field-Effect Transistor Capacitor Storage Cell," United States Patent 3,585,613.

[^3]:    Manuscript received March 16, 1971. This paper was presented at the ISSCC, Philadelphia, Pa., February, 1971.
    The author is with the Intel Corporation, Santa Clara, Ca.

[^4]:    1. Read-only memory. This array of 256 eight-bit words can be used from one to 16 times in a computer made from the 4 -chip microcomputer set MCS-4. Chip also contains multiplexing and demultiplexing circuits to permit it to transmit and receive words through the processor's data bus, four bits at a time, and buffers for transferring data from the bus to the outside world.
[^5]:    (1) All data was taken with plastic packages with the exception of several groups of devices on HTRB which were ceramic.
    ${ }^{(2)}$ In system operation functionality (continuous monitoring), refresh and system voltage margin are routinely checked.

[^6]:    (1) Ceramic package.
    (2) In system operation functionality (continuous monitoring), refresh and system voltage margin are routinely checked.

