Fault Tolerance Design for Discrete Cosine Transforms

Introduction

Discrete cosine transform (DCT) is one of the basic building blocks for JPEG. The discrete cosine transform was first applied to image compression in Ahmed, Natarajan and Rao’s pioneering work, in which they showed that this particular transform was very close to the Karhunen-Loeve transform (KLT) , a transform that produces un-correlated coefficients. Another important aspect of the DCT is the ability to quantize the DCT coefficients using usually-weighted quantization values, as will be discussed in this report. To detect a fault in DCT networks, concurrent error detection (CED) design have been proposed. In this report, the error due to the finite bit-precision is for the DCT is investigated. In order to simplify the problem, the 32 bits integer is used to implement the 2D-DCT. Reconstruction error is measured in terms of total gray-level error. This evaluation is completed in the case no computer failure errors injected into the transform system. When the fault tolerance design, an efficient concurrent error detection scheme for DCT networks is applied. By employing a check-sum scheme with effective checksum weighting factors, the design allows high throughput as well as high fault coverage. The overflow probability is also minimized.

A Fast Algorithm for DCT Based on the DFT

There are a number of relationships between the DFT on real inputs and the DCT. Vetterli and Nussbaumer showed that the N-point DCT can be expressed in terms of the real and imaginary parts of an N-point DFT and rotation of these DFT outputs. Harelick showed that the first N coefficients of a 2N-point DFT with appropriate symmetry of input values can be used to compute an N-point DCT. This property is used to build an efficient scaled DCT structure as shown in Figure 1.


Figure 1. Flowgraph for 8-point DCT adapted from Arai, Agui, and Nakajima.
a1 = 0:707; a2 = 0:541; a3 = 0:707; a4 = 1:307; and a5 = 0:383

The DCT maps integers from the intensity domain to the frequency domain. The DCT theoretically is an exactly invertible mapping between the spatial and frequency domains. Unfortunately, as commonly used the DCT/IDCT does introduce error. The question is how much error can be tolerated for a particular application, and how can the fault tolerance system distinguish between finite resolution errors and the computer failure errors. To get some intuition about error introduced by DCT, the equation for the one-dimensional DCT can be written as:

c=Ax

From the above equation, the relationship between the spatial domain pixels x and the frequency-domain values c is linear. Since A is orthonormal, then inv(A) = A'. In many image processing system, the output of the DCT is a frequency domain image with integer pixels. This requires that the values of c be rounded to the nearest integer and introduces error into the construction, even if infinite precision was used to calculate Ax. In particular, if a 1D signal is converted into the frequency domain by the DC, has its values rounded to the nearest integer, and then converted back to into the spatial domain using the inverse DCT. There is worst-case error in term x[i] of

For N=8, this implies an error of up to 1.32 gray levels for one dimensional data

In two-dimensions, the situation is similar. Once again, the frequency-space values are rounded to the nearest integer. Since rounding introduces an error of up to +0.5 or -0.5 in each frequency term, the maximum error E[m,n] for a pixel in the reconstructed image is.

Proposed Error Detection Scheme for 8-point DCT

The upper limit of the round-off errors as described in the previous section allows the fault tolerance system to know if a computer failure error occurred. However, by using this error detection scheme, the inverse discrete cosine transform is needed to compute the input data from the coefficients output. Doing so could cost a double of time and hardware resources.

An error detection scheme so called concurrent error detection for the DCT is proposed. The protection design methodology uses integer number parity values, similar to previous integer FFT fault tolerance techniques. Figure 2 describes the structure of the concurrent error detection scheme for 8-point DCT.


Figure 2. Concurrent Error Detection Structure for 8-point DCT

Two comparable parity values are calculated, one from the input data vector f and the other from the output transform coefficients in vector F. The parity values are based on input and output weighting vectors b and w. The input parity value is computed by taking the inner product between f and b. Similar for the output parity value. The parity checking operation involves a totally self-checking checker adapted for real number comparisons.The decision threshold is selceted based on the gap between the round off errors and the computer failure errors. Assume the output weighted factor vector w is known, then the input weighted factor vector b can be computed. For example, the weight factor at the output coefficient F(k) is 1/(k+1), then the input weight factors are found as shown below.

Experiment Results

The following figures show the results of a computer program implementing the proposed concurrent error detection scheme.



Figure 3. Performance curves of the concurrent error detection design for 8-point DCT. The mean
square error versus the noise level (left), and detection performance vs. the desision threshold (right)

The fault coverage is evaluated as shown in Figure 3. In this simulation, a functional error is an additive white noise occurring at the link of the transform stages. Only one fault is injected at random in one of the links of a module. The error threshold was decided based on the machine dependent constant, the word length and the input checksum. If the difference of the two checksums is greater than the error threshold, it is assumed that a fault is detected.

Fault Coverage

From the detection performance curves, given a noise level, the smaller the threshold to be selected, the higher fault detection coverage. However, when the threshold is in the round off error region, false alarm may occur. Therefore, the decision threshold is selected based on the level of corrupted output due to the system failure and the fault detection coverage requirement. Fault coverage, by intuitive definition, is a measure of system's ability to perform fault detection, fault location, fault containment, and fault recovery. Fault detection coverage is a measure of a system's ability to detect faults. A system requirement may be at a certain fraction of all faults be detected. Fault coverage can be calculated as the ration of the number of faults detected to the number of faults that occur. In this design, if the output noise grater than a selected threshold T, it is assumed that fault exists in the network.

Throughput

The thoughput of a DCT network is th number of transforms performed per cycle. In this design, like some other chechsum designs, a high throughput can be achieved since the second try is required only when the total noise value (the noise due to the round off error at the DCT values) of the network is over a certain round-off error bound.

Report papers

More information on DCT, particular for JPEG can be found in the following reports

Apr. 12, 2003 Fault Detection Implementation for the Discrete Cosine Transform(1.6M)

Error Detection/Correction Package

Apr. 12, 2003 Fault Tolerance Implementation for DCT(358k)


[Home]