DotCrawl Comb Filter

Removes composite video artifacts: false colors and hanging dots

DotCrawl.vdf

10 December 2003, Version 0.9.3 adds “Multi-path minimum delta” spatial filter.  Promising but still needs tweaking.

30 November 2003, Version 0.9.2 adds better choices for temporal discrimination, removes the less-useful ones.  Everything works, but some features are relatively slow.

22 November 2003, Version 0.9.1 adds MMX optimizations to the spatial filters.  DotCrawl Comb now supports batch operation (job control).

Summary

PAL users:  Sorry guys, this filter doesn’t fix PAL rainbows and dots.  I just got my hands on some PAL specifications, and my algorithms all require some adjustments before they’ll work.  Eventually.  Soon, I hope!

The DotCrawl Comb filter fixes defects caused by composite video (NTSC).  Most capture cards leave some of these defects in the picture, particularly the rainbow colors.  Two filtering steps can be performed:

Temporal filtering works in non-changing parts of the picture, using data from previous frames to cancel out the dot patterns and erroneous colors, and to recover latent picture details.  DotCrawl Comb currently tests up to 3 frames to detect if they are related to the current frame, but it still might erroneously use unrelated frames.

Spatial filtering works on pixels within a single frame, and is applied to parts of the picture that have changed since the previous frame (ie: where there’s motion).  While it does not increase actual picture detail, it can reveal details that would otherwise be masked by the dot patterns.

Composite video defects are most common when capturing at large frame sizes (like 480x480 and 720x480).  Capture cards that use Bt878 chips (PCTV Studio, WinTV, et al) rely solely on a notch filter when capturing large frames, so they show dots around the edges of colored objects.  Older Zoran-based capture cards (Pinnacle DRX, DC10, Matrox Marvel G200/G400) don’t have any built-in filtering at large frame sizes, so dot-patterns appear over the surface of all colored areas.  Disgusting!

DotCrawl Comb does not significantly affect non-chroma picture details; if you use it to filter a photograph it will be substantially unchanged.

Examples:

 

Examples of composite defects:

 

Dot patterns on colored objects:

 

Extra colors:

 

 

Before:

Crawling dots around contrasting colors.


After temporal & spatial filtering:

Dots reduced around spinning ‘U’, totally eliminated around still ‘R’.


 

 

Before:

False colors around fine details.  Colors change from frame to frame.

 

After temporal filtering:

Filter used data from previous frames to eliminate colors and enhance details.

 

 

 

 

Before:

Blue-yellow checkerboard obscures fine detail between letters.

After temporal filtering:

Checkerboard is eliminated, letters are better isolated.

 

 

Detail view of above.  Note how the dots have created little bridges between the “M” and “y”, but the ends of the “y” are a bit spotty.

 

The dots have been filtered out, but detail has actually been increased.  “M” is now separate from “y”, but ends of “y” is still connected.

 

 

Before:

After temporal & spatial filtering:

 

 

Before:

Enlarged to show detail, this was captured by a Bt878 at 320x240.  Its built-in comb filter softened the text but permitted a rainbow of false colors to appear.

After temporal filtering:

The DotCrawl Comb filter could not recover the detail lost by the Bt878’s comb filter, but it accumulated data from successive frames to eliminate the distracting false colors.

 

 

Using DotCrawl Comb:

DotCrawl Comb filter is intended to be easy to configure.  In general, you can probably just use the default settings.


 


 Temporal Comb Settings

The temporal comb filter settings control how the temporal kernel distinguishes between moving and non-moving images.  That’s not as trivial as it sounds because luma-chroma crosstalk (which is the defect the filter is designed to remove) causes pixel values to change between frames even when the underlying image has not changed.  The choices are:

·         None: No temporal filtering.  In this mode, all pixels are pixels are passed directly into the chosen spatial filter.

·         3-frame dead-reckoning: This mode enables temporal filtering whenever there is apparent correlation among 3 prior frames of ‘filtered’ pixel values.  Each filtered pixel requires samples from at least two frames, so this effectively activates the temporal filter on the fourth frame after the image has changed.  This setting is adequate for most material.

·         4-frame dead-reckoning: Same as above, but the discriminator requires 4 prior frames of correlation, so the temporal filter is activated on the fifth frame after the image has changed.

·         Trellis correlation: Uses a box-trellis equation to create two additional correlation channels.  This can activate the temporal filter much earlier, often the first filterable frame, if correlation is sufficiently strong.  This mode rapidly detects and removes artifacts, yet resists erroneous filtering like 4-frame dead-reckoning.  The disadvantage is that it requires much more computation.

Spatial Comb Settings

The spatial comb is applied to pixels that have not been processed by the temporal filter because the image was changing.  The spatial comb filter removes dot artifacts but not colored fringes.  The choices are:

·         None: No spatial filtering.  This is appropriate for small frame sizes (like 352x240) on Bt878 capture cards, since Bt878 drivers already use a spatial comb filter at low frame sizes.  (The Bt878 uses a ‘1-line’ comb filter, which reduces sharpness somewhat.)

·         4-way 5-tap with saturation: Filters each sample against four others that are as nearly as possible to 180-degrees out-of-phase, individually pre-clamping each sample within 12% of the target value (this avoids softening the image).  This mode is adequate at most frame sizes, like 720x480 where the residual phase error is about 1%, which is indistinguishable.

·         Precise 4-way 7-tap with saturation:  Filters each sample against four theoretical samples that are exactly 180-degrees out-of-phase, using linear interpolation and additional taps to generate the theoretical samples.   This method takes longer, but the image isn’t any better at most frame sizes because the gain in precision is usually less than the noise floor.  I would recommend it only if you’re using really peculiar frame sizes like 536x480.  (But who does that?)

·         Multi-path minimum delta: This is a very promising experiment, which achieves very good dot removal while preserving sharpness.  It varies the number of active filter taps (from 0 to 8) by sampling triads of pixels and attempting to extrapolate the deviation caused by chroma noise.  Where pixel differences are due to something other than chroma, the filter either rejects that filter tap or clamps it to a lower value (the minimum delta).

Usage Notes

·         Apply DotCrawl Comb to a raw capture before any other filtering or conversion.  The temporal filter won’t work after 3:2 pulldown-removal (‘inverse telecine’) because that alters the phase relationship between frames.  Cuts & edits are okay.

·         DotCrawl Comb might cause ghosts or splotches in VirtualDub’s preview pane when you’re skipping through a video file and editing.  That should not occur in the video when you filter it and save it.  Let me know if it does!

 

 

Q&A

Excellent!  Someone has already sent me a couple of questions.  Please send me more questions or suggestions!  Here are the questions I’ve been asked so far.

Q: Does your filter work on other stuff besides cartoons?

A: Yes, it works on anything that’s been converted from composite video.  I used mostly cartoons in my examples because the defects are more conspicuous in smooth, solid areas.  ‘Natural’ sources are more affected by chroma-luma crosstalk than cartoons, but the textures of natural surfaces (faces, trees, etc) tends to conceal the defects.  You can observe this by comparing two adjacent frames of a photographic scene.

DotCrawl Comb might improve the picture, even though the defects might not be conspicuous.  There’s no harm in trying it out to see.  More importantly, it can improve the quality of video you’re compressing (eg: for DVDs or VCDs) because it removes substantial amounts of extraneous noise between frames, leaving more bytes available to encode the relevant parts of the picture.

 

Principle of Operation

Numerous resources explain the operation of NTSC and the principles of Y/C separation.  I can’t improve on their descriptions, so I will concentrate mainly on the principles I used in DotCrawl Comb.  My discussion assumes you’re familiar with NTSC signaling and other methods of comb filtering. That background is optional if you want to know how the filter functions, but don’t care for all the details about‘why’.  (Admittedly, I had not consulted any outside sources when I started the project.)

I started with just a temporal comb filter.  An NTSC composite signal is the sum of the luma and chroma signals.  The luma is a black-and-white representation of the picture, while the chroma signal is a quadrature-amplitude-modulation of a vector, which itself is the sum of two orthogonal vectors (the I and Q subchannels).  Hence:

                Eq 1.      

                Eq 2.      

Where s is the composite signal, I is the orange-white-turquoise tint channel, Q is the purple-white-green tint channel, and θ is the color clock.  Eq 2 implies that when there’s any tint in the picture (ie: I and Q are not both zero), then the C (chroma) signal is a periodic wave that bears little resemblance to the picture.  The composite signal is a sum (Eq 1) of the C signal and the Y signal, which is the detail of the picture without any tint.  A receiver separates the Y and C signals in order to display the picture (from the Y signal) and apply the correct tints to it (from the C signal).  The inputs Y, I, and Q are continuously changing and their spectra overlap, so the Y and C channels can generate signals that mimic each other.  Crosstalk from the C channel into the Y channel produces waves of dots in the picture detail, while crosstalk from the Y channel to the C channel produces spurious tints.

Temporal Filter Kernel

The color clock runs 227½ cycles per scan line, or 119,437½ cycles per 525-line frame.  The extra half-cycle ensures that the chroma clock is always 180 degrees out-of-phase from the previous frame.  In other words θnn-1+180° at any particular pixel in frame n.  If samples from two consecutive frames are highly correlated (ie: the picture didn’t move), then Yn≈Yn-1 and In≈In-1 and Qn≈Qn-1.  Any differences between the two samples will be due to the change in sin θn vs sin θn-1, and cos θn vs cos θn-1.  But since we have established that θnn-1+180°, we can exploit the fact that sin θ = -sin (θ-180°)  for all values of θ.

Hence, if we sum samples sn and sn-1 for any frame n we have:

                Eq. 3      

Substitute -sin θn  for  sin θn-1 and substitute -cos θn  for  cos θn-1 and the formula simplifies to:

                Eq. 4      

Now assume In-In-1≈0 and Qn-Qn-1≈0 and we discover:

                Eq. 5      

Given adequately-high correlation, the coefficients of sin θ and cos θ approach 0 so closely that we may omit these terms entirely.  Likewise, Yn+Yn-1 approaches 2Yn so:

                Eq. 6      

If the two frames were, indeed, highly correlated then the true value of Y for each frame can therefore be isolated from the chroma interference (I and Q) by just summing the samples together and dividing by 2.  Once Y has been established, it could be subtracted from the signal in order to establish the accurate values of I and Q.  Intuitively, this wouldn’t seem useful because a software filter receives decoded pixels, not the composite signal.  But all the decoding steps entail simple linear convolution except the sine functions, which cancel themselves out.  Hence the interference from the sine functions can be cancelled by simple addition, even after the signal is decoded and the samples converted to pixels.  The summation removes from the pixels any error caused by interference in the decoded Y channel; essentially it subtracts erroneous signal energy out of Y, which implicity reinstates any energy missing from the I and Q signals.  In short, both the brightness and tint are corrected.

Since summation techniques will succeed only when the intended values are highly correlated, we need to apply the filter only when the pixels represent highly correlated samples.  In practice, the input pixels might not be well correlated even for identical images due to the influence of those sine functions.  Likewise unrelated images can produce highly correlated input pixels; when the picture switches from one person’s face to another many pixels will be indistinguishably similar, yet the pixels from one cannot be used to filter the other.  The filter needs to be gated (switched on-and-off) but clearly it cannot use simple pixel correlation as a gate condition.

The current build of DotCrawl Comb tests the correlation of processed pixels as a gate condition.  If the filtered pixels don’t correlate, it rejects the filtered value and uses the unfiltered pixel value.  That introduces a lag, so pixels won’t be filtered in a region that has recently changed or which contains movement.  There is some risk that pixels will be erroneously filtered when their filtered values correlate by random chance, and this may cause blotches or ghosts.

I expect to implement a better gate condition when I re-write the algorithm in MMX instructions.  I intend to test correlation among pairs of input pixels, using those which are in-phase with the candidate pixel but spatially displaced.  Another test would compute luminance correlations among similar groups.  Essentially, I hope to detect cross-luminance and cross-chroma interference effects separately.

Spatial Filter Kernel

Because some pixels cannot be filtered using a temporal kernel, as described above, I implemented a spatial comb filter too.  DotCrawl Comb applies the spatial kernel only to pixels rejected by the temporal kernel (or temporal filtering is off).  Because it operates only on data within a single frame, spatial filtering can’t recover true detail or tints, but it can ‘reveal’ additional detail by reducing the masking effect of chroma noise.

Unfortunately, most spatial comb filters reduce picture detail, sometimes severely.  To combat this, I devised a 4-way comb filter on the principle that pixels nearby will be 180-degrees out-of-phase, but tend to have highly correlated values.  The correlation is typically not as strong as with the same pixel in a previous frame, but the average of pixels nearby in four directions will be adequately-well correlated to permit filtering even if one or two of them are unrelated.  To preserve pixel integrity I clamp each input pixel to within +/-32 before feeding it into the filter kernel.  This ensures the filter operates at full strength against moderate dot artifacts without permitting aberrant pixels to soften the image.

Assuming Px,y identifies a sample for the pixel at location x,y, the formula for a filtered pixel P’x,y is:

                Eq. 7      

That looks a bit ugly, but for a 352x480 image it reduces to this relatively intuitive convolution matrix centered around the current pixel:

               

For other frame sizes, the matrix just has additional columns of zeroes inserted between the inner and outer terms.  Note, as mentioned above, the four outer terms are clamped to within +/-32 of the inner term before being input to this matrix.  My ‘Precise 4-way 7-tap’ filter works similarly except that it uses linear interpolation to create the left and right terms in their ‘theoretical’ location, about one-376th of the frame width away from the center of the current pixel.  Since most files aren’t captured at 376 pixels per line, this location may lie between two pixels.