ETV Cookbook Search ETV Cookbook
TV Support in OSS

Television Support in Mozilla/XFree86/Linux
by Glenn Adams

1   Introduction

This white paper describes certain challenges that present themselves in the process of considering how to integrate television support with the Mozilla browser on a typical Linux platform employing the XFree86 X Window System. In addition, detailed information is provided that describes certain subsystems available in Linux and XFree86 that may be utilized to overcome these challenges.

2   Challenges

The primary task of integrating television support with Mozilla is to enable the display of a video stream obtained from an appropriate television capture device. Such display may occur in one of two modes: (1) foreground mode, where the video is presented in the foreground of a display region, and (2) background mode, where video is presented as the background of a display region. When operating in foreground mode, the video layer would typically be treated as opaque1 and without any overlying graphics. In background mode, the video layer would also be opaque, but would be composited (blended) with an overlying graphics layer.

The task of supporting foreground video display in Mozilla is relatively straightforward. It can be accommodated in a number of ways, such as using the Mozilla Plug-In system, based upon the earlier Netscape Plug-In Architecture. Alternatively, it can be accommodated as a new built-in image format. The overall display architecture of Mozilla need not be modified to provide foreground mode support.

In contrast, adding support for background video display is more complex. This complexity is precipitated by the design of the rendering architecture of the Gecko layout engine employed by Mozilla. This rendering architecture is designed such that all compositing between foreground and background graphics occurs within the Mozilla application itself rather than in lower-level hardware. While this does not present a significant problem when the background is a static RGB image, it does create a potential performance bottleneck when the background is a 30fps interlaced YUV encoded video stream, such as one would receive from an NTSC capture device.

Ideally, a rendering architecture that would support television would enable the use of a hardware pipeline that employed direct memory access (DMA) transfers between a video capture device and an off-screen buffer in a graphics controller device, whereupon the graphics controller would perform hardware accelerated de-interlacing, color space conversion (YUV to RGB), compositing with an RGB off-screen buffer containing the graphics (subpicture) overlay, and, finally, copying the resulting composite to the active frame buffer.

Aside from the complexities of understanding and using the mechanisms provided by Linux and the XFree86 X Window System to accommodate these modes of television integration, the task of modifying Mozilla's rendering architecture is hampered by the substantial learning curve needed to perform architectural modifications of this nature to the large and complex Mozilla application itself.

While this white paper does not provide a formula for solving these challenges, it does discuss in some detail the various mechanism and extensions provided in Linux and XFree86 for working with video and hardware acceleration rendering features.

3   Potential Solution Mechanisms

3.1   Linux Video Capture Support

This section describes Linux support for video capture related functionality. Two subsystems are described:


3.1.1   V4L API and Device Drivers

The V4L (video for linux) API and device drivers provide a low-level API for controlling video and audio capture facilities associated with video cameras, television tuners, and radio tuners. The following devices are available when V4L is configured and operating:

-/dev/video video capture devices
-/dev/radio radio capture devices
-/dev/vtx video teletext devices
-/dev/vbi video data service devices

In most Linux kernels, V4L is organized as a set of installable modules, where loading of the modules occurs upon the first request by an application to access a major/minor device associated with the module. In most cases, additional, dependent modules are loaded indirectly as a side-effect of loading the initial modules. For this to work, it is necessary that the module configuration file, /etc/modules.conf, be correctly specified. In particular, the following information should appear in this file:

# i2c
alias char-major-89 i2c-dev
options i2c-core i2c_debug=1
options i2c-algo-bit bit_test=1
# bttv
alias char-major-81 videodev
alias char-major-81-0 bttv
options bttv card=10 radio=1
options tuner debug=1

The options specified for the bttv module above are appropriate for Hauppauge WinTV (BT828 based) television tuner cards. If another manufacturer's card is used, then the card=N option would need to be changed to an appropriate code that corresponds with the card. Information about card numbers and other v4l module options can be found in the Linux kernel source directory, e.g., /usr/src/linux-2.4/Documentation/video4linux.

The V4L API makes use of the generic ioctl(2) system call in order to perform most V4L services. In addition, the read(2) system call is used to read captured data. A set of request codes are declared in /usr/include/linux/videodev.h as follows:

Request Description
VIDIOCGCAP Get video/radio device capabilities.
VIDIOCGCHAN Get source properties.
VIDIOCSCHAN Select source and set properties.
VIDIOCGTUNER Get tuner properties.
VIDIOCSTUNER Select tuner and set properties.
VIDIOCGPICT Get video image (picture) properties.
VIDIOCSPICT Set video image (picture) properties.
VIDIOCCAPTURE Enable or disable video capturing.
VIDIOCGWIN Get video output window properties.
VIDIOCSWIN Set video output properties.
VIDIOCGFBUF Get direct video output frame buffer properties.
VIDIOCSFBUF Set direct video output frame buffer properties.
VIDIOCKEY Not documented or implemented.
VIDIOCGFREQ Get tuner frequency property.
VIDIOCSFREQ Set tuner frequency property (i.e., tune to new frequency).
VIDIOCGAUDIO Get audio properties.
VIDIOCSAUDIO Set audio properties.
VIDIOCSYNC Synchronize with memory mapped capture.
VIDIOCMCAPTURE Initiate memory mapped capture.
VIDIOCGMBUF Get memory mapped buffer properties.
VIDIOCGUNIT Get unit numbers of related devices.
VIDIOCGCAPTURE Get subfield capture properties.
VIDIOCSCAPTURE Set subfield capture properties.
VIDIOCSPLAYMODE Set video play mode (Stradis MPEG Decoder Only).
VIDIOCSWRITEMODE Set write mode (Stradis MPEG Decoder Only).
VIDIOCGPLAYINFO Not documented or implemented.
VIDIOCSMICROCODE Download microcode (Stradis MPEG Decoder Only).
VIDIOCGVBIFMT Get VBI format properties (Zoran 3612X FrameGrabber Only).
VIDIOCSVBIFMT Set VBI format properties (Zoran 3612X FrameGrabber Only).

Use of the V4L subsystem (or its more recent extension V4L2, described below) is an essential requirement for the support of the device independent Television Browser Enhancements on the Linux platform. Unfortunately, V4L is, at present, one of the more poorly documented features, with no clear ownership within the open source community. This situation may potentially be ameliorated through the future adoption of V4L2 by Linux platform suppliers.

3.1.2   V4L2 API and Device Drivers

The V4L2 (video for linux, version two) API and device drivers are designed to replace and extend the V4L API. The same devices of V4L are supported as described above, and a number of new devices are added:

-/dev/vfxvideo effect devices
-/dev/codecvideo and audio codec devices
-/dev/voutvideo output devices

Although V4L2 appears to have more active (and recent) development work, it is not yet installed by default on Linux based systems. Its use requires that the V4L2 package be downloaded and compiled into the kernel, thus requiring a custom kernel build. Applications previously built to work with V4L are generally supported by V4L2 through a compatibility layer that maintains (most of) the V4L request functions. Drivers written to work with V4L must be modified to operate with the new V4L2 APIs. In addition, application programs wishing to operate with new V4L2 features must be modified to take advantage of those features.

The V4L2 system is significantly better documented than its V4L predecessor. A number of specifications of its APIs and subsystems are available below.

3.2   XFree86 Video Output Support

This section describes XFree86 support for video output related functionality. Two video extensions are described:


3.2.1   Xv Extension

The Xv extension provides support for video adaptors attached to an X display. It takes the approach that a display may have one or more video adaptors, each of which has one or more ports through which independent video streams pass.

An adaptor may be able to display video in a drawable, capture video from a drawable, or both. It translates between video encoding (NTSC, PAL, SECAM, etc.) and drawable format (depth and visual-id pair). An adaptor may support multiple video encodings and/or multiple drawable formats.

Clients use Xv to gain access and manage sharing of a display's video resources. Typically, a client will use XvQueryExtension to determine the status of the extension, XvQueryAdaptors to get a description of what video adaptors exist, and XvQueryEncodings to get a description of what video encodings an adaptor supports.

Once a client has determined what video resources are available, it is free to place video into a drawable or get video from a drawable, according the capabilities supported. Clients can select to receive events when video activity changes in a drawable and when port attributes have changed.

The presence of support for the Xv extension as well as information about supported adaptors, ports, attributes, operations, and image formats may be obtained by running the xvinfo utility program (which takes an optional -display <display> argument).

A fundamental notion in Xv is that of a video port, an abstraction of an independent video input (capture) or output (display) unit. A port is associated with a set of capabilities according to the hardware mechanisms provided by the adaptor with which the port is associated. These capabilities include:

-capture video stream (XvPutVideo)
-capture video still to drawable (XvPutStill)
-display video stream (XvGetVideo)
-display video still from drawable (XvGetStill)
-display video image from XvImage (XvPutImage)

Most graphics drivers in XFree86 support only the XvPutImage capability. A special generic capture driver "v4l" is also provided with XFree86 that supports the XvPutVideo function. Support for this driver requires that /etc/X11/XF86Config contain the following statements:

Section "Module"
	Load "v4l"

When this driver is installed and configured properly, then an X11 client application can use the XvPutVideo function to cause a captured video stream to be directly written to the applicable screen's frame buffer or video overlay buffer. This permits the use of hardware DMA capabilities to transfer the captured video directly from a PCI card via the PCI bus to the graphics adaptor's frame buffer memory. In contrast, if XvPutImage is used instead of XvPutVideo, then all video frames must first pass through system memory, placing a heavier burden on system memory throughput.

Function Description
XvCreateImage Create client image for drawing to video port.
XvGetPortAttribute Obtain specific port attribute value.
XvGetStill Capture single video frame from drawable.
XvGetVideo Capture video from drawable.
XvGrabPort Lock port for exclusive use.
XvListImageFormats Obtain collection of supported image formats.
XvPutImage Write client image to video frame.
XvPortNotify Generate port notify event when port attribute changes.
XvPutStill Write single video frame to drawable.
XvPutVideo Write video to drawable.
XvQueryAdaptors Return adaptor information for screen.
XvQueryBestSize Determine optimal drawable region size.
XvQueryEncodings Obtain list of video encodings supported by adaptor.
XvQueryExtension Determines if extension is present.
XvSelectPortNotify Enable or disable port notify events.
XvQueryPortAttributes Obtain collection of port attributes.
XvSelectVideoNotify Enable or disable video notify events.
XvSetPortAttribute Set port attribute value.
XvShmCreateImage Create shared memory image for drawing to video port.
XvShmPutImage Write shared memory client image to video frame.
XvStopVideo Stop active video.
XvUngrabPort Release grabbed port.
XvVideoNotify Generate video notify event.

3.2.2   XvMC Extension

The XvMC extension adds functionality to the Xv extension in order to support motion compensated video formats, e.g., those based on DCT (discrete cosine transform) intraframe compression and motion compensated interframe compression, such as MPEG. In addition, this extension provides support for hardware compositing (blending) of both background and foreground subpictures, i.e., additional graphics layers the are situated behind and before a primary video layer.

The XvMC extension was originally developed for use with the Intel 810 Chipset and its successors. In particular, the Intel 82810 Graphics and Memory Controller Hub (GMCH) contains an integrated Graphics Controller that supports hardware motion compensation assistance for MPEG-2 decode functions as well as a hardware overlay engine.

The motion compensation process consists of reconstructing a new picture by predicting (either forward, backward or bidirectionally) the resulting pixel colors from one or more reference pictures. The GMCH intercepts the DVD (or other MPEG-2) pipeline at motion compensation and implements motion compensation and subsequent steps in hardware. Performing motion compensation in hardware reduces the processor demand of software-based MPEG-2 decoding and, thus, improves system performance.

The hardware motion compensation supports a motion smoothing algorithm. When the system processor is not able to process the MPEG decoding stream in a timely manner (as can happen in software DVD implementations), the GMCH supports downsampled MPEG decoding. Downsampling allows for reduced spatial resolution in the MPEG picture while maintaining a full frame rate; this reduces processor load while maintaining the best video quality possible given the processor constraints.

The hardware overlay engine provides a method of merging either video capture data (from an external PCI Video Capture Adapter) or data delivered by the processor, with the graphics data on the screen. Supported data formats include YUV 4:2:2, YUV 4:2:0, YUV 4:1:0, YUV 4:1:1, RGB15, and RGB16. The source data can be mirrored horizontally or vertically or both. Overlay data comes from a buffer located in system memory. Additionally, the overlay engine can be quadruple buffered to support flipping between different overlay images. Data can either be transferred into the overlay buffer from the host or from an external PCI adapter (e.g., DVD hardware or video capture hardware). Buffer swaps can be done by the host and internally synchronized with the display's vertical blanking.

In addition to support with the internal graphics controller of the Intel 81X architecture, the NVidia Linux Drivers for the GeForce4 MX AGP based graphics accelerator also support the XvMC extension. See the NVIDA Accelerated Linux Driver release notes for further information.

Function Description
XvMCBlendSubpicture Associate subpicture with surface for background layer blend.
XvMCBlendSubpicture2 Associate subpicture with surface for foreground layer blend.
XvMCClearSubpicture Fill subpicture surface with specified color.
XvMCCompositeSubpicture Copy image to subpicture surface.
XvMCCreateBlocks Create array of DCT blocks.
XvMCCreateContext Create motion compensation processing pipeline context.
XvMCCreateMacroBlocks Create array of MPEG macro blocks.
XvMCCreateSubpicture Create blendable subpicture surface.
XvMCCreateSurface Create motion compensation surface rendering buffer.
XvMCDestroyBlocks Destroy array of DCT blocks.
XvMCDestroyContext Destroy motion compensation processing pipeline context.
XvMCDestroyMacroBlocks Destroy array of MPEG macro blocks.
XvMCDestroySubpicture Destroy blendable subpicture surface.
XvMCDestroySurface Destroy motion compensation surface rendering buffer.
XvMCFlushSubpicture Commit all outstanding clear/composite operations on subpicture.
XvMCFlushSurface Commit all outstanding rendering operations for surface.
XvMCGetAttribute Obtain value of specific attribute for motion compensation context.
XvMCGetSubpictureStatus Obtain subpicture display/render status.
XvMCGetSurfaceStatus Obtain surface display/render status.
XvMCHideSurface Stop display of a surface.
XvMCListSubpictureTypes Obtain collection of subpicture image types supported by surface.
XvMCListSurfaceTypes Obtain collection of surface types supported by port.
XvMCPutSurface Display sub-region of surface onto a drawable.
XvMCQueryAttributes Obtain collection of attributes for motion compensation context.
XvMCQueryExtension Determines if extension is present.
XvMCQueryVersion Obtain version information for extension.
XvMCRenderSurface Render array of macroblocks to surface.
XvMCSetAttribute Set value of specific attribute for motion compensation context.
XvMCSetSubpicturePalette Set subpicture color palette.
XvMCSyncSubpicture Synchronize all outstanding clear/composite operations on subpicture.
XvMCSyncSurface Synchronize all outstanding rendering operations on surface.

3.3   XFree86 Direct Rendering Support

This section describes XFree86 support for direct rendering functionality. Two extensions are described:


Use of direct rendering extensions may be useful in order to accelerate video format conversion, video frame and image compositing, and composited frame drawing operations in the case that these operations are performed directly by X11 client application code (e.g., Mozilla).

3.3.1   XF86DRI Extension

The XFree86 DRI (Direct Rendering Interface) extension is a framework for allowing direct access to 3D graphics hardware in a safe and efficient manner. It includes changes to the X server, to several client libraries, and to the kernel. The first major use for the DRI is to create fast OpenGL implementations. See the DRI project home page for further information.

To determine if the DRI extension is configured and available, run the glxinfo utility. If the output includes the line "direct rendering: yes", then DRI support is available. See the DRI beginner's guide for information on configuring and debugging this extension.

Use of the OpenGL features supported by this extension requires that the application be linked with the shared library. This library contains a GLX protocol encoder for indirect/remote rendering and DRI access.

A potential use of this extension is to provide hardware accelerated blending (compositing) of images using the glDrawPixels and glReadPixels OpenGL functions in conjunction with an appropriate blending function established by glBlendFunc. Use of these features requires that source and destination image data use the RGBA color space. According to the DRI user's guide, a Matrox G400 drawing directly from AGP memory is able to achieve a throughput of 1GB/sec (or greater) for image blending operations. This represents greater than three times the capacity required to composite HDTV encoded in RGBA 32bpp with 1280x720 resolution at 30fps (progressive).

3.3.2   XF86DGA Extension

The XFree86 DGA (Direct Graphics Access) extension provides a mechanism by means of which an X11 client application may directly control the frame buffer. When this extension is active, the Xserver relinquishes control of the frame buffer entirely, allowing the client to perform all drawing operations, using low-level driver support if available. As a consequence, use of this extension is generally mutually exclusive with the use of X11 for multiple windowed application interaction; however, some limited support is provided for concurrent access by Xserver in order to make use of certain Xserver functions (e.g., copy area, fill rectangle, etc.).

The following interfaces compose the second version of this extension; these interfaces are declared in /usr/include/X11/extensions/xf86dga.h:

Function Description
XDGAChangePixmapMode Change pixmap mode access parameters.
XDGACloseFramebuffer Close framebuffer.
XDGACopyArea Use Xlib to copy area when concurrent access mode is enabled.
XDGACopyTransparentArea Use Xlib to copy transparent area when concurrent access mode is enabled.
XDGACreateColormap Create colormap for use with framebuffer.
XDGAFillRectangle Use Xlib to fill rectangle when concurrent access mode is enabled.
XDGAGetViewportStatus Get status of viewport change request.
XDGAInstallColormap Install colormap for use with framebuffer.
XDGAKeyEventToXKeyEvent Convert DGA key event to standard XKeyEvent.
XDGAOpenFramebuffer Map to framebuffer.
XDGAQueryExtension Determines if extension is present.
XDGAQueryModes Obtain rendering modes and capabilities.
XDGAQueryVersion Obtain version information for extension.
XDGASelectInput Select input events to receive.
XDGASetClientVersion Undocumented
XDGASetMode Establish rendering mode.
XDGASetViewport Set area of framebuffer to be displayed on screen.
XDGASync Synchronize with outstanding requests.

4   Conclusion

While the Mozilla browser was not designed to directly integrate television content, it can readily be added in two stages: (1) foreground video, where television content would display in a specific part of an (X)HTML page without overlying graphics; and, (2) background video, where television content would display as a background with overlying graphics, either as a part of an (X)HTML page or in full-screen mode, with the entire page overlying the video content.

The task of integrating foreground video support is relatively straightforward and does not require internal modification or knowledge of Mosaic. In contrast, integrating background video support requires detailed modifications to the rendering architecture of Mozilla's Gecko NGLayout subsystem. Although the actual subject of such modifications are relatively complex in their own terms, e.g., in that they require detailed knowledge of the various types of support for video described above, a perhaps more complicated aspect is the knowledge needed to modify Mozilla itself, possibly even making substantial architectural changes.

It is the estimate of this author that adding support for both foreground and background video to Mozilla in order to integrate television support would require a full-time engineer for approximately 12 months, where most of that time would be spent in the learning of the details of Linux/XFree86 Video support and Mozilla rendering subsystem architecture. This time would be substantially reduced if the engineer already possessed such detailed knowledge.

1 It is conceivable that one might also want to assign a non-opaque translucency value to the video layer; such a requirement is not explicitly considered by this white paper.

Revised Monday, 28-Apr-2003 15:27:47 CDT - h © 2000 - 2003 Local Enhancement Collaborative & CPB. - Please Comment