Skip to comments.Using FPGAs for HDTV design (FPGA=field-programmable gate array)
Posted on 01/23/2008 2:19:44 PM PST by Las Vegas Dave
Designers have begun to apply dynamic image-processing algorithms in FPGAs to convert and map digital-video signals onto display panels. Multiple video-processing techniques and building blocks exist to handle, process, and display clean, smooth pictures on flat-panel HDTVs.
Todays rapid proliferation of large-screen HDTVs (high-definition televisions) requires the use of highly complex video-processing algorithms to achieve high resolution, which in turn necessitates faster data rates to address the many artifacts that viewers would not typically notice on smaller screens. To overcome these challenges, designers are now beginning to apply dynamic image-processing algorithms in FPGAs to convert and map digital-video signals onto display panels. Designers are using many approaches for handling, processing, and displaying clean, smooth images on flat-panel HDTVs.
Design challenges The consumer market for HDTVs presents a significant challenge for product designers. To compete for the best-selling brand names, designers must consider both cost and video-performance quality. Even many ASSPs (application-specific standard products) target video-display applications; simply using only a standard IC makes it difficult to differentiate one product from another. Using available low-cost FPGAs to design a proprietary video-enhancement algorithm improves the probability of product success. FPGAs are also more effective than ASICs in shortening the design cycle. Most design groups today design an HDTV system either with a stand-alone FPGA or by coupling an FPGA with an ASSP as a coprocessor. Most FPGAs include hard-coded DSP blocks and internal memories, which form the basic elements for video and image processing.
Basic building blocks From an architectural point of view, an HDTV usually receives a digital-TV signal from either a terrestrial or a cable/satellite set-top box (Figure 1). The front-end tuner demodulates the RF signal to baseband for video decoding. Typically, the decoder is either MPEG-2 or MPEG-4/H.264. HDTVs must also receive other video-signal sources, such as external-component and composite signals. The internal microcontroller multiplexes and selects all of these video-signal streams for further video processing and enhancement before the video processor maps the video pixels onto the flat-panel display.
One of the components of a video processor is a deinterlacer, which converts interlaced video to progressive video using a variety of algorithms. Television standards, such as PAL (phase-alternation line) and NTSC (National Television System Committee), commonly use interlaced video, but LCDs require progressive video, which is often more useful for subsequent image-processing functions. The basic algorithm for progressive video is the bob-or-weave algorithm. Weave deinterlacing creates an output frame by filling all of the missing lines in the current input field with lines from the previous input field. This option provides adequate results for the still parts of an image but unpleasant artifacts for the motion parts. Weave deinterlacing also requires frame-buffer storage in either on- or off-chip memory, depending on the device, to allow the weaving together of lines from different fields. For this reason, the weave deinterlacer requires a built-in triple-buffering function. Bob deinterlacing vertically scales up input fields by a factor of two. The two types of scaling for bob deinterlacing are scan-line duplication and scan-line interpolation. Scan-line duplication simply scales by twice repeating each scan line in Input Field 0 to create the output frame and discarding Input Field 1. Scan-line interpolation re-creates the lines missing from Input Field 0 by performing an unweighted mean of the lines above and below them and discarding Input Field 1. At the bottom of Field 0, only one line is available. The interpolator merely replicates this line as in scan-line duplication.
Another enhancement algorithm, motion-adaptive deinterlacing, involves a progressive-scan-video sequence consisting of a 3-D array of data in the horizontal, vertical, and temporal dimensions (Figure 2). In the absence of motion, the deinterlacer identically reconstructs the interlaced sequences. In a fast-motion-video sequence, the motion detector uses the deinterlacer to separate stationary or moving areas within the same video frame. The motion-detection output uses this deinterlacer as a trigger to select either a spatial interpolation or a temporal interpolation to generate a progressive video.
Another building block in video processing is a scaler for converting video signals between arbitrary resolutions. The processor usually uses the scaler to convert low-resolution interlaced SD (standard-definition) signals to the high-resolution, noninterlaced signals that HDTV uses. A scalers basic function is similar to that of a line doubler with extra video-signal processing and optimizing. The system designer can also use various tools and IP (intellectual property) to implement a scaler.
The human eye is less sensitive to color than to luminance. You can improve video-transmission bandwidth by storing more data affecting luminance than data affecting color. A viewer notices no perceptible loss when viewing a smaller TV panel sampling the color detail at a lower rate. Video systems achieve this feature through the use of color-difference components. They divide the signal into a Y (luma) component and two color-difference components (chroma) in a videocamera. Larger panels require further chroma resampling to enhance the video quality of the display.
Chroma resampling deviates from color science in that the luma and the chroma components form naturally as a weighted sum of gamma-corrected RGB (red/green/blue) components instead of linear-RGB components. As a result, luminance and color detail are not independent of one another; some bleeding of luminance and color information occurs between the luma and the chroma components. The error is greatest for highly saturated colors and can be somewhat noticeable between the magenta and the green bars of a color-bars test pattern. This approximation allows a designer to easily implement color resampling. Using this fact, video transmitted in the YCbCr (luminance/chroma-blue/chroma-red) color space often subsamples the color components, Cb and Cr, to save data bandwidth. The video-sampling format is YCbCr of 4:4:4, 4:2:2, or 4:2:0. The 4:2:2 and 4:2:0 formats are subsamples of the 4:4:4 format. How the chroma resampler performs this subsampling depends on the type of application, whether in the professional-broadcasting arena or in the consumer market. These sampling formats are parts of the MPEG-1, MPEG-2, MPEG-4, and other standards. You might wonder why standards dont use the full 4:4:4 sampling. The following video-resolution detail provides the answer: 720×486 resolution=349,920 pixels/frame; 349,920 pixels×10 bits/sample×3 samples/pixel=10,497,600 bits/frame; 10,497,600 bits/frame×29.97 frames/sec=314,613,072 bps; and 314,613,072 bps×3600 sec≈141.58 Gbytes/hour. For a 1920×1080-pixel HDTV, that figure would reach 840 Gbytes/hour. Therefore, using 4:4:4 sampling for a consumer HDTV is neither feasible nor cost-effective. A typical consumer HDTV processes a 4:2:0 video format.
A nonlinear relationship always exists between a pixel value and its displayed intensity on an HDTV monitor. This nonlinear relationship is roughly a power function: L=VGAMMA, where L is the displayed intensity and VGAMMA is the pixel value with gamma correction, a nonlinear operation that designers use to code and decode luminance values in a video-image-processing system. The gamma-correction function allows designers to modify video streams for physical properties in display devices. For example, CRTs and LCD monitors display a brightness that has a nonlinear response to the voltage of a video signal. To account for this response, designers program the gamma-correction function with a look-up table that models the nonlinear function and then use that function to transform the video data and produce the best image on the display.
FIR and median filtering One of the most common video-enhancement blocks is the FIR (finite-impulse-response) filter. A FIR filter multiplies and sums a sequence of received-video-data impulses, creating a 2-D convolution process. A 2-D FIR filter can perform 2-D convolution using matrices of 3×3, 5×5, or 7×7 coefficients. A 2-D FIR filters key provides sharpening, smoothing, and edge detection of a video image. By designing the proper coefficients and applying the correct matrix, you can produce a crystal-clear video output. However, the electrical system can introduce video noise into a video stream during transmission in any channel. A median filter provides a simple and effective noise-filtering process. The median value of all the pixels in a populationthat is, a selected neighborhood blockdetermines each video pixel. The median value of a population is that value in which one-half of the population has smaller values than the median and the other half has larger values than the median value.
You can combine an OSD (on-screen display) and an alpha-blending mixer to provide a method to layer streams of video onto a display screen with text and other graphics. One of the most common applications is the EPG (electronic-programming guide) for an HDTV display. A typical valid range of alpha coefficients is [0, 1], where 1 represents full translucence and 0 represents full opaqueness. For N-bit RGBA (red/green/blue-alpha) values), the range is [0, 2N1]. This range interprets (2N1) as 1 and all other values as the alpha value divided by 2N. For example, 8-bit alpha value (255=>1), (254=>254÷256), (253=>253÷256), and so on. Implementing OSD in an FPGA is fairly straightforward. You use a local microcontroller either within or outside the FPGA to generate the graphics or text characters, and you use a video-line buffer to mix the generated text.
A color-space converter provides a flexible and efficient means of converting image data from one color space to another and provides a method for precisely specifying the display of color using a 3-D-coordinate system (Figure 3). Different color spaces are appropriate for different display devices, such as RGB for computer monitors or YCbCr for HDTV. Color-space conversion is often necessary when transferring data between devices that use different color-space models. For example, to transfer a TV image to a computer monitor, you must convert the image from the YCbCr color space to the RGB color space. Conversely, transferring an image from a computer display to a TV may require a transformation from the RGB color space to YCbCr. Video displays for SDTV and HDTV require different conversions, such as to or from the YIQ (perceived-luminance/color-luminance-information) model for NTSC systems or the YUV (luminance-bandwidth-chrominance) color model for PAL systems.
You achieve conversions between color spaces by providing an array of nine constant coefficients and three constant summands that relate the color spaces. Given the constant coefficients A0, A1, A2, B0, B1, B2, C0, C1, and C2 and the constant summands S0, S1, and S2, you calculate the output values on channels 0, 1, and 2 (dOUT0, dOUT1, and dOUT2): dOUT0=(A0×dIN0)+(B0×dIN1)+(C0×dIN2)+S0; dOUT1=(A1×dIN0)+(B1×dIN1)+(C1×dIN2)+S1; and dOUT2=(A2×dIN0)+(B2×dIN1)+(C2×dIN2)+S2.
Controller and interface A timing controller is a key element of an HDTV-LCD module (Figure 4). It is the last building block you implement after the video processor has performed all video processing and signal enhancement and before interfacing to the display panel. The controller must correctly map all video color pixels on the display with precise timing without incurring video-quality degradation. As in the past, designers currently implement these controllers using fully customized ASIC devices or ASSPs. Neither of these types of devices provides the full scalability of an FPGA, which allows you to scale a design for a small display to a large display. FPGAs also provide a built-in LVDS (low-voltage-differential-signal)- or RSDS (reduced-swing-differential-signal)-interface format that the timing controller typically uses to directly drive the HDTV-display panel. You can program FPGA-based timing controllers to support nonstandard resolutions and different LCD-panel configurations from various LCD manufacturers.
Designers have over the last two years made significant improvements in the quality of image processing for HDTV flat-panel displays. In addition to a variety of standard available ASSPs, designers are now using low-cost FPGAs with built-in features, such as DSP blocks, memories, microcontrollers, and differential interfaces, to leverage the rapid design cycle. FPGAs can serve as stand-alone units or as complements to an ASSP-system design to provide superior video enhancement for the digital-TV-display market.
The pinged subjects will be those of HDTV technology, satellite/cable HD, OTA (over the air with various roof top and indoor antennas) HD reception. Broadcast specials, Blu-ray/HD-DVD, and any and all subjects relating to HD.
Las Vegas Dave
But.........is it true that the new Mitsubishi Lazer TV, that is supposed to be out within the year, will make all the HDTV’s obsolete?
Who ever thinks in Gbytes/hour, unless you're storing movies on a DVD? In more typical units, this is about 250 megabytes/second, which doesn't sound like a huge data flow. Surely, a small number of state-of-the-art DSP chips could easily handle this. I guess the FPGAs are used to provide a custom controller for said set of DSPs.
Interesting article. Let’s you know what the problems are.
My conerns are that some sub-standard video quality level will become the defacto standard. (Think iTunes.) And all this great engineering work will be for naught.
We've been doing this for a while now ...
Maybe some are ... we are not. FPGA's have a lot of un-used stuff in them. We re-use TVP, DDR2-3 DLL, PLL and memory controller modules with just a few tweaks and put them in a small footprint LV ASIC ... 4 months to tapeout ... much better
Gbytes/hour is commonly used for measuring transfer rate in high bandwidth or long execution type applications. The units go back to server tape and HDD systems where it was important to be able to predict failure of system components and efficiency of data transfers.
Compare that to the 250 Mbytes/sec (2 Gbps) uncompressed 4:4:4 HDTV.
Things get nice and blurry like that when you’re about 50 or so.
“My conerns are that some sub-standard video quality level will become the defacto standard.”
It already has - the ATSC 1080i60 standard. While there were seemingly good reasons (mostly surrounding the predominance of CRT displays) at the time the standards were being developed, we are now saddled with an interlaced display format that is a poor match for today’s fixed-pixel displays, which are inherently progressive display devices. 1080p30 could have been done within the same bandwidth limits, but few if any displays or processing devices capable of handling it existed at the time - it would have been much more preferable. It’s a trivial exercise to turn 1080p30 into 1080i60, but not the other way around.
Already here, IMO. With a full 1080p display viewed from a reasonable distance the individual pixels are pretty much indistinguishable.
The problem is that the larger FPGAs that are able to implement “interesting” functionality usually cost $50-$2000. So only the smallest devices are practical for things that get massed produced.
Application Specific ICs (ASICs) are usually a better answer for competitive applications like consumer electronics.
Heck, I don't know. After reading the article I think my brand new plasma is already obsolete...........