A fellow sushi-lover MacSlow was blogging some time ago about various cool things that can be done with OpenGL and video. Mirco writes:

”The remaining things to implement are: using fragment-shaders for the colorspace-conversion too, hooking up some implicit-animation love for switching between different videos.”

I’d like to pick a little bit on the first part of his todo (using hardware-accelerated colorspace conversions).

RGB vs. YUV

Computer graphics is an RGB-world. Every point/pixel on the screen is represented by an intensity of red, green and blue. Any visible color can be coded with a combination of those three values. RGB is the way to specify colors in various drawing API’s, HTML color coding, etc. However – RGB gamut is not modeling well the way human eye works. Our perception has certain characteristics that are not well expressed in the RGB universe. For example – a human eye is very sensitive to changes in lightness (intensity) but is not very keen on noticing differences between dark shades of blue. This is where YUV colorspace kicks in. YUV (just like RGB) can be used to represent any color but the representation is more interesting from the video compression point of view – which is mostly about benefiting from the imperfections in our sight.

In YUV colors are represented by luminance (Y) and two chrominance components (U and V). For example, in RGB the white color is represented with [1.0; 1.0; 1.0] triple while in YUV it would be a [1.0; 0.0; 0.0] set. In a way YUV predates RGB and computers as it’s the format used in the analog TV (the cable essentially contains YUV signals at different bands).

Conversion

The reason why YUV is important is that it’s used as the native format in video compression. The raw (fast) output we get from a modern video decoder is a (some kind of) YUV buffer. YUV can be fairly easily converted to RGB (and vice versa) but it comes at a price. Since it’s a per-pixel operation the processing time gets steep fast. With high-resolution DVD-quality video we’re talking about ~10 million points per second. With numbers like that any operation becomes a bottleneck. Since in the end we somehow need to get the RGB representation, the only thing we can do is delegate the conversion from the CPU to the graphical hardware.

Overlays to the rescue

The traditional way of dealing with this problem was to use overlay capabilities of the graphics board. Overlays are around since long time (way longer than 3d acceleration) and are fairly well established. Overlays, being a hardware capability, allow us to “take over” a certain (more or less rectangular) area of the screen and dump there some pixel data – bypassing the traditional drawing pipeline. The data pushed can be in YUV format. Modern graphics hardware supports all popular YUV formats and the conversion is handled by the hardware.

The limitation of this approach is that the video (overlay) is not really a first-class citizen in the UI pipeline. It’s something that is (simplification here) “burnt over” other elements of the UI. We can’t transform it, we can’t use it in the 3d/2d effects pipeline and it’s problematic (slow) to draw over it (think transparent playback controls drawn over playing video). Overlays are more than enough for implementing standard desktop players but are useless when we want to do more fancy stuff.

For the fancy effects we want to use video as a native texture/source image while still delegating the colorspace conversion to the hardware. OpenGL API/pipeline does not support YUV formats but we can easily fix that with custom GPU code.

YUV formats

One problem with YUV is that it comes in different flavors (formats) and there are quite many of them. FourCC website has a good overview. The good thing is that there are just a couple of popular formats used in practice and the huge rest is mostly exotic or legacy.

Let’s take a quick look at the popular IYUV/I420 format we get from a DivX decoder. It’s a planar format which means that (unlike most RGB formats) the components are not interleaved. We can graphically represent an I420 buffer:

The buffer contains the full Y plane followed by two U and V planes. And here comes the rub – the U and V planes are sub-sampled at half the resolution. So, assuming we’re dealing with a 400x240 video we first get the luminance (Y) plane at full resolution (400x240) followed by U/V planes at half the resolution (200x120). Again, this is because the information about the lightness of the picture (Y) is more important than the information about the chrominance (“colors”) of the video. In other YUV formats it’s common to assign less bits for the U and V.

GL implementation

In the GL implementation we particularly want to:

  • Avoid any data processing on the software side
  • Avoid extra mem copies/unpacking of the data
  • Benefit from the hw-accelerated scaling/filtering (linear, cubic, etc.)
  • Get a high-quality image

To achieve this we need to use three GL elements which are not part of the GL 1.x standard but are commonly available as extensions – multitexturing, fragment programs and rectangular texture.

Multi-texturing allows us to use three different textures (Y, U and V plane respectively) as the source for the output image. A custom fragment shader executes the proper blending function to create the RGB data out of the YUV source. Rectangular texture is necessary to be able to use non-power-of-two resolution source as the texture.

For the textures/planes we use a GL_LUMINANCE 1-byte texture format. We also need to use a separate set of texture coordinates for each plane due to the resolution differences. The texture-filtering step (ie. during scaling) happens before the shading step so in the shader we automatically get properly filtered data (each texture separately).

For other YUV formats (ie. the interleaved ones) we need to do a bit more work. As the UV components are usually scattered across many triples automatic GL scaling/filtering will destroy our data before it reaches the shader. To counter that we need to first draw (with hw-accelerated conversion) to an off-screen FBO/texture and reuse that data as a native RGB texture for further rendering in the UI/scene. Alternatively one can use pbuffers (less optimal performance-wise).

Source code

Here is an example program + source which renders a sample video (Nokia n810 ad) using GStreamer + hardware-accelerated colorspace conversion and some effects. The example uses a rather primitive way of syncing video using timers. The proper way would be to write a decent GStreamer video sink or extend the existing GL-sink to use fragment programs. This approach would prolly be the right way to handle video in ie. clutter.

A rendering of the program output just for reference (might not show up in RSS, full resolution video can be downloaded here):

15 Comments

Man, this is like one week old.

gl-video$ ./gl-video Mesa 7.0.1 implementation error: i915programerror: Exceeded max ALU instructions Please report at bugzilla.freedesktop.org Segmenteringsfel (core dumped)

kudos for the nice and clear post! Very interesting code too.

Nice! I think if you managed to produce a patch for Clutter to use this if the required extensions were present, I’m sure most of OH would buy you beer. :)

That’s a very nice explanation. I’m wondering how Compiz does it currently, because it does it very fast, and transformation works just fine, even with 720p (1280x720) content.

How does the above implementation perform with proper high resolution (720p, 1080p) content? What kind of graphics card would that require, to process 50 million pixels in a second with fragment shaders?

Since the RGB version can be obtained as a linear combination of the YUV values, wouldn’t it be possible to do this by just carefully constructing a bunch of alpha-blended textures from the YUV planes, and re-coloring or illuminating the textures so that they write the appropriate weights to the right RGB planes? Working with whole textures rather than with individual pixels should be faster and more generally compatible than something requiring pixel shader extensions.

glimagesink already supports GLMESAycbcr_texture textures, which already handles packed YUV. Of course, fragment shaders are more versatile.

Nice work. A few other options:

  • Several GL extensions for YUV textures exist.

  • You could use a color matrix to do the conversion. This has the advantage of not requiring quite as new of an OpenGL implementation; 1.2 suffices.

@foo: I think it’s the crappy if’ed implementation of the rounded rectangles in the shader. Can you try http://www.mdk.org.pl/assets/2007/11/18/yuv-no-borders.pso this shader? (save as yuv.pso). I’m interested how it performs on i915.

@richard: You can easily check the perf. The current implementation does the processing for every output pixel (instead of the input one) so you can easily check the throughput by making ie. the window larger. On my hw (ati x300 mobile in t43 laptop) it works fine for 1600x1200 res. The GPU hardware is very very fast at doing this kind of calculations.

@luke: Been there, done that (I’ve got an old post about it somewhere in the opengl tag). It requires around 8 textures drawing + some blending extensions (MIN/MAX if I remember correctly). The performance is much slower and has certain clamping issues (quality) since you need to use the fb as the intermediate storage. Generally it’s pretty bad.

@david: It’s true, there is a couple of OpenGL YUV extensions. I haven’t yet seen a hardware that would support them with acceleration though (not in Linux drivers). About color matrix sutff – If you’re talking about color transfer operations, again – it’s a client-side operation and I haven’t seen a hardware that would do this in a accelerated manner.

You’re posts are a great read everytime. I think these continuous examples of using the GPU for processing image/video is the way it should be done. Just a couple of questions: In the current version of macslow’s gl-gst-player, there’s a file called yuv2rgb-rect.frag in the shaders directory. I believe that is some kind of shader assembly or something like that (I don’t know really)… It’s in here: http://gitweb.freedesktop.org/?p=users/macslow/gl-gst-player;a=blob;h=352e344dfc3b587c874c2438bdb5781f0a7f2fef;hb=8ac3cffe27f4e809c12cc071730eebe1e76905f3;f=shaders/yuv2rgb-rect.frag So, if that’s a shader for YUV to RGB conversion, isn’t it the same thing you’ve presented here? Except your version is in Cg shading language? And another question: Why not use GLSL? Not that I’ve ever gotten it working on Linux, but I think it’s bit more like a standard than Cg… Are there some issues with it, or was Cg just a choice for you because that’s what you knew better?

Anyway, I really appreciate the stuff you’re doing and writing detailed blog posts about. It’s great for learning and understanding what could be done nowadays. Maybe someday we’ll see Diva with GPU video and Graff-like UI? :)

Version without rounded corners runs on my i945 on Core Duo 2.0 Ghz with Compiz. “top” shows 11-18% CPU used. What about adding some benchmark mode?

foo: I’ve reported it: https://bugs.freedesktop.org/show_bug.cgi?id=13299

hello I am interested in your program. I compiled it for ubuntu 7.10 amd64 but I get a “Segmentation fault (core dumped)”. Do you have any suggestions? PS. I am a n00b.

<quote> the U and V planes are sub-sampled at half the resolution. So, assuming we’re dealing with a 400x240 video we first get the luminance (Y) plane at full resolution (400x240) followed by U/V planes at half the resolution (200x120). </quote>

I think you mean “one fourth”, not “half”. “200120” is one fourth the size of “400240”.

Great article… Thanks. Ziyad.

Great example code. I can’t seem to find any open source library for getting video directly into opengl like this where it can be easily manipulated. I have a need for distorting video in opengl for use in planetariums. This seems like a smart approach.

Have I missed something? I can’t believe that I am the only one that needs something like this.

Does anyone want to start a project around this approach, but to handle most YUV formats? My company is even willing to sponsor the initial work!

rob@DigitalisEducation.com

Hi! I searched the web for “opengl video texture” and found your article! Everything reads fine and I got your code compiled. What I’m interested in is an alternative for the shaders as proposed by “anonymous”. It would be nice to have a color conversion matrix because it is cheaper for the mobile industry.

“Nice work. A few other options: Several GL extensions for YUV textures exist. You could use a color matrix to do the conversion. This has the advantage of not requiring quite as new of an OpenGL implementation; 1.2 suffices.”

Do you know any of such extensions? Or can you demonstrate them?

Sorry, comments are closed for this article.

back to top

Powered by Mephisto with a micro theme mod