Tutorial 6 for SDK 1.0

Tutorial 6 for SDK 1.0
Shaders Part 1
by Trevor Hogan

What are Shaders?

Let me start with a quick disclaimer - I'm pretty new to vertex and pixel shaders myself so I'm learning this stuff as I go. If I make any errors I'll try to fix them as soon as possible but I make no guarantees.

So you've heard that Doom 3 uses this cool new technology called vertex and pixel shaders but you're not sure what they are or how to use them. Don't worry, I'll start with the basics and explain everything as I go. It's probably easiest to explain what a shader is by looking at an example. In Doom 3 shaders are used to create the heat haze effect you see near fires, explosions, and steam pipes. The same shader also powers the refraction effect you see when you look through glass windows. In a slightly more mundane example Doom 3 also has a main shader which generates all the cool global effects like normal and specular mapping.

There are two types of shaders, vertex shaders and pixel shaders. Vertex shaders are also known as vertex programs and pixel shaders are also called fragment programs. In most cases you'll write shaders in pairs - only rarely does a shader stand by itself (although we'll be doing just that today). In a nutshell, shaders are tiny programs written in an assembly like language that run on your video card and process individual vertices (vertex shaders) or pixels (pixel shaders). Shaders are a fairly new innovation - just a few years ago all vertex and pixel data on the video card moved through the "fixed function pipeline" whether you wanted it to or not. The fixed function pipeline runs the same way all the time so most games that use the fixed function pipeline tend to have a very similar look (e.g. old Quake engines and old Unreal engines). Now you can replace the fixed function pipeline with your own shaders and exert greater control over the rendering process!

A word of warning before we begin. Many advanced shaders are extremely mathematical in nature so you'll need to know your linear algebra if you're going to write something like a cel shader. I'll try to explain what I'm doing as I go but I'll be assuming you know how to work with vectors and matrices, the dot product, the cross product, and all that junk. Let's begin!

Shaders in Doom 3

Doom 3 shaders are stored in ".vp" and ".vfp" files in the glprogs directory in pak000.pk4. VP files store vertex programs and VFP files store (surprise) vertex and fragment programs. VFP files generally store a pair of shaders which are designed to work together. Doom 3 shaders are written according to the ARB (that's the OpenGL Architectural Review Board) specifications. These specs are extremely technical in nature and are very hard to read but you should know where to find them, so here's some links.

Let's take a quick look at one of the Doom 3 shaders, interaction.vfp. This is the main shader I mentioned earlier. Here's an excerpt.

# perform a dependent table read for the specular falloff
TEX   R1, specular, texture[6], 2D;

# modulate by the constant specular factor
MUL   R1, R1, program.env[1];

# modulate by the specular map * 2
TEX   R2, fragment.texcoord[5], texture[5], 2D;
ADD   R2, R2, R2;
MAD   color, R1, R2, color;

This code is from the end of the file and is actually part of the pixel shader, not the vertex shader. Without actually analyzing the code we can still make a few observations. First, comments seem to start with a pound sign, #, and cover the rest of the line. Second, each line starts with a short three or four letter character sequence all in capitals and is followed by some more text seperated by commas. Finally, each line ends in a semicolon (you WILL forget this at least once).

Since shaders are actually computer programs it should come as no surprise that they take a set of inputs and produce a set of outputs. For a vertex shader the input is a vertex and the outputs can be just about anything (this will be covered in another tutorial). The entire vertex shader is run once per frame on every single vertex to be rendered. It has no notion of triangles or objects; all it knows about is a single vertex. Let's do some quick calculations - if Doom 3 is rendering 30000 vertices at 30 frames per second then your vertex shader will run 900 thousand times per second! Wow! Okay, for a pixel shader the input is a single pixel and the output is a colour value. The entire pixel shader is run once per frame on every single pixel to be rendered. This is important! Pixel shaders do not run once for every pixel on the screen, they run once for every pixel to be rendered! A single screen pixel can be influenced by more than one pixel shader (imagine a player standing behind a glass window) or none at all. Now let's redo those vertex shader calculations - say Doom 3 is running at 800x600 at 30 frames per second and your pixel shader is influencing every single pixel (a big assumption). In this case it will run at least 14.4 million times per second! This should indicate how important it is to write fast shaders without wasting cycles. If you're confused about my calculations, note that even though I used 800x600 as a lower bound pixel shaders do not run on the final screen pixels so the screen resolution may actually have nothing to do with it. Got that?

Whew. Let's take a look at "post process pixel shaders", the easiest type of shader to write.

Post Process Pixel Shaders

I'm going to skip vertex shaders for now and talk about pixel shaders; specifically, post process pixel shaders. A post process pixel shader runs after everything else has been rendered and modifies the already rendered pixels on the screen. In Doom 3, a pixel shader that specifies _currentRender as a fragment map is designated as a post process shader and, unfortunately, will be rendered translucently even if you don't want it to be. However, with a few tweaks to the SDK you can get around this limitation by rendering the model twice. I'll go over how to do this in another tutorial.

Hold on a second! All of a sudden I'm talking about fragment maps and something called _currentRender? Sorry, I was getting ahead of myself. Remember when I said that shaders take a specific set of inputs? Well, pixel shaders generally take textures called fragment maps as inputs. These textures can contain anything you want and don't actually have to be graphical textures in the traditional sense (although this is usually the case). For example, you could pregenerate a texture representing a nontrivial function, say sin(x). The (x,y)th pixel of this texture could contain sin(x) in the red component (probably scaled by 255). In Doom 3, pixel shader inputs are specified in a material file and _currentRender just happens to represent the current render, i.e. the pixels already on the screen.

I know I've done a lot of talking so far but now we need to go over some pixel shader commands before we can write one.

Simple Pixel Shader Commands

As I said before shaders are written in an assembly like language. This means that each line represents one instruction and the first token in the line (i.e. the first word) is the name of the instruction. The other tokens specify a list of parameters seperated by commas. Here's a list of simple pixel shader commands from the ARB specification. Vertex shaders use a different set of commands although there is some overlap.

3.11.5.18  MOV:  Move

The MOV instruction copies the value of the operand to yield a 
result vector.

MOV  a, b;

copies the value of b onto a.

3.11.5.2  ADD:  Add

The ADD instruction performs a component-wise add of the two 
operands to yield a result vector.

ADD  a, b, c;

adds b and c together (component-wise) and stores the result in a.

3.11.5.19  MUL:  Multiply

The MUL instruction performs a component-wise multiply of the two 
operands to yield a result vector.

MUL  a, b, c;

multiplies b and c together (component-wise) and stores the result in a.

3.11.5.15  MAD:  Multiply and Add

The MAD instruction performs a component-wise multiply of the first two
operands, and then does a component-wise add of the product to the 
third operand to yield a result vector.

MAD  a, b, c, d;

multiplies b and c together then adds d (component-wise) and stores the result in a.

3.11.6.1  TEX: Map coordinate to color

The TEX instruction takes the first three components of 
its source vector, and maps them to s, t, and r.  These coordinates 
are used to sample from the specified texture target on the 
specified texture image unit in a manner consistent with its 
parameters.  The resulting sample is mapped to RGBA as described in 
table 3.21 and written to the result vector.

TEX  a, b, c, TARGET;

samples texture c at texture coordinates b using texture target
TARGET and stores the result in a.

That was a very short list but it's all we'll need for today. Many more commands are listed in the ARB spec in case you're interested. Note that in all the instructions the destination variable (result) is the first parameter and the other variables (operands) follow after. You might be wondering why there's a Multiply and Add command since both MUL and ADD already exist; so what's the point? It turns out that shader instructions take exactly one cycle to execute so the MAD command performs both operations for the price of one! Shader speed is directly related to the number of instructions in the shader so replacing a multiply and an add with a MAD can save you a lot of processing time.

Shaders operate exclusively on vectors so every command expects vector inputs and outputs. Every variable in a shader is a four component vector and these vectors can represent coordinates (x, y, z, w) or colours (r, g, b, a). Shaders can also define constants and temporary variables to be used during execution. Finally, shaders access inputs and outputs in a particular format. Let's go over pixel shader inputs and outputs now.

======
INPUTS
======

program.env[0..n]:
  - program environment variables
  - specified by the engine so you cannot change these
  - same for every pixel shader in the game

program.env[0] (specific to Doom 3):
  - the 1.0 to _currentRender conversion
  - multiply a value from 0.0 to 1.0 by this to get the real screen coordinate

program.env[1] (specific to Doom 3):
  - the fragment.position to 0.0 - 1.0 conversion
  - multiply fragment.position by this to get a value from 0.0 to 1.0

fragment.position:
  - the pixel's coordinate (not necessarily screen coordinate)

fragment.texcoord[0..n]:
  - texture coordinates
  - generally set by the vertex shader and read by the pixel shader
  - used with the TEX command
  - advanced vertex shaders may store unrelated data in fragment.texcoord

fragment.texture[0..n]:
  - textures
  - specified with the fragmentMap command in the material file
  - used with the TEX command

=======
OUTPUTS
=======

result.color:
  - the pixel's colour

Alright, there's only one more thing to learn: swizzling and write masking. Swizzling allows you to duplicate, mix, and match vector components in your operands. Let's take a look at an example.

# this multiplies each component of b by the first component of c

MUL  a, b, c.x;

# or

MUL  a, b, c.xxxx;

# this multiplies the first two components of b by the first component of c
# and the last two components of b by the second component of c

MUL  a, b, c.xxyy;

# this performs a normal multiplication

MUL  a, b, c.xyzw;

Swizzling is very powerful and can save you an instruction or two if you're smart about it. You can also prevent an instruction from writing to one or more of the result vector's components by masking. Here's another example.

# this copies the first component of b onto a

MOV  a.x, b;

# this copies the first and third component of b onto a

MOV  a.xz, b;

# here's a combination swizzle and mask
# this copies the first component of b onto the second component of a

MOV  a.y, b.x;

In both examples you could have interchanged x, y, z, and w with r, g, b, and a.

A Colour Stripper Shader

Finally, a real example! Let's write a post process pixel shader to strip the green and blue components from the screen leaving only the red component. First we need to create a material so add this code to one of your material files.

textures/postprocess/redshader
{
      {
            vertexProgram     redshader.vfp
            fragmentProgram   redshader.vfp
            fragmentMap       0                 _currentRender
      }
}

This specifies a new material with a single stage which uses the vertex shader in redshader.vfp and the pixel shader also in redshader.vfp. The pixel shader will take one texture as a parameter and that texture is the current screen render. Note that by using _currentRender this material is marked as translucent (as if the "translucent" key was present) and that this stage will be rendered last. In this case there is only one stage so it doesn't matter but it would make a difference in materials with multiple stages. Okay, now create a new text file named redshader.vfp in the glprogs directory. Then read this code over, understand it, and paste it inside the file.

#################
# vertex shader #
#################

# the first line specifies the beginning of the vertex shader
# it also specifies the vertex shader version
# we're writing a version 1.0 ARB vertex program
# the OPTION line specifies vertex shader options
# in this case ARB_position_invariant means the fixed function pipeline will handle vertex transformations

!!ARBvp1.0
OPTION ARB_position_invariant;

# this vertex shader does nothing

END

################
# pixel shader #
################

# the first line specifies the beginning of the pixel shader
# it also specifies the pixel shader version
# we're writing a version 1.0 ARB fragment program
# the OPTION line specifies pixel shader options
# in this case ARB_precision_hint_fastest means that we don't require high precision math

!!ARBfp1.0
OPTION ARB_precision_hint_fastest;

# temporary variables
# these are four component vectors

TEMP temp1;
TEMP temp2;

# first we need to convert the fragment's position to the real screen coordinate
# remember from the pixel shader inputs that Doom 3 uses program.env[0] and [1] to do this
# so we multiply fragment.position by program.env[1] and store the result in temp1

MUL  temp1, fragment.position, program.env[1];

# temp1 now stores the pixel's position as a fraction from 0.0 to 1.0
# now we multiply temp1 by program.env[0] to get the real screen coordinate

MUL  temp1, temp1, program.env[0];

# temp1 now stores the pixel's real screen coordinate
# now we need to sample the "texture" for the colour value at the pixel's position
# in this case texture[0] is actually the screen!
# 2D means we're sampling in two dimensional coordinates

TEX  temp2, temp1, texture[0], 2D;

# temp2 now stores the colour value of the current pixel on the screen
# all we have to do is strip the green and blue values
# we can do this with masking
# first we zero the result.color vector since it's undefined right now

MOV  result.color, 0;

# now we copy the current pixel's red component onto the result.color vector

MOV  result.color.x, temp2;

# done!

END

Now you can apply this material to an object in the game and it will strip the screen of green and blue values wherever it's rendered. I applied this material to the player by adding "shader" "textures/postprocess/redshader" to player_doommarine in def/player.def. Take a look!

image1

A Greyscale Shader

Let's make a small change to the colour stripper shader to get a greyscale shader. Converting to greyscale is easy - just take the average of the red, green, and blue values for the pixel. Remove the last two MOV statements in the above shader and replace them with this code.

# we'll accumulate the sum in temp2.x for efficiency
# add the red and green values and store the result in temp2.x

ADD  temp2.x, temp2.x, temp2.y;

# add the blue value to our accumulated sum and store the result in temp2.x

ADD  temp2.x, temp2.x, temp2.z;

# divide by three since we're averaging three values

MUL  temp2, temp2.x, 0.3333333333;

# copy the final colour to result.color

MOV  result.color, temp2;

image2

Can you spot the inefficiency in the above shader? I could have saved an instruction but I didn't. See if you can figure it out.

Well, that's it for today. In the next tutorial I'll go over how to render a post process pixel shader on an opaque object by forcing the engine to render it twice. Hope you enjoyed this tutorial as much as I enjoyed writing it!

November 5, 2004