Vertex Data and Vertex Descriptors

When I was writing the first articles for this site (over four years ago), I didn’t have a firm grasp on the purpose of vertex descriptors, and since their use is optional, this omission persisted for far too long. The sample code for most articles has been updated periodically, and most samples now use vertex descriptors, so it seemed fitting to write about them.

In order to discuss vertex descriptors, we need to go back to the fundamentals of data and functions.

Functions and Data

Generally speaking, the purpose of a function is to transform data. Shader functions are no different. For example, a vertex function transforms vertices from whatever space they originate in (often model space) into clip space. A fragment function transforms rasterized data into the final color of a fragment.

Since most functions operate on data supplied externally (as opposed to data generated procedurally), we need a way to get data into our functions. For this reason, they take parameters. Metal shader functions likewise take parameters, but since these functions are “called” by the GPU while executing drawing commands, we don’t pass arguments directly to them. This difference is a source of confusion for many newcomers to Metal.

So how does data get into shader functions? To answer that, we need to first ask what the data that we want to use in our shader functions look like?

The Shape of Data

Suppose we have a structure in our Metal shader file that represents the data associated with a single vertex:

```struct Vertex {
float3 position;
float3 color;
};
```

In this struct, we’re packaging a (model-space) position and color together. This is simply an example; as in regular application code, a function is written to take whatever data it needs to do its job.

In our application code, we might have an array of these structures, one for each vertex. Before we draw this data, we need to copy it into a Metal buffer, which is a block of memory that can be read by the GPU. This is the first step toward being able to use the data from inside a Metal shader function.

Getting Data into Buffers

In Objective-C, we’d probably use `memcpy` to perform this copy:

```memcpy(buffer.contents, vertices, sizeof(Vertex) * vertexCount);
```

In Swift, we have the option of being slightly safer by first binding the buffer’s contents to our structure’s type before copying into it:

```let bufferPoints = buffer.contents().bindMemory(to: Vertex.self,
capacity: vertices.count)
bufferPoints.assign(from: &vertices, count: vertices.count)
```

Note that this is still risky, since as of this writing Swift does not provide guarantees about how struct members are laid out in memory. Struct members can be arbitrarily reordered or padded by Swift, breaking the expectation that the Swift struct has the same layout as its corresponding Metal struct. This can be mitigated by declaring structs in C or Objective-C instead and importing them into Swift via a bridging header, which affords stronger guarantees about member layout.

If our application code is one side of the application-shader bridge, the shader function is the other side. Before we talk about the bridge between them, let’s talk about how data is consumed in a shader function.

Using Data in Functions

When we aren’t using a vertex descriptor, we’re obligated to look up vertex data in our buffers manually. We do this by writing our vertex function to take pointers to one or more buffers, as well as a special parameter attributed with the `vertex_id` attribute. This parameter is populated with the current vertex index whenever the function is invoked on the GPU. Here’s the signature of a simple vertex function:

```vertex VertexOut vertex_main(
device Vertex *vertices [[buffer(0)]],
uint vid                [[vertex_id]])
```

In the body of the function, we then manually fetch the current vertex data from the provided index:

```Vertex vertex = vertices[vid];
```

Determining the vertex data to pass on to the fragment shader is a matter of accessing each piece of vertex data, transforming it, and returning a structure from the function. For example, we might multiply the position by a transformation matrix to move it into clip space, and also pass through the vertex color directly:

```VertexOut out;
out.position = uniforms.modelViewProjectionMatrix * float4(vertex.position, 1);
out.color = vertex.color;
return out;
```

So now we know how to retrieve and operate on vertex data in a vertex function, but how do we get data from our buffers into functions in the first place? To cross that bridge, we need to talk about an abstraction that’s unique to Metal: argument tables.

Argument Tables

You can think of an argument table as a list of resources. A command encoder has an argument table for each type of resource that you can supply to a shader function: buffers, textures, and samplers.

The number of entries (slots) in each list depends on the device, but you can generally assume you have at least 31 buffer and texture entries, and 16 sampler entries.

Setting Argument Table Entries

Rather than an actual data structure, an argument table is more a way of conceptualizing the collection of resources that are used by a particular draw call.

Each type of command encoder has methods for setting entries in its argument tables. In the case of the render command encoder, we have separate sets of argument tables for the vertex and fragment function, since they often operate on different data.

For example, if we want to set a buffer as the first entry in the vertex function’s argument table (index 0), we would do the following:

```renderCommandEncoder.setVertexBuffer(vertexBuffer, offset: 0, index: 0)
```

In addition to telling the command encoder which argument buffer slot to set, you can provide an offset, which is the number of bytes from the beginning of the buffer where data should start to be read from.

Interleaved and Non-interleaved Data

Due to the popularity of the object-oriented paradigm, it is common and natural to want to think of a vertex as one thing, an object that contains all the data relevant to it. But, the GPU doesn’t have any concept of what a vertex is. All it cares about is that a clip-space position is somehow returned from the vertex function.

From a different perspective, then, we might imagine the attributes of our vertices as coming from separate streams, and keep the data for each attribute contiguous in memory: one area for positions, one for normals, one for colors, and so on.

This perspective has some notable benefits. For one, it means that we can access vertex data from different vertex functions without wasting bandwidth or cache space. If a vertex is processed by a pipeline that generates a shadow map, that vertex function may only need the vertex’s position, and not any other data. If we store all of the data for a vertex contiguously, the GPU has to stride farther in memory to get to the next bit of data it needs to operate on. Furthermore, it may be difficult to pack data together in a struct in a way that optimizes storage or read performance, owing to both cache effects and alignment requirements.

When the attributes for a single vertex are stored contiguously, we say that the data is interleaved, while if the data for a particular attribute of all vertices are stored contiguously (whether in one buffer or several), the data is non-interleaved.

Reading Vertex Data with Automatic Fetch

Fortunately, Metal gives us a lot of flexibility when it comes to how we arrange our data in memory. It achieves this by abstracting how vertices are represented in shaders versus how they are laid out in buffers. To do this, we apply attributes to the members of our vertex struct, providing a unique attribute index for each:

```struct Vertex {
float3 position [[attribute(0)]];
float3 color    [[attribute(1)]];
};
```

These attribute attributes (confusing, I know) allow us to refer to each piece of data by index, rather than caring about exactly where it resides in memory.

Rather than taking a pointer to a particular buffer in the argument table, our vertex function can now take a parameter attributed with the `stage_in` attribute:

```vertex VertexOut vertex_main(Vertex vertex [[stage_in]])
```

What effect does this attribute have, and how does the incoming vertex struct get filled in? The answer is that your vertex function is patched by the shader compiler with instructions that tell the GPU where each attribute should be fetched from. This feature is called vertex fetch, and is enabled through the use of vertex descriptors, which are the topic of the next section.

Vertex Descriptors

A vertex descriptor (`MTLVertexDescriptor`) is the mapping between buffers and vertex function parameters. A vertex descriptor consists of a number of attributes and one or more layouts. In essence, an attribute describes the size and location of a single vertex property (position, texture coordinates, etc.), while a layout describes a single buffer. Most particularly, a layout has a stride that indicates the distance in bytes between vertices.

Interleaved and Non-interleaved Vertex Descriptors

The way in which vertex data is read by the GPU is entirely based on the vertex descriptor associated with the render pipeline state.

For example, consider the case where we want to keep all of our vertex data interleaved in a single buffer. Then, for the sample vertex struct above, we might construct our vertex descriptor as follows:

```let vertexDescriptor = MTLVertexDescriptor()
vertexDescriptor.attributes[0].format = .float3
vertexDescriptor.attributes[0].bufferIndex = 0
vertexDescriptor.attributes[0].offset = 0
vertexDescriptor.attributes[1].format = .float3
vertexDescriptor.attributes[1].bufferIndex = 0
vertexDescriptor.attributes[1].offset = MemoryLayout<float3>.stride
vertexDescriptor.layouts[0].stride = MemoryLayout<float3>.stride * 2
```

We have two attributes, and their indices match the indices we specified in the vertex struct. We indicate that the second attribute is at an offset of the size of the first member from the beginning of the struct, since they are laid out next to one another in memory.

The buffer index in each attributes indicates to which argument table slot its corresponding buffer will be assigned; in this case, both attributes are in buffer 0. Since both attributes are in the same buffer, we have just one layout, and we set its stride to the sum of the size of the vertex struct members to indicate that our vertices are tightly packed together in the buffer.

What if we wanted to de-interleave our data and provide it in two separate buffers? In this case, we’d still have two attributes at the same indices, but we’d change the offset of the second attribute to 0 to indicate that the first vertex’s data starts at the beginning of the buffer. Then, we’d add a second layout and set the stride of both layouts to be the width of a single struct member, since the data for each attribute has now been made independent:

```let vertexDescriptor = MTLVertexDescriptor()
vertexDescriptor.attributes[0].format = .float3
vertexDescriptor.attributes[0].bufferIndex = 0
vertexDescriptor.attributes[0].offset = 0
vertexDescriptor.attributes[1].format = .float3
vertexDescriptor.attributes[1].bufferIndex = 1
vertexDescriptor.attributes[1].offset = 0
vertexDescriptor.layouts[0].stride = MemoryLayout<float3>.stride
vertexDescriptor.layouts[1].stride = MemoryLayout<float3>.stride
```

Step Functions and Step Rates

Attributes and layouts are only part of the vertex descriptor story. So far, we’ve been considering the case where we want to fetch new data each time our vertex function is called. Under some circumstances, we want to fetch data less often. For example, if we’re doing instanced rendering, we might want a particular attribute to remain the same across all vertices for an instance. Similarly, if we’re tessellating, some attributes will vary on a per-control-point or per-patch basis.

Vertex descriptors allow for this. Vertex descriptor layouts have two properties in addition to their stride that indicate the frequency with which data should be fetched: the step function and step rate.

Consult the documentation for step functions and step rate for more details on how to use these properties. They are somewhat specialized, but I didn’t want to neglect mentioning them, lest you get the impression that vertex descriptors are less powerful than they are.

Conclusion

In this article, we’ve covered a topic that has been long-neglected on this blog, vertex descriptors. You’ve seen how vertex descriptors can be used to decouple the layout of data in your application from how that data is loaded in your shader functions. You’ve also seen that, due to this decoupling, you can separate vertex data into multiple buffers in order to make it more efficient to use the same data with different render pipelines. Your Metal apps should use vertex descriptors whenever possible, since vertex descriptors are the rare kind of abstraction that aids both performance and code maintenance.