Apr 1

7 min read

Bindless rendering — Templates

In this continuation of bindless rendering — setup, we now shift our focus to using bindless resources in shaders. Previously we were using RenderResourceHandles to index the descriptor heap, but this still forces the user to write unnecessary code that can be neatly wrapped.

Ideally we would like to access our resources directly from a buffer, without the user explicitly having to call ResourceDescriptorHeap or even touch resource bindings in general.

This behavior is currently not possible in HLSL, however we can achieve something close to this by writing a wrapper around the resource types. We will cover the methods used to achieve the desired behavior using templated wrappers for resources. This approach requires SPIR-V ResourceDescriptorHeap indexing functionality, which is currently not supported, but a few tricks can be used to emulate roughly the same behavior. These tricks to emulate ResourceDescriptorHeap behavior in SPIR-V will be covered as well.

Note that the techniques we’re going to discuss require the latest DXC build.

Relevant topics that were covered in the previous part: We covered how registers can be overlapped on Vulkan in order match our resource indices between Dx12 and Vulkan. We also introduced the RenderResourceHandle which represents our descriptor index within the descriptor heap.

Wrapping resources

To achieve our goal of being able to load any resource from an arbitrary piece of memory, we need to wrap our resources in such a way that we keep to a specific set of rules. First off we must make sure we are not adding any data on top of the RenderResourceHandle that we are wrapping, as we heavily rely on our 32 bit constants in order to communicate our resource index buffer to the GPU.

We then need to ensure that we wrap all resource types that the user may want to use, this is quite a big list of resources and functionality. Thankfully since the recent addition of templates to HLSL, we can greatly reduce the amount of code needed to add support for all these resources. But this is still a big task to do as we need to support all functionality within each wrapped resource.

Finally we need to make the interface as intuitive as possible for the user, the whole point of these resource wrappers is to make the user’s life easier!

We start off by defining the types we want to wrap, for our case we decided to create the following types that will replace our RenderResourceHandles in our resource indices buffer:

AccelerationStructure (Represents Tlas)
RawBuffer (Represents ByteAddressBuffer)
RWRawBuffer (Represents RWByteAddressBuffer)
ArrayBuffer (Represents StructuredBuffer)
RWArrayBuffer (Represents RWStructuredBuffer)
Texture (Represents all Texture types)
RWTexture (Represents all RWTexture types)
SimpleBuffer (Represents a single element buffer)

You could choose to separately wrap textures based on dimension and type instead but this would result in an increase of boiler plate code. We weighed the pros and cons of having separately wrapped texture types compared to a single wrapper, and decided to go for a single wrapper instead. Both options are perfectly viable, but we preferred the option that required the least amount of maintenance.

We had to create a macro to minimize code divergence between Dx12 and Vulkan when accessing resources. In short DESCRIPTOR_HEAP maps directly to the ResourceDescriptorHeap in Dx12, but maps to our own ResourceDescriptorHeap emulation in Vulkan.

As for the implementation, all of the associated functions need to be wrapped for each of the types. For buffer types this is very straightforward, we just implement the Load and Store functions (In case of RW variant).

However, for texture types this becomes a bit more complicated as we need to take resource dimensions and type into account. So we need to implement a Load, Store, Sample etc. for all of the possible resource dimensions, along with adding support for any other texture types, for example a TextureCube.

An important thing to note here is that some functions might not be exposed to certain pipeline stages, you can choose to explicitly hide certain function implementations by checking what shader stage we are currently in. These shader stages defines are exposed by default in HLSL both in DXIL and SPIR-V.

For RW textures we cannot directly overload the array operator to store data, so instead we wrap its functionality to a function, again we specialize per texture dimension here.

ResourceDescriptorHeap in Vulkan

At this time Vulkan doesn’t support the recently introduced ResourceDescriptorHeap yet. Since we are running the same shaders in both Dx12 and Vulkan, we need to find a way to emulate the ResourceDescriptorHeap in such a way that we have minimal divergence in code paths between DXIL and SPIR-V resource accesses.

We found three viable options:

Write a static wrapper that emulates ResourceDescriptorHeap as close as possible with indexing operator overloading.
Create a define that accesses the resources according to some explicit arguments provided.
Wait for the official ResourceDescriptorHeap support to drop for Vulkan/SPIR-V

Waiting for the official Vulkan extension to be released was out of the question for us, so we decided to try the two other options instead for the time being. Both of these options have their pros and cons in regards to code quantity and emulation accuracy.

Important note: The techniques that we are going to cover to emulate ResourceDescriptorHeap behavior in SPIR-V, are NOT meant to be used as a long-term solution. We’re just showing that certain paths are possible within SPIR-V to roughly achieve this functionality. Once the ResourceDescriptorHeap is officially supported, all of the SPIR-V specific boilerplate code can be thrown away for a much more light-weight implementation.

First off, there are a set of texture types that we need to support in our framework, which of types uint, int and float. Remember that registers can be overlapped in Vulkan, we abuse this behavior to create macros that declare all of our texture resources, including their value types.

Let’s assume we call this with DEFINE_TEXTURE_TYPES_AND_FORMATS_SLOTS(Texture2D, 0, 0) , this would be expanded to the following:

If we fill out all the value type implementations in the macro, we can easily declare all of our texture registers without having to manually type out hundreds of lines of code. Setting up the registers using the texture types and their value types makes it especially convenient later on to match resource access with the corresponding registers as we can match this in the indexing overloading implementation of the VkResourceDescriptorheap emulation. Now that we have a define that defines all value types of a texture type, we can simply call this implementation macro for all of the texture types.

Operator overloading method

The operator overloading method is by far the most complex method that we found, but seems to be the option closest to emulating ResourceDescriptorHeap behavior in SPIR-V.

For the index operator overload method, we simply create a new struct and implement all the operator overloads based on what resource type is being retrieved. There is a big problem with this though, which is that operator overloading does not differentiate based on return type but only on arguments. This forces us to write a dummy identifier for each resource access type. For buffers we do not need to specify a template argument as this type does not rely on them when doing any kind of resource access. Textures are a bit of a different story though, as they can contain different types of types, therefore we need to add a template type to each of the handles.

Again we abuse macros here to reduce the amount of code until ResourceDescriptorHeap is officially supported. We define an operator overload specialization based on the resource type and its value type.

After defining how each operator overload should be implemented, we can create another macro that implements all index operator overloads for all texture types, and their texture value type permutations.

Now that we have a way to implement the index operator overloading, we can easily start adding support for types we want to support in our ResourceDescriptorHeap emulation. Note that we assume (RW)ByteAddressBuffers for all of our buffer accesses. At the time of writing this blog, there are still a few issues with getting StructuredBuffers working correctly using this implementation. Once these issues have been solved, other buffer types can be added to this implementation as well.

Now that we have roughly the same behavior with resource heap indexing in Vulkan compared to Dx12, there is one last step to make this fully work. We need to explicitly state what kind of a resource the VkResourceDescriptorHeap is going to index. This is far from ideal and could potentially be worked around with more template magic, but we decided this would not be worth the effort.

Pure define approach

Instead of choosing to mimic the ResourceDescriptorHeap indexing, you can also choose to take a much more lightweight approach, that trades away code quantity for a split in the resource heap. This still uses the resource register declarations discussed earlier in this post. This technique is significantly more light-weight compared to the array operator overloading approach, but still is far from ideal.

Resulting shader code

The templated resource wrappers lead to quite a big helper file, but opens up paths to writing very clean and intuitive shaders. Instead of having the user always write the same resource declaration and indexing logic for any resources they may need, we just have a buffer containing our resources, which are directly accessible without forcing the user to worry about registers and declarations. This is a very sharp double-edged sword though, this technique allows the user to load entire resources from any memory, which potentially may not contain a valid resource, resulting in a crash when these are accessed. For our framework Breda, we simply discourage this behavior unless absolutely necessary. Luckily there is a reliable way to validate these resource access patterns which will be covered in the last part of this series.

Next blog

In the final part of the series we will be covering bindless GPU validation built on top of everything we’ve covered so far. The GPU validation should cover the newly introduced risks by taking this templated approach. This includes validating invalid resource usage, ensuring a handle actually points to a resource and finally how to easily verify if a handle’s lifetime is valid.

Bindless rendering — Templates

Wrapping resources

ResourceDescriptorHeap in Vulkan

Operator overloading method

Pure define approach

Resulting shader code

Next blog

More from Traverse Research

Recommended from Medium

SmartShelf — A final full-stack Bootcamp project

Create FAQ Bot Using Dialogflow Knowledge Connector

Anko layouts on Android. Should I forget about XMLs?

TDD Simplified

pq.StringArray data type and RETURNING feature of PostgreSQL for GoLang Product APIs.

Elasticsearch, the advanced Search and Analytics Engine

Alphabet array Solving Anagrams

WebSocket — 1 Million Connections using Appwrite

Darius Bouma

More from Medium

SMS & eFax Guru

Panel Mount Monitor at itd-tech

SOC 2 Trust Services Criteria: How To Select What’s Best For You! (Part 4 of 6)

[Beer Lecture] The Barbarian Booze