Passer au contenu

Rendering related threads

A place to store rendering related threads I find (or create) on Twitter for posterity.

Scalarisation and wave intrinsics (to allow intra-warp direct communication) is what all cool kids seem to do nowadays, here's a list of resources on the topic.

8 réponses 61 Retweets 171 j'aime

Alright! below im gonna post my collection of links related to occlusion culling. if I missed good links, you folks are welcome to post them below.

3 réponses 73 Retweets 222 j'aime

Frame graphs seem like a good idea to help handle the complexity of modern rendering engines/low level graphics APIs. Good collection of resources on them with some code projects: .

1 réponse 43 Retweets 160 j'aime

Someone at work asked me today where do I find all those presentations about graphics techniques and made me realise that it might not be so common knowledge to people just starting gfx programming. Thread of links.

9 réponses 317 Retweets 978 j'aime

This is a prime example of why one should do algorithmic improvements first instead of micro-optimisations: Shadow pass started at 16ms, using a custom buffer layout to reduce mem bandwidth took it to 13ms, using SAH to reduce traversal steps took it down to 3.6ms (GTX 970).

4 réponses 24 Retweets 79 j'aime

Programming with compute shaders (efficiently), balancing workloads with resources and thinking in parallel, gives many opportunities to learn how GPUs really work (well, pretty close at least). A few links to get you started. (1/N)

8 réponses 92 Retweets 343 j'aime

A common theme in the questions I received so far is that beginners feel intimidated by graphics programming and do not know how to start. They need not be though as graphics programming can be approached in different ways and at many levels of complexity (1/5).

3 réponses 20 Retweets 56 j'aime

Early prototyping work for a previous game. This is combining volumetric lightshafts and low fog in one screenspace pass.

1 réponse 7 Retweets 57 j'aime

So below, I compiled a list of awesome people that you should follow if you're interested in computer graphics! please retweet for visibility. and do tell if I missed someone!

16 réponses 120 Retweets 297 j'aime

Daily Pathtracing, Part 1. Initial simple C++ (Win/Mac) implementation & walkthrough. Next part: fixing a gross perf embarrassment in it.

25 réponses 112 Retweets 585 j'aime

alright, below I'm gonna list some professors/researchers in Computer Graphics whose research and papers I absolutely love. you folks are also very much welcome to post your own suggestions below!

8 réponses 40 Retweets 205 j'aime

Awesome uniform load optimization for loops. Beats AMD scalar optimizations in performance and is as fast as constant buffer loads on Nvidia, but without any downsides. Supports typed/raw/structured buffer AND textures:

9 réponses 50 Retweets 178 j'aime

Good example of float precision issues if world space is used in rendering. I personally prefer camera centered world space (camera is 0,0,0). This is often better than view space, because. A) faster & less lossy xform, B) normals stay in world space (easy to sample cubemaps).

3 réponses 10 Retweets 79 j'aime

Just learned that HLSL considers two structures to be equivalent for function overloading if they're the same size. This feels like a legacy behavior that should be fixed in future DirectX versions! See example here:

6 réponses 13 Retweets 43 j'aime

New blog post: Mesh Shader Possibilities! Or why graphics programmers have been yelling about mesh shaders over the last couple weeks.

7 réponses 83 Retweets 244 j'aime

Finished my Claybook raw->typed buffer port. Here's a thread containing some performance analysis numbers on AMD GPU. TLDR: Use raw buffers if your platform supports them!

2 réponses 8 Retweets 47 j'aime

I published a new blog post with an overview of the new OpenGL and Vulkan extensions for the NVIDIA Turing architecture :

5 réponses 92 Retweets 175 j'aime

This is an important slide. Scenes behind the 10 Gigarays/s number. Each scene has a single high poly mesh (no background). Primary rays = fully coherent.

3 réponses 20 Retweets 55 j'aime

The most curious GPU bottleneck: ROP exports apparently retire in submission order. If your PS has early out fast path and very slow generic path, exports of fast pixels will stall (wait previous slow pixel). Moved shadow cone trace PS->CS = 50% perf gain (both Nvidia and AMD).

11 réponses 40 Retweets 174 j'aime

Indexed skinning with 3x less indexed bone matrix memory loads. Supports up to 64 matrices on GCN (up to 32 on NVidia). Needs Vulkan 1.1 subgroupShuffle. Doesn't work in DX12, because SM 6.0 only exposes GCN2 equivalent wave intrinsics:

4 réponses 9 Retweets 44 j'aime

Idea: sample-reusing reconstruction for area lighting (similar to reconstruction of reflections). Pick one light per pixel via single-slot weighted reservoir sampling, reuse spatially with approximate occlusion. 1spp, 1spp+16 tap filter:

5 réponses 12 Retweets 63 j'aime

A bunch of people are asking what resources I recommend to start learning graphics programming. So you get a thread on it!

44 réponses 690 Retweets 2 506 j'aime

How To become an advanced graphics programmer: Some general advice and tips from me, an expert graphics programmer huge thread below.

14 réponses 203 Retweets 699 j'aime

Hey and (or other folks), what's the basic idea for dithering when you have a floating point target? There's no natural dither size eg 1/255 to add ...

4 réponses 3 j'aime
Survoler pour réafficher

Another small volumetric path tracer logical step: I added light sampling (sun on top of the sky, still single scattering only).

2 réponses 5 Retweets 74 j'aime

The most elegant UE4 RHI hack... Apparently I need to ship this because Turing (RTX) drivers have the same UINT clear bug that Intel and AMD fixed 1+ year ago. Spec allows using UINT clear to clear unorm/float (bitwise fill), apparently nobody reads DX11 spec "Remarks" sections.

2 réponses 2 Retweets 18 j'aime

I was going to call it "why geometry shaders are slow" but Intel had to go and be different :)

1 réponse 2 j'aime

Optimization for forward shading pixel shaders with light tile/froxel load: Instead of loading the same light with each pixel, load a different light with each pixel inside a 2x2 quad and use quad swizzle to broadcast light data in light loop. This reduces number of loads by 4x.

8 réponses 11 Retweets 102 j'aime