Copied from the original reddit post as a precaution. Reddit has been unstable recently. Though I prefer reddit as it enables conversation.

Dear graphics programming community,

I'm finally back to introduce you to the 3rd version of my game engine: HighOmega v3.0: King Mackerel.
Check it out here: https://www.youtube.com/watch?v=8IRNQupyoIs... there's a link in the description for an offline video download since YouTube compression is what it is. Fair warning: you're gonna need an RTX 2080 Ti or a Radeon VII for a decent frame rate. It'll be kind of obvious in a second why.

What is this?:

This is a game engine with the craziest rendering pipeline you've ever seen! The latest iteration -- HighOmega v3.0 -- was under full-time development for the past 9 months and it's finally here! If RTX is present it can do triangle-based path-tracing along with spatio-temporal de-noising to conduct real-time light transport. If RTX is not present it can drop to an alternative path mid-pipeline which path traces against Voxels and conducts the exact same de-noising and post-processing effects. Here's the mind blower: instead of Next Event Estimation it combines a real-time coarse-ish 3D radiosity map for explicit luminare information. This map in fact can be used to entirely eliminate variance in smooth gloss and transmission when the BSDF path lands on a mostly Lambertian (diffuse) surface afterwards (e.g. without the need for virtual positions or anything extra like what is needed here: https://cs.dartmouth.edu/~wjarosz/publications/mara17towards.pdf). There are other benefits too. For more details, keep on reading :).

Some background:

So I posted this https://www.reddit.com/r/gamedev/comments/39eddm/highomega_v20_game_engine/ about 4 years ago and I decided that it was time to finally make a game with the engine (v2.0). As I was progressing, I felt that the rendering pipeline was a bit busy. Truth is the engine had been dragged from OpenGL 2.1 all the way to 4.2 and many old and new lighting models (BRDFs) and material definitions were mixed. Also everything was kind of trampled on top of each other (e.g.: indirect diffuse on top of an ambient term.) To top it off there were 2 built-in experimental shader-based path-tracers with completely different notions of geometry that I didn't know what to do with. In short: it was a mess. So I decided it needed a revamp. Since Vulkan was just around the corner... I decided to get a head-start writing a DX12 wrapper that I would extend to cover Vulkan as well. Once Vulkan came out I basically abandoned the whole thing since the abstractions were not a perfect match and re-abstracted Vulkan from scratch. Once that was done, I decided to take an off-the-beaten-path approach and find something crazy to do with the voxel-based path tracer. By this point it was late 2017 and the two nVidia papers on de-noising were out so I decided to combine the two which resulted in a technique called Voxel-based Hybrid Path-tracing with Spatial (and later Spatio-temporal) De-noising. This was showcased live on my 970M GTX laptop at i3D 2018. You can find the abstract (outdated) here: http://toomuchvoltage.com/pub/vbhptwstd/abstract.pdf and the poster itself (less outdated) here: http://toomuchvoltage.com/pub/vbhptwstd/poster.pdf with some photos to boot: http://toomuchvoltage.com/pub/vbhptwstd/i3D2018_showcase1.jpg http://toomuchvoltage.com/pub/vbhptwstd/i3D2018_showcase2.png http://toomuchvoltage.com/pub/vbhptwstd/i3D2018_showcase3.jpg http://toomuchvoltage.com/pub/vbhptwstd/i3D2018_showcase4.jpg

So what's new? (since i3D 2018):

So what I've been up to was mainly finding a nice way to reduce variance. The i3D version had next event estimation and since you more or less pick a random luminare surface hint to sample, its variance reduction capabilities aren't that phenomenal... that is if the hint doesn't fall behind the object from the BSDF vertex's point-of-view. NEE makes a difference at let's say 25 SPP compared to no explicit luminare sampling, but at temporally accumulated 1SPP (over the last 10 frames) it's nothing to write home about. Plus, if you're not using RTX, you have to basically generate luminare surface hints on the CPU and upload them... and that's a CPU-side dependency I didn't like. So I decided to combine something that had been on my mind -- cone-tracing -- and path-tracing. And viola! I decided to cone-trace (potentially diffuse) voxels against themselves (in a decoupled-shading-from-visibility manner via compute) and capture radiosity. This surprisingly wasn't that slow even though it's not very scalable. It'll work for small environments if you strike the right balance between radiosity map voxel coarseness and environment size. Since I was already re-voxelizing dynamic parts of the scene along with re-distance transforming a margin around them... doing this in compute was a breeze. The next step was combining the two. Now if you think about it there's an unbiased way to do it (duh!): use it only for the diffuse term and recursively raytrace the gloss term... and surprise: people have known this since the dawn of time (see http://www.fsz.bme.hu/~szirmay/radiosit/node15.html and https://www.cs.cmu.edu/afs/cs/academic/class/15462-s10/www/lec-slides/lec21.pdf). I opted for a prettier but more biased approach (since my voxels are a bit coarse-ish): simply scale it down at each BSDF vertex with the vertex diffuseness and the dot-chain up to that point, modulate with vertex albedo and add it as explicit lighting. As double-dip protection isn't possible (CORRECTION 12/03/2019: come to think of it it probably is, just would look a little dull), this will make things brighter (add bias)... but the enemy is variance and not bias really. This surprisingly looks pretty good... and there's a huge benefit as mentioned above: if you start at smooth gloss or transmission and end up at a perfectly diffuse surface, you can sample (preferably in a filtered manner) the radiosity map and be done in a radiometrically correct manner (coarseness considered.) Which is exactly what I do in that case. There is one edge case though: hitting/going-through rough gloss/transmission past a first smooth bounce (they will -- erroneously -- be considered smooth). No free lunch I guess. But here's another mind blower: since compute (in most GPUs) roughly starts at lower workgroup items (representing one side of the environment) and works its way up to higher workgroup items (representing another side of the environment) you can get the GPU to give you many many many radiosity iterations for FREE as it progresses along the map. If you hit another diffuse voxel (instead of a luminare or the sky or something) proceed to fetch stored radiance, modulate with albedo at said voxel and you have your irradiance from that voxel! This freebie just blew my mind... I wasn't expecting it. Of course this is inconsistent across the entire radiosity map, but hey, who's counting? :) I suspect that with the right kind of orientation of the environment, you can probably get blurry-as-hell SDS (as in specular-diffuse-specular) causticswhich would be interesting/funny to look at. Why you ask? SDS caustics have been an interesting problem in and of themselves in offline path-tracing... since the dawn of time (read https://graphics.stanford.edu/courses/cs348b-03/papers/veach-chapter8.pdf and see http://bptmlt.blogspot.com/2013/12/sds-paths.html). I have yet to set up an explicit test-case for this though. Again bear in mind, the way I'm doing it is not bias-free... but heck de-noising itself is a boatload of bias.

Phew, that was a mouthful.

Having said all this, large luminares in the sky (as in sun/moon) do NEE with double-dip protection though using direct lighting (no sampling on the surface.) If RTX is present the first BSDF vertex uses filtered raytraced shadows. If not a shadow map is used. Past the first BSDF vertex a shadow map is used regardless. Stained glass caustics going through SSRd transmission (see below) need and use the shadow map.

What else is new since i3D 2018?:

Another interesting thing is, if you're doing upsampling (which my pipeline does), following it up with spatial denoising and temporal accumulation works a LOT better than the exact reverse.... which is what I was doing at i3D 2018. Simply the perception of more lighting information in the fragment neighborhood works better to reduce low frequency shimmers. Funny how that works.

I'm confused as all heck:

Don't worry, I'm making a presentation with all this stuff. It'll be all clear soon. Well, hopefully soon :).

What isn't path-traced?:

If you liked this post, stay in touch through one or more of the following:

Twitter:twitter.com/toomuchvoltage(please re-follow if you had before, there was a bit of a fiasco with this account.)
Mastodon:https://mastodon.gamedev.place/@toomuchvoltage
Facebook:fb.com/toomuchvoltage
YouTube:https://youtube.com/toomuchvoltage
Website:toomuchvoltage.com

Thanks for reading!
And I'm here for any questions! :) (hit me up on Social Media!)

Yours truly,
Baktash.