Search Unity

Assets HXGI Realtime Dynamic GI

Discussion in 'Works In Progress' started by Lexie, May 24, 2017.

  1. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,190
    You could have released the old version as an asset and have this as an update :D
    (what people who want it think)
     
  2. Ninlilizi

    Ninlilizi

    Joined:
    Sep 19, 2016
    Posts:
    271
    This motivates me to stop sucking at life and get mine ready to show off.

    It's not perfect either.... But it does this fast enough to be usuable for VR

    (This is 8 samples per pixel)

     
    Gerard_Slee, jcarpay, cerrec and 5 others like this.
  3. GuitarBro

    GuitarBro

    Joined:
    Oct 9, 2014
    Posts:
    50
    This thread went from dead to hype in like, one day lol.
     
  4. Ninlilizi

    Ninlilizi

    Joined:
    Sep 19, 2016
    Posts:
    271
    Nerd Party!

    Bring your own hype!
     
    GuitarBro likes this.
  5. Ninlilizi

    Ninlilizi

    Joined:
    Sep 19, 2016
    Posts:
    271
    Interesting... I'd also fallen into having a vaguely similar system of flagged layers for performance reasons.

    I've got a kinda semi-realtime bake in placish ..... And a fast update for emitters only process that run both simultaneously, where the slow update layer has some higher quality logic going on... And the fast update layer, while it's conversely computationally more expensive. Balances out by only dealing with super important mobile or animated emitters in a scene and they update in just a few frames.... Where as the slow refresh progressively updates only every half second.

    Also playing with a some what hair brained, it's stupid but seems to work system of light decay over time and progressive propagation strategies... Prob not likely to develop into anything I'd show to the world... But it's a fun experiment if nothing else.

    Both processes feed directly into the same 3d volume... I did initially have some extra steps and mux them together as a separate process but eventually simplified the whole thing, cos extra complexity for such small gains.

    But, I'm still pretty much a gi noob compared to your experience... So I've gotten a little reclusive because I'm spending more time learning than doing and I read a book today isn't terribly exciting for anyone else.

    Do you have any recommendations for learning about writing denoising algos... or alternate data storage paradigms beyond this is my 3dtexture, it is amazing... With a specific focus on efficiency of samples and operation counts... Mildly I'm a ditz friendly?

    Most of my hair-brained things I've tried out... Which usually sound smart before trying them... End up as neither gains of losses... They work, but no better than the simpler solution. And I'm sure half of that is my own inexperience with a lot of the concepts I'm playing with....

    I understand if you have more important things than recommending reading lists.... I'll figure it out eventually, :D

    (I should add, my craziness likely won't ever appeal to the majority of users... As I'm targeting VR use alongside a procedural framework that expose zero of the world geometry to unity itself... So I need a solution that does realtime GI fast with access to nothing more than the render buffers themselves for most of the scene for my own project.)
     
    Last edited: Aug 8, 2019
  6. jefferytitan

    jefferytitan

    Joined:
    Jul 19, 2012
    Posts:
    84
    Out of curiosity, when you say you only have access to the render buffers, so you only have access to the main render buffer? Or do you have the luxury of multiple cameras, hence your orthogonal camera?
     
  7. Ninlilizi

    Ninlilizi

    Joined:
    Sep 19, 2016
    Posts:
    271
    It's a bit more complex than that.
    Though, yeah... The fast update does an orthogonal shader replacement of emitters only... Which solves the It's behind you! problem.

    The problem with my awesome, but a bitch to work with content creation tool... While it limits the way I can interact in engine outside of physics... I do have access to it's pixel shader.... So I can make it write emission to a compute buffer and grab that with the fast update for the same effect.

    Basically... tracing is only half the problem performance wise... voxelising and injection of light has room for lots of improvement too.

    One of the reasons SEGI is so slow.... Is that it does I think, about 3(?) complete scene renders before it has everything it needs to start tracing.... The geometry shader approach to making voxels is slow ontop of that... Then there's a render to generate the shadow map.... It also does this Orthogonaly of the entire scene within it's current cascade.

    I've been watching and learning cool tricks from everyone else for the obvious drawing the GI side of things... But mostly been asking is there a way I don't have to do this at all? to all the various things that happen before you start tracing.

    So hopefully.... When I eventually dump this all on GitHub... There's some interesting material that even the actual geniuses can take some ideas from :D

    (Anyway, ask me questions on Discord... I'm already pushing my luck by crashing Lexie's thread asking questions.... Would be rude to completely take it over, like I have that other place!)
     
    Last edited: Aug 8, 2019
    jefferytitan and AFrisby like this.
  8. Gerard_Slee

    Gerard_Slee

    Joined:
    Apr 3, 2017
    Posts:
    7
    65,73929-2000.png_output.png

    Let us at it!! I am tired of pathtracing.
     
  9. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    http://www.realtimerendering.com/ has some good references to look through. I mostly just google and read through papers. Intel has some open source denoisers as well. Good work on your GI, I'm guessing you're injecting the lighting into a voxel cell and then bleeding that out. Doesn't look like your lighting is anistropic or handles occlusion though. If this is how you are doing it, then you could look into Light Propagation Volumes, they support storing the lighting using Spherical Harmonics and can handle occlusion as well.



    Probably cant reach the quality of a path tracer :) but i have some ideas on how to get nice contact lighting at least.

    Threw together a proof of concept for some ray traced AO. I think once the data structure is complete it would be viable to do 1 sample per pixel + a denoiser pass. The idea would be to only trace rays a set distance, say 1 meter to keep the cost down and then use this data to add a bit more detail back into the lighting pass, It wouldn't be totally accurate but would help keep detail.


    Ray Traced AO On/Off

    I have a lot of different methods I want to test out. Ill start by recreating DDGI again. So far it seems to be the best trade off between performance and quality. It really didn't like using a voxel grid though, it requires an accurate representation of the world data to generate the probes depth maps.

    I still need to finish the triangle storage though, but for now it seems robust enough to at least try out some GI methods using ray traces.
     
    Last edited: Aug 9, 2019
  10. Gerard_Slee

    Gerard_Slee

    Joined:
    Apr 3, 2017
    Posts:
    7
    That ao looks fantastic.

    I have a compiled easy to integrate Nvidia AI denoiser for you
     
  11. Ninlilizi

    Ninlilizi

    Joined:
    Sep 19, 2016
    Posts:
    271
    Your dead on the money.

    The occlusion issue is one I've been scratching my head over how to approach most recently tbh.... And also trying to understand how to make the transition from the wasteful mess that is a 3dTexture, to something like BVH in a way that relates to what I've learnt already... And finding it a conceptually tricky chasm to jump.
     
  12. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    Trying to stay away from anything that would restrict the platform/hardware I can support. But I'm sure it would be fun to mess around with. does it run on 10 series cards or is it restricted to RTX only?
     
  13. Gerard_Slee

    Gerard_Slee

    Joined:
    Apr 3, 2017
    Posts:
    7
    I'm running on a 1080 so RTX not required.
     
  14. DragonmoN

    DragonmoN

    Joined:
    Nov 27, 2016
    Posts:
    10
    As far as i understand it... will it reduce lightleaks and lightbleeding?

     
  15. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    Yes. Here is an example of how being able to trace the triangle data helps me reduce light bleeding.


    Left: Ray traced GI, There is single sided quad at that red line. Right: Standard light volume using bilinear sampling.

    Before I work on the DDGI method I wanted to try out my own concept for realtime GI.

    The general concept is using standard light volume that is updated by raytracing the scene per light probe.


    Sampling nearest light probes in a light volume, sampling directly looks bad and will introduce light bleeding.

    Instead of sampling the light volume directly, You instead trace through the scene a short distance 1-2m and sample the light probe at that position instead. This results in no lighting bleeding as the probe you are lighting the pixel from has unobstructed vision of the pixel.



    This results in actual world occlusion as its no longer limited to the detail of the light volume. The average distance to triangle is also tracked and then used to add some World Space Ambient Occlusion to bring a little more detail into the scene for free. Only sending out short rays speeds up the tracing step and also means that the final result is way less noising then tracing the whole scene.



    Emissive mesh. Light sources can be any shape.

     
    Last edited: Aug 14, 2019
  16. GuitarBro

    GuitarBro

    Joined:
    Oct 9, 2014
    Posts:
    50
    Those are some tasty GI pictures! ;)
     
  17. forestrf

    forestrf

    Joined:
    Aug 28, 2010
    Posts:
    64
    What is the performance of this? This is just too awesome for it to be fast, it can't be! (I hope it is though)
     
  18. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    REALLY hard to say. Right now my triangle data structure is only halfway done and I still need to optimize the work I've done so far. I should be able to get nearly twice the performance once its finished, so any part of the code that uses traces should see a 2x-3x speed improvement once that is done.

    Currently it's around 6ms for the whole thing at 1080p on a GTX 1070 (rendering at half res with upscale). should be able to cut that in half with all the optimizations I have planned. There are steps of it that can be spread over multiple frames or turned off completely to get more performance at the cost of light responsiveness.

    This doesn't scale to large scenes though. Either cascades or a larger light volume would be needed to handle large areas.
     
    Last edited: Aug 14, 2019
  19. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,190
    Of course I guess using simpler geometry proxy to update the light volume would speed up further too (less triangles to trace through)?
     
  20. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    Fixed a bunch of bugs and got the system all working correctly, Its a lot more responsive to light/scene changes now while also running faster. Next step would be adding a Denoiser and then cascades. It looks like I've created a very similar system to enlisted's GI but instead of putting all the effort into generating a good light volume, I instead create a noisy one and trace the world to remove noise, add per pixel near occlusion and also remove light bleeding with out resorting to baking extra data into the models!


    The above image was taken with the settings cranked up to something very similar to a path tracer. This was taken using 32 long rays. this number could be dropped a lot with a good denoiser though. Using this more to confirm the lighting is correct.

    Maybe once I improve my triangle data structure and add a denoiser ill be able to use 1-4 long rays per pixel while still keeping the performance fast enough for realtime use.


    Low end settings only using 1 sample while casting short rays. A lot of the detailed occlusion is missing but the scene isn't as noisy and a general concept of the lighting is still retained. With a denoiser this should be enough for video games IMHO.



    This is what the noisy light volume looks, It's surprising how much tracing per pixel improves the quality over sampling the light volume directly.


    Here is what unity's progressive light mapper looks like for reference. I'm not sure if the over darkening is due to limited bounce count or i have a bug in my code somewhere. (This took about 1 min to generate on the GPU)
     
    Last edited: Aug 15, 2019
    RockSPb, Detniess, Tzan and 4 others like this.
  21. GuitarBro

    GuitarBro

    Joined:
    Oct 9, 2014
    Posts:
    50
    Out of curiosity, is your volume over the whole scene, or camera-relative (or other?) Curious about how well it would perform in an open world game where other GI is a nightmare.
     
  22. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    Size is very relative in this method of GI, Although the volume size is about twice the size of this cornell box and is stored in a 64^3 volume, the precision of the lighting isn't bound to it compared to other GI methods.


    That tiny book shelf is around 12cm tall, each cell of the lighting volume is 25cm. This illustrates how first bounce and occlusion are still present on details smaller then a cell of the light volume due to the fact that the scene is traced using triangle data rather then a coarse data set like voxels.

    Cascade light volumes would be the way to go for large open woulds though. 3 cascades of 64x32x64 starting at 0.5m and scaling 4 times per cascade, 3 cascades would give you a view distance of 512 units, anything outside that would use some basic environmental lighting.


    Book shelves all the way down. This is pretty crippling to my tracer though, it can't handle this kind of concentration of triangle data just yet. Need to work on finishing up the data structure so it can handle things like this.
     
    Last edited: Aug 16, 2019
    hopeful and GuitarBro like this.
  23. GuitarBro

    GuitarBro

    Joined:
    Oct 9, 2014
    Posts:
    50
    Sounds pretty nice to me!
     
  24. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,190
    So it's basically bucketing triangle to voxel, right? And then use the bound of the voxel as an acceleration structure to traverse buckets? And the voxel store per cell directional lighting data to help sampling?

    I'm not sure where the ray start, do you start from the probe and capture the scene lighting for that probe? Or start from surface then project on nearest probe ...
     
    Last edited: Aug 16, 2019
  25. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    Generally yes but its a bit more complicated then that. The resolution of the grid that stores the triangles is automatically generated based on the number of triangles in the area (I am here). Then each cell in that grid is subdivded into an octree of triangle, the depth of the octree is modulated by the number of triangles inside the cell. then octree cells are merged if its more cost efficient to trace less cells. Then cells bounds are expanded to skip large ares of cells that contain the same data set, this allows rays to skip large areas.

    The idea is creating chunks of data sets like this, this allows me to update a region if the triangle data changes with out needing to re-render the whole scene. The triangle data is captured by rendering an orthographic camera with the bounds of a chunk. Once its finished it should have similar speeds to BVH tree with out needing all the tracking/management/limitations that goes with BVH.
     
    Last edited: Aug 16, 2019
    neoshaman likes this.
  26. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,190
    That's interesting so voxel and triangle count are automatically balanced to keep a stable performance!

    But I'm not sure where the ray start, do you start from the probe and capture the scene lighting for that probe? Or start from surface then project on nearest probe ... You still talk about probe, so I guess they are still there in some way? as a light cache?

    Another thing I keep thinking about GI, basically it's a light graph, each point is linked to others and light simply propagate each pass through gathering of the link. But dynamic ray traversal is the most expensive. Given the sparseness of real time rays, wouldn't be better to spend time finding a structure to cache them for static part?

    I mean we could just inject light at a node then it would more quickly update by querying node from the list of links (storing index of other points). It can be updated at real time using all the technique we already use, and at each pass it would mean we get more and more precision. It would still be compatible with dynamic object because then we would have a list of cached rays (starting node position with one link node position) we could test intersection with. The complexity would be to just average the node per light update. And we could probably stream the cache.

    It would benefit low end, as the light update would be faster while we delay the construction of the node list with real time update (or just loading them, or even rotate rays data bundles, to fit the compute/memory budget, while benefiting coverage per accumulation of the bundle result into the node).
     
  27. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    What you are describing is exactly how enlighten works. The reason enlighten takes soo long to bake is its trying to prune the list of connections to the most significant ones that will have the largest impact on the scene while balancing performance and memory consumption.
     
  28. RockSPb

    RockSPb

    Joined:
    Feb 6, 2015
    Posts:
    99
    Unity progressive lightmapper isn't good reference. It over darkenning scenes and at general looks like non phisical unity lights + GI from them. You shold better use for reference own path tracer or Bakery ))
     
  29. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    643
    Last time I used bakery it didn't support emissive surfaces so I wasn't doing comparisons. Looks like they added support for it so here are some tests!


    This is the settings turned up to get the quality similar to a path tracer. This is more a test to make sure the light is calculating correctly.


    Light maps baked in Bakery for comparison. Looks like my skybox contribution might be off or maybe bakery is using a mipmap version of the skybox to help with convergence.


    This is more the quality I'm aiming for. This is 1 sample per pixel but only sending out short rays. It's still able to capture a general concept of the lighting while being fast enough for realtime rendering. This would be paired with a denoiser to achieve better results. Maybe once the triangle data is optimized I will be able to send out more longer rays per pixel.

    Bonus slide.

    Realtime reflections by tracing the world per pixel.
     
    Last edited: Aug 16, 2019
  30. GuitarBro

    GuitarBro

    Joined:
    Oct 9, 2014
    Posts:
    50
    Ok, now those realtime reflections just got me 1000% more hype. Even if it only has a distance of 1 unit or so, that would be way better for contact reflections in cases where SSR falls apart.
     
  31. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,190
    Oh I didn't mean to emulate enlighten, especially since it's a offline renderer.

    Just caching previous (lonely sparse RT) rays at run time to get more (virtual) rays each frame without having to raytrace again. Like a circular buffer or something, it would be like a kind of temporal retroprojection.

    BUT thinking more about it it would be mostly useful only in rapidly changing lighting. If the light are static it's the same effect as it would converge the same, old node won't change their state.

    Anyway your new technique is kinda interesting and fires my brain up again. It's kinda close to what I'm trying now.

    I'm trying to make a more crappy version of GI too for opengl 2.0 (sparse convex space partition, no parallax), but using lightmap as support to geometry structure (I think you said enlighten does that too), given I want to try UV cubemap (atlas) box projection, that's only 2 texture fetch to emulate rays approximation (on geometry approximation), using normal on the cubemap to get the UV of the lightmap point to sample, light being accumulated on lightmap. The right cubemap atlas index is stored per pixel on the lightmap..

    The fact you use a box partition of scene, which is the problem I'm trying to solve as cheaply as possible to place probe, intrigue me.

    I want to see if I can achieve this cheap enough at run time. I have been experimenting with bitfield voxelization (8 textures of 256 to get a cube, bitfield written with "add" at rasterization by hashing depth), but a voxel volume is too much for the level of hardware I target (you need to raymarch). I'm investigating morton order encoding with mipmap (would allow ray march in fix 8 instructions), but I haven't figure out how to convert the bitfield efficiently yet, nor if it will be actually useful.