Search Unity

Graphics HXGI Realtime Dynamic GI

Discussion in 'Tools In Progress' started by Lexie, May 24, 2017.

  1. LennartJohansen

    LennartJohansen

    Joined:
    Dec 1, 2014
    Posts:
    2,394
    When finished would it be possible to add an interface where other components could feed meshes to the voxel system? I am thinking of meshes that render instanced and is not in the scene as gameobjects.

    Lennart
     
  2. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    Hmm, the way the voxelization works is it gathers all the objects in a chunk by using a orthographic camera render call with a replacement shader. You could add DrawMeshInstanced calls to that camera with a callback. The voxelization shader handles instanced meshes already so in theory it should work.
     
  3. LennartJohansen

    LennartJohansen

    Joined:
    Dec 1, 2014
    Posts:
    2,394
    That should work good. Would work with both instanced and instanced indirect. In my case it is vegetation. Is there any requirement on the materials? _MainTex name on the diffuse texture etc?

    Lennart
     
  4. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    You don't have to voxelize everything in the scene. in fact its recommended not to for speed reasons. Just do the large objects that are most likely to contribute to lighting. If you are using a custom buffer + instancedindirect to render things, you'd need to make a modified version of the voxelization shader and use that for the callback event when you add your DrawMeshInstancedIndirect.

    Might be best to make a custom command buffer to add these draw calls to the camera, I'm not sure if unity replaces the shader if you just did a normal graphics.DrawMeshInstancedIndrect() as the camera has a replacement shader set.

    The voxelization shader uses these names for the replacement shader
    _Color("Main Color", Color) = (1,1,1,1)
    _MainTex("Base (RGB)", 2D) = "white" {}
    _EmissionColor("Color", Color) = (0,0,0,0)
    _EmissionMap("Emission", 2D) = "white" {}

    But if your write your own to do custom draw calls you can use what ever names you like.
     
  5. LennartJohansen

    LennartJohansen

    Joined:
    Dec 1, 2014
    Posts:
    2,394
    Unity will replace the shader with instanced indirect also when replacement shader is set.

    This sounds good, and I agrees. larger rocks, maybe some of the bigger trees etc would be nice to add.

    Lennart
     
  6. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    I've added some cone tracing functions as well as the first pass on mipmaping the octree data.



    This is an example of cone tracing from the perspective of the camera. By tracing through the Sparse Octree I'm able to skip large areas of open space. I got it running as fast as I could with the octree format I'm using. I looked into other formats that are faster at tracing, but the space requirements and the extra cost to generate the octree data didn't seem worth it.

    It's still pretty fast and the rays are able to travel really long distances with only a few texture lookups. Big performance gains can be had by increasing the angle of the cone traced at the cost of precision.

    The mipmapping is done using a compute shader. It is very fast as its able to split up the work over many threads. I'm going to add anistropic voxels to the LOD data so each face can have its own data. This will remove most of the light bleeding caused by cone tracing. I would love to add anistropic filtering to the first layer of voxels, but the VRAM requirements will likely be too large. Will have to wait and see.

    I want to try out a few methods for generating the lighting data using cone traces instead of light propagation. Right now the world is broken up by a Sparse data set of light probes of different radius. The next step is getting the GI data stored in those probes.

    The old method I used was Light Propagation Volumes to generate that data, The way they work is by inject lighting into these light probes for every voxel face touching them, they then spread their lighting data out to other nearby probes as long as their are no voxels between them. This works pretty well but lacks a lot of quality as the propagation step causes most of the directional data of the light to be lost.

    What I want to try instead is cone tracing 9-16 cones per light probe to get a good indication of the incoming light for that cell. Kinda like rendering a reflection probe in realtime in unity, except it uses the cone traces to render the probe rather then a camera, So its really fast to render. The system would constantly update the probes nearby the camera to make sure the lighting stays current and responsive and also slowely update distance data as well.

    I'm also going to look into doing cone tracing per pixel (similar to SEGI), I think the data structure solves most of the issues I've seen with per pixel cone tracing. Will do some more research of all three methods before I pick a road to explore.

    Edit: I'm leaning more towards a light probe version. The reason why is it makes transparency support possible as well as forward rendering (FXAA is a big plus). It also means the lighting data is stored volumetricly so its possible to use it with volumetric lighting. Also the cost of rendering is mostly independing on the screen resolution. So things like VR would be feasible.
     
    Last edited: Jan 18, 2018
  7. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,493
    You are progressing so well ...
    I still don't understand how to bake arbitrary data on SH at all ... :oops:
     
    Lexie likes this.
  8. Tzan

    Tzan

    Joined:
    Apr 5, 2009
    Posts:
    736
    All excellent reasons in your edit, I think we all want those things.

    Are you using Forward or Differed in your project?
     
  9. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    Here is an example of cone tracing each pixel to generate 1 Light bounce. Each pixel sends off 9 cones and weights them to estimate the incoming defuse lighting. This method is able to generate pretty nice soft shadows and calculates fairly accurate skybox occlusion.

    Normally you would gather a few samples around each pixel to remove all the grain as well as some temporal sampling to increase the quality even more, might be able to drop it down to 4 cones. 1 more cone would be traced to generate specular by modulating the angle of the cone based off the roughness of the surface.

    Half res rendering with good upsampling can speed up this process significantly. But even still this type of rendering is only really possible on highend GPU's with acceptable settings and frame rates.

    I quickly put this together because its the simplest method to try out. I really need to add alpha support to the tracing as right now the trace returns a hit even if it just clips the corner of a voxel. At least I know the cones are able to capture a pretty good result.

     
  10. arnoob

    arnoob

    Joined:
    May 16, 2014
    Posts:
    155
    That looks really great, it definitely has a sense of diffuse lighting coming from the windows. Congratulation!

    That said, I am a bit surprised by the way you are turning this project. When you started it, it was really different from what SEGI was doing (cone tracing, that can be inaccurate sometimes, heavily relying on the GPU etc...). At first it looked like a "basic" method relying mainly on the CPU and brute-forcing GI by delaying the computation over multiple frames, making it a really interesting method for procedurally generated levels (even in VR!).

    Anyway, I was thinking about your old method, that I thought was really interesting (and with a lot of potential artistic applications!) : how did you used to do the light propagation in your old setup (the one where the light spreads with a delay)? I am simply curious about that because it looks like a basic orthogonal spread of the light, using 6 textures for the lighting and doing a shift of one texel per frame, but you still manage to get lighting that isn't completely bound to a single direction. Could you give us an insight about how you managed to do it? Or maybe a link to a paper that inspired you?

    Anyway, keep up the good work! Your showcases and this thread overall is really my go to every morning before going to work! :)
     
    Shinyclef likes this.
  11. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    The old method is called Light Propagation volumes. All the work was still done on the GPU not the CPU. I talk about it in the post above the last one. The lighting is stored using Spherical harmonics for each voxel. That's how the light was able to "retain" its direction. I say "retain" because it really didn't handle diagonal light very well and would creep around corners.

    I just tried out per pixel cone tracing as its only a few hours work to get it up and running. Cone tracing can be very accurate if you use anisotropic voxels, small cones and actually cone trace the data structure. your probably thinking of how inaccurate SEGI is. Segi doesn't cone trace, it cone marches. It skips large areas to speed up the calculate which results in a lot of lighting bleeding as walls get skipped. The data structure I'm using can skip large areas of open spaces. So its able to accurately sample the data structure while still being fast.

    In the post above I talk about the 3 directions I'm going to go to generate the GI data. One is still using Light propagation volumes. But I think cone traced Light volumes is the one I'm going to go with. It kinda a mix between cone traced GI and Light Propagation Volumes. That way the light transport can be instant and the directional data can be retained. The lighting is also stored volumetricly so transparency and volumetric lighting are possible. The down side is the lighting is only as accurate as your voxel size. But screen space lighting can fill in those gaps.

    Edit: Another reason i made per pixel cone tracing is so i can make sure the cone traces are acting how i need them to. Its hard to debug this stuff so you got to start somewhere. This was the easiest way to visualize it.
     
    Last edited: Jan 19, 2018
    arnoob and Shinyclef like this.
  12. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,493
    That look amazing!

    Now you have a data structure that cover long distance, light baking still off the table? :p
     
    Shinyclef likes this.
  13. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    Cone traced Light volumes is probably better then light maps as it can light dynamic objects using the same system. you would be able to bake it if you wanted to but its not stored in lightmaps, Its stored as sparse light volumes. The system I'm aiming towards is something similar to Quantum Breaks lighting but can be updated/generated in realtime/runtime by using cone tracing rather then raytracing to fill the light volumes.
     
    buttmatrix, Shinyclef and neoshaman like this.
  14. arnoob

    arnoob

    Joined:
    May 16, 2014
    Posts:
    155
    Thank you a lot LexieD! It was really interesting to read. Even if I am not a shader wizard like you I can understand a lot of thing that gives me ideas for other domains! :)
     
  15. macdude2

    macdude2

    Joined:
    Sep 22, 2010
    Posts:
    686
    This looks really awesome! Honestly, the only criticism I think is that its slightly noisy. Honestly its probably not an issue, but do you think that could be fixed in the long run or is that a necessary artifact of this technique?
     
  16. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    if i wanted to expand this method, I would normally do a pass to remove all the grain by using samples around the pixel. As I said in that post, Cone tracing is generally too expensive for current gen hardware so its not a method I intend on using, its was just a quick test i did to make sure the data structure was working with traces.
     
    Shinyclef likes this.
  17. macdude2

    macdude2

    Joined:
    Sep 22, 2010
    Posts:
    686
    Interesting, ok, because your actual plan is to use some sort of mixed propagation/cone tracing method like you were describing in the previous posts?
     
  18. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    yes, To generate the realtime light probe data.
     
    macdude2 likes this.
  19. mykillk

    mykillk

    Joined:
    Feb 13, 2017
    Posts:
    60
    Really impressive. Excellent range, and the resolution on the close-up voxels is pretty high. Looks much better than SEGI's volume cascade implementation. I remember one of the big drawbacks of the volume cascade solution was very uneven frame render times, depending on which level of the cascade is getting rendered. Does your solution avoid the same issue?
     
  20. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    Yes, It only updates X chunks per frame. so its only actually voxelizing a 64^3 * voxel scale area, rather then voxelizing the whole volume/scene like SEGI does.

    Edit: Note there will still be differences in time to voxelize each chunk as some areas will have higher poly count then others. But the area getting revoxelized is small so the render times should be fairly close.

    Its pretty different from cascades. My system stores the whole scene at the smallest voxel resolution in a sparse data structure. where as segi only store the scene close to the player at this small voxel scale. This means that you get crazy light bleeding with far away cone traces as the scene isn't stored at the lowest voxel resolution. This is why i moved away from using cascades as it introduced too many issues with light bleeding.
     
    Last edited: Jan 31, 2018
    Yasunokun, Shinyclef, hopeful and 2 others like this.
  21. buttmatrix

    buttmatrix

    Joined:
    Mar 23, 2015
    Posts:
    609
    ^thank you for that
     
    Shinyclef likes this.
  22. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    I had a little time to work more on this concept of a hybrid system. I put together a proof of concept using a sparse isotropic radiance volume. Each probe sends out a lot of rays to test for incoming light from nearby surfaces, If they don't find any surfaces with-in X steps they sample the radiance data at that point and inject some of that lighting into their own cell. This causes a feedback loop that propagates the light around the scene. The speed of light depends on how far you want to raytrace and the density of the sparse data set around that location. Longer rays cost more to calculate but get a more accurate result.

    The Gbuffer rendered by the camera is ray traced in a few directions to capture the incoming light from the sparse radiance volume to calculate the incoming light for that pixel.

    Right now the radiance volume is stored as a single color, It lacks any directional data. The next step is to switch it out for spherical harmonics so the lighting data is encoded with a direction. That will stop a lot of the lighting creeping around corners and allow me to base the math on something a little more realistic.


    (proof of concept render, needs more work to get occlusion calculating correctly)

    There is still a lot of functionality to add. This is just a proof of concept to see if it's possible to do with the sparse data set. Overall it was pretty successful so I'm going to keep moving forward with this idea. I was blown away by how much a simple ray/cone trace as a lookup had, it makes it look as though there is directional data stored even though there isn't.

    Specular Is done using a single ray for now, either cone traces or multiple rays with stochastic sampling would be used to support rough surfaces. I think this method will successfully merge the benefits of Light propagtion volumes and sparse voxel octree cone traced GI together while removing the negative effects each system has individually
     
    Last edited: Feb 8, 2018
    ftejada, Detniess, DasBiot and 9 others like this.
  23. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,493
    Just probing a bit:
    So we can have LOD rendering:
    - where the level is initialized with bake data for "fast starting"
    - close objects have higher rendering refresh with more details tracing (aka more bounce) ?
     
  24. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    A system to grab tiles out from the GPU and send them back to the CPU for storage could be possible. They would end up taking up a large amount of space even with compression. This feature is on the end of a long list.

    This new method is able to generate light bounces pretty fast as it traverse the sparse data structure, So light can skip over large areas rather then traveling 1 voxel per update like the old system did. The above scene looks pretty complete after around 4 frames have past. Old system would have needed 2+ seconds.

    It might be faster to just have a smaller warmup phase to get the light moving at the start of a level load rather then uncompressing and sending all that data to the GPU.

    The idea is the quality of the lighting is the same regardless of distance. Any quality shift in the way its generated tends to lead to flicker/artifacts. I'm working on some optimizations now that i know the direction im heading. Once that's done i can test with larger volumes. It might be a good idea to use less rays to generate the lighting data at far away nodes, but id prefer not to.

    Dynamic updates to the voxelization will probably be done with in a smaller area around the player. 128^3. There would be a way to manually flag an area to be updated though. The system would prioritize voxelizing the whole volume, then switch to updating the area near the player. Still got to figure out a good way to manage all this.
     
    Last edited: Feb 8, 2018
    Shinyclef, DMeville and neoshaman like this.
  25. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,493
    Wouah! that's impressive
     
  26. Shinyclef

    Shinyclef

    Joined:
    Nov 20, 2013
    Posts:
    505
    Sounds awesome. Are you still feeling as though the end result will be feasible on relatively modern cards like nvidia 770s? Does your recent work make you more confident with performance, less, or about the same? I ask because it looks great and im accustomed to pretty things being infeasible haha.
     
  27. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    I do all my work on a GTX 970m (laptop GPU), even though the number is high, its actually around the speed of a GTX 660. I'm trying to have this as my base line of acceptable/good settings. Better cards would be able to update the lighting data faster. Slower cards will be able to update the lighting slower. The whole system is designed to be able to scale regardless of GPU speed. I'll have a better idea of performance once I'm further along, right now there are too many unknowns.
     
  28. ekergraphics

    ekergraphics

    Joined:
    Feb 22, 2017
    Posts:
    257
    That's just crazy (in a good way), because due to work, we have been forced to look at Unreal Engine, and Nvidia's own VXGI implementation can't muster more than 45fps on a 1080 in an empty room.
     
    Lexie and buttmatrix like this.
  29. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,493
    I have a gt 705 can't wait to see what fractional fps I will have lol I'll be your lowest possible floor (also got a laptop with 920m)
     
  30. yohami

    yohami

    Joined:
    Apr 19, 2009
    Posts:
    124
    Is this out yet?
     
  31. Shinyclef

    Shinyclef

    Joined:
    Nov 20, 2013
    Posts:
    505
    Nope.
     
    WillNode and chiapet1021 like this.
  32. Kusras

    Kusras

    Joined:
    Jul 9, 2015
    Posts:
    134
    expected release date?
     
  33. chiapet1021

    chiapet1021

    Joined:
    Jun 5, 2013
    Posts:
    605
    Considering Lexie is still experimenting with techniques, I would venture a guess of NotSoon(TM).
     
    buttmatrix likes this.
  34. Shinyclef

    Shinyclef

    Joined:
    Nov 20, 2013
    Posts:
    505
    Also keep in mind a release at all has not been confirmed but it seems like it may happen ...
     
    blitzvb and chiapet1021 like this.
  35. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,493
  36. Galfaroth

    Galfaroth

    Joined:
    Sep 20, 2009
    Posts:
    19
    Hi! Will it work on mobile? Can be non-realtime or very slow FPS.
     
  37. Howard-Day

    Howard-Day

    Joined:
    Oct 25, 2013
    Posts:
    137
    Doubtful? I believe it requires compute shaders.
     
  38. zenGarden

    zenGarden

    Joined:
    Mar 30, 2013
    Posts:
    4,538
    Godot is open source, you can take a look at how they did real time GI :rolleyes:
     
  39. elbows

    elbows

    Joined:
    Nov 28, 2009
    Posts:
    2,502
    Some mobile platforms support compute shaders these days, eg certain android devices and iOS devices with metal support. Performance clearly a big consideration when using these for tasks more associated with desktop power, but compute shaders in general are coming of age and are starting to be used more at the heart of various engines systems now.
     
    pcg likes this.
  40. elbows

    elbows

    Joined:
    Nov 28, 2009
    Posts:
    2,502
    Actually that may have conveyed a slightly stronger message about compute and mobile than I was intending.

    The start of the draft Unity HD Pipeline docs are probably as good a place as any to convey what I was meaning. On several fronts I think this is about the right time to consider the split between low and high-end devices as being about those that have compute capabilities (along with a certain amount of grunt) and those that do not.

    (from https://docs.google.com/document/d/1e2jkr_-v5iaZRuHdnMrSv978LuJKYZhsIYnrDkNAuvQ/edit# )
     
  41. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    Got dot uses cone tracing similar to SEGI.
     
  42. zenGarden

    zenGarden

    Joined:
    Mar 30, 2013
    Posts:
    4,538
    I don't know , but it works. For a low poly game style game performance is good, better than Segi.
    While Godot Sponza GI demo is a bad example with many spot lights, intensive post process settings like real time reflections to the max. Anyway it's working good out of the box and support is going on unlike Segi.
    With CryEngine there are both viable solution to get real time GI without any baking with enough performance.
     
  43. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    "While the scene needs a quick pre-bake for the static objects that will be used, lights can be added, changed or removed and this will be updated in real-time. Dynamic objects that move within one of these probes will also receive indirect lighting from the scene automatically."

    Looks like it doesn't support dynamic voxelization.
     
  44. zenGarden

    zenGarden

    Joined:
    Mar 30, 2013
    Posts:
    4,538
    Perhaps but the bake is not noticeable and it works with any emissive materials, the result is great and it runs fast enough to be usable in a real game.
    Someone made some modifications to get it real time with some cost, but i really like it as it is and running good.

    Anyway, after a long Segi wait that went to no more support, i'm still waiting to see if someone will bring something that does not need Unity lightmap baking ? :rolleyes:
     
    MaximKom likes this.
  45. Mauri

    Mauri

    Joined:
    Dec 9, 2010
    Posts:
    2,664
    So, you came up with a solution that still requires pre-baking - thus making it NOT fully realtime at all. I thought we all wanted to have something like SEGI, but with better performance and not just another Enlighten catastrophe...
     
  46. zenGarden

    zenGarden

    Joined:
    Mar 30, 2013
    Posts:
    4,538
    It depends on your game need.
    For example you can have a procedural generation, the GI baking is lower than one second on a small level, so it works.

    For open world, i think this will improve with different options like no pre baking similar to CryEngine Svogi , but you loose performance.
    http://docs.cryengine.com/display/SDKDOC2/Voxel-Based+Global+Illumination

    CryEngine 5.5 will also get a new option to do offline voxelization for people needing more performance when there is not procedural level generation, like lightmaping you'll have to wait for the baking.
     
  47. deltamish

    deltamish

    Joined:
    Nov 1, 2012
    Posts:
    58
    Hey ,

    Been following your work man. I have to say its fantastic. But since beta release of Unity 2018 thwy have support for HD Render pipeline which maybe the go-to for many . Will HXGI support shaders from HD render pipeline as well ?

    Thanks
     
  48. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    I made this custom rendering pipeline that fixes most of the issues with unity's lighting and their response was they have their own they are working on that should be coming out soon, So I stopped working on it. That was 2 years ago. (they still haven't fixed these issues)

    I have seen a lot of promised features by unity that have come and gone in betas or left in an unfinished state. I will not be touching unity's HD pipeline for some time. They cant even settle on a standard for their screen space effects.

    Although the scriptable renderloop is a nice addition. I wouldn't count on that being a finished feature. Until they prove that it wont be another dropped feature I wont be touching it.
     
    Last edited: Mar 9, 2018
  49. chiapet1021

    chiapet1021

    Joined:
    Jun 5, 2013
    Posts:
    605
    I think SRPs are going to stick around, but the HD pipeline is still months out, perhaps a year or more, before it'll be in a v1.0 state.
     
  50. Lexie

    Lexie

    Joined:
    Dec 7, 2012
    Posts:
    646
    It took them 2 years to get command buffer up to the same functionality as Graphic.functions. I think it was only recently they finally added all the function calls. My confidence in a usable stable scriptable renderloop is very low.
     
    StaffanEk and hopeful like this.