Search Unity

  1. New Unity Live Help updates. Check them out here!

    Dismiss Notice

Any Runtime/Dynamic Baking Options - or are any coming?

Discussion in 'Global Illumination' started by Arthur-LVGameDev, Feb 4, 2020.

  1. Arthur-LVGameDev


    Mar 14, 2016

    I apologize in advance for starting a thread with such a seemingly broad question, but I've searched high & low trying to find good (even just workable) solutions and I'm at the point where I feel like I might just be missing something.

    Some Context
    We're building a fully-procedural management/tycoon style game in 3D. Players are able to freely build and destroy structures, change flooring and wall paint colors, build multiple stories, place objects, and the like -- and AI agents/characters then interact with the built structure. Our camera is relatively "freely controlled" by the player and it can tilt / orbit / translate. During gameplay, the player will typically be viewing around and into their structure from an angle; the structures have roofs but we hide them for the "current floor", so that the player may see into the structure to build / tweak their structures to be more efficient and better suited to the AI, etc.

    The problem(s):
    Most everything is good -- except for lighting. We can render thousands of high-variation agents and objects via DrawMeshInstancedIndirect and ComputeBuffers, can allow the player to build massive structures, etc... But the lighting, it continues to be a struggle!

    Our setup as of right now is described below, but I can't help but feel like there's got to be some better way to do this, and to achieve a better end-result as far as lighting/GI goes, though I'm at a loss because of our inability to "bake the scene" -- everything is done procedurally (structures are runtime-created meshes/quads, etc; objects are placed by the user; etc).

    Our Lighting Setup
    • Unity "Built-in Renderer" (legacy) via 2019.2.x latest.
    • Deferred rendering path.
    • 1 Directional Light shadow caster; provides time-of-day & exterior sun shadows.
    • Ambient mode set to 'Color' & medium gray; avoids over-exposing the outdoors at night & avoids always over-exposing the indoors.
    • 1 Directional Light, shadow caster, cull-masked to ONLY impact the indoors [objects, floors], aimed directly "down". Provides shadows for indoor objects and basic light (does not hit roof, does hit floor).
    • Structure [wall] meshes receive only exterior light (interiors, including walls, are blocked/shadowed by roof); since the interior walls receive no light from the "directly-down" facing interior Directional light we have added a custom shader that take an HDR "_InteriorAmbientColor" / essentially an intensity multiplier so that we can make the interior wall brightness roughly match the floor/object/agent light intensity.
    • Post-effects: AO and slight tweaks to Color Grade.
    This all works. And it works OK. But it somehow feels somewhere between deficient and suboptimal. We have user-placable lights for indoors as well (point lights placed just under the roof with relative low range & intensity), though they're pretty much exclusively for mood/aesthetic/effect; that works OK too on mid-tier hardware, and fine on high-end hardware / can place quite a few of them thanks to Deferred rendering.

    Must be a Better Way?!
    Still, I can't help but feel that there must be a better way! Our geometry is nearly static, at least for long stretches of time; the player may build 20 new things, and then nothing again for 10 minutes or more. Surely there's some way for us to effectively bake/'cache' shadowmap/lightmap data at runtime.

    I've looked into this quite a bit, have checked out and played with SEGI and a few others, but really haven't come across anything that seems geared toward solving this problem. To be clear, there are two overall classes of problem that I feel we face:
    1. Seemingly no support/solutions for fully procedurally-generated games / runtime lighting support -- at least beyond simply using purely dynamic lights or hacking in extra "ambient" colors via custom shaders. It seems that all of the "lighting goodies" (and the 'goodness' of which seems debatable at times) are limited to "in the scene" projects, or at least projects where multiple prefabs are assembled to create the "procedural geometry" (ex via lightmap stitching).

    2. Inability to have "multiple different" lighting setups simultaneously -- for instance, with a camera positioned above & angled-towards an open-roof structure, while the outdoor areas remains visible as well. We're able to make it work, but man would it save a ton of time if we could do something like have two distinct lighting setups (via two scenes, for instance), and then use LoadAdditive() to bring them in. Then, instead of using a custom surface shader to handle a "second ambient" setup we'd be able to natively have each scene's ambient configured appropriately ('indoor' scene and 'outdoor' scene) and use the scenes 'separately'. This may not be the optimal solution & definitely isn't the only one, but is just an idea.

      Bonus points: Within a single DrawMeshInstancedIndirect() call [that we might use to render 1k agents, for instance] -- is there a way to change (via compute or surface shader) which of the 4 culling bits are flipped, or is it too late at that point? I've read through the Deferred shading code, though it is admittedly difficult to follow; if we could dynamically change agents between "indoor" vs "outdoor" lighting without a second DrawMeshInstancedIndirect() call though, it'd save us a lot of CPU time and halve our draw calls; worst case we'll likely hack around it with a custom shader and little-bit-brighter actual ambient color so they don't stand out too much.
    All that novel written -- am I missing something? Better ideas or solutions? Would be incredibly appreciative of any ideas / thoughts / guidance on either of these, and really even just overall on lighting these types of procedural / fully runtime-generated games (fully procedurally generated / runtime).

    Thank you!!
  2. rasmusn


    Unity Technologies

    Nov 23, 2017
    Thanks for your comprehensive description. The fully procedural use-case is something that has been on our radar for some time but currently we have no good solution for it. The reason for this is 1) that we have to balance our resources on many other tasks and 2) this is a difficult problem.

    This is not to say that it cannot be solved - partially or completely. As you say, there "must be a better way". Theoretically, it is of course possible to bake progressively at runtime and fade in the results. Originally, we wanted the progressive lightmapper to be able to do just that. But as is often the case, the devil is in the details, and during development we had to make the difficult case to limit the progressive lightmapper to be editor-only (i.e. not available at runtime). We still want to enable runtime-bakes eventually, but this will not be possible anytime soon I'm afraid.

    We are currently reworking our light probe solution. One thing we are considering in this regard is to enable a volume of probes to have several "lighting treatments" or "lighting setups". Then at runtime, the user can programmatically interpolate (think day/night cycle) or switch between these treatments. This sounds a bit similar to your use-case, but this would probably only be one piece of your puzzle.

    We are also working on a replacement for Enlighten that has traditionally been our realtime GI offering. This new solution will probably (I cannot promise anything currently) support some limited form of dynamic geometry but probably won't scale to your case of a complete procedural level. (It will of course support dynamic lights, just like Enlighten). But at least something to be aware of.

    So summing up, I am afraid I cannot give you the answer you hoped for. Realtime GI in a completely dynamic scene is a difficult problem. There are solutions (such as SEGI) that are acceptable for some use-cases, but usually these solutions come with trade-offs. Maybe they only support dynamic light and no dynamic geometry, or maybe they do not scale beyond small scenes. For Unity we need something that is general because of the wide variety of use-cases that we must support This requirement for generality makes the problem particularly difficult (compared to solving a more specific use-case).

    By the way, you should check out the game Claybook if you haven't already. The people behind have published some technical videos about their solution.
    Last edited: Feb 5, 2020
  3. Arthur-LVGameDev


    Mar 14, 2016
    Thank you very much -- really appreciate the super prompt and thorough response. While it's not the answer I perhaps want to hear, it at least confirms that I'm not blatantly missing anything.

    I definitely understand, and especially regarding the need to develop with a "generalist" mindset / audience in mind. That said, I'd far prefer to handle "higher-level" stuff like NavMesh ourselves (is just an example, not even a great one, though is one in which we inevitably handle it ourselves regardless) and be able to rely on the engine for some of this "lower-level" (read: hard) stuff more so. I definitely understand though, and I'm sure the overall split of user/customer profiles likely quite easily justifies the approaches being taken on the whole. :)

    Regarding Enlighten:
    I've read up on it a decent bit, and with the understanding that it's being deprecated in favor of the in-house progressive stuff you guys are working on. That said, my understanding is that even if it wasn't being deprecated, that it's still only baking indirect light/GI but that all [possibly only non Directional] direct lighting continues to be fully dynamic -- is that correct?

    It'd be understandable if it was only baking the indirect, and that's probably the majority use-case, though (and as you probably fully grok), what I'm really after is more along the lines of [low-ish quality] runtime baking of all lighting / even direct light, to basically open the flood-gates on user/player "built" (placed) lighting. The theory being we'd render it as a dynamic light "additively" and then, as you said, once the baking was finished fade it in -- and for our game we could probably even get away with lights simply not "flicking on" for 10+ seconds or so of real-time even, etc... :)

    To summarize:
    I'm not missing anything, and it sounds like our best bet is basically to use >=1 "workarounds" to pseudo-fake it (ex the 1-2 directional lights we're using now and/or potentially go all out with secondary "_IndoorAmbienceColor" and look towards fake/blob-style shadows even), and beyond that just allow players to place/build low-range non-casting dynamic lights.

    I watched the video BTW, and while I'm intrigued I also think that it probably is at least a bit beyond our time/budget constraints (not to be confused with performance/frame-time budgets, which it appears to do exceptionally well on). Especially since it primarily, in our situation, amounts to environment/aesthetic value, at least once we've achieved a minimal level of "light" capability. Not to discount aesthetics too much, obviously! ;) . I've also seen the lightmap stitching stuff that some larger studios have used for their procedurally-generated games, though it doesn't really suit our needs all that well from what I can tell / seems more well-suited to automated procedural generation vs player/user-controlled.

    Anyways, thank you again for taking the time to respond -- it's actually quite helpful just to know we should continue down our current path without continually looking over our shoulder wondering if we're missing something blatantly obvious. =D

    PS -- Bonus Question
    I probably should start a new topic for this, it's more technical/specific, but if you happen to know: is there a way for me to "switch layers" [flip the bits of a DrawMeshInstancedIndirect()] instance within the shader [compute or surface or vert/frag ], for purposes of switching which "lighting scheme" it's receiving? Or is it "too late" in the rendering pipeline, by the time I'd be able to flip those 4 bits to be as we need them/based on data from ComputeBuffer?

    Situation Context: Our characters use ~12 models (so 12 DrawMeshInstancedIndirect calls) and we use buffers to animate them, swap textures, and colorize them; if an agent walks from "indoors" to "outdoors", I need the agent to begin getting outdoor light/shadows. InstancedIndirect takes the layer as an arg for all instances drawn IIRC -- so it means I have to double my calls, if I can't change it at "render-time". Doubling the calls isn't the worst part really, it's updating / "moving" the agent's C#-side render data from one of our InstancedIndirect "structures" over to another, and doing so in a way that isn't poorly-performing. If I could flip the bit in a shader, it'd be a relatively massive win, all things considered. =D
  4. rasmusn


    Unity Technologies

    Nov 23, 2017
    I think there may be a few misunderstandings here:
    1. In Unity you can mark a light as Realtime, Mixed, or Baked. "Realtime" means that they are not included in a bake. "Baked" means that all light, i.e. direct and indirect, are baked. "Mixed" means that only indirect is baked and that direct light is added in the shader at runtime. This concept of Lighting Modes is orthogonal to whether you are using Enlighten or Progressive Lightmapper.
    2. In Unity you can use Enlighten either for baking light or for realtime GI. The new Progressive Lightmapper is only for baking lights. In the Lighting window you can choose either Progressive Lightmapper or Enlighten as your "Bake Backend".
    In light of these clarifications I hope you see, that whether or not a bake is limited to indirect light only, does not depend on whether or not you use Enlighten. It only depends on which light modes you have set on your lights.

    Yes, of course. I think the video give you an idea of how difficult this is to realize.

    Hmm, I don't know that. To me it doesn't sound that expensive to move an agent from one struct to another. This of course assumes, that you only have a few (say, less than ~100) agents switching structs each frame. But maybe this is because I don't understand the full picture.

    For a more in-depth answer I recommend you check out the forum General Graphics.
  5. Arthur-LVGameDev


    Mar 14, 2016
    @rasmusn -- Thank you again.

    I think that I mostly understood this, but honestly that is far-and-away the most succinct way of putting, and I now fully understand it is indeed orthogonal. The overall terminology used, which I'm admittedly fairly new to after working with 2D exclusively for my first ~5 years of my working with Unity, can be difficult to decipher at times -- even just deciding whether to post this question in the "global illumination" forum (vs Graphics) was a question-mark in my mind. ;)

    I fully understand Mixed vs Baked now (thank you), though it's really somewhat tangential to our "issue" in that what we really are seeking is the ability to use "Baked" lights -- albeit with very low quality & on technically non-static geometry, and controlled/updated via API at runtime. Probably quite the ask, likely worthy of a chuckle somewhere, I'm sure! =D

    To clarify on one point, even if tangential to our project but for the sake of my understanding/for the future -- the 'Realtime GI' (which is powered via the Enlighten backend), it's still only taking into consideration lights that are set to mode 'Mixed' and presumably the 'realtime' aspect of it only evaluates 'against' (ie catches light bounces off of) geometry that is marked as "Lightmap Static" -- is that correct?

    It's not that expensive -- but it isn't free, either. It's not a ton of data to "move" but what ends up happening is that it not only doubles the count of InstancedIndirect calls (and some of the buffers, etc) but it also means that each InstancedIndirect call is then handling ~half of the instances that it did previously / each call becomes less dense & more sparse. Our "split" between indoor & outdoor agents is not actually half, so in practice we end up with 10+ extra InstancedIndirect calls that are only shipping 10% of the instance-count versus the "indoor" call.

    It's just quite a bit of extra overhead (ComputeShader driven animations, etc.) given that the sole purpose is to flip a bit, and really it probably makes more sense for us to just ship an extra float in and "darken/lighten" the agents if they walk outdoors & not worry about the structure's shadow, even if it looks a little bit off.

    I really appreciate your taking the time to provide super knowledgeable & insightful answers -- I've noticed it seems like more time/attention being paid in general to user questions on the forums, and I definitely sincerely appreciate that. I do try pretty hard to avoid [ab]using it, by asking dumb/obvious questions, so will do bit more research before posting it over there but will do so if I can't find a better route in the meantime. :)

    Thank you again!! :)
  6. rasmusn


    Unity Technologies

    Nov 23, 2017
    You are almost correct. When you set light mode to "Mixed", Unity will assume that the light cannot move and change. Therefore, there's no need to do any realtime GI (thus no need to use Realtime Enlighten). Instead, Unity will bake the indirect portion of the light (into lightmaps and/or light probes). It may perform this baking using Enlighten or Progressive Lightmapper depending on your project's settings.

    Remember that Enlighten can be used for realtime GI and for baking (depending on your project's settings). Only for lights with mode "Realtime" will Enlighten be used for realtime GI. I understand why this can be confusing. I personally think all these settings is part of the downside of trying to support many different use-cases. We could provide simpler settings, but then we would have to give up supporting a wide variety of use-cases. It's a trade-off.

    Thanks for mentioning this. I will forward your message to our team :)
  7. uy3d


    Unity Technologies

    Aug 16, 2016
    There could be something worth considering, depending on the type of game you're building. What Unity allows you to do is update the SH coefficients of light probes at run time. However, you cannot update the positions of probes. What this means is that if you know beforehand the size of a scene, you could flood fill it with probes in the editor and bake them, so the positions and tetrahedralization are precalculated and saved with the scene. By default you could just bake the ambient probe into all probes.

    Now while the game is running, you could for example pick a probe position, render a lowres cubemap from that probe, convert it into an SH representation and then update the probe with the new values. The logic on which probes to pick, drawing the cubemap, time slicing etc. would all have to be handled by you. But once the probe is updated, all dynamic objects would pick up indirect lighting from the updated probes. You'd still potentially have to deal with the usual issues like light leakage that you wouldn't be able to fix so easily, as the probe positions cannot be modified at runtime anymore.
    rasmusn likes this.
  8. Arthur-LVGameDev


    Mar 14, 2016
    Thank you @uy3d!

    I put together a super-quick test scene of this, obviously skipping over some (read: the entirety of) the SH math, my goal being primarily just to quickly see code have an immediate impact on indirect light, and I was able to get something going that is pretty intriguing and could potentially be the long term solution.

    I've got to brush up on light probes, but it does sound like this fits our use-case -- we do indeed know what "area" the player will have in front of them/be building upon. It looks like probes can be re-tetrahedralized at runtime, which accounts for walls coming/going and similar occasional changes to light-blocking & without needing to move probes at all. Going to look closer at this route for sure, it really does seem to fit our use-case incredibly well given that we know precisely when/where changes occur.

    Out of curiosity, why isn't this the "default" (or even standard/built-in) approach? Is there any issue or scalability concern, or some similar gotcha perhaps? Beyond the math, it almost seems... "too easy" heh. I may quickly know the answer to that when I try to throw 30-60k of them down though, hah. =D

    Again, not very familiar with probes so maybe extracting the output for use by DrawMeshInstancedIndirect is more difficult than it appears, though presumably it is (or can be) wired up the same as the standard shader is.

    Thank you again -- while our "extra ambients" is an okay solution in appearance, it's not awesome or overly impressive as far as modern expectations are concerned. This could very well change that, or at least make a dent in it! :)

    Edit: The "re-tetrahedralization" I mentioned above isn't actually needed I think, though does appear to be available at runtime but I doubt it would acknowledge updated connective geometry. We'd just need to "cut-off the light-flow" between two walls via setting the colors appropriately on either side of a newly-placed wall, as best I can tell. Still actiely exploring this option, it appears to be spot-on though -- it's essentially "fully procedural localized indirect light" as far as I can tell so far. Still learning & testing, though! Ty!!!
    Last edited: Feb 13, 2020
  9. uy3d


    Unity Technologies

    Aug 16, 2016
    Be aware that re-tetrahedralization is used to merge probesets from additively loaded scenes. It does not re-evaluate the position of objects around the probes. If your probe density is too low and someone puts a wall between two probes, you'll get light leaking in from the outside. The current implementation of probes doesn't have any mechanism to handle proper occlusion of probes that would address this issue.
  10. Arthur-LVGameDev


    Mar 14, 2016
    Yeah, have realized this to be the main obstacle; and then figuring out if it can be scalable enough to cope with the higher-density that will be required as a result.

    From your mention, it sounded like probe positions could previously be altered at runtime? Now that would truly have made it too easy! ;)

    I haven't had a chance to tackle this directly yet, we had to make some changes to be able to bake probe positions & not be moving things around -- inverting the parts that we "do vs don't move" during scene setup essentially. Commit just went up which addresses that, so we'll be diving more directly into this the next few days -- will report back and update you on how it goes (and hopefully not to ask dumb questions). Fingers crossed!

    Tyvm -- am fairly optimistic!!
  11. Arthur-LVGameDev


    Mar 14, 2016
    Alright, quick update and a couple of quick questions if I may. :)

    The Good News: I've got a working and "full" proof-of concept going for this! May be some quirks with the SH calculations, but is fully functional and works pretty much exactly as you said. It's pretty clear that it would allow us to really control the lighting, and likely has the highest "ceiling" as far as how great & "alive" we could make things look with it.

    The Bad News: It'll need to get a bit faster to truly be viable. Will need to shed about 80ms from each cubemap capture haha (I laughed, though the humor went over my wife's head). ;)

    Brief Additional Context: Haven't optimized the cubemap capture at all beyond dropping resolution super low. Probably is substantial gain to be had via some very basic simple culling for the cubemap render and similar "easy" wins. Will have to look at Frame Debugger to see what other "big" wins may exist though.

    Also, quick/rough diagram of my general plan WRT probe placement (note: outside corners are suspect/may not be able to share). I think that'll work and reduce worst-case time complexity substantially; less clear are the internal implications of "4X" probes, though. ;)

    Couple Qs [towards gaining ~80ms :D]:
    Note: Questions updated/edited on Feb 16th:

    1) When setting directly into SphericalHarmonicsL2 with the [,] setter -- what's happening after that / what's picking the data up? Is it being pushed to the GPU in a buffer or similar, or is it generally being used/eval'd on the CPU?

    I'm wondering if I can keep the entirety of the operation on the GPU: CubeMap render => compute shader to sample & calc SH => data "set". (I did check the C# source and couldn't find an answer, though didn't dig through the light shaders.)

    Worst case I can probably calc the SH coefficients in a ComputeShader & readback from a buffer; but if it's getting shipped to the GPU immediately anyways then it'd be great to not handle/readback the data on the CPU.

    2) If the data is needed back on the CPU side, then what's my best bet for cubemap render/sync -- am kind of guessing I should look towards AsyncGPUReadback, which I've some limited experience with (for a prior 'timelapse' feature). Is there similar existing API specifically for cubemaps, or perhaps a better solution altogether? I'm sure I can speed it up quite a bit as-is, though I suspect there are far faster ways to capture cubemaps than the standard 'RenderCubemap' methods, and async readback likely makes a lot of sense if it does indeed have to end up on the CPU side.

    Simple cull-layer selection and switching to usage of the RT-argument variant of cam.RenderToCubemap() largely mitigates my issue alongside a Graphics.Copy, and I can additionally amortize the faces if needed. Async readback would probably still make sense, though ideal world would be to compute the SH coefficients entirely on the GPU.

    3) More generally, I'm wondering if your original intent/recommendation was to use the existing "AddLight" API vs calculating the SH coefficients ourselves. I didn't profile the AddLight API, just assumed I'd need to calculate the coefficients though it now looks like I probably could just use existing API.
    I think this was a dumb question. ;)

    4) Bonus: Is runtime probe API on the roadmap at all for the future? Especially (or only even!) runtime-friendly API for managing probe adjacencies, that'd be pure gold! =D

    Unsolicited 2-cents:
    This is actually a pretty awesome system overall, assuming the scalability unknowns are surmountable as they appear to be so far. Part of me doesn't want to say this, as it's a competitive field where you need to "earn" it -- but I'm also always happy when I can find answers online so, IMHO... This capability would warrant documenting in a "how-to"/use-case style, and ideal world Unity might consider adding 1 method to really increase developer usability/UX -- LightProbe.CalculateCoefficients(RT cubemap, v3[] results)) -- I sincerely think doing so would mean a very-usable "fully dynamic runtime GI" solution would likely become immediately more accessible to substantially more developers/teams.

    Thank you! :)

    Seriously, really appreciate the idea and the guidance, will continue to report back (and will try to sleep on potentially-dumb Qs before posting them) -- but I think it has the potential to turn out really great, far exceeding our original [and fairly-low] expectations for our game's lighting! Thank you!!
    Last edited: Feb 16, 2020
  12. uy3d


    Unity Technologies

    Aug 16, 2016
    1) Unfortunately there's no path that's solely on the GPU. The probes reside in CPU memory, and the tetrahedralization is used to interpolate a probe per object (or update the light probe proxy volume) on the CPU, before the interpolated data is sent to the GPU. You probably won't be able to avoid async readback, as you don't want to sync CPU and GPU in the middle of a frame.

    2 (kinda)) For drawing the cubemaps, as you only want some form of indirect lighting, rendering only the lowest LOD, using a simple shader that just gives you (surface normal dot light) * albedo would probably be sufficient, possibly even just doing vertex lighting and multiplying it in the albedo. There's the option of trying to use a geometry shader to render into all 6 cubemap faces at the same time with the same drawcall, but depending on target platform this might not be any faster actually.

    4) There's quite a bit of work being done on providing a more current solution to lighting a world with light probes. You generally have the issue of
    - probe placement
    - light leaking (this is linked to placement)
    - orthogonal to that the problem of updating the probes.

    Placement currently is arbitrary. This requires a costly tetrahedralization step so a probe can be efficiently interpolated from an existing arbitrary grid. That is also the reason why you cannot simply add or remove probes at runtime - although we do provide this now when additively loading scenes
    An alternative way to probe placement is to simply confine them to a grid, which may also be sparse using virtual textures or indirections. The advantage is that as long as you know the grid origin and your position relative to it (can easily be done in the shader) you can directly sample the probes from a texture and have the GPU even do the interpolation for you. We use this for example in the Light Probe Proxy Volumes (LPPVs). The downside is that a grid doesn't necessarily match your scene that well, and grid positions are usually rigid, e.g. all probes within an object are wasted memory. There are approaches that allow you to add offsets to individual probe positions, although once a scene is flood filled with probes you might now want to go in and tinker with individual probes. Still, this is generally the direction we're heading in.

    Leaking occurs due to interpolation before evaluation of the SH coefficients. This is especially difficult to deal with when the entire object is represented by one probe (LPPVs address this issue to a degree). Initially the idea for probes was to provide indirect lighting for dynamic objects. Later we added static scene dressing, so you could have small objects be lit by probes instead of having to lightmap them, but still have them take part in the lightmapping step. For larger objects, this is unfortunately still difficult.
    The next step is to add some form of visibility information to probes that can be used to determine how much influence a probe should have. As the visibility needs to be evaluated before interpolation, this means we lose hardware interpolation of the SH coefficients, at least from the TMUs (it's similar to the PCF issue for shadowmapping, although we eventually got a hardware solution to that from the GPU vendors). There are also specialized approaches like precalculating the transfer function for SH probes. You basically bake the visibility of the static scene instead of lighting into a probe. Then you multiply this with a probe to get the actual SH coefficients. This is often used to precalculate sky occlusion. You can have a dynamic time of day, updating the global light probe and "folding" it with the visibility. Our Book of the Dead demo does something similar to light all the trees in the forest. Generally speaking, there are various approaches to providing visibility, all having their pros and cons, as usual. Unfortunately, we don't provide a solution to augment probes with visibility information, yet, other than what was used in the aforementioned demo.

    Updating the probes is orthogonal to the issues mentioned above, as we only need to generate the SH coefficients, and the visibility later down the line. In order to do this you need to evaluate the scene. This can be done by ray tracing the scene (this is what the progressive lightmappers are doing) or precomputing visibility form factors (enlighten). Both steps are costly and thus done offline. This also limits them to static objects. The recent developments in the field of hardware accelerated ray tracing (DXR) are changing this to a degree, but require the latest hardware, so it'll take some time before these can be considered the standard way of doing things.
    Also, as probes so far have been a CPU thing, we haven't provided a GPU only path to modifying them. Once all evaluation is purely GPU based, having a fast update path that never leaves the GPU is certainly desirable. An API that batch updates probes using the fastest method available (e.g. ray tracing or cube maps) would be nice to have, but I can't make any promises there. I'm not sure to which degree we'll be able to provide a solution that works everywhere, keeping in mind the different render pipelines we have today and their different feature sets.
    JoNax97 likes this.
  13. Arthur-LVGameDev


    Mar 14, 2016
    Thank you, that was both extremely informative & quite helpful.

    I’m still working through it currently, perhaps feeling *slightly* less optimistic as I'm nearing a week into my "lighting" foray though I suppose that's not a ton of time in the grand scheme & is to be expected. I’m probably also toeing the line, as far as being in a bit over my head here -- though doing my best to keep my focus and continue productively moving ahead on it & staying hopeful overall. :)

    There's a good bit more I'd be interested (and enjoy) discussing with more generally on this topic, but in the interest of time & efficiency (and deadlines), I'll try to be brief & more tactical.

    I think you’re speaking more generally here, but this is almost exactly what I’m looking to do; our game is entirely grid-based and my goal is to just have some form of dynamic indirect lighting, and to have ‘control’ over it / everything is procedurally driven so it’s likely most efficient for us to “pipe in” map-state changes in batches.

    Based on my understanding of your description of their workings, LPPV is probably the way to go -- after some hands-on time with it, it looks like it's well-suited and generally even allows for (though yet untested) high-performance usage via InstancedIndirect draws -- and all via sharing just a few LPPVs. I was a bit concerned about the resolution, but it looks like I can override the resolution settings via code, and since we're controlling the refresh-rate it seems performant from my work with the paradigm thus far.

    Because I suspect that this is “piece-of-cake” territory for someone like yourself and I have a feeling that you may be able to utilize the context to provide a few sentences of "directional guidance" that may take this from "toeing-the-line" / barely-attainable to quite doable -- simply due to my lack of what I'd call "whole-picture" familiarity/understanding (though doing my best to rapidly improve upon).

    Game Context -- Grid Based:
    It's very similar to our first title, SimAirport (Web / Steam), which is an agent-based simulation & management sandbox-style game. Unlike SA, the project is 3D and has a different theme/setting, but otherwise quite similar. The entirety of the game is "grid-based" and all objects and walls are "placed" by the player on the logical grid; objects occupy 'cells' while players can freely construct walls which are always grid-aligned with cell edges (unlike in SA, where walls occupied the entirety of grid cell). The camera is player-controlled, but generally played from an angled-above/overhead camera.

    FWIW, so far I've got most of this working -- though I'm currently using a 'constant' outdoor ambient (set on all LPs) and only varying the indoor ambient.

    Some Practical / Purely Tactical Questions
    1. Qs About LightProbeProxyVolumes:
      A) Runtime creation & manipulation? Appears to be "yes".

      B) Can set to higher-resolution than the inspector UI allows for (32x32x32)? Also appears to be "yes" though the Gizmos stop drawing, the resolution appears to be higher I think...?

      C) Effectively these are higher-resolution light-probe samplers -- meaning they still DO require existing [baked]
      LightProbes as their source of data --- or am I missing something here? Is there a way to use LPPVs instead of LightProbes, which is what you kind of implied (when talking about 'purely grid based' systems), or are they "in addition", and simply to provide more optimized & efficient Probe usage/interpolation?

      AFAICT, it's basically just improved resolution for large objects -- plus, and in our case the main benefit, the ability to "runtime bake" [via only occasional triggering of LPPV.Update()] the interpolated LP data to a texture for direct sampling on the GPU -- effectively a performance optimization to 'cache' what would otherwise be more continuous CPU-side interpolation.

    2. Qs About Light Probes
      A) You mentioned that additive LPs are now supported (as of 2019.3.x?) via additive scene loading. Can it be invoked more directly, or does it specifically require an 'additive scene load'...? What happens if I spawn a prefab that has a LPG on it & call Tetrahedralize? Or if I simply add additional probe data directly? Probably not worth exploring this too far, but it'd sure make life easier and allow "JIT" memory usage vs current trajectory of pre-baking probes to cover all ~ 256x256x50 units of each [6x] "buildable area" which just seems bad...

      B) If using a single LPG for all of our LPs, can I rely upon their order within LightmapSettings.lightProbes[] or is there any case where they may be re-ordered? The order appears to be maintained (in the order of their addition), but worth asking as it'd wreck my setup if it isn't guaranteed.
      PS -- I haven't taken the time to isolate it, but it feels like there's a memory leak around here somewhere, I've crashed Unity several times after 2-3 subsequent 'probe-only' bakes (300k+ probes; macOS 2019.2x, 64GB RAM). First one or two is fine, and then suddenly it locks up. No lightmapping is occurring, this is solely with probe positions in play (well, 1 static object).
      Edit: Interestingly, the 'Bake->Clear Baked Data' is extremely slow with 300k+ baked probes; it takes approx. as long (maybe longer) as the 'bake' itself.

      C) There's no way to run 'probe-only' bake, or to initiate it from code?
      Or to do so without requiring 1 static object + an ambient set? Now that I've got the workflow down it's not so bad, but the first day or two I struggled a bit+ on this 'editor side' stuff -- basically figuring out the 'minimum checkboxes required' to get probes to bake without other stuff happening, and while leaving control of the SH data to me (ie tetrahedralize but nothing else).

    3. [An Amateur] Q about CubeMap Renders
      A) How do I avoid over-exposing cubemap renders, ending up with increasingly-bright cubemaps?
      Ex: If CubeRenderA catches some light, and CubeRenderB is adjacent to it, how do I prevent the outcome of CubeRenderA's light (ie colorizing & brightening a nearby object) from impacting CubeRender B? Is the theory to render the cubemaps unlit? This seems like a dumb question but it's one that I still can't answer, and is why I've got the system running at the moment minus the cubemap aspect -- though it's soon to be the main thing left for me to solve. Worst case, I can programmatically define "objectX=>colorY" and 'splash' some ambient that way, but that seems cheesy since I have the cubemap stuff working -- it just over-exposes / the "order" of cubemap renders impacts the outcome.

      B) Best way to render CubeMap with a distinct "light" / shading setup is.... camera.ReplacementShader?
      And then we'd need to manually make sure that all our custom shaders are 'compatible' with the replacement shader? Is there a better or less-tedious way? Certainly is doable, but we have a decent number of shaders that we'd need to "mainstream" / consolidate.

    Tangential Rant
    Apologies in advance if this comes off overly harsh, it's not my intent, it's really just some general feedback; I know this is truly challenging stuff, and the use-cases & platforms you're targeting are extremely varied & generalistic in nature; if anything just providing my perspective, which likely has a different "tunnel of vision" vs someone who works on the internals day-to-day, and so figured it might be helpful, even if minutely [or ignorantly even]. ;)

    I find the lighting & shader ecosystem to be difficult to trace & follow and to debug overall. I understand it's still fairly "new" and still evolving (I didn't know until recently that the 'surface shader' concept is only ~10yrs old), and I certainly don't envy the "legacy" & and platform support workload that I'm sure is no fun.

    Perhaps it's just the nature of GPUs, but tracing/following the Unity shading 'stack' reminds me very much of early-days Perl & PHP code: it feels extremely "global", with loads of implicit/by-convention calls, method-body introspection that impacts call semantics, etc.

    Overall it's just difficult to work with, at least from the perspective of an experienced logic/CPU dev who isn't specialized in the graphics pipeline. It doesn't help that the "CPU side" is fairly opaque; if it was more visible it would at least allow for interrogating & understanding the state more -- and would probably make life much easier. As is, the best resources are truly the docs (which are great but hit/miss), semi-dated Unity blog posts, and forum posts and/or finding reverse-engineered understandings (ie Google " SphericalHarmonicsL2"). It can be frustrating, and if it wasn't for some of the instant-gratification (ie live-reload) it might be close to too frustrating. ;)

    Thank you again -- really, appreciate it tremendously, and hopefully will be able to soon post some screenshots of what you've been able to guide me towards being able to achieve! :)
    Last edited: Feb 20, 2020
  14. uy3d


    Unity Technologies

    Aug 16, 2016
    1a) You can move them around, and creating a new component should also just work.

    1b) LPPVs are limited to 32 in each dimension when set through the UI. Whereas setting a higher value via script will work, as soon as you change anything through the UI an internal validation step will clamp them back to max 32, so that's something to be aware of.

    1c) They are a way to have more detailed lighting for an object, where a single interpolated probe would not be enough. So your assumption is correct, they are NOT a replacement of baked probes. Unfortunately we haven't exposed a public API for you to just provide the volume texture directly.

    2a) It can't. Before, all probes had to be in your main scene. Allowing to have probes in scenes that are additionally loaded and then merged into the global probe volume via tetrahedralize is a recent addition. But the current CPU probes have to be one global probeset, which causes all sorts of limitations.

    2b) Are you baking using the CPU or GPU lightmapper?

    2c) There are no probe only bakes, yet, although we've had that request a couple of times now. Historically probes have been used to augment lightmapped scenes. Only recently have we started considering them a solution on their own. Also, without any static object to actually reflect indirect light, it doesn't really make sense to bake probes. Your specific use case of priming only positions through the bake is not something that was taken into account when designing the system.

    3a) If I understand you correctly, your problem is that you update one probe via a cubemap. Then when you render the next cubemap, objects are picking up the previously updated probe when rendered, meaning that in the first case they didn't have indirect lighting, but in the second case they now do. If that's what you mean, you should be able to disable probe lighting on the mesh renderer during the cubemap pass. Then only direct lighting should be calculated on the objects. lightProbeUsage on the mesh renderer should allow you to do this. Alternatively you could provide a custom probe by setting the usage to custom and providing SH coefficients via the material. This will only allow you to provide one probe for the whole object, though.

    3b) A few more details on what you're trying to do would be helpful.

    A lot of the issues you're running into are caused by trying to use the current system for a use case that it hasn't been designed for. And you can only bend it so much to what you need it to do. We're aware of the shortcomings and are working on providing solutions, but that doesn't help you right now. If you want total flexibility, you could consider running your own version of a global LPPV, basically keeping a 3D texture around at the resolution you need, update it as needed, bind it as a global texture to shaders and adding the sampling code in your custom shaders (it sounds like you already have most of this running, anyway). This gives you the most control, and you won't have to fight the existing system anymore. You should still use all the SH related code for reference, even augment it by some rough culling to deal with leaking (we're not doing that, yet, as it can lead to lighting artifacts, but your title might be able to get away with it).
  15. Arthur-LVGameDev


    Mar 14, 2016
    Thank you very, very much. Extremely helpful and it's going to have a material [hah!] impact on our game. Really appreciate it, thank you.

    Re: 1C -- Yeah, if we could supply the texture ourselves that would be helpful. Either that or, and preferable really, would be some kind of async/frame-splitting for the LPPV update perhaps. LPPVs are fantastic tools, and I didn't understand what they actually were until this thread. I've got to see how it does on a bit better hardware, but I'm definitely very pleased with them functionally -- my only issue is the update spike, I'm already 'batching' changes but I may end up having to chunk them or similar, to keep LPPV.Update() calls reasonably fast/not have spikes.

    Re: 2A -- During build, you're collecting probes from all scenes in the build, or something like that? They just retain their world position in whatever scene they're in? It may be worth consideration actually, would mean >= resolution we've got now (+/- about 2 "cells") with lower RAM requirements and bit easier development, visually better outcome. Hm, may give 2019.3x a quick test with the project tonight & see it's viable.

    Re: 2B - I had to check, but it's currently set to CPU Progressive.
    The Gizmos definitely come-and-go, I'm not sure what causes them to go away though, but I've just got some GOs following the mouse to debug it for now and we aren't actually lightmapping anything right now/I'm actively iterating on the LP/dynamic GI stuff so we have everything non-static (+1 single static object somewhere) to force it to bake the probes. Had that working initially, next day after a merge I struggled to figure out "min-checkboxes-required" to get the probes to bake which was frustrating, but since then have just not touched it & doing everything procedurally, has generally been fine except for the day when I was positioning/crashing probes. Next week or so, we'll actually want to bake the "surrounding" static area/scenery, but at the moment there's no reason for it to be on CPU vs GPU (though I'm on a Mac w/ a somewhat-dated GPU).

    FWIW, the probe bake currently freezes Unity up for ~5min or so, but does eventually recover; it's ~750k probes total, fairly dense in each of 6 distinct locations. They're all on 1 LPG because otherwise order appears to not be maintained, or at least it isn't FIFO anymore. Hitting 'Clear Baked Data' takes approx. the same amount of time as the bake, maybe longer; I had force-quit a few times which could be to blame, could potentially need to clear Library/*. We're still on 2019.2.x latest btw, though we'll update soon barring any major obstacles.

    Re: 3A (and 3B) -- Yep, you nailed it. Here's what I've got right now:
    • Area of map changes: enqueue relevant LPs for update.
    • Cubemap Render
    • AsyncGPUReadback
    • Calculate SH coefficients
    • Push SH back to LightmapSettings static
    • LPPV.Update() called in batches
    What you said is exactly right. It's a bit less of an issue after making LPPV.Update() calls be delayed/batched, but even that really is just (I think?) effectively prolonging the inevitable issue: that the outcome (SH coeffs) for any given map-state at a single probe aren't deterministic (due to indirect lighting), and so the actual outcome/SH values are impacted by the order in which the cubes render.

    Was thinking there may be a better solution than "disable Probes momentarily" per object (and it's unclear what overhead, if any, that might incur), but I just wasn't sure -- it seems like a dumb problem, "too much indirect light" hah. But yeah, wasn't sure if there might be an easier solution that I'm just not aware of.

    3B is really a similar or same question -- basically are there any tricks to disabling lighting, shadows, essentially rendering only albedo (may not work/look right, not sure) but without having to modify all of our custom shaders, and without having to manually find+toggle individual objects. It's doable, but games like this we'll inevitably see players who decide to build 300 couches next to each-other; if there's a lower-complexity way to render "simple" it would certainly be attractive. After searching the docs further & connecting the dots a bit more, I think what you mentioned earlier is probably the right idea -- rendering it as vertex lit. I tried it earlier, I think it worked but we've got a couple custom shaders without a pass so it was hard to tell. I guess the idea is indeed, do whatever it takes to get a render pass that doesn't have indirect lighting applied -- though kind of was wondering if there is research/concepts that I'm unaware of & that might yield "easy way" style, hah!

    EXTRAS / NEW Qs =D

    LPPV.Update() -- Mentioned above, but would love a way to update it with time-slicing and/or run partial updates defined by coordinates, or update it in a thread (unlikely I know), or worst case to just push in our own texture.

    LPPV vs LP: Performance / Scale -- It seems to me that it's likely to be more performant to have all objects use a single (or a few 'main') LPPV if they're already located "on-the-grid" [that the LPPV covers] spatially, vs each object interpolating LP individually. True? At least, as long as the resolution is good enough & am able to bear the Update() cost/load... That's generally what I'm doing right now, is a single LPPV -- a "big grid" -- and having all procedural meshes use it. I haven't profiled yet, but based on how it works I'm anticipating that it's likely faster to have everything use that grid/the LPPV vs the interpolations, ie for user-placed object models, too. The downside is reduced resolution, but if the CPU gain is big enough it'd be worth it (especially if 2019.3x pans out & we can increase density/resolution).

    Thank you -- will probably include some screenshots in next post. :)
  16. Arthur-LVGameDev


    Mar 14, 2016
    Two for one tonight! =D

    Yeah, quite unsurprisingly, I think you were probably right... :)

    As much as I really hate reinventing the wheel and diverging from the well-traveled path (and potentially missing out on upstream improvements) I think that it, unfortunately, probably does make the most sense for us to try building & maintaining/pushing our own 3D texture for the SH data. We faced this same type of 'fork' often working with Unity in 2D, though thus far haven't had that experience in 3D at least. Argh, [yet another] "new" facet of this project, and time is definitely of the essence.

    FWIW, I tried 2019.3 and the additive LP loading does appear to work (and the Editor doesn't freeze as much during the bake), and the only "issue" that I saw really was that the array indices then are unknown/out-of-order & would need to resolve them once (guessing they likely *are* in some specified order, not sure what/which order though). That said, we're kind of precluded from making the jump to 2019.3x due to overall editor stability issues which is unfortunate as it otherwise does appear it would be a very workable (even if a bit suboptimal) approach.

    Before getting too far down the path of building the 3D textures ourselves, I'd like to confirm my findings on 2019.2x though, basically WRT the overall scalability of the LP/LPPV systems: how many probes does it take to get to the center of the proverbial Unity stack?

    What I've been doing thus far is simply baking a ton of probes in 1 scene -- "a ton" being defined as ~700k, which works though is a bit touch-and-go until they're baked, and even then is a little "finicky" / obviously is not optimal. Further, what I'd really like to do is bake ~700k probes per buildable area -- instead of the ~120k per buildable area that I'm laying down right now. I'm guessing that it's probably just not realistic to attempt baking 4.2M probes and, at best, I certainly wouldn't want to accidentally click that LPG's GO while Gizmos are enabled... ;)

    More game context: There's 1 scene that is 'the world'. Each buildable plot is 256x256 units (equates to our 'cells' in game logic). Players play/build on only one plot at a time. Players can build on "multiple floors" of their 256x256 area with each floor being 3 world-space units in height, and currently we impose an arbitrary max of 5 floors (though we wish to increase that to 11+). This means the total world-space size of a buildable plot area is currently 256x256x15 (swizzle varies but is unimportant).

    Worth confirming, though late-night/low-sleep testing implied it true, just to make sure:
    The baked probe positions [in 2019.2x & 2019.3x] -- they are baked in world space?

    The reason I ask is that, if they were baked in local space, then I could simply bake a single 256x256x11 "plot" in an arbitrary area, and then move the LPG to the actual correct location at runtime (theoretically the inverse could work too, moving the "world" to be in the right location, or via using multiple cameras, etc -- but this quickly becomes a messy workaround that breaks other things like static batching, lightmapping, baked occlusion culling, etc). =|

    So, based on all this, would your recommendation still be to pursue the creation of the Texture3Ds ourselves? Assuming so, could I get some more information about the following -- it's kind of jumbled sorry, but the gist should make sense I think/hope...

    A) The creation/data layout for 'unity_ProbeVolumeSH' itself.
    I've read through the shaders multiple times now, and I generally kind of follow, but any example and/or more explicit example would be super helpful to get me going in the right direction. For instance, if you could tell me roughly that the layout is of the 3D texture, ex:
    foreach(texture_px) {  SetPixel( Color( sh[0,1].x, sh[0,1].y)); } 

    Basically, some guidance to help me translate the piece I can't see, the texture creation side.

    B) How do I tell the shader(s) what SH coefficients to use?
    I've got the SH coefficients, I can correctly create them/they work well; and I can even determine what coefficients to use for a given world position (on the CPU/in C#), that part's good too! BUT: How do I tell the shader(s) what SH coefficients to use [for a given world-position]? I'm aware of LightProbeUsage.CustomProvided, but that appears to only allow me to send in a single set of SH coefficients. Which is fine for some cases, but not for others -- which means I basically need [and want, for performance reasons] to be looking specifically at the LPPV code/sampling structure instead, which leads me to the next question, C....

    C) Do I have to basically "copy/paste" the same shader code or can I override the data vars directly?
    Essentially it's ShadeSHPerPixel, and really everything downstream -- ie SHEvalLinearL0L1_SampleProbeVolume, etc.? The method bodies would essentially be the exact same, except they'd be switched to use my own uniforms (I think that's correct term) / different data variables, using the data that I create/format & provide.

    This is where my understanding of the render paths side breaks down; it's unclear to me when/what sets the data currently, if I can simply override it, if it's being set differently per-batch, etc... It'd be great if I could simply make use of the existing shader code directly, and simply swap/push in the appropriate data instead -- but again, I'm not sure if that's something I can do, even 'forcefully'...


    Thank you -- somewhat-anxiously looking forward to additional guidance. Unfortunately no screenshots yet, maybe another day or two away from them given the updated trajectory. Really trying my best to not "settle" for a poor solution though, it does make a big impact on the scene to have the light looking solid; not sure I can afford another whole week on it though, but hopefully I'm on the brink of seeing the light shining at the end of this slightly-dark tunnel!! =D


    BTW -- Perhaps I'm a bit ignorant or lazy or something, I don't know -- but FWIW I have read through all of the content in the docs here, including all of the linked content (technical paper, the presentation, and several other papers as well); I generally understand conceptually, but practically speaking the application of it is a little tougher -- and being able to follow along with concepts "in-use" via the existing stuff helps a ton. Hell, the linked paper damn near required me to use a magnifying glass, hah! ;)
    Last edited: Feb 22, 2020
  17. ldlework


    Jan 22, 2014
    Please make it so we can bake light probes into prefabs and tetrahedralize a randomly generated level at runtime!

    When searching the Unity forums and the wider internet, this has been discussed with Unity staff for *years*. I honestly cannot wrap my head around why an API for this isn't available.

    **Why** is LP only available through scene loading. **Why** not simply create an API for this? This goes for light-mapping too where one needs a bunch of hard to understand code in order to get lightmapped prefabs.

    With all the progress being made regarding lighting, seriously, how is it possible, procedural content isn't being considered at all? It's so obvious from all the communication between desperate and hopeless users and Unity engineers that there is absolutely nothing technical preventing Unity from just having a nice API for managing LP/LM and that the current situation is the way it is because that's just how it is. It hurts my face.

    Please say something to give hope.
    Rich_A likes this.
  18. Arthur-LVGameDev


    Mar 14, 2016
    Alright, so I'm getting closer at least -- here's my bare-bones test bed:

    1 set of them is via "BlendProbes", another is via "CustomProvided" and a MaterialPropertyBlock, and the last one (and only on that isn't working) is via a custom shader that's reading an input Texture3D which has been encoded with the same data that the MPB is receiving + running the logic/math from SH9.

    I have a feeling it's probably something simpler that I'm missing -- Texture filtering, or something like that maybe...? Any ideas, anyone? :) FWIW -- I've got the texture set as a "R32G32B32A32_SFloat" currently, though I tried a few; am not sure it's the issue though.

    Also, I see that Unity puts the L1s all into a single texture and then just offsets +0.25 and 0.50, to fetch the additional coefficients. Is that a certain GraphicsFormat, or are you just writing the data into the texture like that/in bunches?

    Thank you -- hopefully another day or two and will have some solid looking lights going on -- though they look solid as-is too, just not quite as flexible as this way will be. :) . Ty!
  19. Arthur-LVGameDev


    Mar 14, 2016
    Alright, I've verified my data -- believe the issue was the texture clamping, which I haven't solved yet but will probably convert it to use a ComputeBuffer instead, is probably faster & cleaner route anyways.

    WRT Rendering the "Custom SH" -- How to do so cleanly?
    What options exist for "overriding" the GI/indirect light, without having to duplicate basically everything else that's in the Standard shader's deferred render path? Is there a way to do this that is fairly clean? Or am I looking at having to duplicate much of the render path code/files into the project, just to modify the 3-5 methods related to SH? Even then, I'm guessing I'd need to rename the "new copy" of each and update includes throughout?

    To be clear, my goal is to "override" any/all relevant SH methods -- they'll remain functionally identical, I just need to change their data-source to use my custom data/buffer, instead of the default data sources [ie lppv texture & _sha/_shb/_shc).

    The only other semi-clean alternatives that I see right now are...
    A) Start with a new "Standard" shader, compile it to a big vert/frag shader, and then use that on everything; i.e. not overly clean & lacks long-term flexibility. My ideal would be to not break things that use the Standard shader, so that we're not creating a problem/steep learning curve for the future.
    B) Create a "custom light model" and then call into the base, and add my GI afterwards. I'm curently trying this approach, but I can't tell how clean it's actually going to be -- because the SH methods are called from all over the place and it's not easy to tell what the actual call-chain/path is. I'm having a hard time getting it to work cleanly thus far. =|

    TBH, as much as "monkey-patching" tends to lead to problems, it'd be fantastic in this kind of stuff (as would more leveraging/containment of call-scope overall).

  20. INedelcu


    Unity Technologies

    Jul 14, 2015
    The LPPV was originally designed to provide GI for large dynamic objects (like a character for example) and not to be a replacement for real time or baked GI systems.

    Some technical information about the SH 3D texture:

    LPPV component will encode into the 3D texture only L1 SH coefficients for each cell. Basically only the first (constant) band and the second SH band.

    [L00: DC]
    [L1-1: y] [L10: z] [L11: x]

    This is because the texture has 4 channels, so 1 channel for each of the components above.

    The 3D texture is a 3D atlas actually and encodes the information from above for each RGB channels.

    Some comments from our code:

    // The SH coefficients textures are packed into 1 atlas. Only power of 2 textures allowed. Occ part of the texture will contain probe occlusion.
    // For RGB parts, each texel contains 4 SH coefficients for L1 lighting.
    // ---------------------------------
    // | ShR | ShG | ShB | Occ |
    // ---------------------------------

    We used a 3D texture atlas to decrease the amount of texture slots required by shaders. Using a texture will provide builtin texture filtering between neighbour cells (or texels). If you plan to use a ComputeBuffer you have to manually interpolate data. If you look into SHEvalLinearL0L1_SampleProbeVolume (UnityCG.cginc) you'll also notice this line:

    float texCoordX = clamp(texCoord.x, 0.5f * texelSizeX, 0.25f - 0.5f * texelSizeX);

    It's used in order to not get bleeding between the color channels ShR, ShG, ShB that are part of the 3D atlas.

    The C++ code to get the texel information for one cell looks something like this:

    void GetShaderSHL1CoeffsFromNormalizedSH(const SphericalHarmonicsL2& probe, Vector4f outCoefficients[SphericalHarmonics::kColorChannelCount])
    for (SphericalHarmonics::ColorChannel ch = SphericalHarmonics::kColorChannelRed; ch < SphericalHarmonics::kColorChannelCount; ch++)
    outCoefficients[ch].x = probe.GetCoefficient(ch, 3);
    outCoefficients[ch].y = probe.GetCoefficient(ch, 1);
    outCoefficients[ch].z = probe.GetCoefficient(ch, 2);
    outCoefficients[ch].w = probe.GetCoefficient(ch, 0) - probe.GetCoefficient(ch, 6);

    Here, the input probe structure is similar to the C# counterpart

    The outCoefficients represents the texel data for each of the 3D atlas parts:
    outCoefficients[0] is 4 floats for the corresponding ShR part.
    outCoefficients[1] is 4 floats for the corresponding ShG part.
    outCoefficients[2] is 4 floats for the corresponding ShB part.

    The remaining SH L2 band (the base of the pyramid from here comes from normal Light Probe interpolation which means it will not vary
    spatially inside the 3D volume. The decision to evaluate the third band is based on this option (Quality option in the LPPV inspector).

    I hope is more clear about the contents of the unity_ProbeVolumeSH texture.
    Last edited: Feb 24, 2020
    kristijonas_unity likes this.
  21. Arthur-LVGameDev


    Mar 14, 2016
    Thank you!

    For anyone else in the future, this method may come in handy -- I've seen it used in a few spots, a few of them marked with "we should expose this" or similar comment. ;)

    Code (CSharp):
    1.         // Method found via Unity's "Book of the Dead" project + also found in their GH repo for SRP/URP/HDRP etc:
    2.         //  
    3.         public static void GetShaderConstantsFromNormalizedSH(ref SphericalHarmonicsL2 ambientProbe, Vector4[] outCoefficients) {
    4.             for (int channelIdx = 0; channelIdx < 3; ++channelIdx) {
    5.                 // Constant + Linear
    6.                 // In the shader we multiply the normal is not swizzled, so it's
    7.                 // Swizzle the coefficients to be in { x, y, z, DC } order.
    8.                 outCoefficients[channelIdx].x = ambientProbe[channelIdx, 3];
    9.                 outCoefficients[channelIdx].y = ambientProbe[channelIdx, 1];
    10.                 outCoefficients[channelIdx].z = ambientProbe[channelIdx, 2];
    11.                 outCoefficients[channelIdx].w = ambientProbe[channelIdx, 0] - ambientProbe[channelIdx, 6];
    13.                 // Quadratic polynomials
    14.                 outCoefficients[channelIdx + 3].x = ambientProbe[channelIdx, 4];
    15.                 outCoefficients[channelIdx + 3].y = ambientProbe[channelIdx, 5];
    16.                 outCoefficients[channelIdx + 3].z = ambientProbe[channelIdx, 6] * 3.0f;
    17.                 outCoefficients[channelIdx + 3].w = ambientProbe[channelIdx, 7];
    18.             }
    19.             // Final quadratic polynomial
    20.             outCoefficients[6].x = ambientProbe[0, 8];
    21.             outCoefficients[6].y = ambientProbe[1, 8];
    22.             outCoefficients[6].z = ambientProbe[2, 8];
    23.             outCoefficients[6].w = 1.0f;
    24.         }
    I've got it all working via ComputeBuffers and, thus far at least, it's quite fast. My only remaining stumbling block is how to cleanly get it running, ideally with the Standard shader as that would ensure we don't have any major ongoing cost to deal with as far as keeping it running while still upgrading versions, etc...

    I've got a very basic shader setup and working with it, but I've had issues getting further; I should perhaps open a topic in the Shaders forum for it, as it's probably a bit more 'generic' knowledge, but if you happen to know a good way to do this I'd be all ears. What I've got for it right now is a surface shader & "custom light model" implementation:

    Code (CSharp):
    1. #pragma surface surf CustomSH
    3. inline half4 LightingCustomSH_Deferred(SurfaceOutputStandard s, float3 viewDir, UnityGI gi, out half4 outGBuffer0, out half4 outGBuffer1, out half4 outGBuffer2) {
    4.         return LightingStandard_Deferred (s, viewDir, gi, outGBuffer0, outGBuffer1, outGBuffer2);
    5.     }
    7. void LightingCustomSH_GI (SurfaceOutputStandard s, UnityGIInput data, inout UnityGI gi) {
    8.         LightingStandard_GI(s, data, gi);
    9.         gi.indirect.diffuse = CUSTOMDIRECT_ShadeSH9(half4(s.Normal, 1), float4(0,0,0,0)).rgb;
    10.     }
    This seems to work, though I'm not getting any Albedo (ie the texture in MainTex) to come through, only seeing the indirect light via the SH -- and my first hour attacking it didn't yield any results or reason that it's missing. It's wired up the same as others and just has a *very* basic surface shader that takes in a texture and a normal map... I don't know if it's an issue related to our use (and my lack of knowledge) of the Deferred pipeline, or if it's something else, though will begin debugging it first thing tomorrow.

    Ideal world, I'd just be able to replace the existing SH sampling code (both for probes & LPPVs) with my "new version" of the ShadeSH9. Any good ideas on how to accomplish this aspect in the cleanest & least-intrusive way perhaps? =D

    Thank you!!!
    Last edited: Feb 24, 2020
  22. uy3d


    Unity Technologies

    Aug 16, 2016
    It's getting a bit difficult to keep track of your questions, so I might miss a few of them.

    The light probes are in world space. Unfortunately you can't use a template scene that you then instance into the world with a transform.

    About the lighting calculations, have you tried doing them all in a surface shader and write the result to the emissive output? You can get the worldspace position as input, using that you should be able to calculate the UVW coordinates to fetch an interpolated probe. The world normal can then be used to evaluate the probe. You would have to multiply the light with your albedo yourself, before writing the result to the emissive output.
  23. Arthur-LVGameDev


    Mar 14, 2016
    Sorry for the disorganized barrage of questions -- though really appreciate you putting up with me, and thank you for all the guidance you've been able to provide. It's been extremely helpful.

    Here's a screenshot of the outcome thus far:
    The walls are procedural and can be re-painted at runtime, same with the floor, and the light "bounce" will update & react appropriately. The "bar" itself is a single model & is/was placed at runtime. The only light in the scene is a directional light pointing straight down, for shadows.

    It's not perfect yet but it's getting there & is pretty solid overall; we did end up moving to forward rendering, with the reliance on SH we just weren't gaining much with deferred. The main issues that remain are interpolation/blending, there is no interpolation at the moment. We also need to avoid taking cubemaps from inside object geometry & generally get 'smarter' placement of the pseudo-probes and/or the cubemaps.

    On the positive side, it's all runtime generated -- the pseudo-probes could be placed/moved anywhere, can add or remove them, etc. -- at some point though, you obviously either have to directly convey the SH data to the GPU or do so via convention (ie a grid, which is what's used here). No light leaks either, though that goes hand-in-hand with no interpolation yet too. ;)

    Thank you again -- really appreciate it. If you spot anything that looks off or that you know the solutions to, am definitely all ears. Thank you!! :)
  24. Arthur-LVGameDev


    Mar 14, 2016
    Sorry to dig this up again, I'm currently circling back to this project & working towards getting it production-ready. Really just have 1 question.

    My single(!) question is: How are you guys doing probe interpolation, specifically for LPPVs? And/or where is interpolation occurring -- presumably, it's during Texture3D creation (LPPV.Update), meaning it's calculated mostly on the CPU.

    My 'Disconnect': BUT my lack of understanding stems from the fact that the "sample points" (ie grid-based LPPV 'probes') appear to have a lower density than would be required for the resulting image/outcome. I'm not sure how this is happening, or how to performantly achieve the same result (ie without a bunch of lerping & extra data).

    Concrete Illustrative Example: If I create a Unity/standard LPPV and set it to have a large bounds-size & large texture resolution -- both being the same (ie 256x256x5 for both the resolution & the bounds). When the bounds-size & the resolution match, I would expect that I'd have overall "single-unit" density/resolution -- and I assume is what's happening! BUT THEN: How does Unity achieve the interpolation between the units?

    Edit: Maybe more succinct... "How does Unity avoid 'light banding' & achieve completely gradual per-pixel interpolation" in cases where, for instance, the LPPV is configured for "0.25 units" of probe density? What prevents 'seems / edges' from being visible every 0.25 units?

    Maybe I'm blind, but I don't see any 'magic' occurring in the ShadeSHPerPixel() method which would result in 'smooth' interpolation between two data-points [in the Texture3D]. My only guesses at this point are that either the Texture3D is actually created at a higher resolution [a multiple of] the resolution specified, but that would be quite a bit of extra data to generate & transfer -- else maybe the Texture3D uses a texture filter mode which gives the interpolation "for free" perhaps...?

    MISC NOTES - THOUGHTS from 'trenches', of low importance, though potentially is low-hanging fruit:
    A) SphericalHarmonicsL2 -- Not a big deal, though be nice if the SHL2 struct's methods were whitelisted for non-main-thread calls. My specific wishlist: Clear() / AddAmbientLight() / AddDirectionalLight() but the whole struct could probably be allowed without much risk... Anyone that finds/considers touching SphericalHarmonicsL2 itself probably has some idea what they're doing, and presumably, you can still "protect" the LightmapSettings.probes[] data via the same main-thread-only check -- to prevent anyone that doesn't know what they're doing from causing problems with the built-in stuff. I understand Unity's mentality on threading (and the headache-potential that exists), but from what I can see it actually looks like allowing the struct to be mutated from non-main-thread would be benign as long as the Unity accessors are protected (which I'm assuming they already are, though I didn't actually check).
    B) Unsure how to best convey, so grain-of-salt/top-of-my-head, but... the "terminology" with LightProbes seems almost like a misnomer & when working with it [for me, especially initially] it caused some confusion, especially when working with interpolated "probes" and comparing to some literature / when using them a bit more like "cached light emitters" vs purely as a 'LightSamplePoint', the latter of which is perhaps more directly-fitting with the Unity standard implementation/usage of them. I'm aware of the literature/academia and the common verbiage used therein, and that's probably how it came to be named in Unity too, though even the academic [and general lighting] space has some slightly varied naming/definitions/distinctions in use for the 'LightProbe' term. It's somewhat small potatoes but figured I'd voice the thought if nothing else -- as they say: naming things & cache invalidation. ;) =D

    Last edited: Mar 11, 2020
  25. uy3d


    Unity Technologies

    Aug 16, 2016
    The individual probes stored in the 3d volume map are interpolated based on the existing light probe grid on the CPU. The per pixel interpolation is done by the GPU when doing a bilinear texture fetch into the volume texture. The SH coefficients are linearly interpolated by the texture mapping units for us (the fact that you can just linearly blend SH coefficients is one of the nice aspects of them).

    A) We'd have to check whether we can provide the functionality on threads other than the main thread, but with the general push towards DOTS it's more likely that any multi-thread support would come in that form.

    B) Naming can be a bit confusing indeed. It doesn't help that every engine seems to choose a different name for these things as well, and they do tend to differ in certain details. Unity for example has light probes, reflection probes, occlusion probes and the ambient probe which all capture a potential subset of incoming light at a location and encode it in various forms.
  26. Arthur-LVGameDev


    Mar 14, 2016
    Thank you, that makes a ton of sense WRT interpolation and was what I had suspected. Indeed, the ability to interpolate SH so easily / "natively" and for 'free' on the GPU is great.

    Right now I'm unfortunately using ComputeBuffers to pass the SH data in -- it's actually extremely fast & it has the advantage of allowing [very small] 'partial updates' to be very cheap -- but at the expense of not having interpolation, which is definitely required here in this case. That means I'm going to have to switch to Texture-based data transfer to take advantage of the interpolation, even if it may be a bit slower for partial updates [can always chunk it, etc].

    I actually originally did attempt to implement it with Textures -- basically tried to implement it exactly the same way that Unity does for LPPVs -- but I had trouble (and IIRC may have mentioned in my above jumble of posts above), perhaps you could provide some guidance; I think I'm likely ~4 LoC away from having it working, and likely just missing something small/dumb (I ended up originally getting it working via simply swapping the samplers out for data buffers, with no underlying data changes beyond texture-vs-buffer, and it then worked as expected).

    RE: Switching Data Xfer to Textures [from buffers], to gain 'Free Interpolation'
    I think my data written into the Texture [colors/floats] was being truncated/clamped to 0..1f values -- though I wasn't able to conclusively verify it (unsure best way to debug that kind of thing, esp. w/ macOS dev/editor platform). I believe the texture had expected values when logged out on the CPU, but the results weren't right once the texture3D was pushed to the GPU; I'm confident my data was right, though it's possible [even if somewhat unlikely] that I mis-swizzled it -- but the visuals implied truncation/normalization or similar more than corruption / outrightly incorrect data.

    If there's documentation on doing this, I'm more than capable of consuming it -- would appreciate if can point me towards it though as I've looked / Googled / researched / read the default shaders / etc. -- quite extensively at this point and really can't find anything; did find an old blog from Aras wrt packing floats in textures, seemed likely to be out-of-date info though [and I'm not seeing any shader-side Unity code to 'decode'/denormalize packed floats] as I believe it was 4+ yrs old (few other research/notes I've found are below, too -- but haven't solved the issue yet).

    QUESTIONS! Might be the last of them, if we're lucky! =D

    1) Is there a specific TextureFormat that needs to be used to preserve the SH data correctly (ie signed float4s)?
    I thought I tried all of the signed-float formats I saw in the enum, but maybe I missed one -- or possibly it requires using one of the 'new' ones under the Experimental.Rendering namespace or something?

    2) Are any special semantics required to declare the texture in the shader to get back preserved data (ie unclamped signed float4s)?
    As in, are there semantics needed to preserve the float4s / get back signed high-precision floats that are interpolated (via texture filtering) but that are not clamped? I understand & can trivially sample a texture in a shader obviously -- my issue is [I *think*] just getting back signed floats that aren't clamped to 01f...

    3) Is there anything else happening during the texture creation / texture c'tor / data filling -- anything potentially 'notable' / anything considered semantically "atypical"?
    Is there some normalization happening to the float values that is specific to the texture? Or perhaps there's a specific TextureCreationFlag, or really is there anything unusual / anything worth calling out about the texture creation / data-packing process [beyond the swizzle]?

    My numbers look correct "going into" the texture; they work correctly when passed from CPU-to-GPU via buffers (ie they're swizzled correctly, the SH vals are as expected, etc.) -- but only when passed directly via buffers (and via manually passing multiple explicit float4s, which I used as a 'control' test). Basically I've got this all working -- I'm just missing something specific to the Texture3D create and/or the Texture3D sample syntax, which is basically required to achieve smooth interpolation... =|

    The texture creation & data-packing is pretty trivial code to whip up, and I recall being frustrated at it not working with my initial Texture3D approach -- then 30 minutes later I had it working fine, with the exact same data passed in, but passed in via buffers instead (and it's great, just unfortunately lacks 'free interpolation'; and if I do get float4 reading working with a texture, I may try writing the texture itself in a compute shader via buffers or similar -- but I gotta get signed precision float4s working without clamping/truncation first obviously).

    Basically, I suspect that I was simply missing *some* kind of 'small' semantic -- though I've searched far-and-wide and not seeing anything on it. I've done a *lot* of searching on this part, and I've come across a number of different results & various semantics -- and I actually tried several (if not all that I found) of them when doing the original implementation work. Unfortunately none of them yielded working results; a couple of them wouldn't compile on Metal/macOS, and others were equal results to declaring a standard Texture3D sampler.

    FWIW -- Some links & notes from debugging / researching this specific "T3D / float4 clamp / signed float reads":
    I obviously haven't solved the problem yet [hence the above questions] -- but posting this list to "show my work" + may help others searching + might help with documentation improvements, etc. Worst case, to show I'm not trying to be lazy and am sincerely looking hard for the solution -- most definitely not trying to waste anyone's time! =D

    • SL Docs: SamplerStates -- 'float' / 'float4' is not on this page anywhere.
    • Aras Forum re: "Texture3D<float4>" syntax -- thought it was the solution, but not sure I ever even got it to compile on macOS. I can't find the "Texture3D<float4>" syntax anywhere in the documentation today though, and there are very few results on google for it. Much of the results around T3D/float4 topics pertain to reading and writing to the T3D (which makes me feel dumb since I can't even get reads working [accurately] with signed float4s -- much less writes, hah).
    • Forum re: "RWTexture2D issues" -- OP followed up on Reddit -- seems eerily similar, though I don't believe it's related; and I'm running AMD GPUs on macOS, though the post is 2+ years old so maybe is similar issue that impacts Metal? Idk, seemed like a reach (esp. since we obv. know LPPVs work fine on Metal, syntax seems a more likely culprit).
    • SL Docs: Platform Differences -- This is the only place I can find the "RWTexture2D<float4>" syntax in the docs, and in this case it's within the "#ifdef SHADER_API_D3D11" section; I actually did try the RWTexture2D<float4> syntax but conditional'd it on '#ifdef SHADER_API_METAL || SHADER_API_D3D11' but I believe it did not compile (I don't recall for certain but fairly sure) -- this is actually a common [and annoying] thing I've encountered since developing from macOS for last few years. Even with '#pragma target/require' set to an appropriately-high-level for the contained code, I still have to conditional the code (and all usages of it + the 'else' must [usually?] exist otherwise the shader compilation may think it's always a no-op and compile a logically-unexpected shader result).
    • Forum re: "RWTexture<float4> Truncation" -- also on SO -- this could well be the issue I'm seeing, and I can pack the data if needed, but I don't see where is Unity unpacking it at so am not sure that it is the issue. If so, where's the unpacking/de-normalization -- aka what packing scheme is Unity using & where is that occurring?

    Ty so much btw -- really has been a great experience, even if probably pretty tired of me by now, I really appreciate the patience & skilled guidance/insight. It's actually something I've thought about quite a bit here lately and, at a different time/place/post bring up, but I've learned/observed some things (and purely good [great] things via this thread/aspect of our project), and think the introspection is interesting & perhaps useful for others. For another time, but the gist is:

    I appreciate the guidance, patience, and overall highly-skilled & knowledgeable responses tremendously.
    Thank you!!
    Last edited: Mar 12, 2020
  27. uy3d


    Unity Technologies

    Aug 16, 2016
    First things first: after you called SetData on the texture, did you also call Apply() on it to actually upload the data? Buffers don't need that additional Apply call.

    There's nothing special about the format, internally we're using R32G32B32A32 for the lppvs. As long as it's floats, the engine shouldn't touch the data when copying to the texture. It would be helpful if you could c/p the code snippets on how you create the texture, how you update it and how you declare and sample it in the shader. You can try to use the shaderincludes in the unity editor's install folder under Data as reference on how to declare and sample them. Searching for unity_ProbeVolumeSH should provide the relavant code pieces as reference.

    If you want to go the compute shader route, be aware that you can only write to a RWTexture<float4>, but you cannot read from it, as you can only read from single channel RW textures. They relaxed this with DX12. You can bind the texture as an ordinary Texture3D in a pixel shader to read from it as usual, though.
  28. Arthur-LVGameDev


    Mar 14, 2016
    Alrighty, so went through this pretty closely and while I do think it's generally setup correctly (and interpolating!), as best I can tell it's clamping -- my guess is that it's clamping on the "set" side.

    I've got it setup so that I can toggle between the data transport being via either texture or buffer, and all else remains the same, so unless I've flip-flopped something dumb (though I've checked & re-checked pretty closely) then I think there's some root semantic / obvious thing that I'm probably missing and that is, best guess, causing the values to get clamped.

    Only other things maybe worth mentioning before getting into the code: I'm not limiting the "custom" SH to L, and; we use V3I as our 'game coordinates' and so the texture/data is setup as though XY plane, with Z being used for our "level" (7 per level for the higher-precision SH). The StructuredBuffer variant is handling the data exactly the same way (ie is using 7 buffers) / exact same data-structure layout, just with buffers so no "free interpolation".

    Anyways, here's the code. I've tried to only include the important bits, but it gets verbose somewhat quickly, heh! :)

    Simple 'Test Method' / Functional Usage Example

    Code (CSharp):
    1.         var tex3D = new AmbientTexture3D(256, 256, 1);
    2.         var sh = new SphericalHarmonicsL2();
    3.         sh.Clear();
    4.         sh.AddAmbientLight(color);
    5.         tex3D.SetAll(ref sh, 0);
    6.         Shader.SetGlobalTexture("CUSTOM_ProbeVolumeSH", tex3D.GetTexture(0));
    AmbientTexture3D.cs -- Note c'tor ~L15 + SetAll() method ~L28.
    Code (CSharp):
    2. public class AmbientTexture3D {
    3.         private const int SH_COEFFICIENTS_COUNT = 7;
    4.         private static Vector4[] sharedData = new Vector4[SH_COEFFICIENTS_COUNT];
    6.         private Vector3Int MapSize;
    7.         private Texture3D[] Texture_ByFloorLevel;
    9.         public AmbientTexture3D(int MapSize_X, int MapSize_Y, int Levels = 1) {
    10.             MapSize = new Vector3Int(MapSize_X, MapSize_Y, Levels);
    12.             // Create 1 T3D per 'Floor'
    13.             Texture_ByFloorLevel = new Texture3D[Levels];
    14.             for (int i = 0; i < Levels; i++) {
    15.                 Texture_ByFloorLevel[i] = new Texture3D(MapSize_X, MapSize_Y, SH_COEFFICIENTS_COUNT, TextureFormat.ARGB32, false);
    16.                 Texture_ByFloorLevel[i].filterMode = FilterMode.Bilinear;
    17.             }
    18.         }
    20.         public Texture3D GetTexture(int layer) => Texture_ByFloorLevel[layer];
    22.         private void ClearV4() {
    23.             var zero =;
    24.             for (int i = 0; i < SH_COEFFICIENTS_COUNT; i++)
    25.                 sharedData[i] = zero;
    26.         }
    28.         public void SetAll(ref SphericalHarmonicsL2 sh, int floorLevel = 0) {
    29.             ClearV4();
    30.             GetShaderConstantsFromNormalizedSH(ref sh, sharedData);
    32.             Texture3D tex = Texture_ByFloorLevel[floorLevel];
    33.             for (int x = 0; x < MapSize.x; x++) {
    34.                 for (int y = 0; y < MapSize.y; y++) {
    35.                     for (int i = 0; i < SH_COEFFICIENTS_COUNT; i++) {
    36.                         tex.SetPixel(x, y, i, sharedData[i]);
    37.                     }
    38.                 }
    39.             }
    40.             tex.Apply();
    41.         }
    43.         // Method is via Unity's "Book of the Dead" project + can also be found in their GH repo for SRP/URP/HDRP etc:
    44.         //  
    45.         // Second param expects ... array.Length == 7
    46.         public static void GetShaderConstantsFromNormalizedSH(ref SphericalHarmonicsL2 ambientProbe, Vector4[] outCoefficients) {
    47.             for (int channelIdx = 0; channelIdx < 3; ++channelIdx) {
    48.                 // Constant + Linear
    49.                 // In the shader we multiply the normal is not swizzled, so it's
    50.                 // Swizzle the coefficients to be in { x, y, z, DC } order.
    51.                 outCoefficients[channelIdx].x = ambientProbe[channelIdx, 3];
    52.                 outCoefficients[channelIdx].y = ambientProbe[channelIdx, 1];
    53.                 outCoefficients[channelIdx].z = ambientProbe[channelIdx, 2];
    54.                 outCoefficients[channelIdx].w = ambientProbe[channelIdx, 0] - ambientProbe[channelIdx, 6];
    56.                 // Quadratic polynomials
    57.                 outCoefficients[channelIdx + 3].x = ambientProbe[channelIdx, 4];
    58.                 outCoefficients[channelIdx + 3].y = ambientProbe[channelIdx, 5];
    59.                 outCoefficients[channelIdx + 3].z = ambientProbe[channelIdx, 6] * 3.0f;
    60.                 outCoefficients[channelIdx + 3].w = ambientProbe[channelIdx, 7];
    61.             }
    62.             // Final quadratic polynomial
    63.             outCoefficients[6].x = ambientProbe[0, 8];
    64.             outCoefficients[6].y = ambientProbe[1, 8];
    65.             outCoefficients[6].z = ambientProbe[2, 8];
    66.             outCoefficients[6].w = 1.0f;
    67.         }
    68.     }
    Relevant Bits of Shader / CustomSH.cginc
    Code (CSharp):
    2.     // Map Localization Parameters - Size & MIN().  These are in 'GAME COORDS' / V3I-style notation (ie Z=int level)
    3.     int _AMBIENT_MAPSIZE_XY;
    4.     float3 _AMBIENT_MAPBOUNDS_MIN;
    5.     float3 localizedPosition(float3 worldPos) {
    6.         _AMBIENT_MAPSIZE_XY = 256;                              // TODO: UN-HARDCODE THIS!
    7.         _AMBIENT_MAPBOUNDS_MIN = float3(-82.0, 223.05, 68.0);   // TODO: UN-HARDCODE THIS!
    8.         // Localize it
    9.         float3 localized = worldPos - _AMBIENT_MAPBOUNDS_MIN;
    10.         // Re-Swizzle to XY, Z=(int)Level
    11.         localized = float3(localized.x, localized.z, 0.0);  // TODO: UN-HARDCODE 'Level' to be dynamic based on world Y[render]/Z[game/v3i]
    12.         return localized;
    13.     }
    15.     half3 CUSTOM_TEXTURE_SHEvalLinearL2 (half4 normal, float3 worldPos) {
    16.         float3 texCoord = localizedPosition(worldPos);
    17.         texCoord.z = (texCoord.z * 7);
    18.         half3 x1, x2;
    19.         texCoord.z += 3.0;  // 3
    20.         half4 _SHBr = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    21.         texCoord.z += 1.0;  // 4
    22.         half4 _SHBg = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    23.         texCoord.z += 1.0;  // 5
    24.         half4 _SHBb = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    25.         texCoord.z += 1.0;  // 6
    26.         half4 _SHC = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    27.         // 4 of the quadratic (L2) polynomials
    28.         half4 vB = normal.xyzz * normal.yzzx;
    29.         x1.r = dot(_SHBr, vB);
    30.         x1.g = dot(_SHBg, vB);
    31.         x1.b = dot(_SHBb, vB);
    32.         // Final (5th) quadratic (L2) polynomial
    33.         half vC = normal.x * normal.x - normal.y * normal.y;
    34.         x2 = _SHC.rgb * vC;
    35.         return x1 + x2;
    36.     }
    37.     half3 CUSTOM_TEXTURE_SHEvalLinearL0L1_SampleProbeVolume (half4 normal, float3 worldPos) {
    38.         float3 texCoord = localizedPosition(worldPos);
    39.         texCoord.z = texCoord.z * 7;  // each level has 7 units on Z (is hardcoded to 0 atm)
    40.         // sampler state comes from SHr (all SH textures share the same sampler)
    41.         texCoord.z += 0.0;  // 0
    42.         half4 SHAr = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    43.         texCoord.z += 1.0;  // 1
    44.         half4 SHAg = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    45.         texCoord.z += 1.0;  // 2
    46.         half4 SHAb = UNITY_SAMPLE_TEX3D_SAMPLER(CUSTOM_ProbeVolumeSH, CUSTOM_ProbeVolumeSH, texCoord);
    47.         // Linear + constant polynomial terms
    48.         half3 x1;
    49.         x1.r = dot(SHAr, normal);
    50.         x1.g = dot(SHAg, normal);
    51.         x1.b = dot(SHAb, normal);
    52.         return x1;
    53.     }
    55.     // VIA STRUCTURED BUFFERS -- this works (called-methods are omitted), but
    56.     // they're identical only via buffer[index] access instead of Sampler
    57.     half3 CUSTOM_COMPUTEBUFFER_ShadeSH9(float3 worldPos, half4 worldNormal) {
    58.         int index = LocalizedWorldPositionIndex(worldPos);
    59.         // Linear + constant polynomial terms;  + Quadratic polynomials
    60.         half3 res = CUSTOM_COMPUTEBUFFER_SHEvalLinearL0L1_SampleProbeVolume (worldNormal, index);
    61.         res += CUSTOM_COMPUTEBUFFER_SHEvalLinearL2 (worldNormal, index);
    62.     #   ifdef UNITY_COLORSPACE_GAMMA
    63.             res = LinearToGammaSpace (res);
    64.     #   endif
    65.         return res;
    66.     }
    67.     // VIA TEXTURE SAMPLER
    68.     half3 CUSTOM_TEXTURE_ShadeSH9 (float3 worldPosition, half4 worldNormal) {
    69.         // Linear + constant polynomial terms;  + Quadratic polynomials
    70.         half3 ambient = CUSTOM_TEXTURE_SHEvalLinearL0L1_SampleProbeVolume (worldNormal, worldPosition);
    71.         ambient += CUSTOM_TEXTURE_SHEvalLinearL2 (worldNormal, worldPosition);
    72.         #   ifdef UNITY_COLORSPACE_GAMMA
    73.             ambient = LinearToGammaSpace (ambient);
    74.         #   endif
    75.         return ambient;
    76.     }
    77.     half4 LightingCustomSH (SurfaceOutputStandard s, half3 viewDir, UnityGI gi) {
    78.         return LightingStandard (s, viewDir, gi);
    79.     }
    80.     void LightingCustomSH_GI (SurfaceOutputStandard s, UnityGIInput data, inout UnityGI gi) {
    81.         float3 worldPos = data.worldPos;
    82.         LightingStandard_GI(s, data, gi);
    83.         gi.indirect.diffuse = float4(CUSTOM_TEXTURE_ShadeSH9(worldPos, half4(s.Normal, 1)).rgb, 1.0);
    84.     }
  29. uy3d


    Unity Technologies

    Aug 16, 2016
    You're using ARGB32 as your texture format. That's only 8 bits per channel with [0.0;1.0] getting mapped to the range of [0;255]. In the shader you won't be able to read values > 1.0f, which is what you're seeing. Trying using RGBAHalf instead.
  30. Arthur-LVGameDev


    Mar 14, 2016
    I've tried all the formats as mentioned earlier, but gave it a really quick shot just now and still appeared to be clamped. I'll throw in some "if(val > 1)" logic into the shader tomorrow to see if it's definitely clamping still or not, but it appeared to be doing so -- so unless there's some background caching going on or something like that, the results appeared unchanged using RGBAHalf.

    Will debug bit further after some sleep and report back though. Ty -- I think it's *very* close at least!
  31. uy3d


    Unity Technologies

    Aug 16, 2016
    Aren't you forgetting to divide your localizedPosition by _AMBIENT_MAPSIZE_XY? Your UV coordinates must be within [0.0;1.0] range for all three dimensions. You may also want to set the wrap mode for the texture to clamp, repeat doesn't make much sense here.
  32. Arthur-LVGameDev


    Mar 14, 2016
    That was exactly right -- on both counts. On the UVs it was just a dumb mistake on the XY, and the Z dimension I had incorrectly assumed that it worked like texture arrays/was actually expecting an integer. Makes sense that it doesn't though, and I guess is probably a key distinction between them (though may be distinct in hardware, unsure). Once that was fixed + setting TextureWrapMode to clamp = appears to be functioning correctly 1-to-1 with LPPVs. =D W00t.

    Thank you!

    Am about to start looking at the same type of stuff for planar reflections, am hoping it may 'just work' relatively easily, as it seems to be similar paradigm overall too. Hopefully won't need to reinvent any more wheels at least -- the paths you guys have taken on this stuff are quite sound it seems, if anything the only gripe I've got is I wish it was more extensible / wasn't quite as "all-or-nothing". I suppose the different SRPs move towards solving that now, but I'll digress on that front -- nothing short of a massive win, directly thanks to your team / you, for sure. =)

    I owe you guys copious numbers of beers/coffees if you're in the Las Vegas area at some point in the future. Thank you, immensely. :)
    uy3d likes this.