Search Unity

  1. Unity 2019.2 is now released.
    Dismiss Notice

Bakery - GPU Lightmapper (v1.6) [RELEASED]

Discussion in 'Assets and Asset Store' started by guycalledfrank, Jun 14, 2018.

  1. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    Hey. Dou you think it s possible to set compute devices from the ftracers and denoisers to CUDA_0 up to to CUDA_3?
    We have some old BakeQuenns with 2 to 4 GTX Titan V.
    So only a split for 2 and 4GPU s.
     
  2. bcoyle

    bcoyle

    Joined:
    May 22, 2013
    Posts:
    57
    Not sure how I stumbled upon this asset, but it has me tempted to snap it up. One question though -- My current workflow is for archviz option previewing in VR (ex: flipping between 2 different furniture layouts via hide/unhiding of objects). I'll either bake the lighting in 3dsmax/vray or use a simple ambient occlusion screen effect in unity. Bakery looks sexier though!

    Would it be possible to isolate meshes to bake, then hide, and then bake another set of geometry without affecting the previous set?
     
    guycalledfrank likes this.
  3. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
    Yes. I can add an option to specify CUDA devices used (I was under impression that all devices are utilized, when left unspecified). Do you want to use different devices for raytracing and denoising? Or just being able to ignore some devices?

    I recommend baking two separate scenes or using lightmapped prefabs for that. Basically each scene has a storage object linked to it, containing lightmap->mesh mapping. If you rebake the scene, it's refreshed, and hidden objects will lose their lightmaps. However, if you bake two scenes and then load them together at runtime, it'll work; similarly, baking two lightmapped prefabs in one scene will work too, because every prefab has its own storage object, separated from the scene.
    (I'd go with multi-scene workflow, since Unity complicated the prefabs a bit too much after 2018.3; I no longer fully understand how they work)
     
  4. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344


    Simply, For everthing where you can set the cuda_compute_device_device.

    Yes. All devices are utilized, but they seems to do the same calculation because no speedup but full gpu use.
    I tried to show it off in 4 cases. See times and screenshots who follow.

    An ignore should be not needed.
    Here the full run with the full showcase at the end.#


    This is Version 3. Weekend.

    Because of Bakery it is finally possible to do Lightmap Benchmarks without getting old.
    Thank you.
    Here a fast around 30 hour Benchmark with some nice results.

    Done with an real world Archviz scenario from ruggero

    https://assetstore.unity.com/packages/3d/environments/urban/archvizpro-interior-vol-6-120489

    version ArchVizPRO_Interior_Vol.6_HDRP_v.5
    in
    Unity 2019.1.0f2
    with
    Bakery 1.6
    HDRP

    View attachment 449537

    with these real world standard scene settings.
    Only toggled RTX on\off

    View attachment 449543

    resulting in
    16 x 4K (703MB)
    high quality lightmaps.
    Only the areas under table and couch could get more light?

    View attachment 449546

    View attachment 449549

    View attachment 449564

    with some surprises.

    After ever run i

    - did an reboot
    - cleaned Bakery cache folder
    - removed lighting calculaltion
    - and created an new scene
    - started mouse jiggle to avoid company desktop lock

    for Multi/Single GPU
    - removed second card physically
    or
    - deactivated PCI Slot in Bios when possible

    some results



    GEFORCE GTX 1080

    2x GeForce GTX 1080

    Driver 425.31
    SLI
    enabled

    1st 00h53m45s
    2nd 00h51m12s
    3nd 00h52m17s


    2x GeForce GTX 1080 (v2)

    Driver 425.31
    SLI
    disabled

    1st 00h52min20s
    2nd 00h53m18s

    ! On some Cuda Raytracers there is only a speedup when SLI disabled. Not here.


    1x GeForce GTX 1080

    Driver 425.31

    no sucess, because Laptop with 2x Desktop GTX 1080 cards.
    After disabling in system console no chance to do the test.


    Titan RTX

    2x Titan RTX
    Driver 430.64
    NV Link Bridge

    RTX mode off
    1st 00h55m40s
    2nd 00h53m56s

    RTX mode on
    1st 00h41m49s
    2nd 00h43m14s


    2x Titan RTX
    Driver 430.64
    remove NV Link bridge

    RTX mode off
    1st 00h34m20s

    RTX mode on
    1nd 00h33m18s


    1x Titan RTX (second physically removed)
    Driver 430.64
    remove NV Link bridge

    RTX mode off
    1st 00h54m47s
    2nd 00h55m12s

    RTX mode on
    1st 00h34m18s
    1st 00h32m07s


    • ! NV Link slows down a lot. Remove it.
    • ! 1x TitanX RTX is faster than 2x TitanX RTX
    Quadro 6000 RTX

    1x Quadro RTX 6000
    Driver 425.51

    RTX mode off
    1st 00h48m59s

    RTX mode on
    1nd 00h29m54s
    2nd 00h29m19s


    2x Quadro RTX 6000
    Driver 425.51

    RTX mode off
    1st 00h44m50s

    RTX mode on
    1nd 00h32m14s
    2nd 00h31m43s
    • ! 1x Quadro 6000 RTX is faster than 2x Quadro FX 6000
    GTX Titan V

    4x GTX Titan V
    Driver 425.51
    SLI off

    1st 00h39m37s


    4x GTX Titan V
    SLI off
    Driver 431.36

    1st 00h38m23s
    2nd 00h37m47s


    4x GTX Titan V
    Driver 431.36
    SLI on

    1st 00h39m43s


    Result for now.
    Bakery 1.6

    see
    ! 1x Quadro 6000 RTX 00h29m54s
    is faster slightly than
    2x Quadro FX 6000 00h31m43s

    But both cards render. See 100% spikes during ftracers and denoising calculations.
    Seems they do similar calculation?



    1xQuadro6000rendering_07_RTX_on.JPG



    2xQuadro6000_rendering_03_SLI_off.JPG


    ! 1x TitanX RTX 00h32m07s
    is slightly faster than
    2x TitanX RTX 00h33m18s

    But both cards render full.See 100% spikes during ftrace and denoising render.
    Seems they do similar calculation.

    2xTitan_NVLINKOFF_srendering.JPG



    ! 4x GTX Titan V
    are rendering

    4xGTX_Titan_V_02.JPG


    Low hanging fruit:

    Logical result after these tests.

    By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1
    we should get for

    2x GTX 1080 00h51m12s
    around
    2x GTX 1080 00h28m00s


    By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1

    we should get for

    2x TitanX RTX 00h33m18s
    around
    2x TitanX RTX 00h17m00s
    because
    1x TitanX RTX can do 00h32m07s



    By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1

    we should get for
    2x Quadro 6000 RTX 00h31m43s
    around
    2x Quadro 6000 RTX 00h15m00s
    because
    1x Quadro 6000 RTX can do 00h29m54s


    By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1, cuda_2 and cuda_3

    we should get for
    4x GTX Titan V 00h37m47s
    around
    4x GTX Titan V 00h09m00s


    This linear scale is an rough assumption whenever an realistic one

    because it behaves in Blender Cycles (Cuda) for LightMap baking
    and DNoise Denoising (Optix)
    the same by delegating fraction of the jobs to available compute devices.

    Let´s do. Let´s test. Should be easy because of your clean architecture.

    It also shows that with 5 years old GTX Titan V cards and with an 2x GTX 1080 you can reach
    phantastic results. For me the most impressive thing.


    Further possibilities:


    • On the Screesnhots from High-End cards like
      Quadro 6000 RTX/Quadro 8000 RTX
      you see there is currently more then
      20GB/45GB
      of VRAM unused during complete bake.
      One has
      24GB/48GB.
      So there is potential for an little? speedup by giving up your divide and conquer and by pushing more to vram at once.
      You could save some writing and loading times.
      But could be not worth the effort. Who knows? Could be time for specular GI.)
    • By using the OIDN denoiser you could probably get another 10 to 20% because you can denoise in paralell on CPU, instead at the end of the complete bake process on GPU like now.
      All other stuff runs on GPU like now. You probably know better.
    With comparable settings Unity Progressive Lightmapper crashes complete or switches to CPU.
    With lower settings and lots of trying i got around 6 hours for far less quality at a special star constellation i could not capture.

    I gave up after some time.
    Sorry for fast putting this together.

    And the Archviz 6 has big big windows......
     
    Last edited: Jul 20, 2019
    guycalledfrank likes this.
  5. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    .....because of that i am preparing in free time for the next run an more real world scenario
    based on free data.

    • The CornellBox is outdated but did an fantastic job.
    • Sponza, too
    • ArchViz 6 is great but has too big windows

    So i choosed the Paris Bistro scene.

    It is quite some work to do because of testing PBR consistency. It need new HDRP materials and there are 600 textures.
    I will also add an big tree left from the gate like in the original scenario with lots of leaves.

    It will allow next level tests because we have 2019.)

    So it has harder access to the IBL because of houses are high.
    Was quite some fun to test different skys from clear to overcast.

    01_.JPG

    In most sun sky constelations there is no direct acess to the sun.
    e.g see red area on false color vis


    04_.JPG

    So that s easy but we need mostly pure nice indirect in the Bistro, too.
    For future challenge there will be addons like caustics for static glasses and bottles on the table. Only for fun.

    05_.JPG

    There could be multiple seperate time of day bakes or an complete day/night cycle via ibl timelapse.
    If there is someone who has fun to help by runtime switching or interpolating the calculated textures and lightprobes. Give me a message.

    Challenge should be to bake it at once with one IBL.

    06_.JPG


    However i think this is an nice benchmark scenario.
     
    Last edited: Jul 20, 2019
    guycalledfrank likes this.
  6. nsxdavid

    nsxdavid

    Joined:
    Apr 6, 2009
    Posts:
    341
    Just picked up Bakery. Working in Unity 2019.1.10f1, I get the following script errors on import:


    Assets\Editor\x64\Bakery\scripts\ftRenderLightmap.cs(2191,55): error CS0619: 'LightmapEditorSettings.Lightmapper.Radiosity' is obsolete: 'Use Lightmapper.Enlighten instead. (UnityUpgradable) -> UnityEditor.LightmapEditorSettings/Lightmapper.Enlighten'

    Assets\Editor\x64\Bakery\scripts\ftRenderLightmap.cs(2195,58): error CS0619: 'LightmapEditorSettings.Lightmapper.PathTracer' is obsolete: 'Use Lightmapper.ProgressiveCPU instead. (UnityUpgradable) -> UnityEditor.LightmapEditorSettings/Lightmapper.ProgressiveCPU'


    Assets\Editor\x64\Bakery\scripts\ftRenderLightmap.cs(3704,55): error CS0619: 'LightmapEditorSettings.Lightmapper.Radiosity' is obsolete: 'Use Lightmapper.Enlighten instead. (UnityUpgradable) -> UnityEditor.LightmapEditorSettings/Lightmapper.Enlighten'


    Think you can #ifdef around those? Doesn't seem simple to update since Lightmapper.Radiosity has no fields or properties of it's own. :/
     
  7. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
  8. nsxdavid

    nsxdavid

    Joined:
    Apr 6, 2009
    Posts:
    341
    I'll try that , thanks
     
  9. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    Hi there Frank!
    I just picked a Gigabyte RTX 2070 OC Gaming to replace my Gigabyte GTX1080 G1 Gaming and, guess what?
    It's actually SLOWER! LOL
    I went from a 23min render to a 28min render for exactly the same scene and settings!
    (of course I enabled the RTX mode. I have played with all the settings and even overclocked my RTX2070 but there is no way to catch the GTX1080 here!)

    Actually, it's more complicated than that: actual rendering is faster, but denoising is WAY slower with the RTX2070.

    I don't know if it's even possible, but it would be awesome to have the denoiser to use the additional RT and Tensor cores of the RTX series, because right now it's really stalling on the denoise process.
     
  10. Homicide

    Homicide

    Joined:
    Oct 11, 2012
    Posts:
    243
    Hi. Really pretty new to all the artsy fancy side of visual stuff. My question is not so much about this asset, but lightmapping itself.

    Is there no way to bake lightmaps at runtime? Ever?

    I havent been able to come to conclusion on this subject. Thanks
     
  11. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    the GTX 1080 has 10% more Cuda Cores + around 8% more clock speed.
    The CudaCores do all the work during Rendering and Denoising at an higher clock.
    That matches relativly exact with your slowdown.

    So it really hard to catch this up at the end of the bake where the denoisers are running.

    Do you have messured the difference between RTX mode on or off on the 2070 ?

    I did some fast tests these days by running denoising on cpu with the intel denoiser (oidn) instead the optix denoiser (dnoise)on another cuda path tracer (cycles). With this you can get around 20% faster because denoising must not run on gpu. It runs in parallel on the cpu after every bake atlas has finished rendering.

    It s hard these day. With the RTX generation you pay more for less cores, clock even ram on higher models at the comparable segment from generation before.

    Not complete correct but we did such comparisions the whole week during building up from our bake infrastructure
    .
    For now is. 2 people run a marathon (bake).
    The GTX runner has a distance from 38km (10% less)
    The RTX runner has a distance from 42km.
    The GTX runner runs also around 8% faster on a shorter distance

    At the end from the race of the long distance the RTX runner gets 64 bananas (RT cores) and 288 pieces of dextro energy (tensor cores) . Nice. But it really doesn t helps to win because the GTX runner has already finished the race.


    Screenshot_20190720-193650.png
     
    Last edited: Jul 20, 2019
    guycalledfrank likes this.
  12. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    Actually, it's more complicated than that: actual rendering is faster, but denoising is WAY slower with the RTX2070.

    Also, my RTX2070 is overclocked to 1845mhz on boost and my GTX1080 wasn't overclocked (outside of the factory little OC it came with) so definitively the problem with the denoiser is the lack of CUDA cores compared to the GTX, but maybe there is some version of the denoiser than can take advantage of the RT and Tensor cores that are just doing nothing otherwise in the RTX?
     
  13. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    I don't know of any lightmapper that can bake at runtime.
    There was an assetstore plugin for baking procedurally generated meshes at runtime, but it was deprecated.
    So currently, no, there is no way that I know.
    (edit: there are several Ambient Occlusion only bakers in the assetstore that work at runtime, though)
    But it wouldn't be of much use anyway, since baking is a slow process you normally don't want to bother the end user with.
    Your best bet is to use realtime lighting if you are doing procedural stuff.
    If your scene premade, just bake it in the editor.
     
    Homicide likes this.
  14. QuantumTheory

    QuantumTheory

    Joined:
    Jan 19, 2012
    Posts:
    1,041
    Hi @guycalledfrank

    The bicubic interpolation on the lightmaps is excellent. For small dynamic objects like grass and rocks, I'm doing a top-down render of the scene's lightmaps and saving that into a image, then using that image in a custom shader. The scene is outdoors so this works pretty well for our purposes.

    Obviously there is a difference in texel size between the world's lightmaps and this new one, but I was wondering if I can run bicubic interpolation on those captured images in the custom shader. If so, could you shed some light on what functions I need to filter the images?

    Thanks!
     
    guycalledfrank and keeponshading like this.
  15. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    Sure they will be and are some RT cores optimisations for Optix in special for the denoiser. In my explanation i did not
    included these. Only that what lands on my SSD totay.)

    This week will be some more test.
    Also with the 2nd generation of the Titan Pascal.
    By benchmarking all 3 generations of the Titan you will see interesting stuff.
    In special the doubeling of the price by getting less cuda compute power and ram.
    So the GTX Titan V , the oldest one, around 4 years or more, is an pretty cuda bake beast at the moment for the price.

    I played around with this last month....

    https://github.com/ROCm-Developer-Tools/HIP

    Really impressive because you can convert Cuda to portable C++.
    You get a lots of A,s and Ooooh,s because it allows tests not only limited to NVidia s playground.
    However. Hope there will be more competition in the future.
     
    Last edited: Jul 22, 2019
    guycalledfrank likes this.
  16. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    fyi,
    i tried to rebuild this method from Frostbite, already linked in bakery documentation

    https://www.gdcvault.com/play/1025434/Precomputed-Global-Illumination-in

    An nice presentation that showed me that Bakery reached
    an ,best in class, baking level.

    latest additions like
    • Directional SH mode
    • Non linear L1 SH
    and the cool thing is by using in addition several very light layer of HBAO, like shown in presentation, 3 to 5 should be enough and Distance Shadowmask this GDC presentation could be used as Bakery tutorial now. However it works.
     
    Last edited: Jul 22, 2019
    guycalledfrank likes this.
  17. Krubbs

    Krubbs

    Joined:
    Mar 15, 2013
    Posts:
    30
    Is it really possible to reduce rendering time on 2xGPU's?
    I have two RTX 2080 and i feel betrayed now.
    Such powerful hardware simply stay idle.

    If you say yes, when we can wait new release?

    I see small spikes of GPU 1 & 2 load. It doesnt load at least 30%. Both GPU's. If its possible to use 100% of them and really speed up by splitting tasks for 2 gpu... wow... such wonderful future.
     
    Last edited: Jul 22, 2019
  18. maart

    maart

    Joined:
    Aug 3, 2010
    Posts:
    76
  19. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
    In non-RTX mode it's expected given the CUDA cores count, BUT did you try enabling the the RTX checkbox (in advanced settings)? It should really beat 1080 with dedicated RT hardware.

    Current denoiser uses a rather outdated (pre-RTX) version of the library, but I have a new one now. Sent you a PM, test it. Should use your tensor cores.
    Also note that first time denoiser launch is always slower than subsequent launches (drivers cache something I believe).

    Such asset can be implemented, but Bakery is not suited for that.

    Sure, you can. Take a look at ftrace.cginc file, ftLightmapBicubic() function. I did not invent it, but copied somewhere from open Nvidia code: https://github.com/zchee/cuda-sampl...bicubicTexture/bicubicTexture_kernel.cuh#L116
    I've also seen the same code used in Mirror's Edge.
    As you can see, one limitation of this technique is that you need to know texel size (1/resolution) in advance, that's why Bakery's shader tweak is limited to DX11, as I use GetDimensions() function for that. However, I only do it to avoid sending more data from CPU to GPU or altering existing shaders. For most textures (but not lightmaps) Unity already gives you a texel size variable (see TexelSize: https://docs.unity3d.com/Manual/SL-PropertiesInPrograms.html)

    Nice. Too bad I can't decompile dependencies like OptiX though.

    It's pretty cool how they bake lowres proxy geometry and then project lighting on detailed meshes. Fast to bake and SH gives you proper XYZ directionality for details. Something like that can be scripted on top of Bakery. AFAIK UV projection code was a part of Enlighten, but Unity never used/exposed it.

    First check: make sure you have the RTX checkbox enabled.

    Now given the multiGPU benchmark results it does indeed look like it's not parallelizing the job well. I think I'm gonna make a small benchmark program that traces some random rays and debug it. It will help a lot if you'll be able to run some tests with it. I only have one GPU at the moment and can only guess what works.

    You don't really have to use them (unless you need SH lightmaps). Anyway, what errors do you get? HDRP is being updated rapidly and even minor updates break my HDRP shader over and over again.
     
  20. Krubbs

    Krubbs

    Joined:
    Mar 15, 2013
    Posts:
    30
    true

    I'm ready
     
  21. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    It s a shame how Enlighten was bashed these days. It never got an fully integration into Unity.


    However. These lowres proxy projection from the frostbite slides and with the help from this slide

    PrecomputedFormFactors.JPG

    i finally got an good understanding how Enlighten was able
    to update light maps and probes so fast by using these simplified models and multi-threading.
     
    Last edited: Jul 22, 2019
  22. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    THIS!
    THANK YOU!
     
    guycalledfrank likes this.
  23. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    Don't. Frank is super responsive.
    If it can be fixed, it will. :)
     
    guycalledfrank likes this.
  24. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,337
    They used to do in cpu, but move to compute recently. But my guess is that you can do it in vanilla shader too. I had a post in this thread that stumble on a similar idea for open gl es 2.0 and +, but I have now moved to a probe based solution.

    You can try this:
    - put the "surface" in a texture, that, is encode area size, normal, position, albedo
    - apply direct lighting computation using the normal and albedo, compute shadow from shadowmap by comparing the position.
    - store the result in a "direct lighting" dynamic texture
    - for every pixel in a lightmap, associate a bigger tile texture that store UV address of the visible surface from the "surfaces texture" + weight data based on visibility. Such as that texture is actually lightmap size * number of sample. You would probably store those addresses in a tile configuration, so that you can easily hash the lightmap pixel position and get to the proper tile. Let's call this map of adresses, the indirection tilemap.
    - at each iteration, sample the "direct lighting" texture with the indirection tiles, accumulate the result into the lightmap. It's the computational cost of a naive blur of 'tile size'.

    YOU NEED BAKERY to bake the indirection tilemap, the author said he could probably provide an API to make custom bake like that (ie for each point, return data from the hit impact, sort the data yourself and bake it where you want)

    Some observation for optimization:
    - the surface texture can be organize as a lightmap, so you can just use the lightmap, the data is basically teh same as a Gbuffer. So you only need the lightmap data, an accumulation lighting map (just new layer of the lightmap) that first get direct light then add the indirect light at each iteration.
    - the tile are basically a representation of the hemisphere above the lighted points, they are basically small hemi cubemap, it's an atlas of hemicubemap above each point, but kinda like importance sampled. I use that observation to move to another approximation where I use box projected lightprobe as indirection texture.
    - If you basically map the lightmap to itself with indirection texture, you have a baked lightfield of all the rays, since the lightmap has the position of each point, you can infer the ray beginning and end, therefore use that to inject intersections with primitive. Also it mean you can go bidirectional for the light calculation, you can start on the tile and find which point all the ray focus on (hashing the position).
    - with open gl es 2.0 you are stuck with 8 texture samples, the direct lighting cost a few (normal, albedo, position). This limit the ray gathering from the tilemap to a few rays per iteration, but you can use mipmap sampling to simulate either multiple rays at once (power of two per mip) or cone (deeper mip when distance grow), be careful of edge artefact as it's a lightmap representation.
     
    guycalledfrank and keeponshading like this.
  25. XRA

    XRA

    Joined:
    Aug 26, 2010
    Posts:
    188
    Hey just curious if this is a known limitation or bug, noticed that Subtractive light mode doesn't bake correctly when using RNM or SH directional modes
    (left incorrect result, right correct result w/ directional mode none)
    subtractiveShBug.PNG subtractiveCorrect.PNG
     
    Last edited: Jul 23, 2019
  26. RockSPb

    RockSPb

    Joined:
    Feb 6, 2015
    Posts:
    102
    May I ask? Why do you need a real-time gi on the hardware level gles2? That all dependent texture fetching will be incredibly slow.
     
  27. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,337
    Yes it will be slow, but not framerate dependent, it's basically like cheap real time baking.

    Also this is approximation, so it's probably full of inaccuracy and artifact anyway (same as enlighten), you need to design around them. I stumble on it, I didn't explicitly design a GI solution, I was looking for time of day shadow baking. I want to make an open world on mali 400, they are widespread here (and I'm poor that's the hardware I could afford at <100€). I haven't yet implemented it, so I don't know actual performance, but even 2s latency would be okay (I wouldn't inject dynamic light).

    But to fully answer the question:
    - Production wise, I don't have to bake map off line, that's one step less.
    - Even without the GI, it decouple lighting, from static geometry, with the direct lightmap Gbuffer (though managing the precision of 8bit has to be done for position), therefore from framerate, which would allow for more complex lighting shader. You SHOULD probably separate dynamic effect and specular lighting with constant diffuse environment lighting.
    - Memory wise, it makes the mobile game smaller, lightmaps take a lot of memory.
    - Once the compute is done, it's a single texture fetch for all objects, you can even discard the extra textures once done.
    - It cover many permutation of lighting, you probably don't want to bake all of these with strict memory budget, providing you stay within the precision budget.
    - Like enlighten it probably work with small texture area with like less than 1 pixel per meter (recommended by enlighten).
    - If you do procedural packing of procedural object's UV, it's compatible with procedural generation.
    - It's just another option for case where you need it, more tools is always better. It give an extra touch.
    - GI is low frequency, it matter less if it's slow, and you can easily spread the compute to whatever frequency you need, also Time of day change are slow so you can tune it's duration to the update. You won't use it to gather dynamic object either, it's only static to static GI, environment only.
    - It's still probably faster to any other method for low end hardware, you won't do a bvh tracing or ray marching past a point.
    - It's still probably useful for higher end machine or even VR.
    - It's actually very flexible, you can mix and match technique, change data precision, spread compute as much as you want, you can even use prebake lighting to jumpstart the compute.
    - It's basically a shader version of enlighten so it cover the same usecase.

    And the extended probe version can update change in scenery and project on dynamic object (dynamic don't contribute unless you use shadowmap injection), for the same cost of a single texture fetch for static and a single cubemap fetch for dynamic, but with less geometric precision on the gi (box projection approximation). And it would also allow for updating not just a lightmap but 3d probe volume too (which are 2 fetch and one lerp for all objects on ogles 2.0).

    But I'm going to go and try to implement it soon for open GL es 2.0 I'll report the result. I'll just need to have the set up done, ask for @guycalledfrank for the promise API and try myself.
     
    Last edited: Jul 24, 2019
  28. mikerz1985

    mikerz1985

    Joined:
    Oct 23, 2014
    Posts:
    53
    Hi -- I've used Bakery in the past and found it awesome.

    I have a theoretical use case I want to run by you; for the sake of argument let's say I have a stadium 3d model, and 30,000 seats. Each seat has a piece of data associated with it, and the user can choose to sit in that seat. Currently, I bake the seats into groups of meshes; let's say 100 groups of 300 seats. The meshes can be marked as static, and then baking can happen as normal.

    What I would like to do instead, is have a special ECS based, GPU instanced seat rendering system which can also support LOD. I would like not to have to use a realtime screen space AO effect, but rather bake the lighting ahead of time.

    Does this sound doable to you?

    Some immediate questions I'm thinking about:
    I'm not sure how to relate a particular seat's uv to a particular lightmap's UV.
    I'm not sure how to relate a different LOD level's uv to the lightmap UV. I imagine they would have to be generated in a spatially normalized way.
    I'm not sure how to bake based on a ECS system, and not using static geometry. This one is perhaps more an optimization; I could still bake static geometry purely for the purposes of being able to bake lightmaps.
     
  29. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    Yes, just render the seats separately using LightmapGroups plus the prefab component and make them a prefab.
    Not an issue with lightmap groups.
    A good LOD generator automatically generates the LODs with matching UVs.
    I use MantisLOD and it's fantastic.
    About static: you can mark it static for baking, then mark it as not static for management and then, once you have placed it in the scene by code reset it to static again.
    I can't talk about ECS since I haven't used it though, but iirc Bakery needs to manually assign it's lightmaps by code each time the mesh is loaded, so it may be more complex than expected.
     
    guycalledfrank likes this.
  30. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
    @keeponshading @Krubbs
    I made a little benchmark program. Basically it's a small part of the RTX lightmapper that traces a HDRI for a somewhat complex scene.
    https://drive.google.com/open?id=1YhImIhF-ESttzOZKJ-W-ldRt_OCt1WuK
    Launch run_benchmark.bat, then read .benchmark.txt, it should tell you measured render time.
    Please confirm that there is no benefit in a multiGPU config. I'll try to modify some bits and send you an updated version then.

    One thing that bothers me is that if you put a bunch of visible UVs at ray hits in a tile, adjacent values are going to be very far from each other in UV space = cache miss at every read.
    This is in case I get your idea correctly and think of a small tile of UV values at each texel.
    Alternatively... you can atlas multiple copies of the complete image in the tile map, each lightmap-sized, but storing a different UV value. This way you can sort of blend each sub-image over the final lightmap, using linear texture access (I can probably draw some visual explanation, if my wording is hard to understand).

    I should!

    Huh. Check example_subtractive scene, make sure your setup is similar. MAYBE switch render mode to shadowmask so you can see the "baked contribution" option on the light, switch it to "Direct and Indirect", enable Subtractive render mode back.

    I heard Unity are looking for a lighting engineer ;)

    I have some doubts about the tile map size in VRAM. If you have a 1024x1024 lightmap (that's modest) and at least 4x4 tiles (rather noisy) and given UVs need to be at least half2, that's 4096x4096x4 = 64 MB. It's not uncommon to have only <= 256 MB of available VRAM on mobile devices, especially older ones.
    Plus having a lightmap GBuffer will add to this. You can compress normal and albedo, but I have no idea what to do with position, it's still 3 (or usually 4, because most GPUs don't support 3) floats per texel. For small scenes you may be able to get away with half3(4).
    Pre-baked lightmaps, on the other hand, can at least use compression (ASTC, ETC, PVR).

    If lighting doesn't differ much between the seats, you can use Lightmap Prefab component to bake a few unique seat lightmaps and reuse them.
    If it does, I suggest putting them in a Lightmap Group with "pack atlas" mode (default). You'll then get one lightmap texture for all seats, plus renderer.lightmapScaleOffset will be assigned a scale/offset vector to transform each seat UVs with. If you also use "override resolution" checkbox on the Lightmap Group Selector component, you can force all seats to have the same exact size (in pixels) inside the atlas; this will allow you to also skip the "scale" part of the lightmapScaleOffset, as it'll become a known constant.

    Bakery current can't bake multiple LOD levels in one texture, so you'll need a separate Group for each LOD level.

    I never used ECS so far, but you can bake it in a "normal" project and then grab the lightmaps/UV offsets and use them the way you want.
     
    Last edited: Jul 24, 2019
  31. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    Hi. I cannot access the Titan RTX and Quadro RTX setups before next week.

    So if there someone who could support this task earlier we would get some earlier results.

    I really recommend to remove second card physically from power and slot. Deactivating in system control or Bios did not worked properly for all cases.

    @guycalledfrank
    is your benchmark program only working with RTX cards or GTX generation too?
     
    Last edited: Jul 24, 2019
  32. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    We need at least 2x RTX cards to test this, right?
    I'm afraid it's not very common :p
     
  33. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    It s standard now in our companys high-end visualisation pc's but you are right, for now it makes no sense.

    Bakery could be the first software to fully support rtx on/off and multi gpu.)

    Thats reason enough for NVidia to send some test samples to @guycalledfrank
    Ask them. You will get some.

    The GTX series would be important too.

    There is only blender cycles now who gives me close to linear scale with gtx and rtx cards because i can deligate the bake jobs to available compute devices by hand
    So 50 jobs to do.
    25 for cuda_0.
    25 for cuda_1.
    There is no rtx on/off benefit for now.

    I could also, in 2.8 version,for
    50 jobs, delgate per console to
    22 for cuda_0.
    22 for cuda_1.
    6 for intel xxxx
    It s again faster but your PC becomes absolut unresponsive.
    But there is the good feeling to use your hardware the right way.
     
    Last edited: Jul 24, 2019
    guycalledfrank and atomicjoe like this.
  34. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
    It's like baking with "RTX mode" on. Should run on both GTX/RTX, but may be slower on GTX. Anyway, in this case the interesting part is to compare one GPU vs many.

    Hmmmm...
     
  35. atomicjoe

    atomicjoe

    Joined:
    Apr 10, 2013
    Posts:
    452
    Yeah man, just ask them nicely and make the point it's a very good promo of their OptiX technology on Unity who has already made several users switch to new RTX hardware :)
     
    keeponshading likes this.
  36. Krubbs

    Krubbs

    Joined:
    Mar 15, 2013
    Posts:
    30
    Very strange results.
    Several launches:

    2xRTX 2080 (no NVLink)
    1. 1.717 s
    2. 1.310 s
    3. 1.166 s
    4. 1.162 s
    5. 1.171 s

    1xRTX 2080
    1. 2.002 s
    2. 0.828 s
    3. 0.846 s
    4. 0.850 s
    5. 0.842 s

    Only 1st start show that 2x faster than 1x. But after 2nd test we can see that 1x2080 faster than 2x2080. Hmm...
    But i haven't NVLink bridge.
     
    Last edited: Jul 25, 2019
    keeponshading and guycalledfrank like this.
  37. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    fyi. In addition to test and use multi gpu setups for production bakes.
    When you have finished your bigest multi scene setup and all multi gpu optimisation is done and your scenes are waterproof you don t need to buy gpu s for yourself to get fast an productive bake done
    e.g. Amazon Cloud has some great setups to bake the biggest stuff down in hours/days.
    https://aws.amazon.com/de/blogs/aws/in-the-works-ec2-instances-g4-with-nvidia-t4-gpus/

    current pricing
    https://aws.amazon.com/de/ec2/instance-types/p2/
    8x GPU 7,2 USD per hour
    16xGPU 14,4 USD per hour
    what s pretty fair

    that s the biggest argument for the split for compute devices per job shown in my benchmark post before. It makes this possible too without any hassle.

    applied to an time of day scenario for the archviz 6 example.
    And it s only an example but possible with 1 Unity and 1 Bakery license.


    4x GTX Titan V 00h37m47s at the moment
    around
    4x GTX Titan V 00h09m00s after compute device optimisation when possible

    16x GPU AWS Cloud around 00h02m00s for one lighting scenario
    for 14,4 USD you can calculate 30 different lighting scenarios in 1 hour
    not calculated the ibl switch times and save times in. only rough estimates. Should only show a possible trend. When you don t like multi lit scenarios so it scales similar for having 16 times bigger levels as the archviz example.

    So computation costs for multiple dynamic day bakes for lightmaps and probes if there is some nice lightfield kind of storage and compression for them could become real very fast.
    This all with mostly physical correct data Bakery delievers already for all bake modes and holly molly tweaks.
    Not some i approximate here and there finger painting.
     
    Last edited: Jul 25, 2019
    guycalledfrank likes this.
  38. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
    Sent you an updated benchmark.
     
  39. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    eDmitriy and guycalledfrank like this.
  40. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    Hi,
    on some deep dive around probes i played around with
    https://github.com/TheRealMJP/BakingLab

    (by browsing the code you find also some hints for Directional in Enlighten, https://github.com/TheRealMJP/BakingLab/commit/157a4e6fdb81a2e5999f231543316ea4ee9c0a2e )

    because it allows some nice comparisons from ground truth path traced with

    • Diffuse – a single RGB value containing the result of applying a standard diffuse BRDF to the incoming lighting, with an albedo of 1.0
    • Half-Life 2 – directional irradiance projected onto the Half-Life 2 basis[6], making for a total of 3 sets of RGB coefficients (9 floats total)
    • L1 SH – radiance projected onto the first two orders of spherical harmonics, making for a total of 4 sets of RGB coefficients (12 floats total). Supports environment specular via a 3D lookup texture.
    • L2 SH – radiance projected on the first three orders of spherical harmonics, making for a total of 9 sets of RGB coefficients (27 floats total). Supports environment specular via a 3D lookup texture.
    • L1 H-basis – irradiance projected onto the first two orders of H-basis[7], making for a total of 4 sets of RGB coefficients (12 floats total).
    • L2 H-basis – irradiance projected onto the first three orders of H-basis, making for a total of 6 sets of RGB coefficients (18 floats total).
    • SG5 – radiance represented by the sum of 5 SG lobes with fixed directions and sharpness, making for a total of 5 sets of RGB coefficients (15 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.
    • SG6 – radiance represented by the sum of 6 SG lobes with fixed directions and sharpness, making for a total of 6 sets of RGB coefficients (18 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.
    • SG9 – radiance represented by the sum of 9 SG lobes with fixed directions and sharpness, making for a total of 9 sets of RGB coefficients (27 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.
    • SG12 – radiance represented by the sum of 12 SG lobes with fixed directions and sharpness, making for a total of 12 sets of RGB coefficients (36 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.
    The complete blog here
    https://mynameismjp.wordpress.com/2016/10/09/sg-series-part-6-step-into-the-baking-lab/

    the results are somehow flashy like here, in special for bright light sources
    probe_basis_comparison.png

    @guycalledfrank
    Do you have an opinion on SG's.
    In special on the environment specular from SG lightmaps.


    theorder_sgcomparison_00.png
     
    Last edited: Jul 27, 2019
    ftejada, m4d, RockSPb and 1 other person like this.
  41. feranti

    feranti

    Joined:
    Apr 7, 2014
    Posts:
    8
    Thank you for pointing me there (and sorry for the delay oy my answer).
    To be sure, you say that all the code necessary to run the baking is in an external DLL that we can use ourselves (the same way you use it through your own editor scripts, I assume) ?
    I also assume it will stick us with a specific version of Bakery and will annoy us if the DLL changes in a later version of Bakery, but it is definitely useful hint ! Thanks a lot !
     
  42. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,337
    What do you call the complete image? the lightmap that accumulate light? I'm not sure I get it lol, I guess you are like proposing many maps but shifted in UV to reduce the sampling cost?

    However The UV samples should be preprocessed during baking, there is maybe a way to reduce this at this stage by sorting the samples correctly. Maybe a smart pass that reassemble the lightmap layout to optimized ray (even using generated proxy geometry to optimize further)? who know. I just offered a vanilla basic implementation idea. I'm pretty sure there is endless iteration to do on top. That's just a starting idea. If the vanilla implementation works at its own level, there is probably more to think from there!

    I think I get it (maybe not but here is an idea), you have the lightmap, each pixel query it's tile and each tile query another tile that hold the "light data" to sample. If that's true you wouldn't need the UV map at all, you could just light directly into the "light tile" (done in a different process), then use a mipmap to sample the average of all rays for a single bounce. There would be no way to propagate the light data to new pass (as the target and source are decoupled). Which is a valid idea too (to test).

    But then you lose bidirection, as I was mapping the lightmap to itself, needed for multibounces, since every points can hash it's UV position to the indirection texture, and since the indirection texture is the position of other points, we effectively have a structure that store the entire lightfield data, basically a light graph we can traverse however we need (at the cost of texture cache miss).

    The multi bounces work with mapping the data to itself because we use a gather strategy. Each pass a point gather light from above and update itself, then next pass other points that see it gather back that light data, it's recursive and propagate along the graph. The hardest part become about optimizing the gather pass which is the most costly, and that's all GI in a nutshell IMHO. Wether it's texture fetch or BVH traversal, it's one GI hell.

    That's kinda the whole problem with any GI solution and why hardware BVH accelerated solution is kind of a big deal. The data over the hemisphere of a point is as good as (potentially) random, I think all gpu solution suffered from this as far as I know. It's also what motivate me to use mipmap as an optimization to "bundle rays" BEFORE realizing it could be used to simulate cone ala voxel tracing (given we can avoid edges problem). GI stay expensive whatever solution we try to go for. There is also the solution to "jitter" the around the address to get neighbor sample, but it work best if we store a reference ID for the geometry and only sample "jitter" that have the same ID to prevent edges (but cannot guarantee visibility, it's an approximation, maybe adding even further data like a jitter range?)

    This is especially true for low end mobile, who use tile based GPU rendering, which mean that random texture access is expensive when fetching from the main memory. I think in my target hardware the gpu tile is 16²px (but that's rasterization), I don't know (yet) what is the size of the cache and if dynamic textures are hold up to the same standard (okay the spec sheet says 8kb to 256kb l2 cache, so a single 256² rgba raw in the best case).

    The value I see is that it's geometry independent, framerate independent and embarrassingly parallel, that is highly flexible, which mean we can spread compute any way we want, as much as computing pixel per pixel, ray per ray. What if we put the cpu to contribution to create small bundle of data as texture and reconstruct the final texture later? That's a possible option.

    Okay that makes think, If I come back to your idea (at least how I understood it) above that give a single bounce, may be if we unpack back the data from the lightmap? we can hash the light tile to a pixel, so we can gather its data and bounce it back to the light tile pixels. That might work? It could be the superior quality solution! But it seems to trade off memory as the light Gbuffer is bigger due to duplicated point in tile, UV tile being lighter (less channel, GB lightmap smaller in pixel).

    It's a trade off really, IMHO, my experience with low end cheap phone is that people who have them cannot dl game with size of more than 100mo which is like already too much, I worked with someone who has an application of 50mo and user complain was that's too heavy. Even compressed map are still huge matrices of number. Also I was anticipating 256² sized lightmap, so not detailed or complex (low frequency) environment lighting! I think the graphic memory target is more along 32mo :confused: anyway.

    A RT lightmap on the other hand can be created on the fly, since the data needed is actually within the vertices, the TILE version would only need the indirection texture (max 2048 on low end). But the big win is that this method is geometry agnostic, so it could (potentially) work as a generative pass (replacing a texture loading for example or even happen before any level loading at all), you pre compute before starting the level (during building/loading), it's kinda a method compression into itself. Of course it wouldn't be as great as proper tracing.

    A less geometric independent, and more approximated version, can do away with storing any prebaked indirection map, and generate an atlas of UV box projected probes at runtime during level building/loading. Also any artefacts are probably fine with some aesthetics, and it's also probably (test pending) the only way to get RTGI on hardware like the mali 400 mp2, which is more or less a souped PS2 in power level :eek:.

    You probably need to design the entire geometry with that in mind, so it's not as generic as other solution (at least as this level of hardware cheapness). I don't expect the quality of other solution at all lol. It's fillrate dependent. The mali 400 has a low bound fillrates of 210Mpix/s that's 3,5Mpix/frames at 60fps. That's 53,4 textures of 256²px per frames (13 x 1024², 3 x 2048²) assuming best conditions with the cheapest possible shader. So the quality rate is a factor of number of rays, memory access and shader complexity.
    https://developer.arm.com/ip-products/graphics-and-multimedia/mali-gpus/mali-400-gpu

    Thanks that's quite the compliment :D seeing you are much more competent than me, but I don't think I'm actually competent enough, I basically indulging my inner Dunning Kruger to learn and experiment right now lol, if I was competent enough, I would have already made the asset and sold it as an extension of Bakery for RT light :p, right now I'm just piecing data together, and I use the forum as a scratch pad to organize those idea and bouncing them back to anybody who want to hear them (like we just did above lol).

    Okay now I think I have enough to try an implementation anyway, thanks for the talk! I need to start learning about proper practical implementation of dynamic texture now :rolleyes: let's see where are the gotcha!
     
    Last edited: Jul 27, 2019
    guycalledfrank and keeponshading like this.
  43. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    As more as i digging around i see
    Bakery as Unitys ScriptableBakePipeline.

    You did all the hard work to demystify baking.

    So i would do an SphericalGaussian Lightmaps addon for Bakery with Cycles as GroundTruth preview and solver.)
     
    Last edited: Jul 27, 2019
    guycalledfrank likes this.
  44. ephraimmiah

    ephraimmiah

    Joined:
    Jul 17, 2019
    Posts:
    2
    Heyhey, I got a bug report of sorts for you:

    When using the bakery lightmapped prefab script/component on a prefab and then baking the prefab while having the bakery mesh light script inside of said prefab, bakery will throw an error.

    While the first prefab still gets its lightmaps properly applied, any additional prefabs in the scene won´t have theirs applied properly. Error message is:

    ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: index

    Bakerys Progressbar then gets stuck in "Finished rendering" with loading percentages between 103% or higher values.

    This gets fixed by simply placing all lights out of the prefabs, which is a usable solution after all.
     
    guycalledfrank likes this.
  45. Softelectro

    Softelectro

    Joined:
    Oct 9, 2013
    Posts:
    19
    Hello,

    We use Bakery since few month now with succes. And we have two questions/remarks:

    Baking event
    Like we split our project in several scene, we have created an editor script to bake all scenes, one by one, during the night.
    With Unity baking, we get an event when the baking is finished (Lightmapping.completed) to bake next scene, but we don't find same thing in Bakery, so we add it.
    Do you think it's possible to add such thing in your script?​

    Rebake one mesh
    For memories, in Unity 4, it was possible to bake only selected gameobjects. With your lightmapgroup script we can make the same things but the result isn't the same between with and without lightmapgroup script (UV size in lightmap).
    Do you thing it's possible to add such function in Bakery?​

    Thanks in advance
     
    keeponshading likes this.
  46. guycalledfrank

    guycalledfrank

    Joined:
    May 13, 2013
    Posts:
    1,099
    I know that demo :)
    You should also try Probulator for L1 Geomerics trick and more: https://github.com/kayru/Probulator

    SGs are nice, but unlike SHs they also need fitting: https://mynameismjp.wordpress.com/2...proximating-radiance-and-irradiance-with-sgs/
    Fitting is slow and sad, but you can avoid it with running average, which gives me hope: https://torust.me/rendering/irradiance-caching/spherical-gaussians/2018/09/21/spherical-gaussians
    Running average is really cool but I didn't test it yet.
    For lightmaps I also don't like the storage requirement. The amount of textures they used for The Order (was it 9?) is too much to pay for a rather blurry specular (Directional/SH specular can give some middle ground). I can implement SG, but I'm not sure if it's worth it. Probably another reason for me to implement scriptable API.

    It's a bunch of multiple exes/dlls that scripts use. You can try following my script logic, but it's huge, as it tries to support all Unity features and work around limitations. I also have a somewhat working single wrapper DLL around all this stuff, it's not a part of Unity Bakery, but I can send it to you.

    Basically this:

    upload_2019-7-30_12-21-6.png

    Clustering data like this will likely reduce cache misses. Even though "UV value" represents a particular ray direction, the hits will happen in a more continuous fashion.

    Ouch. That sounds very familiar. Are you on the latest 1.6? Pretty sure I fixed it before.

    Ah yes I added that recently.
    Get github access: https://geom.io/bakery/wiki/index.php?title=Github_access
    Download these branches:
    https://github.com/guycalledfrank/bakery-csharp/tree/more_changes
    https://github.com/guycalledfrank/bakery-compiled/tree/more_changes
    Ongoing changelog is here: https://github.com/guycalledfrank/bakery-csharp/pull/5
    Take a look at OnFinishedFullRender.

    This is rather hard, as selective baking in general. Problem is, object needs to know the lighting around it, and to get it... you need to bake the rest too. So the selective render feature is rather limited at the moment and I'm not sure how to work around it.
     
    Tudor and neoshaman like this.
  47. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    They used SG9 Lightmaps for static and SG9 Probes for moving stuff.
    Then additional blending from SG 9 Lightmaps with manual placed Reflection Probes based on roughness.

    sg_cubemap_transition_00.png
    sg_cubemap_transition_01.png
    The top image shows a scene from The Order: 1886. The bottom image shows the same scene with a color coding applied to show the environment specular source for each pixel.

    Also nice

    Screenshot_20190730-121238.png

    and

    Screenshot_20190730-121159~2.png

    The complete Blog from BakeLab is such an fantastic read. Starting to really like it.

    Thanks for the Probulator links. Great stuff.

    I would prefer fit SG s over fitting complete level by hand.)

    Let s dig deeper. They fixed some parameters to make it more easy. But i did not completly understand this for now.

    Some more details like in the Blog
    https://present5.com/advanced-lighting-r-d-at-ready-at-dawn-studios/


    Sciptable API. Yes. Yes. Yes.

    Back on Hardware tomorrow for doing further MultiGPU tests.
    Could you send me optimized RTX version too.
     
    Last edited: Jul 30, 2019
    ftejada and guycalledfrank like this.
  48. BattleAngelAlita

    BattleAngelAlita

    Joined:
    Nov 20, 2016
    Posts:
    58
    One main disadvantage of SG is that they need to evaluate all lobes to get the final lighting result. But yeah, will be great to get SG at least for lightprobes. Maybe i try to implement it into bakery.
     
    guycalledfrank and keeponshading like this.
  49. feranti

    feranti

    Joined:
    Apr 7, 2014
    Posts:
    8
    It will be very nice of you! Thanks
    Do you want me to PM you my mail address?
     
  50. keeponshading

    keeponshading

    Joined:
    Sep 6, 2018
    Posts:
    344
    Fyi.
    In case you are searching another/additional potential solver who would allow realtime ground truth preview and very fast bake of all FullGI passes.
    Yesterday there was another milestone reached on Cycles.
    (actually the sucessfully landed 3 rockets in one week and are close to spread an community on mars)
    NVidia spent an additional Optix RTX integration to Blender Source Code.
    https://developer.blender.org/D5363
    Since now Optix is part of the Driver and could be integrated in OpenSource according to the GPL.

    So you can solve and preview now with
    CPU
    Cuda(GPU and/or CPU)
    or
    Optix RTX
    per click
    and choose single or multi gpu as compute devices.
    cycles-rtx-preferences (1).png
    with the hell of an speed

    cycles-rtx-performance-1.png

    The Optix Denoiser gets also an official integration among denoising of output,for realtime raytraced preview in short.

    For now Optix and OIDN Denoising via free addons.

    So thats pretty cool.

    Among lots of other reasons it helps to understand why
    Epic spend 1.2 Mio to Blender Foundation and UBISoft (not so much) last weeks.)

    So for now Unity seems to have no interest.
    GPU parity to CPU parity in PLM first.)
    Probably they start looking into it when the competition has done an integration.

    I could be horribly wrong. But according to all i have seen in how you build up Bakery you could integrate it in a few weeks by going platform independent and having realtime preview (cycles standalone build window)
     
    Last edited: Jul 30, 2019
    guycalledfrank likes this.