Bakery - GPU Lightmapper (v1.96) + RTPreview [RELEASED]

keeponshading · Jul 17, 2019

Hey. Dou you think it s possible to set compute devices from the ftracers and denoisers to CUDA_0 up to to CUDA_3?
We have some old BakeQuenns with 2 to 4 GTX Titan V.
So only a split for 2 and 4GPU s.

bcoyle · Jul 18, 2019

Not sure how I stumbled upon this asset, but it has me tempted to snap it up. One question though -- My current workflow is for archviz option previewing in VR (ex: flipping between 2 different furniture layouts via hide/unhiding of objects). I'll either bake the lighting in 3dsmax/vray or use a simple ambient occlusion screen effect in unity. Bakery looks sexier though!

Would it be possible to isolate meshes to bake, then hide, and then bake another set of geometry without affecting the previous set?

guycalledfrank · Jul 19, 2019

keeponshading said: ↑

Dou you think it s possible to set compute devices from the ftracers and denoisers to CUDA_0 up to to CUDA_3?
Click to expand...

Yes. I can add an option to specify CUDA devices used (I was under impression that all devices are utilized, when left unspecified). Do you want to use different devices for raytracing and denoising? Or just being able to ignore some devices?

bcoyle said: ↑

Would it be possible to isolate meshes to bake, then hide, and then bake another set of geometry without affecting the previous set?
Click to expand...

I recommend baking two separate scenes or using lightmapped prefabs for that. Basically each scene has a storage object linked to it, containing lightmap->mesh mapping. If you rebake the scene, it's refreshed, and hidden objects will lose their lightmaps. However, if you bake two scenes and then load them together at runtime, it'll work; similarly, baking two lightmapped prefabs in one scene will work too, because every prefab has its own storage object, separated from the scene.
(I'd go with multi-scene workflow, since Unity complicated the prefabs a bit too much after 2018.3; I no longer fully understand how they work)

keeponshading · Jul 20, 2019

guycalledfrank said: ↑

Yes. I can add an option to specify CUDA devices used (I was under impression that all devices are utilized, when left unspecified). Do you want to use different devices for raytracing and denoising? Or just being able to ignore some devices?
Click to expand...

Simply, For everthing where you can set the cuda_compute_device_device.

Yes. All devices are utilized, but they seems to do the same calculation because no speedup but full gpu use.
I tried to show it off in 4 cases. See times and screenshots who follow.

An ignore should be not needed.
Here the full run with the full showcase at the end.#

This is Version 3. Weekend.

Because of Bakery it is finally possible to do Lightmap Benchmarks without getting old.
Thank you.
Here a fast around 30 hour Benchmark with some nice results.

Done with an real world Archviz scenario from ruggero
https://assetstore.unity.com/packages/3d/environments/urban/archvizpro-interior-vol-6-120489

version ArchVizPRO_Interior_Vol.6_HDRP_v.5
in
Unity 2019.1.0f2
with
Bakery 1.6
HDRP

View attachment 449537

with these real world standard scene settings.
Only toggled RTX on\off

View attachment 449543

resulting in
16 x 4K (703MB)
high quality lightmaps.
Only the areas under table and couch could get more light?

View attachment 449546

View attachment 449549

View attachment 449564

with some surprises.

After ever run i

- did an reboot
- cleaned Bakery cache folder
- removed lighting calculaltion
- and created an new scene
- started mouse jiggle to avoid company desktop lock

for Multi/Single GPU
- removed second card physically
or
- deactivated PCI Slot in Bios when possible

some results

GEFORCE GTX 1080

2x GeForce GTX 1080

Driver 425.31
SLI enabled

1st 00h53m45s
2nd 00h51m12s
3nd 00h52m17s

2x GeForce GTX 1080 (v2)

Driver 425.31
SLI disabled

1st 00h52min20s
2nd 00h53m18s

! On some Cuda Raytracers there is only a speedup when SLI disabled. Not here.

1x GeForce GTX 1080
Driver 425.31

no sucess, because Laptop with 2x Desktop GTX 1080 cards.
After disabling in system console no chance to do the test.

Titan RTX

2x Titan RTX
Driver 430.64
NV Link Bridge

RTX mode off
1st 00h55m40s
2nd 00h53m56s

RTX mode on
1st 00h41m49s
2nd 00h43m14s

2x Titan RTX
Driver 430.64
remove NV Link bridge

RTX mode off
1st 00h34m20s

RTX mode on
1nd 00h33m18s

1x Titan RTX (second physically removed)
Driver 430.64
remove NV Link bridge

RTX mode off
1st 00h54m47s
2nd 00h55m12s

RTX mode on
1st 00h34m18s
1st 00h32m07s

! NV Link slows down a lot. Remove it.

! 1x TitanX RTX is faster than 2x TitanX RTX

Quadro 6000 RTX

1x Quadro RTX 6000
Driver 425.51

RTX mode off
1st 00h48m59s

RTX mode on
1nd 00h29m54s
2nd 00h29m19s

2x Quadro RTX 6000
Driver 425.51

RTX mode off
1st 00h44m50s

RTX mode on
1nd 00h32m14s
2nd 00h31m43s

! 1x Quadro 6000 RTX is faster than 2x Quadro FX 6000

GTX Titan V

4x GTX Titan V
Driver 425.51
SLI off

1st 00h39m37s

4x GTX Titan V
SLI off
Driver 431.36

1st 00h38m23s
2nd 00h37m47s

4x GTX Titan V
Driver 431.36
SLI on

1st 00h39m43s

Result for now.
Bakery 1.6

see
! 1x Quadro 6000 RTX 00h29m54s
is faster slightly than
2x Quadro FX 6000 00h31m43s

But both cards render. See 100% spikes during ftracers and denoising calculations.
Seems they do similar calculation?

! 1x TitanX RTX 00h32m07s
is slightly faster than
2x TitanX RTX 00h33m18s

But both cards render full.See 100% spikes during ftrace and denoising render.
Seems they do similar calculation.

! 4x GTX Titan V
are rendering

Low hanging fruit:

Logical result after these tests.

By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1
we should get for

2x GTX 1080 00h51m12s
around
2x GTX 1080 00h28m00s

By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1
we should get for

2x TitanX RTX 00h33m18s
around
2x TitanX RTX 00h17m00s
because
1x TitanX RTX can do 00h32m07s

By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1

we should get for
2x Quadro 6000 RTX 00h31m43s
around
2x Quadro 6000 RTX 00h15m00s
because
1x Quadro 6000 RTX can do 00h29m54s

By delgegating the ftracers and denoisers to compute device cuda_0 and cuda_1, cuda_2 and cuda_3

we should get for
4x GTX Titan V 00h37m47s
around
4x GTX Titan V 00h09m00s

This linear scale is an rough assumption whenever an realistic one
because it behaves in Blender Cycles (Cuda) for LightMap baking
and DNoise Denoising (Optix)
the same by delegating fraction of the jobs to available compute devices.

Let´s do. Let´s test. Should be easy because of your clean architecture.

It also shows that with 5 years old GTX Titan V cards and with an 2x GTX 1080 you can reach
phantastic results. For me the most impressive thing.

Further possibilities:

On the Screesnhots from High-End cards like
Quadro 6000 RTX/Quadro 8000 RTX
you see there is currently more then
20GB/45GB
of VRAM unused during complete bake.
One has
24GB/48GB.
So there is potential for an little? speedup by giving up your divide and conquer and by pushing more to vram at once.
You could save some writing and loading times.
But could be not worth the effort. Who knows? Could be time for specular GI.)

By using the OIDN denoiser you could probably get another 10 to 20% because you can denoise in paralell on CPU, instead at the end of the complete bake process on GPU like now.
All other stuff runs on GPU like now. You probably know better.

With comparable settings Unity Progressive Lightmapper crashes complete or switches to CPU.
With lower settings and lots of trying i got around 6 hours for far less quality at a special star constellation i could not capture.

I gave up after some time.
Sorry for fast putting this together.

And the Archviz 6 has big big windows......

keeponshading · Jul 20, 2019

.....because of that i am preparing in free time for the next run an more real world scenario
based on free data.

The CornellBox is outdated but did an fantastic job.

Sponza, too

ArchViz 6 is great but has too big windows

So i choosed the Paris Bistro scene.

It is quite some work to do because of testing PBR consistency. It need new HDRP materials and there are 600 textures.
I will also add an big tree left from the gate like in the original scenario with lots of leaves.

It will allow next level tests because we have 2019.)

So it has harder access to the IBL because of houses are high.
Was quite some fun to test different skys from clear to overcast.

In most sun sky constelations there is no direct acess to the sun.
e.g see red area on false color vis

So that s easy but we need mostly pure nice indirect in the Bistro, too.
For future challenge there will be addons like caustics for static glasses and bottles on the table. Only for fun.

There could be multiple seperate time of day bakes or an complete day/night cycle via ibl timelapse.
If there is someone who has fun to help by runtime switching or interpolating the calculated textures and lightprobes. Give me a message.

Challenge should be to bake it at once with one IBL.

However i think this is an nice benchmark scenario.

nsxdavid · Jul 19, 2019

Just picked up Bakery. Working in Unity 2019.1.10f1, I get the following script errors on import:

Assets\Editor\x64\Bakery\scripts\ftRenderLightmap.cs(2191,55): error CS0619: 'LightmapEditorSettings.Lightmapper.Radiosity' is obsolete: 'Use Lightmapper.Enlighten instead. (UnityUpgradable) -> UnityEditor.LightmapEditorSettings/Lightmapper.Enlighten'

Assets\Editor\x64\Bakery\scripts\ftRenderLightmap.cs(2195,58): error CS0619: 'LightmapEditorSettings.Lightmapper.PathTracer' is obsolete: 'Use Lightmapper.ProgressiveCPU instead. (UnityUpgradable) -> UnityEditor.LightmapEditorSettings/Lightmapper.ProgressiveCPU'

Assets\Editor\x64\Bakery\scripts\ftRenderLightmap.cs(3704,55): error CS0619: 'LightmapEditorSettings.Lightmapper.Radiosity' is obsolete: 'Use Lightmapper.Enlighten instead. (UnityUpgradable) -> UnityEditor.LightmapEditorSettings/Lightmapper.Enlighten'

Think you can #ifdef around those? Doesn't seem simple to update since Lightmapper.Radiosity has no fields or properties of it's own. :/

guycalledfrank · Jul 20, 2019

nsxdavid said: ↑

Think you can #ifdef around those? Doesn't seem simple to update since Lightmapper.Radiosity has no fields or properties of it's own. :/
Click to expand...

That seems like a bug related to Unity Hub (try updating it?): https://forum.unity.com/threads/bakery-gpu-lightmapper-v1-6-released.536008/page-63#post-4668227
Unity should normally automatically patch your scripts.
More info here: https://geom.io/bakery/wiki/index.p...2.80.9D_and_.E2.80.9CUnityUpgradable.E2.80.9D

nsxdavid · Jul 20, 2019

I'll try that , thanks

atomicjoe · Jul 20, 2019

Hi there Frank!
I just picked a Gigabyte RTX 2070 OC Gaming to replace my Gigabyte GTX1080 G1 Gaming and, guess what?
It's actually SLOWER! LOL
I went from a 23min render to a 28min render for exactly the same scene and settings!
(of course I enabled the RTX mode. I have played with all the settings and even overclocked my RTX2070 but there is no way to catch the GTX1080 here!)

Actually, it's more complicated than that: actual rendering is faster, but denoising is WAY slower with the RTX2070.

I don't know if it's even possible, but it would be awesome to have the denoiser to use the additional RT and Tensor cores of the RTX series, because right now it's really stalling on the denoise process.

Homicide · Jul 20, 2019

Hi. Really pretty new to all the artsy fancy side of visual stuff. My question is not so much about this asset, but lightmapping itself.

Is there no way to bake lightmaps at runtime? Ever?

I havent been able to come to conclusion on this subject. Thanks

keeponshading · Jul 20, 2019

atomicjoe said: ↑

Hi there Frank!
I just picked a Gigabyte RTX 2070 OC Gaming to replace my Gigabyte GTX1080 G1 Gaming and, guess what?
It's actually SLOWER! LOL
I went from a 23min render to a 28min render for exactly the same scene and settings!
(of course I enabled the RTX mode. I have played with all the settings and even overclocked my RTX2070 but there is no way to catch the GTX1080 here!)

Actually, it's more complicated than that: actual rendering is faster, but denoising is WAY slower with the RTX2070.

I don't know if it's even possible, but it would be awesome to have the denoiser to use the additional RT and Tensor cores of the RTX series, because right now it's really stalling on the denoise process.
Click to expand...

the GTX 1080 has 10% more Cuda Cores + around 8% more clock speed.
The CudaCores do all the work during Rendering and Denoising at an higher clock.
That matches relativly exact with your slowdown.

So it really hard to catch this up at the end of the bake where the denoisers are running.

Do you have messured the difference between RTX mode on or off on the 2070 ?

I did some fast tests these days by running denoising on cpu with the intel denoiser (oidn) instead the optix denoiser (dnoise)on another cuda path tracer (cycles). With this you can get around 20% faster because denoising must not run on gpu. It runs in parallel on the cpu after every bake atlas has finished rendering.

It s hard these day. With the RTX generation you pay more for less cores, clock even ram on higher models at the comparable segment from generation before.

Not complete correct but we did such comparisions the whole week during building up from our bake infrastructure
.
For now is. 2 people run a marathon (bake).
The GTX runner has a distance from 38km (10% less)
The RTX runner has a distance from 42km.
The GTX runner runs also around 8% faster on a shorter distance

At the end from the race of the long distance the RTX runner gets 64 bananas (RT cores) and 288 pieces of dextro energy (tensor cores) . Nice. But it really doesn t helps to win because the GTX runner has already finished the race.

atomicjoe · Jul 20, 2019

Actually, it's more complicated than that: actual rendering is faster, but denoising is WAY slower with the RTX2070.

Also, my RTX2070 is overclocked to 1845mhz on boost and my GTX1080 wasn't overclocked (outside of the factory little OC it came with) so definitively the problem with the denoiser is the lack of CUDA cores compared to the GTX, but maybe there is some version of the denoiser than can take advantage of the RT and Tensor cores that are just doing nothing otherwise in the RTX?

atomicjoe · Jul 20, 2019

Homicide said: ↑

Is there no way to bake lightmaps at runtime? Ever?
Click to expand...

I don't know of any lightmapper that can bake at runtime.
There was an assetstore plugin for baking procedurally generated meshes at runtime, but it was deprecated.
So currently, no, there is no way that I know.
(edit: there are several Ambient Occlusion only bakers in the assetstore that work at runtime, though)
But it wouldn't be of much use anyway, since baking is a slow process you normally don't want to bother the end user with.
Your best bet is to use realtime lighting if you are doing procedural stuff.
If your scene premade, just bake it in the editor.

QuantumTheory · Jul 20, 2019

Hi @guycalledfrank

The bicubic interpolation on the lightmaps is excellent. For small dynamic objects like grass and rocks, I'm doing a top-down render of the scene's lightmaps and saving that into a image, then using that image in a custom shader. The scene is outdoors so this works pretty well for our purposes.

Obviously there is a difference in texel size between the world's lightmaps and this new one, but I was wondering if I can run bicubic interpolation on those captured images in the custom shader. If so, could you shed some light on what functions I need to filter the images?

Thanks!

keeponshading · Jul 22, 2019

atomicjoe said: ↑

Actually, it's more complicated than that: actual rendering is faster, but denoising is WAY slower with the RTX2070.

Also, my RTX2070 is overclocked to 1845mhz on boost and my GTX1080 wasn't overclocked (outside of the factory little OC it came with) so definitively the problem with the denoiser is the lack of CUDA cores compared to the GTX, but maybe there is some version of the denoiser than can take advantage of the RT and Tensor cores that are just doing nothing otherwise in the RTX?
Click to expand...

Sure they will be and are some RT cores optimisations for Optix in special for the denoiser. In my explanation i did not
included these. Only that what lands on my SSD totay.)

This week will be some more test.
Also with the 2nd generation of the Titan Pascal.
By benchmarking all 3 generations of the Titan you will see interesting stuff.
In special the doubeling of the price by getting less cuda compute power and ram.
So the GTX Titan V , the oldest one, around 4 years or more, is an pretty cuda bake beast at the moment for the price.

I played around with this last month....

https://github.com/ROCm-Developer-Tools/HIP

Really impressive because you can convert Cuda to portable C++.
You get a lots of A,s and Ooooh,s because it allows tests not only limited to NVidia s playground.
However. Hope there will be more competition in the future.

keeponshading · Jul 22, 2019

fyi,
i tried to rebuild this method from Frostbite, already linked in bakery documentation

https://www.gdcvault.com/play/1025434/Precomputed-Global-Illumination-in

An nice presentation that showed me that Bakery reached
an ,best in class, baking level.

latest additions like

Directional SH mode

Non linear L1 SH

and the cool thing is by using in addition several very light layer of HBAO, like shown in presentation, 3 to 5 should be enough and Distance Shadowmask this GDC presentation could be used as Bakery tutorial now. However it works.

Krubbs · Jul 22, 2019

Is it really possible to reduce rendering time on 2xGPU's?
I have two RTX 2080 and i feel betrayed now.
Such powerful hardware simply stay idle.

If you say yes, when we can wait new release?

I see small spikes of GPU 1 & 2 load. It doesnt load at least 30%. Both GPU's. If its possible to use 100% of them and really speed up by splitting tasks for 2 gpu... wow... such wonderful future.

maart · Jul 22, 2019

I'm checking out this hdrp project downloaded from here https://www.patreon.com/posts/proje...ial&utm_source=twitter&utm_campaign=postshare trying to see if I can get it to work with bakery. But I can't get it to work with the bakery-lit-hdrp shaders from the zip file in the bakery package. is there someone who can help?

guycalledfrank · Jul 22, 2019

atomicjoe said: ↑

It's actually SLOWER! LOL
Click to expand...

In non-RTX mode it's expected given the CUDA cores count, BUT did you try enabling the the RTX checkbox (in advanced settings)? It should really beat 1080 with dedicated RT hardware.

atomicjoe said: ↑

but denoising is WAY slower with the RTX2070.
Click to expand...

Current denoiser uses a rather outdated (pre-RTX) version of the library, but I have a new one now. Sent you a PM, test it. Should use your tensor cores.
Also note that first time denoiser launch is always slower than subsequent launches (drivers cache something I believe).

Homicide said: ↑

s there no way to bake lightmaps at runtime? Ever?

I havent been able to come to conclusion on this subject. Thanks
Click to expand...

Such asset can be implemented, but Bakery is not suited for that.

QuantumTheory said: ↑

I can run bicubic interpolation on those captured images in the custom shader
Click to expand...

Sure, you can. Take a look at ftrace.cginc file, ftLightmapBicubic() function. I did not invent it, but copied somewhere from open Nvidia code: https://github.com/zchee/cuda-sampl...bicubicTexture/bicubicTexture_kernel.cuh#L116
I've also seen the same code used in Mirror's Edge.
As you can see, one limitation of this technique is that you need to know texel size (1/resolution) in advance, that's why Bakery's shader tweak is limited to DX11, as I use GetDimensions() function for that. However, I only do it to avoid sending more data from CPU to GPU or altering existing shaders. For most textures (but not lightmaps) Unity already gives you a texel size variable (see TexelSize: https://docs.unity3d.com/Manual/SL-PropertiesInPrograms.html)

keeponshading said: ↑

Really impressive because you can convert Cuda to portable C++.
You get a lots of A,s and Ooooh,s because it allows tests not only limited to NVidia s playground.
Click to expand...

Nice. Too bad I can't decompile dependencies like OptiX though.

keeponshading said: ↑

and the cool thing is by using in addition several very light layer of HBAO
Click to expand...

It's pretty cool how they bake lowres proxy geometry and then project lighting on detailed meshes. Fast to bake and SH gives you proper XYZ directionality for details. Something like that can be scripted on top of Bakery. AFAIK UV projection code was a part of Enlighten, but Unity never used/exposed it.

Krubbs said: ↑

Is it really possible to reduce rendering time on 2xGPU's?
I have two RTX 2080 and i feel betrayed now.
Click to expand...

First check: make sure you have the RTX checkbox enabled.

Now given the multiGPU benchmark results it does indeed look like it's not parallelizing the job well. I think I'm gonna make a small benchmark program that traces some random rays and debug it. It will help a lot if you'll be able to run some tests with it. I only have one GPU at the moment and can only guess what works.

maart said: ↑

But I can't get it to work with the bakery-lit-hdrp shaders from the zip file in the bakery package. is there someone who can help?
Click to expand...

You don't really have to use them (unless you need SH lightmaps). Anyway, what errors do you get? HDRP is being updated rapidly and even minor updates break my HDRP shader over and over again.

Krubbs · Jul 22, 2019

guycalledfrank said: ↑

RTX checkbox enabled
Click to expand...

true

guycalledfrank said: ↑

It will help a lot if you'll be able to run some tests with it.
Click to expand...

I'm ready

keeponshading · Jul 22, 2019

guycalledfrank said: ↑

It's pretty cool how they bake lowres proxy geometry and then project lighting on detailed meshes. Fast to bake and SH gives you proper XYZ directionality for details. Something like that can be scripted on top of Bakery. AFAIK UV projection code was a part of Enlighten, but Unity never used/exposed it.
Click to expand...

It s a shame how Enlighten was bashed these days. It never got an fully integration into Unity.

However. These lowres proxy projection from the frostbite slides and with the help from this slide

i finally got an good understanding how Enlighten was able
to update light maps and probes so fast by using these simplified models and multi-threading.

atomicjoe · Jul 22, 2019

keeponshading said: ↑

It s a shame how Enlighten was bashed these days. It never got an fully integration into Unity.
Click to expand...

THIS!
THANK YOU!

atomicjoe · Jul 22, 2019

Krubbs said: ↑

I have two RTX 2080 and i feel betrayed now.
Click to expand...

Don't. Frank is super responsive.
If it can be fixed, it will.

neoshaman · Jul 22, 2019

keeponshading said: ↑

i finally got an good understanding how Enlighten was able
to update light maps and probes so fast by using these simplified models and multi-threading.
Click to expand...

They used to do in cpu, but move to compute recently. But my guess is that you can do it in vanilla shader too. I had a post in this thread that stumble on a similar idea for open gl es 2.0 and +, but I have now moved to a probe based solution.

You can try this:
- put the "surface" in a texture, that, is encode area size, normal, position, albedo
- apply direct lighting computation using the normal and albedo, compute shadow from shadowmap by comparing the position.
- store the result in a "direct lighting" dynamic texture
- for every pixel in a lightmap, associate a bigger tile texture that store UV address of the visible surface from the "surfaces texture" + weight data based on visibility. Such as that texture is actually lightmap size * number of sample. You would probably store those addresses in a tile configuration, so that you can easily hash the lightmap pixel position and get to the proper tile. Let's call this map of adresses, the indirection tilemap.
- at each iteration, sample the "direct lighting" texture with the indirection tiles, accumulate the result into the lightmap. It's the computational cost of a naive blur of 'tile size'.

YOU NEED BAKERY to bake the indirection tilemap, the author said he could probably provide an API to make custom bake like that (ie for each point, return data from the hit impact, sort the data yourself and bake it where you want)

Some observation for optimization:
- the surface texture can be organize as a lightmap, so you can just use the lightmap, the data is basically teh same as a Gbuffer. So you only need the lightmap data, an accumulation lighting map (just new layer of the lightmap) that first get direct light then add the indirect light at each iteration.
- the tile are basically a representation of the hemisphere above the lighted points, they are basically small hemi cubemap, it's an atlas of hemicubemap above each point, but kinda like importance sampled. I use that observation to move to another approximation where I use box projected lightprobe as indirection texture.
- If you basically map the lightmap to itself with indirection texture, you have a baked lightfield of all the rays, since the lightmap has the position of each point, you can infer the ray beginning and end, therefore use that to inject intersections with primitive. Also it mean you can go bidirectional for the light calculation, you can start on the tile and find which point all the ray focus on (hashing the position).
- with open gl es 2.0 you are stuck with 8 texture samples, the direct lighting cost a few (normal, albedo, position). This limit the ray gathering from the tilemap to a few rays per iteration, but you can use mipmap sampling to simulate either multiple rays at once (power of two per mip) or cone (deeper mip when distance grow), be careful of edge artefact as it's a lightmap representation.

XRA · Jul 23, 2019

Hey just curious if this is a known limitation or bug, noticed that Subtractive light mode doesn't bake correctly when using RNM or SH directional modes
(left incorrect result, right correct result w/ directional mode none)

RockSPb · Jul 23, 2019

neoshaman said: ↑

They used to do in cpu, but move to compute recently. But my guess is that you can do it in vanilla shader too. I had a post in this thread that stumble on a similar idea for open gl es 2.0 and +, but I have now moved to a probe based solution.

You can try this:
- put the "surface" in a texture, that, is encode area size, normal, position, albedo
- apply direct lighting computation using the normal and albedo, compute shadow from shadowmap by comparing the position.
- store the result in a "direct lighting" dynamic texture
- for every pixel in a lightmap, associate a bigger tile texture that store UV address of the visible surface from the "surfaces texture" + weight data based on visibility. Such as that texture is actually lightmap size * number of sample. You would probably store those addresses in a tile configuration, so that you can easily hash the lightmap pixel position and get to the proper tile. Let's call this map of adresses, the indirection tilemap.
- at each iteration, sample the "direct lighting" texture with the indirection tiles, accumulate the result into the lightmap. It's the computational cost of a naive blur of 'tile size'.
Click to expand...

May I ask? Why do you need a real-time gi on the hardware level gles2? That all dependent texture fetching will be incredibly slow.

neoshaman · Jul 24, 2019

Yes it will be slow, but not framerate dependent, it's basically like cheap real time baking.

Also this is approximation, so it's probably full of inaccuracy and artifact anyway (same as enlighten), you need to design around them. I stumble on it, I didn't explicitly design a GI solution, I was looking for time of day shadow baking. I want to make an open world on mali 400, they are widespread here (and I'm poor that's the hardware I could afford at <100€). I haven't yet implemented it, so I don't know actual performance, but even 2s latency would be okay (I wouldn't inject dynamic light).

But to fully answer the question:
- Production wise, I don't have to bake map off line, that's one step less.
- Even without the GI, it decouple lighting, from static geometry, with the direct lightmap Gbuffer (though managing the precision of 8bit has to be done for position), therefore from framerate, which would allow for more complex lighting shader. You SHOULD probably separate dynamic effect and specular lighting with constant diffuse environment lighting.
- Memory wise, it makes the mobile game smaller, lightmaps take a lot of memory.
- Once the compute is done, it's a single texture fetch for all objects, you can even discard the extra textures once done.
- It cover many permutation of lighting, you probably don't want to bake all of these with strict memory budget, providing you stay within the precision budget.
- Like enlighten it probably work with small texture area with like less than 1 pixel per meter (recommended by enlighten).
- If you do procedural packing of procedural object's UV, it's compatible with procedural generation.
- It's just another option for case where you need it, more tools is always better. It give an extra touch.
- GI is low frequency, it matter less if it's slow, and you can easily spread the compute to whatever frequency you need, also Time of day change are slow so you can tune it's duration to the update. You won't use it to gather dynamic object either, it's only static to static GI, environment only.
- It's still probably faster to any other method for low end hardware, you won't do a bvh tracing or ray marching past a point.
- It's still probably useful for higher end machine or even VR.
- It's actually very flexible, you can mix and match technique, change data precision, spread compute as much as you want, you can even use prebake lighting to jumpstart the compute.
- It's basically a shader version of enlighten so it cover the same usecase.

And the extended probe version can update change in scenery and project on dynamic object (dynamic don't contribute unless you use shadowmap injection), for the same cost of a single texture fetch for static and a single cubemap fetch for dynamic, but with less geometric precision on the gi (box projection approximation). And it would also allow for updating not just a lightmap but 3d probe volume too (which are 2 fetch and one lerp for all objects on ogles 2.0).

But I'm going to go and try to implement it soon for open GL es 2.0 I'll report the result. I'll just need to have the set up done, ask for @guycalledfrank for the promise API and try myself.

mikerz1985 · Jul 24, 2019

Hi -- I've used Bakery in the past and found it awesome.

I have a theoretical use case I want to run by you; for the sake of argument let's say I have a stadium 3d model, and 30,000 seats. Each seat has a piece of data associated with it, and the user can choose to sit in that seat. Currently, I bake the seats into groups of meshes; let's say 100 groups of 300 seats. The meshes can be marked as static, and then baking can happen as normal.

What I would like to do instead, is have a special ECS based, GPU instanced seat rendering system which can also support LOD. I would like not to have to use a realtime screen space AO effect, but rather bake the lighting ahead of time.

Does this sound doable to you?

Some immediate questions I'm thinking about:
I'm not sure how to relate a particular seat's uv to a particular lightmap's UV.
I'm not sure how to relate a different LOD level's uv to the lightmap UV. I imagine they would have to be generated in a spatially normalized way.
I'm not sure how to bake based on a ECS system, and not using static geometry. This one is perhaps more an optimization; I could still bake static geometry purely for the purposes of being able to bake lightmaps.

atomicjoe · Jul 24, 2019

mikerz1985 said: ↑

Does this sound doable to you?
Click to expand...

Yes, just render the seats separately using LightmapGroups plus the prefab component and make them a prefab.

mikerz1985 said: ↑

I'm not sure how to relate a particular seat's uv to a particular lightmap's UV.
Click to expand...

Not an issue with lightmap groups.

mikerz1985 said: ↑

I'm not sure how to relate a different LOD level's uv to the lightmap UV. I imagine they would have to be generated in a spatially normalized way.
Click to expand...

A good LOD generator automatically generates the LODs with matching UVs.
I use MantisLOD and it's fantastic.

mikerz1985 said: ↑

I'm not sure how to bake based on a ECS system, and not using static geometry.
Click to expand...

About static: you can mark it static for baking, then mark it as not static for management and then, once you have placed it in the scene by code reset it to static again.
I can't talk about ECS since I haven't used it though, but iirc Bakery needs to manually assign it's lightmaps by code each time the mesh is loaded, so it may be more complex than expected.

guycalledfrank · Jul 24, 2019

@keeponshading @Krubbs
I made a little benchmark program. Basically it's a small part of the RTX lightmapper that traces a HDRI for a somewhat complex scene.
https://drive.google.com/open?id=1YhImIhF-ESttzOZKJ-W-ldRt_OCt1WuK
Launch run_benchmark.bat, then read .benchmark.txt, it should tell you measured render time.
Please confirm that there is no benefit in a multiGPU config. I'll try to modify some bits and send you an updated version then.

neoshaman said: ↑

- for every pixel in a lightmap, associate a bigger tile texture that store UV address of the visible surface from the "surfaces texture" + weight data based on visibility.
Click to expand...

One thing that bothers me is that if you put a bunch of visible UVs at ray hits in a tile, adjacent values are going to be very far from each other in UV space = cache miss at every read.
This is in case I get your idea correctly and think of a small tile of UV values at each texel.
Alternatively... you can atlas multiple copies of the complete image in the tile map, each lightmap-sized, but storing a different UV value. This way you can sort of blend each sub-image over the final lightmap, using linear texture access (I can probably draw some visual explanation, if my wording is hard to understand).

neoshaman said: ↑

the author said he could probably provide an API
Click to expand...

I should!

XRA said: ↑

Hey just curious if this is a known limitation or bug, noticed that Subtractive light mode doesn't bake correctly when using RNM or SH directional modes
Click to expand...

Huh. Check example_subtractive scene, make sure your setup is similar. MAYBE switch render mode to shadowmask so you can see the "baked contribution" option on the light, switch it to "Direct and Indirect", enable Subtractive render mode back.

neoshaman said: ↑

and I'm poor
Click to expand...

I heard Unity are looking for a lighting engineer

neoshaman said: ↑

Memory wise, it makes the mobile game smaller, lightmaps take a lot of memory.
Click to expand...

I have some doubts about the tile map size in VRAM. If you have a 1024x1024 lightmap (that's modest) and at least 4x4 tiles (rather noisy) and given UVs need to be at least half2, that's 4096x4096x4 = 64 MB. It's not uncommon to have only <= 256 MB of available VRAM on mobile devices, especially older ones.
Plus having a lightmap GBuffer will add to this. You can compress normal and albedo, but I have no idea what to do with position, it's still 3 (or usually 4, because most GPUs don't support 3) floats per texel. For small scenes you may be able to get away with half3(4).
Pre-baked lightmaps, on the other hand, can at least use compression (ASTC, ETC, PVR).

mikerz1985 said: ↑

Does this sound doable to you?
Click to expand...

If lighting doesn't differ much between the seats, you can use Lightmap Prefab component to bake a few unique seat lightmaps and reuse them.
If it does, I suggest putting them in a Lightmap Group with "pack atlas" mode (default). You'll then get one lightmap texture for all seats, plus renderer.lightmapScaleOffset will be assigned a scale/offset vector to transform each seat UVs with. If you also use "override resolution" checkbox on the Lightmap Group Selector component, you can force all seats to have the same exact size (in pixels) inside the atlas; this will allow you to also skip the "scale" part of the lightmapScaleOffset, as it'll become a known constant.

mikerz1985 said: ↑

I'm not sure how to relate a different LOD level's uv to the lightmap UV. I imagine they would have to be generated in a spatially normalized way.
Click to expand...

Bakery current can't bake multiple LOD levels in one texture, so you'll need a separate Group for each LOD level.

mikerz1985 said: ↑

I'm not sure how to bake based on a ECS system, and not using static geometry.
Click to expand...

I never used ECS so far, but you can bake it in a "normal" project and then grab the lightmaps/UV offsets and use them the way you want.

keeponshading · Jul 24, 2019

guycalledfrank said: ↑

@keeponshading @Krubbs
I made a little benchmark program. Basically it's a small part of the RTX lightmapper that traces a HDRI for a somewhat complex scene.
https://drive.google.com/open?id=1YhImIhF-ESttzOZKJ-W-ldRt_OCt1WuK
Launch run_benchmark.bat, then read .benchmark.txt, it should tell you measured render time.
Please confirm that there is no benefit in a multiGPU config. I'll try to modify some bits and send you an updated version then.
Click to expand...

Hi. I cannot access the Titan RTX and Quadro RTX setups before next week.

So if there someone who could support this task earlier we would get some earlier results.

I really recommend to remove second card physically from power and slot. Deactivating in system control or Bios did not worked properly for all cases.

@guycalledfrank
is your benchmark program only working with RTX cards or GTX generation too?

atomicjoe · Jul 24, 2019

keeponshading said: ↑

Hi. I cannot access the Titan RTX and Quadro RTX setups before next week.

So if there someone who could support this task earlier we would get some earlier results.

I really recommend to remove second card physically from power and slot. Deactivating in system control or Bios did not worked properly.
Click to expand...

We need at least 2x RTX cards to test this, right?
I'm afraid it's not very common

keeponshading · Jul 24, 2019

It s standard now in our companys high-end visualisation pc's but you are right, for now it makes no sense.

Bakery could be the first software to fully support rtx on/off and multi gpu.)

Thats reason enough for NVidia to send some test samples to @guycalledfrank
Ask them. You will get some.

The GTX series would be important too.

There is only blender cycles now who gives me close to linear scale with gtx and rtx cards because i can deligate the bake jobs to available compute devices by hand
So 50 jobs to do.
25 for cuda_0.
25 for cuda_1.
There is no rtx on/off benefit for now.

I could also, in 2.8 version,for
50 jobs, delgate per console to
22 for cuda_0.
22 for cuda_1.
6 for intel xxxx
It s again faster but your PC becomes absolut unresponsive.
But there is the good feeling to use your hardware the right way.

guycalledfrank · Jul 24, 2019

keeponshading said: ↑

is your benchmark program only working with RTX cards or GTX generation too?
Click to expand...

It's like baking with "RTX mode" on. Should run on both GTX/RTX, but may be slower on GTX. Anyway, in this case the interesting part is to compare one GPU vs many.

keeponshading said: ↑

Ask them. You will get some.
Click to expand...

Hmmmm...

atomicjoe · Jul 24, 2019

guycalledfrank said: ↑

Hmmmm...
Click to expand...

Yeah man, just ask them nicely and make the point it's a very good promo of their OptiX technology on Unity who has already made several users switch to new RTX hardware

Krubbs · Jul 25, 2019

Very strange results.
Several launches:

2xRTX 2080 (no NVLink)
1. 1.717 s
2. 1.310 s
3. 1.166 s
4. 1.162 s
5. 1.171 s

1xRTX 2080
1. 2.002 s
2. 0.828 s
3. 0.846 s
4. 0.850 s
5. 0.842 s

Only 1st start show that 2x faster than 1x. But after 2nd test we can see that 1x2080 faster than 2x2080. Hmm...
But i haven't NVLink bridge.

keeponshading · Jul 25, 2019

fyi. In addition to test and use multi gpu setups for production bakes.
When you have finished your bigest multi scene setup and all multi gpu optimisation is done and your scenes are waterproof you don t need to buy gpu s for yourself to get fast an productive bake done
e.g. Amazon Cloud has some great setups to bake the biggest stuff down in hours/days.
https://aws.amazon.com/de/blogs/aws/in-the-works-ec2-instances-g4-with-nvidia-t4-gpus/

current pricing
https://aws.amazon.com/de/ec2/instance-types/p2/
8x GPU 7,2 USD per hour
16xGPU 14,4 USD per hour
what s pretty fair

that s the biggest argument for the split for compute devices per job shown in my benchmark post before. It makes this possible too without any hassle.

applied to an time of day scenario for the archviz 6 example.
And it s only an example but possible with 1 Unity and 1 Bakery license.

4x GTX Titan V 00h37m47s at the moment
around
4x GTX Titan V 00h09m00s after compute device optimisation when possible

16x GPU AWS Cloud around 00h02m00s for one lighting scenario
for 14,4 USD you can calculate 30 different lighting scenarios in 1 hour
not calculated the ibl switch times and save times in. only rough estimates. Should only show a possible trend. When you don t like multi lit scenarios so it scales similar for having 16 times bigger levels as the archviz example.

So computation costs for multiple dynamic day bakes for lightmaps and probes if there is some nice lightfield kind of storage and compression for them could become real very fast.
This all with mostly physical correct data Bakery delievers already for all bake modes and holly molly tweaks.
Not some i approximate here and there finger painting.

guycalledfrank · Jul 26, 2019

Krubbs said: ↑

Very strange results.
Several launches:
Click to expand...

Sent you an updated benchmark.

keeponshading · Jul 26, 2019

fyi. With low sampled and denoised bakes some dreamy art styles are possible too. Like the idea.

https://www.creativeshrimp.com/denoise-blender-tutorial.html

keeponshading · Jul 27, 2019

Hi,
on some deep dive around probes i played around with
https://github.com/TheRealMJP/BakingLab

(by browsing the code you find also some hints for Directional in Enlighten, https://github.com/TheRealMJP/BakingLab/commit/157a4e6fdb81a2e5999f231543316ea4ee9c0a2e )

because it allows some nice comparisons from ground truth path traced with

Diffuse – a single RGB value containing the result of applying a standard diffuse BRDF to the incoming lighting, with an albedo of 1.0

Half-Life 2 – directional irradiance projected onto the Half-Life 2 basis[6], making for a total of 3 sets of RGB coefficients (9 floats total)

L1 SH – radiance projected onto the first two orders of spherical harmonics, making for a total of 4 sets of RGB coefficients (12 floats total). Supports environment specular via a 3D lookup texture.

L2 SH – radiance projected on the first three orders of spherical harmonics, making for a total of 9 sets of RGB coefficients (27 floats total). Supports environment specular via a 3D lookup texture.

L1 H-basis – irradiance projected onto the first two orders of H-basis[7], making for a total of 4 sets of RGB coefficients (12 floats total).

L2 H-basis – irradiance projected onto the first three orders of H-basis, making for a total of 6 sets of RGB coefficients (18 floats total).

SG5 – radiance represented by the sum of 5 SG lobes with fixed directions and sharpness, making for a total of 5 sets of RGB coefficients (15 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.

SG6 – radiance represented by the sum of 6 SG lobes with fixed directions and sharpness, making for a total of 6 sets of RGB coefficients (18 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.

SG9 – radiance represented by the sum of 9 SG lobes with fixed directions and sharpness, making for a total of 9 sets of RGB coefficients (27 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.

SG12 – radiance represented by the sum of 12 SG lobes with fixed directions and sharpness, making for a total of 12 sets of RGB coefficients (36 floats total). Supports environment specular via an approximate evaluation of per-lobe specular contribution.

The complete blog here
https://mynameismjp.wordpress.com/2016/10/09/sg-series-part-6-step-into-the-baking-lab/

the results are somehow flashy like here, in special for bright light sources

@guycalledfrank
Do you have an opinion on SG's.
In special on the environment specular from SG lightmaps.

feranti · Jul 27, 2019

guycalledfrank said: ↑

Not easiliy, because of many UnityEditor dependencies. You can't import/convert assets at runtime; There are no StaticFlags on GameObjects (if I'm not mistaken); Batched static geometry is rebuilt and doesn't match original meshes anymore; etc.

Not really, because that would impose limits on player devices. Currently Bakery requires Windows/Nvidia, which is an okay-ish requirement for a developer machine, but not for the player. So I'm not sure if it would be too useful.

That's cool, but I don't think it's a common case.

There is a way though. You can use standalone non-Unity version of Bakery (as DLL) at runtime and manage scene exporting and asset importing yourself. In fact I'm finishing the API right now, currently it looks like that:

View attachment 445994
Click to expand...

Thank you for pointing me there (and sorry for the delay oy my answer).
To be sure, you say that all the code necessary to run the baking is in an external DLL that we can use ourselves (the same way you use it through your own editor scripts, I assume) ?
I also assume it will stick us with a specific version of Bakery and will annoy us if the DLL changes in a later version of Bakery, but it is definitely useful hint ! Thanks a lot !

neoshaman · Jul 27, 2019

guycalledfrank said: ↑

- Alternatively... you can atlas multiple copies of the complete image in the tile map,
- each lightmap-sized, but storing a different UV value.
- This way you can sort of blend each sub-image over the final lightmap,
- using linear texture access (I can probably draw some visual explanation, if my wording is hard to understand).
Click to expand...

What do you call the complete image? the lightmap that accumulate light? I'm not sure I get it lol, I guess you are like proposing many maps but shifted in UV to reduce the sampling cost?

However The UV samples should be preprocessed during baking, there is maybe a way to reduce this at this stage by sorting the samples correctly. Maybe a smart pass that reassemble the lightmap layout to optimized ray (even using generated proxy geometry to optimize further)? who know. I just offered a vanilla basic implementation idea. I'm pretty sure there is endless iteration to do on top. That's just a starting idea. If the vanilla implementation works at its own level, there is probably more to think from there!

I think I get it (maybe not but here is an idea), you have the lightmap, each pixel query it's tile and each tile query another tile that hold the "light data" to sample. If that's true you wouldn't need the UV map at all, you could just light directly into the "light tile" (done in a different process), then use a mipmap to sample the average of all rays for a single bounce. There would be no way to propagate the light data to new pass (as the target and source are decoupled). Which is a valid idea too (to test).

But then you lose bidirection, as I was mapping the lightmap to itself, needed for multibounces, since every points can hash it's UV position to the indirection texture, and since the indirection texture is the position of other points, we effectively have a structure that store the entire lightfield data, basically a light graph we can traverse however we need (at the cost of texture cache miss).

The multi bounces work with mapping the data to itself because we use a gather strategy. Each pass a point gather light from above and update itself, then next pass other points that see it gather back that light data, it's recursive and propagate along the graph. The hardest part become about optimizing the gather pass which is the most costly, and that's all GI in a nutshell IMHO. Wether it's texture fetch or BVH traversal, it's one GI hell.

guycalledfrank said: ↑

One thing that bothers me is that if you put a bunch of visible UVs at ray hits in a tile, adjacent values are going to be very far from each other in UV space = cache miss at every read.
This is in case I get your idea correctly and think of a small tile of UV values at each texel.
Click to expand...

That's kinda the whole problem with any GI solution and why hardware BVH accelerated solution is kind of a big deal. The data over the hemisphere of a point is as good as (potentially) random, I think all gpu solution suffered from this as far as I know. It's also what motivate me to use mipmap as an optimization to "bundle rays" BEFORE realizing it could be used to simulate cone ala voxel tracing (given we can avoid edges problem). GI stay expensive whatever solution we try to go for. There is also the solution to "jitter" the around the address to get neighbor sample, but it work best if we store a reference ID for the geometry and only sample "jitter" that have the same ID to prevent edges (but cannot guarantee visibility, it's an approximation, maybe adding even further data like a jitter range?)

This is especially true for low end mobile, who use tile based GPU rendering, which mean that random texture access is expensive when fetching from the main memory. I think in my target hardware the gpu tile is 16²px (but that's rasterization), I don't know (yet) what is the size of the cache and if dynamic textures are hold up to the same standard (okay the spec sheet says 8kb to 256kb l2 cache, so a single 256² rgba raw in the best case).

The value I see is that it's geometry independent, framerate independent and embarrassingly parallel, that is highly flexible, which mean we can spread compute any way we want, as much as computing pixel per pixel, ray per ray. What if we put the cpu to contribution to create small bundle of data as texture and reconstruct the final texture later? That's a possible option.

Okay that makes think, If I come back to your idea (at least how I understood it) above that give a single bounce, may be if we unpack back the data from the lightmap? we can hash the light tile to a pixel, so we can gather its data and bounce it back to the light tile pixels. That might work? It could be the superior quality solution! But it seems to trade off memory as the light Gbuffer is bigger due to duplicated point in tile, UV tile being lighter (less channel, GB lightmap smaller in pixel).

guycalledfrank said: ↑

I have some doubts about the tile map size in VRAM. If you have a 1024x1024 lightmap (that's modest) and at least 4x4 tiles (rather noisy) and given UVs need to be at least half2, that's 4096x4096x4 = 64 MB. It's not uncommon to have only <= 256 MB of available VRAM on mobile devices, especially older ones.
Plus having a lightmap GBuffer will add to this. You can compress normal and albedo, but I have no idea what to do with position, it's still 3 (or usually 4, because most GPUs don't support 3) floats per texel. For small scenes you may be able to get away with half3(4).
Pre-baked lightmaps, on the other hand, can at least use compression (ASTC, ETC, PVR).
Click to expand...

It's a trade off really, IMHO, my experience with low end cheap phone is that people who have them cannot dl game with size of more than 100mo which is like already too much, I worked with someone who has an application of 50mo and user complain was that's too heavy. Even compressed map are still huge matrices of number. Also I was anticipating 256² sized lightmap, so not detailed or complex (low frequency) environment lighting! I think the graphic memory target is more along 32mo anyway.

A RT lightmap on the other hand can be created on the fly, since the data needed is actually within the vertices, the TILE version would only need the indirection texture (max 2048 on low end). But the big win is that this method is geometry agnostic, so it could (potentially) work as a generative pass (replacing a texture loading for example or even happen before any level loading at all), you pre compute before starting the level (during building/loading), it's kinda a method compression into itself. Of course it wouldn't be as great as proper tracing.

A less geometric independent, and more approximated version, can do away with storing any prebaked indirection map, and generate an atlas of UV box projected probes at runtime during level building/loading. Also any artefacts are probably fine with some aesthetics, and it's also probably (test pending) the only way to get RTGI on hardware like the mali 400 mp2, which is more or less a souped PS2 in power level .

You probably need to design the entire geometry with that in mind, so it's not as generic as other solution (at least as this level of hardware cheapness). I don't expect the quality of other solution at all lol. It's fillrate dependent. The mali 400 has a low bound fillrates of 210Mpix/s that's 3,5Mpix/frames at 60fps. That's 53,4 textures of 256²px per frames (13 x 1024², 3 x 2048²) assuming best conditions with the cheapest possible shader. So the quality rate is a factor of number of rays, memory access and shader complexity.
https://developer.arm.com/ip-products/graphics-and-multimedia/mali-gpus/mali-400-gpu

guycalledfrank said: ↑

I heard Unity are looking for a lighting engineer
Click to expand...

Thanks that's quite the compliment seeing you are much more competent than me, but I don't think I'm actually competent enough, I basically indulging my inner Dunning Kruger to learn and experiment right now lol, if I was competent enough, I would have already made the asset and sold it as an extension of Bakery for RT light , right now I'm just piecing data together, and I use the forum as a scratch pad to organize those idea and bouncing them back to anybody who want to hear them (like we just did above lol).

Okay now I think I have enough to try an implementation anyway, thanks for the talk! I need to start learning about proper practical implementation of dynamic texture now let's see where are the gotcha!

keeponshading · Jul 27, 2019

As more as i digging around i see
Bakery as Unitys ScriptableBakePipeline.

You did all the hard work to demystify baking.

So i would do an SphericalGaussian Lightmaps addon for Bakery with Cycles as GroundTruth preview and solver.)

ephraimmiah · Jul 29, 2019

Heyhey, I got a bug report of sorts for you:

When using the bakery lightmapped prefab script/component on a prefab and then baking the prefab while having the bakery mesh light script inside of said prefab, bakery will throw an error.

While the first prefab still gets its lightmaps properly applied, any additional prefabs in the scene won´t have theirs applied properly. Error message is:

ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: index

Bakerys Progressbar then gets stuck in "Finished rendering" with loading percentages between 103% or higher values.

This gets fixed by simply placing all lights out of the prefabs, which is a usable solution after all.

Softelectro · Jul 29, 2019

Hello,

We use Bakery since few month now with succes. And we have two questions/remarks:

Baking event

Like we split our project in several scene, we have created an editor script to bake all scenes, one by one, during the night.
With Unity baking, we get an event when the baking is finished (Lightmapping.completed) to bake next scene, but we don't find same thing in Bakery, so we add it.
Do you think it's possible to add such thing in your script?

Rebake one mesh

For memories, in Unity 4, it was possible to bake only selected gameobjects. With your lightmapgroup script we can make the same things but the result isn't the same between with and without lightmapgroup script (UV size in lightmap).
Do you thing it's possible to add such function in Bakery?

Thanks in advance

guycalledfrank · Jul 30, 2019

keeponshading said: ↑

on some deep dive around probes i played around with
https://github.com/TheRealMJP/BakingLab
Click to expand...

I know that demo
You should also try Probulator for L1 Geomerics trick and more: https://github.com/kayru/Probulator

keeponshading said: ↑

Do you have an opinion on SG's.
In special on the environment specular from SG lightmaps.
Click to expand...

SGs are nice, but unlike SHs they also need fitting: https://mynameismjp.wordpress.com/2...proximating-radiance-and-irradiance-with-sgs/
Fitting is slow and sad, but you can avoid it with running average, which gives me hope: https://torust.me/rendering/irradiance-caching/spherical-gaussians/2018/09/21/spherical-gaussians
Running average is really cool but I didn't test it yet.
For lightmaps I also don't like the storage requirement. The amount of textures they used for The Order (was it 9?) is too much to pay for a rather blurry specular (Directional/SH specular can give some middle ground). I can implement SG, but I'm not sure if it's worth it. Probably another reason for me to implement scriptable API.

feranti said: ↑

To be sure, you say that all the code necessary to run the baking is in an external DLL that we can use ourselves (the same way you use it through your own editor scripts, I assume) ?
Click to expand...

It's a bunch of multiple exes/dlls that scripts use. You can try following my script logic, but it's huge, as it tries to support all Unity features and work around limitations. I also have a somewhat working single wrapper DLL around all this stuff, it's not a part of Unity Bakery, but I can send it to you.

neoshaman said: ↑

What do you call the complete image? the lightmap that accumulate light? I'm not sure I get it lol, I guess you are like proposing many maps but shifted in UV to reduce the sampling cost?
Click to expand...

Basically this:

Clustering data like this will likely reduce cache misses. Even though "UV value" represents a particular ray direction, the hits will happen in a more continuous fashion.

ephraimmiah said: ↑

Heyhey, I got a bug report of sorts for you:
Click to expand...

Ouch. That sounds very familiar. Are you on the latest 1.6? Pretty sure I fixed it before.

Softelectro said: ↑

get an event when the baking is finished
Click to expand...

Ah yes I added that recently.
Get github access: https://geom.io/bakery/wiki/index.php?title=Github_access
Download these branches:
https://github.com/guycalledfrank/bakery-csharp/tree/more_changes
https://github.com/guycalledfrank/bakery-compiled/tree/more_changes
Ongoing changelog is here: https://github.com/guycalledfrank/bakery-csharp/pull/5
Take a look at OnFinishedFullRender.

Softelectro said: ↑

Do you thing it's possible to add such function in Bakery?
Click to expand...

This is rather hard, as selective baking in general. Problem is, object needs to know the lighting around it, and to get it... you need to bake the rest too. So the selective render feature is rather limited at the moment and I'm not sure how to work around it.

keeponshading · Jul 30, 2019

guycalledfrank said: ↑

I know that demo
You should also try Probulator for L1 Geomerics trick and more: https://github.com/kayru/Probulator

SGs are nice, but unlike SHs they also need fitting: https://mynameismjp.wordpress.com/2...proximating-radiance-and-irradiance-with-sgs/
Fitting is slow and sad, but you can avoid it with running average, which gives me hope: https://torust.me/rendering/irradiance-caching/spherical-gaussians/2018/09/21/spherical-gaussians
Running average is really cool but I didn't test it yet.
For lightmaps I also don't like the storage requirement. The amount of textures they used for The Order (was it 9?) is too much to pay for a rather blurry specular (Directional/SH specular can give some middle ground). I can implement SG, but I'm not sure if it's worth it. Probably another reason for me to implement scriptable API.
Click to expand...

They used SG9 Lightmaps for static and SG9 Probes for moving stuff.
Then additional blending from SG 9 Lightmaps with manual placed Reflection Probes based on roughness.

The top image shows a scene from The Order: 1886. The bottom image shows the same scene with a color coding applied to show the environment specular source for each pixel.

Also nice

and

The complete Blog from BakeLab is such an fantastic read. Starting to really like it.

Thanks for the Probulator links. Great stuff.

I would prefer fit SG s over fitting complete level by hand.)

Let s dig deeper. They fixed some parameters to make it more easy. But i did not completly understand this for now.

Some more details like in the Blog
https://present5.com/advanced-lighting-r-d-at-ready-at-dawn-studios/

Sciptable API. Yes. Yes. Yes.

Back on Hardware tomorrow for doing further MultiGPU tests.
Could you send me optimized RTX version too.

BattleAngelAlita · Jul 30, 2019

One main disadvantage of SG is that they need to evaluate all lobes to get the final lighting result. But yeah, will be great to get SG at least for lightprobes. Maybe i try to implement it into bakery.

feranti · Jul 30, 2019

guycalledfrank said: ↑

It's a bunch of multiple exes/dlls that scripts use. You can try following my script logic, but it's huge, as it tries to support all Unity features and work around limitations. I also have a somewhat working single wrapper DLL around all this stuff, it's not a part of Unity Bakery, but I can send it to you.
Click to expand...

It will be very nice of you! Thanks
Do you want me to PM you my mail address?

keeponshading · Jul 30, 2019

Fyi.
In case you are searching another/additional potential solver who would allow realtime ground truth preview and very fast bake of all FullGI passes.
Yesterday there was another milestone reached on Cycles.
(actually the sucessfully landed 3 rockets in one week and are close to spread an community on mars)
NVidia spent an additional Optix RTX integration to Blender Source Code.
https://developer.blender.org/D5363
Since now Optix is part of the Driver and could be integrated in OpenSource according to the GPL.

So you can solve and preview now with
CPU
Cuda(GPU and/or CPU)
or
Optix RTX
per click
and choose single or multi gpu as compute devices.

with the hell of an speed

The Optix Denoiser gets also an official integration among denoising of output,for realtime raytraced preview in short.

For now Optix and OIDN Denoising via free addons.

So thats pretty cool.

Among lots of other reasons it helps to understand why
Epic spend 1.2 Mio to Blender Foundation and UBISoft (not so much) last weeks.)

So for now Unity seems to have no interest.
GPU parity to CPU parity in PLM first.)
Probably they start looking into it when the competition has done an integration.

I could be horribly wrong. But according to all i have seen in how you build up Bakery you could integrate it in a few weeks by going platform independent and having realtime preview (cycles standalone build window)

Search Unity

Unity ID

Useful Searches

Bakery - GPU Lightmapper (v1.96) + RTPreview [RELEASED]