[Solved] Reconstruct World Position from Depth Texture in Single-Pass Stereo VR

Discussion in 'General Graphics' started by equalsequals, Jun 28, 2017.

1. equalsequals

Joined:
Sep 27, 2010
Posts:
154
We are attempting to use the Depth Texture in a Shader which is executed from CommandBuffer.Blit during the CameraEvent.BeforeForwardOpaque, in Single-Pass Stereo. We are using Unity 5.6.0f3, but are open to upgrading if necessary.

I've done a fair amount of reading and come across a few good, slightly different, methods for doing this outside of Single-Pass:
I've mainly been looking at Keijiro's example as it is the most simplistic. It works in Multi-Pass VR, but does not account for Single-Pass.

Where I see these methods going wrong with Single-Pass are:
1. UV Coordinates do not account for a double-wide Frame Buffer
2. Camera's Projection Matrix != Eye Projection Matrix
3. World Space Camera Pos != World Space Eye Position
4. 'CameraToWorld' Matrix != 'EyeToWorld' Matrix
Possible resolutions are:
1. Calculate these values CPU-side myself as late as possible.
2. UnityShaderVariables.cginc appears to have these variables (provided they are correctly populated)
Here is what I am doing currently, for simplicity I will focus only on the Left Eye:
Code (CSharp):
1.
2. void OnPreRender()
3. {
4.     mat.SetVector("_MyProjectionParams", new Vector4(
5.         1f, // x is 1.0 (or –1.0 if currently rendering with a flipped projection matrix)
6.         cam.nearClipPlane, // y is the camera’s near plane
7.         cam.farClipPlane, // z is the camera’s far plane
8.         1f / cam.farClipPlane // w is 1/FarPlane.
9.         ));
10.
11.     Matrix4x4 leftEye = cam.GetStereoProjectionMatrix(Camera.StereoscopicEye.Left);
12.     mat.SetMatrix("_LeftEyeProjection", leftEye);
13.
14.     Matrix4x4 leftToWorld = cam.GetStereoViewMatrix(Camera.StereoscopicEye.Left).inverse;
15.     mat.SetMatrix("_LeftEyeToWorld", leftToWorld);
16. }
17.
Note: The reason why I am calculating the projection parameters myself is that it seems that the projection is not populated correctly at the time of Blit.

Code (CSharp):
1.
2. CGPROGRAM
3. #pragma vertex vert
4. #pragma fragment frag
6. #include "UnityCG.cginc"
7.
8. struct v2f
9. {
10.     float4 vertex : SV_POSITION;
11.     float2 uv : TEXCOORD0;
12. };
13.
14. v2f vert (appdata_img v)
15. {
16.     v2f o;
17.     o.vertex = UnityObjectToClipPos(v.vertex);
18.     o.uv = v.texcoord;
19.
20.     return o;
21. }
22.
24.
25. float4x4 _LeftEyeProjection; // Left Eye's Projection Matrix
26. float4x4 _LeftEyeToWorld; // Left Eye's "cameraToWorldMatrix"
27. float4 _MyProjectionParams; // Same as _ProjectionParams, but calculated myself
28. fixed4 frag(v2f i) : SV_Target
29. {
30.     float d = SAMPLE_DEPTH_TEXTURE(_CameraDepthTexture, i.uv);
31. #if defined(UNITY_REVERSED_Z)
32.     d = 1 - d;
33. #endif
34.     const float near = _MyProjectionParams.y;
35.     const float far = _MyProjectionParams.z;
36.
37.     const float zPers = near * far / lerp(far, near, d);
38.     float3 worldPos = 0;
39.     if (i.uv.x < .5) // Left Eye Only
40.     {
41.         float2 uv = i.uv;
42.         uv.x = saturate(uv.x * 2); // 0..1 for left side of buffer
43.
44.         const float2 p11_22 = float2(_LeftEyeProjection._11, _LeftEyeProjection._22);
45.         float4 vpos = float3((uv * 2 - 1) / p11_22 * zPers, -zPers, 1);
46.         worldPos = mul(_LeftEyeToWorld, vpos).xyz;
47.     }
48.     else
49.     {
50.         return 0; // TODO: Right Eye
51.     }
52.
53.     half3 color = pow(abs(cos(worldPos * UNITY_PI * 4)), 20);
54.     return fixed4(color, 1);
55. }
56.
57. ENDCG
What I see when in the RenderTexture is this:

[gifv here]

I feel that this is close as the X & Z axis seem to be mostly static. My guess is that where this is going wrong is actually the Eye-To-World step as, from what I can tell, viewing the View Position looks correct to me. Of couse, I could be wrong in that observation.

[gifv here]

Upon changing some values out with Unity-provided ones in 'UnityShaderVariables.cginc', I don't see much of a difference. What I see available to me is:

Code (CSharp):
1. #if defined(USING_STEREO_MATRICES)
2. CBUFFER_START(UnityStereoGlobals)
3.     float4x4 unity_StereoMatrixP[2];
4.     float4x4 unity_StereoMatrixV[2];
5.     float4x4 unity_StereoMatrixInvV[2];
6.     float4x4 unity_StereoMatrixVP[2];
7.
8.     float4x4 unity_StereoCameraProjection[2];
9.     float4x4 unity_StereoCameraInvProjection[2];
10.     float4x4 unity_StereoWorldToCamera[2];
11.     float4x4 unity_StereoCameraToWorld[2];
12.
13.     float3 unity_StereoWorldSpaceCameraPos[2];
14.     float4 unity_StereoScaleOffset[2];
15. CBUFFER_END
16. #endif
This leads me to believe that what I'm calculating myself is correct, but I am simply just missing a step that is unique to Single-Pass.

If anyone has any additional insight, it would be greatly appreciated.

==

Last edited: Jun 28, 2017
2. equalsequals

Joined:
Sep 27, 2010
Posts:
154
Upon further investigation and assembling an example project (see: attached), I have discovered that I am seeing the intended behavior under the Oculus SDK, but the effect remains broken in Open VR.

Oculus SDK result (working):

Open VR result (broken):

I believe that this has to do with the asymmetrical, off-axis projection Open VR uses, but don't know which matrix to use to solve it. I have tried passing the following values, to no avail:

Code (CSharp):
1. // Open VR API
2. OpenVR.System.GetProjectionMatrix(EVREye.Eye_Left, cam.nearClipPlane, cam.farClipPlane);
3.
4. // Steam VR API
5. SteamVR.instance.hmd.GetProjectionMatrix(eye, cam.nearClipPlane, cam.farClipPlane);
6.
7. // Unity API
8. cam.GetStereoProjectionMatrix(Camera.StereoscopicEye.Left);

File size:
40.7 KB
Views:
565
3. equalsequals

Joined:
Sep 27, 2010
Posts:
154
I have resolved the issue. For posterity, I'll try to outline the root issues I have encountered and the workarounds I implemented to solve them. A big thanks to Jean-Francois at Unity for the suggestions and sanity-checking during this process.

To recap the original issue, I had been attempting to reconstruct the World-Space Position of a given fragment in a Shader using the Depth Texture. This happens at the time of a CommandBuffer.Blit at the CameraEvent.BeforeForwardOpaque, but the principle should be the same for Post-Effects in OnPostRender as well. It is important to note that there is the special case of this being Single-Pass Stereo (SPS) VR. Depth-To-World conversion is a fairly straight-forward calculation to make, and there are plenty of resources to implement it as I have linked above in my initial post. The method I used as a starting point can be found here.

My starting method works fine in both Non-VR and Multi-Pass Stereo VR, but it has some problems in SPS, both in 5.6.x (my test case) and 2017.1.

1. CommandBuffer.Blit is not aware of Single-Pass Stereo and does not draw a Quad per eye, but instead one Quad for the entire double-wide Frame Buffer. The resulting UVs in this case need to be adjusted before clip-space conversion can be applied, but this can be worked around fairly easily.
2. The "scaffolding" in the CBuffer is not there at this point so the Unity CBuffers are not correctly populated. This can be worked around by passing in my own variables as I had already been doing at the time of my initial post.
3. The implementation I have linked above uses a math shortcut to reverse the projection that does work in the Oculus SDK once the previous 2 points are circumvented. In Open VR however, this is broken, which I can only assume is due to the way the matrices are assembled. The work around here is to do a proper UV to clip-space conversion and multiply by the inverse of the Projection Matrix to derive the Eye-Space position.
4. Finally, and possibly most importantly, the Projection Matrix received from querying Camera.GetStereoProjectionMatrix is incorrect and does not correctly reverse the projection in #3. I verified this by using RenderDoc and peeking at Unity's CBuffer during the FowardOpaque queue.
Breaking this down:

1. Working around the Stretched Quad Issue
This is fairly straight forward, and I already had this accomplished by the time of my initial post.

Code (CSharp):
1. fixed4 frag (v2f i) : SV_Target
2. {
3.     float2 uv = i.uv; // Take the UV from the Blit Quad...
4.
5.     if (uv.x < .5) // Left Eye
6.     {
7.         uv.x = saturate(uv.x * 2); // 0..1 for left side of buffer
8.     }
9.     else // Right Eye
10.     {
11.         uv.x = saturate((uv.x - 0.5) * 2); // 0..1 for right side of buffer
12.     }
13.
14.     // ...More to follow...
15. }
This accounts for the UV being stretched across both eyes, and normalizes the UV from 0..1 for each eye, preparing it for Clip-Space conversion later on.

We need 2 sets of 2 matrices here, the Inverse Projection for both Left and Right eyes, and the equivalent of Camera.cameraToWorldMatrix for each eye.

Code (CSharp):
1. private void OnPreRender() // This is just a Later-Than-Late Update
2. {
3.
4.     if (cam.stereoEnabled)
5.     {
6.         // Left and Right Eye inverse View Matrices
7.         leftToWorld = cam.GetStereoViewMatrix(Camera.StereoscopicEye.Left).inverse;
8.         rightToWorld = cam.GetStereoViewMatrix(Camera.StereoscopicEye.Right).inverse;
9.         mat.SetMatrix("_LeftEyeToWorld", leftToWorld);
10.         mat.SetMatrix("_RightEyeToWorld", rightToWorld);
11.
12.         leftEye = cam.GetStereoProjectionMatrix(Camera.StereoscopicEye.Left);
13.         rightEye = cam.GetStereoProjectionMatrix(Camera.StereoscopicEye.Right);
14.
15.         // Compensate for RenderTexture...
16.         leftEye = GL.GetGPUProjectionMatrix(leftEye, true).inverse;
17.         rightEye = GL.GetGPUProjectionMatrix(rightEye,true).inverse;
18.
19.         mat.SetMatrix("_LeftEyeProjection", leftEye);
20.         mat.SetMatrix("_RightEyeProjection", rightEye);
21.     }
22. }
At this point it is important to reiterate that the Projection Matrices are wrong here.

The result here will not be correct, but the code will not need to be changed in the Fragment past this point.

Code (CSharp):
1. fixed4 frag (v2f i) : SV_Target
2. {
3.     float d = SAMPLE_DEPTH_TEXTURE(_CameraDepthTexture, i.uv); // non-linear Z
4.
5.     float2 uv = i.uv;
6.
7.     float4x4 proj, eyeToWorld;
8.
9.     if (uv.x < .5) // Left Eye
10.     {
11.         uv.x = saturate(uv.x * 2); // 0..1 for left side of buffer
12.         proj = _LeftEyeProjection;
13.         eyeToWorld = _LeftEyeToWorld;
14.     }
15.     else // Right Eye
16.     {
17.         uv.x = saturate((uv.x - 0.5) * 2); // 0..1 for right side of buffer
18.         proj = _RightEyeProjection;
19.         eyeToWorld = _RightEyeToWorld;
20.     }
21.
22.     float2 uvClip = uv * 2.0 - 1.0;
23.     float4 clipPos = float4(uvClip, d, 1.0);
24.     float4 viewPos = mul(proj, clipPos); // inverse projection by clip position
25.     viewPos /= viewPos.w; // perspective division
26.     float3 worldPos = mul(eyeToWorld, viewPos).xyz;
27.
28.     fixed3 color = pow(abs(cos(worldPos * UNITY_PI * 4)), 20); // visualize grid
29.     return fixed4(color, 1);
30. }
At this point, the grid should still be broken because as I mentioned above, our Projection Matrix is not in fact, correct. It should look similar to this:

Now, diving into the FrameDebugger I see what my Projection Matrix is:

Upon loading up RenderDoc and inspecting the Unity CBuffer, I notice the difference:

Very specifically, our Matrix[1,1] is negative when passing from the API and positive in the CBuffer. [3,2] and [3,3] also have discrepancies, but do not seem to need adjustment.

The simple fix we apply to our C# script is:

Code (CSharp):
1.         // Compensate for RenderTexture...
2.         leftEye = GL.GetGPUProjectionMatrix(leftEye, true).inverse;
3.         rightEye = GL.GetGPUProjectionMatrix(rightEye,true).inverse;
4.         // Negate [1,1] to reflect Unity's CBuffer state
5.         leftEye[1, 1] *= -1;
6.         rightEye[1, 1] *= -1;
At this point, we should have the correct inverse projection:

I'm not entirely sure why the value from the Unity API is negative, but logically it stands to reason that the inverse of the matrix at the time of render needs to match the inverse of the matrix when I attempt to reverse the projection.

In any case, I hope this breakdown serves as a good resource to anyone struggling with this issue moving forward.

Cheers

==

Last edited: Jul 13, 2017
4. hrgchris

Joined:
Oct 26, 2016
Posts:
19
Hey - just thought I'd add to this. Thanks to the amazing work by the guys above I got it all working eventually, though had to tweak some things to get it working in latest unity, and also support with/without vr and with/without single pass. So, here's a git repo with all that in: https://github.com/chriscummings100/worldspaceposteffect. Again - thanks for all your help guys!

5. equalsequals

Joined:
Sep 27, 2010
Posts:
154
Glad to hear I could help in some way. In 2017.x the introduction of RenderTextureDescriptor made things work almost out-of-box, considering you're using some intermediate render target. You can actually get the main framebuffer's VR (or XR depending on the 2017.x version) descriptor via VRSettings/XRSettings.

Code (csharp):
1.
2. RenderTextureDescriptor desc;
3. if (XRSettings.enabled)
4.     desc = XRSettings.eyeTextureDesc;
5. else
6.     desc = new RenderTextureDescriptor(Screen.width, Screen.height); // Not XR
7.
With this context, single/multi pass is irrelevant because the Blit actually happens in 2 separate draw calls (this was one of the bugs that caused a need for this post in the first place.) with correct unity_StereoEyeIndex population you should be able to do more work on the vertex Shader for better performance.

Cheers

==

Last edited: Mar 15, 2018
Invertex likes this.
6. Ceyl

Joined:
Sep 9, 2015
Posts:
9
Hi there,

Thanks for this post, it helped me a lot converting one of my post processing effect to VR SinglePass. But @equalsequals , I don't really understand how to use the RenderTextureDescriptor you describe. Should I create a new texture with this descriptor and blit with this in the OnRenderImage? What does it changes in the shader implementation?

Cheers,

7. equalsequals

Joined:
Sep 27, 2010
Posts:
154
RenderTextureDescriptor can be used as the parameter for a RenderTexture's constructor. In the case of Blit and Single-Pass Stereo, the vrUsage will dictate whether or not the Blit is issued correctly. This property is what Unity uses to provide context to the renderer. If a RenderTexture is meant to be for VR output, it should be created this way.

My initial solve for this was before the introduction of RenderTextureDescriptor in 2017, so what was happening was the Blit was done in a single quad that spanned both sides of the 'double-wide' RenderTexture, when it should be 2 quads. This allows for doing a majority of the calculations I was forced into doing per-fragment on a per-vertex basis which is much better for performance.

8. neitron

Joined:
Sep 14, 2014
Posts:
14
Hi! I really appreciate your work, which is great. I have kind of similar issue.

I'm working on an Fx which shows hot edge on unity terrain in a place where terrain intersects lava mesh. I wrote a custom shader for the terrain material:
1. It takes depth render texture of the lava which is rendered by an extra camera.
2. It uses built-in _CameraDepthTexture to look for depth values match in _LavaDepthRenderTexture.
3. Drows the terrain + emissive edge on an intersection.

Roughly main Cginc part below :
Code (CSharp):
3. fixed4 _EdgeTint;
4. fixed _EdgeTintPower;
5. fixed _EdgePow;
6. fixed _EdgeThickness;
7.
8.
9. fixed4 lavaEdge(float4 screenPos, float3 worldPos)
10. {
11.     float2 screenUV = screenPos.xy / screenPos.w;
12.
13.     fixed screenDepth = DecodeFloatRG(tex2D(_CameraDepthTexture, screenUV));
14.
15. #if UNITY_SINGLE_PASS_STEREO
16.     // If Single-Pass Stereo mode is active, transform the
17.     // coordinates to get the correct output UV for the current eye.
18.     float4 scaleOffset = unity_StereoScaleOffset[unity_StereoEyeIndex];
19.     screenUV = (screenUV - scaleOffset.zw) / scaleOffset.xy;
20.
21. #endif
22.
24.     fixed diff = screenDepth - lavaDepth;
25.     fixed intersect = 0.0f;
26.
27.     if (diff > 0.0f)
28.     {
29.         fixed factor = diff / _EdgeThickness / lavaDepth * 0.01;
30.         intersect = 1.0f - smoothstep(0.0f, _ProjectionParams.w, factor);
31.     }
32.
33.     _EdgeTint.rgb *= _EdgeTintPower;
34.
35.     return lerp(fixed4(0.0, 0.0, 0.0, 0.0), _EdgeTint, pow(intersect, _EdgePow));
36. }

It works well but not in VR (Stereo Single-Pass) (((
The main visible issue is when I'm rolling my head (around z axis) but not when I'm doing pitch (arond x axis) or yaw (around y) movements. I feel that I'm doing something wrong with converting or projecting, could you help me pls. And sorry for my terrible Eng - hope there is no misunderstanding.

9. BoltScripts

Joined:
Feb 12, 2015
Posts:
20
Was going to post here asking for help on moving the expensive stuff out of the fragment stage and into the vertex stage but i figured it out while typing my response
Though I tried using unity_CameraInvProjection instead of needing to upload a separate matrix but its just not quite right even though in theory unity now selects the matrix per eye as shown here:
Code (csharp):
1.  #define unity_CameraInvProjection unity_StereoCameraInvProjection[unity_StereoEyeIndex]
Its pretty close but just gets offset slightly from where it should be based on camera position. I don't really feel like diving into everything trying to figure out why matrices aren't quite right and all that and tbh i don't really know enough to really do that anyway but if anyone knows why that doesn't work and wants to share then i would like to know.

Anyway, I did make some other smaller optimizations in using a matrix array rather than separate variables and indexing into it using unity_StereoEyeIndex. I used the concept of the direction vector, or ray, and multiply that by depth and add that to _WorldSpaceCameraPos in the fragment shader.
Calculating the direction vector using the same math as mentioned originally but substituting the cameras near plane for the depth value. Not entirely sure why the depth needs to be the near plane but it makes sense in a vague kind of way because how the depth is calculated later in the fragment stage.
Then just taking the world position from that and subtracting _WorldSpaceCameraPos gives (miraculously) correct direction vector.

Code (CSharp):
1. float4x4 stereoMatrices[4];
2.
3. #define IVIEW_MATRIX stereoMatrices[unity_StereoEyeIndex * 2]
4. #define IPROJ_MATRIX stereoMatrices[unity_StereoEyeIndex * 2 + 1]
5.
6. v2f vert(appdata v) {
7.     v2f o;
8.
9.     o.pos = v.vertex * float4(2, 2, 1, 1) - float4(1, 1, 0, 0);
11.
12.     //_ProjectionParams.y is camera near plane
13.     float4 viewPos = mul(IPROJ_MATRIX , float4(v.uv * 2.0 - 1.0, _ProjectionParams.y, 1));
14.     viewPos /= viewPos.w;
15.     float3 wPos = mul(IVIEW_MATRIX , viewPos).xyz;
16.
17.     o.ray = wPos - _WorldSpaceCameraPos;
18.
19.     return o;
20. }
21.
22. fixed4 frag(v2f i) : SV_Target {
23.
24.     float zdepth = DECODE_EYEDEPTH(UNITY_SAMPLE_DEPTH(tex2D(_CameraDepthTexture, i.uv)));
25.     float3 wPos = i.ray * zdepth + _WorldSpaceCameraPos;
26.
27.     ...
28.
29. }