Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice

[Help] Render Streaming AR

Discussion in 'Unity Render Streaming' started by jwgsbbc, Aug 9, 2021.

  1. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    We're currently using the Render Streaming package in a prototype AR application, but have resorted to chroma-key to handle the transparency (so the user can see the camera feed behind the streamed render).

    I am looking at solutions to allow for the alpha channel of the frame buffer to be transmitted, from a windows machine (with an NVIDIA card) to the client iPhone.

    I can see that the WebRTC package currently uses the NVIDIA Codec v.9.1 which I don't think supports alpha encoding (whereas 11.1 does, it seems, via HVEC). I suspect this may be a blocker to using the hardware encoding of the NVIDIA card to encode translucent pixels.

    Is there a solution that I'm missing?

    Thanks.
     
  2. kazuki_unity729

    kazuki_unity729

    Unity Technologies

    Joined:
    Aug 2, 2018
    Posts:
    803
    It is an interesting challenge for me.
    My idea is the alpha channel texture combine into color texture before streaming it, and compositing received texture using shader.

    This idea is based on the AR demo made by keijiro.
     
    hippocoder and jwgsbbc like this.
  3. cloudending

    cloudending

    Joined:
    Mar 25, 2019
    Posts:
    15
    Do you have workaround on this topic?
     
    jwgsbbc likes this.
  4. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    Our work around was as described by @kazuki_unity729, although I've only just seen their response today.

    We split the rendered frame in two - left half for colour, right half for alpha stored as greyscale.

    To allow for the existing render buffer to have its alpha channel stored in one half of the output a custom post-process was implemented.

    https://docs.unity3d.com/Manual/PostProcessingOverview.html

    The Unity tutorial on post processing for the HDRP (High Definition Render Pipeline) was followed, to get an example set up:

    https://docs.unity3d.com/Packages/c...efinition@7.1/manual/Custom-Post-Process.html

    By default the HD render pipeline uses a compressed 32-bit frame buffer for post processing, with 11 bits for both the red and green channels and 10 bits for the blue channel. Unfortunately it has no alpha channel. This meant that reading or writing to the alpha channel during the post-processing pass had no impact on the result.

    The solution was to change the post-processing buffer format to an alpha supporting (but twice-the-size - 64bits) buffer format. Obviously this meant that the renderer uses more memory but that seems like a reasonable trade off.

    See this forum thread for more details:

    https://forum.unity.com/threads/alpha-in-render-texture-must-be-alpha.746405/



    Once the alpha channel was accessible the existing greyscale post processing effect (from the Unity tutorial) was re-written to squeeze the colour into the left side of the image, and write the alpha into the right side. Below is the relevant fragment shader code:

    Code (CSharp):
    1.  
    2. float4 CustomPostProcess(Varyings input) : SV_Target
    3. {
    4.     UNITY_SETUP_STEREO_EYE_INDEX_POST_VERTEX(input);
    5.     float2 squishedTexC = input.texcoord * float2(2.0f, 1.0f);
    6.     uint2 positionSS = fmod(squishedTexC, float2(1.0f, 1.0f)) * _ScreenSize.xy;
    7.     float4 srcColour = LOAD_TEXTURE2D_X(_InputTexture, positionSS).rgba;
    8.     float4 colour = float4(srcColour.rgb, 1.0f);
    9.     float4 alpha = float4(srcColour.a, outColour.a, outColour.a, 1.0f);
    10.     // avoiding branching:
    11.     // h = 0 when x < 1 (colour half)
    12.     // and 1 when x >= 1 (alpha half)
    13.     float h = clamp((float)sign(squishedTexC.x - 0.9999f), 0.0f, 1.0f);
    14.     return lerp(colour, alpha, h);
    15. }
    16.  
    And for decoding on the client side:

    Code (CSharp):
    1.  
    2. fixed4 frag(v2f IN) : SV_Target
    3. {
    4.     // read the colour from the left side
    5.     float2 colorTexCoord = IN.texcoord * float2(0.5f, 1.0f);
    6.     float3 feedColour = tex2D(_MainTex, colorTexCoord).rgb;
    7.     // and the alpha from the right side
    8.     float2 alphaTexCoord = colorTexCoord + float2(0.5f, 0.0f);
    9.     // Use the luminance of sample for the alpha since that should be the highest
    10.     // resolution assuming the video compression used chroma-sub-sampling
    11.     // e.g. YUV 4:2:2
    12.     float feedAlpha = Luminance(tex2D(_MainTex, alphaTexCoord).rgb);
    13.  
    14.     // Since the buffer from the renderer contains blended colour
    15.     // 1-feedAlpha * c0 + feedAlpha * c1
    16.     // where we actually want c1 (blended using feedAlpha with
    17.     // the feed from the camera).
    18.     // If we presume that the transparent areas (c0) are BLACK then
    19.     // the colour from the feed will be feedAlpha * c1.
    20.     // so c1 = feedColour / feedAlpha.
    21.     float3 colour = feedColour/feedAlpha;
    22.     return float4(colour, feedAlpha);
    23. }
    24.  
    A more efficient way of doing this could be to use a blend mode on the renderer that doesn't do any alpha blending, but this would have implications for the rendering of translucent objects overlapping other objects.
     
  5. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    Sorry, I've only just seen your reply to my question, many thanks.
     
  6. cloudending

    cloudending

    Joined:
    Mar 25, 2019
    Posts:
    15
    thanks a lot. Any I have another question. How do you synchronize the camera image from client and render image from server.
     
  7. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    Currently, we do not - latency affects the quality of experience, potentially drastically.
    There are some latency mitigation strategies that we've briefly looked at - similar to those used in HMDs - Timewarp, Spacewarp, etc. Some of these require more info from the renderer though; depth, motion vectors.
    A naive approach, if reducing latency isn't the priority is to delay the camera feed - this looks possible, but since there's no way currently to add metadata to the render stream (see: https://github.com/Unity-Technologies/com.unity.webrtc/issues/305) we're looking at encoding the sync time in the pixels. i.e. sending the pose time from the client to the renderer, encoding that in the rendered image, and decoding on the client. Then choosing the frame that from the camera that relates to that time. Currently, for some reason, it doesn't work though :(
     
    Last edited: Nov 17, 2021
  8. cloudending

    cloudending

    Joined:
    Mar 25, 2019
    Posts:
    15
    See https://groups.google.com/g/discuss-webrtc/c/npYIyxSBOLI. Francesco Pretto add sender_ntp_time_ms in VidoeFrame. It may help to synchronize. But as you say the latency of network affects the quality of experience a lot.
     
  9. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    I didn't follow all of that thread, and it's perhaps less relevant us (currently) since we're dealing with Unity->Unity client server. If I understand correctly though it sounds like there's some work to allow for the rendered-video-source (renderer) wall-clock time to be added to the frame metadata, which could be useful, but it would still only be able to give an estimate of the round-trip-time, "motion to photons", since one could only estimate the difference between the wall clock time of the renderer and the client. For our use-case at least I think we need to be able to send the user-input-time (motion-time) from the client to the renderer and then back to the client so the client can measure the round-trip time, this presumably means we need to add arbitrary data into the frames.
     
  10. cloudending

    cloudending

    Joined:
    Mar 25, 2019
    Posts:
    15
    Modifing webrtc source code is incompatible with web. But insertable stream can be used to add custom data. I think this is a way to synchronization.
     
  11. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    We got an encode-motion-time-in-pixels proof of concept working, and it works surprisingly well. In detail we took the client frameCount (actually the least significant 8 bits) sent it through to the renderer and encoded it as 8 blocks of black/white 0/1 along the bottom of the streamed frame. The corresponding camera frame is then shown when it reaches the client. We're seeing ~150ms of latency, but it's not tooooooo bad with the camera feed synced up.
     
  12. kazuki_unity729

    kazuki_unity729

    Unity Technologies

    Joined:
    Aug 2, 2018
    Posts:
    803
    cloudending and jwgsbbc like this.
  13. jwgsbbc

    jwgsbbc

    Joined:
    Mar 17, 2021
    Posts:
    7
    Our current solution, simpler, is working pretty well - but this looks really interesting. Thanks.
     
  14. cloudending

    cloudending

    Joined:
    Mar 25, 2019
    Posts:
    15
    this project seems like the solution jwgsbbc proposed. It is a good way in sync render frame and camera frame