Search Unity

how much does a tex2D/texCUBE cost?

Discussion in 'Shaders' started by Lulucifer, May 16, 2013.

  1. Lulucifer

    Lulucifer

    Joined:
    Jul 8, 2012
    Posts:
    358
    any concept of that?
     
  2. RC-1290

    RC-1290

    Joined:
    Jul 2, 2012
    Posts:
    639
    It really depends on what kind of GPU you're targeting, and if the texture coordinates change in the shader, based on input data.

    For Example, page 36 of these slides might be interesting to you, if you're developing for the ImgTec SGX GPU.
     
    Last edited: May 16, 2013
  3. Aras

    Aras

    Unity Technologies

    Joined:
    Nov 7, 2005
    Posts:
    4,770
    What RC said.

    On mobile, texture sample probably takes as much as "several to several dozen simple math operations".
    On PC, the ratio is higher, for one texture sample you can afford dozens to hundreds of math operations.

    And then there's funky hardware like PowerVR (e.g. everything in iOS) where several "non dependent" texture samples can be though to be free, but dependent ones do cost quite a lot more.
     
  4. Lulucifer

    Lulucifer

    Joined:
    Jul 8, 2012
    Posts:
    358
    On IOS(PowerVR), "non dependent " texture samples is nearly free, dependent ones cost a lot more,

    what does "dependent" texture samples mean ?
    like this: tex2D(_MainTex,uv+float2(0.1,0.1)) ?
     
  5. VIC20

    VIC20

    Joined:
    Jan 19, 2008
    Posts:
    2,688
    Yes, this is a dependent one. Also tex2D(_MainTex, uv.zw)) is a dependent one.

    A dependent texture read is a texture read in which the texture coordinates depend on some calculation within the (fragment) shader instead of on a varying. As the values of this calculation cannot be known ahead of time it is not possible to pre-fetch texture data and so stalls in shader processing occur.


    If your shader requires more than one dependent texture read:

    When a dependent texture read occurs the current thread running on the USSE within the SGX is suspended until the comparatively slow read is completed, and another thread is swapped in. By re-arranging the code within the shader so that the two dependent texture reads were back-to-back, another performance improvement was gained. This works because the compiler can now batch these dependent texture reads together, whereas previously the shader looked as follows:

    • Mathematics
    • Texture Read
    • Wait
    • Mathematics
    • Texture Read
    • Wait

    It now looks as follows:

    • Mathematics
    • Texture Read
    • Texture Read
    • Wait

    This effectively hides some of the wait time of the first texture read behind the execution time of the second texture read. Unfortunately a wait of some description is inevitable due to the limitations of dependent texture reads. Nonetheless, the strong parallelism of the PowerVR SGX hardware ensures that these waits can be used effectively by other threads



    ^ But don't forget to check the code in the compiled shader and/or use #pragma glsl_no_optimize when you use more than one dependent texture read.
     
    Last edited: May 19, 2013
    joshuacwilde, forestrf and colin299 like this.
  6. Lulucifer

    Lulucifer

    Joined:
    Jul 8, 2012
    Posts:
    358
    Thanks, all you, guys :D