Search Unity

  1. Looking for a job or to hire someone for a project? Check out the re-opened job forums.
    Dismiss Notice
  2. Unity 2020 LTS & Unity 2021.1 have been released.
    Dismiss Notice
  3. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Help Wanted Runtime texture compression using compute shader

Discussion in 'General Graphics' started by joshuacwilde, Dec 31, 2020.

  1. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    I am doing some runtime texture generation, and I've been trying to think on how I can cut down memory costs (doing virtual terrain texturing). This is for mobile btw. What I am thinking is if I could somehow compress the render textures to etc2 in a compute shader, I could keep everything on the GPU and save on memory. Has anyone had any kind of experience with this, or related? Just trying to think of any solutions to save on memory, as it would probably save us at least 50 mb or more if we could compress to etc2 at runtime.
     
  2. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    10,103
    ETC2 is fairly slow to compress. Several seconds to minutes per textures, and that’s on desktop GPUs. The fastest compute shader based compressors for it are ... faster ... but can still take seconds per texture, again, on desktop GPUs.

    There was a paper released just a few months ago that proposed a method to bring ETC2 compression down to potentially less than a millisecond. QuickETC2. It’s plausible you could implement that in c# yourself.
    https://github.com/wolfpld/etcpak/commit/da85020e690890f4356d42ab5802e4f957f220fd

    Or try implementing the techniques there in a compute shader yourself.
    https://github.com/darksylinc/betsy

    Alternatively there are existing ETC1 real time compression assets.
    https://assetstore.unity.com/packages/tools/realtime-texture-compression-for-android-etc1-7724
     
    Noisecrime likes this.
  3. Aras

    Aras

    Unity Technologies

    Joined:
    Nov 7, 2005
    Posts:
    4,699
    I'm playing with compute shader compression myself right now (though with BC7 format, not ETC2), and so far it's in the "not really that much faster than doing it on the CPU, but a world of pain in shader compilers, precision mismatches, GPU driver issues" etc.

    Runtime compression to ETC/ETC2 is actually there in latest Unity versions (i.e. via Texture2D.Compress), but that's only starting with 2021.1. Underneath, that runtime compression is the same compressor that bgolus linked to above ("etcpak", just not the QuickETC2 branch yet).
     
    Noisecrime, ekakiya and bgolus like this.
  4. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    I was actually looking at Betsy. It's like 5 passes just for ETC2. I think ETC2 has the most passes. Mostly I am wanting to do ETC2 because it is supported on both iOS and Android. And getting one type of compression to work would be enough headache on it's own for me. I actually hadn't seen QuickETC2, I will definitely take a look at that. Yeah I saw the ETC1 asset, but I would need ETC2, unless I wanted to use 2 textures instead of 3. I am packing color, normals, and roughness into 2 textures right now.

    Ideally I can just get texture size in memory down at least a bit. I am using 50 512x512 tiles as well as a 4k basemap for the terrain in a quadtree right now. At 2 textures per tile and basemap. I am thinking that's about the minimum I am gonna get away with, I will definitely need a bit more than, maybe for additional tiles, definitely for more terrain textures. Uncompressed as well as with other textures I am using for the terrain, it's putting me at about 200 MB in runtime memory :/ Especially for bandwidth limited mobile devices, it isn't great, although I can probably make it happen if I made sacrifices elsewhere. It's too bad even on 3GB iOS devices, only about 1.2 GB of that is useable in game.
     
  5. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    Ahh interesting!! I remember before Texture2D.Compress() only worked with DXT. So are there additional options now or how does that work? And do you have any speed benchmarks or any notes there? The only thing there is having to pull the texture from the GPU to the CPU, compress, then send back. And I imagine the texture compression is thread blocking?
     
  6. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    EDIT :
    Oh wow, QuickETC2 is realllly new!
     
  7. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    are Texture2D.LoadImage() and ImageConversion.LoadImage() both blocking? Since this is for the terrain texture updating, I really need to offload as much off the main thread as possible.
     
  8. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    Also, terrain virtual texturing for reference. Still some bugs to fix there...
     

    Attached Files:

  9. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    10,103
    When I said a few months, I really did mean a few months. September is when I think it was first published, and it's not yet been merged into the main branch of etcpak or any other etc2 encoders that I know of. I'm assuming the quality isn't great compared to more traditional etc2 encoders. Similar to how the real time DXT1/5 encoder is no where near the quality of the editor encoder. Plus you're going to be recompressing already compressed assets, so that's a whole extra level of badness.
     
  10. wolfpld

    wolfpld

    Joined:
    Jan 4, 2021
    Posts:
    1
    The QuickETC2 etcpak branch achieves two things:

    1. Vanilla etcpak performs ETC2 compression by encoding each 4x4 block using ETC1 and planar compression. The error metrics are then compared and the better one is used. QuickETC2 introduces a heuristical selector, so that only one selected encoding is performed for each block. This makes the encoding ~2x quicker at generally the same image quality. Obviously, there will be cases where the heuristic is wrong.

    2. QuickETC2 also introduces the missing T and H block encoding modes, which greatly improve the resulting image quality, at generally the same speed as vanilla etcpak ETC2 mode (often faster).

    The image quality comparison is presented in figures and tables at https://www.semanticscholar.org/pap...-Nah/e33e9f0eb2b4033e6097c0096902f163de80cb2e

    The changes on the branch are provided as a reference implementation by paper's author and will be eventually used as a base for mainline scalar/SSE/AVX/NEON implementation.
     
    Aras, joshuacwilde and bgolus like this.
  11. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    @Aras ?
     
  12. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    Really appreciate the detailed response here. Very helpful.
    Do you have a rough idea of how etcpack for ETC2 with NEON would compare to QuickETC2 with regular cpp? Given that QuickETC2 doesn't have a NEON implementation yet.

    I am looking for a solution that I could implement asap, and trying to figure out what makes the most sense right now.
     
  13. Aras

    Aras

    Unity Technologies

    Joined:
    Nov 7, 2005
    Posts:
    4,699
    I don't actually know, just know that runtime compression recently got ETC support, since I remember seeing that work being done.
     
  14. Noisecrime

    Noisecrime

    Joined:
    Apr 7, 2010
    Posts:
    1,604
    This all sounds pretty cool, my reply here is mostly remind myself to investigate this further ( especially to test 2021 feature ) and keep up with the thread.

    Sadly I can't offer anything more in terms of implementing this on GPU, but its something I would be interested in since I have an app that saves user designs and have to create a thumbnail for them (512x512) which when viewing all of their saved designs can easily eat up memory when uncompressed.

    For that app many years ago I did write ( or rather converted ) a PVRTC compression algorithm using c#. The source was particularly badly implemented ( maybe for ease of learning ) taking 100's of MB and tens of seconds to convert just 512 square texture. I was able to massively reduce both, but it was still too slow for use as it had to save the design and thumbnail when user switched apps, for which the OS gave you limited time. In the end it was mostly a 'for fun' project since its use would be limited to iOS so never did anything more with it.

    Your post reminded me of this and that perhaps with the advancements of mobile devices if a new common format would be viable. From my quick research ETC on iOS is supports from A8 core and up, and I assume for Android any opengl 3.0 device. So that should cover most devices from the last 4 years at least, which is around my cut-off for supporting the apps I make for clients, so a good fit.
     
  15. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    269
    Last edited: Feb 26, 2021
  16. florianpenzkofer

    florianpenzkofer

    Unity Technologies

    Joined:
    Sep 2, 2014
    Posts:
    335
unityunity