Search Unity

Question Why writting to RWStrucuredBuffer this way does't work?

Discussion in 'Shaders' started by adamski0402, Sep 4, 2022.

  1. adamski0402

    adamski0402

    Joined:
    Jul 2, 2022
    Posts:
    1
    Hi,
    I am new to hlsl, I' ve recentyl created a mandelbrot set explorer using compute shaders.
    I wanted to be able to zoom indefinitely so I implemented a way to compute fixed point numbers to arbitrary precision(I don't think floating point would benefit me, and fixed point is easier to code). The numbers are represented with an array of ints.
    To allow viewing the images in real time I wanted each frame to look like this:
    1. read previous data if thre is any
    2. do a few iterations
    3. geneare an image to display
    4. save the data for the next frame

    I am usung a RWStrucutredBuffer<int> of lenght Screeen WIdth * Screen Write * ints needed per pixel
    to store and retrieve the data

    To store the data I used a loop

    for (int l = 0; l < fpPre; l++) {
    _Buffer[idx + CurX + l] = zX[l];
    _Buffer[idx + CurY + l] = zY[l];
    }

    zX and zY are the arrays representing the numbers
    fpPre is the amount of ints per number
    idx is the base index for a pixel idx = ( id.x + id.y * width ) * ints per Pixel
    and CurX CurY are helper variables to know where the x and the y variable start in the buffer (CurrX = 0, CurrY = fpPre)

    This code didn't work the zX was't written to the buffer for some of the pixels
    After some trial and error I changed the code to looks like this

    for (int l = 0; l < fpPre; l++) {
    _Buffer[idx + CurX + l] = zX[l];
    }
    for (int k = 0; k < fpPre; k++) {
    _Buffer[idx + CurY + k] = zY[k];
    }

    I was sure It wouldn't work since I thought the code does the exact same thig, but for some reason this works.
    Do you know what is the problem with the first code?
    Have you ever encountered something similar?
    Since at the end everything works as intended I just wanted to ask this question beacouse I find this strange and interesting.
     
  2. Invertex

    Invertex

    Joined:
    Nov 7, 2013
    Posts:
    1,546
    There's no reason the first one should be an issue from what I can tell. This may very well be a compile bug to report.
     
  3. Trindenberg

    Trindenberg

    Joined:
    Dec 3, 2017
    Posts:
    396
    Either way, the 2nd version is more optimised. In the first one you are accessing MemA 0, MemB 0, MemA 1, MemB 1, etc. In the second you are accessing MemA 0..1..2..3..4, MemB 0..1..2..3..4, streamlining memory access. Not sure if GPU caches the same way CPUs do though, but not sure why the issue either.