Search Unity

Audio How does outputSampleRate relate to the frequency range covered by GetSpectrumData?

Discussion in 'Audio & Video' started by sbsmith, Jan 22, 2024.

  1. sbsmith

    sbsmith

    Joined:
    Feb 7, 2013
    Posts:
    126
    I'm starting to play with AudioSource.GetSpectrumData and wanted to make sure that I understand it correctly. I am not an audio person and just have a basic understanding of how things are supposed to work. My main question is about how AudioSettings.outputSampleRate relates to the frequencies covered by AudioSource.GetSpectrumData.

    I understand that human hearing is about 20Hz to 20,000Hz.
    I understand that the output of AudioSettings.outputSampleRate is 48,000Hz on Windows and can be lower on different platforms.

    If the sample rate is 48kHz, does that mean the contents of the samples buffer represent 0-24kHz?

    If I wanted to analyze data in the different audible frequency bands (sub-bass to brilliance), can I grab ranges of indices that correspond to these bands, and are they linearly distributed in the range of 0 to AudioSettings.outputSampleRate / 2?

    ex. If AudioSettings.outputSampleRate = 48kHz, and my sample buffer is size 64, does that mean sample[ 0 ] represents the frequencies of 0-375Hz? And if my sample buffer size is 512, does that mean sample[ 0 ] represents the frequencies of 0-47Hz?

    If I want to get more resolution on the lower frequencies, am I forced to have big buffers collecting unwanted data on the higher frequencies too?

    Thanks for your help!
     
  2. SeventhString

    SeventhString

    Unity Technologies

    Joined:
    Jan 12, 2023
    Posts:
    410
    That's how I understand it yes. I've had this realization toying around projects leveraging FFTs.

    Something interesting I found (while it's not helping you really...) is the projection of FFT spectrum on Mel frequencies, which appears to be more closely related to human audition. https://learn.flucoma.org/reference/melbands/
     
    sbsmith likes this.
  3. sbsmith

    sbsmith

    Joined:
    Feb 7, 2013
    Posts:
    126
    Thanks! Are you able to provide any confirmation on how outputSampleRate correlates to the frequency range covered by GetSpectrumData samples? And is the distribution in the samples array linear?

    (Edit: I just finished reading that Mel band link, and that will come in pretty handy. I didn't know if was going to average my data, just get max amplitude, or try some kind of convolution. I'll probably start with this instead. Thanks again.)
     
    Last edited: Jan 23, 2024
  4. SeventhString

    SeventhString

    Unity Technologies

    Joined:
    Jan 12, 2023
    Posts:
    410
    Theoretically, your sampling rate will define the highest frequency you can represent, based on the Nyquist theorem (maxFreq = samplingFreq / 2). So with 48kHz, you should be able to properly represent the hearing range [20Hz, 20kHz].

    Now, when you're using `GetSpectrumData`, the `numSamples` you specify is the "number of FFT bins" (IMHO `numSamples` is a poor choice of variable name). Each bin will covers an identical range in Hz (`samplingFreq/2/numSamples`). So you're getting a linear representation of a non-linear spectrum... but that's how FFT goes!

    48kHz with 1024 bins gives you (48000 / 2 / 1024) ~23.5 Hz per bin. This is a relatively bad resolution to analyze low frequencies. The simplest thing you can do is to increment the number of bins. The highest value supported is 8192, which would give a resolution of ~3 Hz. While this is significantly better, bear in mind that it will be longer to process. The complexity of the FFT algo is estimated to O(n log n), which gives a 10x factor between 1024 and 8192.

    Circling back to the question...
    • the sampling rate does correlate to the FFT range, based on Nyquist
    • the result of the FFT is an array of linearly-evenly-spaced frequency bins each covering the same frequency range
     
    sbsmith likes this.
  5. sbsmith

    sbsmith

    Joined:
    Feb 7, 2013
    Posts:
    126
    That's exactly what I was looking for! Thanks, @SeventhString ! They really should provide this extra bit of info in the documentation. It would probably help more people take advantage of what is a pretty cool feature.
     
    SeventhString likes this.