Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question OnAudioFilterRead metronome example

Discussion in 'Audio & Video' started by Verne33, Jul 5, 2023.

  1. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    Planning to implement the metronome in the example here, but there is one thing that is confusing me. I'm sure it's completely obvious so forgive my beginner's ignorance.

    What's stumping me is they're using what I understand as multiplication '*' operator on where I feel it should normally be addition.

    It's used here to multiply AudioSettings.dspTime (or a double based upon it) by the sampleRate, which is 48000 by default.

    Since AudioSettings.dspTime is constantly increasing, I feel multiplying by 48000 would lead to an arbitrarily large number rather than anything useful. Hence my confusion

    Examples:

    Code (CSharp):
    1.  
    2. double startTick = AudioSettings.dspTime;
    3. double sampleRate = AudioSettings.outputSampleRate;
    4. double nextTick = startTick * sampleRate;  // Here
    5.  
    Code (CSharp):
    1. double sampleRate = AudioSettings.outputSampleRate;
    2.  
    3. void OnAudioFilterRead(float[] data, int channels)
    4.    {
    5.         double sample = AudioSettings.dspTime * sampleRate; // Here
    6.    }  

    If anyone can clarify this I'd greatly appreciate it.
     
  2. SeventhString

    SeventhString

    Unity Technologies

    Joined:
    Jan 12, 2023
    Posts:
    290
    I understand your confusion, and I admit that the example could have better name for its variables :p When
    nextTick
    is initialized, its probably because a tick happens exactly at the start, but then the variable is reused.

    So you are totally right about the fact that the next tick should involve an addition, and it actually happens in the example:
    Code (CSharp):
    1. void OnAudioFilterRead(float[] data, int channels)
    2. {
    3.     if (!running)
    4.         return;
    5.  
    6.     double samplesPerTick = sampleRate * 60.0F / bpm * 4.0F / signatureLo;
    7.     double sample = AudioSettings.dspTime * sampleRate;
    8.     int dataLen = data.Length / channels;
    9.  
    10.     int n = 0;
    11.     while (n < dataLen)
    12.     {
    13.         float x = gain * amp * Mathf.Sin(phase);
    14.         int i = 0;
    15.         while (i < channels)
    16.         {
    17.             data[n * channels + i] += x;
    18.             i++;
    19.         }
    20.         while (sample + n >= nextTick)
    21.         {
    22.             nextTick += samplesPerTick;  // <------HERE!!
    23.             amp = 1.0F;
    24.             if (++accent > signatureHi)
    25.             {
    26.                 accent = 1;
    27.                 amp *= 2.0F;
    28.             }
    29.             Debug.Log("Tick: " + accent + "/" + signatureHi);
    30.         }
    31.         phase += amp * 0.3F;
    32.         amp *= 0.993F;
    33.         n++;
    34.     }
    35. }
    The reason why you see a multiplication is because
    AudioSettings.dspTime
    returns the current time in a decimal format representing seconds, so there we're multiplying the number of seconds by the number of samples per second to find the index of the sample where the next tick occurs.
     
  3. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    Thank you @SeventhString , it makes sense now :D .

    If I may ask a follow up question - if trying to create an audio sequencer with as close to perfect on-beat timing as possible, would you recommend an approach like this below, or is there another best-practice?

    - Have a "silent" looping "metronome audio source" playing
    - Set DSP Buffer Size to "Best Latency", then in
    OnAudioFilterRead
    frequently update a reading of
    AudioSettings.dspTime
    via a double, example:
    _DspTimeReference

    - Use method you explained above to know sample accurate timing of where audio system is at, particularly when a tick/beat has been reached and/or is approaching.

    Here is where I become unsure - Since you can only call
    PlayScheduled
    from the main thread, you can't use that to schedule an upcoming clip in truly perfect timing, right?

    I know there must be way to "write' my audio clips sample data (extracted via
    AudioClip.GetData
    ) directly into Unity's audio buffer, at a specific sample time (ideally in
    OAFR
    ), but I am just not sure exactly how to go about it. Any points in the right direction would be immensely helpful!
     
  4. SeventhString

    SeventhString

    Unity Technologies

    Joined:
    Jan 12, 2023
    Posts:
    290
    So if you're really trying to get something accurate and dedicated to music (you're not going to like this) you should probably not use Unity and rather go for JUCE, in my humble opinion that's the most relevant music framework out there. Now that my conscience is clean, we can talk about Unity....

    What you're suggesting makes perfect sense to me.
    PlayScheduled
    does offet sample-accurate precision, even if it's called from the main thread. While you can certainly assume some latency between the main vs audio thread, you can consider that
    PlayScheduled
    will "post" your request to the audio engine, which will then do it's work accurately.

    I you want to go hardcore, and enjoy juggling around with samples, something you could do it to schedule and mix the audio by yourself all in OAFR like you mentioned. You would need a few things I imagine...
    • A sound bank, something to keep all your samples data, immutable, loaded and uncompressed for quick and safe reading
    • I'd probably keep a
      List<SequencerTrack>
      in a
      Sequencer
      where each track would store a reference of it's sound and what beat indices are activated or not and hold the mechanics about calculating the next sample trigger(s).
    • OAFR would use a pattern similar to the metronome (convert dspTime to sampleTime and mix audio)
    This all sounds really basic.. but it's a start I guess ¯\_(ツ)_/¯ it really depends on the flavor of your project.

    But if you're willing to go down that path, I'd really recommend you have a look at JUCE! It might be more work to setup the UI, but in terms of audio quality and features, you'd be better off.

    Happy coding!

    Share a prototype if you want/can :D
     
    Verne33 likes this.
  5. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    You're referring to JUCE as an alternative to Unity, correct? vs. a plugin.

    I've already put a lot of work into this project in Unity/C# & the final thing to nail down is the accuracy of the metronome, so I'm hoping to just cross the finish line :) With the help of "Best Latency" and OAFR, I am hopeful I can get a "good enough" solution after a code refactor of my system.

    On the off chance you're not aware, some folks years ago have worked on making sequencers in Unity based on the OAFR example. Here is one example on Github. Unfortunately it is not documented in much detail so I haven't had to chance to study it much yet.
     
  6. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,046

    You won't be able to get anything that's both accurate and timely in terms of sequencer out of OAFR.

    You can have accurate or timely. Not both.

    And you'll get glitches, regardless.

    @SeventhString is suggesting making a plugin with JUCE. Though, these days, I'd suggest using CLAP. It's a fundamentally better system design, and much more modern and strict, and much more focused on just audio and threading, which is what you're going to want to really focus on, as you don't want your plugin to compete with Unity's insistence that it overload the main thread.

    I presume your Unity app is the interface, so all you're doing is making a plugin that manages MIDI and audio launching, in CLAP.

    The real issues will be getting all things hooked up in terms of an IDE for C/C++ that runs alongside Unity and your C# IDE whilst you get the two things to work... together. This will be extremely painful. Torturous, even.
     
  7. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,046

    This example is fundamentally a complete mess. Way past contrived. It's trickery within tricks within loops. Nobody should ever be tasked with understanding how this thing works, let alone trying to learn from it.

    It's also specifically and seemingly deliberately misleading, as it gives people like the OP false hope that Unity is somehow somewhat suitable to running music apps and rhythm games, and all they need do is untangle this ridiculous mess in order to see how they can have a stable and performant time base.

    Aren't we past this kind of disingenuousness in examples, now that Unity's gone public?
     
  8. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    Yeah, this is above my current skill level, so I hope to find out soon whether you're right about Unity being unsuitable for music apps. For what it's worth, I'm not planning to use MIDI, rather I made a system around stitching together audio clips. Furthermore the target platform is VR, hence the need to use a game engine. Feeling a bit boxed in. Maybe need to consider switching to Unreal :D

    Edit: Also, not 100% sure what you mean by "You can have timely or accurate. Not both".
    Why are they mutually exclusive? The main obvious issue to me is inability to call PlayScheduled/Set samples in OAFR.

    However, couldn't you get very close to perfect by this method:
    - In OAFR - When your 'sample countdown' shows a loop's end is approaching, inform main thread.
    - On next Update, main thread calls PlayScheduled(time) where "time" is whatever is left on the countdown OAFR manages.

    That way your "time" value will never be "off" by more than ~5.3ms, assuming you've set DSP buffer size to "Best Latency". I haven't tested this, but I would hope that small of latency would be effectively un-noticeable?
     
    Last edited: Jul 6, 2023
  9. Hikiko66

    Hikiko66

    Joined:
    May 5, 2013
    Posts:
    1,302
    What issues did you have with PlayScheduled? Describe these inaccuracies, and exactly what you are trying to fix.

    PlayScheduled is the easiest solution, considering it basically schedules audio on the other thread for you, which is what you'd have to do anyway including all of that ugly maths, and it includes spatialization, which I believe OAFR does not.

    Your metronome doesn't have to be framerate independent, because PlayScheduled is framerate independent. Any inaccuracy is not noticeable to your ears.
    The only way it becomes noticeable is through drift, which is basically a build up of tiny inaccuracies over time into something significant enough to hear, but if you are using a metronome or a single source of timing, I don't see how you are introducing a build up of inaccuracies.
     
    Last edited: Jul 7, 2023
  10. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    Yeah that's a very valid question. My issues arose from trying to make a metronome that detects 16ths of a measure, which at higher BPMs are very short intervals, so over time the metronome was losing synchronization with the actually-playing audio clips.
    The reason I was tracking 16ths was that I intended to make a recording system based on those markers. Any sound 'recorded' would be assigned a value 0-15 within a bar, and then in future loops playback could be scheduled based on adding the time until 'next playable interval' to the metronome time. But it's completely broken as-is, so starting to think of alternatives to achieve 'recording/scheduling' using sample time measured in OAFR
     
  11. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,046
    I'm assuming this is the goal.

    Within the above context, what I mean by "timely" is low latency, such that the player is unaware that there was any delay. With OAFR and Best Latency you are still governed by the full time of the input received plus, because you'll have to do that outside of OAFR and set flags for OAFR to digest from those received inputs. Then, in OAFR, having gotten those flags, schedule in accordance with whatever else you've got in mind, and consider the length of the DSP chunk (256 "frames" in low latency, or 256/48000th of a second as we've previously discussed) if you want to respect that, or you can jump in early, as soon as you've got that flag, and brutalise your sample reading to get in early.

    By Accuracy I mean on the beat and such that you don't create any phasing or cancellation, or sync issues etc..

    You give up some timeliness when going for that beat, as you've now gotta add the above timings latency to the NEXT beat that you can reasonably expect to hit that matches what you're wanting to drop in. And this isn't vaguely fun to code in C#. And the trial and error is a nightmare, as if you've got different beats (some tracks 128 bpm, some at 140, some at 123, some at 90 etc) then this gets to be really unfunny.

    Keep in mind, there is a super ugly hack available, you can set the DSP buffer length that's 256 at "Best Latency" to any power of 2 number you like. 32 is about the lowest you can reasonably go, but even on a great system, this is going to create issues sometimes. You'll likely pop speakers trying this out. I don't recommend going below a DSP buffer length of 64. But that's still 4x faster than a buffer of 256, so can provide some meaningful improvements.

    I've long considered much of the same problem as I was, for a while, considering using Unity to make a VJ thing that also had MIDI record and playback (and editing). It's not something that Unity's suited for, UNLESS you can be bothered making a VST style plugin work with it, and use JUCE or CLAP or pure C to do your own timing system and kicking off of audio/samples and playback.

    Nobody has done it for quite sometime. The guy that made the Vital synth, a gem of a programmer, even ported his prior synth to work with Unity and found it so unfunny that he gave up on it.

    Unity needs better time management, reporting and responsiveness just for vaster game input and screens, as they're not up to scratch on VR gear, let alone fast phones and Nvidia screens etc. Audio timings are way beyond its timing system.
     
  12. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    How? Audiosettings.SetDSPBufferSize (depreciated)? That is certainly intriguing
     
  13. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,046
    Verne33 likes this.
  14. SeventhString

    SeventhString

    Unity Technologies

    Joined:
    Jan 12, 2023
    Posts:
    290
    Yes that's what I meant. JUCE is a C++ framework dedicated to the creation of audio applications. I believe it has a more professionnal, almost industrial, quality in that regard. If you intend to make an application targeting live performers or seasoned music producers, this is really your best bet. Also, considering the good momentum this group has, I would not be surprised if they added CLAP support eventually.

    And just a nuance if you're not familiar with CLAP: understand that it's an alternative to VST, as it's a framework sinking MIDI events to produce audio signals. It's not a general music composing toolkit that could replace Unity or JUCE by itself.

    Now, if you're going for an entertaining/gamified music experience, Unity is still in this race in my opinion. Unifikation is totally right about Unity not being the number one choice for audio experiences, but as a serious developer and investigator, he might have higher standards than your target audience.

    I want to be clear that I'm not saying "dump Unity for JUCE absolutely". If you have a lot of time already invested in this and you life and salary doesn't depend on this, I'd recommend you stay motivated and go as far as you can to reach something that YOU consider fun, great, and everything. Even if you don't reach your original goal, you will still have earned great knowledge, built intuitions, brought music to the world and potentially made new friends.

    It's more about the journey than the destination!

    ...and even if you're an entrepreneur depending on this for living, Unity might be a good way to prototype, learn the ropes of audio, get feedback from potential users and maybe recruit for collaborators. But for a real pro-level audio app, I really really recommend you go for something like JUCE or AudioKit or any specialized audio framework.

    Cheers!
     
    Verne33 likes this.
  15. Verne33

    Verne33

    Joined:
    Apr 12, 2020
    Posts:
    30
    Thanks for your reply. My project is for a game/entertainment music-making app, but I naturally wanted to see how far you can push Unity's audio system to achieve a legit sequencer.
     
  16. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,046
    There's another option I've seen discussed in theory several times, but never implemented:

    Create your own thread in C#, and set it's affinity to a Core/Thread that you know is otherwise unused, and then use it as you like to poll at rates you like, doing whatever you like.

    As I understand it, one of the problems with this is truly getting the calls into this thread off the main thread. I'm not sure if this is really an issue. If this can be alleviated, then this is the best solution, as you'll be able to code in C#, do whatever you like at whatever rate you like, and build up all the content and sequencing for subsequent absorption and rendering by the Audio thread, without taxing it at all, so you can run it as fast as you like (eg 64 samples) without any fear of overloading it.

    Your sequencer could/should then be entirely self contained on that other core, in its own thread.

    Unity should have provided us with this kind of thing when they built Jobs, so that we could make the most of Burst, ourselves, and for themselves so they could poll input and render outputs other than graphics and audio (data etc) at rates far faster than frames for all manner of things that need that kind of stuff. MIDI being my favourite example, but also audio sequencing like you're doing.
     
    Verne33 and SeventhString like this.