Search Unity

Question Microphone

Discussion in 'Multiplayer' started by ep1s0de, Apr 25, 2021.

  1. ep1s0de


    Dec 24, 2015
    Does anyone have an example of recording - > sending - > receiving voice data over the network?
    The interaction with the microphone in unity is so terrible that I think no one will be able to implement voice chat in their games with the playback of the received data in AudioSource

    I found a couple of implementation examples, but they are so crutchy and crooked that you can immediately say " goodbye optimization"
  2. luke-unity


    Sep 30, 2020
    If you are getting bad microphone quality that's most likely because you are sampling at a too low frequency or a different frequency than what your microphone device supports. You can get the supported frequencies of a microphone via the device list. Everything is explained in the doc. If that does not help we might need more detail about what is making the use of microphones 'terrible' with Unity.

    As for sending the data over the network. If you just have a single recording and want to send that over the network and convert it into an audio clip again then there should be enough resources around how to do that lossless. You can start with the docs on AudioClip.Create

    If you want to do real time voice communication at real time then that's a whole different story. This is no simple feat to achieve there are a variety of advanced algorithms involved in being able to deliver high quality, real time encoding of voice. I suggest starting by either using at some of the open source voice chat solutions such as Mumble or if you want a Unity solution, there are assets like Dissonance which come with source code which come with source access when purchasing them.
  3. ep1s0de


    Dec 24, 2015
    There is no desire to spend money... "if you want to do better, do it yourself" but the engine does not provide this feature because it does not provide access to sound updates.. i.e. to check the current number of recorded samples from the microphone, none of the unity updates is suitable
  4. luke-unity


    Sep 30, 2020
    Unity does provide you access to sound updates. Microphone.GetPosition does that. Even better, Unity has functionality built in to map samples perfectly onto the player loop. If you use Microphone.Start the resulting AudioClip is perfectly aligned with Time.time. If you need more control you call also use Microphone.GetPosition to manually control your latency by delaying the playback of your clip. Note you can do this in any Unity player loop function such as Update or FixedUpdate.

    I don't really understand what you want more in term of a sound update. Running a player loop like Update at the sample rate wouldn't make much sense.

    It would be interesting to hear what you want Unity to do differently in terms of microphone features and which APIs it exposes.
    Last edited: Apr 26, 2021
  5. ep1s0de


    Dec 24, 2015
    How to get rid of jitter when playing parts of the sound from the buffer?

    Code (CSharp):
    1. using System;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    5. public class MicTest : MonoBehaviour
    6. {
    7.     const int FREQUENCY = 22100;
    8.     public AudioClip mic_clip;
    9.     public int mic_lastsample = 0;
    10.     public AudioSource Speaker;
    11.     public bool sending;
    12.     public bool replay;
    13.     float[] samples;
    14.     public Queue<byte[]> voice_buffered = new Queue<byte[]>(2048);
    16.     private void Start()
    17.     {
    18.         mic_clip = Microphone.Start(null, true, 100, FREQUENCY);
    20.         while (Microphone.GetPosition(null) < 0) { }
    21.     }
    23.     private void Update()
    24.     {
    25.         sending = Input.GetKey(KeyCode.V);
    26.         replay = Input.GetKey(KeyCode.R);
    27.     }
    29.     string SpeakerClipName;
    31.     void FixedUpdate()
    32.     {
    33.         if (sending)
    34.         {
    35.             int pos = Microphone.GetPosition(null);
    36.             int chunk_size = pos - mic_lastsample;
    38.             if (chunk_size > 0)
    39.             {
    40.                 samples = new float[chunk_size];
    41.                 mic_clip.GetData(samples, mic_lastsample);
    43.                 //put it in the queue for playback
    44.                 voice_buffered.Enqueue(ToByteArray(samples));
    45.             }
    46.             mic_lastsample = pos;
    47.         }
    49.         //If there is data in the queue we play it back
    50.         if (voice_buffered.Count > 0 && replay && !Speaker.isPlaying)
    51.         {
    52.             SpeakerClipName = "Voice_"+Environment.TickCount;
    53.             float[] array = ToFloatArray(voice_buffered.Dequeue());
    55.             //Here I noticed that if you create a clip with the same name, it may have old samples...
    56.             Speaker.clip = AudioClip.Create(SpeakerClipName, array.Length, 1, FREQUENCY, false);
    57.             Speaker.clip.SetData(array, 0);
    58.             Speaker.Play();
    60.             print("remain " + voice_buffered.Count);
    61.         }
    62.     }
    64.     public byte[] ToByteArray(float[] floatArray)
    65.     {
    66.         int len = floatArray.Length * 4;
    67.         byte[] byteArray = new byte[len];
    68.         int pos = 0;
    69.         foreach (float f in floatArray)
    70.         {
    71.             byte[] data = System.BitConverter.GetBytes(f);
    72.             System.Array.Copy(data, 0, byteArray, pos, 4);
    73.             pos += 4;
    74.         }
    75.         return byteArray;
    76.     }
    78.     public float[] ToFloatArray(byte[] byteArray)
    79.     {
    80.         int len = byteArray.Length / 4;
    81.         float[] floatArray = new float[len];
    82.         for (int i = 0; i < byteArray.Length; i += 4)
    83.         {
    84.             floatArray[i / 4] = System.BitConverter.ToSingle(byteArray, i);
    85.         }
    86.         return floatArray;
    87.     }
    88. }
  6. luke-unity


    Sep 30, 2020
    The problem with your implementation is that you always create an audio clip with very little samples to replay. This will result in very jittery audio because the AudioSource won't be able to interpolate them correctly. Your method also has some problems if the networking itself has some jitter because you are not buffering the received data. What you would do on the receiving side instead is:

    1. When receiving samples don't keep them as individual arrays. Put them into a single large float array.
    2. wait for a bit until your buffer is filled with samples for maybe ~1 second.
    3. Start the audio by copying all samples over to an audioclip and playing it. Then clear your float array.
    4. Wait until the current clip finished playing then repeat.

    This is still not 100% ideal you will still get a very small hitch at the seams when loading a new AudioClip but fixing that is not entirely trivial, so not sure whether I could explain that here.
  7. ep1s0de


    Dec 24, 2015
    I found another way to inject audio data from the queue into the audio clip using AudioClip. PCMReaderCallback, so there is no need to create a new clip. But there is a new problem, because the delegate is called after a new clip playback cycle, the engine does not give access to changing AudioSource.timeSamples because the audio processing takes place in a different thread.

    I have already realized that playing short "chunks" of audio data is a bad idea, now I will accumulate it in a buffer of half a second. This will solve the problem with AudioSource.timeSamples (will not need to use it)