easy speech synthesis on a Mac

JoeStrout · Mar 29, 2018

I needed speech synthesis for a recent project. I started out using Watson's text-to-speech service, but in less than a week I hit the limit of their free tier (10,000 characters). Since I'm on a Mac, I decided to try Apple's speech instead, and I love it. The voice quality is at least as good, if not better; the performance is great, and it's free.

Here's the code:

Code (CSharp):

using System.Collections;

using System.Collections.Generic;

using UnityEngine;

using UnityEngine.Events;

public class AppleSpeechSynth : MonoBehaviour {

public string voice = "Samantha";

public int outputChannel = 48;

public UnityEvent onStartedSpeaking;

public UnityEvent onStoppedSpeaking;

System.Diagnostics.Process speechProcess;

bool wasSpeaking;

void Update() {

bool isSpeaking = (speechProcess != null && !speechProcess.HasExited);

if (isSpeaking != wasSpeaking) {

if (isSpeaking) onStartedSpeaking.Invoke();

else onStoppedSpeaking.Invoke();

wasSpeaking = isSpeaking;

}

}

public void Speak(string text) {

string cmdArgs = string.Format("-a {2} -v {0} \"{1}\"", voice, text.Replace("\"", ","), outputChannel);

speechProcess = System.Diagnostics.Process.Start("/usr/bin/say", cmdArgs);

}

}

Just call the Speak method, and bask in the sultry (or manly, as you prefer) sounds of speech.

Note that I needed the outputChannel parameter in order to redirect the output (through SoundFlower) to QuickTime when recording this demo video. That was a PITA, because then I couldn't hear it while recording... but anyway, if it you have any trouble hearing the speech, do a say -a '?' on the command line, and check that the output channel you have selected is the correct number for "Built-in Output".

unity_49fCeZpEbCk40A · Jul 20, 2019

Hi Joe, I want to implement this script in my unity project. How can I do that? should I attach it with my main camera or game object?

JoeStrout · Jul 21, 2019

It's just a MonoBehaviour. Attach it to whatever you want.

estherifitae · May 1, 2020

hello! may i ask how u went about using Watson's service? I was trying out their speech-to-text service, however, i had errors in compiling the script (error under the inspector section).

dfarjoun · Jun 9, 2020

Hi,
This is VERY interesting!!!!
How do you implement the voice recognition on the mac?
Can you use Apple also for the recognition?

JoeStrout · Jun 9, 2020

Sorry, I have no idea about that.

bricefr · Jun 10, 2020

dfarjoun said: ↑

Hi,
This is VERY interesting!!!!
How do you implement the voice recognition on the mac?
Can you use Apple also for the recognition?
Click to expand...

I tried everything I could, not possible using the native macOS binaries. Also note that for the TTS feature - the say command - there is some legal restrictions...

djackson_unity · Apr 10, 2021

Hey! I'm a beginning coder and I need some text to speech in my (mac based) project. Can you explain what the two UnityEvents:

Code (CSharp):

public UnityEvent onStartedSpeaking;

public UnityEvent onStoppedSpeaking;

do in the script? Do I need to use them in some way, or just call the Speak method?

Thank you so much for this code, btw, it's exactly what I needed.

JoeStrout · Apr 12, 2021

Those simply provide events other code can use if they need them. If you don't need them, you don't need them.

If you're not familiar with Unity events and all the cool ways they let you decouple your code, check out this tutorial (old but still applies today).

djackson_unity · Apr 12, 2021

Thanks! I'm not familiar with Unity events so I appreciate the link to the tutorial.

paulshaquille · Apr 24, 2021

Can someone explain what this code is doing exactly?
I need to implement voice recognition in my (iOS) game and I feel like this actually works but I don't know what it's doing. I've placed in into my game but I also don't know if it's working. Am I supposed to have downloaded something else or this should work no matter what?

JoeStrout · Apr 24, 2021

It's just invoking the /usr/bin/say command (a built-in command-line app on macOS) via the shell. This is speech synthesis, not voice recognition. I wouldn't expect it to work on iOS.

paulshaquille · Apr 25, 2021

JoeStrout said: ↑

It's just invoking the /usr/bin/say command (a built-in command-line app on macOS) via the shell. This is speech synthesis, not voice recognition. I wouldn't expect it to work on iOS.
Click to expand...

Thank you. This definitely helped me save some time

stfunity · Feb 1, 2022

Great stuff, thank you. Missed hearing Fred's voice

stfunity · Feb 2, 2022

I like this command argument for Say and I wanted to make a second one so I could poll the local MacOS and get a list of all available SpeechSynthesis voices

I found a command that runs like this in bash terminal and tried to work it into the same format as your say command above but I couldn't figure it out all the way because I wasn't sure if I needed a path declaration like you had at first

This is the command I want to process as a bash/terminal argument out of Unity C#
ls /System/Library/Speech/Voices | sed 's/.SpeechVoice$//'
I tried setting up

Code (CSharp):

public void ListAvailable() {

string cmdArgs = "ls /System/Library/Speech/Voices | sed 's/.SpeechVoice$//'";

speechProcess = System.Diagnostics.Process.Start(cmdArgs);

}

Wasn't sure if I -needed- to add the first part of the other command format from your original Speak function where it says "/usr/bin/say", or if I could just pass one string as a command argument

I also don't know how to capture the console's response to that, I know it would return text but I am no expert on talking to bash indirectly.
----------------------------------------------------------------------------------

Meanwhile I have another question:
Is it possible to access some kind of phoneme or viseme stream on the Mac side of SpeechSynthesis or another library for timing of mouth poses on an Avatar? Obviously the speak function engages Synthesis. Not sure if there's kind of timing system exposed, I can see where people have set the speed parameter on the speech on Mac Side so I guess there's something back there

Discussion threads on this issue are surprisingly scant given the amount of time all of these systems have been coexisting. Thanks so much for any expertise you can offer.

@ippdev and I are trying to crack this nut so we can get universal speech support

JoeStrout · Feb 2, 2022

Hmm, you're trying to execute a compound command — where output of one command is piped into another. I'm not certain that System.Diagnostics.Process.Start can do that.

An alternative would be to use just the "ls" command (which is actually "/bin/ls") as the first argument to Process.Start, with "/System/Library/Speech/Voices" as the cmdArgs (second argument).

But then you will need to process the returned text. That is a little tricky, but it is doable; Process.Start returns a Process object, which has a StandardOutput stream you can read from. See these answers for some examples. Once your code is reading the results of the ls command, you can search it yourself for SpeechVoice entries.

Or, better yet: why are we going to all this work to run `ls` in a shell to get a list of files the hard way? C# has built-in methods to get files in a directory. Just use one of those instead.

Matloob · Feb 4, 2024

Hi, is there any way to save the audio generated by the speech synthesis? I'm trying to display the audio generated visually.

JoeStrout · Feb 5, 2024

Yes. Type "man say" in Terminal for details.

Search Unity

Unity ID

Useful Searches

easy speech synthesis on a Mac