Search Unity

RT-Voice - Run-time text-to-speech solution

Discussion in 'Assets and Asset Store' started by Stefan-Laubenberger, Jul 10, 2015.

  1. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    We created a short tutorial video:



    Have a nice weekend!
     
  2. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    A simple demo for a Kids game with RTVoice:
     
  3. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    An advanced tutorial for RTVoice:



    Enjoy!
     
  4. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Linux-support is under evaluation :rolleyes:

    Theoretically, it looks promising - the only downside is the lack of a standard TTS-system under Linux...
    This means that e.g. eSpeak must be installed on the target system and we're not sure if it's too much work to create a custom installer for every game... We never did this before for Linux.
    Any suggestions or experience with TTS under Linux?

    Cheers
    Stefan


    Edit:
    Linux is supported via eSpeak since 2.9.6
     
    Last edited: Jun 25, 2018
  5. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
  6. wetcircuit

    wetcircuit

    Joined:
    Jul 17, 2012
    Posts:
    1,409
    Thank you for PlayMaker actions!

    I have a couple of questions….

    Screen Shot 2016-05-03 at 10.19.06 AM.png

    I don't see an equivalent to "List<Voice> Voices()", instead we get a number representing the alphabetical list(?). Randomly typing in numbers until I hit what I recognize as "Zarvox", I have 72 voices loaded on my Mac :eek: … higher numbers seem to default to Agnes (Voice 0)

    I can deal with that. Once I find the voice(s) I want to use I can write them down…, but:

    1. if I add more voices these numbers will change.

    2. I can't use a PlayMaker variable (int) to assign the Voice globally. I would have to update each instance of the Speak action by hand.

    Is there something I've missed? Could the Voice input be changed to a FsmInt variable?

    Also could the Text input be changed to a FsmString variable? I think this is going to be necessary for anything complex….

    **edit: I tried to sit with a PlayMaker tutorial to learn how to change the actions to PM variables, but scripting is over my head, tbh…. I think essentially we'll need all the inputs to be PM fsm variables... I can think of reasons to control volume and rate, even audiosource game object through PlayMaker….
     
    Last edited: May 3, 2016
  7. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981

    Hi wetcircuit

    Thank you very much for your valuable feedback!

    The current Playmaker-actions are in an early stage and we're aware that they aren't complete.
    We don't use Playmaker in our own projects, so we have to learn how you and others will use them...

    I already did an improvement: you can now "debug" all available voices, which should help to find your desired voice.
    Please send me an email with your invoice and I send you the new actions.

    Currently, I'm on holiday until 16th of May and my actions are fairly limited during this time. The Internet in South Africa isn't that great :)


    So long,
    Stefan
     
    wetcircuit likes this.
  8. wetcircuit

    wetcircuit

    Joined:
    Jul 17, 2012
    Posts:
    1,409
    Thank you! sorry to sound so "grabby". :oops: I have a ton of TTS voices so I bought RT-Voice a while ago...
    I'm excited to use it. :cool:
     
  9. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    No worries, you aren't "grabby". It helps to improve the product.
    I've got something ready for you - as mentioned, please send me an email.

    Thank you!
     
    wetcircuit likes this.
  10. wetcircuit

    wetcircuit

    Joined:
    Jul 17, 2012
    Posts:
    1,409
    Oh WOW! This is great! I can set the voice as a global variable….
    And text can be updated/randomized by PlayMaker… volume and rate too! This is perfect! thank you!

    Screen Shot 2016-05-06 at 10.16.05 AM.png
     
  11. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    I'm glad it's useful for you!

    For further questions or suggestions, don't hesitate to contact me.

    So long,
    Stefan
     
  12. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    Hello.Having trouble with latest RT-Voice PRO running for Win10 (Unity 5.3.4) ... Your demos all are indicating 'No OS voices found - TTS not possible' but if I run an app like Balabolka I get voices enumerated for both SAPI 5 (David and Zira) and Microsoft Speech Platform (ZiraPro).
     
  13. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi mdrajeske

    RT-Voice should just work fine with your setup - we're also using Windows 10 Pro and Unity 5.3.4 as our development setup.

    Please make sure that your project (build) is set to "Windows Standalone".
    You can also verify your pc with our demo-build - it should display all available voices.

    If it's still not working, please send me an email with some additional information (like console-log, screenshots etc.) and we will find a solution together.


    So long,
    Stefan
     
  14. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    Ah, working on a shared machine and someone had reinstalled Unity and had not included the Windows target support. All better. So now I'm seeing two voices 'Microsoft David Desktop' and 'Microsoft Zira Desktop' which would be consistent with your docs mentioning SAPI 5 support.

    I have found that the newer Microsoft Speech Platform's ZiraPro is a much better quality voice compared to Zira Desktop. Any plans to support Microsoft Speech Platform in addition to SAPI 5?
     
  15. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    I'm happy it's working now :)

    About ZiraPro... I don't have this voice on my system, but I think it should work as long as it's available under "Speech Properties".

    Btw, there are plenty of other providers like IVONA, CereProc, Nuance etc. which sell very good voices.
     
  16. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    To use Microsoft Speech Platform RT-Voice would need to integrate the Microsoft Speech Platform (MSSP) SDK found here: https://www.microsoft.com/en-us/download/details.aspx?id=27226

    Unity app developers would then need to ensure they install/deploy the MSSP runtime and voices (not installed by default with the OS) - which are also available at that same link. The benefit is 1) much better voices and 2) staying in step with MS for Win10 and beyond while also able to run on older Win7 systems as well. You would likely need to add an API method for selecting SAPI 5 vs. MSSP for the Windows use case. I have some users still on Windows 7 and the only available SAPI 5 voice installed with the OS by default is Anna, and it is very robotic/awful.

    I understand there are other commercial voices out there, but currently trying to stay with voices provided freely via Microsoft OS's to simplify licensing concerns.
     
  17. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    I will take a look at MSSP but I won't promise anything. It looks like a lot of customization and work...
    I'm also on holiday until next week, so I can't do much research until then.

    so long,
    Stefan
     
  18. laynardo

    laynardo

    Joined:
    Mar 21, 2015
    Posts:
    10
    Hi, I just downloaded the software into Unity 5 on my mac. Well, I play any demo scene, I just get the spinning color wheel. It won't build. any ideas?
     
  19. crafTDev

    crafTDev

    Joined:
    Nov 5, 2008
    Posts:
    1,820
    Hey,

    So I have been having issues in that the SpeakComplete event will throw even when the dialog isn't finished being spoken. What should I do to fix this?

    Thanks,
    jrDev
     
  20. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi jrDev

    It depends how you're using RT-Voice. Did you set the variable "speakImmediately" equals true?
    If that's the case, "SpeakComplete" should be called after the AudioSource is no longer played...

    Otherwise, there is a problem -> please send me some details about your setup via email.


    Thank you!
     
  21. crafTDev

    crafTDev

    Joined:
    Nov 5, 2008
    Posts:
    1,820
    Hey,

    Never mind about what I said above. It was my fault. I am using the playmaker actions that apparently get overrided by the SendEvent action that calls another event with RTVoice on it, so it seemed like it was not waiting for the rt voice to complete. I ended up putting the SendEvent action on another state after RTVoice state is called.

    Thanks,
    jrDev
     
  22. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi Laynardo

    I wrote you an email. Please let me know if it helped.

    Thx!
     
  23. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    I'm happy it's working now!
    If you have any further questions or suggestions, just write me an email.


    Cheers
    Stefan
     
  24. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    I see there is a Silence method ... what about a pause/resume functionality?
    I am finding that the Silence method does not work as expected. Tracing it through it looks like the list of audio sources held within Speaker is empty even though I am passing in a non null audio source to the Speak method. Therefore source.Stop() and Destroy(source) are never called.
     
  25. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    The "Silence"-method is designed to stop all current voices from speaking and it works like a charm:
    Code (CSharp):
    1. using UnityEngine;
    2. using System.Collections;
    3. using Crosstales.RTVoice;
    4.  
    5. public class SpeakAndSilence : MonoBehaviour {
    6.  
    7.     void Start () {
    8.       Speaker.Speak("This text is cancelled after 2 seconds.");
    9.  
    10.       Invoke("silence", 2f);
    11.     }
    12.  
    13.    private void silence() {
    14.       Speaker.Silence();
    15.    }
    16. }
    We won't provide a Pause or Resume-function - the result of a Speak()-call is an AudioSource and the developers are responsible to use it as they like (e.g. applying audio filters, pause etc.).


    About MSSP: the "SpeechSynthesizer"-class is used for TTS (which we also use):

    https://msdn.microsoft.com/en-us/library/microsoft.speech.synthesis.speechsynthesizer(v=office.14).aspx

    This means that if you have MSSP installed, you should see all (new) voices.

    I hope this answer your questions.
     
    Last edited: May 17, 2016
  26. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    Thank you for your responses!
    1. I did catch on that pause and resume should be done using the Pause and Unpause methods on the audio source. Still learning my way around Unity.
    2. MSSP must enumerate/track the list of available voices differently from SAPI as I do not see the MSSP voices listed when I use your VoicesForCulture method. I am not surprised that the underlying speech synthesizer is the same.
    3. I am getting a different behavior with regard to Silence. I am suspicious that within Speaker.cs, line 317, sources is an empy list even though I am passing in a valid audio source as part of my call to Speak. Therefore, source.Stop() and Destory(source) never get called and the voice continues to be heard even after a call to Silence.
     
  27. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi,
    1. No problem :)
    2. I will check if I find a solution for this. Meanwhile, you can instantiate your desired voice with the exact name (important!) and try to let RT-Voice speak with it ;)
    3. I created a fix for your request -> provided AudioSources are now supported! Please send me your invoice via email and you receive the new files.
     
  28. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    Stefan,
    Does RT Voice support 'text markup' that allows for the customization of the rendered audio? On Windows using the Speech API (SAPI), this would be done via SAPI's support for the Speech Synthesis Markup Language (SSML). It would involve accepting an XML based text string that includes the text to be heard along with the supported mark ups within the XML.
     
  29. mdrajeske

    mdrajeske

    Joined:
    May 8, 2016
    Posts:
    11
    I am finding that call audioSource.Pause() trips the SpeakCompleteEvent callback which is not desirable. Simply pausing the audio source does not mean that the speaking is complete as it will likely be unpaused at some point. I reviewed the information being passed to the callback and I see nothing within the arguments that distinguishes between an audioSource pause and a natural ending to the speech. How can I know that speech is 'actually' complete and not just paused?
     
  30. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    RT-Voice supports simple "text-markups" with pre- and postfix strings for rich-text, like " <color=green>".
    This is done by the "MarkSpokenText"-function inside "Helper.cs". You can also take a look at the demo-scenes "Simple" and "SimpleNative".

    At the moment there are no plans to add support for XML, SSML etc. in the near future.
     
  31. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    I found a solution and sent you an email with the new files.

    Thank you for your valuable suggestions!


    Cheers
    Stefan
     
  32. ElevenGame

    ElevenGame

    Joined:
    Jun 13, 2016
    Posts:
    146
    Hello Stefan,

    my virus scanner just reported the new RTVoiceTTSWrapper.exe (from 2.3.1 version) for containing something named "TR/Dropper.MSIL.ciqv". That did not happen with the last Version.. Have you got any idea where that could come from? Is it just a random similarity to some malware?

    Greetings
     
  33. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi

    Which scanner do you use: Kaspersky, ESET, F-Secure, BitDefender, Lavasoft or which one?

    We absolutely didn't add any malware to the exe but we did some changes with our build process.
    Please send me an email and I can give you a new version of the program.

    Thank you!


    So long,
    Stefan
     
    Last edited: Jun 13, 2016
  34. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    We don't see any problems with the current version from the store...
    Here are the scan results from Metadefender:

    https://www.metadefender.com/#!/results/file/4abfc4098a114db38d9f9893fa27abc8/regular/

    Capture.PNG

    Only two engines have a problem... Did you update your scanner?


    Cheers
    Stefan
     
  35. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
  36. WhiteLotus

    WhiteLotus

    Joined:
    Mar 4, 2015
    Posts:
    9
    Hi stefan,
    just wanted to know that will this plug-in(in future) provide support for Ios and android plateform ?
     
  37. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi WhiteLotus

    We're currently re-evaluating this option, but it needs further testing etc.
    Don't expect RT-Voice to support mobile in the next few months.


    So long,
    Stefan


    Edit:
    RTV works now on all platforms!
     
    Last edited: Mar 13, 2017
  38. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Last edited: Mar 13, 2017
    Hormic likes this.
  39. unitydevist

    unitydevist

    Joined:
    Feb 3, 2009
    Posts:
    45
    Glad to see the support for audio file saving was added! Does this support playing the audio files if they're cached and otherwise generating the audio for un-cached dialogue lines with the same voice or a closest match?
     
  40. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi Ixpk

    Until now, there isn't any caching mechanism implemented.
    But you can generate 1-n AudioSources and play/loop them whenever you want to.

    The "file save"-function was added by request for various mobile developers. It's simple but it works :)

    If there is a real (performance) need, we will probably think about adding some more advanced technologies for caching.


    I hope this answers your question.
     
  41. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi

    I try to answer your questions:
    1. Yeah for sure. The easiest way is like that: Speaker.Speak("Hi there");
    2. We have a lot of plans :) seriously, we're evaluating the mobile option, but as I said, don't expect it to soon (if ever)
    3. That's also possible, like that: Speaker.Speak("A B C");

    Does this answer your questions?
     
  42. unitydevist

    unitydevist

    Joined:
    Feb 3, 2009
    Posts:
    45
    Sort of? All I mean by caching is to have audio files that were previously Speak()-generated files be played using the already generated files. That way, after playing the game for a while on one platform, when you move the game to the other OS the audio files already generated in the project will play and keep the game's voice consistent unless a line of new dialog is spoken in which case it would generate from the new OS. In that case, it might be good to be able to specify which OS/voice combo is the desired one for caching so as not to get files cached from the wrong OS voice.
     
  43. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    I will think about it, but it's not that easy...
    My first thoughts are to create a hash from the given text and save the files named as the generated hash. Then it should be possible to switch between existing (audio files) and new texts.

    But as I said, that's a lot of work - so don't expect it to be implemented in the near future.
     
  44. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    We added a demo scene for LDC:

    LDC.jpg


    Here are the downloads:

    Windows
    Mac

    There is more to come, like support for SLATE, LipSync & uSequencer :)
     
    Last edited: Jul 23, 2016
    Tinjaw likes this.
  45. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    LipSync is now also supported:
    Capture.PNG


    Check out the demo-app :)


    The upcoming version 2.4.0 has also demo scenes for:

    We will submit it next week to the store!


    So long,
    Stefan
     
    Last edited: Mar 13, 2017
    Tinjaw and TonyLi like this.
  46. wetcircuit

    wetcircuit

    Joined:
    Jul 17, 2012
    Posts:
    1,409
    I have a request for the Playmaker actions: is there a way to know when RT-Voice has finished speaking? Like maybe a Finish Event as part of the Speak Action...?

    Or maybe a way to test if a speaker is currently speaking?
     
  47. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi wetcircuit

    I will take a look into your request and find a solution :)


    So long,
    Stefan
     
    wetcircuit likes this.
  48. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    We will release the version 2.4.0 today - here are some news.

    The "Speaker"-component has been improved and shows now all available voices. You can also preview them inside the Editor:

    Capture.PNG



    We also added a new component called "SpeechText". This are game objects containing the definition of text, voices and various options. You can also preview the whole setup inside the Editor:

    SpeechText.PNG




    Happy time :)
     
  49. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi wetcircuit

    I sent you the newest version of RT-Voice with a solution for your request.


    Cheers
    Stefan
     
    wetcircuit likes this.
  50. wetcircuit

    wetcircuit

    Joined:
    Jul 17, 2012
    Posts:
    1,409
    Ahh! Thank you! Send event in the Playmaker action is perfect!

    Those standalone SpeechText objects, will be useful too!
     
    Stefan-Laubenberger likes this.