Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Talking Avatars with Azure Speech-To-Text and Text-To-Speech

Discussion in 'General Discussion' started by Doug_Danforth, Jul 19, 2022.

  1. Doug_Danforth

    Doug_Danforth

    Joined:
    May 20, 2013
    Posts:
    4
    I am wondering if anyone has created and/or is selling an avatar/project in Unity that you can speak to and who will respond using Azure STT and TTS via WebGL. I would like this to be a reasonably high fidelity character that has good lip sync capabilities. We have an old virtual character but it uses facial bones and the lip syncing is very poor. The STT and TTS is also spotty. We manage the dialogue using a web service running pattern matching software (ChatScript).

    I am happy to pay for this.
     
    BYD likes this.
  2. MadeFromPolygons

    MadeFromPolygons

    Joined:
    Oct 5, 2013
    Posts:
    3,883
    This is such a specific set of requirements (azure STT, TTS, webgl), you are simply not going to find someone with this who is willing to share or sell it.

    You also definitely are not going to find it in the general discussion section of the forums, thats for sure.
     
  3. Doug_Danforth

    Doug_Danforth

    Joined:
    May 20, 2013
    Posts:
    4
    Thanks for responding. I don't actually need Azure, but that service is purported to work with WebGL whereas I don't think Watson does. Basically I am looking for an avatar that can communicate via a web page.

    In my world (academia) this is a very hot topic - avatars that can communicate. Practicing conversations, taking medical histories, providing instructions to patients, etc. Lot's of research happening on NLP systems to manage dialogue, but a realistic, easily deployable (thus WebGL) avatar solution for the front end seems to be lacking.

    Where would be a better place to post this?
     
  4. MadeFromPolygons

    MadeFromPolygons

    Joined:
    Oct 5, 2013
    Posts:
    3,883
    I work in this world too, I work in a company providing apps that provide things like this to a wide range of sectors including education. But what I am saying is this is very specific / niche so nobody is going to just share it. We have this working in our product but again, we would not share our work for obvious reasons as thats basically just taking work we have paid for and spent time on and giving it out to competitors.

    Its not that hard to integrate azure TTS and STT into a character with lipsync. I suggest you look at salsa lipsync, integrate that, and then integrate azure yourself rather than relying on someone giving you a project to work with which is very unlikely.

    As you say this is a hot topic, so expect to need to put in the work yourself as very few people will share this willingly. Anything high quality will be made by a compant and as a result, unlikely they will share it.

    A better place would be the WebGL subforum, as this is WebGL related.
     
  5. Doug_Danforth

    Doug_Danforth

    Joined:
    May 20, 2013
    Posts:
    4
    We actually have Azure TTS and STT working for iOS and Windows. It is the WebGL part that is apparently harder. We have a nasty TTS bug that causes the TTS to skip on the third or fourth response and then plays the skipped audio and the next response during the next turn. We have been trying to fix this for a few months now and my current developer is convinced there is a bug in the Azure TTS JavaScript code. We are currently working with Microsoft to try and crack it.

    Because of this, and the fact that our characters are built on old technology, I thought I would just try to see if some enterprising Unity developer had taken an iClone Character or something similar and hooked up the lip sync and STT/TTS pieces.

    I will post in the WebGL forum. Thanks for the advice.
     
    MadeFromPolygons likes this.
  6. ReadSpeakerTom

    ReadSpeakerTom

    Joined:
    Aug 31, 2022
    Posts:
    1
    Doug - For the TTS side of your question, this is something our team at ReadSpeaker is working on.

    We have a Game Engine Plugin that offers runtime TTS, on the device and we are adding a lipsync feature to it. It is still early for lipsyncing but we have a PoC that we can share with you.

    We wouldn't handle the STT side of the conversation


    readspeaker.ai/unity-unreal-game-engine-plugin-free-trial/
     
    Last edited: Aug 31, 2022
  7. ippdev

    ippdev

    Joined:
    Feb 7, 2010
    Posts:
    3,799
    ReadSpeaker was very spotty on support..it has 50+ dll's named 11, 118, 24 etc.. It was a SOB to uninstall. We installed MaryTTS as a server to have 33 languages via a web interface. Vosk for ASR.