Search Unity

  1. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice
  2. Ever participated in one our Game Jams? Want pointers on your project? Our Evangelists will be available on Friday to give feedback. Come share your games with us!
    Dismiss Notice

[RELEASED] Unity Google Cloud Speech Recognition [VR\AR\Mobile\Desktop]

Discussion in 'Assets and Asset Store' started by FrostweepGames, Aug 6, 2015.

  1. 4wall

    4wall

    Joined:
    Sep 16, 2014
    Posts:
    69
    Hi
    Currently evaluating various Cloud based ML Services. In particular Speech to Text. In your demos it seems like there is a huge amount of latency with Google's Speech to Text service. Have you done a comparison with other Cloud based services like IBM Watson?
     
  2. eco_bach

    eco_bach

    Joined:
    Jul 8, 2013
    Posts:
    1,412
    Ok, for starters this needs better and up to date documentation! Tried following the instructions but so many questions. Part of the problem is the complexity of Google Cloud services for novices. I've tried create a Key for a new Service Account and then plugged this key into the 'Api Key' field of the prefab but whenever I run the Example scene I get the following error

    Code (csharp):
    1.  
    2. HTTP/1.1 403 Forbidden
    3. {
    4.   "error": {
    5.     "code": 403,
    6.     "message": "The request is missing a valid API key.",
    7.     "status": "PERMISSION_DENIED"
    8.   }
    9. }
    10.  
    Right now my key begins with
    -----BEGIN PRIVATE KEY-----\n

    and has sever line breaks ie \n

    Does this seem correct? Or should I be using the Key ID instead?

    Could you possible provide an uptodate tutorial with proper links on setting up ONLY speech recogntion and generating the key for Google Cloud?

    IBM Watson for speech recognition is MUCH easier to use and run!
    Included a screen grab of my prefab Inspector settings
     

    Attached Files:

    Last edited: Nov 29, 2018
  3. noahx

    noahx

    Joined:
    Nov 22, 2010
    Posts:
    50
    Hi,
    Does this tool have a feature to start recording when "voice" (or other noise) is present? And stop when there no voice anymore. What I mean, I need a plugin that doesn't require to "call" a method to start recording, I need to "automatically" start recording that piece of audio when someone speaks on the microphone without pushing any key. I looked at the tutorial and I didn't find anything related to it, so I'm guessing there's no feature like that and the recording is only activated by explicitly calling the method by user action.

    Thanks.
     
  4. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    looks like you created an invalid api key.

    look at this video:


    Best
     
  5. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    actualyl we have some functionality named as 'voice detection' that will help to realize feature you want. it will detect when user start\stop talking. and sending requests directly.

    Best
     
  6. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    didnt make a lot of comparison but in general, for us, Google Cloud service more scalable and workable..

    So it depends from what you need in your project in general.

    We have a lot of experience with making plugins based on Google Services.. and doesnt have any plugins for IBM Watson.. I would say we can make it but I dont know hot it will be - popular or not..

    Best
     
  7. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    actually it works in parallel thread, should not be a problem.
    but it depends from device I guess.

    Best
     
  8. wintermuute

    wintermuute

    Joined:
    May 12, 2017
    Posts:
    13
    Hi,
    I'm looking at purchasing your Google Cloud Machine Learning Kit and am wondering if it would work with Google's Cloud AutoML? If it does not, is this something you are planning on adding support for?
     
  9. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    for now no we havent support of it.

    its a big API, so its not an easy to implement but we are looking on new product on Google Cloud and planning to make supporting of them.

    Best
     
  10. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    check out our Android Demo on Google Play: click here

    Use your own API Key or our demo key.

    You can check how it works before buying.

    Best Regards
     
  11. peteRunner

    peteRunner

    Joined:
    Oct 7, 2014
    Posts:
    7
    Hi all,
    wrote before, but problem still persist.

    When using in Mobile VR, the speech recognition is doing lag for 1 - 2 seconds, but it is killing whole VR experience. See video here - seconds 7 - 12. My idea is because of audio encoding. Is here someone, who has not similar problems in mobile VR?
     

    Attached Files:

  12. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hi,

    I guess in new Unity versions you can use .net .4.x and async\await functions can help you remove lags from there.

    Best
     
  13. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    I have problem with the runtime, when start recording and I want to stop. then back to recording.. it get error
    ArgumentException: AudioClip.SetData failed; invalid data
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.MediaManager.MakeAudioClipFromSamples (System.Single[] samples) (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/Core/Managers/MediaManager.cs:186)
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.MediaManager.Update () (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/Core/Managers/MediaManager.cs:130)
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.ServiceLocator.Update () (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/Core/Utilites/ServiceLocator.cs:27)
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.GCSpeechRecognition.Update () (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/GCSpeechRecognition.cs:93)
     
  14. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    also I get speech recognition fail in IOS, but okay in android
     
  15. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    which error you see in xcode?

    Best
     
  16. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    regardign this issue.. when it appear? is it reproduces everytiem or randomly?
    it can appear if troubles with microphone.

    how we can reproduce it?

    Best
     
  17. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    to reproduce set it to runtime voice detection .. then start recording.. then stop and start recording again.. the error will show up.. cause in runtime voice detection, I want to add a stop recording
     
  18. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    also in runtime voice detection, why is it hard to get a callback from google speech, but in Unity it will get call back immediately, possible internet problem? but in unity editor it get callback immediately.. and phone got the same connection, cause in the log its not even givin an error like Speech not recognize or somtime.
     
  19. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    it doesnt work in ios anymore, always get Speech Recognition failed with error {}
     
  20. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    I'm building it in Unity cloud, my unity version was 2018.0.3f2,
    it doesnt work in ios anymore, always get Speech Recognition failed with error {}
     
    Last edited: Feb 12, 2019
  21. ShahSoft

    ShahSoft

    Joined:
    Jan 13, 2013
    Posts:
    12
    Hey FrostweepGames,

    Will this work in WebGL build? perhaps along with your WebGL Microphone package since Microphone is missing in Unity WebGL.
     
  22. Kuin1982

    Kuin1982

    Joined:
    Dec 11, 2014
    Posts:
    1
    Hey FrostweepGames,
    Is it possible to define some keywords in order to get more accurate detection or just to only detect that keywords and not all the grammar.
    I have purchased the asset but I didn´t see that feature.

    In the documentation it seems that are phrases, maybe this parameter can be included:
    "config": {
    "encoding":"LINEAR16",
    "sampleRateHertz": 8000,
    "languageCode":"en-US",
    "speechContexts": [{
    "phrases":["Chromecast", "Chromecast model"]
    }]
    },

    Thanks in advance,

    Pedro.
     
    Last edited: Feb 20, 2019
  23. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    yes it can work in WebGL with MicrophoneLibrary inserted into the project.

    Best
     
  24. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219

    Hello,

    yes, you can use Speech Context as Phrases list.
    our asset provide full access to REST api for this service.

    Best
     
  25. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Oh I see.

    will check it.
     
  26. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
  27. ColtonKadlecik_VitruviusVR

    ColtonKadlecik_VitruviusVR

    Joined:
    Nov 27, 2015
    Posts:
    168
    Hey @FrostweepGames

    We're still looking for streaming support for desktop (Windows). Any progress?

    Cheers,
    Colton
     
  28. shwa

    shwa

    Joined:
    Apr 9, 2012
    Posts:
    447
    Hi,
    I'm looking for a relatively simple voice recognition solution.
    For Mac and Win desktop standalone.

    1. Menu : User can speak 1 word, or just a few words, and it opens a specific scene.

    2. Single word or short words trigger an action/response within a scene.

    Can this frostsweep specific voice recognition asset do this right out of the box?

    If not, is it relatively straightforward to use, so i can implement the above?

    To start, I'd prefer to buy just this asset, and not the Google Cloud Machine Learning Kit.

    Does this asset have all the same voice recognition functionality/demos/uses as the more expensive kit?
    Is there an upgrade path to the Cloud Machine kit, after i buy the voice recognition asset?

    thanks!
     
  29. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    you can use it like you want.

    in result you will get recognized text. so you just need to compare it with your list of commands and then change scene or whatever you want^^.
    yes. you could by only speech recognition asset not whole kit.
    and yeah, you will have a discount provided by asset store to buy whole kit.


    Best regards
     
  30. Sabre-Runner

    Sabre-Runner

    Joined:
    Mar 28, 2013
    Posts:
    2
    The MediaManager has a RecordFailedEvent field which the GCSpeechRecognition registers for.
    But nothing in MediaManager calls for it. Is it redundant or is there a line missing?
     
  31. unity_-WNfY6QzzZV4nA

    unity_-WNfY6QzzZV4nA

    Joined:
    Dec 22, 2018
    Posts:
    2
     
  32. unity_-WNfY6QzzZV4nA

    unity_-WNfY6QzzZV4nA

    Joined:
    Dec 22, 2018
    Posts:
    2
    Hi, the documentation shows a "Speech Context Phrases" list in the configs for the component, but I don't see that. Can you please advise on how to set the context?
     
  33. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    you can insert context phrases if you will get instance of a GCSpeechRecognition and call method

    Code (csharp):
    1.  
    2. void SetContext(List<string[]> contexts)
    3.  
    this method includes list of arrays of context phrases.

    Best Regards
     
  34. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    strange but could be a muissing line.

    will check and fix.

    Best
     
  35. blamejane

    blamejane

    Joined:
    Jul 8, 2013
    Posts:
    228
    I purchased this asset and I have the Example demo working for the Start/Stop button. However I can't seem to get the Voice Detection working. When I follow your steps in your 2nd video (original post for this thread), I click Voice Detection and click Start Recording, but whenever I speak, nothing triggers BeginTalkingEvent, unless I set Threshold to negative number. Once I do that it detects words, but never calls EndTalking until I slide Threshold to positive number.

    I am testing in the Unity Editor on my Mac.

    I simply want to detect 1-word and 2-word commands, and then execute the commands if the result response text matches my commands list. The only thing not working at this point is I can't get the begin/end talk voice detection to process my speech

    What should I do?

    Thanks.
     
  36. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    this function accuracy directly depends from microphone noise and noise of a background.
    to increase quality of detection you could change VoiceDetectionThreshold .

    currently we are working on version 4.0 of our asset that has function that dynamically sets a threshold dependently of a recorded noise.

    you could do same by recording few seconds of an audio and calculate average of samples array and insert this value in VoiceDetectionThreshold parameter in config.

    let me know if it helps.

    Best
     
  37. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello everyone!

    We are happy to announce a new update of Google Cloud Speech Recognition for Unity - Version 4.0 .

    - Updated Core
    - Updated Networking
    - Improved Functionality
    - New Features
    - New Examples
    - Improved stability
    and much.. much more!

    few screenshots of new update:





    Thanks for your supporting!

    Best Regards

    Your Frostweep Games team
     
  38. Turtwiggy

    Turtwiggy

    Joined:
    Jun 18, 2017
    Posts:
    7
    +1 to this issue. 2~ seconds of stutter is not good in VR so basically makes this plugin unusable for me. You can't wrap the StopRecord() method in async/await as there's code that needs to be on the Unity Main Thread (Microphone.End, AudioClip.Create...) so async/await is not a quick fix.

    Code (CSharp):
    1.  _speechRecognitionManager.Recognize(clip, _currentSpeechContexts, _speechRecognitionManager.CurrentConfig.defaultLanguage);
    This is also taking roughtly 400ms for 8000 / Linear16 which is causing stutter. Any suggestions?
     
  39. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello, new version of an asset is coming and it includes some improvements in issue you said. We reorganized core so it works much better.

    Best
     
    MarkAcc likes this.
  40. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello.

    We are happy to tell you that Asset version 4.0 is Live now.
    Check it out right now !

    Tkank you all.

    Best Regards
     
  41. han1108th

    han1108th

    Joined:
    Feb 24, 2019
    Posts:
    4
    thank you for your Nice asset!!
    but I just have a trouble with Detect Voice.

    when I toggled on detect voice and toggle off directly recognize, long running recognize
    and push start record, It doesn't work at all and red Image recordState always on.

    I really need detect voice function... without push start and stop button.

    Is there any other way to run it?
    thank you for your help
     
  42. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219

    Hello,

    looks like its example code issue.
    I'll check it and let you know.

    also write us letter to frostweep@gmail.com or join our discord.

    Best
     
    han1108th likes this.
  43. han1108th

    han1108th

    Joined:
    Feb 24, 2019
    Posts:
    4
    I really hope this function will work soon!!!
    I'll check this thread more than ten times a day!!!!:)
     
  44. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    I guess you shouldnt toggle off Directly Recognize checkbox because its disables recognition at all except case when you call Recognize Last Record, but in this way you will not get a good result because I'm not sure that you could create a cool recording in runtime voice detection in realtime.

    btw: we implemented few improvements for runtime voice detection.

    Best
     
  45. al57

    al57

    Joined:
    Feb 13, 2014
    Posts:
    9
    Hi, can it "autodetect" the language people are using ? I mean if i say "bonjour" it will autodetect french ?
     
  46. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    219
    Hello,

    thisfeature handled by Google service.

    Best
     
  47. LumoKvin

    LumoKvin

    Joined:
    Sep 28, 2019
    Posts:
    195
    "The plugin does not cover the cost of Google Cloud Service"

    Who pays the cost?

    If I make a game with this plugin, would I have to setup a Google Cloud Service account, and then I would get billed based on how much the players use it?

    Or would the player have to setup an account and they would get billed?
     
  48. jeromeWork

    jeromeWork

    Joined:
    Sep 1, 2015
    Posts:
    343
    Sorry, but you do. Either you get your money back in ad revenue, or in-game purchases, or you charge for the game (careful you get that pricing right), or a combination of all three
     
  49. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    Hi is the latest version supports webgl?
     
  50. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    337
    we just got MicrophoneWebGL how can I insert to speech to text?
     
unityunity