Search Unity

[RELEASED]Unity3D Google Cloud Speech Recognition [VR\AR\Mobile\Desktop]

Discussion in 'Assets and Asset Store' started by FrostweepGames, Aug 6, 2015.

  1. 4wall

    4wall

    Joined:
    Sep 16, 2014
    Posts:
    69
    Hi
    Currently evaluating various Cloud based ML Services. In particular Speech to Text. In your demos it seems like there is a huge amount of latency with Google's Speech to Text service. Have you done a comparison with other Cloud based services like IBM Watson?
     
  2. eco_bach

    eco_bach

    Joined:
    Jul 8, 2013
    Posts:
    1,352
    Ok, for starters this needs better and up to date documentation! Tried following the instructions but so many questions. Part of the problem is the complexity of Google Cloud services for novices. I've tried create a Key for a new Service Account and then plugged this key into the 'Api Key' field of the prefab but whenever I run the Example scene I get the following error

    Code (csharp):
    1.  
    2. HTTP/1.1 403 Forbidden
    3. {
    4.   "error": {
    5.     "code": 403,
    6.     "message": "The request is missing a valid API key.",
    7.     "status": "PERMISSION_DENIED"
    8.   }
    9. }
    10.  
    Right now my key begins with
    -----BEGIN PRIVATE KEY-----\n

    and has sever line breaks ie \n

    Does this seem correct? Or should I be using the Key ID instead?

    Could you possible provide an uptodate tutorial with proper links on setting up ONLY speech recogntion and generating the key for Google Cloud?

    IBM Watson for speech recognition is MUCH easier to use and run!
    Included a screen grab of my prefab Inspector settings
     

    Attached Files:

    Last edited: Nov 29, 2018
  3. noahx

    noahx

    Joined:
    Nov 22, 2010
    Posts:
    47
    Hi,
    Does this tool have a feature to start recording when "voice" (or other noise) is present? And stop when there no voice anymore. What I mean, I need a plugin that doesn't require to "call" a method to start recording, I need to "automatically" start recording that piece of audio when someone speaks on the microphone without pushing any key. I looked at the tutorial and I didn't find anything related to it, so I'm guessing there's no feature like that and the recording is only activated by explicitly calling the method by user action.

    Thanks.
     
  4. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    looks like you created an invalid api key.

    look at this video:


    Best
     
  5. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    actualyl we have some functionality named as 'voice detection' that will help to realize feature you want. it will detect when user start\stop talking. and sending requests directly.

    Best
     
  6. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    didnt make a lot of comparison but in general, for us, Google Cloud service more scalable and workable..

    So it depends from what you need in your project in general.

    We have a lot of experience with making plugins based on Google Services.. and doesnt have any plugins for IBM Watson.. I would say we can make it but I dont know hot it will be - popular or not..

    Best
     
  7. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    actually it works in parallel thread, should not be a problem.
    but it depends from device I guess.

    Best
     
  8. wintermuute

    wintermuute

    Joined:
    May 12, 2017
    Posts:
    4
    Hi,
    I'm looking at purchasing your Google Cloud Machine Learning Kit and am wondering if it would work with Google's Cloud AutoML? If it does not, is this something you are planning on adding support for?
     
  9. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    for now no we havent support of it.

    its a big API, so its not an easy to implement but we are looking on new product on Google Cloud and planning to make supporting of them.

    Best
     
  10. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    check out our Android Demo on Google Play: click here

    Use your own API Key or our demo key.

    You can check how it works before buying.

    Best Regards
     
  11. peteRunner

    peteRunner

    Joined:
    Oct 7, 2014
    Posts:
    7
    Hi all,
    wrote before, but problem still persist.

    When using in Mobile VR, the speech recognition is doing lag for 1 - 2 seconds, but it is killing whole VR experience. See video here - seconds 7 - 12. My idea is because of audio encoding. Is here someone, who has not similar problems in mobile VR?
     

    Attached Files:

  12. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hi,

    I guess in new Unity versions you can use .net .4.x and async\await functions can help you remove lags from there.

    Best
     
  13. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    274
    I have problem with the runtime, when start recording and I want to stop. then back to recording.. it get error
    ArgumentException: AudioClip.SetData failed; invalid data
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.MediaManager.MakeAudioClipFromSamples (System.Single[] samples) (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/Core/Managers/MediaManager.cs:186)
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.MediaManager.Update () (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/Core/Managers/MediaManager.cs:130)
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.ServiceLocator.Update () (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/Core/Utilites/ServiceLocator.cs:27)
    FrostweepGames.Plugins.GoogleCloud.SpeechRecognition.GCSpeechRecognition.Update () (at Assets/FrostweepGames/GCSpeechRecognition/Scripts/GCSpeechRecognition.cs:93)
     
  14. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    274
    also I get speech recognition fail in IOS, but okay in android
     
  15. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    which error you see in xcode?

    Best
     
  16. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    regardign this issue.. when it appear? is it reproduces everytiem or randomly?
    it can appear if troubles with microphone.

    how we can reproduce it?

    Best
     
  17. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    274
    to reproduce set it to runtime voice detection .. then start recording.. then stop and start recording again.. the error will show up.. cause in runtime voice detection, I want to add a stop recording
     
  18. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    274
    also in runtime voice detection, why is it hard to get a callback from google speech, but in Unity it will get call back immediately, possible internet problem? but in unity editor it get callback immediately.. and phone got the same connection, cause in the log its not even givin an error like Speech not recognize or somtime.
     
  19. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    274
    it doesnt work in ios anymore, always get Speech Recognition failed with error {}
     
  20. domdev

    domdev

    Joined:
    Feb 2, 2015
    Posts:
    274
    I'm building it in Unity cloud, my unity version was 2018.0.3f2,
    it doesnt work in ios anymore, always get Speech Recognition failed with error {}
     
    Last edited: Feb 12, 2019
  21. ShahSoft

    ShahSoft

    Joined:
    Jan 13, 2013
    Posts:
    12
    Hey FrostweepGames,

    Will this work in WebGL build? perhaps along with your WebGL Microphone package since Microphone is missing in Unity WebGL.
     
  22. Kuin1982

    Kuin1982

    Joined:
    Dec 11, 2014
    Posts:
    1
    Hey FrostweepGames,
    Is it possible to define some keywords in order to get more accurate detection or just to only detect that keywords and not all the grammar.
    I have purchased the asset but I didn´t see that feature.

    In the documentation it seems that are phrases, maybe this parameter can be included:
    "config": {
    "encoding":"LINEAR16",
    "sampleRateHertz": 8000,
    "languageCode":"en-US",
    "speechContexts": [{
    "phrases":["Chromecast", "Chromecast model"]
    }]
    },

    Thanks in advance,

    Pedro.
     
    Last edited: Feb 20, 2019
  23. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    yes it can work in WebGL with MicrophoneLibrary inserted into the project.

    Best
     
  24. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208

    Hello,

    yes, you can use Speech Context as Phrases list.
    our asset provide full access to REST api for this service.

    Best
     
  25. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Oh I see.

    will check it.
     
  26. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
  27. ColtonK_VitruviusVR

    ColtonK_VitruviusVR

    Joined:
    Nov 27, 2015
    Posts:
    146
    Hey @FrostweepGames

    We're still looking for streaming support for desktop (Windows). Any progress?

    Cheers,
    Colton
     
  28. shwa

    shwa

    Joined:
    Apr 9, 2012
    Posts:
    435
    Hi,
    I'm looking for a relatively simple voice recognition solution.
    For Mac and Win desktop standalone.

    1. Menu : User can speak 1 word, or just a few words, and it opens a specific scene.

    2. Single word or short words trigger an action/response within a scene.

    Can this frostsweep specific voice recognition asset do this right out of the box?

    If not, is it relatively straightforward to use, so i can implement the above?

    To start, I'd prefer to buy just this asset, and not the Google Cloud Machine Learning Kit.

    Does this asset have all the same voice recognition functionality/demos/uses as the more expensive kit?
    Is there an upgrade path to the Cloud Machine kit, after i buy the voice recognition asset?

    thanks!
     
  29. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    you can use it like you want.

    in result you will get recognized text. so you just need to compare it with your list of commands and then change scene or whatever you want^^.
    yes. you could by only speech recognition asset not whole kit.
    and yeah, you will have a discount provided by asset store to buy whole kit.


    Best regards
     
  30. Sabre-Runner

    Sabre-Runner

    Joined:
    Mar 28, 2013
    Posts:
    2
    The MediaManager has a RecordFailedEvent field which the GCSpeechRecognition registers for.
    But nothing in MediaManager calls for it. Is it redundant or is there a line missing?
     
  31. unity_-WNfY6QzzZV4nA

    unity_-WNfY6QzzZV4nA

    Joined:
    Dec 22, 2018
    Posts:
    2
     
  32. unity_-WNfY6QzzZV4nA

    unity_-WNfY6QzzZV4nA

    Joined:
    Dec 22, 2018
    Posts:
    2
    Hi, the documentation shows a "Speech Context Phrases" list in the configs for the component, but I don't see that. Can you please advise on how to set the context?
     
  33. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    you can insert context phrases if you will get instance of a GCSpeechRecognition and call method

    Code (csharp):
    1.  
    2. void SetContext(List<string[]> contexts)
    3.  
    this method includes list of arrays of context phrases.

    Best Regards
     
  34. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    strange but could be a muissing line.

    will check and fix.

    Best
     
  35. blamejane

    blamejane

    Joined:
    Jul 8, 2013
    Posts:
    209
    I purchased this asset and I have the Example demo working for the Start/Stop button. However I can't seem to get the Voice Detection working. When I follow your steps in your 2nd video (original post for this thread), I click Voice Detection and click Start Recording, but whenever I speak, nothing triggers BeginTalkingEvent, unless I set Threshold to negative number. Once I do that it detects words, but never calls EndTalking until I slide Threshold to positive number.

    I am testing in the Unity Editor on my Mac.

    I simply want to detect 1-word and 2-word commands, and then execute the commands if the result response text matches my commands list. The only thing not working at this point is I can't get the begin/end talk voice detection to process my speech

    What should I do?

    Thanks.
     
  36. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    208
    Hello,

    this function accuracy directly depends from microphone noise and noise of a background.
    to increase quality of detection you could change VoiceDetectionThreshold .

    currently we are working on version 4.0 of our asset that has function that dynamically sets a threshold dependently of a recorded noise.

    you could do same by recording few seconds of an audio and calculate average of samples array and insert this value in VoiceDetectionThreshold parameter in config.

    let me know if it helps.

    Best