Search Unity

[RELEASED] Unity Google Cloud Speech Recognition [VR\AR\Mobile\Desktop]

Discussion in 'Assets and Asset Store' started by FrostweepGames, Aug 6, 2015.

  1. L13

    L13

    Joined:
    Jan 18, 2016
    Posts:
    2
    Hello,
    I am developing an app that use speech to text and translate.
    Can I use "stoprecord" function when voice detect is on?
    I'd tried but when I restart record with voice detect and it came out
    "ArgumentException: Length of created clip must be larger than 0"
    Did I switch start and stop function too fast or something?

    Thanks.
     
  2. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    - actually yes, its possible. when you click on start the voice detection will be enabled, when you click on stop record the voice detection will be disabled.

    - need to check it.. maybe bug with code.
    is it reproducable in the editor?

    Thanks
     
  3. L13

    L13

    Joined:
    Jan 18, 2016
    Posts:
    2
    Sometimes when I stop record while voice detection is on but google doesn't send back result, after that when I restart record the error will appear.

    Maybe Google didn't send back result is the reason?

    Thanks.

    001.png
     
    domdev likes this.
  4. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    we will check it and fix it in the new update.

    Thanks for the feedback.
     
  5. djmilk

    djmilk

    Joined:
    Jul 16, 2013
    Posts:
    4
    StartRecord(bool isEnabledVoiceDetection)

    isEnabledVoiceDetection = false is done !

    but

    isEnabledVoiceDetection = true is doesn't work
     
  6. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    hello.

    we will check it. thanks
     
    jay-one likes this.
  7. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
  8. Elecman

    Elecman

    Joined:
    May 5, 2011
    Posts:
    1,371
    Is it possible to provide context aware phrase hints dynamically?
     
  9. ColtonKadlecik_VitruviusVR

    ColtonKadlecik_VitruviusVR

    Joined:
    Nov 27, 2015
    Posts:
    197
    Hey @FrostweepGames

    In the video and in the description of the asset is says runtime voice detection. Does that mean the plugin will only send data to Google when the player is actually speaking? (i.e. I am wondering if I will be charged when the player isn't actually speaking)

    Also, when will streaming be available for desktop games (Oculus/Vive)?

    Cheers,
    Colton
     
    Last edited: Jan 22, 2018
  10. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    hello,
    you can send the context phrases before send the request to the google service. its possible actually.

    thanks
     
  11. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    actually voice detection vorking like that:
    1. start voice detection
    2. speech was detected
    3. start recording
    4. speech not detected
    5. stop recording
    6. start send to the google service.
    7. receive data

    it can works in parallel (it means that you can tals again in parallel with recognition on google service)

    streaming speech recognition still in development state because has many functionality that doesnt supported by Unity so we trying to implement own bridge for it.

    thanks for understaing I will let you know when it will be availabe for public release.

    best
     
  12. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
  13. Thomas_DK

    Thomas_DK

    Joined:
    Feb 1, 2018
    Posts:
    3
    Hi there at Frostweep,

    my Name is Thomas and I'm realtivly new to Unity.

    I have a question about a minimal Speech detection setup - so far I can run the example fine with my owne API Key, but I'm completly lost in the documentation on how to implement my owne SpeechDetection.

    So my qestion is, what are the steps?
    I'm guessing :
    1) Start Recording
    2) How do I send the result ?
    3) How do I get the result ?
    4) How to I "print" the result to text ?
    5) Stop Recording ?

    A minmal exapmle with just a keypress - send / get result and print result to text would be greatly appriciated.
    Best Wishes
    /Thomas_DK
     
  14. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello.

    An example was attached.

    1) Call StartRecord method
    2) you should subscribe on the event from the recognition service like in the example
    3) you can use Debug.Log(%somevariable%);
    4) Call StopRecord method

    best
     

    Attached Files:

  15. dizzymediainc

    dizzymediainc

    Joined:
    Apr 6, 2014
    Posts:
    433
    Well after A LOT of tinkering I managed to get it working ^_^ I feel like the process to get everything working as I wanted took way too long and should have been a lot simpler to get going, something to consider when updating the documentation.

    Also I noticed that LongRecognition does not have a failed value, so if I use that in the prefab settings and don't say anything during the time it tries to detect the speech, it'll throw and error and stop the game scene. I looked into the code itself for those calls (the script on the prefab provided) and I can clearly see there's no failed event for LongRecognition, this is something that definitely should be there, please look into this.

    I attached my demo script for anyone who wants to get started quickly with detecting words, i use a simple string array that you can manually fill up (dict) as a word dictionary, on start the system will create the new Dictionary values (myWords = key and action) for each value in dict.

    I use Odin so I commented out some things for those who don't and am currently working on getting an array of actions that will add based on the dict value (not done).


    LAST NOTES: Something that seems very important and took me a while to get right is the right extension to refer to when checking the words, YOU MUST refer to "alternative.transcript" and not "alternative" when comparing the results to the words in your dictionary/library. If you don't use "alternative.transcript" it simply won't compare the words properly and you'll get an error every time.

    This took way too long to figure out, it does mention the "transcript" method in the documentation but is not exactly clear on the importance of or the proper way to refer to it and use it.

    Anyways Hope this can be helpful ^_^
     

    Attached Files:

  16. superjayman

    superjayman

    Joined:
    May 31, 2013
    Posts:
    185
    When will you support streaming recognition. This pre-recorded stuff is useless..
     
  17. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    9,859
    I've just bought this plugin, but I can't get it to work. In my Google Cloud Platform dashboard, I have "Cloud Speech API" enabled. Then I clicked on "Credentials", "Create Credentials," and selected API key. It generated a key for me (39 characters of gibberish).

    I loaded the Example scene from the plugin, and pasted this key in for the Api Key property. But when I run, press Record, say something, and then Stop, I get:
    Code (csharp):
    1. {
    2.   "error": {
    3.     "code": 403,
    4.     "message": "The request is missing a valid API key.",
    5.     "status": "PERMISSION_DENIED"
    6.   }
    7. }
    What am I doing wrong?
     
  18. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    9,859
    OK, never mind, I found the "isUseAPIKeyFromPrefab" boolean.

    As a bit of feedback: I think this is poor design. If the prefab has an API key filled in, you should use it. Why would somebody fill in an API key and not want to use it? I bet this is something that trips up just about every new user of your plugin, for no good reason.
     
  19. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    9,859
    OK, now that I have it working, I find myself a little disappointed in the responsiveness. A simple input like "yes" takes over 2 seconds from the time the recording is stopped, to the RecognitionSuccessEventHandler. A longer input takes even longer to process (typically about 3 seconds in my testing).

    That's a long time to wait for the robot or NPC or whatever to react. Is this really the best we can do?
     
  20. dizzymediainc

    dizzymediainc

    Joined:
    Apr 6, 2014
    Posts:
    433
    I agree, a quicker response time would definitely be more ideal. 2-3 seconds is too long, something around 1 second would be useful.
     
  21. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    9,859
    For what it's worth, I gave up on this one and switched to the Watson Unity plug-in. It's very performant, and the cost of the service is quite a bit less than Google too (even after you go over the 1000 free minutes/month, which you may never do).
     
    mediumTaj likes this.
  22. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    we will try to optimize the functionality that it will more accurate and understandable..

    thnkas for the feedback and sorry for the long delay..
     
  23. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    for now we are working on the streaming speech recogntion for the android and ios..

    I guess it will be avaialble very soon because has less problems than on desktop...

    thanks
     
  24. haydenjameslee

    haydenjameslee

    Joined:
    Apr 11, 2014
    Posts:
    18
    I'm running into the same issue as castana1962. Did you manage to fix the problem?
     
  25. haydenjameslee

    haydenjameslee

    Joined:
    Apr 11, 2014
    Posts:
    18
    @castana1962 I fixed it. The problem is that Photon stores the Newtonsoft DLL in an Editor folder, which is compiled after regular scripts. See more info here: https://docs.unity3d.com/Manual/ScriptCompileOrderFolders.html

    To fix:
    1. Delete all instances of Newtonsoft in the Frostweep folders
    2. Move Photon's instance of Newtonsoft DLL into the Plugins folder
    3. Select the Newtonsoft DLL and make sure "any platform" is selected and it excludes no platforms like so:

    upload_2018-3-31_14-27-0.png
     
    FrostweepGames likes this.
  26. haydenjameslee

    haydenjameslee

    Joined:
    Apr 11, 2014
    Posts:
    18
    Hello @FrostweepGames

    Is it possible to "cancel" a recording so that it doesn't send and return the recording to google? We would like to give the user an ability to cancel a recording and record another one a few moments later. Is this already possible to do?
     
  27. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    yes its possible.

    you can write the ovveride function of the stop record that will not sends the message to the google service.

    if you need more details.. let me know.

    Best
     
  28. superjayman

    superjayman

    Joined:
    May 31, 2013
    Posts:
    185
    Useless!! Get it working for the PC. Your current implementation 2 second delay, are you guys crazy?
     
  29. haydenjameslee

    haydenjameslee

    Joined:
    Apr 11, 2014
    Posts:
    18
    @superjayman The 2 second delay is most likely caused by Google, not this plugin, unless I'm misunderstanding something. Speech to text is an intensive process and hence Google's servers take a while to process the audio and return it to the plugin. Streaming will also have latency from when you say the word in question and when google responds. I hope that clears things up.
     
  30. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
  31. ColtonKadlecik_VitruviusVR

    ColtonKadlecik_VitruviusVR

    Joined:
    Nov 27, 2015
    Posts:
    197
    Hey @FrostweepGames,

    Just checking in to see if there is any progress of the streaming on Windows?

    Cheers,
    Colton
     
  32. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello.

    thanks for the interesting..

    actually we have worked demo version of the asset..

    For now we are working on optimizing the DLL libs in the Unity...

    Short Description of current progress:
    - for now it works on Windows(checked it) and should works on OSX and Linux(not tested yet).
    - it uses official gRPC, Protobuf, Speech Client to make streaming speech recognition.
    - full supporting of the GRPC functions including streamin\no streaming\long and default speech recognition.

    Problems:
    - Unity throws some errors on the dll's
    - Build an app possible after first try(first try to build you will get an error, second try - will be okay)


    Best Regards
     
  33. peteRunner

    peteRunner

    Joined:
    Oct 7, 2014
    Posts:
    7
    Hi Guys,
    is Audio encoding config working? Because only Linear16 works for me. When I check code, in AudioConvert.cs I see only Linear16 support, for others, there is
    Code (CSharp):
    1. throw new System.NotSupportedException(encoding + " doesn't supported for converting!");
    So the plugin doesn't send the request to GC.

    Peter.
     
  34. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello.

    Actually plugin works only with LINEAR_16 because its a raw dat of wav format... if you want to use other codecs you have to implement code of codec then make functionality to convert AudioClip to your format and then make some changes in code of MediaManager(as I remember), for send the bytes from your format..

    Best
     
  35. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello guys!

    Good news! We will release the new asset that named as 'Google Cloud Machine Learning Kit' that includes all of our asset that connected to google service such as:

    Speech Recognition API
    Vision API
    Natural Language API
    Translation API

    ~ All in one and optimized to use services together
    ~ Support of both Unity Network Methods: UnityWebRequest and WWW class.
    ~ Optimized network options.
    ~ Updated assets with fixes and improvements.

    Thanks for your support!

    Best Regards,
    your Frostweep Games team
     
    peteRunner likes this.
  36. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello guys!

    We are happy to say to you that our new asset published on the Asset Store!

    Check it out right now on Google Cloud Machine Learning Kit page.

    Thanks for your support and we are very happy that you use our assets in your Unity projects!

    Best Regards,

    your Frostweep Games team.
     
  37. icyfrog66

    icyfrog66

    Joined:
    Aug 15, 2017
    Posts:
    8
    I am also getting the error: ArgumentException: Length of created clip must be larger than 0"
    Has this error been resolved yet?

    I used almost the same setting as the example scene, except the way I am calling the record method is

    public void Update()
    {
    if (Input.GetKeyDown (KeyCode.Space)) {
    Debug.Log("space");
    _speechRecognition.StartRecord(true);
    }
    if (Input.GetKeyUp (KeyCode.Space)) {
    Debug.Log("stopped");
    _speechRecognition.StopRecord();
    }
    }

    The only error I am getting is the error for the length of the created clip.
     
  38. icyfrog66

    icyfrog66

    Joined:
    Aug 15, 2017
    Posts:
    8
    It seems that even if I create buttons to call the startRecord method, the clip length is still stated to be less than 0. Do you know the solution to this issue?
    One of the errors on the debug log is just "{", from the NetworkResponseEventHandler method in SpeechRecognitionManager.

    I guess that when I made sure I had the exact same settings as the Example, it still worked. It would be good to know about what causes this error, but for now things seem to be able to work.
     
    Last edited: Jul 4, 2018
  39. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    we will check it out.

    Thanks for the report.

    we will back to you soon.

    Best
     
  40. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello guys!

    We're happy to say that our new asset was published on the Asset Store!

    Check it out right now on Google Cloud Text To Speech page.

    Thanks for your support!

    Best Regards,

    your Frostweep Games team.
     
  41. SachinGanesh

    SachinGanesh

    Joined:
    Jun 28, 2015
    Posts:
    20
    Hi, Do you support Streaming speech recognition?
     
  42. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    SachinGanesh likes this.
  43. peteRunner

    peteRunner

    Joined:
    Oct 7, 2014
    Posts:
    7
    Hello,

    anybody using this plugin in mobile VR? When I hit stop record, it causes a spike and it lags for 1-2 seconds, but in VR it is bad.
    After checking the profiler and experimenting with the plugin, it is during Audio encoding or sending webrequest.

    Do you have a similar experience, or mobile VR running smooth?

    Thanks.
     
  44. Vt-lab

    Vt-lab

    Joined:
    Jun 17, 2015
    Posts:
    4
    i have a problem and i can't solve.....
    i have a project working with a 1 api key. but i trying to do same functions in my other account (production account.)

    but all the time Unity tells me that it's forbidden direction (403). i dont remember if i only need create project activate the api and create the credentials or i need to do any more, (in my project i only change the credential and dosen't works( i only change the credential key))

    can anyone help me?
     
  45. kilik128

    kilik128

    Joined:
    Jul 15, 2013
    Posts:
    909
    hi it's work only with microphone ? or can use wav
    thank's
     
  46. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264

    Yes , using it.

    Best
     
  47. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    did you enabled Billing?

    For activate Google Cloud Service you have to create APi Key, Enable API and Enable Billing.

    Actually you can enable 'FullLogIfError' checkbox and see why 403 error appear.

    Best
     
  48. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    currently it have functionality only for recording from microphone but you can also make it workable with wav file.

    One thing you need to implement its loading wav file into app via WWW or UnityWebRequest from storage

    Then it should work well.

    Best
     
  49. peteRunner

    peteRunner

    Joined:
    Oct 7, 2014
    Posts:
    7
    And Mobile VR is running smooth??? No lag for a few seconds if you hit stop recording?? Even 2 seconds lag is killing my VR experience.
     
  50. kilik128

    kilik128

    Joined:
    Jul 15, 2013
    Posts:
    909
    yes i have try with Watson Api but it's big api so hard for my skill
    Any way to get sample ?