Search Unity

[RELEASED] Google Cloud Streaming Speech Recognition [VR\AR\Mobile\Desktop]

Discussion in 'Assets and Asset Store' started by FrostweepGames, Jun 3, 2018.

  1. itra

    itra

    Joined:
    May 21, 2019
    Posts:
    4
    Yeah any code you can share would be great. What you've got working in the video is exactly what I'm trying to achieve! I've been playing around with the idea for the last few hours and so far I've got Unity communicating with my c# .Net server via TcpListener. My plan is to pass the byte array from Unity to the server and then just feed the byte array straight into Google's StreamingRecognizeRequest. Not sure if this is the right approach. Now I've hit a brick wall at the same place it sounds like you did - I am having trouble capturing the audio on the client and then once on the server, getting Google's API to recognise it. This is the error code I keep getting back from Google's API:

    Status(StatusCode=OutOfRange, Detail="Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time.")

    Thanks for the help!
     
  2. Straafe

    Straafe

    Joined:
    Oct 15, 2012
    Posts:
    73
    @itra
    So in the client, looks like I used an asset called NatMic to get the mic's sample buffer as a byte array and then I send those bytes directly to the server over the TCP socket connection. I was recording the audio at a sample rate of 16,000 in single channel (I can't remember if that mattered for Google's specs, but I also matched that format on the server side when setting up the speech recognition configuration).

    I PM'd you the main server script. It's Java, but might get you some insights on dealing with the audio bytes on your server and passing them over to Google. It was based on this.
     
    Last edited: Sep 24, 2020
  3. itra

    itra

    Joined:
    May 21, 2019
    Posts:
    4
    @Straafe I've got it working! Thanks for the help and for sharing this work around in the first place. Never would have thought of it otherwise.
     
    Straafe likes this.
  4. itra

    itra

    Joined:
    May 21, 2019
    Posts:
    4
  5. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    our version of Google Cloud Streaming Speech recognition is now live at http://u3d.as/18RU

    Best Regards
     
  6. lmachado

    lmachado

    Joined:
    Aug 13, 2021
    Posts:
    2
    Hi,
    Is it possible to extend the time in which it detects the end of the recording?, i need to record a long speech of 3 pharagraphs
     
  7. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    I would say it mostly depends from google service.

    its possible to change chunk sizes which transfers to gogole service which will help with it a bit.

    Best Regards
     
  8. ruxrux

    ruxrux

    Joined:
    Aug 26, 2013
    Posts:
    1
    hi!


    just bought this add-on and getting some errors of the bat

    1. had to import the add-on without newtonsoft-json as it creates conflicts
    2. running the example doesnt work: it seems to recognize the microphone but doesnt access it: error message:
    "Start Record Failed. Please check microphone and try again."

    would love any help to run this plugin on Android / Oculus!
     
    DeLeT3D likes this.
  9. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264

    Hello,

    already replied in Discord.

    to fix that issue you should delete all Newtonfot.Json librareis which are not in that asset, because there will conflicts. but you cannot use another version of Json lib becuase it is a dependency to other librariries which will throw errors without it.

    Best Regards
     
  10. OACB

    OACB

    Joined:
    Sep 14, 2014
    Posts:
    1
    Hello,
    Can this store the recording of the voice and upload it to google drive?
     
  11. FrostweepGames

    FrostweepGames

    Joined:
    Jan 2, 2015
    Posts:
    264
    Hello,

    our asset dont have a possibility to upload files anywhere.. But you could export AudioClip as wav file and upload it anywhere by implementing your own uploading solution.

    Best
     
  12. DeLeT3D

    DeLeT3D

    Joined:
    Apr 2, 2017
    Posts:
    1
    I keep getting an error saying "Start record Failed. Please check microphone device and try again." I've tried multiple microphones and even tried refreshing the Microphones and restarting the machine. I'm able to use the microphones with other applications without any problems though. I've imported the Package in Unity 2020.3.27. Any assistance would be appreciated.
     
  13. JPhilipp

    JPhilipp

    Joined:
    Oct 7, 2014
    Posts:
    56
    A question. When using the Streaming version of the asset, and one does always want to listen for the user voice throughout the whole session, will this cause calls to the Google Cloud service even when nothing is spoken -- or will it dynamically only call the API when something is said?

    A related question: I already bought the Streaming asset, but I don't need the intermittent speech recognitions of half-sentences. I only need the final result. Do I now need to also buy the non-Streaming asset to achieve this?

    Thanks!
     
  14. fire_crystal

    fire_crystal

    Joined:
    May 11, 2023
    Posts:
    1
    Thank you for providing the wonderful asset "Streaming Speech Recognition using Google Cloud [VR\AR\Mobile\Desktop] Pro" (Version 1.0.3).

    Now, when building with Unity 2021.3.25f1 with Scripting Backend set to IL2CPP, the following error occurs.

    Building Library\Bee\artifacts\WinPlayerBuildProgram\u1oik\GameAssembly.dll failed with output:
    ���C�u���� Library/Bee/artifacts/WinPlayerBuildProgram/u1oik/GameAssembly.dll.lib �ƃI�u�W�F�N�g Library/Bee/artifacts/WinPlayerBuildProgram/u1oik/GameAssembly.dll.exp ��쐬��
    3ii0_pc.Core__1.obj : error LNK2019: ������̊O���V���{�� dlopen ���֐� Mono_dlopen_m28A6FCFD6D4175345383F596F0DAA79E26C34070 �ŎQ�Ƃ���܂���
    3ii0_pc.Core__1.obj : error LNK2019: ������̊O���V���{�� dlerror ���֐� Mono_dlerror_mCE4B2AE1A919E371751AEEAE600318E2470B3E88 �ŎQ�Ƃ���܂���
    3ii0_pc.Core__1.obj : error LNK2019: ������̊O���V���{�� dlsym ���֐� Mono_dlsym_m7B83E4542E62BE8A07581ABFE015F499C692682E �ŎQ�Ƃ���܂���
    Library\Bee\artifacts\WinPlayerBuildProgram\u1oik\GameAssembly.dll : fatal error LNK1120: 3 ���̖�����̊O���Q��
    UnityEngine.GUIUtility:processEvent (int,intptr,bool&)


    If you change the Scripting Backend to Mono, the build will succeed, but I would like to use IL2CPP for performance and obfuscation.

    By the way, delete "Streaming Speech Recognition using Google Cloud [VR\AR\Mobile\Desktop] Pro" (Version 1.0.3) from the project
    No error occurs when building with Scripting Backend set to IL2CPP.

    I would appreciate it if you could tell me the solution.
     
  15. Fangh

    Fangh

    Joined:
    Apr 19, 2013
    Posts:
    274
    a warning is making the build impossible :

    upload_2023-11-16_9-54-31.png


    Assets/FrostweepGames/StreamingSpeechRecognition/Scripts/GCStreamingSpeechRecognition.cs(459,6): warning CS0162: Unreachable code detected


    using Unity 2021.3.32 on iOS
     
  16. Fangh

    Fangh

    Joined:
    Apr 19, 2013
    Posts:
    274
    There is also an error with IL2CPP
    using Unity 2021.3.32 and iOS

    upload_2023-11-16_10-27-6.png


     
  17. Fangh

    Fangh

    Joined:
    Apr 19, 2013
    Posts:
    274