Search Unity

Question Simplified voice command recognition

Discussion in 'Audio & Video' started by andrew_pearce_, Oct 9, 2020.

  1. andrew_pearce_

    andrew_pearce_

    Joined:
    Nov 5, 2018
    Posts:
    169
    Hello,

    I searched online any basic algorithm to do simple voice recognition (high accuracy is not required). For OCR there are several and I remember implementing one which is based on searching corner pixels (one black and three white or vice versa). However, I cannot find anything for voice.

    I tried to record user voice and then to simplify the comparison of sample data, I split it into chunks and calculate average value for each chunk. Then I printed graph for visual comparison (so we can ignore normalization for now). The graph waves was always different shaped, even between continues sessions.

    Then I read more theory about sound and tried to calculate DB for samples. I compare result but again, only sometimes I was able to get graphs which looked similar.

    Finally I even tried to create spectrogram but again without any success. Bellow I am attaching screenshots but I am not sure which one is for spectrogram and which one is for DB.

    So all approaches showed me that either there is a logic error in my conversions or I need to do something else. At the same time I saw that Android apps to learn languages offer such feature. So I believe it should not be that difficult to implement. I am aware of online services but they are paid and way, way expensive for game.

    Any suggestion is highly appreciated

    Andrew

    The best match when I compared just audio samples:
    https://i.imgur.com/cRa6OOV.png

    Splitting everything into chunks makes everything looks very same, even if you record different voice command:
    https://i.imgur.com/iA7oj3M.png

    This looks very promising but this is the best case:
    https://i.imgur.com/zeTodR3.png

    As you can see there is a noise, I believe this was a screenshot with spectrogram:
    https://i.imgur.com/AkagKzJ.png
     
    Last edited: Oct 9, 2020
  2. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    21,190
    andrew_pearce_ likes this.
  3. andrew_pearce_

    andrew_pearce_

    Joined:
    Nov 5, 2018
    Posts:
    169
    Thanks, I will definitely check it. Since it's offline and a big research project, it should work fine. The only problem that it's a black box unless I will be able to study/understand source code.

    I was hoping to find a small neural network sample which could be extended. The fact that we have single word commands should simplify the challenge.

    UPD: I gave a quick try and plugin works. There is a doc how to use plugin but I cannot find information about acousticModels files yet.
     
    Last edited: Oct 10, 2020