Is it possible, at the moment, to implement an audio processing NN with Barracuda? Specifically I'm thinking of doing something with keyword detection intended for mobile devices. Most of models use some form of RNNs and I think this is a bit more tricky with Barracuda. Also wondering if there's any sample project (outside ML-agents) with a clean implementation of an ONNX model with Barracuda.
Just for keyword detection you probably could go with spectrograms + CNNs, no need for RNNs. Something similar to Nvidia Jasper: https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/jasper.html