Could voice recognition make game development easier/faster/better...

Billy4184 · Feb 7, 2020

neoshaman said: ↑

What you are asking is the awareness to redefine the problem when new data is given, which is what I said computer can't do, ie reasoning...
Click to expand...

I don't believe that the ability to redefine a problem is outside of the ability of an AI. The question is, what are you redefining the problem to? Humans don't randomly define problems, we define them in such a way that the solution provides is with greater understanding of the world around us, strengthens us, increases the chances of survival at a more or less abstract level. The system for developing the problems and defining them is simply a question of experimentation and survival/thriving (or not).

There is only one (though very broad) criteria for success in everything that is meaningful, and despite what modern art will tell you, it's not something you can choose at random or invert at will.

neoshaman · Feb 7, 2020

I don't think it's outside what ai can potentially done right now, I just think we don't have it yet, in fact we probably don't even have a proper framework, as human, to think about it beyond vague term.

(written before reading everything, editing and redacting, left for the lulz lol because that's the second part of your argument) It's also possible that we as human overstate that problem and currently not doing any better, ie redefining a problem is probably something that happen because it's a subset of the bigger problem that is probably more unconscious that we conceptualize it, ie "surviving". (so yeah we kind of think alike lol)

Billy4184 said: ↑

and despite what modern art
Click to expand...

Being an art student, I don't think that's what modern art is ACTUALLY saying, unless you remove all context ad absurdum lol

RichardKain · Feb 7, 2020

It's a common misconception to conflate the human mind with computers. Some people just think brains are big-ass organic computers, and that with sufficient sophistication, a computer can do anything a human brain can. This is not the case. Human minds do not function like computers. Brains can perform certain tasks far better than computers, while other processes are far more in the computer's wheelhouse.

One of the prime examples of this is vocal recognition. Human minds are really great at understanding human language. They pick up up languages just by being surrounded by other people. The inference, intuition, and pattern recognition that language requires are all things that the human mind are good at.

Computers absolutely suck at all of those tasks. Computers can quickly process logic that someone else has written, but creating brand-new logic is not their forte. And intuitive leaps are something that have never been properly replicated artificially. Computers simply don't function that way. Ultimately, human language is intended for humans. Using it for pre-defined keywords as interface shortcuts is do-able. Creating computers that properly "understand" human language is the realm of science fiction. We're barely at the point of mechanical recognition, and far off from actual cognition.

neoshaman · Feb 7, 2020

https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html
Towards a Conversational Agent that Can Chat About…Anything
Tuesday, January 28, 2020

Murgilod · Feb 7, 2020

neoshaman said: ↑

https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html
Towards a Conversational Agent that Can Chat About…Anything
Tuesday, January 28, 2020
Click to expand...

This proves RichardKain's point more than anything. This is just Yet Another Transformer Implementation that requires absolutely massive datasets. On top of that, perplexity isn't a very useful metric outside of the novelty of conversational AI. If anything, if you want voice recognition for gamedev, the last thing you want is perplexity. You want to be able to rely on consistent results.

Conversational AI of this type is a research toy and nothing more.

neoshaman · Feb 7, 2020

Oh I agree, that's why I put on the date, the title is toward, it's not we are there, let's pop a champagne. Also is a thinly veil recuperation of ai dungeon design with a tinge of a new revelation to make it "legit".

Also massive dataset is what a child goes through from being born to uttering word in a sequence that make some sense.

And like I pointed, you can't have that conversational agent from random text anyway, because, even if it can hold a conversation, it's like learning japanese culture and custom through badly written anime fantasy. It's doom to fail in principle. Also these AI are designed with no internal concern for existing in the context of that discussion, language is expression of internal state, there is none except structure of language.

And that's the point I'm trying to make, I saying, with such faulty premise, we get result like that, that's goddamn scary.

Two paper down the line and it will improve anyway, it's like being a slow boiling frog.

RichardKain · Feb 7, 2020

I'm mainly concerned with immediate practical applications. Speculation on future developments are fun to wax poetic about, but don't particularly help with the here and now.

Here and now, your best bet for voice recognition is the mobile libraries being fielded by Apple and Google. Microsoft also has some internal libraries for Windows, but they don't seem to be putting quite as much effort into them. The Apple and Google efforts are partially being fueled by their own smart-speaker initiatives. Those devices make it possible for them to pull down way more data for running comparison and analysis. Most voice recognition libraries require virtual "training," in order to adapt their response to a particular type of voice or style of speech. Having access to a huge data-set makes it easier and faster to adjust a system's expectations for different individuals. (working from similar examples recorded in the past)

I'm actually looking at ways to exploit those libraries for a possible lip-sync animation solution, but it's slow going. And the results always seem to be hit-and-miss.

neoshaman · Feb 7, 2020

RichardKain said: ↑

I'm mainly concerned with immediate practical applications.
Click to expand...

RichardKain said: ↑

Creating computers that properly "understand" human language is the realm of science fiction.
Click to expand...

I was reacting to the second, because for the former, it was yesterday. It's already in used in search engine and many interfaces, and you have the testimony of ZombieGorilla, that for the sound recognition, it's been there for a while, complex semantic is what's missing.

But the main problem is you don't need human level intelligence in most cases, practical use are especially not that deep. Somehow, being able to maintain a full conversation is always held at this proof it can't be use practically ...

I think we should probably define concrete use case, of where it fails and where it wins. My proposition pre AI discussion was voice is cool to navigate a vast array of items intuitively, which already have latency due to overloading and memorization of various hack on top of traditional interfaces. It was much more straightforward than a "conversation" or "social awareness", which aren't interface concept.

RichardKain · Feb 7, 2020

Using voice recognition for inventory retrieval would indeed be a very appropriate application. Certain types of games can get vast inventories stuffed with all manner of things that would be difficult to browse through. Being able to select items from such a system with a simple command could be a fantastic shortcut, instead of rapidly scrolling through huge windowed lists. The main consideration would be needing to keep the names of all of the items as distinct as possible, so as to avoid confusion in the recognition, but that is manageable.

Murgilod · Feb 7, 2020

RichardKain said: ↑

Using voice recognition for inventory retrieval would indeed be a very appropriate application. Certain types of games can get vast inventories stuffed with all manner of things that would be difficult to browse through. Being able to select items from such a system with a simple command could be a fantastic shortcut, instead of rapidly scrolling through huge windowed lists. The main consideration would be needing to keep the names of all of the items as distinct as possible, so as to avoid confusion in the recognition, but that is manageable.
Click to expand...

You can do this now. For instance, it takes me no time at all to find things in Unity because I have a structured name for everything. In the amount of time it would take me to say "open my matcap shader" I could have already typed "shd matc"

neoshaman · Feb 7, 2020

Yeah but you can do it simultaneously with doing something else, my initial use case is game, where you are forced to pause and navigate with d pad, or even mice.

In your case it might make less sense, as the example probably happen in a sequence, but then it's not just about finding the item, you also have to click it, drag in the scene and applied to the object, maybe it's more fair if the commend recognize an object is selected, you ask for a shader and it automatically applied it to the object while also creating a material when needed (though I'm not fond of that because, it will probably put in a crap place, which mean you still have the dragging, and it will encourage beginner to leave crap all other the place, without proper folder structure to order thing, it's happening with hololens industrial apps ... where noob just put random valve they don't need in some random place instead of destroying them, like virtual littering is a thing futurist don't anticipate )

For application it depend on how the overall interface structure and rhythm of a specific work, I don't need it for most actions I do in blender as the keyboard is already parallel to the main action (which use the mouse).

There is no silver bullet, but that's not enough to completely write off a technique on a single point of failure. That's what design is for.

Billy4184 · Feb 7, 2020

RichardKain said: ↑

It's a common misconception to conflate the human mind with computers. Some people just think brains are big-ass organic computers, and that with sufficient sophistication, a computer can do anything a human brain can. This is not the case. Human minds do not function like computers. Brains can perform certain tasks far better than computers, while other processes are far more in the computer's wheelhouse.

One of the prime examples of this is vocal recognition. Human minds are really great at understanding human language. They pick up up languages just by being surrounded by other people. The inference, intuition, and pattern recognition that language requires are all things that the human mind are good at.

Computers absolutely suck at all of those tasks. Computers can quickly process logic that someone else has written, but creating brand-new logic is not their forte. And intuitive leaps are something that have never been properly replicated artificially. Computers simply don't function that way. Ultimately, human language is intended for humans. Using it for pre-defined keywords as interface shortcuts is do-able. Creating computers that properly "understand" human language is the realm of science fiction. We're barely at the point of mechanical recognition, and far off from actual cognition.
Click to expand...

There are of course differences at the most basic level .. but I think it is a common mistake to compare the average computer, which is more or less a general computing device, with brains, which are absolutely not such a thing.

To begin with, you cannot compare a computer to a brain unless the software in the computer has, in some form or another, had the same volume of learning experiences as a human brain.

On top of that, human brains are not general computing devices. They have structures that are built in, hardwired over long periods of evolution. For example, a human language cannot have arbitrary rules, it has to follow a certain pattern and structure that the brain seems to already have an interface built-in for. (I remember a talk by Noam Chomsky on the subject but I can't remember where). That's not even covering the systems that govern the formation and recollection of memories, the abstraction of knowledge ..

That doesn't necessarily mean that computers and brains function very differently at the most basic physical level though. Perhaps a completely unadapted collection of brain cells operating on nothing but a boot process would also be a general computing device (or quite possibly not).

But it's obvious that to compare a computer with a brain, at the very least you have to add something that accounts for the specific adaptation and specialization that a brain features. Otherwise you might as well consider the brain of a newborn baby and the art director at Naughty Dog to be the same thing.

How much of the apparent unique capability of the human brain is just software, or whether it represents a fundamental gap between computers and brains, is hard to say. But they have to at least start from a hypothesized similar point of adaptation for the comparison to be potentially useful.

zombiegorilla · Feb 7, 2020

Billy4184 said: ↑

There are of course differences at the most basic level .. but I think it is a common mistake to compare the average computer, which is more or less a general computing device, with brains, which are absolutely not such a thing.

To begin with, you cannot compare a computer to a brain unless the software in the computer has, in some form or another, had the same volume of learning experiences as a human brain.

On top of that, human brains are not general computing devices. They have structures that are built in, hardwired over long periods of evolution. For example, a human language cannot have arbitrary rules, it has to follow a certain pattern and structure that the brain seems to already have an interface built-in for. (I remember a talk by Noam Chomsky on the subject but I can't remember where). That's not even covering the systems that govern the formation and recollection of memories, the abstraction of knowledge ..

That doesn't necessarily mean that computers and brains function very differently at the most basic physical level though. Perhaps a completely unadapted collection of brain cells operating on nothing but a boot process would also be a general computing device (or quite possibly not).

But it's obvious that to compare a computer with a brain, at the very least you have to add something that accounts for the specific adaptation and specialization that a brain features. Otherwise you might as well consider the brain of a newborn baby and the art director at Naughty Dog to be the same thing.

How much of the apparent unique capability of the human brain is just software, or whether it represents a fundamental gap between computers and brains, is hard to say. But they have to at least start from a hypothesized similar point of adaptation for the comparison to be potentially useful.
Click to expand...

There is a great chapter in Randall Munroe's book "what if?" called Human Computer that does a great job of quantifying the differences between brain and computer. (Randall Munroe of xkcd.com fame)

neoshaman · Feb 7, 2020

Then there is the notion of emulation, that is artificial neural network are crude emulation of actual neurons, and seems to have inherited some property. So while a computer don't work like a brain, the software emulation is close enough. We already have completely emulated a worm brain, though that's less complex than a human one. The human brain have high level structures we haven't parsed and understood yet.

So, in 2014, a collective called the OpenWorm project mapped all the connections between the worm's 302 neurons and managed to simulate them in software, as Marissa Fessenden reports for the Smithsonian.
Click to expand...

https://www.sciencealert.com/scientists-put-worm-brain-in-lego-robot-openworm-connectome

Imho we are on the verge of not calling Neural network to be intelligence anymore (like we barely consider state machine, relational database, expert system, as intelligence anymore). It's more and more obvious that it's a statistical, fuzzy, self organizing database (with great property), the main question is really how do we one shoot insert semantics, and how do we one shot extract emergent self organized semantics (if it's possible at all).

Once we done that, it will lost all the mystic it currently has, just to become another tool of the trade. Most progress in the field seems to be more about the architecture, than the neuron, combined with other techniques that are more supplements or layers to the actual neuron structure (like finding new reward or memory systems). Neural network are basically just one of the component.

In Fact I would not be surprised if some big progress would be made by ditching neuron altogether, in some domain, and have the architecture replace them with a simpler equivalent (like some image recognition where found to be replaceable by the image equivalent of bag of word, which is way more simpler to understand and manipulate).

Murgilod · Feb 7, 2020

neoshaman said: ↑

Then there is the notion of emulation, that is artificial neural network are crude emulation of actual neurons, and seems to have inherited some property. So while a computer don't work like a brain, the software emulation is close enough. We already have completely emulated a worm brain, though that's less complex than a human one. The human brain have high level structures we haven't parsed and understood yet.

https://www.sciencealert.com/scientists-put-worm-brain-in-lego-robot-openworm-connectome

Imho we are on the verge of not calling Neural network to be intelligence anymore (like we barely consider state machine, relational database, expert system, as intelligence anymore). It's more and more obvious that it's a statistical, fuzzy, self organizing database (with great property), the main question is really how do we one shoot insert semantics, and how do we one shot extract emergent self organized semantics (if it's possible at all).

Once we done that, it will lost all the mystic it currently has, just to become another tool of the trade. Most progress in the field seems to be more about the architecture, than the neuron, combined with other techniques that are more supplements or layers to the actual neuron structure (like finding new reward or memory systems). Neural network are basically just one of the component.

In Fact I would not be surprised if some big progress would be made by ditching neuron altogether, in some domain, and have the architecture replace them with a simpler equivalent (like some image recognition where found to be replaceable by the image equivalent of bag of word, which is way more simpler to understand and manipulate).
Click to expand...

Again though, this is so far off that it basically ends up being a pointless hypothetical. The human brain contains 331125827 more neurons than that of a worm, and even OpenWorm dramatically simplifies how its neuron system works to function at all.

neoshaman · Feb 7, 2020

That's literally what I said?

I was just talking about the difference of hardware, and how emulation is possible despite the difference, I'm wasn't saying we will emulate the human brain anywhere soon, everything else is spent dispelling the mystic of neural network.

I probably suck at conveying ideas

Kiwasi · Feb 11, 2020

RichardKain said: ↑

Using voice recognition for inventory retrieval would indeed be a very appropriate application. Certain types of games can get vast inventories stuffed with all manner of things that would be difficult to browse through. Being able to select items from such a system with a simple command could be a fantastic shortcut, instead of rapidly scrolling through huge windowed lists. The main consideration would be needing to keep the names of all of the items as distinct as possible, so as to avoid confusion in the recognition, but that is manageable.
Click to expand...

Replaying Skyrim on the console right now. Voice commands to take use inventory items would certainly be a boon.

However it could be argued this is just an artifact of how poor Skyrims inventory system is to begin with. I've seen plenty of systems that are way better where voice commands would be superfluous.

Ryiah · Feb 11, 2020

Kiwasi said: ↑

Replaying Skyrim on the console right now. Voice commands to take use inventory items would certainly be a boon.
Click to expand...

There is a mod available for that, and it even supports Skyrim VR.

https://www.nexusmods.com/skyrimspecialedition/mods/16514

neoshaman · Feb 11, 2020

Dang, it's just the dialog line not inventory, you got me exciting, that's the most clunky way to do it, I guess it's a proof of concept it can be modded in I guess.

Some more thought on voice, less about defending it

Skyrim on gamepad already have the dpad shortcut anyway and probably the num key on pc), that's a smaller breadth than voices, and you can't move as you must lift the same thumb controlling the movement stick, BUT let's be frank, inventory tend to be accessed in downtime anyway (that's secret counter argument I kept to myself while using inventory as an example).

Command to companion would have been a better use if "precise aiming" wasn't needed (assuming a game like mass effect), cursor pointing remain the superior option when augmented with context sensitive mechanics. Everything that deal with precise space, selection of cloned object (ie enemies), etc ... would be hard, unique items (therefore direct selection) are a much better use case.

Voice probably lack the evolution of complementary ideas like context sensitive action for direct input device, remember when text adventure had you type everything noun verb, then click text to associate to noun on the scene, then just click because object has affordances (yeah you gonna open the door mostly, no need to click open the door).

Human tend to shorten high frequency use of "voice items". I wonder if a visual feedback of the last command would allow to have reference mechanics like we use in plain language, like using pronoun to refer to previously told noun. Just an example though, probably not that explicitly. Also interrupted input is something I have no clue how to actually manage, in a complex set of command. Because the finer you would want to get, the longer and more clunky command would be.

We tend to avoid language and voice for repetitive task, it's more focus and directed into unique state composition in unique situation. It's better if punctual, for example it's best to unleash a super technique when a bar is full in a fighting game (only happen once in a while, tend to be highly meaningful) than regular punch (diluted meaning, happen many time in a minute).

So the ideal use of voice is:
- low frequency use
- high latency situation
- simple short utterance
- highly meaningful to context
- with a breadth of unique items to select
- simultaneous to other inputs

The closest thing I can think off is the surgeon at work lol. Any similar situation. In VR is probably great because input are sparser than other medium, and menu are intrusive.

Ryiah · Feb 11, 2020

neoshaman said: ↑

Dang, it's just the dialog line not inventory, you got me exciting, that's the most clunky way to do it, I guess it's a proof of concept it can be modded in I guess.
Click to expand...

Latest beta release supports mapping voice commands to the keyboard and mouse, console commands, entries in your favorites menu (this one is only Skyrim VR), etc. Mention of it is made around the middle of the mod page.

By the way here is the quick start guide for using the framework they choose in Unity. Supports more than just Windows.

https://docs.microsoft.com/en-us/az...rts/speech-to-text-from-microphone?tabs=unity

juggyruggy · Feb 11, 2020

Oh for sure, can't wait for it to become reality, just imagine, creating objects and making them move all with the power with your voice!

digiross · Feb 11, 2020

I've used both Bixby and Alexa and they don't understand a damn thing. IMHO the technology would have to improve drastically to even be viable. If it worked it could be very cool. Reminiscent of the Star Trek holodeck which is amazing. Current VR is a joke compared to it unfortunately. But we're not in the 23rd century yet. One can hope. LOL

neoshaman · Feb 11, 2020

Even with a perfect VR now, I'm not sure you will solve the moving around in virtual space, and the force feedback associated. Abstraction of space is not just a tech limitation imho.

Kiwasi · Feb 11, 2020

neoshaman said: ↑

BUT let's be frank, inventory tend to be accessed in downtime anyway
Click to expand...

Try out a heavy alchemy build. When you are accessing poisons and potions every few strikes Skyrim's inventory system gets old fast. Even with the quick access options.

Ryiah · Feb 12, 2020

Kiwasi said: ↑

Try out a heavy alchemy build. When you are accessing poisons and potions every few strikes Skyrim's inventory system gets old fast. Even with the quick access options.
Click to expand...

I honestly believe one of the reasons stealth archers are popular with the Elder Scrolls is because you can simply equip a bow, some arrows, and never open your inventory again. Anything else requires you to micromanage and it simply isn't enjoyable without an inventory overhaul mod.

Murgilod · Feb 12, 2020

Ryiah said: ↑

I honestly believe one of the reasons stealth archers are popular with the Elder Scrolls is because you can simply equip a bow, some arrows, and never open your inventory again. Anything else requires you to micromanage and it simply isn't enjoyable without an inventory overhaul mod.
Click to expand...

I mean, there's that and the fact that stealth archery attack bonuses are absolutely ridiculously OP when you consider arrow DPS, and coupling that with stealth's later abilities basically letting you jump back into stealth whenever you want...

Search Unity

Could voice recognition make game development easier/faster/better...

Billy4184

neoshaman

RichardKain

neoshaman

Murgilod

neoshaman

RichardKain

neoshaman

RichardKain

Murgilod

neoshaman

Billy4184

zombiegorilla

Moderator

neoshaman

Murgilod

neoshaman

Kiwasi

Ryiah

neoshaman

Ryiah

juggyruggy

digiross

neoshaman

Kiwasi

Ryiah

Murgilod

Search Unity

Unity ID

Useful Searches

Could voice recognition make game development easier/faster/better...

Moderator