Search Unity

Search

Could voice recognition make game development easier/faster/better...

Discussion in 'General Discussion' started by Arowx, Feb 5, 2020.

Page 1 of 2

Arowx

Joined:

Nov 12, 2009

Posts:

8,194
Could the technology to make this 80's advert for the Atari games company a reality, be possible today?

Voice Recognition

Augmented/Virtual Reality

Simple prototype 3d graphics and game objects (prefabs) linked to instantiation commands.

Would it be a good challenge for the Unity community?
Arowx, Feb 5, 2020

#1
hippocoder

Digital Ape

Joined:

Apr 11, 2010

Posts:

29,723

I'd be happy to invest the years this topic needs to become more than a tech demo but I would need to be paid for those years. I don't think this subject is a snappy turnaround.

Would the technology help everyone? Sure would. But isn't that obvious?

hippocoder, Feb 5, 2020

#2

Ryiah likes this.
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

Arowx said: ↑

Could voice recognition make game development easier/faster/better...
Click to expand...

We would need to have the entire toolset built around voice recognition to see any real benefit. Adding a symbol to C#, for example, can be done with two keypresses at most but the words to represent them (eg "left curly brace", "left parenthesis", etc) are complex and take far longer to speak.

Ryiah, Feb 5, 2020

#3

Acissathar and hippocoder like this.
Murgilod

Joined:

Nov 12, 2013

Posts:

10,160

Voice recognition is good as an accessibility tool, but that's it. Voice recognition tools have existed for computers in the consumer and more advanced spaces since Dragon NaturallySpeaking came out in 1997. Ultimately though, it adds nothing. It's not faster, it's not more intuitive, it's awful if you need to deal with an open plan environment and is essentially useless outside of tech demos. Hell, even in the consumer space, voice recognition is only really used when there's no better input system. That's why Siri and whatever Android's assistant is called now are used all the time, but Kinect kinda just languished with the only real use of it being "Xbox on"

Murgilod, Feb 5, 2020

#4

neoshaman and Ryiah like this.
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

Murgilod said: ↑

Voice recognition is good as an accessibility tool, but that's it.
Click to expand...

Reading this reminded me that there is an example of someone writing code using voice commands. I've linked the video below with the time stamp but if it doesn't link correctly the start time is exactly nine minutes in. The author refers to their solution as "verbal vim" because it's literally like working with vim but with nowhere near the performance of a keyboard.

Ryiah, Feb 5, 2020

#5

neoshaman likes this.
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Ryiah said: ↑

We would need to have the entire toolset built around voice recognition to see any real benefit. Adding a symbol to C#, for example, can be done with two keypresses at most but the words to represent them (eg "left curly brace", "left parenthesis", etc) are complex and take far longer to speak.
Click to expand...

LOL could we not just say 'within block/if/class/method/function' and have a smart solution place the braces around the key information.

However the concept here would be to work with a higher level game development language.

Murgilod said: ↑

Voice recognition is good as an accessibility tool, but that's it. Voice recognition tools have existed for computers in the consumer and more advanced spaces since Dragon NaturallySpeaking came out in 1997. Ultimately though, it adds nothing. It's not faster, it's not more intuitive, it's awful if you need to deal with an open plan environment and is essentially useless outside of tech demos. Hell, even in the consumer space, voice recognition is only really used when there's no better input system. That's why Siri and whatever Android's assistant is called now are used all the time, but Kinect kinda just languished with the only real use of it being "Xbox on"
Click to expand...

Imagine adding a voice interface that works with the language used to describe assets and will automatically download and add "free for prototype" assets packs that integrate automatically.

It would probably need to use an AI system trained by Unity and the community as well as having assets that are plug and play prototype ready and compatible.

It's like Unity nearly has all of the technology to do this but maybe not the drive and vision to do it.

It's more of a Apple Jobsian move of really pushing to make things easier, it could be akin to going from computer to ipad/smartphone.

Also as a game designer wouldn't you want your workday to be more like a scene from Ironman than the Simpsons.

Arowx, Feb 5, 2020

#6
Murgilod

Joined:

Nov 12, 2013

Posts:

10,160

Arowx said: ↑

Also as a game designer wouldn't you want your workday to be more like a scene from Ironman than the Simpsons.
Click to expand...

Once again you are moving the goalposts. Also, Iron Man is fiction. Iron Man is fiction and uses an AI to handle the backend of this that is so complicated that it literally became a superhero in a later movie in the MCU.

Murgilod, Feb 5, 2020

#7
Joe-Censored

Joined:

Mar 26, 2013

Posts:

11,847

Writing code by voice wouldn't be too bad working from top to bottom. Editing code by voice would be a difficult challenge. You could try an experiment. Get someone who has no coding background, call them on the phone, and walk them through fixing a few C# syntax errors spread across several scripts. See if that is faster than just using a mouse and keyboard.

Joe-Censored, Feb 5, 2020

#8

angrypenguin likes this.
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Joe-Censored said: ↑

Writing code by voice wouldn't be too bad working from top to bottom. Editing code by voice would be a difficult challenge. You could try an experiment. Get someone who has no coding background, call them on the phone, and walk them through fixing a few C# syntax errors spread across several scripts. See if that is faster than just using a mouse and keyboard.
Click to expand...

You know something as simple as line numbers and word letter tags could make that quite easy.

Arowx, Feb 5, 2020

#9
Joe-Censored

Joined:

Mar 26, 2013

Posts:

11,847

Arowx said: ↑

You know something as simple as line numbers and word letter tags could make that quite easy.
Click to expand...

But easier than move mouse and left click? It has to be a significant work flow improvement to justify such a change, or the inertia of the existing workflow will continue to dominate.

Joe-Censored, Feb 5, 2020

#10
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Murgilod said: ↑

Once again you are moving the goalposts. Also, Iron Man is fiction. Iron Man is fiction and uses an AI to handle the backend of this that is so complicated that it literally became a superhero in a later movie in the MCU.
Click to expand...

And as usual you are being pedantic. However Ironman is pretending to work within the much more complex domain of "real word" engineering, science and technology.

We would just be working within the domain of game prototyping and design which would probably involve different genre specific lexicon.

For instance an asset package for an FPS would need words for weapons, armour, vehicles, health, ammunition, compounds and an Frantasy game would need different words these as well as words for races, magic, villages ect.

It could take time to learn to speak and work with each genre but between different game types there would need to be a common Unity game development lexicon.

Maybe also game types might need unique lexicon elements e.g. Turn Based, Real Time, Multi-Player as well as game mechanics and underlying Unity technologies.

It could be something like the old text based adventures where half the battle is learning the lexicon, however maybe AI could help bridge the gap between a new user and experienced user matching words with similar meanings into the available asset lexicon or offering to bring in assets that include those items.

Arowx, Feb 5, 2020

#11
Murgilod

Joined:

Nov 12, 2013

Posts:

10,160

Arowx said: ↑

And as usual you are being pedantic. However Ironman is pretending to work within the much more complex domain of "real word" engineering, science and technology.

We would just be working within the domain of game prototyping and design which would probably involve different genre specific lexicon.

For instance an asset package for an FPS would need words for weapons, armour, vehicles, health, ammunition, compounds and an Frantasy game would need words for races, magic, villages ect.

It could take time to learn to speak and work with each genre but between different game types there would need to be a common Unity game development lexicon.

Maybe also game types might need unique lexicon elements e.g. Turn Based, Real Time, Multi-Player as well as game mechanics and underlying Unity technologies.

It could be something like the old text based adventures where half the battle is learning the lexicon, however maybe AI could help bridge the gap between a new user and experienced user matching words with similar meanings into the available asset lexicon or offering to bring in assets that include those items.
Click to expand...

So basically you want nebulous future tech that is hyperaware of context that also has a perfectly accurate recognition system.

This is like all your other threads. You propose an idea based on some random thing that doesn't exist for multitudes of reasons, then when you get told that the thing doesn't exist for multitudes of reasons, you start getting huffy. These pointless hypotheticals are the thing that get your threads closed again and again.

"Can this 1024 core processor used for deep learning become a consumer product?"

"Should we have game engine specific hardware?"

"Should Unity let you mash up different game genres?"

"Will AI in games ever need to register their own faction in multiplayer?"

"What does quantum computing mean for games?"

And so on and so forth. These things are never based in reality, but rather you taking as much time as possible to extrapolate what something could be used for without understanding why it isn't used for that, or why it isn't technologically feasible in any reasonable time frame.

Murgilod, Feb 5, 2020

#12

JoNax97 and MadeFromPolygons like this.
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Joe-Censored said: ↑

But easier than move mouse and left click? It has to be a significant work flow improvement to justify such a change, or the inertia of the existing workflow will continue to dominate.
Click to expand...

For argument's sake let's say we have eye tracking in AR/VR to cover mouse movement, there you have the simplest and fastest UI for moving to a specific point in the text.

Now the thing is your typing speed faster than your speaking speed?

Quick google search finds that on average voice is 3x faster than typing http://readmultiplex.com/2017/03/27/stanford-university-voice-first-is-3x-faster-than-typing/

Imagine if you could code 3x faster...

Also we already have AR/VR systems the allow hand tracking via devices or inside out video tracking that could allow you to edit your code with your hands as well as your voice, ideal for moving/erasing blocks of text quickly.

It's like we are so close to this level of technology utilisation, but so far away 'ideologically' are our current technologies/concepts holding us back.

Last edited: Feb 5, 2020

Arowx, Feb 5, 2020

#13
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

Arowx said: ↑

LOL could we not just say 'within block/if/class/method/function' and have a smart solution place the braces around the key information.

However the concept here would be to work with a higher level game development language.
Click to expand...

Right. Like I said we'd have to build our entire toolset around the concept.

Arowx said: ↑

For argument's sake let's say we have eye tracking in AR/VR to cover mouse movement, there you have the simplest and fastest UI for moving to a specific point in the text.
Click to expand...

We already have this. Windows 10 supports it out of box.

https://support.microsoft.com/en-us/help/4043921/windows-10-get-started-eye-control

Ryiah, Feb 5, 2020

#14
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Ryiah said: ↑

Right. Like I said we'd have to build our entire toolset around the concept.

We already have this. Windows 10 supports it out of box.

https://support.microsoft.com/en-us/help/4043921/windows-10-get-started-eye-control
Click to expand...

Don't we always have to change our 'toolset' around new UI, keyboard skills, mouse skills, voice commands, swipes and taps on touch screens?

Arowx, Feb 5, 2020

#15
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

Arowx said: ↑

Don't we always have to change our 'toolset' around new UI, keyboard skills, mouse skills, voice commands, swipes and taps on touch screens?
Click to expand...

No. Just about everything you've listed here has always been available in some way. We've had touch screens for as long as we've had the mouse. Modern tablets are just as accessible with a keyboard and mouse as it is with your fingers, and the reason for this is because it's not that much of a paradigm shift going from one to the other.

The mouse and the touch screen may have different ways to interact with them but at the end of the day they're both the same dimensions as the interface they are controlling which is why they're as easy as they are.

Last edited: Feb 5, 2020

Ryiah, Feb 5, 2020

#16

MadeFromPolygons likes this.
Antypodish

Joined:

Apr 29, 2014

Posts:

10,779

Scenerio 1.
"Move mouse cursor to position x:53.34, y:743"
- Oh, too far
"Move mouse cursor right, by 17 pixels"
- Eh
"Move mouse cursor left by 3 pixels"
- now
"Move cursor down by 6 pixels"
"Left click"
"...."

So who is going to win, pair of lips or hand on mouse?

Realistically speaking, as assistant is fine. As solely dev, tool not.
As another OP fantasy topic ... no (voice assistant) words.

Voice controlled submarine

Any Good? | Cold Waters with VOICEATTACK | FIRST LOOK | SUBMARINE SIMULATOR | Sim UK (PC)

I remember playing similar game about 10 years ago, or so.

Last edited: Feb 5, 2020

Antypodish, Feb 5, 2020

#17

andrejpetelin likes this.
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Ryiah said: ↑

has always been available in some fashion
Click to expand...

OK I can understand that if your younger you will have this impression but historically it's just incorrect.

Computers started out with punch card then moved to typewriters and consoles with commands then graphical interfaces (1983) then touch screens (2011).

Now we have voice recognition and AR/VR eye tracking and hand tracking technology.

Each time the UI would need to be reworked to accommodate the new interface technology, hence GNOME a GUI that is only 8 years old should handle multiple current interface types well.

Arowx, Feb 5, 2020

#18
Arowx

Joined:

Nov 12, 2009

Posts:

8,194
Antypodish said: ↑

Scenerio 1.
"Move mouse cursor to position x:53.34, y:743"
- Oh, too far
"Move mouse cursor right, by 17 pixels"
- Eh
"Move mouse cursor left by 3 pixels"
- now
"Move cursor down by 6 pixels"
"Left click"
"...."

So who is going to win, pair of lips or hand on mouse?

Realistically speaking, as assistant is fine. As solely dev, tool not.
As another OP fantasy topic ... no (voice assistant) words.
Click to expand...

You're focusing on voice recognition and missing some important points:

We have eye and hand tracking for VR/AR, so no need for a mouse.

Voice commands used right could replace the multi-click/step process of adding prefabs to a scene:

Visually locate prefab, move mouse to prefab, click to select, drag to scene view and drop in position.

May also involve searching for and finding prefab in list or subfolders or windows.

Versus say prefab name, position in scene.
Arowx, Feb 5, 2020

#19
Murgilod

Joined:

Nov 12, 2013

Posts:

10,160

Arowx said: ↑

Computers started out with punch card then moved to typewriters and consoles with commands then graphical interfaces (1983) then touch screens (2011).
Click to expand...

Punch card computers were never widely accessible and touch screens have been around in the pro/consumer space since I was a kid in the 90s.

Murgilod, Feb 5, 2020

#20
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

Arowx said: ↑

OK I can understand that if your younger you will have this impression but historically it's just incorrect.

Computers started out with punch card then moved to typewriters and consoles with commands then graphical interfaces (1983) then touch screens (2011).
Click to expand...

You just love moving the bar to make yourself look smarter or to win an argument, don't you? I'm very educated on the history of computer technology having spent an extensive amount of time looking into everything that has been created including some of the most obscure technologies.

None of that is relevant though because none of that has anything to do with our discussions. None of it was mentioned in the slightest in the post you wrote that I quoted when I made my post.

Murgilod said: ↑

Punch card computers were never widely accessible and touch screens have been around in the pro/consumer space since I was a kid in the 90s.
Click to expand...

And if we don't just limit ourselves to the pro/consumer space we can go back as far as 1972.

https://en.wikipedia.org/wiki/PLATO_(computer_system)

And the oldest machine made for a mouse is from the same year.

https://en.wikipedia.org/wiki/Xerox_Alto

Yes, prototypes and patents existed earlier, but none of them are relevant because the concepts behind user interfaces didn't exist meaning software for them hadn't been written and in some cases the monitors to display the results of the software hadn't been built yet.

Last edited: Feb 5, 2020

Ryiah, Feb 5, 2020

#21

MadeFromPolygons likes this.
Antypodish

Joined:

Apr 29, 2014

Posts:

10,779

Arowx said: ↑

We have eye and hand tracking for VR/AR, so no need for a mouse.
Click to expand...

Can you imagine yourself, waving "magic wand" for ~8 hours, trying deving a game for example?
Or typing on virtual floating keyboard, without haptic feedback.
That besides voice speaking, for which topic was about.

Using computer mouse, or relevant trackpad etc., in a sense is still superior so far, for long hours work. Requires little effort, to do large amount of work. So please don't VR/AR me, unless going to 3D virtual visualization, game, or museum.

Stay on topic please.

Antypodish, Feb 5, 2020

#22

Ryiah likes this.
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

Antypodish said: ↑

Can you imagine yourself, waving "magic wand" for ~8 hours, trying deving a game for example?
Or typing on virtual floating keyboard, without haptic feedback.
That besides voice speaking, for which topic was about.

Using computer mouse, or relevant trackpad etc., in a sense is still superior so far, for long hours work. Requires little effort, to do large amount of work. So please don't VR/AR me, unless going to 3D virtual visualization, game, or museum.

Stay on topic please.
Click to expand...

I am on topic please, read the first post.

Also if voice is 3x faster than typing won't your 8 hour day be done in 2 hours 40 minutes!

Arowx, Feb 5, 2020

#23
Murgilod

Joined:

Nov 12, 2013

Posts:

10,160

Arowx said: ↑

I am on topic please, read the first post.

Also if voice is 3x faster than typing won't your 8 hour day be done in 2 hours 40 minutes!
Click to expand...

It's faster for language entry, which does not rely heavily on things like punctuation or formatting. English and Mandarin are not programming languages.

Murgilod, Feb 5, 2020

#24
AcidArrow

Joined:

May 20, 2010

Posts:

11,794

Arowx said: ↑

I am on topic please, read the first post.

Also if voice is 3x faster than typing won't your 8 hour day be done in 2 hours 40 minutes!
Click to expand...

I’m pretty sure that doesn’t hold up if you’re writing code.

Also you’re doing something wrong if the major bottleneck for your coding or your creative writing is the speed of your typing.

AcidArrow, Feb 5, 2020

#25

angrypenguin and Kiwasi like this.
Joe-Censored

Joined:

Mar 26, 2013

Posts:

11,847

Arowx said: ↑

For argument's sake let's say we have eye tracking in AR/VR to cover mouse movement, there you have the simplest and fastest UI for moving to a specific point in the text.

Now the thing is your typing speed faster than your speaking speed?

Quick google search finds that on average voice is 3x faster than typing http://readmultiplex.com/2017/03/27/stanford-university-voice-first-is-3x-faster-than-typing/

Imagine if you could code 3x faster...

Also we already have AR/VR systems the allow hand tracking via devices or inside out video tracking that could allow you to edit your code with your hands as well as your voice, ideal for moving/erasing blocks of text quickly.

It's like we are so close to this level of technology utilisation, but so far away 'ideologically' are our current technologies/concepts holding us back.
Click to expand...

I did specifically bring up speed, so I'll admit I'm about to deflect my own original point, but even still if you get the speed up and past the current workflow is this still a better solution? With eye tracking you now have to maintain full focus on the task. A distraction from a coworker effectively moves your mouse around. Can't work with a youtube video playing (good luck following a video tutorial). It just doesn't seem better to me.

Improvements in process generally free the individual of work, allowing you to focus on other things. They automate what used to be manual. They allow you to skip previously tedious steps. This sounds like the reverse, requiring the full focus of your mind, your hands, your eyes, eliminating all opportunities for multitasking. Requiring you to even redesign your workspace rather than improve the task at hand with what everyone already has available. It seems like a step backwards.

Now, if you can get to a Star Trek TNG holodeck computer level of programming. Where you dictate to the computer generalities, and the AI does the majority of the actual work interpreting what you asked for in a simple statement into what would be days or weeks of manual coding. Now you've got something.

Joe-Censored, Feb 5, 2020

#26

zombiegorilla likes this.
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

Arowx said: ↑

Also if voice is 3x faster than typing won't your 8 hour day be done in 2 hours 40 minutes!
Click to expand...

If you go to your boss and tell him that you can accomplish your work three times as fast he will give you three times the work to complete.

Ryiah, Feb 5, 2020

#27

XCPU, BIGTIMEMASTER, zombiegorilla and 1 other person like this.
zombiegorilla

Moderator

Joined:

May 8, 2012

Posts:

9,052

Arowx said: ↑

For argument's sake let's say we have eye tracking in AR/VR to cover mouse movement, there you have the simplest and fastest UI for moving to a specific point in the text.
Click to expand...

Nope. For so many reasons. first off, the eyes aren't a pointing device, they just don't work like that. When you read, your eyes aren't literally scrolling across the words. Their tracking is variable and uses saccades to assemble a picture that your brain assembles. Your eyes jump around and are controlled by much more subconscious activity. Our predator brains are fantastic at accurately following something with our eyes and predicting, but not leading. They are scanners, not controllers.

Good input requires feedback for fine grain accuracy. A mouse or trackpad or touch screen or whatever makes a consistent loop that your brain can gauge and perform at a lower level. Using your eyes or floaty hands things (vr/ar without haptic feedback far beyond what we have now) is massively inefficient and crude. It requires your eyes to provide the control loop with a constantly changing feedback. It is a complete waste of resources.

You can use a keyboard and mouse so efficiently because you are shifting that control loop off to other senses. I don't need my eyes to type and only very minimally for using a mouse. (very little if using a trackpad). I can type/mouse and talk at the same time, I can type/mouse and use my eyes to process the tons of things going on in the environment (and on screen) at the same time. There is a reason why "3d" interfaces and vr interfaces and free air gestures are not ubiquitous, they are insanely inefficient. As supplement input like a confirm or something is one thing, but a comprehensive interface is not in cards for voice/eye tracking, they simply aren't granular enough.

(all this is completely separate from assistive technologies/tools)

Last edited: Feb 6, 2020

zombiegorilla, Feb 6, 2020

#28

MadeFromPolygons, bobisgod234 and Ryiah like this.
zombiegorilla

Moderator

Joined:

May 8, 2012

Posts:

9,052
Arowx said: ↑

You're focusing on voice recognition and missing some important points:

We have eye and hand tracking for VR/AR, so no need for a mouse.

Voice commands used right could replace the multi-click/step process of adding prefabs to a scene:

Visually locate prefab, move mouse to prefab, click to select, drag to scene view and drop in position.

May also involve searching for and finding prefab in list or subfolders or windows.

Versus say prefab name, position in scene.

Click to expand...

None of this remotely true or accurate. And no one is seriously considering this for development purposes other than for assistive tech. VR still can't find a solid foothold in playing games, it's a joke for developing games (aside from testing for vr games).

This thread is likely going be short lived, as you have managed to create another pointless speculative tech discussion to which you seem to be the only one who doesn't understand how tech works. This need to stay in the real world, and on the topic of practical game development. Otherwise it will be closed as @Murgilod pointed out slapping "how could Unity" with some random tech thing doesn't make a decent discussion topic, or valid one here.
zombiegorilla, Feb 6, 2020

#29

MadeFromPolygons, Antypodish and Ryiah like this.
BIGTIMEMASTER

Joined:

Jun 1, 2017

Posts:

5,181

I could have finished twenty games by now if unity would hurry up and build a mind-reader. I can think waaay faster than I can move a mouse.

BIGTIMEMASTER, Feb 6, 2020

#30
Ryiah

Joined:

Oct 11, 2012

Posts:

21,203

BIGTIMEMASTER said: ↑

I could have finished twenty games by now if unity would hurry up and build a mind-reader. I can think waaay faster than I can move a mouse.
Click to expand...

Totally surprised the OP hasn't jumped on this one yet.

https://www.roadtovr.com/ces-2020-nextmind-400-brain-computer-interface-developer-kit/

Ryiah, Feb 6, 2020

#31

MadeFromPolygons and BIGTIMEMASTER like this.
neoshaman

Joined:

Feb 11, 2011

Posts:

6,493

To put some order in the chaos, the biggest mistake when a new thing is introduced is the silver bullet syndrome, as in it will replace/change everything we do ... except generally not really. It lead to many good idea get fit into the wrong hole (cough kinect cough). But just because something is new doesn't mean we can have metric to judge it relative to old way, there is a whole methodology to run in a circle, you don't have to get experimental.

For example, let's focus on just a few aspect of interfaces, we can break them in many parameter, but i'll focus on three:
- Breath (how many item can be access instantly)
- latency
- finesse (granularity of control)

Mice for example have low latency, high finesse but little breath, it is generally supplemented by secondary interface like context menu (increase breath).

When you look at voice, we can see it has great breath (direct access to the entire dictionary), High latency, low finesse. Probably bad for fighting games, but maybe great to access inventory or items of a menu without going into the menu. In general menu are a supplement to overloaded interface like mice and button, so it add a lot of complexity, you must pause the game, select the high level category, and then maybe access the item at the leaf of a complex navigation tree. Voice can just picked it directly. In software, we use shortcut as supplement, but even them get overloaded, so voice could a time saver hear, so we can shift mode ot tools without breaking from the actions.

However there is more parameter to consider, one of them is resilience to noise. IMHO we aren't there yet in voice recognition, we still have all these voice assistant triggering stuff at the most inopportune moment based on mis reading input or getting lost in ambient noise. You don't want your husband talking about exiting something and it close your current urgent freelance works ...

Be scientific and there is no magic, only design

neoshaman, Feb 6, 2020

#32

Ryiah likes this.
zombiegorilla

Moderator

Joined:

May 8, 2012

Posts:

9,052

did you mean "breadth"?

Voice recognition is virtually perfect, (depending your budget for tools) my first job in engineering was designing building voice input (recognition) solutions in industrial settings. (active large mechanical facility with multiple audio situations and users). This was in the early 90s. Noise, voice and environment are long solved. Even consumer level tools are pretty perfect. HomePod for example is pretty cutting edge, it can easily identify and contextualize the speaker and pick out a whispered command in a noisy room full of people.

Even professional voice packages aren't really that practical in any real sense for development. Sure everything in my house is "smart". If I say "alexa, game time" . (I can also say "siri ,game time", and she'll tell alexa) it turns on my tv, switches to hdmi port two, wakes my PC and launches HueSync and fires up epic games store (or steam if I am in the middle of a steam game), it then dims the rest of the house lights and turns of a few non smart lights, and finally puts my phone, watch and laptops in silent mode. Which is fun and amusing, but it isnt' the same as development, using a computer as a tool. No amount of voice recognition is even a fraction of the speed of keyboard and mouse. I can type code faster than I can speak it, and I can switch context habitually out of muscle memory.

Hands are simply the most amazing and versatile tools in our reality. Voice can never be faster, physical laws prevent it. Using it for trivial off-loaded tasks can be though. Hence why Siri, Alexa et al, are successful products for the niche they occupy.

(again, seperate from assistive technologies)

Last edited: Feb 7, 2020

zombiegorilla, Feb 6, 2020

#33

neoshaman and Ryiah like this.
neoshaman

Joined:

Feb 11, 2011

Posts:

6,493

MY real agenda I have with voice vs keyboard, is that I have to constantly shift from qwerty to azerty depending on the software, when windows don't try to randomly shift itself and STAY in one or other mode ... bless people who live under qwerty supremacy, that they don't know the pain of QWERT shortcut randomly shifting on the keyboard just because the computer feels like it. At least blender is consistent in his convention it's less an hassle.

https://github.com/meyertime/descent-glovepie/wiki/Setting-up-GlovePIE

neoshaman, Feb 6, 2020

#34
RichardKain

Joined:

Oct 1, 2012

Posts:

1,261

Voice input would be much better for player interaction than it would be for game development. I can't create 3D models with my voice, nor could I code effectively. Too many symbols and structural brackets, not enough human-readable language. Most programming looks nothing like how you would typically speak.

I suppose you could incorporate it using a pre-built and defined structural system with its own specific voice-command syntax, but that would be a specific pre-defined workflow. (and would probably benefit in speed from having additional inputs)

For certain game types, player interaction could be handled well through vocal commands. I think back on some of my favorite point-and-click adventure games, and the fairly simple verb system they used. A similar approach could easily be implemented with modern voice-recognition libraries. One of the problems with this approach is that most major voice recognition software initiatives are focused on mobile platforms as opposed to desktop environments.

RichardKain, Feb 6, 2020

#35

zombiegorilla likes this.
Kiwasi

Joined:

Dec 5, 2013

Posts:

16,860

Arowx said: ↑

We have eye and hand tracking for VR/AR, so no need for a mouse.
Click to expand...

So I participated in global game jam last weekend. We built a VR game, which none of us had done before. In order to get a feel for it we built a bunch of precision tests. With current VR hand tracking our user could select between objects that were about 100mm apart. Any closer than that and it became an exercise in frustration. Contrast that with a mouse which most users can manipulate down to less than 5mm.

All told this means the object density in a VR work space is dramatically lower than in a screen workspace. Even though screens are a physically smaller workplace, you can fit far far more stuff on screen.

Its likely we will get better hand tracking technology. Gloves that track individual finger movements with high accuracy. But even then we are limited by the general precision of the human body. Which is ultimately going to mean the mouse stays around for a long time as a primary input device for high density applications.

Kiwasi, Feb 6, 2020

#36

Tzan, zombiegorilla and Ryiah like this.
zombiegorilla

Moderator

Joined:

May 8, 2012

Posts:

9,052

RichardKain said: ↑

Voice input would be much better for player interaction than it would be for game development. I can't create 3D models with my voice, nor could I code effectively. Too many symbols and structural brackets, not enough human-readable language. Most programming looks nothing like how you would typically speak.

I suppose you could incorporate it using a pre-built and defined structural system with its own specific voice-command syntax, but that would be a specific pre-defined workflow. (and would probably benefit in speed from having additional inputs)

For certain game types, player interaction could be handled well through vocal commands. I think back on some of my favorite point-and-click adventure games, and the fairly simple verb system they used. A similar approach could easily be implemented with modern voice-recognition libraries. One of the problems with this approach is that most major voice recognition software initiatives are focused on mobile platforms as opposed to desktop environments.
Click to expand...

There was this game that came out a while ago, it was launched when the first versions of MBPs with motion sensors came out. It was a simple flying game, that you controlled by holding and tilting your MBP. The fun part was that to fire your weapons you had to yell at your computer. I think there were a couple of different commands, like saying "Pew" over and over fired your guns and yelling "Boom!" dropped bombs. It wasn't a great game, but it was kind fun yelling "Pew, pew, pew!" to fire commands.

Yea, I would love for some games to have rarely used commands activate via voice.

zombiegorilla, Feb 7, 2020

#37

Kiwasi and Ryiah like this.
zombiegorilla

Moderator

Joined:

May 8, 2012

Posts:

9,052

Kiwasi said: ↑

Its likely we will get better hand tracking technology. Gloves that track individual finger movements with high accuracy. But even then we are limited by the general precision of the human body. Which is ultimately going to mean the mouse stays around for a long time as a primary input device for high density applications.
Click to expand...

And resistance is a big factor. Especially in VR type games. The resistance/drag on a mouse or trackpad makes it more accurate. Everyone has a some natural motion/sway their body/hands. It is eliminated by the drag of the surface. A lot of VR games are played standing up and moving, that exaggerates the motion of the hand. A little AI and a lot more sensors on the body could offset it a bit, but even then you have the stop motion of pointing. When you move your mouse to say an icon or menu, as you approach it you reduce the force and drag facilitates the stop. When pointing in air, you have to exercise a counterforce to stop your hand ( and arm and shoulder, and torso ) in many directions. It is very difficult to get accurate results, and possibly more importantly, you are using a lot more brain power and energy to get there. It's like writing your name... doing it on paper is so easy, you can do it with even thinking about. Writing it in the air is (vr/hand tracking, etc) requires a lot of concentration and yields consistent results.

About 6 or 7 years ago, I got a chance to work on some R&D stuff with a well known motion tracker hardware team. The ones we were using were prototype and crazy accurate. But even under the best conditions the input had to be scrubbed of noise which "softened" the results. One of the guys on our team came up with the idea of putting up a piece of glass to rest your hand on. It wasn't perfect (smudges fuzzed things up), but accuracy and control of motion was dramatically increased. In the end we basically replicated a poor quality touch screen with the ability to some fun things with depth.

zombiegorilla, Feb 7, 2020

#38

Kiwasi and Ryiah like this.
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

RichardKain said: ↑

Voice input would be much better for player interaction than it would be for game development. I can't create 3D models with my voice, nor could I code effectively. Too many symbols and structural brackets, not enough human-readable language. Most programming looks nothing like how you would typically speak.
Click to expand...

This keeps coming up what if there was a dedicated programming lexicon a vocal version of shorthand dedicated for programming and games development.

Think of how language evolves with new interfaces to describe interaction with them e.g. swipe, tap, open, click, close

What would a game programming or even 3d modelling vocal shorthand sound like?

Maybe we could use short vocal clicks, whistles, blurts or other unique vocal sounds/words to signify unique aspects of a domains lexicon. Something it would take time to master.

Arowx, Feb 7, 2020

#39
Billy4184

Joined:

Jul 7, 2014

Posts:

6,025

Voice recognition in itself adds nothing. But to conceptualize the possibilities of future game engine tools, you just need to imagine yourself sitting next to a reasonably competent programmer/level designer and explaining what you'd like them to do. It would not take long to communicate a fairly complex idea as long as you and they are operating from roughly the same page.

The complex part of the problem is not the voice recognition but 'context recognition'. Game engines already do this to some extent with the UI design, hotkeys and context-sensitive controls, but it can be taken a lot further, especially if there was a way to preserve context once built, and to recognize the contextual foundation that underlies a lot of smaller contexts within the scope of a project so that a small input from the user can be transformed into a complex command.

Billy4184, Feb 7, 2020

#40

neoshaman likes this.
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,620

This seems like a typical case of a solution looking for a problem. "We've got voice recognition, what can we shove it into?"

Arowx said: ↑

...what if there was a dedicated programming lexicon a vocal version of shorthand...
Click to expand...

I imagine being really distracting to the person at nearby desks, for one. I imagine offices getting a whole lot louder, for another. Though with a little more thought, I don't, because...

I don't imagine an increase in productivity for common use cases. You're talking about an audience who commonly makes or modifies their own tools, and tech that has been in consumer hands since the late 90s which has seen multiple hard pushes towards normalisation (Dragon, Xbox Kinect, Siri & co...). If adding voice commands somewhere addressed widespread problems for the average developer then it's incredibly likely they would be in place by default already.

As others have pointed out, accessibility is a consideration, and it's quite possible there's more that could be done there with a variety of tools and approaches.

angrypenguin, Feb 7, 2020

#41

Kiwasi likes this.
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,620

Billy4184 said: ↑

But to conceptualize the possibilities of future game engine tools, you just need to imagine yourself sitting next to a reasonably competent programmer/level designer and explaining what you'd like them to do. It would not take long to communicate a fairly complex idea as long as you and they are operating from roughly the same page.
Click to expand...

Wait... am I misunderstanding something here?

This is a long way in the future, and basically assumes we somehow build a general intelligence machine.

The fundamental flaw in that line of thinking is that, currently, computers are stupid. They can follow simple instructions incredibly quickly, but the instructions have to be simple. Computers give the impression of being able to solve complex problems because someone previously gave it the solution in fine detail. That's great for common tasks with well defined processes and rules (I can totally imagine "Siri, do my taxes" working for a lot of people in the not too distant future, and it's pretty close already), but for anything that requires creativity... well, it's a differently scoped version of the "Make MMO" button.

We need competent programmers and designers because they are good at things computers fundamentally suck at. Understanding problems from human perspectives, breaking them down into smaller, easily solvable problems, applying creativity and divergent thinking to design and/or implement relevant solutions. For the foreseeable future computers are good at applying a general purpose solution to problems, but humans must first develop that solution. Even our most advanced machine learning and such are only useful in very specific, small problem domains they are individually designed and trained for.

angrypenguin, Feb 7, 2020

#42

Kiwasi and zombiegorilla like this.
neoshaman

Joined:

Feb 11, 2011

Posts:

6,493

@angrypenguin Have you heard of our lord and savior "ai dungeon 2" (really just a smart use of gpt-2)?

While it's still dumb, it's a very different kind of dumb of what you are pointing at. IT has a wacky sense of context from something who only learn about the world through tons of online text, ie it doesn't understand it but MOSTLY because those aren't embodied knowledge, text are simulation, and its doing a simulation of a simulation. However it does pick abstract context inferences in unexpected way that wasn't taught. IT's not intelligent, it can't do reasoning at all. But the more you understand exactly how it work, the more you understand thing will change drastically.

IMHO the main defining difference in the future, between an AI and a human, will be experiencing what it mean to be human, a lot of common sense comes through directly living through stuff and experiencing first end the meaning.That's a thing we already struggle from human to human, I mean a lot of management issue are down to miscommunication, the main thing is that we expect agency from human and respect their experience integrity (in some way), which we will never lend to an AI, no matter how advanced it is.

Anyway https://techcrunch.com/2019/02/12/u...develop-clever-commit-an-ai-coding-assistant/

neoshaman, Feb 7, 2020

#43
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,620

neoshaman said: ↑

While it's still dumb, it's a very different kind of dumb of what you are pointing at.
Click to expand...

It's exactly the kind of stupid I'm pointing at, though. It's a complex solution (set of steps) that a team of people have programmed for a broad, general problem (how to create a lot of text from a little text) which gives the appearance of being intelligent and understanding, but which is not.

angrypenguin, Feb 7, 2020

#44
Billy4184

Joined:

Jul 7, 2014

Posts:

6,025

angrypenguin said: ↑

Wait... am I misunderstanding something here?

This is a long way in the future, and basically assumes we somehow build a general intelligence machine.

The fundamental flaw in that line of thinking is that, currently, computers are stupid. They can follow simple instructions incredibly quickly, but the instructions have to be simple. Computers give the impression of being able to solve complex problems because someone previously gave it the solution in fine detail. That's great for common tasks with well defined processes and rules (I can totally imagine "Siri, do my taxes" working for a lot of people in the not too distant future, and it's pretty close already), but for anything that requires creativity... well, it's a differently scoped version of the "Make MMO" button.

We need competent programmers and designers because they are good at things computers fundamentally suck at. Understanding problems from human perspectives, breaking them down into smaller, easily solvable problems, applying creativity and divergent thinking to design and/or implement relevant solutions. For the foreseeable future computers are good at applying a general purpose solution to problems, but humans must first develop that solution. Even our most advanced machine learning and such are only useful in very specific, small problem domains they are individually designed and trained for.
Click to expand...

Well I don't want to try to reduce the concept of AI down to a single problem, because no doubt there are issues that AI research will solve faster than others. Game development cannot be reduced to a single command, and there are parts of it that are easier to transform into logic than others.

I think you are underestimating the ability of a computer to learn from human beings. If you try to develop the solution for a problem (or the foundation for solving many types of problems) analytically, it's often very difficult to do. And the way that human beings behave (and the way they create things) is not so logical. It is based on, at the very least, a combination of logic and perceptual, emotional and instinctive systems that are more or less adapted to the environments we operate in. So to reproduce them analytically you would have to model history and evolution as well. This is my opinion.

But a computer that learns by watching human beings work may develop a set of rules that passes for the framework that humans operate in, at least for the most part. How far away that could be is hard to say, but there are computers that are learning to mimic artwork quite well without having solved at more than a superficial level the question of what makes good art.

As far as my opinion on the value of human work vs computer work, I am only interested really in human beings and what they are capable of. But human beings are limited, and if we are to be able to keep moving ahead, we will have to transfer more and more capability into systems that we don't consciously interact with. I am very interested in possibilities that AI can bring for people to be able to move a lot more information, and transformations of information, out of cognition and into machines that adapt to our way of seeing the world.

Billy4184, Feb 7, 2020

#45
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,620

Billy4184 said: ↑

...there are computers that are learning to mimic artwork quite well...
Click to expand...

Computers are very good at rapidly applying existing solutions to well studied problems. They aren't good at studying the problems and coming up with new solutions for themselves.

Your use of the word "mimic" is key, here: the comptuer is once again applying pre-existing solution. The fact that it observed that solution rather than having it explicitly coded in is just a matter of input method.

Billy4184 said: ↑

I think you are underestimating the ability of a computer to learn from human beings.
Click to expand...

Billy4184 said: ↑

...the way that human beings behave (and the way they create things) is not so logical. It is based on, at the very least, a combination of logic and perceptual, emotional and instinctive systems that are more or less adapted to the environments we operate in. So to reproduce them analytically you would have to model history and evolution as well. This is my opinion.
Click to expand...

Are you familiar with the concept of a "Chinese Room"? Neither a person nor a computer has to understand a solution* in order to implement it, or even to make variations and compare their results against a target.

I've followed plenty of recipes, and tweaked to my own taste often enough. I still know bugger all about cooking.

* Let alone the problem it arose from.

angrypenguin, Feb 7, 2020

#46

Kiwasi and Billy4184 like this.
Billy4184

Joined:

Jul 7, 2014

Posts:

6,025

angrypenguin said: ↑

Computers are very good at rapidly applying existing solutions to well studied problems. They aren't good at studying the problems and coming up with new solutions for themselves.

Your use of the word "mimic" is key, here: the comptuer is once again applying pre-existing solution. The fact that it observed that solution rather than having it explicitly coded in is just a matter of input method.

Are you familiar with the concept of a "Chinese Room"? Neither a person nor a computer has to understand a solution* in order to implement it, or even to make variations and compare their results against a target.

I've followed plenty of recipes, and tweaked to my own taste often enough. I still know bugger all about cooking.

* Let alone the problem it arose from.
Click to expand...

Well, as you yourself implied with the last paragraph, neither do humans understand the solution is before being able to implement it. So the question is, who needs to understand the solution? Probably no one.

Billy4184, Feb 7, 2020

#47
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,620

Billy4184 said: ↑

So the question is, who needs to understand the solution?
Click to expand...

No, the question is: who needs to understand the problem?

Sticking to the recipe metaphor, being able to follow and tweak a recipe is an almost completely different set of skills and knowledge to being able to make a new recipe from scratch. "Follow these instructions to make a hamburger" vs. "Design and plan a desert item for a person who likes...".

As far as I'm concerned, understanding the problem means being able to define a "solved" state, and see the difference between the "current" and "solved" states. Once you understand that you can start seeing if solutions work, and you can even do that by brute force trying random stuff and measuring the difference between the result and the "solved" state, which computers happen to be pretty good at. But if you don't understand the problem in the first place you can't know what the solved state looks like, and therefore can't tune a solution towards it.

Dragging this aaaall the way back on topic, if the instructions you're giving a computer are at the same as explaining what you'd like to a "reasonably competent programmer", you're assuming that the computer can understand your description, work out the "problem" based on that, and start testing solutions. I'm saying that, for the foreseeable future, it's going to have fundamental issues with at least one of those steps in all but narrowly defined problem spaces.

Last edited: Feb 7, 2020

angrypenguin, Feb 7, 2020

#48

Kiwasi likes this.
neoshaman

Joined:

Feb 11, 2011

Posts:

6,493

But that's exactly what NN do, in a small not yet human scale.

What you are asking is the awareness to redefine the problem when new data is given, which is what I said computer can't do, ie reasoning, but that's assume the AI itself is designed to have awareness beyond the problem he is asked to solve, and that's mostly "socio cultural awareness", ie what we call a general intelligence (over a generalized intelligence).

So you kind of have move the goal posting from what you were expressing through writing BUT I understand you move the goalpost closer to what you actually think so it's more clear.

It's just that your initial rebuttal wasn't exact.

Anyway just in case, the type of AI we are talking about are finding solution not through imitation but through self determination, most algorithm I talk about are unsupervised (ie there is no set of good example) and just find structure in data. The big shocker is that basically random input of text is enough to infer SOME degree of understanding, ie it wasn't ask to discover that country have capital, and yet it's able to do it without any human input, and it's able to correctly applied to a long text. In fact it can pass simple comprehension test it wasn't design for. Which was the huge upset in AI.

see this

We are in the midst of a revolution

neoshaman, Feb 7, 2020

#49
Billy4184

Joined:

Jul 7, 2014

Posts:

6,025

angrypenguin said: ↑

No, the question is: who needs to understand the problem?

Sticking to the recipe metaphor, being able to follow and tweak a recipe is an almost completely different set of skills and knowledge to being able to make a new recipe from scratch. "Follow these instructions to make a hamburger" vs. "Design and plan a desert item for a person who likes...".

As far as I'm concerned, understanding the problem means being able to define a "solved" state, and see the difference between the "current" and "solved" states. Once you understand that you can start seeing if solutions work, and you can even do that by brute force trying random stuff and measuring the difference between the result and the "solved" state, which computers happen to be pretty good at. But if you don't understand the problem in the first place you can't know what the solved state looks like, and therefore can't tune a solution towards it.

Dragging this aaaall the way back on topic, if the instructions you're giving a computer are at the same as explaining what you'd like to a "reasonably competent programmer", you're assuming that the computer can understand your description, work out the "problem" based on that, and start testing solutions. I'm saying that, for the foreseeable future, it's going to have fundamental issues with at least one of those steps in all but narrowly defined problem spaces.
Click to expand...

Does an extraordinary artist have the capability to define the 'solved' state of an idea they are working on? A computer might develop a set of 10 million little rules and the relationships between them that passes for an understanding of what makes good art. Whether that counts as being able to define the solved state of the 'artistic problem' - I suppose it does in a way. Maybe that's what humans do too, subconsciously.

Also, the computer would not begin solving the problem at the time that it is commissioned to create a game or part of it. No human is born a minute before such an enterprise either. Like a human, the computer would have to learn. Perhaps it could learn very fast, but one way or the other it would have to receive many inputs that define a successful solution to similar problems before it would 'know' what that sort of thing looks like.

Billy4184, Feb 7, 2020

#50

(You must log in or sign up to reply here.)

Page 1 of 2