Search Unity

Discussion about Gestures in XR

Discussion in 'AR/VR (XR) Discussion' started by MadanyNO, Dec 30, 2019.

  1. MadanyNO

    MadanyNO

    Joined:
    Apr 18, 2016
    Posts:
    16
    hi everyone,

    i wanted to start a lil discussion about gestures in XR, as i started to get into it at the last few weeks.
    i have seen a few assets that where focused on that in the assets stores, but i feel like its not what i was looking for, at list from what i tried.

    a bit of back story for all this:

    as of lately i entered the XR development area after the place i work at got a HTC vive and an old demo project some one made in pieces and in unity 5.6.x, and was asked to piece all the parts together.

    one of the parts of the project was the recognition of gestures the user did with the controllers in hand and then activate an action by it. the demo used the vr infinite gesture to show that u can use them, but it basically was only the example project of vr infinite gesture and not some use of it.....
    and as this was an old unity version i thought about going for newer things and try them, and i saw AirSig assets 3D Motion Gesture and Signature Recognition (for HTC Vive) and that used steamvr asset as a basis for its work and tracking too.

    both this and infinite gesture had to use a button for triggering the gesture recognition as u need to hold it for the full gesture, which for my project now is good enough, but i was looking for a pure gesture recognition, no buttons whats so ever.

    for example use the vr infinite gesture 2 example from their example video, but without pressing any button to start the gesture, just the gesture itself.

    also as the valve knuckles controller and the new vrfree gloves by sensoryx we can now get more complexes hand gestures to use for example, create a shape with ur hands to activate an action, like creating a heart symbol with ur hands will make a heart balloon to be spawned or something like that.

    how would u go to start something like this?
    Disclaimer: i dont expect someone to make this for me, its more about brain storming about the right way to think when working on something like this

    i kind of liked how the vr infinite gesture did thing, to start and record an action, i dont mind the button to be used for the recording even, just so i could get the gestures, but how do i find that a gesture is been made?

    do i keep track of the position of the controllers at all time and camper all my gestures database to the latest inputs from the position to see if there is a match? cause that sound a bit heavy when u have a lot of gestures to work with.

    should it have user training action add to the database of gestures? so it will be more userspasific? but then it makes the database heavier when u have more then one user.


    and as an add thinking point, can we use the new DOT to make all this less heavy and get better results?

    again this is more of a brain storming discussion on whats should be the right way to go about it so feel free to throw in all ur takes on it, would love to hear them all, sorry in advance for my English, and tnx for any inputs and ideas
     
  2. arfish

    arfish

    Joined:
    Jan 28, 2017
    Posts:
    782
    Hi

    I guess it depends on how complex the gestures are. Perhaps they could be chosen to be easy to identify with just a history of a few samples.

    Another path might be to sample the positions of the hands to draw figures in an image, and then use image recognition to identify gestures.
     
  3. MadanyNO

    MadanyNO

    Joined:
    Apr 18, 2016
    Posts:
    16
    Hi @arfish tnx for joining :D

    and i was thinking on dealing with all kind of complexity gestures, like a general way to deal with all kind of them.
    i do agree that there can be a few kind of gestures, like i can have a moving gestures like they did in the VR infinite gestures i linked above, or there can be just making a shape by positioning the controllers hand to create signs like sign language.

    if i had to choose i would like to be more focused on the moving one, because of 2 reasons:
    1. most VR sets dont come with ready to use hand recognition and i would like to use normal basic controllers for now, (since that what i have to work with at the moment too XD)
    2. and i think it would be more interesting to go that way since making a sign and recognition it can be done as a part of the moving gestures and the moving one seem to have more to go into it.
    and about the image recognition for the identification, that is an interesting take for it, but i was wondering about 2 parts of that:
    1. as can be seen in the example of VR infinite gestures, gestures can be 3 dimensional, so making a image of the actions path may lose of its 3d movements.
    2. the path taken to create the shape is lost this way, so if 2 gestures u make create the same paths the images of them my over lap and it may identify the gestures wrongly. for example drawing a circlie in front of u one time clockwise and then one anti-clockwise will look the same for the image recognition i think. if i am wrong please correct me ^^"
    but it can still be used as a filtering factor to filter the gestures that dont fit the path image, and leaves as with less images to compere to.
    also, we can use the image of the path from 2 direction to see more of the gestures paths. like a image from the front, and an image from the side of the gestures path.

    as i see it for now i am going to have a buffer array to keep the pose input from the controllers, and a time stamp for the given pose so it can order the position the right way when camper it later.
    the given array will be checked if it contain any of the recorded gestures the system have, and when it find one it rise a flag for it.
    the checking process will be creating an image of front and side of the path the gesture took, compere it to recorded gestures image paths to filter the gestures, then if still have more then one fitting gestures, compere them by the order the path was taken and see who fits more.

    what do u guys think?
     
  4. arfish

    arfish

    Joined:
    Jan 28, 2017
    Posts:
    782
    It´s sad Google decided to discontinue Daydream. Gesture tracking shouldn´t be to hard to implement for their hand held controller with gyro.

    I wonder if there are any hand held controllers available to use with Google Cardboard?
     
  5. Follet

    Follet

    Joined:
    May 18, 2018
    Posts:
    38
  6. MadanyNO

    MadanyNO

    Joined:
    Apr 18, 2016
    Posts:
    16
    hi there @Follet, no i didnt find any better way to do this... i have been testing options here and there but the project i worked for this was put on hold but i guess i will start thinking about this soon again as it been unhold.

    for now i played with https://assetstore.unity.com/packag...-and-signature-recognition-for-htc-vive-95144

    which let u record both hands also recording for each hand by it self, but i havent find anything that let me get the motions without a "start gesture" and "end gestures" events....

    i learned how to get the position and rotation of my controllers, but how to find out if i get a creation gesture with out a start and end events... i am lost there for now i will follow ur thread and update there too if i find anything, but for now i have nothing new
     
  7. Follet

    Follet

    Joined:
    May 18, 2018
    Posts:
    38
    @MadanyNO Hey, I've tried using colliders, but it's not very accurated. I'm doing now another method wich is creating an array/list of transforms that is looking promising for now. It works by first, I record a movement with my hand, storing a new position whenever the last position is X far from the last one. Once I have this list with my recorded positions, I create a new one for the user, where if the distances between my recording and his are < than Y then it does whatever.
    As I said, so far it's working really well
     
  8. MadanyNO

    MadanyNO

    Joined:
    Apr 18, 2016
    Posts:
    16
    hi @Follet ,
    i also thought about this solution but i got stuck when i was thinking on when do we stop recording the actions,

    but in ur idea, it seems like u stop when the action isnt in the error margin of Y? but how u determened what is y? cause in the end, each person is different so some my always be out of the range of ur Y?

    also i didnt get the part about u record ur position when ever it is X distance? how do u get complex movement with small changes from a simpler one? like if i go zigzag from point one to two but another action go the same distance from the same points, but the distance is just one recording, how do u make the difference between them?

    sorry for my english if its messy ^^" but tnx a lot for sharing ur idea :D
     
  9. Follet

    Follet

    Joined:
    May 18, 2018
    Posts:
    38
    @MadanyNO Hello, I'll share part of my code, simplified to be conceptual and with changed names to be better undersanded. This names and calls are done to be unrestanded, to be efficient you should create parameters for many things here:

    Code (CSharp):
    1. private void RecordUserPositions(int lastIndex)
    2.     {
    3.         if (lastIndex == 0) //When starting to check the positions, save the first positions at the hand position
    4.         {
    5.             //A list of transforms already created, with empty gameobjects that will be used to set the position where the hand is
    6.             positionCurrentPathPool[poolCurrentPathUsedIndex].transform.position = GetRightHandPosition();
    7.  
    8.             //the list where the new positions will be stored
    9.             userPositions.Add(positionCurrentPathPool[poolCurrentPathUsedIndex]);
    10.  
    11.             poolCurrentPathUsedIndex++;
    12.             lastIndex++;
    13.         }
    14.         else
    15.         {
    16.             //Check distance from hand position to last saved position
    17.             currentDistanceFromLastPoint = GetDistance(GetRightHandPosition(), userPositions[lastIndex - 1]);
    18.  
    19. // if the distance is bigger, add a new user point. distanceRequired is a fixed number set by you
    20.             if (currentDistanceFromLastPoint >= distanceRequired)
    21.             {
    22.                 positionCurrentPathPool[poolCurrentPathUsedIndex].transform.position = GetRightHandPosition();
    23.                 userPositions.Add(positionCurrentPathPool[poolCurrentPathUsedIndex]);
    24.                 currentDistanceFromLastPoint = 0;
    25.                CheckSavedVSUser(lastIndex);
    26.                 poolCurrentPathUsedIndex++;
    27.                 lastIndex++;
    28.  
    29.             }
    30.         }
    31.     }
    32.  
    33.     private void CheckSavedVsUser(int currentIndex)
    34.     {
    35.         //goodPath1 is another List with the recorded positions of ONE gesture, like a zig zag, a circle or whatever
    36.         if (currentIndex <= goodPath1.Count - 1 && currentIndex > 0)
    37.         {
    38.             //Check distance from last saved point to the recorded path
    39.             differenceDistance = GetDistance(userPositions[currentIndex].transform.position, goodPath1[currentIndex].transform.position);
    40.             if (differenceDistance <= differenceMinimum) //differenceMinimum is also set by you as a fixed amount
    41.             {
    42.                 //user is following the path
    43.                 if (currentIndex >= goodPath1.Count - 1)
    44.                 {
    45.                     //user reached the end of the path. Do your action here
    46.                 }
    47.             }
    48.             else
    49.             {
    50.                 //User is too far from the path, so it's discarded
    51.             }
    52.         }
    53.     }
    The first method is call when a button is called or whenever you want.

    This is a simplified part of what I'm doing, you can do extra checkings, increase the check distances as you go further from the original point, check multiple paths at the same time and discard the one that are far etc. But the core is this basically.

    The precision of these movements depends on the amount of points you want to set (defined by the distance checkers, in the case of the code above "distanceRequired").

    If you have any problems or questions just ask :) I would suggest to create a lineRenderer that follows the "goodPath1"/The recorded position to test it. Cheers
     
    Last edited: Mar 3, 2020
  10. MadanyNO

    MadanyNO

    Joined:
    Apr 18, 2016
    Posts:
    16
    tnx for sharing ur work @Follet i get what u did there, and that was something i thought of doing too,

    my problem is that if i take ur code as example, i dont want to use the first function when i click a button, or something like that.

    I want to make a movement, and that will trigger the first function or to call it every fixed update. but if i do it its going to be very cost heavy for my system, after every movement to see if i follow one of the paths in my program memory.

    here is something i think on doing or kinda do:

    Code (CSharp):
    1. public struct GesturePos
    2. {
    3.     Vector3 Position;
    4.     Quaternion Rotation;
    5. }
    6.  
    7. public enum IsMovementGesture
    8. {
    9.     No,
    10.     PartOf,
    11.     Yes
    12. }
    13.  
    14.  
    15. public class Gesture
    16. {
    17.     public GesturePos[] Path;
    18.  
    19.     public Gesture(GesturePos[] path)
    20.     {
    21.         Path = path;
    22.     }
    23.  
    24.     public IsMovementGesture CheckMovementOnPath(GesturePos[] movement)
    25.     {
    26.         //if movement is too big then dont check and return IsMovementGesture.No;
    27.         // may need to add a safety check to see if its close enough to still be in gesture
    28.         if (movement.Length > Path.Length)
    29.             return IsMovementGesture.No;
    30.  
    31.         bool SameLength = movement.Length == Path.Length;
    32.  
    33.         for (int i = 0; i < movement.Length; i++)
    34.         {
    35.             //compere the GesturePos if they are close enough if not return IsMovementGesture.No;
    36.         }
    37.  
    38.         return (SameLength) ? IsMovementGesture.Yes : IsMovementGesture.PartOf;
    39.  
    40.     }
    41.  
    42. }
    43.  
    Code (CSharp):
    1.  
    2. public class CheckGestures : MonoBehaviour
    3. {
    4.  
    5.     List<Gesture> Gestures = new List<Gesture>();
    6.     List<Gesture> FilteredGestures = new List<Gesture>();
    7.     List<GesturePos> movement = new List<GesturePos>();
    8.  
    9.  
    10.     // Update is called once per frame
    11.     void Update()
    12.     {
    13.         movement.Add(getCurrentGesturePos());
    14.  
    15.         //filter the gesture list
    16.         FilterGestures();
    17.  
    18.         // if filtered gestures is empty, no gesture metch or partlly metch to the movment
    19.         // so we pop the first gesturepos from the movement list
    20.         // then we filter again till we stop getting an empty list
    21.         // or we dont have movement to check anymore
    22.         if (FilteredGestures.Count == 0)
    23.         {
    24.             do
    25.             {
    26.                 movement.RemoveAt(0);
    27.                 FilteredGestures = new List<Gesture>(Gestures);
    28.                 FilterGestures();
    29.  
    30.             } while (FilteredGestures.Count != 0 || movement.Count==0);
    31.         }
    32.        
    33.     }
    34.  
    35.     private void FilterGestures()
    36.     {
    37.         //check if movement is empty, if so returns an empty list
    38.         if(movement.Count==0)
    39.         {
    40.             FilteredGestures = new List<Gesture>();
    41.             return;
    42.         }
    43.  
    44.         // make a new list to hold the new filtered gestures
    45.         List<Gesture> newfilteredList = new List<Gesture>();
    46.         IsMovementGesture DoesFit;
    47.         //filter the gestures that not fit the gesture
    48.         foreach (Gesture gesture in FilteredGestures)
    49.         {
    50.             DoesFit = gesture.CheckMovementOnPath(movement.ToArray());
    51.             if (DoesFit == IsMovementGesture.Yes)
    52.             {
    53.                 //Fire Gesture Found function here
    54.                 PlayerDidGesture(gesture);
    55.  
    56.                 //reset the filter list and the movement list
    57.                 movement = new List<GesturePos>();
    58.                 FilteredGestures = new List<Gesture>(Gestures);
    59.             }
    60.             else if (DoesFit != IsMovementGesture.No)
    61.                 newfilteredList.Add(gesture);
    62.         }
    63.  
    64.         //set the new filtered list to the filteredgestures
    65.         FilteredGestures = newfilteredList;
    66.     }
    67.  
    68.     private void PlayerDidGesture(Gesture gesture)
    69.     {
    70.         //do the action that fit the gesture the player did
    71.     }
    72.  
    73.     private GesturePos getCUrrentGesturePos()
    74.     {
    75.         //get the current gesture pos here
    76.     }
    77. }
    78.  
    thats what i come to think i should do, but i am not sure about the update function i did,
    i am bothered it would be to heavy or slow to get the gestures right, so now i wonder if there is a way to make this process better.

    will be glad to here opinions on that part or any other part if u think there is a better way for that
     
  11. Follet

    Follet

    Joined:
    May 18, 2018
    Posts:
    38
    @MadanyNO I think it will indeed be heavy to check all gestures continually. Your code seems really fine. Right now I think that you could do another List<Gestures> with only the first/last two or three parameters set (position and rotation) and compare only those in the Filter. If this list with 2 or 3 parameters "fit" then check the rest using the list with the whole path to see if it's the same movement.

    I don't know how o explain it better, but I'll try with different words. It seems you'll be continually recording and comparing the current movement to the already saved paths you have. My suggestion is to instead of checking all positions for the recorded paths, create a list with the first/last positions of the recorded paths, and if the result is true then use the list with the whole path to compare the current user path, discarding the other paths (using a break inside the foreach maybe). That of course if I understood correctly what you're doing.

    In any case, trying to get the gestures without pushing a button but doing it non stop, it's going to be heavy no matter what, I think.
    Good luck and thanks to you too for sharing your idea!
     
  12. MadanyNO

    MadanyNO

    Joined:
    Apr 18, 2016
    Posts:
    16
    @Follet yes u r right it would be heavy, so i try to think of ways to make it less heavy.

    for now, because of my filtered list i dont go on all the gestures i have, but it will still be heavy when i reset my filtered list when no path fits. but its something XD

    for now i think i found another point to make it less heavy after i read what u said about checking only the last part and first part.

    it made me think on my CheckMovementOnPath function from before:

    Code (CSharp):
    1.    
    2. public IsMovementGesture CheckMovementOnPath(GesturePos[] movement)
    3.     {
    4.         //if movement is too big then dont check and return IsMovementGesture.No;
    5.         // may need to add a safety check to see if its close enough to still be in gesture
    6.         if (movement.Length > Path.Length)
    7.             return IsMovementGesture.No;
    8.         bool SameLength = movement.Length == Path.Length;
    9.         for (int i = 0; i < movement.Length; i++)
    10.         {
    11.             //compere the GesturePos if they are close enough if not return IsMovementGesture.No;
    12.         }
    13.         return (SameLength) ? IsMovementGesture.Yes : IsMovementGesture.PartOf;
    14.     }
    15.  
    and since i already checked all the path i had in the gesture in the previous check, i just check unnecessarily again. so i come up with this as a better way to go with it:

    Code (CSharp):
    1.    
    2.     public IsMovementGesture CheckMovementOnPath(GesturePos[] movement)
    3.     {
    4.         //if movement is too big then dont check and return IsMovementGesture.No;
    5.         // may need to add a safety check to see if its close enough to still be in gesture
    6.         if (movement.Length > Path.Length)
    7.             return IsMovementGesture.No;
    8.         bool SameLength = movement.Length == Path.Length;
    9.         gesture newestMove = movement[movement.Length-1];
    10.         gesture MoveShouldBe = Path[movement.Length-1];
    11.  
    12.         //compere the GesturePos if they are close enough if not return IsMovementGesture.No;
    13.         bool checkPoint = CompereGesturePoints( newestMove, MoveShouldBe);
    14.         //else
    15.         return ((!checkPoint)? ((SameLength) ? IsMovementGesture.Yes : IsMovementGesture.PartOf) : IsMovementGesture.No;
    16.     }
    17.  
    like this i only check the newest point i get, and if she fits to the path or not, then i already take most of the heavy checking i do. i will need to change the FilterGestures() function to fit with the new way i check the path cause now when i get to an empty FilteredGestures list, thinking about making it a recursive check, but still thinking about it