Search Unity

  1. Looking for a job or to hire someone for a project? Check out the re-opened job forums.
    Dismiss Notice
  2. The Burst compiler has its own forum section now.
    Dismiss Notice
  3. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Problem with getting string of certain pattern to validate.

Discussion in 'Scripting' started by 5c4r3cr0w, Oct 3, 2018.

  1. 5c4r3cr0w

    5c4r3cr0w

    Joined:
    Mar 8, 2016
    Posts:
    26
    Ok so I am new to regular expressions.
    I am trying to create input validations where you can only put arrow keys first letters separated by hyphen like:

    "U-D-L-R"

    Now after several attempts on this https://regexr.com/ I was able to create a pattern to match it looks like this:
    ([UDRL]*)([-])([UDRL])

    But in Unity Editor it returns true even I enter wrong input.

    Below is what my code to validate input looks like:

    Code (CSharp):
    1.     public string Dummy = "U-D-F-C-D-R-N-E";
    2.  
    3.  
    4.     private void isStringValid()
    5.     {
    6.         string pattern = "([UDRL]*)([-])([UDRL])";
    7.         bool isMatch = Regex.IsMatch(Dummy, pattern, RegexOptions.None);
    8.         Debug.LogError("isMatch = " + isMatch);
    9.     }
    Am I doing something wrong?
     
  2. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    5,334
    Yes. Regex works by looking for something that matches in the input string, so it returns true if any substring of your input matches the regex.

    If you want to check if the entire input matches, you want:

    Code (csharp):
    1. Match match = Regex.Match(Dummy, pattern);
    2. bool matchedEntireString = match.Length == Dummy.Length;

    By the way, that regex isn't what you're looking for. Let's break it down:

    [UDRL]* is "any number of either U, D, L or R"
    [-] is "a single - symbol". Note that using a set here (eg []) is completely unnecessary, as you only have a single character in the set.
    [UDRL] is "either U, D, L or R"

    So you can check on regexr that eg. UDLD-R would be a match.

    What you want is:

    ([UDRL]-)*([UDRL])
     
    5c4r3cr0w likes this.
  3. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    5,334
    Here's a snippet of code you can use to see exactly what's going on. Figured I'd rather post that than try to explain what Captures are:

    Code (csharp):
    1.  
    2. // paste me wherever
    3. [MenuItem("Foo/Foo")]
    4. private static void Foo()
    5. {
    6.     string input = "U-D-L-R";
    7.     string pattern = "([UDRL]-)*([UDRL])";
    8.  
    9.     StringBuilder debugOutput = new StringBuilder();
    10.  
    11.     var match = Regex.Match(input, pattern);
    12.     debugOutput.AppendLine($"{input} matched against {pattern}");
    13.     debugOutput.AppendLine($"the length of the match is {match.Length}");
    14.     if(match.Length == input.Length)
    15.         debugOutput.AppendLine($"the entire input was matched!");
    16.     else
    17.         debugOutput.AppendLine($"the entire input was not matched!");
    18.  
    19.     debugOutput.AppendLine($"There are {match.Groups.Count} groups");
    20.     for (var i = 0; i < match.Groups.Count; i++) {
    21.         Group group = match.Groups[i];
    22.         if (i == 0) {
    23.             debugOutput.AppendLine("  group 0 is always the entire matched input, rather than a group!\n" +
    24.                                    "    This is strange, but is done since regex indexes are traditionally 1-indexed rather than 0-indexed");
    25.         }
    26.         else {
    27.             debugOutput.AppendLine($"  group {i} has the value: {group.Value}, and contains {group.Captures.Count} captures:");
    28.             for (var j = 0; j < group.Captures.Count; j++) {
    29.                 Capture capture = group.Captures[j];
    30.                 debugOutput.AppendLine($"    capture {j}: {capture.Value}");
    31.             }
    32.         }
    33.     }
    34.  
    35.     Debug.Log(debugOutput.ToString());
    36. }
     
    5c4r3cr0w likes this.
  4. 5c4r3cr0w

    5c4r3cr0w

    Joined:
    Mar 8, 2016
    Posts:
    26
    ^ This works...!!! :)

    It took a little time for me to get that Regex will return true in isMatch() even if only a single or two characters matches the pattern.

    I after that tried to implement it with matchesCollection and matches like you suggested. What I found was only characters which matched the pattern were included in collection. So yesterday I came up with a below code:


    Code (CSharp):
    1. private bool isStringValid(string path)
    2.     {
    3.         bool isValidString = false;
    4.  
    5.         if (path.Contains('-'))
    6.         {
    7.             string[] array = path.Split('-');
    8.             string[] hyphens = Regex.Split(path, @"(?=[UDLR])");
    9.  
    10.             if (array.Length != (hyphens.Length - 1))
    11.             {
    12.                 isValidString = false;
    13.             }
    14.             else
    15.             {
    16.                 isValidString = true;
    17.             }
    18.  
    19.         }
    20.         else
    21.         {
    22.             isValidString = false;
    23.         }
    24.  
    25.         saveBtn.interactable = (NPCMoveDD.value == 1) ? isValidString : true;
    26.  
    27.         return isValidString;
    28.     }
    This code worked pretty well for several tests I conducted. I wasn't still sure that it's very robust so I will definitely try your solution. :)
     
unityunity