Search Unity

  1. Unity 2020.1 has been released.
    Dismiss Notice
  2. We are looking for feedback on the experimental Unity Safe Mode which is aiming to help you resolve compilation errors faster during project startup.
    Dismiss Notice
  3. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Two-player turn-based boardgames (TicTacToe, ConnectFour, ...)

Discussion in 'ML-Agents' started by ScrubRasta, Jul 23, 2020.

  1. ScrubRasta

    ScrubRasta

    Joined:
    Jun 18, 2016
    Posts:
    7
    I'm a game dev student working on MLAgent implementations for simple two-player turn-based board games. Having no result at all on 7x6ConnectFour and 4x4ConnectThree I'm now training a TicTacToe agent but I expect bad results too. I wonder if I am doing some thing wrong or if this is just not really possible (having found nothing but similar failed projects on github).

    If anyone has a good sample project that could be cool!

    I have some questions:
    - What are good hyperparameters for something like this?
    - Is it better to train with self-play or just train a starting- and a responding agent?
    - Is it bad to train two agents at the same time?
    - Is it bad to train against a random move agent?
    - Is there anyway to use a CNN/RNN for Connect four instead of the default NN?
     
    Last edited: Jul 28, 2020
  2. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    558
    I'll kick this over to the team for them to have a look, and forward any insight they share.
     
  3. ScrubRasta

    ScrubRasta

    Joined:
    Jun 18, 2016
    Posts:
    7
    @TreyK-47 Thanks man!
    Btw, there are still some bugs in the timing of the CollectObservations and CollectDiscreteActionMasks functions, sometimes they are called in the wrong order (after the OnActionRecieved). I've seen someone report the problem back in 2018 where someone from Unity said it was going to be low-priority, but it feels weird it is still there after two years while it makes the framework feel very unfinished.
     
  4. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    558
    No problem! As for those bugs, could you submit some new reports for us for them? That way we can take another look.
     
  5. ScrubRasta

    ScrubRasta

    Joined:
    Jun 18, 2016
    Posts:
    7
    @TreyK-47 Do I just do a normal bug report or is there a special place for MLAgents bugs?
     
  6. ScrubRasta

    ScrubRasta

    Joined:
    Jun 18, 2016
    Posts:
    7
    Also I noticed that a trained network doesn't always play the same move in the same situation (most clear with varying first moves). How should I interpret this result? Does this mean the network values the different played options equally (which seems very unlikely) or is there something else going on?
     
  7. ScrubRasta

    ScrubRasta

    Joined:
    Jun 18, 2016
    Posts:
    7
    I probably have too much hidden_units. Either way, I just noticed that Unity/python is not using the hyperparameters of my .yaml file, but uses some other parameters: (maybe these look familiar, if anyone has a clue if this actually means it is not using my .yaml hyperparameters and what a fix could be?)


    2020-07-25 01:29:01 INFO [stats.py:129] Hyperparameters for behavior name TicTacToeBehaviour-1:
    trainer_type: ppo
    hyperparameters:
    batch_size: 1024
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    network_settings:
    normalize: False
    hidden_units: 128
    num_layers: 2
    vis_encode_type: simple
    memory: None
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    init_path: None
    keep_checkpoints: 5
    checkpoint_interval: 500000
    max_steps: 500000
    time_horizon: 64
    summary_freq: 50000
    threaded: True
    self_play: None
    behavioral_cloning: None
     
  8. seboz123

    seboz123

    Joined:
    Mar 7, 2020
    Posts:
    18
    probably u need to rename the behaviour in your yaml to the behaviour name of your unity brain. Just a quick little thing you could try
     
  9. ScrubRasta

    ScrubRasta

    Joined:
    Jun 18, 2016
    Posts:
    7
    @seboz123 Thanks man, that worked out fine!
     
  10. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    558
unityunity