Self Play and Action mask problems on Turn Based Game

Nicolas-Vidal-Iscool · Feb 22, 2020

Hey there,

I wanted to give another shot at Unity ML Agents for multi agent turn based game and to start simply, I tried to make a simple TicTactoe.

With one Agent versus a random opponent controlled automatically by the environment (the random opponent plays immediately and synchronously after the Agent Action of the learner, I managed with SAC to get a near optimal policy. (I used the Decision Requester)

But then I wanted to replace the Random agent by another learner agent as "Self Play" was recently introduced.

For testing purposes I used a Random Heuristic for both agents, and with some manual calls to Request Decision I managed to have them play a bunch of games in Heuristic mode.

Then I tried to let them play in Learn mode (but in the same team and without Self Play hyperparameters in the config), It kinda works (even if I had to sync the request Decision calls with Environment OnAgentStatus callback).

Then I tried to enable self play by setting one of the agent to Team 1 (instead of 0) and setting some self play hyperparameters in config. And that's where I'm stuck. If I do this, one of the agents completely ignores the action mask and seems to play randomly. The other agent doesn't seem to learn anything, ELO reports is flat at 1200.

I tried this on latest release branch and master branch.

Any idea of what I should do to be able to make some tests with self play on turn based multi agent games ?

Thanks a lot.

Cheers.

Nicolas

TreyK-47 · Feb 26, 2020

Hey there @Nicolas-Vidal-Iscool - we'll circulate this for the team to review. Could you tell us which C# and Python versions you're running?

p2b · Feb 27, 2020

Hi @Nicolas-Vidal-Iscool, I'm working on a similar project (as a matter of fact, it's the exact same idea) and I faced the same issue as you did, sadly I do not have a solution to the same and am looking for one myself. I was working with ml-agents v13.1 and in a similar fashion one of the agents would completely ignore the mask and act randomly, I few Debug statements revealed that the masks were being sent to the python API correctly for only one agent.
I switched to the v14.0 latest-release today and tried the same, but now I am facing a new issue, the first agent collects observations and makes a move, but when it's the turn of the second agent, it collects observations and then stops indefinitely. It seems as if it does not receive the actions from the python API from this point forward for some reason.

I would really appreciate some help and guidance regarding the issue.
The project can be found here: https://github.com/Prabhav2B/Tac-Tic

Nicolas-Vidal-Iscool · Feb 26, 2020

Hi @p2b and @TreyK-47 indeed, with 0.14 Python and C# the second agent was stopping indefinitely. Then I switched to the master version of C# and Python (commit ff3c5e013ee0b585b1a7713baecaa46baa9a7da4) I ran into the issue I'm describing (mask is ignored).

I will try to upload the issue on a github tomorrow.

Thank you all for looking into this.

p2b · Mar 12, 2020

Hey @TreyK-47, any updates regarding this issue?

bobchalmers · Jul 26, 2020

Unity ml-agents version 1.2, Python 0.18; Basically same issue. Self-play with correct fields in the config .yaml file, two teams, turn-based game so I don't really want the decision requester. Only team 0 acts. No replies to RequestDecision come back for the second player (team 1).

ScrubRasta · Jul 29, 2020

@TreyK-47 Any updates on this? Running into the same issues. Actionmask is ignored.

christophergoy · Jul 8, 2021

Hi, I've reached out to our research team to see if we can diagnose and solve this.

christophergoy · Jul 8, 2021

also, are you sure that that the agent on gameobject 2 is actually writing to the action mask?

christophergoy · Jul 8, 2021

we did some internal testing and found that action masks on agents in self play were going through

R0b_g · Aug 24, 2021

Hi there, I bet this is too late and might not even work for you but I sort of had a similar problem and setting
threaded: false
in the .yaml/configuration file worked for me

Search Unity

Self Play and Action mask problems on Turn Based Game

Nicolas-Vidal-Iscool

TreyK-47

Unity Technologies

p2b

Nicolas-Vidal-Iscool

p2b

bobchalmers

ScrubRasta

christophergoy

Unity Technologies

christophergoy

Unity Technologies

christophergoy

Unity Technologies

R0b_g

Search Unity

Unity ID

Useful Searches

Self Play and Action mask problems on Turn Based Game

Unity Technologies

Unity Technologies

Unity Technologies

Unity Technologies