ml-agents, team self play, handling dead agents during episode.

m4l4 · Jul 28, 2020

Hi everyone, i've been experimenting with the self play concept and the idea of a cooperative/competitive environment.
First version of the env is basically a team based "food collector" (the one in the ml-agents examples).

2 teams, 3 agents each.
agents health decrease with time, there's food all over the ground, last team standing wins the match.
Agents can shoot lasers to freeze other agents in place.

Reward is really simple, agent gets -1 if health drop to zero, each agent of the winning team gets a +1.
(probably it's a good idea to also give a -1 to the loosing team).

My question is about the agents with 0 health. How should i handle them while the episode is still in progress?
right now, i just use gameObject.SetActive(false), and i reactivate them OnEpisodeBegin().

How does, being deactivated, affects the training of the agents? How does it perceives what happen when inactive? How can it interprete being deactivated for most of the match, and then getting rewarded at the end of the episode? i don't even know if it can get AddReward() while in that state.

Should i leave them on the ground, change tag to "deadAgent", add an isDead bool to the observation space and just ignore their outputs while isDead (as i do for isFrozen)?

put them in a cage till the end of the match ??? like hockey penalty box

awjuliani · Jul 28, 2020

Hello. If you deactivate an agent, then it no longer sends or receives observations, actions, and rewards. As long as you punish the dead agent before deactivating it, the reward should be received. The issue with keeping the agent around is that it might learn the wrong relationships between the observations and actions.

m4l4 · Jul 28, 2020

that's what i was thinking. Punishing the agent before deactivation obviously it's not an issue (as for reactivating it before final reward).
but the problems remain. How an agent "interprete" being deactivated?
any suggestion on how to handle them properly?

Hsgngr · Jul 29, 2020

m4l4 said: ↑

that's what i was thinking. Punishing the agent before deactivation obviously it's not an issue (as for reactivating it before final reward).
but the problems remain. How an agent "interprete" being deactivated?
any suggestion on how to handle them properly?
Click to expand...

If you put an enum state for being dead and alive as an observation (or boolean) that will solve the problem. Just freeze the agent so it doesnt go anywhere, it can continue to train but it will know that its dead. Therefore it will seperate the states between being alive and dead.

m4l4 · Jul 29, 2020

Thanks, that seems like good compromise.
By "freezing" the agent you mean just ignore its outputs?

OnActionReceived()
{
if(!alive){
return;
}

}

Hsgngr · Jul 29, 2020

m4l4 said: ↑

Thanks, that seems like good compromise.
By "freezing" the agent you mean just ignore its outputs?

OnActionReceived()
{
if(!alive){
return;
}

}
Click to expand...

yes this and deactivate the gameobject colliders since you dont want any collision in the environment.

Search Unity

ml-agents, team self play, handling dead agents during episode.

m4l4

awjuliani

Unity Technologies

m4l4

Hsgngr

m4l4

Hsgngr

Search Unity

Unity ID

Useful Searches

ml-agents, team self play, handling dead agents during episode.

m4l4

awjuliani

Unity Technologies

m4l4

Hsgngr

m4l4

Hsgngr