Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Team learning vs Concurrent learning in Multiagent Environment

Discussion in 'ML-Agents' started by Hsgngr, Aug 25, 2020.

  1. Hsgngr

    Hsgngr

    Joined:
    Dec 28, 2015
    Posts:
    61
    I got a bit confused. If we run multi-agents with the same parameters in the same environment do we create a team learning or concurrent learning ? Which one is used by mlagents ?

    Team learning:

    In team learning, there is a single learner involved: but this learner is discovering a
    set of behaviors for a team of agents, rather than a single agent. This lacks the gametheoretic
    aspect of multiple learners, but still poses challenges because as agents
    interact with one another, the joint behavior can be unexpected. This notion is often
    dubbed the emergent complexity of the MAS.
    Team learning is an easy approach to multi-agent learning because its single
    learned can use standard single-agent machine learning techniques. This sidesteps the
    difficulties arising from the co-adaptation of several learners that we will later
    encounter in concurrent learning approaches. Another advantage of a single learner
    is that it is concerned with the performance of the entire team, and not with that of
    individual agents. For this reason, team learning approaches may (and usually do)
    ignore inter-agent credit assignment – discussed later – which is usually difficult to
    compute.
    Team learning has some disadvantages as well. A major problem with team
    learning is the large state space for the learning process. For example, if agent A can
    be in any of 100 states and agent B can be in any of another 100 states, the team
    formed from the two agents can be in as many as 10,000 states. This explosion in the
    state space size can be overwhelming for learning methods that explore the space of
    state utilities (such as RL), but it may not so drastically affect techniques that explore
    the space of behaviors (such as evolutionary computation) [127, 220, 221, 237]. A
    second disadvantage is the centralization of the learning algorithm: all resources
    need to be available in the single place where all computation is performed. This can
    be burdensome in domains where data is inherently distributed.
    Team learning may be divided into two categories: homogeneous and heterogeneous
    team learning. Homogeneous learners develop a single agent behavior which is
    used by every agent on the team. Heterogeneous team learners can develop a unique
    behavior for each agent. Heterogeneous learners must cope with a larger search
    space, but hold the promise of better solutions through agent specialization. There
    exist approaches in the middle-ground between these two categories: for example,
    dividing the team into squads, with squad-mates sharing the same behavior. We will
    refer to these as hybrid team learning methods.

    Concurrent learning:
    The most common alternative to team learning in cooperative MAS is concurrent
    learning, where multiple learning processes attempt to improve parts of the team.
    Typically each agent has it own unique learning process to modify its behavior.
    There are other degrees of granularity of course: the team may be divided into
    ‘‘squads’’, each with its own learner, for example. However, we are not aware of any
    concurrent learning literature which assigns learners to anything but individual
    agents.
    Concurrent learning and team learning each have their champions and detractors.
    Bull and Fogarty [40] and Iba [125, 126] present experiments where concurrent
    learning outperforms both homogeneous and heterogeneous team learning, while
    Miconi [166] reports that team learning is preferable in certain conditions. When
    then would each method be preferred over the other? Jansen and Wiegand [130]
    argue that concurrent learning may be preferable in those domains for which some
    decomposition is possible and helpful, and when it is useful to focus on each subproblem
    to some degree independently of the others. The reason for this is that
    concurrent learning projects the large joint team search space onto separate, smaller
    individual search spaces. If the problem can be decomposed such that individual
    agent behaviors are relatively disjoint, then this can result in a dramatic reduction in
    search space and in computational complexity. A second, related advantage is that
    breaking the learning process into smaller chunks permits more flexibility in the use
    of computational resources to learn each process because they may, at least partly,
    be learned independently of one another.

    Source: Cooperative Multi-Agent Learning: The State of the Art
     
    betike and rg18 like this.
  2. betike

    betike

    Joined:
    May 28, 2019
    Posts:
    18
    I would love the answer to this question as well. Thanks!
     
  3. ShirelJosef

    ShirelJosef

    Joined:
    Nov 11, 2019
    Posts:
    21
    According to
    https://blogs.unity3d.com/2020/02/2...t-adversaries-using-self-play-with-ml-agents/

    So it is not a multi agent algorithm per se.