Search Unity

Resolved Same brain, different health/damage still should be normalized?

Discussion in 'ML-Agents' started by HQF, Mar 19, 2022.

  1. HQF

    HQF

    Joined:
    Aug 28, 2015
    Posts:
    40
    Hi! I'm developing turn based strategy, and I have agents with same Brain and same type of weapon, but with different settings (Health/damage).
    How should I make observations of health and damage to make my agent do more "strategic" actions. Because if agent with damage 100 will hit agent with 25 health it's get reward equals:

    Reward = total damaged health / damage

    So in this situation it would be 0.25 (wasted 75% of damage and reward)

    But my question is Normalization. I normalized almost all values (include positions) except units health and damage, because I don't think that normalize all different damages and health's will give my agent correct knowlage about his damage and other agents damage/health. But maybe without normalizations learning my agents will take a lot of time...
    Does someone already solve agents health / damage problem?

    P.S. for observations of other agents I use BufferSensor that stores RelativePosition, Team, Health, Damage.

    P.P.S I learn agents with same units setups with zero sum for Self-Play POCO training for calculation of ELO rating.

    Unit1:
    100 health, 50 Damage

    Unit2:
    50 health, 25 Damage

    Unit3:
    25 health, 50 damage
     
  2. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    It's hard to say without knowing the specifics of your game, but my intuition would be this:
    1) Normalize health and damage, especially if you've normalized all other values, e.g. current_health / max_health and current_damage / max_damage.
    2) Have your agent observe these normalized values.
    3) Don't reward the agent based on a health-damage ratio, because being healthy is probably not your agent's objective function. I'm assuming it has to achieve some goal for which it needs to be somewhat healthy. Maximizing rewards for health could be counter-productive though, perhaps by making the agent risk averse in situations where it should sacrifice a bit of health in order to succeed with its main goal. I'd say focus on rewarding for that goal and let the agent figure out by itself that being healthy is merely instrumental.
     
  3. HQF

    HQF

    Joined:
    Aug 28, 2015
    Posts:
    40
    Thanks for reply! Currently I'm giving reward to agent only when he damaged someone, so I didn't reward him for being healthy.

    For now I did health Normalization like this:

    normalizedValue = myCurrentHealth / sum of mine and my teammates maximum health.

    And also I didn't observe my damage anymore, because it's constant value for each agent, I just reward agents when he damage other agent with:

    Reward = damageDone / sum of damaged agent and his teammates maximum health.

    But I not sure that different rewards for units with different damage will "show" agents his actual damage...