Search Unity

Question Positive vs negative rewards

Discussion in 'ML-Agents' started by EternalMe, Dec 18, 2022.

  1. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    So is there a difference? Or only the balance between upper and lower limit counts, and i could as well use rewards in range of -2 to -1? -1 for good stuff?

    I might me hallucinating, but from my personal observations there is a difference. Negative rewards kind of encourage the agent to more explore alternative actions. And lower chances to repeat this negatively rewarded actions. No?

    Or is it just the quick termination problem? Where agent decides it's batter to fail the whole episode, just to not accumulate more negative rewards?

    The ml-agents documentation recommends to be careful with negative rewards, and not use them excessively as agent won't learn well. This also a pointer for me. However I think we need a bit deeper explanation on this to design proper reward models.
     
  2. GamerLordMat

    GamerLordMat

    Joined:
    Oct 10, 2019
    Posts:
    185
    Yeah, I dont understand it either bc you would have to understand reinforcement training in detail (on my TO-DO).

    But negative rewards tend to block the agent in my experience. A big negative experience when dropping the ball makes the agent stop moving and just tries to dont let the ball fall. Sometimes negative rewards are needed to avoid exploits and push it in a hard manner to the expected result.
     
    EternalMe likes this.
  3. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    And this is the part from docs:

    Positive rewards are often more helpful to shaping the desired behavior of an agent than negative rewards. Excessive negative rewards can result in the agent failing to learn any meaningful behavior.

    I did some general reading on RL, but it seems it depends on the actual implementation. So answer from ml-agents team would be nice.
     
    GamerLordMat likes this.