Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Can I give rewards outside OnActionReceived()?

Discussion in 'ML-Agents' started by ahmmmmmwhy, Mar 26, 2020.

  1. ahmmmmmwhy

    ahmmmmmwhy

    Joined:
    Aug 20, 2017
    Posts:
    11
    I was wondering if my agent fails to learn because I give rewards outside of
    OnActionReceived
    function.

    My agent shots missiles, which have a custom Missile script attached. In its
    FixedUpdate
    missile checks for collisions. If it hits a target, I call:

    Code (csharp):
    1. Agent.AddReward (1f);
    Would this be received and used by the learning framework?

    I am rewarding a lot of other stuff in various
    FixedUpdate
    of other components.
     
  2. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    Hi,
    It is recommended that you set your rewards in OnActionRecieved in order to ensure that your rewards are associated with the correct Observation/Action pair. Otherwise, the rewards might not be associated with the observations/actions you think.
     
    ahmmmmmwhy likes this.
  3. ahmmmmmwhy

    ahmmmmmwhy

    Joined:
    Aug 20, 2017
    Posts:
    11
    Thanks for the reply!

    Would it be completely lost, or just attached to the next step observation?
     
  4. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    It would potentially be associated with the next step.
     
  5. ChrissCrass

    ChrissCrass

    Joined:
    Mar 19, 2020
    Posts:
    31
    Quick question about this.

    When OnActionRecieved() is called (formerly AgentAction() ), that means that outputs have been generated from the neural network on that tick, right?

    If so, does that mean rewards defined on a given tick decision tick actually associate themselves with last-decision-tick's action/state pair?

    If not, then wouldn't this depend on whether the physics update has occurred before or after the AddReward() happening in that same tick?

    What I'm guessing is that unless the AddReward() happens before a decision is requested (and therefore before any outputs have a chance to change) on a decision tick, it will just get applied to next-decision-tick's action-state pair.

    I have often wondered about the underlying rules of where rewards can or should be set...
     
  6. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    Yes

    Yes the reward data is sent along with the observations for a particular step.

    As long as you are adding your rewards in the OnActionReceived method, you won't need to worry about the physics system updates. Currenlty, the Agent methods are triggered by hooking into the FixedUpdate loop, but it subject to change between major version updates.

    AddReward should happen in OnActionReceived