Search Unity

  1. We are migrating the Unity Forums to Unity Discussions. On July 12, the Unity Forums will become read-only. On July 15, Unity Discussions will become read-only until July 18, when the new design and the migrated forum contents will go live. Read our full announcement for more information and let us know if you have any questions.

Is there something I'm not understanding or missing about the agent call stack

Discussion in 'ML-Agents' started by vycanis1801, Jun 5, 2022.

  1. vycanis1801

    vycanis1801

    Joined:
    Feb 11, 2020
    Posts:
    12
    I am trying to train an agent to play a simple fighting game. Any player, agent or human, has one attack, one "life", and cannot block.
    Currently I am running my agent such that an episode ends when an agent's attack hit box collides with the opponents body.

    The problem I am noticing is that when EndEpisode() is called, the agents StoredActions buffer should be cleared. After stepping over that call the buffers seem to clear. However, the next time I run OnActionsRecieved(ActionBuffers actions), I see that the previous StoredActions buffer is being processed again. This is a problem for me because I have it setup such that, when a discrete action equals 3 and the agent is not currently performing an attack, it should attack.

    Code (CSharp):
    1. public override void OnActionReceived(ActionBuffers actions) {
    2.             int input = actions.DiscreteActions[0];
    3.             inputToDirection = 0;
    4.  
    5.             if (input == 1) { inputToDirection = -1; }
    6.             else if (input == 2) { inputToDirection = 1; }
    7.             else if(input == 3 && !PerformingAttack()) {
    8.                 attackTime = 0;
    9.             }
    10.  
    11.             speed = PerformingAttack() ? 0 : 2.2f;
    12.             gameObject.transform.localPosition += speed * Time.fixedDeltaTime * new Vector3(inputToDirection, 0, 0);
    13.         }
    Code (CSharp):
    1. private bool PerformingAttack() {
    2.             return attackTime >= 0;
    3.         }
    how the attack functions:
    Code (CSharp):
    1. private void UpdateAttack() {
    2.             if (attackTime >= 0) {
    3.                 attackTime++;
    4.  
    5.                 if (!hurtbox.enabled) {
    6.                     hurtbox.enabled = true;
    7.                     hurtboxRenderer.enabled = true;
    8.                 }
    9.  
    10.                 // Activate the hit box after 3 frames
    11.                 if (attackTime > 2 && !hitbox.enabled) {
    12.                     hitbox.enabled = true;
    13.                     hitboxRenderer.enabled = true;
    14.                 }
    15.  
    16.                 // Deactivte the hit box after 6 frames (number of frames the attack is active should be 3)
    17.                 if (attackTime > 5 && hitbox.enabled) {
    18.                     hitbox.enabled = false;
    19.                     hitboxRenderer.enabled = false;
    20.                 }
    21.  
    22.                 // Reset the attack
    23.                 if (attackTime >= attackLength) {
    24.                     attackTime = -1;
    25.                     hurtbox.enabled = false;
    26.                     hurtboxRenderer.enabled = false;
    27.                 }
    28.             }
    29.         }
    My understanding of the call stack based on stepping through the code is:
    EndEpisode()
    EndEpisodeAndReset(DoneReason.DoneCalled)
    _AgentReset()
    ResetData()
    m_ActuatorManager?.ResetData()
    StoredActions.Clear()
    OnEpisodeBegin()
    // Run next frame
    OnActionsReceived(ActionBuffers actions)

    Can someone explain to me why this is happening, or what I'm missing?