Search Unity

Advice on Penalising Shots to avoid excessive shots fired ?

Discussion in 'ML-Agents' started by JulesVerny, Sep 12, 2021.

  1. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    47
    Hi,

    I am attempting to develop an Asteroids type Base Defence simulation agent. So the Actions are to a) Rotate the Gun Clockwise/ anti clockwise and another Action b) Fire the Gun/Defence Missile. To hit the incoming rocks/missiles, before they hit my base. But my Agents expend lots of shots/missiles, which I would like to influence to minimise the shots fired. Hopefully to learn that only one shot is required per incoming Missile. But I am struggling to get a Reward scenario to achieve this behavior.

    So my reward profile:
    Destroy Enemy Missile: Reward: +1.0
    Enemy Missile Reaches my Base: Reward: -1.0
    Missile/ Shot Fired: Reward: -0.1
    With an Episode being my Base being subject to 20 Enemy Missiles, so the optimun would be only 20 Fires off to engage. But typicaly I send out ~5 shots per every incoming enemy missile.

    I note that my Decision Request rate Defaults at 5. Which typically results in batches of 5 shots being fired. But when I reduce Decision Request rate down to 2 or 1, I get fewer friendly shots fired, but little to no training performance to engage many/most enemy missiles.

    Increasing the Missile/ Shot Fired penalty to -0.5 I get no Training Convergence/ Performance.

    Does anyone have similar Training Scenario, and Rewards profiling advice for Base Defence scenarios ?
    Any Advice appreciated.
     
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    In your decision requester, did you turn on `TakeActionsBetweenDecisions`? That might be one reason that that you're seeing 5 shots when request rate is 5
     
  3. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    47
    Yes I did start with a Decision Request period left at its default of 5, with the Take actions between decisions left Checked. But I have reduced this down to 1,with no Actions between decisions. I have also changed to discrete, and not continuous Actions (B1: None, Rotate Left, Rotate Right, B2: No Fire, Fire). With various tuning experiments I have managed to get the fire rate reduced. It now it basically fires around 2 shots at every Enemy Missile, and saves the ship from being hit around 85% of the time.

    But I guess I was expecting rather better performance, for such a simple asteroids game type scenario.
     
  4. WaxyMcRivers

    WaxyMcRivers

    Joined:
    May 9, 2016
    Posts:
    59
    I would give the agent an observation of how many shots are left, then give a penalty for shooting that is -1/missle_total; then add a scaling coef to the new loss if you need to make the agent more restricted. You want the agent to know that there is an inherent risk of not accumulating more reward for shooting and missing.