Search Unity

Help with FPS agents.

Discussion in 'ML-Agents' started by diificult_, Aug 26, 2022.

  1. diificult_

    diificult_

    Joined:
    Nov 2, 2019
    Posts:
    4
    Hi,


    I'm trying to train some AI in a fps game using multi-agents. My level is a small square area.

    I am struggling to get them trained as they either just spin around(Mostly this one), get stuck on walls or just walk around aimlessly. The AI are rewarded for hitting an enemy and also for killing an enemy. Teams are also rewarded for winning. The AI at the moment are not punished for dying so there is no reason for them to stop and hide behind objects. The mean rewards seem quite low, even when ive changed things to improve rewards and encourage more hitting, it has improved but the AI dont act as expected. I've tried both only rewarding for a win + increasing reward with kills, and also doing this plus punishing for a lose. Both seem to not work. The rewards just seem very low and not improving, or if it does, it drops straight back down.


    This training environment is 5 maps being played at once. I was wondering am I not rewarding the agents enough? Have I just left it for not long enough? Is their potential that the map is too complicated for the AI, but the AI do manage to shoot the enemy sometimes and they do manage to sometimes stand by each other. Or is there something I've missed.


    Code (CSharp):
    1. behaviors:
    2.  
    3. behaviors:
    4.   Shooter:
    5.     trainer_type: poca
    6.     hyperparameters:
    7.       batch_size: 2048
    8.       buffer_size: 20480
    9.       learning_rate: 0.0003
    10.       beta: 0.005
    11.       epsilon: 0.2
    12.       lambd: 0.95
    13.       num_epoch: 3
    14.       learning_rate_schedule: constant
    15.     network_settings:
    16.       normalize: false
    17.       hidden_units: 512
    18.       num_layers: 3
    19.       vis_encode_type: simple
    20.       goal_conditioning_type: none
    21.     reward_signals:
    22.       extrinsic:
    23.         gamma: 0.999
    24.         strength: 1.0
    25.     keep_checkpoints: 40
    26.     checkpoint_interval: 2000000
    27.     max_steps: 500000000
    28.     time_horizon: 1000
    29.     summary_freq: 50000
    30.     threaded: false
    31.     self_play:
    32.       save_steps: 500000
    33.       team_change: 1000000
    34.       swap_steps: 200000
    35.       window: 100
    36.       play_against_latest_model_ratio: 0.5
    37.       initial_elo: 1200.0
    38.  
    upload_2022-8-26_13-14-16.png
    upload_2022-8-26_13-14-46.png
    upload_2022-8-26_13-15-10.png

    Thank you very much!
     
  2. diificult_

    diificult_

    Joined:
    Nov 2, 2019
    Posts:
    4
    upload_2022-8-26_14-34-17.png

    Forgot to add to the main thread.
    Here is a list of observations that I have
     
  3. diificult_

    diificult_

    Joined:
    Nov 2, 2019
    Posts:
    4
    Just wanted to see if there any more information i could provide to get help to get my AI working as intended :)
     
  4. Qacona

    Qacona

    Joined:
    Apr 16, 2022
    Posts:
    126
    How long have you run the training set for? I'm building a model that is uses a quasi-realistic physics model to steer an aircraft around a giant cube and it spends the first 5-6 million steps just getting the 'lay of the land' (i.e., not faceplanting immediately due to gravity).

    If you find your agent experiencing mode collapse (e.g. in my case, just flying around in circles), that might be an indicator that your model isn't complex enough to capture the behaviour you want to replicate (which means either more units or more layers).

    You might consider starting off with getting the model working on simpler tasks like just getting agents to walk to a random point in the map with a small penalty each step to get it to hurry up. (And a hard reset if it gets stuck or tries looping). Then you could teach it to use cover by adding static turrets through the map and then eventually add agent vs agent combat.
     
  5. diificult_

    diificult_

    Joined:
    Nov 2, 2019
    Posts:
    4
    Thank you this has helped, i also realised in comparison to a project like the dodgeball one I have been only training it for 1/6 of theirs so i am going to also let it continue to train much longer.