Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice
  3. Join us on November 16th, 2023, between 1 pm and 9 pm CET for Ask the Experts Online on Discord and on Unity Discussions.
    Dismiss Notice
  4. Dismiss Notice

ML-Agent just ceases learning, doesn't evolve for long periods, new training runs don't change this

Discussion in 'ML-Agents' started by BarShiftGames, Dec 16, 2021.

  1. BarShiftGames

    BarShiftGames

    Joined:
    Jul 31, 2017
    Posts:
    12
    upload_2021-12-15_21-18-59.png

    I'm having issues with my learning process just giving up, it literally just decides to do nothing for thousands and thousands of steps, I have many inputs (like 723), and only 1 output, so the likely hood that it just doesn't do anything of substance is extremely low. It also seems to remember it's past networks, despite me erasing them. Example; It doesn't start out doing random actions, which would normally result in a very low score (-400 and below) to start, instead it just seems to know a basic idea of what to do at the beginning, which should be impossible. I'm not sure if I'm missing anything in how to train AI, or if I've found some sort of weird behavior. Any help is appreciated! My config.yaml is also basically the standard one, beyond me increasing the hidden units to 1024, and hidden layers to 3.
    So I'm really not sure, the first time I ran training, it actually slowly learned, but after that first session it just got worse and worse every time I tried to run it, and I've no idea how to wipe it's memory entirely, clearing the results folder doesn't seem to wipe this thing's memory.
     
  2. jrupert-unity

    jrupert-unity

    Unity Technologies

    Joined:
    Oct 20, 2021
    Posts:
    12
    Clearing the results folder will start it fresh, as will the --force option. It is possible that the first run was just lucky. Can you say more about the environment and rewards?
     
  3. BarShiftGames

    BarShiftGames

    Joined:
    Jul 31, 2017
    Posts:
    12
    I can't say a ton, but basically if they did any random movements the reward can easily vary downwards to -1000 as the max, so to see that on it's first 10k steps (each round is about 360 steps, and I have 10 agents working) it only varied like 9 reward points is a bit strange to me, and feels like it already knew that making random motions was not good. On top of that, I only ever had the behavior of starting out at absolute minimum, then slowly getting better over time, the first time I ran the learning agent, past that, it just got worse and worse, while never repeating the same upward slope from absolute minimum at the first attempt. I only can only set rewards at the end, given the system is kindof in flex until I have to cut it off, this is meant to be a continuous problem, but I had to create an artificial cut-off point somewhere, to get it to actually restart and try again.
     
  4. BarShiftGames

    BarShiftGames

    Joined:
    Jul 31, 2017
    Posts:
    12
    Small update, it seems that upping the batch and buffer size to something like 40 and 800, seems to have positively effected it's ability to continue learning, there is a small patch at the very beginning of about 20-30k steps that it gives up for a bit, but then it resumes learning in some capacity. Will continue to experiment and update if anything significant comes up.
     
  5. BarShiftGames

    BarShiftGames

    Joined:
    Jul 31, 2017
    Posts:
    12
    Upping the beta also appears to have positively effected it's development, when larger more complex/volatile tasks are involved.