Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Official Post Your ML-Agents Project

Discussion in 'ML-Agents' started by jeffrey_unity538, Nov 13, 2020.

  1. jeffrey_unity538

    jeffrey_unity538

    Unity Technologies

    Joined:
    Feb 15, 2018
    Posts:
    59
    Thread on community created ML-Agents projects. Please share any posts, images, videos or project links.
     
  2. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
    Hi, here are a couple of my project videos. There are source files available for most of them, but I haven't really kept those up to date, some are using old versions of ML-Agents. Also, since I'm still finding my way around Unity, I would approach some things differently now than I did back when I created these projects...
    You can find more info in the video descriptions and a few other Unity and ML-Agents related videos on my channel.




    https://github.com/mbaske/ml-motorcycles


    https://github.com/mbaske/ml-dogfight


    https://github.com/mbaske/ml-hover-bike-race


    https://github.com/mbaske/robot-ants
     
    mariandev, bb8_1, TulioMMo and 10 others like this.
  3. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
  4. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
  5. jeffrey_unity538

    jeffrey_unity538

    Unity Technologies

    Joined:
    Feb 15, 2018
    Posts:
    59
  6. CallieChaos

    CallieChaos

    Joined:
    Oct 3, 2019
    Posts:
    4
    I've put together a project that makes the machine learning process into the game itself! I've set up a twitch stream where you can watch creatures learn to navigate their environment to reach an exit and bet game coins on which ones you think will make it the furthest. It's running at https://twitch.tv/orbward
     
    mbaske and akTwelve like this.
  7. akTwelve

    akTwelve

    Joined:
    Oct 7, 2013
    Posts:
    8
    @mbaske Your project videos are always incredible. Also, that boxing match is hilarious!
     
  8. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
    Thanks!
     
  9. Adam_Streck

    Adam_Streck

    Joined:
    Jul 31, 2013
    Posts:
    26

    Attached Files:

    Last edited: Jan 2, 2021
  10. kodobolt

    kodobolt

    Joined:
    Jul 19, 2017
    Posts:
    4
    Hi everyone! I've been playing with ML agents again over the holidays and trying to see if I can build a small RTS game where units are trained agents.

    Duel training with self-play:



    Resource gathering:



    I'm experimenting with giving player control over high-level policies (e.g. which resource to gather, whether to prioritize attack or survival), curriculum learning (I've set up a system of lessons and scenarios to learn complex behavior step by step), various perception sensors (e.g. basic "range" sensor with OverlapSphere, "smell map" sensor with decaying influence that could give rough pathfinding data).

    My overall goal is to build a small prototype/proof of concept to understand how RL can be part of a designer's toolbox and learn best practices along the way.

    I started looking into ML for Unity several years ago but had no time to fully flesh out something. I initially tried to set up my own interface to python, then the first version of ML-Agents came out but it was a bit rough around the edges. I'm very impressed with how far it has come and found it quite easy to understand and use. Thanks for making it!
     
  11. aureliantactics

    aureliantactics

    Joined:
    Jan 15, 2021
    Posts:
    1
    I'm working on getting Unity ML to help train the AI for a tactics game (turn based strategy game). First I did TicTacToe. Code available on my github.


    Next I implemented a GridWorld mini-game in the game.
    .

    I tried to do a more advanced mini-game (one vs. one duel) but it ran way too slowly for the agent to learn anything. I'm working on speeding up the code. I can't duplicate the game board and have many games run in parallel (like TicTacToe and the Unity examples) without a lot of changes to my code. I can get by without having to render anything so I'm trying to do some research to see how to go about creating a 'fast' mode to the game.
     
  12. Roboserg

    Roboserg

    Joined:
    Jun 3, 2018
    Posts:
    83
    RoboLeague - a car soccer environment inspired by Rocket League for deep reinforcement learning experiments in an adversarial self-play setting. The project is open source - https://github.com/roboserg/RoboLeague


     
  13. dhyeythumar

    dhyeythumar

    Joined:
    Mar 15, 2020
    Posts:
    7
    Hi everyone! I am experimenting with ML-Agents on Unity's Boat Attack environment. I have reduced the environmental objects for a smooth training process.
    To project is open source and to get the assets and build versions of this environment, check out the following links:
    Github env build repo - https://github.com/Dhyeythumar/BoatAttack-with-ML-Agents-build-versions
    Github env assets repo - https://github.com/Dhyeythumar/BoatAttack-with-ML-Agents

    Check out the following video of the trained agent:



    The following video contains the complete training process streamed from Google Colab to the twitch server:

     
  14. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
    Hi, you can now find updated "Angry AI" source files working with the latest (1.7.2-preview) ML-Agents release on Github.
    The project uses imitation learning, multi-stage training, hierarchical/tiered agent control and grid sensor observations.

    https://github.com/mbaske/angry-ai

     
  15. unity_vS2z9d1vWrMH-Q

    unity_vS2z9d1vWrMH-Q

    Joined:
    Jun 29, 2020
    Posts:
    2
    mbaske, Haneferd and christophergoy like this.
  16. AngrySamsquanch

    AngrySamsquanch

    Joined:
    Dec 28, 2014
    Posts:
    24
    I've been experimenting with agent flocking behaviors:



    Demo Link
     
  17. AngrySamsquanch

    AngrySamsquanch

    Joined:
    Dec 28, 2014
    Posts:
    24


    Not the most advanced agents, but it works with raytracing.

    Download Link
     
    mbaske, Haneferd and christophergoy like this.
  18. PinataMostGrim

    PinataMostGrim

    Joined:
    Jun 13, 2015
    Posts:
    1
    There are some really cool projects in this thread :D

    @mbaske I'm a big fan of the exploration drone. It reminds me of the mapping drones in Prometheus, which I thought were the coolest thing when I saw it.

    @aureliantactics A few of your Medium articles were really good references while I was learning to use the ML toolkit. Thanks for writing them!

    Here are a few of the projects I've used ML-Agents for.

    This is an agent that controls a physics-based MKV:


    Here is an agent that tries to use the MKV to intercept incoming projectiles:


    This one is a variation of Boids that tries to maintain a certain distance from a target point:
     
  19. flimflamm

    flimflamm

    Joined:
    Jan 6, 2020
    Posts:
    43
    A happy mile-stone in my un-named project; VR interaction with agents (even during live training!)






    Still massively unfinished, but I think this shows extreme promise.
     
    Haneferd likes this.
  20. zeemanj

    zeemanj

    Joined:
    Jun 20, 2019
    Posts:
    1
    My first Unity project using ML-Agents.

    The game is a agility game, where you have to try to get a ball into an arc by moving the board.
    This can be very frustrating so I wanted to give it a try with Unity ML-Agents after following the excellent Hummingbirds tutorial by Immersive Limit LLC (see https://learn.unity.com/course/ml-agents-hummingbirds).

    The agent moves the board, rotating it either to the left, right, forward or backward and gets a reward when a ball enters the arc.

    1 Ball

    After several tries I finally succeeded. ML-Agents is able to play the game, with 1 ball, in an average time of less then 6 seconds, amazing. See a video of a game at
    .

    I learned to keep to rewards simpel, I started with a difficult reward schema but the Agent failed to learn.
    The key to succes was to use the 'Ray Perception Sensor 3D' component. That made a huge difference for the better.

    The training run of 4 million episodes takes about 1 hour on my machine.

    2 Balls


    Training (4 million steps) took 147 minutes. It takes the agent an average of 73 seconds.
    See a video of a game at
    (first win is at 00:11)

    3 Balls

    Training (12 million steps) took 500 minutes. It takes the agent an average of 1054 seconds to win.
    See a video of a game at
    . The first 00:05:30 are trimmed. The first win is at 00:10 (actual time at 05:40).
    Not bad, but it would be nice to improve this. Still I can't win the 3-ball game in less than this time, too frustrating. To improve the ML Agents performance in playing the game it may need a bit of curiosity.

    The Unity project is at https://github.com/jpzpj/Unity-ML-Agility-Game
     
    mbaske, Haneferd and jrupert-unity like this.
  21. teleceratops

    teleceratops

    Joined:
    Oct 27, 2018
    Posts:
    3


    My first reasonable game of Rugby, adapted from the soccer sample. There's some good defending going on in the middle, but once they make a break for the line, they're unstoppable!
    Just for fun, and learning.
     
    mbaske, Haneferd, JulesVerny and 2 others like this.
  22. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    39
    Well my Dalek Example is very Basic Introductory example, hardly a showcase. It is based around the Wall Jump example. The Dalek has to discover to move the ramp to wall, to get to the Tardis. The only positive reward is upon encountering the Tardis, on the Higher Ground.



    Its a very slow Training, very slow final Reward Growth after an Initial spurt. So some advice on how to speed up the Training would be welcome. I have tried to reduce the Dalek Box Colliders further, but the Agent then fails to learn. Further Discussion and Project details and Downloads at: Https://Github.com/JulesVerny/DalekSteps
     
    mbaske, Haneferd and Sab_Rango like this.
  23. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
    ML-Agents with AR Foundation

    Hi, I recycled my old Spot controller policy and put the bots into an AR environment. Apart from various physics and raycast observations, agent policies receive input values for target speed, direction and body height. There are two heuristics acting on top of this, providing the inputs: "Fetch" steers an agent towards a ball and back to the camera position, "Fight" has agents follow or evade each other, their guns fire automatically if targets are within range.
    For the AR scene, I placed three anchors on the ground, creating a plane which serves as a global reference frame for props and agents. The idea was being able to easily save my setup and load it later, after just putting down the three anchors again. The plane also has a collider and a transparent shadow collector material. A couple of position markers were added for outlining the walkable area. I connected them with a polygon, drew that onto a texture and filled it with a gradient. The steering heuristic then samples pixel colors near an agent, slows it down and turns it around near the walkable area's boundaries.
    I anchored colliders and transparent renderers to the shipping containers. The renderers were used for occlusion, as well as for casting and receiving shadows. However, tracking the associated vertical planes wasn't super reliable. That's why there are a few occlusion glitches and shadows arent't lining up as tightly as I wanted them to.
    Frame drops mostly occurred when I moved the phone around. I guess that max quality settings + agent inference + AR tracking & light estimation + post processing grain was a little too much for my phone (POCO F3) to handle

     
    Haneferd, StenCG, lin_sir_96 and 5 others like this.
  24. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    39
    Last edited: Nov 4, 2021
    Haneferd, WaxyMcRivers and mbaske like this.
  25. AngrySamsquanch

    AngrySamsquanch

    Joined:
    Dec 28, 2014
    Posts:
    24
    Ml Agents controlling a Raspberry Pi powered car:

     
  26. LDERUB

    LDERUB

    Joined:
    Feb 16, 2018
    Posts:
    2
    Hello ML Agents community! Hey AngrySamsquanch great project! I was hoping to find something like this here. I recently did a very similar project. Currently I stream the real observations into Unity and the actions back to the real robot. The next steps would be to run the neural network directly on the robot. I have attached a video of my first use case. I am also working on more complex scenarios.
     
  27. hk1ll3r

    hk1ll3r

    Joined:
    Sep 13, 2018
    Posts:
    88
    Slime Volleyball with ml-agents AI from back in 2019. Could have trained the AI further but then it wouldn't be fair to us humans.
    slime.png
     
    Haneferd, namuh and LDERUB like this.
  28. hk1ll3r

    hk1ll3r

    Joined:
    Sep 13, 2018
    Posts:
    88
    Football IO is another free to play game I'm actively working on. The AI is trained by self play. I'm excited to add 2v2, 3v3 and 4v4 modes to it in the coming weeks.
    screenshot-menu.png
     
    jarek108, Haneferd, tjumma and 2 others like this.
  29. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    39
    This is the Cyber Man vs Dalek Agent Learning environment I have been working through in the last couple of weeks

    But the Results are rather dissapointing to my Mind.



    The Agent Fails gain a sense of how to deal with objects currently outside its field of view. The Agent does not develop any effective search and exploration strategy. The hyper parameter tuning was very frustrating and rather black art majic, with little to no intuition on how to properly tune for good performance.

    The project and code base is available at: https://github.com/JulesVerny/Cyberman
    Any Suggestions would be appreciated.
     
    mbaske and Haneferd like this.
  30. hk1ll3r

    hk1ll3r

    Joined:
    Sep 13, 2018
    Posts:
    88
    Looks like a fun project Jules. I'm a hobbyist and have been through a lot training simple agents using ml-agents. This thread is for showcasing your work. I'd suggest you post your help wanted, feature requests, questions etc. on other threads just to keep the forums clean.

    While I feel your frustration, note that this is a fairly new field and the current package does a lot but far from handling things automagically. Looking at open AI blog you can see that a lot of manual engineering goes into training agents using RL. You have to design a reward signal, proper environment and observations, suitable actions and a proper brain architecture.

    If you want your agent to have memory and choose its actions based on the previous states as well as the current observations, you can use recurrent neural networks. The hallway example shows how to use the feature.

    For choosing hyper parameters, this doc page has been the most useful for me. But yeah, a lot of dark voodoo magic is required in fine tuning these.

    Using the newer Transformer Networks in RL is an area of active research. Hopefully in a few years we will get a much better training algorithm based on them that requires less parameter tuning.
     
  31. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Realtime AI Dancer

    This is the most extensive AI project I've done in this area. Originally built using TorchSharp (native C# bindings for Pytorch) and a lot of rather computationally heavy custom AI logic.

    I am now working on converting this into a pure ML Agents based implementation because thanks to Barracuda that would make the engine runtime fully self contained, lighter and a lot more portable across platforms. The ML Agents GAIL reward function is proving to be an invaluable tool for me as I technically have unlimited training data generated from the current (original) version, but I want the ML Agents implementation to be better than the current version.

    The input data for the AI includes:
    • FFT spectrum bands grouped into categories, bass, vocal, etc...
    • Some realtime extracted feature data from the raw FFT spectrum data
    • BPM of the song calculated in realtime on a 5 second history.
    • Choice of dance style or combination of styles (fusion)

    Current Version Limitations:
    • I've only completed female dance styles because I don't have enough high quality training data for male dance styles.
    • Current version of the AI is extremely heavy in terms of computation. Therefore to maintain realtime performance, a single agent is controlling all characters (Identical dance movement across all characters). Addressing this in the next version based on ML Agents. Additionally in order to support multiple independent agents the agent needs to be aware of the environment and other agents in the scene as well.
    • The AI needs 96 frames (3.2 seconds) of history in order to function, therefore each dance is pre-initialized using the first 5 seconds of music the song. This is needed for both the AI (LSTM) as well as for input data (feature extraction from the FFT spectrum data).
    • Realtime dance choreography takes 1 to 3 seconds to react to major changes in the music.
    • For any sections of a song that are almost silence the AI fails to come up with a viable dance (well its almost silence). For these areas heuristics kicks in and plays some basic generic motions at random. Not sure what else could be done here.
    • Legs Crossing through each other and Arms / hands sometimes moving through the body.
    • Sometimes the AI jerks from one motion to another instead of moving smoothly.
    • AI prefers some types of dance moves over others. Kind of like a goto move that it does a bit too much.

    Non AI heuristics:
    1. Grounding done via simple raycast and lerp.
    2. Agents have a lerp towards a base position to keep them from running into each other or ending up outside the dance area. This makes it look like the character is floating over the ground a little bit. Agents are not aware of other agents or their global position in the overall environment. Addressing this in the next version based on ML Agents
    3. Environment dance lights are controlled using Heuristics
    4. Camera controllers are implemented using Heuristics
    5. DynamicBone (Asset Store) for more humanlike characters

    Current Version example videos across different dance styles and music Genres:

    Pop Song: Starboy the weekend



    Fusion EDM: Stars Align



    Latin Pop: Cancion Bonita



    Hip Hop Rap: 16 Shot



    Reggaeton: Savage Love




    Almost 400 test videos: https://www.youtube.com/c/ChillXStudio

    Note: I've stopped recording further videos as I'm working porting this to a pure ML Agents implementation. Also stopped the realtime live feed on Youtube as I need the hardware for training ML Agents.

    New Features I'm working on while porting this across to ML Agents
    • Natural emotions using ML Agents
    • Lip Sync using ML Agents
    • Environment awareness for the agent so it can blend into any environment
    • Other agent awareness so that I can have multiple agents in a scene dancing in Sync or Not in Sync.

    Current Version Example Videos:
    • All background environments are purchased from the Unity asset store along with multiple tools / vfx
    • All character models are purchased from DAZ3D (Genesis 8 models) and exported to Unity via Reallusion CC3 which converts the model to a more game engine optimized rig than the standard Genesis 8 rig plus creates native HDRP materials.
    • Substance 3D was used for basic enhancements to skin materials.

    Unity Asset Store 3D Environments / Textures:

    Unity Asset Store Tools / Systems / VFX:

    @jeffrey_unity538
    ML Agents is very powerful. However if it the training backend is ported from Pytorch to TorchSharp this would enable us to make alterations to the neural network structures without having to fiddle with Python. I understand that Barracuda will not run all possible ops but we can still make useful modifications while staying within the supported types of NN layers. For example we might want two parallel networks processing different inputs which are then concat into a third network. Maybe we want to add custom auto encoders. etc...

    Additionally a TorchSharp training backend is easily multi-threaded for much better training performance since it is a native C# binding for the Torch library.
     
    Last edited: Feb 5, 2022
    Oyedoyin1, mbaske and Haneferd like this.
  32. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Some more example videos across different dance styles and music Genres:

    Reggaeton Pop: Mi Gente



    Hip Hop: Saweetie Best Friend



    Fast Dance EDM: Wild



    Long EDM DJ Mix: Various artists




    Initial test of male dancers:

    Not quite right just a test. Whole thing has to be retrained to support male dancers but I do not have enough high quality training data for male dance styles in order to bootstrap the AI.

     
    mbaske, TulioMMo and Haneferd like this.
  33. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    39
    Unity ML Agents attempt to learn Duff Ball



    A review of Unity Reinforcement Learning Agents in a simple Team play environment. The intention is to review the training time and performance of Unity ML Agents playing a soccer like game, but with Tactical level Observations and Actions, rather than the typically base Movement/ Rotations level Actions in used in many Unity ML examples.
     
  34. TulioMMo

    TulioMMo

    Joined:
    Dec 30, 2020
    Posts:
    29
    I have recently published a journal article at Ocean Engineering. I used Unity ML-Agents to train a tidal lagoon environment with random generated ocean tides.

    Using the Swansea Bay Tidal Lagoon project (UK) as a test case, the agent obtained a performance comparable to state-of-art methods (but in real time, without requiring future ocean predictions).

    50 days free access link with article, supplementary material and video simulation:
    https://lnkd.in/eFnrNtmH
     
    mbaske, JulesVerny, LDERUB and 2 others like this.
  35. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    39
    Dugby Learns to Escape



    Training this escape sequence required a lot of A Priori Reward shaping and Sub Objectives to make level progress. This would be better classed as Machine Training rather than Machine Learning. Fun and Interesting to learn the limits of this technology.
     
    Sab_Rango, mbaske, TulioMMo and 2 others like this.
  36. KudryashovV

    KudryashovV

    Joined:
    Apr 5, 2017
    Posts:
    5
    Map Sensor

    The Map Sensor is designed to use non-visual data as visual observations.

    For example, we need the agent to be able to see the heightmap. To solve this problem, you can draw a heightmap on the plane and then get observations using the Camera Sensor. At first I did just that. It works. Then I developed a special Map Sensor similar to the Camera Sensor. Map Sensor has some advantages over Camera Sensor.

    Map Sensor faster than Camera Sensor. No camera rendering, no image to array conversion.
    Map Sensor has an unlimited number of channels. Camera Sensor has 1 (greyscale) or 3 (RGB) channels.
    Map Sensor does not use graphics. We can use the no_graphics engine setting.

    Project link: https://github.com/V-Kudryashov/MapSensor

    1 channel example.
    In this example, Map is a height map.
    loat[,] H = terrain.terrainData.GetHeights(0, 0, res, res);



    4 channels example.
    The map contains 4 channels:
    NormalX
    NormalY
    CurvatureMagnitude
    Objects

     
    JonathanCzeck and JulesVerny like this.
  37. Zibelas

    Zibelas

    Joined:
    Aug 1, 2017
    Posts:
    6
    I tried myself to get the agents working for the game of Splendor, a turn based card/ boardgame. It is still a work in progress with not all features of the game implemented, but the agent is already playing good enough to play against.

    The agent has to decide between 3 actions
    • taking currency
    • returning currency if owning too much
    • buying a card
    The game is over once a player reaches 15 points. In agent vs agent games, the average turn count was around 30 which is similar to my own game round experience. The minimum winning turns needed are around 14, but this only works if all cards on the board are perfect.

     
    LDERUB likes this.
  38. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    39
    mbaske likes this.
  39. i-make-robots

    i-make-robots

    Joined:
    Aug 27, 2017
    Posts:
    17
    @mbaske I have been trying to develop quadruped strategies and I have had very little success. Can you share more about your technique? In order to simulate real motors available to me I have been using ArticulationBody instead of CharacterJoint. They quickly learn to stand up but don't seem to be able to walk to a target point, have no obvious gait styles, and have never learned to roll over from prone (the original goal)

    https://github.com/MarginallyClever/DogML/
     

    Attached Files:

  40. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    472
    Hi, I got mixed results when trying to achive good gait styles. My starting point was imitation learning. In order to record demonstrations, I would implement an agent heuristic with actions driven by oscillators. Just something basic that moves the quadruped forward in a regular manner. For initial training, you can try setting a high GAIL strength of 1 and a lower extrinsic strength, perhaps 0.1, plus behavioral cloning.
    My metrics for evaluating motion are usually forward speed = Vector3.Dot(agent.forward, agent.velocity) and heading = Vector3.SignedAngle(agent.forward, (target_pos - agent_pos), Vector3.up) / 180. Assuming the demo contains the agent moving forward with a given speed, I would then reward it for how closely it matches that speed, e.g. reward = Mathf.Clamp01(1 - Mathf.Abs(target_speed - agent_forward_speed)). Similarly, the agent should learn to minimize its heading value, reward = 1 - Mathf.Abs(heading). The idea here is that target points aren't fed to agent observations directly, but rather to get a relative target direction first and then calculate the heading value. The demo heading is always zero for forward motion. Both, speed and heading values should be observed by the agent.
    Optionally, you can reward the agent for its posture (body up angle and height) and spawn it with random body rotations at the beginning of each episode, in order to train it for getting up right away.
    IF all this works out and the agent learns how to walk forward (might take a couple of million steps), I would then add at least one more round of trainig without imitation and extrinsic strength at 1. Varying target speeds, directions, randomly knocking the quadruped over occasionally - so it learns to generalize / optimize its gait.
     
  41. kokimitsunami

    kokimitsunami

    Joined:
    Sep 2, 2021
    Posts:
    11
    Hi ML Agents community, we developed a playable game demo, in which you can battle against the machine learned knight implemented using ML-Agents.

    image1.png

    If you're interested, please check out three-part blog series where I provide how the agent was trained and how the game runs on Arm-based devices (Part 1, Part 2, and Part 3).
     
    Sab_Rango and BmanClark like this.