Search Unity

Inference Slowdown

Discussion in 'ML-Agents' started by Xiromtz, Feb 13, 2020.

  1. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Hey,
    When using inference i.e. using a trained model, my game slows down by a lot. I checked with the profiler, and the "DecideAction" in the behavior update costs me around 20 ms. When training with the same timescale of 1, it only costs me around 4 ms.
    This makes no sense to me, since I would have thought inference is much quicker than actual training. Like this, it doesn't seem usable in an actual game.
    I might have missed something, so help would be appreciated.
    Thanks!
     
  2. Mantas-Puida

    Mantas-Puida

    Joined:
    Nov 13, 2008
    Posts:
    1,864
    Hi!
    Could you please share what is your network configuration? Also which version of tensorflow do you have installed, CPU or GPU?
     
  3. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Hi,
    Not 100% sure what you mean with network configuration, I guess you mean the neural network settings? If that's the case, I'm using the prepackaged mlagents-learn ppo algorithm. I'm using no recurrent network, hidden_units=32, memory_size=256,vis_encode_type=simple,num_layers=1,memory_size=256, sequence_length=64, batch_size=32, buffer_size=2560.

    Weirdly enough, I tried it a few more times yesterday and I got stable framerates for both training and inference. I tried it again today, and my framerate for inference has dropped to around 30 fps, while my training framerate is at 70fps, I think I need to do some more testing to be able to reproduce this 100% of the time.

    Another issue I'm having with training, is that something is very off with the timescale. If I use a timescale of 1 for training, everything is much slower than running the game normally with a timescale of 1. I checked in the project settings and the Time is set to 1 in both cases. My code is definetely framerate independent and I can see a very clear slowdown of animations, as if the timescale was lower than 1. This issue does not occur when using inference, even with the lower framerates.

    Thanks for the help!
     
  4. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    I forgot to add the network inputs i'm using:
    I have a 96*72px rendertexture with the color format set to B5G6R5_UNORM_PACK16, to which a second camera renders
    I use an additional RayPerceptionSensor with 5 rays and 14 tags
    I have additional vector observations, where the space size is 8
    I use a discrete action space and 5 branches
    I set my inference device to cpu, since the docs say this is more efficient

    I just tested it using GPU and with inference the FPS moves up to nearly 60, though still lower than training.
    Weirdly enough, my CPU usage in the task-manager doesnt go above 40% when using inference, though I honestly don't know enough on this topic.

    I might need to do some more testing using a build, instead of using the Unity Editor, but I tested training and inference both in the editor and it still seems weird to me why inference would be slower, since I thought it would have less overhead.
     
  5. celion_unity

    celion_unity

    Joined:
    Jun 12, 2019
    Posts:
    289
    Would it be possible for you to attach your frozen_graph.pb or the .nn file here? The actual weights don't matter, so if you're concerned about that, you can just run training for a few steps so that they weights are essentially random.
     
  6. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Sure, some quick stats after testing the model in a simpler environment:
    I have around 280 fps when running the game normally
    I have around 120 fps when running the game through training
    I have around 90 fps when running the game through inference with this model

    I've added the generated .nn file and the frozen_graph_def.pb inside of the same-named folder (not sure if this is the correct one)

    And I forgot to mention my tensorflow version: It should be at 2.0.0

    I hope this helps, I'm not very savvy when it comes to TensorFlow.
    Thanks!
     

    Attached Files:

  7. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    I just compared the training behavior when using the simplified environment vs a more complicated one. It seems the slowdown is much worse when training in a more complicated environment.

    Again, nothing is framerate dependent since I apply a movement vector to the velocity of a rigidbody and animations are also slowed down. I have moved the EnvironmentStep functionality to Update instead of FixedUpdate after disabling AutomaticStepping, but reversing these changes results in the same strange slowdown.

    I can't really comprehend how this is even possible, but maybe this is somehow intended via the python interface i.e. when action decisions take too long.
     
  8. Mantas-Puida

    Mantas-Puida

    Joined:
    Nov 13, 2008
    Posts:
    1,864
    @Xiromtz how many agents do you have in your environment?
     
  9. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    I'm planning on using 2, but for now i'm only using a single agent. The agent is attached to my player and hooks up the returned actions to the normal input functionality. So I don't spawn anything new when training vs normal gameplay, other than the actual agent code, there is no additional overhead.
     
  10. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    I also don't use parallel instances like in the mlagents examples. Once everything is setup, I plan to use the commandline arguments to run multiple executables.
     
  11. Mantas-Puida

    Mantas-Puida

    Joined:
    Nov 13, 2008
    Posts:
    1,864
    Thanks! And what machine are you testing on? Is it running Windows?
     
  12. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Yup, running windows 10 on my own pc, so it's not really a ML build. If it helps I'm using an i7-4790k and a gtx970. I'll do some further testing in actual builds tomorrow to see if anything changes. I also have an amazon aws instance running Ubuntu, which I'll use once everything looks functional enough to run a longer training process.
    Thanks for the help!
     
  13. celion_unity

    celion_unity

    Joined:
    Jun 12, 2019
    Posts:
    289
    Thanks, I was able to get the model loading in a simple scene. On my laptop, I'm seeing roughly 1.2ms for the GenerateTensors section (this doesn't count rendering if you were using a Camera sensor) vs about 6.5 ms in the Barracuda graph execution for GPU, and 18.5ms for Barracuda in CPU mode.

    I'll let Mantas take over from here :)
     
  14. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Awesome!
    At this point I'm actually very intrigued by how the inference engine itself works, at least in theory. As far as my knowledge on neural networks goes, I would understand than inference would be a simple left->right movement in the graph via activation functions and weights and when training we do the same for deciding on actions, but must additionally calculate gradients of some error function for the learning process.

    This is why, in my knowledge, I would understand that training must always have a much higher overhead than inference. Is there something I'm missing?
     
  15. OmarVector

    OmarVector

    Joined:
    Apr 18, 2018
    Posts:
    130
    Any updates about the reason why inference taking such power? its very noticeable on Webgl since its single threaded, and cant find any work around yet

    Edit: I fixed tons of GC on barracuda package, yet the inference itself is very slow, it takes up to 20ms , if I run it on slow machine on webgl like i5, the game roughly run 30fps , once the agents stop inference the game jump 60FPS normally . we speaking having 8 agents similar setup to Dodgeball game
     
    Last edited: Apr 4, 2023
  16. Thorce

    Thorce

    Joined:
    Jul 3, 2019
    Posts:
    41

    Did you ever find out what the problem is?