Question ML Agents doesn't work well with large number of observations. Is this use case feasible?

AFriendlyUnityDeveloper · Mar 3, 2023

Hello ML Experts,

I am attempting to train a model using ML agents with a large number of observations (I have tried 10,000 to 100,000 in size). This seems problematic for multiple reasons:

1) I get GRPC errors in unity when I increase the number of observations.

Example: "
GRPC Exception: Status(StatusCode=ResourceExhausted, Detail="Received message larger than max (12455075 vs. 4194304)"). Disconnecting from trainer.
UnityEngine.Debug:LogError (object)
Unity.MLAgents.RpcCommunicator:Exchange (Unity.MLAgents.CommunicatorObjects.UnityOutputProto) (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Communicator/RpcCommunicator.cs:493)
Unity.MLAgents.RpcCommunicator:SendBatchedMessageHelper () (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Communicator/RpcCommunicator.cs:393)
Unity.MLAgents.RpcCommunicatorecideBatch () (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Communicator/RpcCommunicator.cs:320)
Unity.MLAgents.Policies.RemotePolicyecideAction () (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Policies/RemotePolicy.cs:67)
Unity.MLAgents.AgentecideAction () (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Agent.cs:1360)
Unity.MLAgents.Academy:EnvironmentStep () (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Academy.cs:578)
Unity.MLAgents.AcademyFixedUpdateStepper:FixedUpdate () (at Library/PackageCache/com.unity.ml-agents@2.0.1/Runtime/Academy.cs:43)

It seems that ML agents tries to send data through GRPC at a size in bytes roughly equal to 400x the number of observations. (which obviously doesn't work)

2) If I reduce the number of observations to the low 10s of thousands to get under the GRPC error it technically works but very slowly and leads to explosive memory usage.

Things run at least 100x slower than I would normally expect and unity immediately starts consuming nearly 100% of my 64GB of ram. I suspect this is due to excessive allocation and deallocation of extremely large buffers for use with GRPC (or other internal structures).

My use case and why I am using such a large number of inputs:

I am writing a turn based game that is played on a grid where many (hundreds) of different pieces can occupy the same tile of that grid. I am one hot encoding each tile with an entry for each piece into my observation vector. This leads to an observation vector that is roughly equal to "number of pieces" * "number of tiles"

What should I do next?

Can this be achieved in Unity? Should I give up and try to do old fashioned hand rolled state machines?

Any ML Agents super stars out there have any suggestions?

hughperkins · Mar 3, 2023

Have you considered sending an image, where the value of each pixel is the piece at that position? (I haven't tried this in mlagents, but seems like this would allow you to send the information you wish, in a compact form).

AFriendlyUnityDeveloper · Mar 3, 2023

I have not tried it, would this work?

Does the ml agents library process each pixel of an image in the same way it would a distinct observation? Does it do any compression or anything else that would screw up my one hot encoding?

Search Unity

Unity ID

Useful Searches

Question ML Agents doesn't work well with large number of observations. Is this use case feasible?

AFriendlyUnityDeveloper

hughperkins

AFriendlyUnityDeveloper