Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Question Pairing agent actions with observed objects

Discussion in 'ML-Agents' started by GeorgGrech, Mar 9, 2023.

  1. GeorgGrech

    GeorgGrech

    Joined:
    Nov 26, 2021
    Posts:
    6
    Hello,

    I am working on a project using ML Agents. In it an agent goes around collecting resources and selling them at its base. Ideally, the agent favours resources that are close or of a better type.

    I'm using a BufferSensor component to observe the properties of all the resources in the vicinity.
    Code (CSharp):
    1.     public override void CollectObservations(VectorSensor sensor)
    2.     {
    3.         try
    4.         {
    5.             if (resourcesTrackingList != null && resourcesTrackingList.Count > 0)
    6.             {
    7.                 foreach (ResourceData resourceData in resourcesTrackingList)
    8.                 {
    9.                     float[] listObservation = new float[6]; // 2 distance values, imvAmount + extra 3 for one-hot encoding of type
    10.  
    11.                     listObservation[(int)resourceData.type] = 1;
    12.  
    13.                     listObservation[3] = resourceData.distanceFromPlayer / 50; //Normalize to 50 (limited range)
    14.                     listObservation[4] = resourceData.distanceFromBase / 150; //Normalize to 150 (greater range)
    15.                     listObservation[5] = (float)resourceData.invAmountLeft / enemyPlayer.maxInventorySize;
    16.  
    17.                     m_BufferSensor.AppendObservation(listObservation);
    18.  
    19.                 }
    20.             }
    21.  
    22.             sensor.AddObservation((float)enemyPlayer.inventoryAmountFree/ enemyPlayer.maxInventorySize); //Keep track of inventory
    23.         }
    24.  
    25.         catch
    26.         {
    27.             Debug.Log("Exception caught in observations");
    28.         }
    29.     }
    To choose the resource, I'm using discrete actions.

    Code (CSharp):
    1.     public override void OnActionReceived(ActionBuffers actionBuffers)
    2.     {
    3.  
    4.         int actionIndex = actionBuffers.DiscreteActions[0];
    5.  
    6.         if (actionIndex == 0) //0 means returns to base
    7.         {
    8.             StartCoroutine(ReturnToBase());
    9.         }
    10.  
    11.  
    12.         else //1 or greater, choose to gather a resource
    13.         {
    14.             StartCoroutine(GatherResource(resourcesTrackingList[actionIndex - 1].resourceObject.transform));
    15.         }
    16.     }
    However, like this, I don't see how there can be a correlation between the resource it chooses and the observations. Is there a way to make the agent choose between the objects observed based on their properties?

    Any help would be greatly appreciated!
    Thanks!