Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Definition of value during training

Discussion in 'ML-Agents' started by MrOCW, Jul 16, 2021.

  1. MrOCW

    MrOCW

    Joined:
    Feb 16, 2021
    Posts:
    51
    Hi, I am figuring out the backend of how training works in ML-Agents and I encountered this while using the Basic example.

    Input to Unity:
    rl_input {
    agent_actions {
    key: "BasicLearning"
    value {
    value {
    vector_actions: 0.0
    value: -0.08162793517112732
    }
    }
    }
    }
    Output from Unity:
    agentInfos {
    key: "BasicLearning"
    value {
    value {
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 1.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stacked_vector_observation: 0.0
    stored_vector_actions: 0.0
    reward: -0.009999999776482582
    id: -1078
    }
    }
    }


    I believe the action is generated from the policy and then sent to Unity and Unity sends back an output describing the vector observations , reward, and agent id.

    Does anyone know what the value term in the input here means/represent?