Search Unity

Agent model and outputs structure

Discussion in 'ML-Agents' started by m4l4, Nov 24, 2021.

  1. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    81
    Hi everyone, after a long time i'm back on another ml project and i'd like to ask you a couple of question regarding the models and specifically the output layer.

    As far as i've read, most models use a swish activation function, but it looks like there's no activation function for the output layer (am i correct?)

    Discrete actions return a bool (0-1) while continuous return a float between -1 and 1.
    How does the model always outputs the same range without activate?

    and also, how does the discrete outputs branching works?
    To distribute a probability you usually use softmax, but it gets distributed among every output node. How does the model apply the probability to a single branch?
     
  2. MrOCW

    MrOCW

    Joined:
    Feb 16, 2021
    Posts:
    51
    Hi, not sure about your model, but when I visualize my model in netron, my final layers are Add, Clip, Div, then the output. If my understanding is correct, Add Clip Div is an activation layer converted during ONNX export.
     
  3. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    81
    Actually i embarked into the quest of writing my own model by scratch. It's a neuroevolution project. Since the nets are evolved by genetic crossover, i don't need any backpropagation, policies, and stuff related to training.
    Just a multilayer perceptron that can evolve with no fixed topology (see NEAT-HyperNEAT algorithm).
    I took care of the generation part of the net, removal of dead end connection, and data flow.
    Now i'm starting to realize that i still have so many doubts about how i can manipulate outputs for my own purposes. Unity ML takes care of that, you specify the outputs type, branch them if possible, and you always know what you will get from the model. Used it a lot without thinking too deeply about the how and why.
     
  4. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    81
    After some researches i've come up with a solution to scale my outputs between 0f-1f, but i need someone to confirm it's a reasonable idea.

    Combining z-value and min-max scaling.

    every time the net outputs some data, i add those to a list.
    iterating the list i calculate the mean of the dataset and the standard deviation.
    with mean and standard deviation i calculate the z-value of every datapoint.
    use argMax & argMin of the z-values list for the min-max scaling of the data.

    Converting data into their respective z-values takes care of variance, then min-max can rescale them using a more appropriate range.

    Does it sound reasonable?
    or the answer is "just use a sigmoid on the output layer and be happy" ?
    i'm starting to get confused :confused::confused::confused: