Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Convolutional Neural Networks without using an image render

Discussion in 'ML-Agents' started by wwaero, Mar 6, 2020.

  1. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    Is it possible to use the CNN without supplying a rgb image? I have a grid I'm trying to represent with different states and I'm wondering if it's possible to use the CNN without creating an camera render.
     
  2. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
  3. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    @celion_unity I'm currently writing an array of observations where each spot in the array is represented by a one hot encoded value as vector obs i.e. [0,1,0,2,3,1] -> [0000, 0100, 0000, 0010, 0001, 0100] will this manner not be effective for training? You're saying I should switch to separate array for each value type so I can somehow pass it into the visual observation? [0,1,0,0,0,0][0,0,0,1,0,0][0,0,0,0,1,0]
     
  4. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
    Sorry, maybe I was assuming too much from your original question :)

    What I meant was if you had a 2D grid representing a tic-tac-toe game with 3 states (empty, X, O), you'd want to represent a board like (I hope this displays correctly)
    Code (CSharp):
    1.  
    2. .|.|X
    3. -----
    4. .|O|.
    5. -----
    6. .|.|X
    7.  
    and you would encode it as something this

    Code (CSharp):
    1. 100|100|010
    2. -----------
    3. 100|001|100
    4. -----------
    5. 100|100|010
    where "100" means a 1 in the first channel and 0 in the others.

    In order for the observations to work well in a CNN, you want the convolution to happen on the "spatial" dimensions, and you want to do the encoding on the "channel" dimension.

    I can't quite tell from your example, but if your data doesn't have some 2D spatial structure, trying to use a CNN probably won't work well for training.