Help with Roguelike using Curiosity

TheJarmanitor · May 25, 2021

i'm teaching an agent to beat a spelunky type roguelike . the levels are small but require precision and wall jumping.
i've added curiosity to the parameters but it's not taking anough risks, they don't jump right or even use all the actions. I'm certain the problem comes with the parameters. This is what i'm using right now

Code (CSharp):

behaviors:

PlayerAgent:

trainer_type: ppo

hyperparameters:

batch_size: 256

buffer_size: 2048

learning_rate: 3.0e-5

beta: 5.0e-1

epsilon: 0.2

lambd: 0.9

num_epoch: 5

learning_rate_schedule: linear

network_settings:

normalize: false

hidden_units: 1024

num_layers: 10

reward_signals:

extrinsic:

gamma: 0.99

strength: 0.4

curiosity:

strength: 0.9

gamma: 0.99

network_settings:

hidden_units: 512

num_layers: 5

learning_rate: 3e-3

keep_checkpoints: 3

max_steps: 10000000

time_horizon: 256

summary_freq: 20000

what should be higher/lower/different? what recommendations can you give me in this scenario?

ruoping_unity · May 25, 2021

Hi,

It will be helpful for us to help if you can describe your game with more details, including the game mechanism, the task you're trying to solve, the observations/actions setup of your agents, reward functions, etc.

TheJarmanitor · May 25, 2021

i'll elaborate on the game.

It's a Rogulike platformer with emphasis on verticality. It uses The spelunky approach on dungeon creation(set of premade rooms with doors and an algorithm to create a main path and then fill the rest). I have some basic enemies with patroling AIThe Agent needs to learn how to get to the end of the dungeon. At the beginning of the episode, a new dungeon is created. the Agent needs precision jumping to move around

The agent can move horizontally, start a jump, end a jump midway, attack, wall jump once after touching the ground and pass through some platforms.

each time it moves, there's a small penalty so the agent tries to solve it as fast as possible, and if it jumps so the agent doesn't hop too much. the rest of the rewards are:

Code (CSharp):

AgentActions(actionBuffers.ContinuousActions);

if(combat.didDamage){

AddReward(0.05f);

}

if(combat.hurt){

AddReward(-0.01f);

}

if(combat.isAttacking && !combat.didDamage){

AddReward(-0.005f);

}

if(combat.died){

AddReward(-0.5f);

EndEpisode();

}

it's using a ray perception where it can check platform tiles, ground/wall tiles, enemies and the checkpoint. Besides that the vector observations i'm collecting are

Code (CSharp):

sensor.AddObservation(player.wallSliding ? 1f: 0f); //check if it's on a wall

sensor.AddObservation(player.velocity);

sensor.AddObservation(player.directionalInput);

for the movement and collisions i'm a using raycasts(not sure if your'e familiarized with Sebastian Lague's platformer tutorial)

mbaske · May 25, 2021

10 x 1024 hidden units seem a bit excessive for the task. Maybe try simplifying the environment first, in order to see if the agent can actually learn walking and jumping with the given observations and rewards. I would remove enemies for now and concentrate on movement. You can probably parameterize the dungeon difficulty, so the wall jump example should be a good fit in terms of config params and curriculum. Just keep it simple to see if things work like you intended, before increasing complexity and adding enemies.

TheJarmanitor · May 25, 2021

The enemies aren't causing extra problems right now because the agent is not exploring enough to get to them. sometimes the agent moves by accident but mostly stays on the same place jumping and moving erratically. That's what i want to impove right now. what should i change on the parameters to make that? on top of what you just said

But the curriculum probably will work for me in the future so i appreciate it greatly. i don't understand the wall jump exmaple too well but i'll do my best. Thank you

TheJarmanitor · May 29, 2021

An update on this. I have changed my actions from continuous to discrete. Now there is a branch for horizontal movement, vertical moment and jumping. I've been get better results but still really far from good. My agent sometimes finds the end of the level, but when it hits a dead end he doesn't jump back and try another route. The learning process doesn't look consistent

Here are my new parameters:

Code (CSharp):

behaviors:

PlayerAgent:

trainer_type: ppo

hyperparameters:

batch_size: 64

buffer_size: 40960

learning_rate: 3e-4

beta: 1e-3

epsilon: 0.1

lambd: 0.9

num_epoch: 5

learning_rate_schedule: constant

network_settings:

normalize: true

hidden_units: 512

num_layers: 5

reward_signals:

extrinsic:

gamma: 0.99

strength: 0.7

curiosity:

strength: 0.8

gamma: 0.8

encoding_size: 256

learning_rate: 5e-3

keep_checkpoints: 3

max_steps: 10000000

time_horizon: 1024

summary_freq: 25000

i tried using memory, but hasn't worked very well, i also used bigger sensors, but i'm not sure if it was the best idea

Search Unity

Help with Roguelike using Curiosity

TheJarmanitor

ruoping_unity

Unity Technologies

TheJarmanitor

Attached Files:

Screenshot from 2021-05-25 07-31-18.png

mbaske

TheJarmanitor

TheJarmanitor

Attached Files:

Screenshot from 2021-05-29 12-58-58.png

Search Unity

Unity ID

Useful Searches

Help with Roguelike using Curiosity

Unity Technologies

Attached Files:

Attached Files: