Search Unity

Cloud TPU opportunities

Discussion in 'ML-Agents' started by Dscvr, Mar 23, 2021.

  1. Dscvr

    Dscvr

    Joined:
    Mar 31, 2019
    Posts:
    3
    Hey I have a game, its a chess-like game.

    I'm pretty inept but after some time have it running around 250,000 games a day, it hits maxsteps of 50 million after around 1.2 million games and I use "--initialize-from" to keep it training (on my laptop without a GPU!)

    The hyperparams look good for once, Elo is constantly increasing in self-play.

    The issue is that it's a game with 5.9 billion starting states and RNG + hidden information. I've briefly looked at Cloud TPUS but feel like that might take me a long time to get a grasp on, the training speed looks like it will be approximately 400,000 faster than what I am currently doing though (It could theoretically do 45 years worth of my laptop training in an hour, which might be plenty considering it was showing very basic understanding at 1 million games) It also looks like everything referenced is for training on data sets (like image recognition for example) so it might not be built for a self-play scenario

    My question is - Has anyone else converted / used an MLagents project to make use of Googles Cloud TPU's OR would it potentially ever be in the pipeline for a future MLagent update to integrate Cloud TPU use for projects that need much more firepower.

    (I did read that MLagents were looking at setting up so that multiple devices could concurrently work a ML problem and consolidate updates to the NN across multiple devices - this might be what I have to rely on to solve this)

    I did setup a q-value version with a 4 depth minmax, maybe I should have just stuck with that? The decision tree on 4 depth was around 2-10 seconds per decision so it was still going to be a really slow process and then would play more like a minmax as well...

    Considering how powerful and cost effective TPU's are through google, I thought that setting this up at home on high cost gpu/s would still be less cost effective, especially if this project needs a tonne more training that I could ever imagine...

    Sorry for the ramble!

    :)

     
    Last edited: Mar 23, 2021