Search Unity

Is ml agents dead?

Discussion in 'ML-Agents' started by EternalMe, Jun 18, 2022.

  1. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    > Its a good thing. Just not very interesting for me.

    I mostly agree, up to a point. Let's start with how I agree. I work in NLP ("natural language processing"), since several years. I disliked that NLP never showed any 'spark' of intelligence. It was just copying statistical patterns it had observed. I wanted to work on RL, and I decided to try to combine RL with NLP. I wasn't interested in supervised learning of natural language, but in seeing how far computers could get on their own. "Emergent communication". I wrote a few papers on the subject, but didn't manage to get any of them published, "Compositionality Through Language Transmission, using Artificial Neural Networks", https://arxiv.org/abs/2101.11739; "TexRel: a Green Family of Datasets for Emergent Communications on Relations", https://arxiv.org/abs/2105.12804; "Icy: A benchmark for measuring compositional inductive bias of emergent communication models", https://openreview.net/forum?id=S352vriz3G . This last one was getting close to publication standard I feel, but then my gf at the time ditched me, and I started becoming interested in YouTube instead...

    So that's how I agree, potentially, with "[imitation learning is] not very interesting for me". Then another way of looking at imitation learning is that ChatGPT is basically just trained to imitate humans. I agree that it's not technically using what we term "imitation learning", but conceptually, it is learning to imitate humans. And yet ... it does show a spark of intelligence I feel. It is able to combine knowledge together to create new things.
     
    EternalMe likes this.
  2. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    Yes it seems like ChatGPT, by now, can connect some dots in a sort of linear fashion (not 100% sure tho). Or call it kitbashing a problem solver. I tested it with some math textual tasks and did good for most part. Yet sometimes also gave obviously wrong solutions and did not really bite into the problem when it was really clear, but rather started throwing around theories and calculations that seemed to match the 'query' by some degree. So I would say no to the 'intelligence' of it for now. Rather a super advanced query processing machine.
     
  3. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    If you define intelligence as things like "has free will", "awareness of self", then sure, it fails. The fact that it can handle things like tic-tac-toe in ascii art *at all* is pretty incredible.

    googling around a little:
    - this guy runs a verbal reasoning iq test against chatgpt, and chatgpt scores 96, https://www.reddit.com/r/cognitiveTesting/comments/znmbq7/what_is_the_iq_of_chatgpt/ . This puts it around par with the average person on the street
    - this person tried running it through LSAT problems, and it scored 149, putting it in 40th percentile, https://twitter.com/pythonprimes/status/1599875927625764864?lang=en

    Sure, it's not superhuman, yet, but it's showing evidence of no longer being sub-human. Did you try giving those math textual texts to a random person on the subway, or in a nightclub, as a baseline of average human performance?
     
  4. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    No, I am not speaking about free will (if one exists anyway) and self awareness. Also not about not understanding how to solve the task (or the right methods to use for this), what would be most problem for humans. But about not biting into the task and just doing something similar, this rises my suspicions of absence of intelligence in my definition.

    It was a simple task about 2 cars driving towards each other with different speeds and the question of where do they meet. And it calculated instead with the cars going in same directions, and the faster one catching up. plus the opening solution text did no match the mathematical solution below. All was just off. Its pity I did not save this conversation, you would see that I made the task very clear. But maybe a bit in my own way of putting words together, and probably not they way the NN was fed with datasets :p . You should see where I am going with this.

    However the results are impressive from overall statistic, so Its good super advanced calculator that can take on text queries. I stick to that.
     
    hughperkins likes this.
  5. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    A simple example. I could give a x * y task to a person (or persons). With large numbers. Now the person would try to use some strategies learned in school (or else where) to handle this. And the person could get the strategy wrongly or make a simple mistake. And its fine. But if the person says: "The solution to this multiplication problem is... y divided by x ... its z", then I have to really question if the person understands what multiplication is at all in first place, and not giving me some Similar best shot math thingy (memorized at some point). And this is my exp with chatGTP, not often, but it happens, so I draw my conclusions from this and what I know about current NN's.
     
  6. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    Screen Shot 2023-02-19 at 16.54.07.png

    I dont think it does terribly? I feel like if you grab the average person on hte street, they might make at least one mistake in the calculation too? I reckon that some kind of verbal reasoning questions might be a better fit for what chatgpt is good at perhaps though?
     
  7. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    It actually knows how to multiple 7 x 1234, just never gets the right answer as part of the long multiplication:

    Screen Shot 2023-02-19 at 16.58.05.png
     
  8. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    Screen Shot 2023-02-19 at 17.14.33.png Screen Shot 2023-02-19 at 17.14.37.png

    I feel like it gets this task correct?
     
  9. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    My example with the multiplication was not specifically about chatGTP, but rather a very simplified version of kind of a problem i encounter there and here with it. We all make mistakes, so can the bot. But if it's from the kind in my example, it rises suspicion or absence of intellect and querying something similar.

    When it comes to the car problem. It get it right when i rephrased my task differently. However 1st (also clean instructions) it was very off (described it above). Its really pointless to prove wrong by recreating the task in your way, maybe i can somehave find it in history (not sure on that), then we can discuss objectively.

    Currently typing from mobile....afk
     
  10. JB-AI

    JB-AI

    Joined:
    Jan 9, 2023
    Posts:
    14
    It has to rely on imitation learning. It would be infeasible to expect a model to learn all the positions, velocities, angles, and angular velocities of whatever number of joints a character may have let alone adding in CNNs for vision sensors, potentially NLP for speech recognition and synthesis, and doing all of that for each different character or character model. It is much easier to give demonstrations of how a game is expected to be played.

    The bit that is important is thinking about the minimum number of individual motions needed to complete a task or series of tasks. If the goal is soccer for instance, the agent needs to be able to stand, walk, run, strafe, shoot the ball, etc. etc. It has already been shown that higher level tasks such as dribbling can be achieved with human-like performance with a minimum of standing, walking, and running without explicitly training on dribbling. This allows for considerable generalization and can be achieved with an algorithm called Adversarial Motion Priors which preceded ASE and could be put together using ML-Agents as is.

    ASE learns an encoding of low-level skills to be used by a high-level controller for completing more complex tasks. The purpose of the encoding is to make it easier to map skills to inputs such as button presses, text, or speech but will be crucial for developing hierarchies of motions specific to individual tasks that can be used by a high-level controller whether AI or human. This ASE model is crucial for any sports related game having physically realistic motions, the current industry standard is the non-sense from the game Madden 25 which effectively rag dolled a character if they made any contact with anything else.

    I caution against assumptions that AI should be used solely for NPC decision-making and instead insist consideration of their use in animating generally.

    I have already come up with numerous games that could not be made without RL for animation, but I also again urge caution in assuming that gaming is the only use for these models. If you want to create a game with nearly true to real-life robotic systems, for instance, you may only be able to do that with this method because of the sheer difficulty of hand-animating each and every potential motion some robotic system could make in each different context, in each different environment.

    Hugh, regarding your concern of the AI being designed to beat you or improve at the rate you improve, it is absolutely possible to save copies of the brain at different points in its training and allow players to select for difficulty rather than have the difficult always dynamically adjusted via training. It is also possible to delay the updating of the network for some number of plays or even reduce the number of state-action-reward examples fed to the model to constrain its rate of improvement.

    EternalMe, you don't need separate brains for differing difficulty levels, instead you would save a history of brain versions after each updating of the network. One brain trained but inferenced as multiple brains of different versions. The versions could be controlled by win-rate or some other factors. It is also possible to constrain an agent so that a brain must perform worse. I would not necessarily advise this, but it is possible.
     
    Last edited: Feb 20, 2023
  11. JB-AI

    JB-AI

    Joined:
    Jan 9, 2023
    Posts:
    14
    On ChatGPT,

    I have also used it extensively and it definitely has significant limitations. This is mostly due both to the limitations of NLP as it specifically studies languages that exist and not the space of all languages and also due to the inefficiencies and biases inherent in our use and conception of language. I have seen some areas of research that consider mapping words to images, 3D models, sounds, etc. which is necessary for actually capturing what language is, which can be defined as a universal representational system.

    Any change in some environment can be represented as some other change in a similar or different environment. So, something falling can be represented as sounds, chemicals, motions, etc. or some series and/or combination of the previous.

    The process of communicating language can be thought of as a representational convergence between two or more *intelligent parties* where each party attempts to communicate some representation by taking some actions to convey some change in their environment (internal or external) through a process that is essentially gradient descent. An example, some chemical reaction (hunger) in one party's stomach (internal environment) is communicated as a series of sounds to another party with the intent to convey the meaning of a hunger (the word as a representation).

    If the parties use unfamiliar representations, they must converge on a common representational system to convey meaning to each other. Take the idea of a baby randomly taking actions to convey to its parent some change that happened to it and the parents then must make somewhat random attempts to determine the baby's intention (meaning) such as "feed me". This can be generalized in innumerable ways. Such as the baby attempting to represent the features associated with its parent's motion as the movement of its own extremities. It must continuously make attempts at conveying the intended goal to its own motor system until there is a common electrochemical representation used to communicate between the baby and the baby's limbs that have features similar enough to its parents' motions to allow the baby to walk. In this case, imitation learning is a necessity for intelligent systems as, generally speaking, the space of actions will be so large that a being will be more likely to die than find the optimal policy on its own.

    *Intelligent parties can be defined as a system capable of perceiving, representing said perception (retaining a representation for internal use), and communicating said representation (external use of the representations).

    This is all a long-winded way of pointing out that meaning isn't what words, or representations, show up more often next to what other words but that, meaning is rather what a word is standing in for. So large language models have "knowledge" of the representations, the representational system, and common uses of the representations and system but has no "knowledge" of what is actually being represented. It has no more data on a car than the use of the word in context with other words. It's a shallow understanding.

    The discussion of how ChatGPT performs given a logical prompt from different parties shows the problem with how NLP is currently performed. The text OpenAI would prefer to train their model on will be similar to that of a researcher. Researchers tend to use language in similar ways due to how often they communicate with other researchers and as such have developed particular language and convention to convey meaning. It is by no means the most effective (just to note there is fundamental room for improvement in our general use of language). This means that when a non-researcher is attempting to convey something that may have the same meaning as what a researcher conveys but is written in ways a researcher is unlikely to use, the model (whether a language model or image generation or speech synthesis, etc.) will tend to perform worse at recognizing the intent of the user.

    Without expanding NLP to include any representational system (starting with the most common) and a convergence process between the user and the model it will be difficult to ensure that a model produces correct output specific to that user's intent. After all, the models are currently trained to produce the next most likely words given what they have been trained on, rather than the next most likely correct words regardless of whether they were in their training data.

    All of the above doesn't even account for self-reward systems. In other words, an intelligent agent that self-determines its rewards rather than the rewards being determined by the creators of some environment.
     
    Last edited: Feb 20, 2023
  12. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    Code (CSharp):
    1. EternalMe, you don't need separate brains for differing difficulty levels, instead you would save a history of brain versions after each updating of the network. One brain trained but inferenced as multiple brains of different versions. The versions could be controlled by win-rate or some other factors. It is also possible to constrain an agent so that a brain must perform worse. I would not necessarily advise this, but it is possible.
    I depends on How you want the difficulty to work. In my case I would want the enemy with lower difficulty to be slower, weaker, etc. - physical properties + abilities and not just the level of training. And changes like these usually require retraining a new brain. Unless you train it with random physical properties and abilities, but that is an idea I haven't used in practice, so can't comment objectively (this could mean larger network).

    Other then that, the idea with brain History versions sounds neat, but I would say in practice it's not so great. Mostly there sort of a brake point where the agent becomes usable at all, after that the continuing training is more like polishing the small things.

    Of course one can plan very smart learning Points. Like first the agent should learn to run slowly - difficulty 1 (copy), then it should learn to to run faster - difficulty 2(copy). Or in specific point introducing new challenges to make it smarter. Ml agents has a functionality for this. But it's some intense planning when mixed with physical abilities etc. and probably won't work in all scenarios.
     
  13. JB-AI

    JB-AI

    Joined:
    Jan 9, 2023
    Posts:
    14
    What is the total training time usually for training multi brains for different difficulties? What kind of games do you prefer making using ML-Agents by the way, Eternal?

    Hugh, how do you use ML-Agents for research, I've been trying to think about a tool that could potential mass produce research interactively. For instance, a researcher could have a core version of an experiment that develops without any outside interaction (maybe multiple as a control) then has numerous copies that could be interacted with not just by the researcher dynamically interacting with their experiment but could be open to public interaction as well. I think it would be pretty interesting to essentially play with a researchers experiment in real time or near real time. It could be a useful educational tool as well.
     
  14. EternalMe

    EternalMe

    Joined:
    Sep 12, 2014
    Posts:
    183
    I haven't done anything super serious with ml-agents. Not commercial, not for a real games. Just hobby level experiments going on for years in my free time. So I don't do long/fine trainings. 1mil steps or so to get a proof of my concepts. But my skill is on point and I also help a lots of newcomers in discord. Also have cool ideas for ml games and beyond, but time (money) is a big problem, so I usually keep just dreaming about it.

    As for you tool. I did not really understand it. If you can give me real examples orsmth. But I suggest discord and chat there. This post is going into all possible directions by now.
     
  15. unikum

    unikum

    Joined:
    Oct 14, 2009
    Posts:
    58
    It's so confusing when Unity devs come in and say it's not dead and then it follows with long silence and no updates to the repo. Is it put on ice or is it abandoned now?
     
    TV4Fun likes this.
  16. DerDicke

    DerDicke

    Joined:
    Jun 30, 2015
    Posts:
    292
    This is the creepy thing. It gets this relatively complex example right, but try:
    "Mark has 3 bananas. Peter has 5 apples. How many apples do they have together".
    Depending on temperature you get "They have 8 apples together" or "They have 8 fruits together".

    I also generated Unity code with it. Compiles and runs. But when math is involved, it's usually complete bs. But it *looks* correct until you test it (or understand the math code).

    Also it creates links if you ask for sources. Meaning links that do not exist. If you tell it the link isn't there, it will create another one, pretending this is the right one. Just funny.
     
  17. DrunkenMastah

    DrunkenMastah

    Joined:
    Sep 26, 2017
    Posts:
    51
    Bug fixed in GPT4 upload_2023-5-16_14-37-13.png
     
  18. DerDicke

    DerDicke

    Joined:
    Jun 30, 2015
    Posts:
    292
  19. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    So, this thread is no longer about whether ML Agents has been abandoned, but while it still was, the consensus was that it has been abandoned, right?
     
    TV4Fun likes this.
  20. DrMadVibe

    DrMadVibe

    Joined:
    Jun 29, 2022
    Posts:
    5
  21. JB-AI

    JB-AI

    Joined:
    Jan 9, 2023
    Posts:
    14
    I think it is either on a back burner or they are working on integrating with the DOTS/ECS system or it truly is dead. It's really hard to tell.
     
  22. GamerLordMat

    GamerLordMat

    Joined:
    Oct 10, 2019
    Posts:
    185
    Not really, like with every tool you have to know the theoretical basics to use it in the most efficient way.
    Mlagents is a great tool and will stay like this until there are huge jumps in AI (which ahvent happened yet).

    you can learn PPO, Deep Learning etc. on the way but use Mlagents for projects
     
    TV4Fun likes this.
  23. KevinWoozy1423

    KevinWoozy1423

    Joined:
    Aug 9, 2021
    Posts:
    3
    a year later
     
    TV4Fun likes this.
  24. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    47
    I am not sure if ML- Agents id dead.
    Release 21 came out a couple weeks ago. But there is no clear information on what new features or capabilities this release offers. I guess lots of bug fixes, but no additional features ?

    So it feels as though ML-Agents is not being developed any further, it is just in maintenance phase.

    Unless anyone can give us any information of if and when ML- Agents is going to be progressed any further.
     
  25. WaxyMcRivers

    WaxyMcRivers

    Joined:
    May 9, 2016
    Posts:
    59
    https://github.com/Unity-Technologies/ml-agents/releases

    All of the info about Release 21 is right here. ML-Agents is still in development.
     
  26. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    47
    Yeh, I saw that. But there is nothing of any real substance that progresses ML-Agents.
    So it is pretty dead.
     
    TV4Fun likes this.
  27. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    Why is it dead? Is it not still useful? It looks like you can train lots of interesting things with it.
     
  28. heartingNinja

    heartingNinja

    Joined:
    Mar 26, 2015
    Posts:
    35
    I just did your hummingbird course for a second time using ml agents 2.0.1. It still works. One name changes but it is in comments. I have it working with animations by just switching to discrete actions instead of continuous.

    But now I see I don't really understand the training. It will do great one round and get a mean reward of 7. Then go back to 0.
     
  29. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    Because of the very high cost of the investment of limited time. It’s unwise to poor hundreds or thousands of your limited hours of life into a tech stack that has no future of its own. That time would be better spent on a healthy, thriving foundation with higher returns on your investment in the future.
     
    TV4Fun likes this.
  30. msrafiyenko

    msrafiyenko

    Joined:
    Sep 25, 2014
    Posts:
    5
    Hey, folks! are there any new upcoming updates? or it is dead now for sure?