Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Stable Diffusion: Fundamental technological shift

Discussion in 'General Discussion' started by xshadowmintx, Sep 3, 2022.

Thread Status:
Not open for further replies.
  1. xshadowmintx

    xshadowmintx

    Joined:
    Nov 4, 2016
    Posts:
    47
    Hi!

    I don't post much, but I'm a regular reader, so I believe there's a reasonable chance that the mods and/or actual unity developers will read this, so here goes:

    Seriously, stop whatever you are working on, and pay attention to what is happening on https://github.com/CompVis/stable-diffusion

    There are rare times when you see generational shifts in technology taking place, and if you miss the boat on this one, you will be regretting it... literally, forever.

    There is a very brief window here, to be a leader in integrating stable diffusion and related technology into unity, even in an experimental manner, that will either make unity a leader, or a left-behind.

    I know is a complex issue, but mm... well, I think the internet is pretty clear that stable diffusion is a fundamental transition in art and asset generation.

    If you are not paying attention, for whatever reason, or not applying yourselves to actively integrating and experimenting with this technology then DO SO NOW.

    Here are some things that have happened in the last 10 days, since this model was released:

    - the original code: https://github.com/CompVis/stable-diffusion

    - the code now runs on any M1 mac: https://github.com/lstein/stable-diffusion

    - the code with support for masking and inpainting, which is highly effective at generating, for example, textures and game assets: https://github.com/hlky/stable-diffusion

    - running on intel cpus using openvino: https://github.com/bes-dev/stable_diffusion.openvino

    I can't say this any more clearly than this:

    This is an absolutely transformative technology.

    Sometimes things pass and it seems like you folk (ie. Unity, the company) are drifting along doing this and that with all kind of priorities, and I totally get that.

    If the next blog post I read is about unreal now has an in-engine img2img to use SD, you will know, that you've missed the boat, and are playing catchup. So... have a look. Have a play. Reach out, work with them. Do something amazing.

    ...but, please: PAY ATTENTION because, I think.. looking back, you might find that this is more important than anything else you're doing right now.

    That is all.
     
    rem_s and johnpccd like this.
  2. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    You're a bit late to the party.

    Here's the thead where it was already discussed:
    https://forum.unity.com/threads/ai-art-for-games.1320627/

    Here are posts in it explaining how to use it.
    https://forum.unity.com/threads/ai-art-for-games.1320627/page-4#post-8406549
    https://forum.unity.com/threads/ai-art-for-games.1320627/page-4#post-8409678
    https://forum.unity.com/threads/ai-art-for-games.1320627/page-4#post-8410107
    https://forum.unity.com/threads/ai-art-for-games.1320627/page-4#post-8411547
    ------
    It is one of the situation where unity actually doesn't need to do anything, because there's no boat to miss.

    It is already available to everybody and the internet is full of sites describing how to build prompts for it.

    It already works as a standalone tool, and trying to integrate it into engine is kinda pointless.
     
    Ryiah, pachermann, Martin_H and 4 others like this.
  3. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,019
    You forgot to explain WHY. ;)
     
    JoNax97 likes this.
  4. DragonCoder

    DragonCoder

    Joined:
    Jul 3, 2015
    Posts:
    1,459
    For people who want a really easy entry point - someone made a neat little GUI with integrated installer: https://www.reddit.com/r/StableDiff...sytoinstall_windows_gui_for_stable_diffusion/

    For art it's cool. And quite funny, I'm active in some art community and posted some pieces.. and got a higher fave/view ratio than some commissioned art I paid money for ._.'

    However it's hard to see the practical use for game assets asides of concept art yet because what you cannot do very well is tell "draw me the great character you have just drawn, but as a frontal view instead of sideview". It excells at "one off" things or things it has thousands of samples in its dataset. Backgrounds and props work though with some manual adjustment afterwards.

    That said, this is just the beginning - aka the first system that is really usable for said one-off usecases. Indeed see the thread neginfinity linked. There are aims for pixel art etc. too. Maybe the reproducability issue will be solved as well at some point.


    Yeah good point. Let's not make this into a mere hype like NFTs xP

    While not practical use to myself yet, it certainly brought me a lot of joy and that's something :)
     
  5. xshadowmintx

    xshadowmintx

    Joined:
    Nov 4, 2016
    Posts:
    47
    It's already been effectively integrated into photoshop and krita, proving you are incorrect.

    There is considerable value in having your tooling tightly integrated into your workflow.

    Never the less, you're welcome to your opinion; I shall, however, bookmark this post to bring up later, when it is integrated into other engines, and people moan and complain that unity doesn't have those features; because 'the community thought it was kinda pointless'.

    At any rate, I'm not interested in what other developers think; the tools are out there. People are already using it to generate game assets.

    What I care about, is that Unity, the company, is paying attention. ...because, bluntly, anyone who is *not* paying attention right now, should be struck off for being asleep at the wheel.
     
  6. xshadowmintx

    xshadowmintx

    Joined:
    Nov 4, 2016
    Posts:
    47
    This took me 15 seconds to generate. You can definitely generate more than simply 'concept art'.

    You decide if you see value in it... but, I think it's fair to say that some people will find it has some value.

    (the latter being the output from taking the output from the first image and looping it back in as input for a second round)
     

    Attached Files:

  7. BIGTIMEMASTER

    BIGTIMEMASTER

    Joined:
    Jun 1, 2017
    Posts:
    5,181
    is it a fundamental technological shift or something that saves production time in certain niches?
     
  8. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    Ther's a thing called textual inversion. Where you give 3 to 5 images to yet ANOTHER neural network, and it produces a gibberish "vector" which describes it within your neural network.

    Then you can use the gibberish vector to refer to the thing.
    I've not used it yet, however. The requirements to run inversion are significantly higher.
    ---
    Take a look at the posts I linked and you'll be able to greatly improve it.

    Like I said, you're late to the party.
     
  9. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    9,744
    Nobody who says something is a fundamentally transformative technology ever seems to bother to explain why, probably because they've only ever bought into the hype without thinking of the practical way these technologies end up disseminating into the ecosystem. AI art and its various implementations do have a place in the industry, but fundamentally it's going to be things like texture generation (we already have this in some capacity, its inroads are proven), blocking out concept art to have more detail (management loves this S***), and filling in background elements and creating other non-hero assets.

    Of course, this all comes with a lot of problems. In the case of the concept art and non-hero assets, I get the feeling we're going to very much see something similar happen in those spaces as we've seen happen from scratch music in film scores, where something will be generated relatively close to what's going to go into the final product and what ends up in the final project is going to be little more than a paintover of that. Management loves this S*** because it means that there's a locked in process that doesn't have to be adapted too much, lots of artists hate it because it fundamentally reduces their job to something that's easily replaced by entry level workers who just need to know their way around image editing.

    More than that, however, there's still the upcoming potential legal issues. Despite what people will tell you, there is a fundamental difference between a human being taking inspiration from art they've engaged with as a person and throwing all the art ever into a training data set and then drawing from that to create something new. The real issue here is that this means that the training data is fundamentally entirely derivative from work that is entirely unlicensed. On top of that, the AI output suffers from a lot of copyright issues because copyright generally requires human authorship. This means that additional work would have to be ensured to establish provenance on the part of companies using it in any significant capacity.

    Stop buying into the hype train and shiny demos and think about how things actually work when they've gone through widespread adoption.
     
    Saniell and Martin_H like this.
  10. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,019
    Like this one? :D

    upload_2022-9-3_18-1-10.png

    I particularly like the interpretation of "offroad". :cool:
     
  11. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,019
    Which it is. The AI was programmed by humans. The person generating the image defined the parameters.
     
  12. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    9,744
    Cool, enjoy being wrong about how this works I guess.

    https://www.smithsonianmag.com/smar...e-rules-ai-art-cant-be-copyrighted-180979808/

     
    Teila and Martin_H like this.
  13. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,019
    => Other countries put less emphasis on the necessity of human authorship for protection.

    I'm in an Other country. And given the appeal, and future legal conflicts expanding the gray areas, I'm sure that eventually even the USCO will change their point of view. I'd say as soon as a tech giant makes such an appeal and there's REAL money on the line ...
     
  14. DragonCoder

    DragonCoder

    Joined:
    Jul 3, 2015
    Posts:
    1,459
    That's the inverse though. Yeah you cannot copyright a piece generated via AI (in this particular ruling) but you can still use it.
    And if your game pops off with a specific character generated by the AI, you acquire a trademark for it (of course for money. Nintendo and co. pay as well for having Mario etc. protected). For a trademark it does not matter so much how it was created.
     
    Last edited: Sep 3, 2022
  15. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    9,744
    Rulings. Plural. Repeated attempts have been made. Also that's really not how trademark works.
     
  16. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    You're not using it correctly.

    And that's not how it works. The AI was not programmed.
     
    Martin_H likes this.
  17. Martin_H

    Martin_H

    Joined:
    Jul 11, 2015
    Posts:
    4,433
    https://www.deutscheranwaltspiegel..../do-ai-generated-works-qualify-for-copyright/

    I wish more people would understand this. And even more I'd wish they'd have more compassion for artists, but I guess that is a lost cause...
    I've played around some with stable diffusion and I'm convinced it isn't nearly as "intelligent" as the marketing hype and apologists want us to believe. I think it's glorified copyright infringement on a massive scale and I think over time (perhaps with help of other trained models) we'll find out the true scale of how aggressively it's ripping parts from training set images to construct its output images. This is the atomic war escalation level of global scale copyright ignorance and I bet it will come back to bite many of the early adopters.
     
  18. DragonCoder

    DragonCoder

    Joined:
    Jul 3, 2015
    Posts:
    1,459
    That can be dismissed on basis of the data.
    Dall-E 2 used a few hundred millions of images of training data. It contains only 3.5 billion parameters in its model though. That means just a few dozen values per image. Good luck at reproducing a training image or even identifying an image with so little data (it would be an amazing compression feat. if possible).

    The AIs do not copy images. They utilize the "knowledge" that's behind how images are produced. And you better do not wanna copyright knowledge like people did in the middle ages (where only "guilds" were allowed to posses the means to do certain crafts).

    The one thing I could imagine becoming copyrightable are specific art styles if they are recognizable enough.
     
    Last edited: Sep 4, 2022
  19. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    With you being an artist, you should probably take a look at this video to see how people use it to make their own works from a template.
    https://www.reddit.com/r/StableDiffusion/comments/x0b0rq/img2img_fantasy_art_walkthrough_video/

    Regarding copyright...

    Diffusion networks work in following way. They start with a noise. And then, in turns, apply filters to it that make the noise look more like a desired target.

    https://towardsdatascience.com/stable-diffusion-best-open-source-version-of-dall-e-2-ebcdf1cb64bc

    In case of stable diffusion, each keyword is a function like that.
    The thing is, for example, a "dog()" function does not describe individual dogs. It describes all possible dogs. So there is not really a specific dog in the dog function and no training image.

    Another issue is that long time ago when discussing something with specific artists, I was asking (I think) about copyright issues with use of reference images.

    The answer I got was along the lines that if you "steal" from sufficiently large number of sources, the result will no longer resemble any of them, and thus fail substantial similarity test.

    And then neural net pretty much used original images to derive the references from them.
     
  20. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    9,744
    Except it can't be so easily dismissed. Dall-E and Midjourney both just scraped data from every online resource they could, regardless of whether or not they had the rights to. Midjourney was especially revealing because it would often recreate Shutterstock watermarks. GPT-2 does the same, with some prompts resulting in lines being sampled whole-cloth from writing online, including fanfiction authors putting "if you liked this, please pledge to my Patreon" at the end of stories because the language model saw that as how things just concluded.

    What you are effectively arguing here, poorly at that, is that because they do this with so much data that it'd be impossible to police. That's not just a bad argument but ignores the part where these large dataset algorithms can not exist without the use of labour they are not entitled to. These are not things that are just in the public domain and your argument is basically the exact same as people who (incorrectly) say "well if I found it on the internet that means it's free!"

    But that's not true, and there's no legal basis to believe that. Jumping whole-hog into the "AI revolution" is already leaving a lot of people burned and that's going to lead to a lot of legal problems, especially if it becomes clear that, say, the training data contains footage from Disney movies.

    That's not true at all and you're prescribing way too much to how AI works. There is no deep understanding, they simply analyse patterns. That's not "how the images are produced," that's tracing. Your entire argument about "copyrighting knowledge" is also pretty embarrassing because, again, an algorithm is not a person. There is no creative interpretation here, it is code that replicates things based on fuzzied data.
     
    Martin_H likes this.
  21. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    Missed this one.
    I advise you to change your attitude. "I'll record this conversation in the big book and later show you how wrong and shortsighted you were muahahaha!" is not the way to make people subscribe to your opinion.

    It also just happens that I've spent a whole week messing with this tool non-stop, and by now know well what it can do, what its limitations are and what it can produce. I also shared the information and as far as I can tell you haven't bothered to read it.

    This is the sort of stuff I can do with it:
    upload_2022-9-4_4-17-21.png

    The point is, when I'm saying "there's no boat to miss", I'm saying it with all this knowledge in mind.

    There's no point in integrating it into unity in the same way as there's no point in integrating blender and photoshop into unity. It is a wrong sort of task. Specialized software stays separate, and this is a specialized software.

    At the moment, it is ill-suited for producing sprites and game assets directly, like you tried. The castle you made looks subpar and is nigh unusable for anything serious. If you start improving it, you'll get something like this,
    upload_2022-9-4_4-27-9.png
    And then it is no longer usable as a sprite, like you wanted.

    Yes, it is an important tool. It will have a big impact on art production, and already did.
    But its existence is almost unrelated to what unity does, because unity is a 3d engine. You can use this stuff to generate concepts, to improve visuals of your existing work, but, here's the thing....

    Those tasks do not affect what you do in unity most of the time or even at all. It is just an extra filter you got in 2d pipeline, and all. And in unity you'll be doing the same thing as before. Same deal with unreal engine, by the way.

    So, there truly is no boat.
     
    GCatz likes this.
  22. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    Well, here's the thing. Midjourney is not GPT-2, due to different number of dimensions in the output. AFAIK also cannot use diffusion model on text, again, due to the way it works. GPT is like a more complex markov chain.

    Regarding shutterstock watermarks, do you imply that if someone plasters shutterstock watermark on top of their work, they'll be sued for copyright infringement? Shutterstock may become upset with trademarks, but the image won't become theirs.

    See art references and transformative use.
    https://en.wikipedia.org/wiki/L.H.O.O.Q.
    Copyright does not work by tracking the original image and all things that were made based on it.

    For the claim to succeed, it has to pass similarity test or similar. See famous Rogers vs Koons.
    https://en.wikipedia.org/wiki/Rogers_v._Koons

    Artists has been copying styles, poses and objects for centuries. By your logic, the traditional artistic exercise of drawing still life of some statue would involve a copyright breach. Additionally, anything you saw is already in your brain, that means that you should be sued if you ever visited shutterstock site. Actually the images from that site likely are downloaded onto your computer in a cache, so I suppose that makes you a pirate. Additionally, claims like "this is tracing" has to be supported by evidence.
     
    DragonCoder likes this.
  23. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    9,744
    I never said that Midjourney was GPT-2 and what I'm saying is that is in the post.

    An algorithm is not an artist. I covered this earlier too.

     
  24. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    Actually, I'd argue that this is false.

    As if you think about it, an artist is a neural network, and painting/drawing is an algorithm being used by it.

    About that.

    First, you can argue that output of neural network is public domain, as long as you have an image "address". The address being keyword settings, output resolution, denoiser, number of steps and seed. Because anyone with the same "address" will get the same image.

    But this is where it gets interesting.

    Stable Diffusion has ability to work from a template.

    Here's a quick example I did earlier.
    The picture on the right is derived from MY drawing on the left, and thus it would belong to me and I would own the rights to it. Something similar is happening here:
    https://www.reddit.com/r/StableDiffusion/comments/x0b0rq/img2img_fantasy_art_walkthrough_video/

    And similar thing happens when people produce photoshop collages. Transformative use says hi, where individual fragments recombined create a new whole.

    So to have a human autorship, you simply need to have a human involved in the process. Once somebody combines two or more pictures and makes a collage, you have a human involved, and result will not be easily found in the database anymore.
     
  25. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    9,744
    If you're not actually going to read my posts, examples, and links, just don't reply.
     
  26. Martin_H

    Martin_H

    Joined:
    Jul 11, 2015
    Posts:
    4,433
    You don't know what you're talking about, I've already done it. And I expect to find more over time, others will too. You're simply wrong. Lossy compression imho is a great way to describe these models.

    I've seen all those videos (as an artist... with horror and anxiety) and I take offense in the notion that the result is supposed to be "their own works", when it is so clearly ripping off and exploiting the entirety of images online.

    Reference images are ideally used to derive understanding of the concepts that create a certain look. You're not supposed to straight up copy them into the final piece, you're supposed to learn from them and then do your own thing. What this AI does imho is more like auto-photobashing and for photobashing you still need to obay copyright laws and licenses (at least in my country). I strongly object to the notion that it's supposed to be legal to just feed any image online into the training sets for these models. It's frankly ridiculous on what a scale copyright and licenses are being ignored here. I remember how serious you took licenses for things like make-human, and I can't reconcile that with the stance you have now regarding stable diffusion and similar models. Maybe you'll want to have a look at licenses of the stock photosites that have been crawled to feed these AI models. Models that may now make stock photo sites to a significant degree obsolete. I have a hunch they have terms in there about not using their images to create competing products, because pretty much every site selling royalty free assets does have those terms - and that's IF you purchase a license, which they probably have not, since no one could afford it.

    I don't think any of these AI models actually derives deep understanding from the images the same way an artist does. Try getting it to change the lighting and colors in a complex scene without changing the likeness of faces or other important factors of the scene. As far as I can tell, it's a glorified auto-photobasher that doesn't actually understand a thing it's doing in the visual domain.

    I won't deny it's very fun to play with this, I'm doing it too. Googling for images can also be quite fun and this is like a supercharged version of that... on drugs... and probably also somewhat unhealthy, like scrolling through tiktok for a week straight. I'm still reading through your findings in the other thread and I enjoy learning how to direct prompts better. Be that for my own amusement, or to prove people how they're totally deluding themselves into thinking this is some kind of magic that it simply isn't. I've read a bit about the ideas the main developer has regarding therapeutic use, creation instead of consumtion, and giving people who have no access to painting or similar some kind of creative outlet. It's not all bad, but there is still a lot wrong with it, and I think this tool has no place in commercial endeavours. @Murgilod is doing a better job than I explaining all these problems. You obviously do not agree, but I'm not interested in having a long back and fourth argument about that, so lets agree to disagree.

    I think you wouldn't. It's like the cap is stolen from one photo, the shirt from another, the face from another, etc. except it's probably more than one source per element, or interpolated between several, but in the end it is still a picture made out of pieces of other pictures. You own the copyright to the left picture, because you drew it. That's it. The right picture contains mostly the work of other artists, not you. You've done paintings, you know the work needed to get from the sketch to the final is way more than coming up with that sketch in the first place.

    I don't think that applies, but I'm no lawyer and more importantly I'm... so tired of having that debate on every single place where people talk about AI right now...

    Wasn't there a game that got pulled from steam because a part of a rifle looked like part of a rifle in Call of Duty? Small indie fps, 5+ years ago, Jim Sterling covered the incident I think.

    If you're interested, maybe read the thing I linked in the previous post, that was written by actual lawyers and giving some examples of what you own copyright to.
     
    Luxxuor likes this.
  27. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    This topic is interesting to me, but I'm under no obligation to follow your orders or desires. Which means that I go where I please, and act as I please. The cat on the avatar should've given you the hint, to be honest.

    Also, your link to deathwalterspiegel from earlier did not address collages or img2img.

    -------------.

    Back to the topic.

    Regarding stable diffusion:

    In essence we got a "make it pretty" filter, that will be used by actual artists to iterate through their work faster.

    And in addition to the "make it pretty" filter we got a stock image archive with all possible images in existence. Unfortunately, the image archive comes with a gacha machine, so you can keep pulling the lever in hopes to produce a better image out of billions of possibilities.

    Thinking about it, journalists and people with concerns are probably focusing on the wrong thing. They think that the important part is going to be the gacha library making complete images and people using those as is. While the actual impact will likely come from "stock" combined with "filter". Because both combination and selection of humans will involve human operator, an authorship will likely arise in the process.
     
    f1415 likes this.
  28. ippdev

    ippdev

    Joined:
    Feb 7, 2010
    Posts:
    3,792
    By the same token an artist is not a technique. An algorithm can virtually describe that technique and implement it graphically or 3D thru CNC machines.
     
    neoshaman and neginfinity like this.
  29. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    @Martin_H, I've written a long explanation but ultimately decided to cut it down to focus on key points. (not sure if that one worked out)

    First, there is understanding within the net and it is not autobasher. You CAN change lighting and elmenents in the scene, while retaining subject and composition, although it requires luck. The composition is largely determined in the beginning, it sets overall shapes, and details you add, alter details in the scene, while keeping overall shapes the same.

    The reason why I know is because I generated tens of thousands of images looking at effects of different keywords.

    Here's an example. The initial argument sets up the overall "idea" or shapess, and adding more keywords refine it. You can also alter materials, depend things in a different medium, and so on. For example, you can draw darth vader as an origami, though it does not always succeed.

    Original prompt:
    upload_2022-9-4_5-21-6.png
    Adding keywords:
    upload_2022-9-4_5-21-55.png
    More keywords:
    upload_2022-9-4_5-24-13.png
    The net has idea of limbs, faces, eyes, skin, material textures, knows that trucks has wheels, the shapes of cities, spaceships, and so on. It does not know them as copy pasted fragments, but rather as composition of shapes, lines and so on.

    Now, the thing is, the net works on several candidates, and changing keywords might make it pick a different one, but in general, making alterations, while keeping overall original idea is pretty much what it is already doing, and you can't achieve the same effect by bashing photo fragments together.

    It is also more or less capable organizing objects in logical fashion, so every time I hear someone say "photobasher", I feel like they haven't played with it enough. I mentioned it elsewhere, but imagine that somebody took mona lisa, put it into blender, pressed a button and used resulting powder to paint something? Would result be still a derived work?

    Also, here's a memetic result.
    upload_2022-9-4_5-33-52.png
    Not much luck adding the second woman in this iteration.

    Regarding copyright in makehuman.
    The reason for a different stance is that I do not see those situation as the same, and to me the way makehuman works is vastly different from the way neural net works. In makehuman we have valuable handcrafted topology, which determines the outcome. But in case of NN, the process is largely similar to what an artist does. You live, you look, you see, you remember, and every thing you saw is stored in your brain for a long time, and you refer to it while drawing. So if you ever opened shutterstock, it is in your mind. Does that mean you're guilty of copyright infringement? It is in your brain already.
    ---------
    Also, the most important thing.

    Regarding the generated painting. template-based workflow, not just text to image.
    If I spend a week with my limited skills and arrive at the EXACT same thing, what difference will there be ?

    For me, the point of drawing is to bring an image I have in my mind into the medium. The image has already been produced, it is in my head. Because my brain has no USB port, I can't just download it. But the image is still there. I can see it. I know what it looks like, how things stand, I just can't copy it easily, because I lack skill. What's more, I can't exactly go to a human artist and ask him to make the exact thing I see, because I'd have to go through limited natural language communication, where information will be lost, and in the end the art will be different, because based on my input, the artist will envision a different thing. At which point I can request a new version, and the artist probably ask for an exorbitant price for it.

    Stable Diffusion gives you an access to a library with billions of images. Probably not every possible imaginable thing, but an innumerable amount of them. And somewhere in there, there's picture I saw in my mind. I simply need to find it.

    So, how does the process of making exact image I want go with the artist? I give out a sketch, describe what I want, and I iterate with a painter.
    How does t he process of making exact image I want go with painting by hand? I start with a sketch, I already know what I want, and I iterate, trying to make it closer to the ideal idea, step by step.
    And how does it go with img2img? The process it the same I start with a rough sketch, and slowly try make it look closer and closer to what I had in mind, step by step (for the record, you can't generate it with one click, even though you might luck out, template-based rendering requires your input, continuously. YOu iterate with it).

    In this scenario, what the tool does is accelerating the process by a factor of thousand, and making it accessible to people who cannot draw or pay to an artist. So they can make the pictures in THEIR mind into reality. And of course, an artist will have much more power with it.

    Isn't that what we always wanted? Ai assisted artistic workflow, where the human works in tandem in machine, painting in big strokes and letting the machine fill in details or do the grunt work? Well, maybe not "we", but I definitely thought this thing would be nice, and expected it to arrive.

    And here it is. It arrived.

    Obviously, for the sake of the picture you saw, you can study art and improve your skills, or strive to earn money to hire an army of artists, but is it all worth it? Will you do the same thing with every other things you want to try? Lifespan is finite, and it is entirely possible to pursue too many skills.
     
    M4R5 and stain2319 like this.
  30. UhOhItsMoving

    UhOhItsMoving

    Joined:
    May 25, 2022
    Posts:
    69
    halo 3 comparison.png halo reach comparison.png call of duty comparison.png
    Someone "defined the parameters" (i.e. made the prompt), but did they create the work? A prompt and a work are two different things. If a person asked a group of people to draw the prompt "a penguin in a suit solving math problems," who would own the works made by that group of people?

    Btw, I'm not bashing the thing at all. I've already generated amazing images just today alone.
     
    angrypenguin and Martin_H like this.
  31. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    ALso, regarding "artbashing". Here's a small demonstration of how it works.

    Original image:
    upload_2022-9-4_8-18-57.png
    Request:
    "battle tank in a city" (+style keywords).
    Result (image getting blue is a bug in t his workflow, you need to process it with a filter from time to time to combat it):
    upload_2022-9-4_8-19-23.png

    Process (next images are based on previous ones):
    upload_2022-9-4_8-20-5.png
     
    Noisecrime and Antypodish like this.
  32. GCatz

    GCatz

    Joined:
    Jul 31, 2012
    Posts:
    281
    Stable Diffusion is a game changer for 2D stuff and replacing crappy fiverr artists
    but Unity can't do nothing with it..

    on the day Text-to-Image will become Text-to-3D-Textured-Model (and not the voxel kind)
    than there will be a boat to miss
     
    xVergilx, neginfinity and DragonCoder like this.
  33. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    If you can generate smooth voxels, then you can generate polygons out of it.

    Actually, I'm reminded again of Nvidia-NGP which can reconstruct a sort of volumetric data out of several photographs using GPU. Someone called it 3d polaroid.

    I played with it briefly, it generates a volumetric "cube" which takes about 60 megabytes of disk space. From the cube you can produce geometry.

    Because there's no topology involved, I could see this being generated by a neural network at some point.

    The issue is that this would require insane amount of training data. With images we have the web, and all the imagees we'll ever need (liaon5b uses 2 billion pictures, people could probably go much higher, if they wanted), but that's not the case for the NeRF data...
     
  34. xshadowmintx

    xshadowmintx

    Joined:
    Nov 4, 2016
    Posts:
    47
    It seems capable of generating game-ready assets to me.

    _output.jpg
     
    Last edited: Sep 4, 2022
    Deleted User and DragonCoder like this.
  35. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    Uh, are you going to nitpick at a single phrase and ignore everything else said in the thread?
     
    MadeFromPolygons likes this.
  36. xshadowmintx

    xshadowmintx

    Joined:
    Nov 4, 2016
    Posts:
    47
    Your argument boils down to:

    - This technology isn't useful except to generate concept art.

    - There are existing tools that let you use it to generate concept art.

    - Therefore, there is no reason, specifically, for Unity, the company, to really care about it.

    However:

    - It is suitable for uses other than generating concept art. Generating concept art is the most trivial use-case that people are using it for.

    - Using it to generate non-concept art like sprites and textures requires that you use a number of tools side by side, because, for example, it can't generate transparent backgrounds for sprites, and support for arbitrary tillable textures is limited to some quite obscure branches.

    - Unity already investigated this space with ArtEngine (https://unity.com/products/unity-artengine) and decided it was not commercially viable, so there's some reasonable expectation that they will have 'burnt their fingers' on AI Art and be unenthusiastic about venturing into that space.

    - Barracuda (their neural interference engine, https://docs.unity3d.com/Packages/com.unity.barracuda@3.0/manual/index.html) has been used for some experimental purposes, but not really anything useful, but they have people working in the ML space.

    - SD now has an onnx port (https://huggingface.co/bes-dev/stable-diffusion-v1-4-onnx), which means it's not beyond the realms of possibly you could run it in unity.

    - Unity has a market lead over other existing engines specifically with regard to the quality and tooling of their mobile and 2d offering.

    All I can say is: You appear to have a lot of things to say on this topic... well, that's great.

    However, there is, as you pointed out, another thread for unity users to chat about AI generated art.

    What I was trying to do, with this post, was encourage Unity the company, to pay attention to what is going on, because there is an opportunity here to integrate with their existing compelling commercial offering for 2d and mobile users.

    If you are a 3d modeller, or building a 3d shooter, perhaps, this may seem irrelevant to you, and a waste of time. However, commercially, its a very important opportunity.

    I don't want to fight with you; you have strong opinions, I disagree with them. You are simply wrong about some things. You're fundamental contentions with regard to the use and value of SD appear (to me) to be only considering trivial and largely irrelevant use cases. ...but, I will bow out this thread here.

    I can only say this: I hope to see this tooling tightly integration in unity in the future, and not, in other engines.

    ...because, it is my belief, that its appearance in other engines would continue the unfortunate erosion of the compelling market offering that unity current presents.
     
    Last edited: Sep 4, 2022
  37. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    It feels like you haven't read a single thing I wrote.

    -----

    The main reason there's no point for unity to care about it is because this is a 2d tool that is a part of 2d pipeline. An engine editor normally assembles already finished assets, it does not produce them. We don't have photoshop integrated into unity. Or blender. Same goes for this tool.

    What's more, it is a specialist tool and not another midjourney.

    Honestly, I do not see any sane way to integrate it into unity editor that would work in a reasonable way.

    Also, if you really feel strongly about it, you're free to make an asset with it and sell it on asset store. Or offer it for free. Instead of demanding that someone else does that.

    I can program, produce 3d, 2d, music. The tool simply doesn't fit into unity workflow well. There's no benefit from tight integration, plus it is dependent on python and pytorch, neither of which play nice by themselves.

    Opportunity? What opportunity? It is already available, to everybody, for free. Anyone can use it, with any engine.

    You'd have a point if it was proprietary and was available on another engine. But it isn't. Or if it were producing rigged ready to use 3d characters. But it does not. Even your img2img examples will need plenty of manual processing before they become usable.

    So, what is there to do, exactly? It is already there, available, and usable. Like blender, for example.
     
    Last edited: Sep 4, 2022
    MadeFromPolygons and Martin_H like this.
  38. UhOhItsMoving

    UhOhItsMoving

    Joined:
    May 25, 2022
    Posts:
    69
    Are we talking about implementing it just in-edtor, or are we also talking about in-game?

    Nothing's stopping other engines from integrating it, too. In fact, nothing's stopping any program in general from integrating it.

    When you think about it, the prompt feature is kinda like a image search engine, just with generated images instead of existing images.
     
  39. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,509
    Ok, what specifically do you think that might look like? A window in Unity which generated images that you can save as assets?

    What practical benefits is this going to have over doing that in a purpose-built application and then alt-tabbing back to Unity? That's how the vast majority of source-level content generation is done, and typically for good reason.
     
  40. stain2319

    stain2319

    Joined:
    Mar 2, 2020
    Posts:
    417
    I agree, I do not see any point of putting this technology "inside Unity" just as I don't see any point in putting blender or Photoshop inside Unity.
     
    neginfinity likes this.
  41. sxa

    sxa

    Joined:
    Aug 8, 2014
    Posts:
    741
    Any third party, including the OP, is free to write an SD integration.

    In the meantime, maybe lets have UT devs focus on the engine stuff that only they can provide.

    isnt it ojnly a couple of months since there were folk wibbling that Unity should support cryptocoins, and NFTs and that kinda crap? Same thing.
     
    stain2319 likes this.
  42. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,469
    Also back when unity created the ML division, they had a post which told about a stated interest in creating an ai that would create any asset, then it was paired with image of badly generated car as a display of the starting point.

    It's worth mentioning that AI is rather "trivial" despite its impact on debate about art (and it's not new either, David Cope experimentation with ai music lead to the same discussion, what's new is that now everybody can do it). By trivial I mean it's basically 3 known ai put together and being unreasonably effective, that is a image to text caption ai, an image generator and an upscaler, which are very close in architecture, it's more of a collision than a revelation. The consequence is that people who think those IMPLEMENTATION can't do x or y NOW, are about to have a rude awakening, because other experiment already do that (like relighting, or 3d, or pbr decomposition, or coherence), it's a matter of time that people put all the working experiment together, with the bigger problem being sorting the training data to get good result. Also the more multimodal these AI get, the bigger they seem to need to be. It's not worth debating if it will ever happen, it's just a matter to sip tea and waiting. Also now we have big proof of concept, you bet people will start working on fixing these issue.

    I wouldn't bother too much about copyright issue anyway, that's an issue that will take care of itself. Those ai would still be useful even without them, Dalle is already scrubbing things of a different nature but is a proof of concept about copyright, and work around will be found, like identifying popular and desirable features, and pay an artist farm to reproduce those features for totally copyright free dataset, which will lead to the same result. In fact that's exactly what happen before when image recognition became useful. The future will be wild.

    SO what's the horizon? what about movie? People seems to think there is less data and therefore it wpn't be able. That's not how it works, the true challenge is the inference power, work are already done about temporal coherence, the challenge would be to compressed the unique semantic part of a movie down to useful token, and adjacent data do help ai be good, not just the target generations. Image aren't the problem in movie, assuming you have an image generation algorithm, you can decompress back into image, it will be to learn, compress and correlate the token long enough to make a coherent story, then decompress into joint audiovisual. First proof of concept will probably be bad and temporally incoherent like gpt2, then a few refinement down the line...
     
    DragonCoder likes this.
  43. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,362
    I believe in you xshadowmintx :)
    But, why do you want Unity to embed this so that everyone (your competition) can use it more easily?
    Surely, what you want to do is learn all the knowledge about using this new technology and keep that knowledge to yourself. Next step == profit. (That's what I'm doing! Sshhhh )
    Do you have shares in Unity? If so, you should bring this up at the next shareholders meeting not on the Unity forum (which is well known for taking every opportunity to crush people's dreams :D)

    What I do predict about games in the future is that game consoles will be called "Dream Machines". And basically you put on your VR headset, and it reads your mind and generates whole 3D fantasy worlds with this kind of technology. Sine the technology reads your mind it will create the best experience tailored to you that no game developer could ever compete with. And that will be the end of game developers because they won't be needed. This may come in less than ten years time. Then in eleven years time these "Dream Machines" will be banned because they will be more addictive than heroine. THE END.
     
    DanMeyer009 likes this.
  44. stain2319

    stain2319

    Joined:
    Mar 2, 2020
    Posts:
    417
    I'm bookmarking the above post and coming back in 10 years whenever I need a chuckle... ;)
     
  45. kdgalla

    kdgalla

    Joined:
    Mar 15, 2013
    Posts:
    4,355
    Yeah, in 10 years we can look back and reminisce about how people used to use AI for relatively harmless and innocent things. Right before "the Great AI War" nearly destroyed all of civilization. :p
     
  46. Noisecrime

    Noisecrime

    Joined:
    Apr 7, 2010
    Posts:
    2,000
    lol - though lets face it eventually 'Dream Machines' will be a thing, just likely not in our lifespan. However taking out the 'mind reading' part and its clear we much closer to such a technology than we ever have been in the past 60 years of AI.

    Oh and my go go and chuckle post is this from GameDev in 2000 'MP3 beating compression'. Its both good for a chuckle and to be humbled as it can be so easy to be misinformed or ignorant of knowledge that disproves you own beliefs/assumptions.

    As for AI, I'm pretty convinced 2022 will go down in history as the nexus for everything we see going forward. I'm convinced that in terms of creative output AI will now become a major new force and tool, a disruption no less than that of industrialisation or the digital era. It may well become the starting point of a true 'forth industrial revolution'. While AI has been a useful tool in some areas/fields for a decade or more, this feels like something different.

    The funny thing for me is I've never bought into 'The Singularity' concept of runaway AI/super intelligence destroying humanity as its own conscious act.

    However with recent events I can totally see humans being able to use AI in the 'near future' in such a way that results in another 'great war'. Just in terms of the ability to fake anything could easily be mis-used to cause or take countries to war. It reminds me both of the Star Trek DS9 episode 'In the Pale Moonlight' and Babylon 5 - 'The Deconstruction of Falling Stars', both dealing with using technology to fake events for the benefit of one side or the other.
     
    Last edited: Jan 3, 2023
  47. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    You know, SMBC had a real good comic about that...

    Uh-huh. And then we merrily beep and laugh in binary while flashing leds. As in the biological platforms becoming obsolete was inevitable.
     
    Noisecrime likes this.
  48. GimmyDev

    GimmyDev

    Joined:
    Oct 9, 2021
    Posts:
    157
  49. stain2319

    stain2319

    Joined:
    Mar 2, 2020
    Posts:
    417
    "Reading your mind" isn't the hard part. Translating it to gameplay mechanics and graphics on the fly, much more so.
     
  50. GimmyDev

    GimmyDev

    Joined:
    Oct 9, 2021
    Posts:
    157
    Ask chat gpt grand son.
     
Thread Status:
Not open for further replies.