#Dalle, #StableDiffusion, etc.. are amazing to imagine things like avocado chairs, but they can't draw specific objects.
This is a major blocker for those who want to use this tech for advertising a real product.
But this will change.
š This thread to explain how.
A good solution would be to show #StableDiffusion a couple of images of your product (or your pet in this example) and get an identifier (a name) for it.
So you can use this new identifier (name) in your prompt.
This is precisely what a new paper from @GoogleAI is doing.
It's named Dream Booth: dreambooth.github.io/
It is surprisingly simple to understand how it works.
Let me explain:
First, you need a generic text2image network (in Red) and a couple of new images.
The goal is to get a new network (in blue) that will be able to draw everything, including your specific product/pet with a unique identifier.
So you can use this unique identifier when you need to draw exactly the thing you want to. (The thing/product/pet that was in your images)
How to create this new network (in yellow there)?
The idea is to start from a generic one and to teach it 2 tasks simultaneously.
1) learn what the generic network was doing (upper part)
2) learn to reconstruct specifics images (bottom part)
And that's it! Deceptively simple.
Here are some results:
While this method is not perfect, given the progress of research, it's easy to bet these technics will become mainstream sooner than later.
If you are interested, keep an eye on what we are doing at clipdrop.co ;-)