Making Stable Diffusion Safer for Third Parties

•

✨ So you want to add AI image generation to your website? Rarely, Stable Diffusion will generate NSFW images from regular prompts. The default filter catches it, but your users getting a black image isn't much better. 😅 How can we get around this? With a new model 👇

The default filter Stable Diffusion comes with is pretty good, and usually manages to catch these "exceptions". (it's easy to manually bypass and only filters sexual content, but that's for another thread 👀) When it triggers, SD will replace your image with a black square. ⬛️

This essentially fixes the problem of accidentally showing NSFW when users don't expect it, but at the cost of UX (you either show an error or return a black square, none of which are expected). Can we do better? Yes! Let's embed the filter into the model itself 💡

Introducing Stable Latent Diffusion, does exactly that. It operates at the denoising step of SD, and "pushes" generated images to be farther away from inappropiate content (defined as: hate, harassment, violence, self-harm, sexual, shocking, illegal). arxiv.org/abs/2211.05105

And the best part: it requires no extra training/tuning! 🎉 This means it'll work with your existing models (so you don't have to redownload those enoromus ckpt files). If you're using @huggingface's transformers lib, it's literally a one line change!

I've also deployed it on @replicatehq, which provides a super simple API to make integration even easier (and a pretty nice interface to try it online). 🚀 (cog repo lives at github.com/m1guelpf/cog-safe-diffusion) replicate.com/m1guelpf/safe-latent-diffusion

Keep in mind, the Stable SD model above is new, and might produce worse results than the original, mess things up, or miss some stuff. 🚧 (you still have the original content filter backing you up if something goes wrong, so worst case is still black square)

If you try out the model, make sure to let me know how it went! And if you're interested on learning more about the default SD content filter, I've got a thread coming on how it works, its flaws, and how to use it for non-AI images, so stay tuned for that soon 👀