You need a decent GPU, though. I suspect my 6080MiB won't cut it any longer :(

ompto · on Aug 22, 2022

There's a version that's a bit slower but more memory efficient https://github.com/basujindal/stable-diffusion that runs on 6GB too.

miohtama · on Aug 22, 2022

Is Apple M1 support expected soon? Because even if Apple’s chips are slower, they have plenty of RAM on laptops. I saw some weeks ago it was coming, but I am not sure where to follow the process.

bm-rf · on Aug 24, 2022

looks like there is a nightly release for apple silicon, https://towardsdatascience.com/gpu-acceleration-comes-to-pyt...

neurostimulant · on Aug 22, 2022

You're going to need at least 10GB VRAM. My SFF pc with 4GB VRAM can only run dalle mini / craiyon :(

mempko · on Aug 22, 2022

Not if you change the precision to float16. Should work on a smaller card. Tried on a 1080 with 8GB and it works well.

krisoft · on Aug 22, 2022

How would one do that?

-----

Sorry my bad, found the answer. One simply adds the following flags to the StableDiffusionPipeline.from_pretrained call in the example: revision="fp16", torch_dtype=torch.float16

Found it in this blogpost: https://huggingface.co/blog/stable_diffusion

mempko thank you for your hint! I was about to drop a not insignificant amount of money on a new GPU.

What does one lose by using float16 representation? Does it make the images visually less detailed? Or how can one reason about this?

ShamelessC · on Aug 23, 2022

Zero loss. All upside. Only causes issues when training. 32-bit ships by default because it is compatible with cpu and GPU’s that might not have native fp16 support.

Edit: Just to be clear, your intuition that it could cause issues is certainly merited - and not _all_ models can be trivially converted from fp32 to fp16 without some new error accumulating (during inference). Variational autoencoders like VQGAN and GAN's are particularly prone to such issues.

But in this case, it's all upside.