Data Point: I am currently having issues getting Mixtral Q4_K_M running in LMStu...

geococcyxc · on Dec 19, 2023

You just have to allow more than 75% memory to be allocated to the GPU by running sudo sysctl -w iogpu.wired_limit_mb=30720 (for a 30 GB limit in this case).

cschneid · on Dec 19, 2023

1. That worked after some tweaking. 2. I had to lower the context window size to get LM Studio to load it up. 3. LM Studio has two distinct checkboxes that both say "Apple Metal GPU". No idea if they do the same thing....

Thanks a ton! I'm running on GPU w/ Mixtral 8x Instruct Q4_K_M now. tok/sec is about 4x what CPU only was. (Now at 26 tok/sec or so).