Thanks for the tip about exllama, I've been on the lookout for a readable python...

		dTal on June 13, 2023 \| parent \| context \| favorite \| on: Llama.cpp: Full CUDA GPU Acceleration Thanks for the tip about exllama, I've been on the lookout for a readable python implementation to play with that is also fast and has support for quantized datasets.