Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Towards Optimal LLM Quantization (picovoice.ai)
18 points by bejager on June 2, 2024 | hide | past | favorite | 8 comments


How does it compare with AWQ, SqueezeLLM, or newer quantization methods?


How do you integrate with vLLM?


Is there a way for me to compress a custom fine-tuned model of my own?


not yet but it's something we have in mind as a future feature.


Decent platform support - any plans for a Rust SDK?


We continuously work on expanding SDK support, Rust is also on the list.


Any benchmarks with Falcon 2?


we don't support Falcon 2 yet but new models are always on our radar to be added to the platform.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: