Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Towards Optimal LLM Quantization
(
picovoice.ai
)
18 points
by
bejager
on June 2, 2024
|
hide
|
past
|
favorite
|
8 comments
aa6864aa
on June 2, 2024
|
next
[–]
How does it compare with AWQ, SqueezeLLM, or newer quantization methods?
abcd98
on June 2, 2024
|
prev
|
next
[–]
How do you integrate with vLLM?
dynamix
on June 2, 2024
|
prev
|
next
[–]
Is there a way for me to compress a custom fine-tuned model of my own?
bejager
on June 2, 2024
|
parent
|
next
[–]
not yet but it's something we have in mind as a future feature.
eonlav
on June 2, 2024
|
prev
|
next
[–]
Decent platform support - any plans for a Rust SDK?
bejager
on June 2, 2024
|
parent
|
next
[–]
We continuously work on expanding SDK support, Rust is also on the list.
aviel
on June 2, 2024
|
prev
[–]
Any benchmarks with Falcon 2?
bejager
on June 2, 2024
|
parent
[–]
we don't support Falcon 2 yet but new models are always on our radar to be added to the platform.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: