KVarN: Native vLLM backend for KV-cache quantization by Huawei
(github.com)
63 points
by theanonymousone
3 hours ago |
7 comments
()
()
()