Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint
(modal.com)
78 points
by charles_irl
11 hours ago |
18 comments
()
()
()
()
()