Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation
(arxiv.org)
135 points
by fheinsen
8 hours ago |
70 comments
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()