MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU
(arxiv.org)
247 points
by chrsw
11 hours ago |
47 comments
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()