Sophia: A Scalable Second-Order Optimizer for Language Model Pre-Training
3 points
1 hour ago
| 0 comments
| arxiv.org
| HN
No one has commented on this post.