Skip to content

SOAP optimizer #367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jdart1 opened this issue Apr 8, 2025 · 3 comments
Open

SOAP optimizer #367

jdart1 opened this issue Apr 8, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@jdart1
Copy link

jdart1 commented Apr 8, 2025

Ceres chess (https://github.com/dje-dev/Ceres) is using this recently published optimization algorithm: https://arxiv.org/abs/2409.11321. There is a Python implementation. It is reportedly faster and more performant than AdamW.

@jdart1 jdart1 added the enhancement New feature or request label Apr 8, 2025
@jw1912
Copy link
Owner

jw1912 commented Apr 8, 2025

@jdart1 I am currently in the lead up to final university exams so I probably will not have time to implement this (as it is non-trivial and requires new operations on each backend) for around 2-3 months.
Some thoughts about it though:

  • This seems likely to be an insane training slowdown (in pos/sec terms) for king bucketed networks
  • I can only see evidence for its improved performance on deep neural networks, in particular transformers (as Ceres net is)

@jdart1
Copy link
Author

jdart1 commented Apr 8, 2025

It is only a suggestion. If it is less efficient then there is no point in implementing it.

@cosmobobak
Copy link
Contributor

On the topic of alternate optimisers, there's also Muon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants