Open
Description
Problem & Motivation
In Evo2, using the --max-steps argument to stop training at a specific step also modifies the learning rate schedule. This makes it difficult to test partial convergence training that stops at a given step without altering the intended LR schedule.
File: sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py
Remove then SignalAfterGivenStepCallback from the training script
BioNeMo Framework Version
Category
Model/Training
Proposed Solution
introduce a new optional argument ie lr_scheduler_steps
which, when passed, sets lr rate scheduler number of steps instead of max_steps
Expected Benefits
max_steps can be used to control length of the training when lr_scheduler_steps is used to define schedule of lr