This repo contains the official PyTorch implementation of the NeurIPS'22 paper
Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes
Maxim Kodryan*,
Ekaterina Lobacheva*,
Maksim Nakhodnov*,
Dmitry Vetrov
arXiv / openreview / short poster video / long talk (in Russian) / bibtex
Environment
conda env create -f SI_regimes_env.yml
Example usage
To obtain one of the lines in Figure 1 in the paper:
- Run script run_train_and_test.py to train and compute metrics (in the presented form, it trains a scale-invariant ConvNet on CIFAR-10 using SGD on the sphere with ELR 1e-3).
- Use notebook Plots.ipynb to look at the results.
Main parameters
To replicate other results from the paper, vary the parameters in run_train_and_test.py:
- dataset: CIFAR10 or CIFAR100
- to train fully scale-invariant networks use models ConvNetSI/ResNet18SI, fix_noninvlr = 0.0 (learning rate for not scale invariant parameters), and initscale = 10. (norm of the last layer weight matrix)
- to train all network parameters use models ConvNetSIAf/ResNet18SIAf, fix_noninvlr = -1 and initscale = -1
- to train networks on the sphere use fix_elr = 'fix_elr' and some positive elr value
- to train network in the whole parameter space use fix_elr = 'fix_lr' and some positive lr_init value (+ we use weight decay wd in this setup)
- to turn on the momentum use a non-zero value for it in params
- to turn on data augmentation delete the noaug option from add_params
Parts of this code are based on the following repositories:
- On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay. Ekaterina Lobacheva, Maxim Kodryan, Nadezhda Chirkova, Andrey Malinin, and Dmitry Vetrov.
- Rethinking Parameter Counting: Effective Dimensionality Revisted. Wesley Maddox, Gregory Benton, and Andrew Gordon Wilson.
If you found this code useful, please cite our paper
@inproceedings{kodryan2022regimes,
title={Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes},
author={Maxim Kodryan and Ekaterina Lobacheva and Maksim Nakhodnov and Dmitry Vetrov},
booktitle={Advances in Neural Information Processing Systems},
year={2022},
url={https://openreview.net/forum?id=edffTbw0Sws}
}