Adding README for the examples.

botev · KfacJaxDev · commit 58825af20570 · 2022-04-04T09:54:39.000-07:00
PiperOrigin-RevId: 439331640
diff --git a/examples/README.md b/examples/README.md
@@ -0,0 +1,54 @@
+# KFAC-JAX Examples
+
+To run the examples you will need to install additional dependencies:
+
+```shell
+$ pip install -r examples/requirements.txt
+```
+
+This folder contains code with common functionality used in all examples. 
+Each example follows the following structure:
+
+* `experiment.py` have the example specific code and includes the model 
+definition, loss definition and pipeline experiment class.
+* `pipeline.py` have the example specific hyper-parameter configuration.
+
+To run an example simply do:
+
+```shell
+$ python ${example_name}/pipeline.py
+```
+
+## Autoencoder on MNIST
+
+The example demonstrates how to use the optimizer on an deterministic 
+autoencoder on the MNIST dataset.
+The default configuration uses the automatic learning rate, momentum and damping 
+adaptations.
+
+## Classifier on MNIST
+
+The example demonstrates how to use the optimizer on a very small 
+convolutional classifier on the MNIST dataset.
+The default configuration uses the automatic learning rate, momentum and damping 
+adaptations.
+
+## Resnet50 on ImageNet
+
+This examples demonstrates how to use the optimizer on Resnet50 on the 
+ImageNet dataset.
+Because it is unfeasible to run this problem with very large batch sizes, the 
+default configuration only adapts the damping.
+The momentum is fixed at `0.9` and the learning rate follows an ad-hoc schedule.
+
+
+## Resnet101 with TAT on ImageNet
+
+This examples demonstrates how to use the optimizer on Resnet101 on the 
+ImageNet dataset, with no residual connections or normalization layers as in the
+[TAT paper].
+The damping is fixed at `0.001`, the momentum at `0.9` and we use cosine 
+learning rate schedule.
+
+
+[TAT paper]: https://arxiv.org/abs/2203.08120
diff --git a/examples/autoencoder_mnist/pipeline.py b/examples/autoencoder_mnist/pipeline.py
@@ -38,6 +38,7 @@ def get_config() -> config_dict.ConfigDict:
   config.checkpoint_dir = "/tmp/kfac_jax_jaxline/"
   config.train_checkpoint_all_hosts = False
 
+  # Experiment config.
   config.experiment_kwargs = config_dict.ConfigDict(
       dict(
           config=dict(
diff --git a/examples/lrelunet101_imagenet/pipeline.py b/examples/lrelunet101_imagenet/pipeline.py
@@ -77,7 +77,7 @@ def get_config() -> config_dict.ConfigDict:
                       use_adaptive_momentum=False,
                       use_adaptive_damping=False,
                       learning_rate_schedule=dict(
-                          initial_learning_rate=0.1,
+                          initial_learning_rate=3e-4,
                           warmup_epochs=5,
                           name="cosine",
                       ),
diff --git a/examples/resnet50_imagenet/pipeline.py b/examples/resnet50_imagenet/pipeline.py
@@ -42,7 +42,7 @@ def get_config() -> config_dict.ConfigDict:
   config.experiment_kwargs = config_dict.ConfigDict(
       dict(
           config=dict(
-              l2_reg=0.0,
+              l2_reg=1e-5,
               training=dict(
                   steps=200_000,
                   epochs=None,