Is there a way to initialize the nnx layers dynamically? #4365

yCobanoglu · 2024-11-10T02:49:10Z

yCobanoglu
Nov 10, 2024

https://flax.readthedocs.io/en/v0.8.3/experimental/nnx/mnist_tutorial.html

This model is from the tutorial and the Linear1 layers input size is fixed which makes it annoying to train this model on a different dataset. Is there a way to lazy init somehow ?
Making self.linear1=None then initializing on the first pass in call with an if-else causes this error:
https://flax.readthedocs.io/en/v0.8.3/experimental/nnx/mnist_tutorial.html

class CNN(nnx.Module):
    """A simple CNN model."""

    def __init__(self, *, rngs: nnx.Rngs):
        self.conv1 = nnx.Conv(1, 32, kernel_size=(3, 3), rngs=rngs)
        self.conv2 = nnx.Conv(32, 64, kernel_size=(3, 3), rngs=rngs)
        self.avg_pool = partial(nnx.avg_pool, window_shape=(2, 2), strides=(2, 2))
        self.linear1 = nnx.Linear(3136, 256, rngs=rngs)
        self.linear2 = nnx.Linear(256, 10, rngs=rngs)

    def __call__(self, x):
        x = self.avg_pool(nnx.relu(self.conv1(x)))
        x = self.avg_pool(nnx.relu(self.conv2(x)))
        x = x.reshape(x.shape[0], -1)  # flatten
        x = nnx.relu(self.linear1(x))
        x = self.linear2(x)
        return x

cgarciae · 2024-11-11T19:04:18Z

cgarciae
Nov 11, 2024
Maintainer

Hey @yCobanoglu, great question! We get this a lot as a downside of having explicit initialization. The nice thing is that you can infer the hard to compute constants by using nnx.eval_shape if you pass some input data to the constructor. Here's an example:

class CNN(nnx.Module):
  """A simple CNN model."""

  def __init__(self, x, *, rngs: nnx.Rngs):
    self.conv1 = nnx.Conv(1, 32, kernel_size=(3, 3), rngs=rngs)
    self.conv2 = nnx.Conv(32, 64, kernel_size=(3, 3), rngs=rngs)
    self.avg_pool = partial(nnx.avg_pool, window_shape=(2, 2), strides=(2, 2))
    # use `eval_shape` to compute the number of flat features without running the model
    flat_features = nnx.eval_shape(CNN._get_flat_features, self, x).shape[-1]
    self.linear1 = nnx.Linear(flat_features, 256, rngs=rngs)
    self.linear2 = nnx.Linear(256, 10, rngs=rngs)

  def _get_flat_features(self, x):
    x = self.avg_pool(nnx.relu(self.conv1(x)))
    x = self.avg_pool(nnx.relu(self.conv2(x)))
    x = x.reshape(x.shape[0], -1)
    return x

  def __call__(self, x):
    x = self.avg_pool(nnx.relu(self.conv1(x)))
    x = self.avg_pool(nnx.relu(self.conv2(x)))
    x = x.reshape(x.shape[0], -1)  # flatten
    x = nnx.relu(self.linear1(x))
    x = self.linear2(x)
    return x


sample_x = jnp.ones((1, 64, 64, 1))
model = CNN(sample_x, rngs=nnx.Rngs(0))

Here I'm duplicating some of the forward pass but maybe you could even refactor the model.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is there a way to initialize the nnx layers dynamically? #4365

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is there a way to initialize the nnx layers dynamically? #4365

Uh oh!

Uh oh!

yCobanoglu Nov 10, 2024

Replies: 1 comment

Uh oh!

cgarciae Nov 11, 2024 Maintainer

yCobanoglu
Nov 10, 2024

cgarciae
Nov 11, 2024
Maintainer