Get runtime sharding without doing any FLOPs inside `nnx.shard_map` #4797

carlesoctav · 2025-06-30T14:52:15Z

carlesoctav
Jun 30, 2025

I have a modifier/function that basically converts any module into an FSDP module. In short, it adds gather_params() and scatter_params() hooks before and after __call__.

By default, any model has no sharding metadata, but I add runtime metadata when __call__ gets called. However, I've encountered a problem:

In Linen, we can do something like shape = jax.eval_shape(model.init, param_rng, x) and get the partition via linen.get_partition_spec(). This function returns the PartitionSpec for nn.partitioned classes. I think this works because model.init calls __call__ at least once, so scatter_params() gets called.

Since NNX creates the params when we create the object in __init__, it doesn't even call __call__. We could simply run the model once, but that would use FLOPs to get the runtime PartitionSpec.

I tried running nnx.eval_shape(lambda x: model(x)) and then doing nnx.get_partition_spec(nnx.state(model)) to return the partition_spec, but I got an error because PartitionSpec is not a valid JAX type.

Please note that I need to run this under nnx.shard_map because I need to use jax.lax modules (for example, to find indices and do other operations).

Any ideas on how I can achieve this? Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Get runtime sharding without doing any FLOPs inside `nnx.shard_map` #4797

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Get runtime sharding without doing any FLOPs inside nnx.shard_map #4797

Uh oh!

carlesoctav Jun 30, 2025

Replies: 0 comments

Get runtime sharding without doing any FLOPs inside `nnx.shard_map` #4797

carlesoctav
Jun 30, 2025