Hardcoded 4096 dimension in DiffusionModelEncoder prevents architectural flexibility

The `DiffusionModelEncoder` class has a hardcoded input dimension of 4096 in the final linear layer, preventing architectural flexibility and causing shape mismatch errors when using different encoder configurations.

Steps to reproduce the behavior:

1. Create a `DiffusionModelEncoder` with custom channels or input dimensions that don't result in 4096 flattened features
2. Forward pass through the model
3. Encounter shape mismatch error at the hardcoded nn.Linear(4096, 512) layer

The final linear layer should dynamically adapt to the actual flattened feature size from the encoder blocks.

**Environment**

```
================================
Printing MONAI config...
================================
MONAI version: 1.6.dev2525
Numpy version: 1.26.4
Pytorch version: 2.6.0+cu124
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 56fe5f014f424ad0e6dbc8345515fc49295dd849
MONAI __file__: /usr/local/lib/python3.11/dist-packages/monai/__init__.py

Optional dependencies:
Pytorch Ignite version: 0.5.2
ITK version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: 5.3.2
scikit-image version: 0.25.2
scipy version: 1.15.2
Pillow version: 11.1.0
Tensorboard version: 2.18.0
gdown version: 5.2.0
TorchVision version: 0.21.0+cu124
tqdm version: 4.67.1
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: 7.0.0
pandas version: 2.2.3
einops version: 0.8.1
transformers version: 4.51.3
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: NOT INSTALLED or UNKNOWN VERSION.
clearml version: NOT INSTALLED or UNKNOWN VERSION.

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies


================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.4 LTS
Platform: Linux-6.6.56+-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.11.11
Process name: python3
Command: ['/usr/bin/python3', '-m', 'colab_kernel_launcher', '-f', '/root/.local/share/jupyter/runtime/kernel-2d79cbbb-e64e-40c4-b611-bb0a65b9b06d.json']
Open files: [popenfile(path='/root/.ipython/profile_default/history.sqlite', fd=46, position=0, mode='r+', flags=688130), popenfile(path='/root/.ipython/profile_default/history.sqlite', fd=48, position=0, mode='r+', flags=688130), popenfile(path='/root/.ipython/profile_default/history.sqlite-journal', fd=74, position=0, mode='r+', flags=688130)]
Num physical CPUs: 2
Num logical CPUs: 4
Num usable CPUs: 4
CPU usage (%): [2.6, 3.7, 2.9, 2.7]
CPU freq. (MHz): 2000
Load avg. in last 1, 5, 15 mins (%): [33.9, 15.3, 6.8]
Disk usage (%): 28.7
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 31.4
Available memory (GB): 29.2
Used memory (GB): 1.7

================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 12.4
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 90100
Current device: 0
Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
GPU 0 Name: Tesla P100-PCIE-16GB
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 56
GPU 0 Total memory (GB): 15.9
GPU 0 CUDA capability (maj.min): 6.0
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hardcoded 4096 dimension in DiffusionModelEncoder prevents architectural flexibility #8496

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hardcoded 4096 dimension in DiffusionModelEncoder prevents architectural flexibility #8496

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions