Open
Description
The DiffusionModelEncoder
class has a hardcoded input dimension of 4096 in the final linear layer, preventing architectural flexibility and causing shape mismatch errors when using different encoder configurations.
Steps to reproduce the behavior:
- Create a
DiffusionModelEncoder
with custom channels or input dimensions that don't result in 4096 flattened features - Forward pass through the model
- Encounter shape mismatch error at the hardcoded nn.Linear(4096, 512) layer
The final linear layer should dynamically adapt to the actual flattened feature size from the encoder blocks.
Environment
================================
Printing MONAI config...
================================
MONAI version: 1.6.dev2525
Numpy version: 1.26.4
Pytorch version: 2.6.0+cu124
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 56fe5f014f424ad0e6dbc8345515fc49295dd849
MONAI __file__: /usr/local/lib/python3.11/dist-packages/monai/__init__.py
Optional dependencies:
Pytorch Ignite version: 0.5.2
ITK version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: 5.3.2
scikit-image version: 0.25.2
scipy version: 1.15.2
Pillow version: 11.1.0
Tensorboard version: 2.18.0
gdown version: 5.2.0
TorchVision version: 0.21.0+cu124
tqdm version: 4.67.1
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: 7.0.0
pandas version: 2.2.3
einops version: 0.8.1
transformers version: 4.51.3
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: NOT INSTALLED or UNKNOWN VERSION.
clearml version: NOT INSTALLED or UNKNOWN VERSION.
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.4 LTS
Platform: Linux-6.6.56+-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.11.11
Process name: python3
Command: ['/usr/bin/python3', '-m', 'colab_kernel_launcher', '-f', '/root/.local/share/jupyter/runtime/kernel-2d79cbbb-e64e-40c4-b611-bb0a65b9b06d.json']
Open files: [popenfile(path='/root/.ipython/profile_default/history.sqlite', fd=46, position=0, mode='r+', flags=688130), popenfile(path='/root/.ipython/profile_default/history.sqlite', fd=48, position=0, mode='r+', flags=688130), popenfile(path='/root/.ipython/profile_default/history.sqlite-journal', fd=74, position=0, mode='r+', flags=688130)]
Num physical CPUs: 2
Num logical CPUs: 4
Num usable CPUs: 4
CPU usage (%): [2.6, 3.7, 2.9, 2.7]
CPU freq. (MHz): 2000
Load avg. in last 1, 5, 15 mins (%): [33.9, 15.3, 6.8]
Disk usage (%): 28.7
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 31.4
Available memory (GB): 29.2
Used memory (GB): 1.7
================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 12.4
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 90100
Current device: 0
Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
GPU 0 Name: Tesla P100-PCIE-16GB
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 56
GPU 0 Total memory (GB): 15.9
GPU 0 CUDA capability (maj.min): 6.0
Metadata
Metadata
Assignees
Labels
No labels