Visual Prompting Evaluation for Vision-Language Models in Radiology

This repository contains the implementation for our MIDL 2025 paper "Visual Prompt Engineering for Vision Language Models in Radiology".

Overview

This project aims to evaluate how different visual prompting techniques affect the performance of vision-language models (VLMs) in the domain of radiology image analysis. We assess the impact of different visual annotations (bounding boxes, circles, arrows, and crops) on model performance across several common chest X-ray datasets. We demonstrate, that visual markers, particularly a red circle, improve AUROC by up to 0.185.

Features

Evaluation of multiple vision-language models (BiomedCLIP, BMC_CLIP_CF) on radiology datasets
Implementation of various visual prompting techniques:
- Bounding boxes
- Circles
- Arrows
- Region cropping
Support for multiple radiology datasets:
- PadChest-GR
- VinDr-CXR
- NIH14 (ChestX-ray14)
- JSRT (Japanese Society of Radiological Technology)

Installation

Clone this repository:

git clone https://github.com/MIC-DKFZ/VPE-in-Radiology.git
cd VPE-in-Radiology

Make sure you have Python 3.12 installed:
```
python --version
```
Install the required packages:
```
pip install -r requirements.txt
```
Download the required datasets:
- PadChest-GR: Download from BIMCV website - Requires registration for the grounded reports version
- VinDr-CXR: Download from PhysioNet - Requires PhysioNet credentialed access
- NIH ChestX-ray14: Download from NIH
- JSRT Dataset: Download from JSRT website - Requires registration

Configure dataset paths in default_config.json to match your local environment. Update the following fields for each dataset:

{
  "PadChestGRTrainDataset": {
    "metadata_file": "/path/to/padchest/metadata.csv",
    "images_dir": "/path/to/padchest/images",
    "grounded_reports": "/path/to/padchest/grounded_reports.json"
  },
  "VinDrCXRTrainDataset": {
    "metadata_file": "/path/to/vindrcxr/annotations_train.csv",
    "images_dir": "/path/to/vindrcxr/pngs/train"
  }
}

Alternatively, you can create a custom config file (e.g., custom_config.json) and specify it using the CONFIG environment variable:

export CONFIG=/path/to/custom_config.json
python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain

Usage

Basic Usage

Run the core processing script with different parameters:

python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain

Visual Prompting Options

Add visual annotations to images:

# Add bounding box
python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain --image_annotation_type bbox --color red

# Add circle
python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain --image_annotation_type circle --color red

# Add arrow
python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain --image_annotation_type arrow --color red

# Use cropping
python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain --image_annotation_type crop

Text Annotation

You can add text annotation suffixes that describe the visual prompts:

python src/process.py --model_name BiomedCLIPModel --dataset_name PadChestGRTrain --image_annotation_type bbox --color red --text_annotation_suffix ' indicated by a red bounding box'

Reproducing Our Experiments

To reproduce all experiments from our paper, run:

bash reproduce_paper.sh

Project Structure

src/: Core source code
- models.py: Implementation of vision-language models
- process.py: Main processing script for running experiments
- datasets/: Dataset-specific implementations
  - base_dataloader.py: Base class for all datasets
  - padchestgr/, vindrcxr/, nih14/, jsrt/: Dataset-specific modules
- utils/: Utility functions
experiments/: Output directory for experiment results
default_config.json: Configuration file for dataset paths
requirements.txt: Required Python packages
reproduce_paper.sh: Script to reproduce all experiments from the paper

Results and Output

Experiment results are saved under the experiments/ directory with the following structure:

experiments/
    {DATASET_NAME}/
        {MODEL_NAME}/
            {ANNOTATION_TYPE}/
                {COLOR}/
                    {LINE_WIDTH}/
                        {TEXT_ANNOTATION}/
                            metrics.csv
                            metrics.json
                            results.csv

Each experiment directory contains:

metrics.json: Detailed performance metrics
metrics.csv: Summary of performance metrics in CSV format
results.csv: Raw prediction results for each image

Citation

If you use this code or our findings in your research, please cite our paper:

Paper Link

@inproceedings{denner2025visual,
  title={Visual Prompt Engineering for Vision Language Models in Radiology},
  author={Denner, Stefan and Bujotzek, Markus Ralf and Bounias, Dimitrios and Zimmerer, David and Stock, Raphael and Maier-Hein, Klaus},
  booktitle={Medical Imaging with Deep Learning},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
default_config.json		default_config.json
reproduce_paper.sh		reproduce_paper.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visual Prompting Evaluation for Vision-Language Models in Radiology

Overview

Features

Installation

Usage

Basic Usage

Visual Prompting Options

Text Annotation

Reproducing Our Experiments

Project Structure

Results and Output

Citation

About

Uh oh!

Releases

Packages

Languages

License

MIC-DKFZ/VPE-in-Radiology

Folders and files

Latest commit

History

Repository files navigation

Visual Prompting Evaluation for Vision-Language Models in Radiology

Overview

Features

Installation

Usage

Basic Usage

Visual Prompting Options

Text Annotation

Reproducing Our Experiments

Project Structure

Results and Output

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages