HiLM-D: Enhancing MLLMs with Multi-Scale High-Resolution Details for Autonomous Driving,
Xinpeng Ding, Jianhua Han, Hang Xu, Wei Zhang and Xiaomeng Li Arxiv preprint
In this paper, we extend DRAMA to DRAMA-ROLISP (Risk Object Localization and Intention and Suggestion Prediction), including the additional planning description.
You can download the images from DRAMA
You can also download the annotations from Google Driver
If you find our work useful in your research, please consider citing our paper:
@article{ding2025hilm,
title={HiLM-D: Enhancing MLLMs with Multi-Scale High-Resolution Details for Autonomous Driving},
author={Xinpeng, Ding and Jinahua, Han and Hang, Xu and Xu, Hang and Wei, Zhang and Xiaomeng, Li},
booktitle={International Journal of Computer Vision},
year={2025}
}
@article{ding2023hilm,
title={Hilm-d: Towards high-resolution understanding in multimodal large language models for autonomous driving},
author={Ding, Xinpeng and Han, Jianhua and Xu, Hang and Zhang, Wei and Li, Xiaomeng},
journal={arXiv preprint arXiv:2309.05186},
year={2023}
}
We thanks for the opensource projects.