This is the official GitHub page for the paper (Link):
Sushil Awale, Eric Müller-Budack, Ralph Ewerth: "Patent Figure Classification using Large Vision-language Models". In: European Conference on Information Retrieval (ECIR), Lucca, Italy, 2025, Lecture Notes in Computer Science, vol 15573. Springer, Cham.
- Extended CLEF-IP - https://doi.org/10.5281/zenodo.10019328
- DeepPatent2 - https://doi.org/10.7910/DVN/UG4SBD
More details on dataset/README.md
Download the dataset directly from Zenodo.org
For finetuning of
For all CNN-based baselines, see baselines/README.md.
For all LVLM-based classification, see classifier/README.md
@InProceedings{10.1007/978-3-031-88711-6_2,
author="Awale, Sushil
and M{\"u}ller-Budack, Eric
and Ewerth, Ralph",
title="Patent Figure Classification Using Large Vision-Language Models",
booktitle="Advances in Information Retrieval",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="20--37",
isbn="978-3-031-88711-6",
doi="https://doi.org/10.1007/978-3-031-88711-6_2"
}
This work is published under the GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007. For details please check the LICENSE file in the repository.