Skip to content

How to train a recognition model with mutlple boxes per image? #1901

Answered by felixT2K
haimat asked this question in Q&A
Discussion options

You must be logged in to vote

correct one image == one word for recognition so yes from the detection labels you have to crop :)

https://github.com/mindee/doctr/tree/main/references/recognition#data-format

images folder with all the word crop images and one labels.json so at the end you have this twice one images folder + labels.json for train and one images folder + labels.json for val :)

├── images
    ├── img_1.jpg
    ├── img_2.jpg
    ├── img_3.jpg
    └── ...
├── labels.json
{
    "img_1.jpg": "I",
    "img_2.jpg": "am",
    "img_3.jpg": "a",
    "img_4.jpg": "Jedi",
    "img_5.jpg": "!",
    ...
}

reference datasets can be found here: #1654

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@felixT2K
Comment options

Answer selected by haimat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants