Adjust clamping for rotated bboxes #9112

AntoineSimoulin · 2025-06-20T20:49:46Z

Adjust clamping for Rotated Boxes

This PR is a follow-up to #9104, aiming to address inconsistencies in the clamping function and improve its intuitiveness. The initial approach for clamping rotated bounding boxes focused on finding the largest angle-preserving box enclosed within the original box and the image canvas. However, as illustrated in Figure 2, this method can lead to non-intuitive results where the box does not fully enclose the underlying object. To address this issue, this PR proposes an adjustment to the clamping function. Instead of seeking the largest angle-preserving box, we now aim to find the smallest angle-preserving box that encloses the intersection of the original box and the image canvas. This change ensures that the resulting box is more intuitive.

These adjustments have some key implications. With this new approach, clamped rotated boxes may have vertices outside the canvas. However, the center of the bounding box is guaranteed to remain within the canvas. This PR addresses #8254 by ensuring that rotated bounding boxes SHOULD be clamped (consistent with un-rotated boxes). Crucially, as illustrated in Figure 1, the clamping operation preserves the original box's pixel assignments within the image canvas, ensuring that no information is lost during the process.

Details of the adjustments

This PR implements in particular the following modifications:

Modify the conditions from the clamping function to ensure the resulting box completely encapsulate the input box. The output from the clamping operation is the smallest angle-preserving box that encloses the intersection of the original box and the image canvas.
Modify the elastic_bounding_boxes for rotated boxes so that we use the "CXCYWHR" format instead of "XYXYXYXY". The elastic transform needs the transformed points to be within the canvas size. This is the case for the center or rotated boxes but not necessarily for all vertices.
Fix the _order_bounding_boxes_points in the case of largest negative values along the y-axis.

Illustration of the adjustements

We illustrate the adjustments on the clamping function using this image example. The clamping should be more intuitive and should prevent from loosing information.

Figure 1: Illustration of the clamping adjustments (original box in grey and corresponding clamped box in blue).

Figure 2: Illustration of the clamping BEFORE this PR.

Figure 3: Illustration of the clamping AFTER this PR.

Test plan

Please run the following tests:

pytest test/test_transforms_v2.py -k box -v
...
2372 passed, 1432 skipped, 5025 deselected in 46.08s

Test Plan: ```bash pytest test/test_transforms_v2.py -k box -v ```

pytorch-bot · 2025-06-20T20:49:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9112

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit c6b365b with merge base 6bbe010 ():

NEW FAILURE - The following job has failed:

Tests / unittests-linux (3.9, linux.g5.4xlarge.nvidia.gpu, cuda, 11.8) / linux-job (gh)
RuntimeError: Command docker exec -t da2c47cce5c5e78c4d41ecbf26961f5da1e001c763dc764ec33513e5b0404529 /exec failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

CMake / windows (windows.g5.4xlarge.nvidia.gpu, cuda, 11.8) / windows-job (gh) (trunk failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

NicolasHug · 2025-06-23T12:25:20Z

Thanks for the PR @AntoineSimoulin, and for the detailed pictures!

It's clear from Figure 2 that our current clamping strategy leads to sub-optimal boxes. Out of curiosity, could you share the transformations that were used in each result? I suspect that the more transforms are used, the more clamping happens, and thus more information is lost.

The clamping strategy proposed in this PR allows for some corners of the box to be outside of the image canvas. That makes me wonder: what do we actually want from a clamping operation? Do we want the corners to be within the canvas, or do we only need the center of the box to be within the canvas?

My current understanding is that there is a spectrum of clamping strategies:

no clamp at all. This is what retains the most information.
a strict clamping, where we force all of the box points to be in the canvas, as implemented in main. Potentially, a lot of information is lost.
a more lenient clamping as in this PR, which seems to be an intermediate strategy between the 2 strategies above: we lose less information than with strict clamping, but we may still have points outside of the canvas.

I do agree that the clamping in this PR leads to less surprising results than the strict clamping we have in main. Maybe we could expose it as one of multiple clamping strategies. However, since it still results in points outside the canvas and some information loss, I wonder if users wouldn't prefer the no-clamping strategy in general?

AntoineSimoulin · 2025-06-23T14:38:44Z

Out of curiosity, could you share the transformations that were used in each result?

Figure 2 and 3 are obtained by applying a CenterCrop transformation for size in 300, 500, 1000, and original image size.

My current understanding is that there is a spectrum of clamping strategies

Yeah I do agree with the proposed breakdown.

Maybe we could expose it as one of multiple clamping strategies. However, since it still results in points outside the canvas and some information loss, I wonder if users wouldn't prefer the no-clamping strategy in general?

As illustrated in Figure 1, I feel the strategy proposed in this PR offers the best trade-off. For instance, in the case of object detection, it would be very difficult for a model to predict a vertex very far from the canvas boundaries. Also this transformation ensures that the center of the box is within the image canvas and therefore we should be able to apply any transformation without error. Finally, Contrary to stricter clamping we do not loose information as all pixels within the canvas assigned to the object are still within the bounding box.

I would prefer to opt-in by default for this strategy and do not keep implementation for other option for now to keep simplicity of the codebase. Let me know what you think!

AntoineSimoulin added 2 commits June 20, 2025 13:32

Adjust rotated clamping conditions

2a361ef

Test Plan: ```bash pytest test/test_transforms_v2.py -k box -v ```

apply linting

4261ed3

facebook-github-bot added the cla signed label Jun 20, 2025

Merge branch 'main' into rotated-bboxes-transforms

c6b365b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adjust clamping for rotated bboxes #9112

Adjust clamping for rotated bboxes #9112

Uh oh!

AntoineSimoulin commented Jun 20, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

NicolasHug commented Jun 23, 2025

Uh oh!

AntoineSimoulin commented Jun 23, 2025

Uh oh!

Uh oh!

Adjust clamping for rotated bboxes #9112

Are you sure you want to change the base?

Adjust clamping for rotated bboxes #9112

Uh oh!

Conversation

AntoineSimoulin commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Adjust clamping for Rotated Boxes

Details of the adjustments

Illustration of the adjustements

Test plan

Uh oh!

pytorch-bot bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9112

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

NicolasHug commented Jun 23, 2025

Uh oh!

AntoineSimoulin commented Jun 23, 2025

Uh oh!

Uh oh!

AntoineSimoulin commented Jun 20, 2025 •

edited

Loading

pytorch-bot bot commented Jun 20, 2025 •

edited

Loading