Skip to content

Commit 4715de9

Browse files
committed
Update docs to mention rotated boxes and keypoints
1 parent 342eb92 commit 4715de9

File tree

3 files changed

+26
-24
lines changed

3 files changed

+26
-24
lines changed

docs/source/transforms.rst

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,20 @@
11
.. _transforms:
22

3-
Transforming and augmenting images
4-
==================================
3+
Transforming images, videos, boxes and more
4+
===========================================
55

66
.. currentmodule:: torchvision.transforms
77

88
Torchvision supports common computer vision transformations in the
9-
``torchvision.transforms`` and ``torchvision.transforms.v2`` modules. Transforms
10-
can be used to transform or augment data for training or inference of different
11-
tasks (image classification, detection, segmentation, video classification).
9+
``torchvision.transforms.v2`` module. Transforms can be used to transform and
10+
augment data, for both training or inference. The following objects are
11+
supported:
12+
13+
- Images as pure tensors, :class:`~torchvision.tv_tensors.Image` or PIL image
14+
- Videos as :class:`~torchvision.tv_tensors.Video`
15+
- Aligned and rotated bounding boxes as :class:`~torchvision.tv_tensors.BoundingBoxes`
16+
- Segmentation and detection masks as :class:`~torchvision.tv_tensors.Mask`
17+
- KeyPoints as :class:`~torchvision.tv_tensors.KeyPoints`.
1218

1319
.. code:: python
1420
@@ -111,9 +117,9 @@ In Torchvision 0.15 (March 2023), we released a new set of transforms available
111117
in the ``torchvision.transforms.v2`` namespace. These transforms have a lot of
112118
advantages compared to the v1 ones (in ``torchvision.transforms``):
113119

114-
- They can transform images **but also** bounding boxes, masks, or videos. This
115-
provides support for tasks beyond image classification: detection, segmentation,
116-
video classification, etc. See
120+
- They can transform images **but also** bounding boxes, masks, videos and
121+
keypoints. This provides support for tasks beyond image classification:
122+
detection, segmentation, video classification, etc. See
117123
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py`
118124
and :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`.
119125
- They support more transforms like :class:`~torchvision.transforms.v2.CutMix`

gallery/transforms/plot_transforms_getting_started.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -79,12 +79,12 @@
7979
# very easy: the v2 transforms are fully compatible with the v1 API, so you
8080
# only need to change the import!
8181
#
82-
# Detection, Segmentation, Videos
82+
# Videos, boxes, masks, keypoints
8383
# -------------------------------
8484
#
85-
# The new Torchvision transforms in the ``torchvision.transforms.v2`` namespace
86-
# support tasks beyond image classification: they can also transform bounding
87-
# boxes, segmentation / detection masks, or videos.
85+
# The Torchvision transforms in the ``torchvision.transforms.v2`` namespace
86+
# support tasks beyond image classification: they can also transform rotated or
87+
# aligned bounding boxes, segmentation / detection masks, videos, and keypoints.
8888
#
8989
# Let's briefly look at a detection example with bounding boxes.
9090

@@ -129,8 +129,9 @@
129129
# TVTensors are :class:`torch.Tensor` subclasses. The available TVTensors are
130130
# :class:`~torchvision.tv_tensors.Image`,
131131
# :class:`~torchvision.tv_tensors.BoundingBoxes`,
132-
# :class:`~torchvision.tv_tensors.Mask`, and
133-
# :class:`~torchvision.tv_tensors.Video`.
132+
# :class:`~torchvision.tv_tensors.Mask`,
133+
# :class:`~torchvision.tv_tensors.Video`, and
134+
# :class:`~torchvision.tv_tensors.KeyPoints`.
134135
#
135136
# TVTensors look and feel just like regular tensors - they **are** tensors.
136137
# Everything that is supported on a plain :class:`torch.Tensor` like ``.sum()``

torchvision/tv_tensors/_keypoints.py

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -13,19 +13,14 @@ class KeyPoints(TVTensor):
1313
1414
Each point is represented by its X and Y coordinates along the width and height dimensions, respectively.
1515
16-
KeyPoints can be converted from :class:`torchvision.tv_tensors.BoundingBoxes`
17-
by :func:`torchvision.transforms.v2.functional.convert_bounding_boxes_to_points`.
18-
1916
KeyPoints may represent any object that can be represented by sequences of 2D points:
2017
2118
- `Polygonal chains <https://en.wikipedia.org/wiki/Polygonal_chain>`_,
22-
including polylines, Bézier curves, etc., which should be of shape
23-
``[N_chains, N_points, 2]``, which is equal to ``[N_chains, N_segments +
24-
1, 2]``
25-
- Polygons, which should be of shape ``[N_polygons, N_points, 2]``, which is
26-
equal to ``[N_polygons, N_sides, 2]``
27-
- Skeletons, which could be of shape ``[N_skeletons, N_bones, 2, 2]`` for
28-
pose-estimation models
19+
including polylines, Bézier curves, etc., which can be of shape
20+
``[N_chains, N_points, 2]``
21+
- Polygons, which can be of shape ``[N_polygons, N_points, 2]``
22+
- Skeletons, which can be of shape ``[N_skeletons, N_bones, 2, 2]`` for
23+
pose-estimation models.
2924
3025
.. note::
3126
Like for :class:`torchvision.tv_tensors.BoundingBoxes`, there should

0 commit comments

Comments
 (0)