You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/task/motion.rst
+62-3
Original file line number
Diff line number
Diff line change
@@ -14,12 +14,71 @@ We now describe how OFASys implements denoising diffusion probabilistic modeling
14
14
15
15
Here we assume the dataset is a table, where each sample record contains two fields.
16
16
Field "mocap" of modality "MOTION" is a BVH file containing motion capture data,
17
-
while field "title" of modality "TEXT" is a text sentence describing the captured motion, e.g., "a person walks four steps backward".
18
-
Similarly, we can replace "title" with other modalities such as "[AUDIO:...]", "[IMAGE:...]", "[VIDEO:...]" to implement various kinds of conditional synthesis tasks.
17
+
while field "text" of modality "TEXT" is a text sentence describing the captured motion, e.g., "a person walks four steps backward".
18
+
Similarly, we can replace "text" with other modalities such as "[AUDIO:...]", "[IMAGE:...]", "[VIDEO:...]" to implement various kinds of conditional synthesis tasks.
19
19
And we can simply replace the text with an empty string to implement the task of unconditional motion synthesis, aka., motion prediction.
Example 1: Inference without classifier-free guidance. This usage is much simpler and more concise than the classifier-free guided approach described later (see Example 2). However, the generated results tend to correlate poorly with the text prompts. It is thus *NOT* recommended.
The saved result in the `BVH <https://research.cs.wisc.edu/graphics/Courses/cs-838-1999/Jeff/BVH.html>`_ file format, "run_then_jump__guided.bvh", can be imported into a 3D animation software such as `Blender <https://www.blender.org/>`_ for rendering:
0 commit comments