Skip to content

Commit ae809a0

Browse files
committed
Add closed caption docs
1 parent 0b37197 commit ae809a0

File tree

1 file changed

+115
-0
lines changed

1 file changed

+115
-0
lines changed
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
---
2+
id: closed-captions
3+
title: Closed captions
4+
description: How to add closed captions to your calls
5+
---
6+
7+
The Stream API supports adding real-time closed captions (subtitles for participants) to your calls. This guide shows you how to implement this feature on the client side.
8+
9+
## Call and call type settings
10+
11+
The closed caption feature can be controlled with the following options:
12+
13+
- `available`: the feature is available for your call and can be enabled.
14+
- `disabled`: the feature is not available for your call. In this case, it's a good idea to "hide" any UI element you have related to closed captions.
15+
- `auto-on`: the feature is available and will be enabled automatically once the user is connected to the call.
16+
17+
This setting can be set on the call or call type level.
18+
19+
You can check the current value like this:
20+
21+
```typescript
22+
console.log(call.state.settings?.transcription.closed_caption_mode);
23+
```
24+
25+
## Closed caption events
26+
27+
If closed captions are enabled for a given call, you'll receive the captions in the `call.closed_caption` WebSocket events. Below, you can find an example payload:
28+
29+
```
30+
{
31+
"type": "call.closed_caption",
32+
"created_at": "2024-09-25T12:22:25.067005915Z",
33+
"call_cid": "default:test",
34+
"closed_caption": {
35+
"text": "Thank you, guys, for listening.",
36+
// When did the speaker start speaking
37+
"start_time": "2024-09-25T12:22:21.310735726Z",
38+
// When did the speaker finish saying the caption
39+
"end_time": "2024-09-25T12:22:24.310735726Z",
40+
"speaker_id": "zitaszuperagetstreamio"
41+
}
42+
}
43+
```
44+
45+
## Displaying the captions
46+
47+
When displaying closed captions, we should make sure that they are real-time (showing a sentence from 30 seconds ago has very little use in a conversation) and visible for enough time that participants can read them.
48+
49+
Below is an example implementation:
50+
51+
```typescript
52+
import {
53+
Call,
54+
CallClosedCaption,
55+
ClosedCaptionEvent,
56+
} from '@stream-io/video-client';
57+
58+
// The captions queue
59+
captions: (CallClosedCaption & { speaker_name?: string })[] = [];
60+
// The maximum number of captions that can be visible on the screen
61+
numberOfCaptionsVisible = 2;
62+
// A single caption can stay visible on the screen for this duration
63+
// This is the maximum duration, new captions can push a caption out of the screen sooner
64+
captionTimeoutMs = 2700;
65+
66+
// Subscribe to call.closed_caption events
67+
call.on(
68+
'call.closed_caption',
69+
(event: ClosedCaptionEvent) => {
70+
const caption = event.closed_caption;
71+
// It's possible to receive the same caption twice, so make sure to filter duplicates
72+
const isDuplicate = captions.find(
73+
(c) =>
74+
c.speaker_id === caption.speaker_id &&
75+
c.start_time === caption.start_time,
76+
);
77+
if (!isDuplicate) {
78+
// Look up the speaker's name based on the user id
79+
const speaker = call.state.participants.find(
80+
(p) => p.userId === caption.speaker_id,
81+
);
82+
const speakerName = speaker?.name || speaker?.userId;
83+
// Add the caption to the queue
84+
captions.push({ ...caption, speaker_name: speakerName });
85+
// Update the UI
86+
updateDisplayedCaptions();
87+
// We specify a maximum amount of time a caption can be visible
88+
// after that, we remove it from the screen (unless a newer caption has already pushed it out)
89+
captionTimeout = setTimeout(() => {
90+
captions = captions.slice(1);
91+
updateDisplayedCaptions();
92+
captionTimeout = undefined;
93+
}, captionTimeoutMs);
94+
}
95+
});
96+
97+
const updateDisplayedCaptions = () => {
98+
// The default implementation shows the last two captions
99+
const displayedCaptions = captions.slice(
100+
-1 * numberOfCaptionsVisible,
101+
);
102+
const captionsHTML = displayedCaptions
103+
.map((c) => `<b>${c.speaker_name}:</b> ${c.text}`)
104+
.join('<br>');
105+
// Update the UI
106+
}
107+
```
108+
109+
:::note
110+
Since the closed caption WebSocket event contains `start_time` and `end_time` fields, you can subtract the two to know how long it took the speaker to say the caption. You can then use this duration to control how long the text is visible on the screen. This is useful to ensure the captions are as real-time as possible, but that might not leave enough time for participants to read the text.
111+
:::
112+
113+
## See it in action
114+
115+
To see it all in action check out our TypeScript sample application on [GitHub](https://github.com/GetStream/stream-video-js/tree/main/sample-apps/client/ts-quickstart) or in [Codesandbox](https://codesandbox.io/p/sandbox/eloquent-glitter-99th3v).

0 commit comments

Comments
 (0)