Skip to content

Commit ed47579

Browse files
authored
docs: closed captions (#1497)
I'll add the docs if the code looks good :)
1 parent 1d26c9b commit ed47579

File tree

5 files changed

+251
-26
lines changed

5 files changed

+251
-26
lines changed
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
---
2+
id: closed-captions
3+
title: Closed captions
4+
description: How to add closed captions to your calls
5+
---
6+
7+
The Stream API supports adding real-time closed captions (subtitles for participants) to your calls. This guide shows you how to implement this feature on the client side.
8+
9+
## Call and call type settings
10+
11+
The closed caption feature can be controlled with the following options:
12+
13+
- `available`: the feature is available for your call and can be enabled.
14+
- `disabled`: the feature is not available for your call. In this case, it's a good idea to "hide" any UI element you have related to closed captions.
15+
- `auto-on`: the feature is available and will be enabled automatically once the user is connected to the call.
16+
17+
This setting can be set on the call or call type level.
18+
19+
You can check the current value like this:
20+
21+
```typescript
22+
console.log(call.state.settings?.transcription.closed_caption_mode);
23+
```
24+
25+
## Closed caption events
26+
27+
If closed captions are enabled for a given call, you'll receive the captions in the `call.closed_caption` events. Below, you can find an example payload:
28+
29+
```
30+
{
31+
"type": "call.closed_caption",
32+
"created_at": "2024-09-25T12:22:25.067005915Z",
33+
"call_cid": "default:test",
34+
"closed_caption": {
35+
"text": "Thank you, guys, for listening.",
36+
// When did the speaker start speaking
37+
"start_time": "2024-09-25T12:22:21.310735726Z",
38+
// When did the speaker finish saying the caption
39+
"end_time": "2024-09-25T12:22:24.310735726Z",
40+
"speaker_id": "zitaszuperagetstreamio"
41+
}
42+
}
43+
```
44+
45+
## Displaying the captions
46+
47+
When displaying closed captions, we should make sure that they are real-time (showing a sentence from 30 seconds ago has very little use in a conversation) and visible for enough time that participants can read them.
48+
49+
Below is an example implementation:
50+
51+
```typescript
52+
import {
53+
Call,
54+
CallClosedCaption,
55+
ClosedCaptionEvent,
56+
} from '@stream-io/video-client';
57+
58+
// The captions queue
59+
let captions: (CallClosedCaption & { speaker_name?: string })[] = [];
60+
// The maximum number of captions that can be visible on the screen
61+
const numberOfCaptionsVisible = 2;
62+
// A single caption can stay visible on the screen for this duration
63+
// This is the maximum duration, new captions can push a caption out of the screen sooner
64+
const captionTimeoutMs = 2700;
65+
66+
// Subscribe to call.closed_caption events
67+
call.on('call.closed_caption', (event: ClosedCaptionEvent) => {
68+
const caption = event.closed_caption;
69+
// It's possible to receive the same caption twice, so make sure to filter duplicates
70+
const isDuplicate = captions.find(
71+
(c) =>
72+
c.speaker_id === caption.speaker_id &&
73+
c.start_time === caption.start_time,
74+
);
75+
if (!isDuplicate) {
76+
// Look up the speaker's name based on the user id
77+
const speaker = call.state.participants.find(
78+
(p) => p.userId === caption.speaker_id,
79+
);
80+
const speakerName = speaker?.name || speaker?.userId;
81+
// Add the caption to the queue
82+
captions.push({ ...caption, speaker_name: speakerName });
83+
// Update the UI
84+
updateDisplayedCaptions();
85+
// We specify a maximum amount of time a caption can be visible
86+
// after that, we remove it from the screen (unless a newer caption has already pushed it out)
87+
captionTimeout = setTimeout(() => {
88+
captions = captions.slice(1);
89+
updateDisplayedCaptions();
90+
captionTimeout = undefined;
91+
}, captionTimeoutMs);
92+
}
93+
});
94+
95+
const updateDisplayedCaptions = () => {
96+
// The default implementation shows the last two captions
97+
const displayedCaptions = captions.slice(-1 * numberOfCaptionsVisible);
98+
const captionsHTML = displayedCaptions
99+
.map((c) => `<b>${c.speaker_name}:</b> ${c.text}`)
100+
.join('<br>');
101+
// Update the UI
102+
};
103+
```
104+
105+
:::note
106+
Since the closed caption event contains `start_time` and `end_time` fields, you can subtract the two to know how long it took the speaker to say the caption. You can then use this duration to control how long the text is visible on the screen. This is useful to ensure the captions are as real-time as possible, but that might not leave enough time for participants to read the text.
107+
:::
108+
109+
## See it in action
110+
111+
To see it all in action check out our TypeScript sample application on [GitHub](https://github.com/GetStream/stream-video-js/tree/main/sample-apps/client/ts-quickstart) or in [Codesandbox](https://codesandbox.io/p/sandbox/eloquent-glitter-99th3v).

sample-apps/client/ts-quickstart/index.html

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!DOCTYPE html>
1+
<!doctype html>
22
<html lang="en">
33
<head>
44
<meta charset="UTF-8" />
@@ -10,6 +10,7 @@
1010
<div id="call-controls"></div>
1111
<div id="screenshare"></div>
1212
<div id="participants"></div>
13+
<div id="closed-captions"></div>
1314
<script type="module" src="/src/main.ts"></script>
1415
</body>
1516
</html>

sample-apps/client/ts-quickstart/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"version": "0.0.0",
55
"type": "module",
66
"scripts": {
7-
"dev": "vite --host 0.0.0.0 --https",
7+
"dev": "https=1 vite --host 0.0.0.0",
88
"build": "tsc && vite build",
99
"preview": "vite preview"
1010
},
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
import {
2+
Call,
3+
CallClosedCaption,
4+
ClosedCaptionEvent,
5+
} from '@stream-io/video-client';
6+
7+
export class ClosedCaptionManager {
8+
status: 'on' | 'off' = 'off';
9+
private unsubscribe?: () => void;
10+
private captionTimeout?: ReturnType<typeof setTimeout>;
11+
private captions: (CallClosedCaption & { speaker_name?: string })[] = [];
12+
private captionContainer?: HTMLElement;
13+
/**
14+
* A single caption can stay visible on the screen for this duration
15+
*
16+
* This is the maximum duration, new captions can push a caption out of the screen sooner
17+
*/
18+
private captionTimeoutMs = 2700;
19+
/**
20+
* The maximum number of captions that can be visible on the screen
21+
*/
22+
private numberOfCaptionsVisible = 2;
23+
24+
constructor(private call: Call) {}
25+
26+
renderToggleElement() {
27+
const button = document.createElement('button');
28+
button.textContent =
29+
this.status === 'on'
30+
? 'Turn off closed captions'
31+
: 'Turn on closed captions';
32+
33+
button.addEventListener('click', async () => {
34+
this.status === 'on' ? this.hideCaptions() : this.showCaptions();
35+
button.textContent =
36+
this.status === 'on'
37+
? 'Turn off closed captions'
38+
: 'Turn on closed captions';
39+
});
40+
41+
return button;
42+
}
43+
44+
renderCaptionContainer() {
45+
this.captionContainer = document.createElement('div');
46+
47+
return this.captionContainer;
48+
}
49+
50+
showCaptions() {
51+
this.status = 'on';
52+
this.unsubscribe = this.call.on(
53+
'call.closed_caption',
54+
(event: ClosedCaptionEvent) => {
55+
const caption = event.closed_caption;
56+
const isDuplicate = this.captions.find(
57+
(c) =>
58+
c.speaker_id === caption.speaker_id &&
59+
c.start_time === caption.start_time,
60+
);
61+
if (!isDuplicate) {
62+
const speaker = this.call.state.participants.find(
63+
(p) => p.userId === caption.speaker_id,
64+
);
65+
const speakerName = speaker?.name || speaker?.userId;
66+
this.captions.push({ ...caption, speaker_name: speakerName });
67+
this.updateDisplayedCaptions();
68+
this.captionTimeout = setTimeout(() => {
69+
this.captions = this.captions.slice(1);
70+
this.updateDisplayedCaptions();
71+
this.captionTimeout = undefined;
72+
}, this.captionTimeoutMs);
73+
}
74+
},
75+
);
76+
}
77+
78+
hideCaptions() {
79+
this.status = 'off';
80+
this.cleanup();
81+
}
82+
83+
cleanup() {
84+
this.unsubscribe?.();
85+
clearTimeout(this.captionTimeout);
86+
}
87+
88+
private updateDisplayedCaptions() {
89+
if (!this.captionContainer) {
90+
console.warn(
91+
'Render caption container before turning on closed captions',
92+
);
93+
return;
94+
}
95+
const displayedCaptions = this.captions.slice(
96+
-1 * this.numberOfCaptionsVisible,
97+
);
98+
this.captionContainer.innerHTML = displayedCaptions
99+
.map((c) => `<b>${c.speaker_name}:</b> ${c.text}`)
100+
.join('<br>');
101+
}
102+
}

sample-apps/client/ts-quickstart/src/main.ts

Lines changed: 35 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ import {
1010
renderVolumeControl,
1111
} from './device-selector';
1212
import { isMobile } from './mobile';
13+
import { ClosedCaptionManager } from './closed-captions';
1314

1415
const searchParams = new URLSearchParams(window.location.search);
1516
const extractPayloadFromToken = (token: string) => {
@@ -50,32 +51,42 @@ call.screenShare.setSettings({
5051
maxBitrate: 1500000,
5152
});
5253

53-
call.join({ create: true }).then(async () => {
54-
// render mic and camera controls
55-
const controls = renderControls(call);
56-
const container = document.getElementById('call-controls')!;
57-
container.appendChild(controls.audioButton);
58-
container.appendChild(controls.videoButton);
59-
container.appendChild(controls.screenShareButton);
60-
61-
container.appendChild(renderAudioDeviceSelector(call));
62-
63-
// render device selectors
64-
if (isMobile.any()) {
65-
container.appendChild(controls.flipButton);
66-
} else {
67-
container.appendChild(renderVideoDeviceSelector(call));
68-
}
69-
70-
const audioOutputSelector = renderAudioOutputSelector(call);
71-
if (audioOutputSelector) {
72-
container.appendChild(audioOutputSelector);
73-
}
74-
75-
container.appendChild(renderVolumeControl(call));
76-
});
54+
const container = document.getElementById('call-controls')!;
55+
56+
// render mic and camera controls
57+
const controls = renderControls(call);
58+
container.appendChild(controls.audioButton);
59+
container.appendChild(controls.videoButton);
60+
container.appendChild(controls.screenShareButton);
61+
62+
container.appendChild(renderAudioDeviceSelector(call));
63+
64+
// render device selectors
65+
if (isMobile.any()) {
66+
container.appendChild(controls.flipButton);
67+
} else {
68+
container.appendChild(renderVideoDeviceSelector(call));
69+
}
70+
71+
const audioOutputSelector = renderAudioOutputSelector(call);
72+
if (audioOutputSelector) {
73+
container.appendChild(audioOutputSelector);
74+
}
75+
76+
container.appendChild(renderVolumeControl(call));
77+
78+
// Closed caption controls
79+
const closedCaptionManager = new ClosedCaptionManager(call);
80+
container.appendChild(closedCaptionManager.renderToggleElement());
81+
82+
const captionContainer = document.getElementById('closed-captions');
83+
captionContainer?.appendChild(closedCaptionManager.renderCaptionContainer());
84+
85+
call.join({ create: true });
7786

7887
window.addEventListener('beforeunload', () => {
88+
// Make sure to remove your event listeners when you leave a call
89+
closedCaptionManager?.cleanup();
7990
call.leave();
8091
});
8192

0 commit comments

Comments
 (0)