Source from repo

Remotion Best Practices

Apply best practices for creating programmatic videos with Remotion and React.

RemotionGitHub remotion-devOfficialFeaturedSource repo Original GitHub link

Files

Skill

Size

112.1 KB

Entrypoint

SKILL.md

Format

folder

Open file

rules/voiceover.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown100 linesFree

rules/voiceover.md

1---
2name: voiceover
3description: Adding AI-generated voiceover to Remotion compositions using TTS
4metadata:
5  tags: voiceover, audio, elevenlabs, tts, speech, calculateMetadata, dynamic duration
6---
7 
8# Adding AI voiceover to a Remotion composition
9 
10Use ElevenLabs TTS to generate speech audio per scene, then use [`calculateMetadata`](./calculate-metadata) to dynamically size the composition to match the audio.
11 
12## Prerequisites
13 
14By default this guide uses **ElevenLabs** as the TTS provider (`ELEVENLABS_API_KEY` environment variable). Users may substitute any TTS service that can produce an audio file.
15 
16If the user has not specified a TTS provider, recommend ElevenLabs and ask for their API key.
17 
18Ensure the environment variable is available when running the generation script:
19 
20```bash
21node --strip-types generate-voiceover.ts
22```
23 
24## Generating audio with ElevenLabs
25 
26Create a script that reads the config, calls the ElevenLabs API for each scene, and writes MP3 files to the `public/` directory so Remotion can access them via `staticFile()`.
27 
28The core API call for a single scene:
29 
30```ts title="generate-voiceover.ts"
31const response = await fetch(
32  `https://api.elevenlabs.io/v1/text-to-speech/${voiceId}`,
33  {
34    method: "POST",
35    headers: {
36      "xi-api-key": process.env.ELEVENLABS_API_KEY!,
37      "Content-Type": "application/json",
38      Accept: "audio/mpeg",
39    },
40    body: JSON.stringify({
41      text: "Welcome to the show.",
42      model_id: "eleven_multilingual_v2",
43      voice_settings: {
44        stability: 0.5,
45        similarity_boost: 0.75,
46        style: 0.3,
47      },
48    }),
49  },
50);
51 
52const audioBuffer = Buffer.from(await response.arrayBuffer());
53writeFileSync(`public/voiceover/${compositionId}/${scene.id}.mp3`, audioBuffer);
54```
55 
56## Dynamic composition duration with calculateMetadata
57 
58Use [`calculateMetadata`](./calculate-metadata.md) to measure the [audio durations](./get-audio-duration.md) and set the composition length accordingly.
59 
60```tsx
61import { CalculateMetadataFunction, staticFile } from "remotion";
62import { getAudioDuration } from "./get-audio-duration";
63 
64const FPS = 30;
65 
66const SCENE_AUDIO_FILES = [
67  "voiceover/my-comp/scene-01-intro.mp3",
68  "voiceover/my-comp/scene-02-main.mp3",
69  "voiceover/my-comp/scene-03-outro.mp3",
70];
71 
72export const calculateMetadata: CalculateMetadataFunction<Props> = async ({
73  props,
74}) => {
75  const durations = await Promise.all(
76    SCENE_AUDIO_FILES.map((file) => getAudioDuration(staticFile(file))),
77  );
78 
79  const sceneDurations = durations.map((durationInSeconds) => {
80    return durationInSeconds * FPS;
81  });
82 
83  return {
84    durationInFrames: Math.ceil(sceneDurations.reduce((sum, d) => sum + d, 0)),
85  };
86};
87```
88 
89The computed `sceneDurations` are passed into the component via a `voiceover` prop so the component knows how long each scene should be.
90 
91If the composition uses [`<TransitionSeries>`](./transitions.md), subtract the overlap from total duration: [./transitions.md#calculating-total-composition-duration](./transitions.md#calculating-total-composition-duration)
92 
93## Rendering audio in the component
94 
95See [audio.md](./audio.md) for more information on how to render audio in the component.
96 
97## Delaying audio start
98 
99See [audio.md#delaying](./audio.md#delaying) for more information on how to delay the audio start.
100

Preparing the source view

Remotion Best Practices

rules/voiceover.md