Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Apply best practices for creating programmatic videos with Remotion and React.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
rules/voiceover.md
1---2name: voiceover3description: Adding AI-generated voiceover to Remotion compositions using TTS4metadata:5tags: voiceover, audio, elevenlabs, tts, speech, calculateMetadata, dynamic duration6---78# Adding AI voiceover to a Remotion composition910Use ElevenLabs TTS to generate speech audio per scene, then use [`calculateMetadata`](./calculate-metadata) to dynamically size the composition to match the audio.1112## Prerequisites1314By default this guide uses **ElevenLabs** as the TTS provider (`ELEVENLABS_API_KEY` environment variable). Users may substitute any TTS service that can produce an audio file.1516If the user has not specified a TTS provider, recommend ElevenLabs and ask for their API key.1718Ensure the environment variable is available when running the generation script:1920```bash21node --strip-types generate-voiceover.ts22```2324## Generating audio with ElevenLabs2526Create a script that reads the config, calls the ElevenLabs API for each scene, and writes MP3 files to the `public/` directory so Remotion can access them via `staticFile()`.2728The core API call for a single scene:2930```ts title="generate-voiceover.ts"31const response = await fetch(32`https://api.elevenlabs.io/v1/text-to-speech/${voiceId}`,33{34method: "POST",35headers: {36"xi-api-key": process.env.ELEVENLABS_API_KEY!,37"Content-Type": "application/json",38Accept: "audio/mpeg",39},40body: JSON.stringify({41text: "Welcome to the show.",42model_id: "eleven_multilingual_v2",43voice_settings: {44stability: 0.5,45similarity_boost: 0.75,46style: 0.3,47},48}),49},50);5152const audioBuffer = Buffer.from(await response.arrayBuffer());53writeFileSync(`public/voiceover/${compositionId}/${scene.id}.mp3`, audioBuffer);54```5556## Dynamic composition duration with calculateMetadata5758Use [`calculateMetadata`](./calculate-metadata.md) to measure the [audio durations](./get-audio-duration.md) and set the composition length accordingly.5960```tsx61import { CalculateMetadataFunction, staticFile } from "remotion";62import { getAudioDuration } from "./get-audio-duration";6364const FPS = 30;6566const SCENE_AUDIO_FILES = [67"voiceover/my-comp/scene-01-intro.mp3",68"voiceover/my-comp/scene-02-main.mp3",69"voiceover/my-comp/scene-03-outro.mp3",70];7172export const calculateMetadata: CalculateMetadataFunction<Props> = async ({73props,74}) => {75const durations = await Promise.all(76SCENE_AUDIO_FILES.map((file) => getAudioDuration(staticFile(file))),77);7879const sceneDurations = durations.map((durationInSeconds) => {80return durationInSeconds * FPS;81});8283return {84durationInFrames: Math.ceil(sceneDurations.reduce((sum, d) => sum + d, 0)),85};86};87```8889The computed `sceneDurations` are passed into the component via a `voiceover` prop so the component knows how long each scene should be.9091If the composition uses [`<TransitionSeries>`](./transitions.md), subtract the overlap from total duration: [./transitions.md#calculating-total-composition-duration](./transitions.md#calculating-total-composition-duration)9293## Rendering audio in the component9495See [audio.md](./audio.md) for more information on how to render audio in the component.9697## Delaying audio start9899See [audio.md#delaying](./audio.md#delaying) for more information on how to delay the audio start.100