Source from repo

Ad Creative

Generates and iterates ad copy at scale for Google Ads, Meta, LinkedIn, TikTok, and Twitter/X with performance data analysis.

coreyhaines31GitHub coreyhaines31Source repo Original GitHub link Publisher page

Files

Skill

n/a

Size

49.9 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

references/generative-tools.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown638 linesFree

references/generative-tools.md

1# Generative AI Tools for Ad Creative
2 
3Reference for using AI image generators, video generators, and code-based video tools to produce ad visuals at scale.
4 
5---
6 
7## When to Use Generative Tools
8 
9| Need | Tool Category | Best Fit |
10|------|---------------|----------|
11| Static ad images (banners, social) | Image generation | ChatGPT Images 2.0, Nano Banana Pro, Flux, Ideogram |
12| Ad images with text overlays | Image generation (text-capable) | Ideogram, Nano Banana Pro |
13| Short video ads (6-30 sec) | Video generation | Veo, Kling, Runway, Sora, Seedance |
14| Video ads with voiceover | Video gen + voice | Veo/Sora (native), or Runway + ElevenLabs |
15| Voiceover tracks for ads | Voice generation | ElevenLabs, OpenAI TTS, Cartesia |
16| Multi-language ad versions | Voice generation | ElevenLabs, PlayHT |
17| Brand voice cloning | Voice generation | ElevenLabs, Resemble AI |
18| Product mockups and variations | Image generation + references | Flux (multi-image reference) |
19| Templated video ads at scale | Code-based video | Remotion |
20| Personalized video (name, data) | Code-based video | Remotion |
21| Brand-consistent variations | Image gen + style refs | Flux, Ideogram, Nano Banana Pro |
22 
23---
24 
25## Image Generation
26 
27### Nano Banana Pro (Gemini)
28 
29Google DeepMind's image generation model, available through the Gemini API.
30 
31**Best for:** High-quality ad images, product visuals, text rendering
32**API:** Gemini API (Google AI Studio, Vertex AI)
33**Pricing:** ~$0.04/image (Gemini 2.5 Flash Image), ~$0.24/4K image (Nano Banana Pro)
34 
35**Strengths:**
36- Strong text rendering in images (logos, headlines)
37- Native image editing (modify existing images with prompts)
38- Available through the same Gemini API used for text generation
39- Supports both generation and editing in one model
40 
41**Ad creative use cases:**
42- Generate social media ad images from text descriptions
43- Create product mockup variations
44- Edit existing ad images (swap backgrounds, change colors)
45- Generate images with headline text baked in
46 
47**API example:**
48```bash
49# Using the Gemini API for image generation
50curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent" \
51  -H "Content-Type: application/json" \
52  -H "x-goog-api-key: $GEMINI_API_KEY" \
53  -d '{
54    "contents": [{"parts": [{"text": "Create a clean, modern social media ad image for a project management tool. Show a laptop with a kanban board interface. Bright, professional, 16:9 ratio."}]}],
55    "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]}
56  }'
57```
58 
59**Docs:** [Gemini Image Generation](https://ai.google.dev/gemini-api/docs/image-generation)
60 
61---
62 
63### Flux (Black Forest Labs)
64 
65Open-weight image generation models with API access through Replicate and BFL's native API.
66 
67**Best for:** Photorealistic images, brand-consistent variations, multi-reference generation
68**API:** Replicate, BFL API, fal.ai
69**Pricing:** ~$0.01-0.06/image depending on model and resolution
70 
71**Model variants:**
72| Model | Speed | Quality | Cost | Best For |
73|-------|-------|---------|------|----------|
74| Flux 2 Pro | ~6 sec | Highest | $0.015/MP | Final production assets |
75| Flux 2 Flex | ~22 sec | High + editing | $0.06/MP | Iterative editing |
76| Flux 2 Dev | ~2.5 sec | Good | $0.012/MP | Rapid prototyping |
77| Flux 2 Klein | Fastest | Good | Lowest | High-volume batch generation |
78 
79**Strengths:**
80- Multi-image reference (up to 8 images) for consistent identity across ads
81- Product consistency — same product in different contexts
82- Style transfer from reference images
83- Open-weight Dev model for self-hosting
84 
85**Ad creative use cases:**
86- Generate 50+ ad variations with consistent product/person identity
87- Create product-in-context images (your SaaS on different devices)
88- Style-match to existing brand assets using reference images
89- Rapid A/B test image variations
90 
91**Docs:** [Replicate Flux](https://replicate.com/black-forest-labs/flux-2-pro), [BFL API](https://docs.bfl.ml/)
92 
93---
94 
95### Ideogram
96 
97Specialized in typography and text rendering within images.
98 
99**Best for:** Ad banners with text, branded graphics, social ad images with headlines
100**API:** Ideogram API, Runware
101**Pricing:** ~$0.06/image (API), ~$0.009/image (subscription)
102 
103**Strengths:**
104- Best-in-class text rendering (~90% accuracy vs ~30% for most tools)
105- Style reference system (upload up to 3 reference images)
106- 4.3 billion style presets for consistent brand aesthetics
107- Strong at logos and branded typography
108 
109**Ad creative use cases:**
110- Generate ad banners with headline text directly in the image
111- Create social media graphics with branded text overlays
112- Produce multiple design variations with consistent typography
113- Generate promotional materials without needing a designer for each iteration
114 
115**Docs:** [Ideogram API](https://developer.ideogram.ai/), [Ideogram](https://ideogram.ai/)
116 
117---
118 
119### Other Image Tools
120 
121| Tool | Best For | API Status | Notes |
122|------|----------|------------|-------|
123| **DALL-E 3** (OpenAI) | General image generation | Official API | Integrated with ChatGPT, good text rendering |
124| **Midjourney** | Artistic, high-aesthetic images | No official public API | Discord-based; unofficial APIs exist but risk bans |
125| **Stable Diffusion** | Self-hosted, customizable | Open source | Best for teams with GPU infrastructure |
126 
127---
128 
129## Video Generation
130 
131### Google Veo
132 
133Google DeepMind's video generation model, available through the Gemini API and Vertex AI.
134 
135**Best for:** High-quality video ads with native audio, vertical video for social
136**API:** Gemini API, Vertex AI
137**Pricing:** ~$0.15/sec (Veo 3.1 Fast), ~$0.40/sec (Veo 3.1 Standard)
138 
139**Capabilities:**
140- Up to 60 seconds at 1080p
141- Native audio generation (dialogue, sound effects, ambient)
142- Vertical 9:16 output for Stories/Reels/Shorts
143- Upscale to 4K
144- Text-to-video and image-to-video
145 
146**Ad creative use cases:**
147- Generate short video ads (15-30 sec) from text descriptions
148- Create vertical video ads for TikTok, Reels, Shorts
149- Produce product demos with voiceover
150- Generate multiple video variations from the same prompt with different styles
151 
152**Docs:** [Veo on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/video/overview)
153 
154---
155 
156### Kling (Kuaishou)
157 
158Video generation with simultaneous audio-visual generation and camera controls.
159 
160**Best for:** Cinematic video ads, longer-form content, audio-synced video
161**API:** Kling API, PiAPI, fal.ai
162**Pricing:** ~$0.09/sec (via fal.ai third-party)
163 
164**Capabilities:**
165- Up to 3 minutes at 1080p/30-48fps
166- Simultaneous audio-visual generation (Kling 2.6)
167- Text-to-video and image-to-video
168- Motion and camera controls
169 
170**Ad creative use cases:**
171- Longer product explainer videos
172- Cinematic brand videos with synchronized audio
173- Animate product images into video ads
174 
175**Docs:** [Kling AI Developer](https://klingai.com/global/dev/model/video)
176 
177---
178 
179### Runway
180 
181Video generation and editing platform with strong controllability.
182 
183**Best for:** Controlled video generation, style-consistent content, editing existing footage
184**API:** Runway Developer Portal
185 
186**Capabilities:**
187- Gen-4: Character/scene consistency across shots
188- Motion brush and camera controls
189- Image-to-video with reference images
190- Video-to-video style transfer
191 
192**Ad creative use cases:**
193- Generate video ads with consistent characters/products across scenes
194- Style-transfer existing footage to match brand aesthetics
195- Extend or remix existing video content
196 
197**Docs:** [Runway API](https://docs.dev.runwayml.com/)
198 
199---
200 
201### Sora 2 (OpenAI)
202 
203OpenAI's video generation model with synchronized audio.
204 
205**Best for:** High-fidelity video with dialogue and sound
206**API:** OpenAI API
207**Pricing:** Free tier available; Pro from $0.10-0.50/sec depending on resolution
208 
209**Capabilities:**
210- Up to 60 seconds with synchronized audio
211- Dialogue, sound effects, and ambient audio
212- sora-2 (fast) and sora-2-pro (quality) variants
213- Text-to-video and image-to-video
214 
215**Ad creative use cases:**
216- Video testimonials and talking-head style ads
217- Product demo videos with narration
218- Narrative brand videos
219 
220**Docs:** [OpenAI Video Generation](https://platform.openai.com/docs/guides/video-generation)
221 
222---
223 
224### Seedance 2.0 (ByteDance)
225 
226ByteDance's video generation model with simultaneous audio-visual generation and multimodal inputs.
227 
228**Best for:** Fast, affordable video ads with native audio, multimodal reference inputs
229**API:** BytePlus (official), Replicate, WaveSpeedAI, fal.ai (third-party); OpenAI-compatible API format
230**Pricing:** ~$0.10-0.80/min depending on resolution (estimated 10-100x cheaper than Sora 2 per clip)
231 
232**Capabilities:**
233- Up to 20 seconds at up to 2K resolution
234- Simultaneous audio-visual generation (Dual-Branch Diffusion Transformer)
235- Text-to-video and image-to-video
236- Up to 12 reference files for multimodal input
237- OpenAI-compatible API structure
238 
239**Ad creative use cases:**
240- High-volume short video ad production at low cost
241- Video ads with synchronized voiceover and sound effects in one pass
242- Multi-reference generation (feed product images, brand assets, style references)
243- Rapid iteration on video ad concepts
244 
245**Docs:** [Seedance](https://seed.bytedance.com/en/seedance2_0)
246 
247---
248 
249### Higgsfield
250 
251Full-stack video creation platform with cinematic camera controls.
252 
253**Best for:** Social video ads, cinematic style, mobile-first content
254**Platform:** [higgsfield.ai](https://higgsfield.ai/)
255 
256**Capabilities:**
257- 50+ professional camera movements (zooms, pans, FPV drone shots)
258- Image-to-video animation
259- Built-in editing, transitions, and keyframing
260- All-in-one workflow: image gen, animation, editing
261 
262**Ad creative use cases:**
263- Social media video ads with cinematic feel
264- Animate product images into dynamic video
265- Create multiple video variations with different camera styles
266- Quick-turn video content for social campaigns
267 
268---
269 
270### Video Tool Comparison
271 
272| Tool | Max Length | Audio | Resolution | API | Best For |
273|------|-----------|-------|------------|-----|----------|
274| **Veo 3.1** | 60 sec | Native | 1080p/4K | Gemini | Vertical social video |
275| **Kling 2.6** | 3 min | Native | 1080p | Third-party | Longer cinematic |
276| **Runway Gen-4** | 10 sec | No | 1080p | Official | Controlled, consistent |
277| **Sora 2** | 60 sec | Native | 1080p | Official | Dialogue-heavy |
278| **Seedance 2.0** | 20 sec | Native | 2K | Official + third-party | Affordable high-volume |
279| **Higgsfield** | Varies | Yes | 1080p | Web-based | Social, mobile-first |
280 
281---
282 
283## Voice & Audio Generation
284 
285For layering realistic voiceovers onto video ads, adding narration to product demos, or generating audio for Remotion-rendered videos. These tools turn ad scripts into natural-sounding voice tracks.
286 
287### When to Use Voice Tools
288 
289Many video generators (Veo, Kling, Sora, Seedance) now include native audio. Use standalone voice tools when you need:
290 
291- **Voiceover on silent video** — Runway Gen-4 and Remotion produce silent output
292- **Brand voice consistency** — Clone a specific voice for all ads
293- **Multi-language versions** — Same ad script in 20+ languages
294- **Script iteration** — Re-record voiceover without reshooting video
295- **Precise control** — Exact timing, emotion, and pacing
296 
297---
298 
299### ElevenLabs
300 
301The market leader in realistic voice generation and voice cloning.
302 
303**Best for:** Most natural-sounding voiceovers, brand voice cloning, multilingual
304**API:** REST API with streaming support
305**Pricing:** ~$0.12-0.30 per 1,000 characters depending on plan; starts at $5/month
306 
307**Capabilities:**
308- 29+ languages with natural accent and intonation
309- Voice cloning from short audio clips (instant) or longer recordings (professional)
310- Emotion and style control
311- Streaming for real-time generation
312- Voice library with hundreds of pre-built voices
313 
314**Ad creative use cases:**
315- Generate voiceover tracks for video ads
316- Clone your brand spokesperson's voice for all ad variations
317- Produce the same ad in 10+ languages from one script
318- A/B test different voice styles (authoritative vs. friendly vs. urgent)
319 
320**API example:**
321```bash
322curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/{voice_id}" \
323  -H "xi-api-key: $ELEVENLABS_API_KEY" \
324  -H "Content-Type: application/json" \
325  -d '{
326    "text": "Stop wasting hours on manual reporting. Try DataFlow free for 14 days.",
327    "model_id": "eleven_multilingual_v2",
328    "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
329  }' --output voiceover.mp3
330```
331 
332**Docs:** [ElevenLabs API](https://elevenlabs.io/docs/api-reference/text-to-speech)
333 
334---
335 
336### OpenAI TTS
337 
338Simple, affordable text-to-speech built into the OpenAI API.
339 
340**Best for:** Quick voiceovers, cost-effective at scale, simple integration
341**API:** OpenAI API (same SDK as GPT/DALL-E)
342**Pricing:** $15/million chars (standard), $30/million chars (HD); ~$0.015/min with gpt-4o-mini-tts
343 
344**Capabilities:**
345- 13 built-in voices (no custom cloning)
346- Multiple languages
347- Real-time streaming
348- HD quality option
349- Simple API — same SDK you already use for GPT
350 
351**Ad creative use cases:**
352- Fast, cheap voiceover for draft/test ad versions
353- High-volume narration at low cost
354- Prototype ad audio before investing in premium voice
355 
356**Docs:** [OpenAI TTS](https://platform.openai.com/docs/guides/text-to-speech)
357 
358---
359 
360### Cartesia Sonic
361 
362Ultra-low latency voice generation built for real-time applications.
363 
364**Best for:** Real-time voice, lowest latency, emotional expressiveness
365**API:** REST + WebSocket streaming
366**Pricing:** Starts at $5/month; pay-as-you-go from $0.03/min
367 
368**Capabilities:**
369- 40ms time-to-first-audio (fastest in class)
370- 15+ languages
371- Nonverbal expressiveness: laughter, breathing, emotional inflections
372- Sonic Turbo for even lower latency
373- Streaming API for real-time generation
374 
375**Ad creative use cases:**
376- Real-time ad preview during creative iteration
377- Interactive demo videos with dynamic narration
378- Ads requiring natural laughter, sighs, or emotional reactions
379 
380**Docs:** [Cartesia Sonic](https://docs.cartesia.ai/build-with-cartesia/tts-models/latest)
381 
382---
383 
384### Voicebox (Open Source)
385 
386Free, local-first voice synthesis studio powered by Qwen3-TTS. The open-source alternative to ElevenLabs.
387 
388**Best for:** Free voice cloning, local/private generation, zero-cost batch production
389**API:** Local REST API at `http://localhost:8000`
390**Pricing:** Free (MIT license). Runs entirely on your machine.
391**Stack:** Tauri (Rust) + React + FastAPI (Python)
392 
393**Capabilities:**
394- Voice cloning from short audio samples via Qwen3-TTS
395- Multi-language support (English, Chinese, more planned)
396- Multi-track timeline editor for composing conversations
397- 4-5x faster inference on Apple Silicon via MLX Metal acceleration
398- Local REST API for programmatic generation
399- No cloud dependency — all processing on-device
400 
401**Ad creative use cases:**
402- Free voice cloning for brand spokesperson across all ad variations
403- Batch generate voiceovers without per-character costs
404- Private/local generation when ad content is sensitive or pre-launch
405- Prototype voice variations before committing to a paid service
406 
407**API example:**
408```bash
409curl -X POST http://localhost:8000/generate \
410  -H "Content-Type: application/json" \
411  -d '{"text": "Stop wasting hours on manual reporting.", "profile_id": "abc123", "language": "en"}'
412```
413 
414**Install:** Desktop apps for macOS and Windows at [voicebox.sh](https://voicebox.sh), or build from source:
415```bash
416git clone https://github.com/jamiepine/voicebox.git
417cd voicebox && make setup && make dev
418```
419 
420**Docs:** [GitHub](https://github.com/jamiepine/voicebox)
421 
422---
423 
424### Other Voice Tools
425 
426| Tool | Best For | Differentiator | API |
427|------|----------|---------------|-----|
428| **PlayHT** | Large voice library, low latency | 900+ voices, <300ms latency, ultra-realistic | [play.ht](https://play.ht/) |
429| **Resemble AI** | Enterprise voice cloning | On-premise deployment, real-time speech-to-speech | [resemble.ai](https://www.resemble.ai/) |
430| **WellSaid Labs** | Ethical, commercial-safe voices | Voices from compensated actors, safe for commercial use | [wellsaid.io](https://www.wellsaid.io/) |
431| **Fish Audio** | Budget-friendly, emotion control | ~50-70% cheaper than ElevenLabs, emotion tags | [fish.audio](https://fish.audio/) |
432| **Murf AI** | Non-technical teams | Browser-based studio, 200+ voices | [murf.ai](https://murf.ai/) |
433| **Google Cloud TTS** | Google ecosystem, scale | 220+ voices, 40+ languages, enterprise SLAs | [Google TTS](https://cloud.google.com/text-to-speech) |
434| **Amazon Polly** | AWS ecosystem, cost | Neural voices, SSML control, cheap at volume | [Amazon Polly](https://aws.amazon.com/polly/) |
435 
436---
437 
438### Voice Tool Comparison
439 
440| Tool | Quality | Cloning | Languages | Latency | Price/1K chars |
441|------|---------|---------|-----------|---------|----------------|
442| **ElevenLabs** | Best | Yes (instant + pro) | 29+ | ~200ms | $0.12-0.30 |
443| **OpenAI TTS** | Good | No | 13+ | ~300ms | $0.015-0.030 |
444| **Cartesia Sonic** | Very good | No | 15+ | ~40ms | ~$0.03/min |
445| **PlayHT** | Very good | Yes | 140+ | <300ms | ~$0.10-0.20 |
446| **Fish Audio** | Good | Yes | 13+ | ~200ms | ~$0.05-0.10 |
447| **WellSaid** | Very good | No (actor voices) | English | ~300ms | Custom pricing |
448| **Voicebox** | Good | Yes (local) | 2+ | Local | Free (open source) |
449 
450### Choosing a Voice Tool
451 
452```
453Need voiceover for ads?
454├── Need to clone a specific brand voice?
455│   ├── Best quality → ElevenLabs
456│   ├── Enterprise/on-premise → Resemble AI
457│   └── Budget-friendly → Fish Audio, PlayHT
458├── Need multilingual (same ad, many languages)?
459│   ├── Most languages → PlayHT (140+)
460│   └── Best quality → ElevenLabs (29+)
461├── Need free / open source / local?
462│   └── Voicebox (MIT, runs on your machine)
463├── Need cheap, fast, good-enough?
464│   └── OpenAI TTS ($0.015/min)
465├── Need commercially-safe licensing?
466│   └── WellSaid Labs (actor-compensated voices)
467└── Need real-time/interactive?
468    └── Cartesia Sonic (40ms TTFA)
469```
470 
471### Workflow: Voice + Video
472 
473```
4741. Write ad script (use ad-creative skill for copy)
4752. Generate voiceover with ElevenLabs/OpenAI TTS
4763. Generate or render video:
477   a. Silent video from Runway/Remotion → layer voice track
478   b. Or use Veo/Sora/Seedance with native audio (skip separate VO)
4794. Combine with ffmpeg if layering separately:
480   ffmpeg -i video.mp4 -i voiceover.mp3 -c:v copy -c:a aac output.mp4
4815. Generate variations (different scripts, voices, or languages)
482```
483 
484---
485 
486## Code-Based Video: Remotion
487 
488For templated, data-driven video ads at scale, Remotion is the best option. Unlike AI video generators that produce unique video from prompts, Remotion uses React code to render deterministic, brand-perfect video from templates and data.
489 
490**Best for:** Templated ad variations, personalized video, brand-consistent production
491**Stack:** React + TypeScript
492**Pricing:** Free for individuals/small teams; commercial license required for 4+ employees
493**Docs:** [remotion.dev](https://www.remotion.dev/)
494 
495### Why Remotion for Ads
496 
497| AI Video Generators | Remotion |
498|---------------------|----------|
499| Unique output each time | Deterministic, pixel-perfect |
500| Prompt-based, less control | Full code control over every frame |
501| Hard to match brand exactly | Exact brand colors, fonts, spacing |
502| One-at-a-time generation | Batch render hundreds from data |
503| No dynamic data insertion | Personalize with names, prices, stats |
504 
505### Ad Creative Use Cases
506 
507**1. Dynamic product ads**
508Feed a JSON array of products and render a unique video ad for each:
509```tsx
510// Simplified Remotion component for product ads
511export const ProductAd: React.FC<{
512  productName: string;
513  price: string;
514  imageUrl: string;
515  tagline: string;
516}> = ({productName, price, imageUrl, tagline}) => {
517  return (
518    <AbsoluteFill style={{backgroundColor: '#fff'}}>
519      <Img src={imageUrl} style={{width: 400, height: 400}} />
520      <h1>{productName}</h1>
521      <p>{tagline}</p>
522      <div className="price">{price}</div>
523      <div className="cta">Shop Now</div>
524    </AbsoluteFill>
525  );
526};
527```
528 
529**2. A/B test video variations**
530Render the same template with different headlines, CTAs, or color schemes:
531```tsx
532const variations = [
533  {headline: "Save 50% Today", cta: "Get the Deal", theme: "urgent"},
534  {headline: "Join 10K+ Teams", cta: "Start Free", theme: "social-proof"},
535  {headline: "Built for Speed", cta: "Try It Now", theme: "benefit"},
536];
537// Render all variations programmatically
538```
539 
540**3. Personalized outreach videos**
541Generate videos addressing prospects by name for cold outreach or sales.
542 
543**4. Social ad batch production**
544Render the same content across different aspect ratios:
545- 1:1 for feed
546- 9:16 for Stories/Reels
547- 16:9 for YouTube
548 
549### Remotion Workflow for Ad Creative
550 
551```
5521. Design template in React (or use AI to generate the component)
5532. Define data schema (products, headlines, CTAs, images)
5543. Feed data array into template
5554. Batch render all variations
5565. Upload to ad platform
557```
558 
559### Getting Started
560 
561```bash
562# Create a new Remotion project
563npx create-video@latest
564 
565# Render a single video
566npx remotion render src/index.ts MyComposition out/video.mp4
567 
568# Batch render from data
569npx remotion render src/index.ts MyComposition --props='{"data": [...]}'
570```
571 
572---
573 
574## Choosing the Right Tool
575 
576### Decision Tree
577 
578```
579Need video ads?
580├── Templated, data-driven (same structure, different data)
581│   └── Use Remotion
582├── Unique creative from prompts (exploratory)
583│   ├── Need dialogue/voiceover? → Sora 2, Veo 3.1, Kling 2.6, Seedance 2.0
584│   ├── Need consistency across scenes? → Runway Gen-4
585│   ├── Need vertical social video? → Veo 3.1 (native 9:16)
586│   ├── Need high volume at low cost? → Seedance 2.0
587│   └── Need cinematic camera work? → Higgsfield, Kling
588└── Both → Use AI gen for hero creative, Remotion for variations
589 
590Need image ads?
591├── Need text/headlines in image? → Ideogram
592├── Need product consistency across variations? → Flux (multi-ref)
593├── Need quick iterations on existing images? → Nano Banana Pro
594├── Need highest visual quality? → Flux Pro, Midjourney
595└── Need high volume at low cost? → Flux Klein, Nano Banana
596```
597 
598### Cost Comparison for 100 Ad Variations
599 
600| Approach | Tool | Approximate Cost |
601|----------|------|-----------------|
602| 100 static images | Nano Banana Pro | ~$4-24 |
603| 100 static images | Flux Dev | ~$1-2 |
604| 100 static images | Ideogram API | ~$6 |
605| 100 × 15-sec videos | Veo 3.1 Fast | ~$225 |
606| 100 × 15-sec videos | Remotion (templated) | ~$0 (self-hosted render) |
607| 10 hero videos + 90 templated | Veo + Remotion | ~$22 + render time |
608 
609### Recommended Workflow for Scaled Ad Production
610 
6111. **Generate hero creative** with AI (Nano Banana, Flux, Veo) — high-quality, exploratory
6122. **Build templates** in Remotion based on winning creative patterns
6133. **Batch produce variations** with Remotion using data (products, headlines, CTAs)
6144. **Iterate** — use AI tools for new angles, Remotion for scale
615 
616This hybrid approach gives you the creative exploration of AI generators and the consistency and scale of code-based rendering.
617 
618---
619 
620## Platform-Specific Image Specs
621 
622When generating images for ads, request the correct dimensions:
623 
624| Platform | Placement | Aspect Ratio | Recommended Size |
625|----------|-----------|-------------|-----------------|
626| Meta Feed | Single image | 1:1 | 1080x1080 |
627| Meta Stories/Reels | Vertical | 9:16 | 1080x1920 |
628| Meta Carousel | Square | 1:1 | 1080x1080 |
629| Google Display | Landscape | 1.91:1 | 1200x628 |
630| Google Display | Square | 1:1 | 1200x1200 |
631| LinkedIn Feed | Landscape | 1.91:1 | 1200x627 |
632| LinkedIn Feed | Square | 1:1 | 1200x1200 |
633| TikTok Feed | Vertical | 9:16 | 1080x1920 |
634| Twitter/X Feed | Landscape | 16:9 | 1200x675 |
635| Twitter/X Card | Landscape | 1.91:1 | 800x418 |
636 
637Include these dimensions in your generation prompts to avoid needing to crop or resize.
638

Marketplace

Source from repo

Ad Creative

Generates and iterates ad copy at scale for Google Ads, Meta, LinkedIn, TikTok, and Twitter/X with performance data analysis.

coreyhaines31GitHub coreyhaines31Source repo Original GitHub link Publisher page

Files

Skill

n/a

Size

49.9 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

references/generative-tools.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown638 linesFree

references/generative-tools.md

1# Generative AI Tools for Ad Creative
2 
3Reference for using AI image generators, video generators, and code-based video tools to produce ad visuals at scale.
4 
5---
6 
7## When to Use Generative Tools
8 
9| Need | Tool Category | Best Fit |
10|------|---------------|----------|
11| Static ad images (banners, social) | Image generation | ChatGPT Images 2.0, Nano Banana Pro, Flux, Ideogram |
12| Ad images with text overlays | Image generation (text-capable) | Ideogram, Nano Banana Pro |
13| Short video ads (6-30 sec) | Video generation | Veo, Kling, Runway, Sora, Seedance |
14| Video ads with voiceover | Video gen + voice | Veo/Sora (native), or Runway + ElevenLabs |
15| Voiceover tracks for ads | Voice generation | ElevenLabs, OpenAI TTS, Cartesia |
16| Multi-language ad versions | Voice generation | ElevenLabs, PlayHT |
17| Brand voice cloning | Voice generation | ElevenLabs, Resemble AI |
18| Product mockups and variations | Image generation + references | Flux (multi-image reference) |
19| Templated video ads at scale | Code-based video | Remotion |
20| Personalized video (name, data) | Code-based video | Remotion |
21| Brand-consistent variations | Image gen + style refs | Flux, Ideogram, Nano Banana Pro |
22 
23---
24 
25## Image Generation
26 
27### Nano Banana Pro (Gemini)
28 
29Google DeepMind's image generation model, available through the Gemini API.
30 
31**Best for:** High-quality ad images, product visuals, text rendering
32**API:** Gemini API (Google AI Studio, Vertex AI)
33**Pricing:** ~$0.04/image (Gemini 2.5 Flash Image), ~$0.24/4K image (Nano Banana Pro)
34 
35**Strengths:**
36- Strong text rendering in images (logos, headlines)
37- Native image editing (modify existing images with prompts)
38- Available through the same Gemini API used for text generation
39- Supports both generation and editing in one model
40 
41**Ad creative use cases:**
42- Generate social media ad images from text descriptions
43- Create product mockup variations
44- Edit existing ad images (swap backgrounds, change colors)
45- Generate images with headline text baked in
46 
47**API example:**
48```bash
49# Using the Gemini API for image generation
50curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent" \
51  -H "Content-Type: application/json" \
52  -H "x-goog-api-key: $GEMINI_API_KEY" \
53  -d '{
54    "contents": [{"parts": [{"text": "Create a clean, modern social media ad image for a project management tool. Show a laptop with a kanban board interface. Bright, professional, 16:9 ratio."}]}],
55    "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]}
56  }'
57```
58 
59**Docs:** [Gemini Image Generation](https://ai.google.dev/gemini-api/docs/image-generation)
60 
61---
62 
63### Flux (Black Forest Labs)
64 
65Open-weight image generation models with API access through Replicate and BFL's native API.
66 
67**Best for:** Photorealistic images, brand-consistent variations, multi-reference generation
68**API:** Replicate, BFL API, fal.ai
69**Pricing:** ~$0.01-0.06/image depending on model and resolution
70 
71**Model variants:**
72| Model | Speed | Quality | Cost | Best For |
73|-------|-------|---------|------|----------|
74| Flux 2 Pro | ~6 sec | Highest | $0.015/MP | Final production assets |
75| Flux 2 Flex | ~22 sec | High + editing | $0.06/MP | Iterative editing |
76| Flux 2 Dev | ~2.5 sec | Good | $0.012/MP | Rapid prototyping |
77| Flux 2 Klein | Fastest | Good | Lowest | High-volume batch generation |
78 
79**Strengths:**
80- Multi-image reference (up to 8 images) for consistent identity across ads
81- Product consistency — same product in different contexts
82- Style transfer from reference images
83- Open-weight Dev model for self-hosting
84 
85**Ad creative use cases:**
86- Generate 50+ ad variations with consistent product/person identity
87- Create product-in-context images (your SaaS on different devices)
88- Style-match to existing brand assets using reference images
89- Rapid A/B test image variations
90 
91**Docs:** [Replicate Flux](https://replicate.com/black-forest-labs/flux-2-pro), [BFL API](https://docs.bfl.ml/)
92 
93---
94 
95### Ideogram
96 
97Specialized in typography and text rendering within images.
98 
99**Best for:** Ad banners with text, branded graphics, social ad images with headlines
100**API:** Ideogram API, Runware
101**Pricing:** ~$0.06/image (API), ~$0.009/image (subscription)
102 
103**Strengths:**
104- Best-in-class text rendering (~90% accuracy vs ~30% for most tools)
105- Style reference system (upload up to 3 reference images)
106- 4.3 billion style presets for consistent brand aesthetics
107- Strong at logos and branded typography
108 
109**Ad creative use cases:**
110- Generate ad banners with headline text directly in the image
111- Create social media graphics with branded text overlays
112- Produce multiple design variations with consistent typography
113- Generate promotional materials without needing a designer for each iteration
114 
115**Docs:** [Ideogram API](https://developer.ideogram.ai/), [Ideogram](https://ideogram.ai/)
116 
117---
118 
119### Other Image Tools
120 
121| Tool | Best For | API Status | Notes |
122|------|----------|------------|-------|
123| **DALL-E 3** (OpenAI) | General image generation | Official API | Integrated with ChatGPT, good text rendering |
124| **Midjourney** | Artistic, high-aesthetic images | No official public API | Discord-based; unofficial APIs exist but risk bans |
125| **Stable Diffusion** | Self-hosted, customizable | Open source | Best for teams with GPU infrastructure |
126 
127---
128 
129## Video Generation
130 
131### Google Veo
132 
133Google DeepMind's video generation model, available through the Gemini API and Vertex AI.
134 
135**Best for:** High-quality video ads with native audio, vertical video for social
136**API:** Gemini API, Vertex AI
137**Pricing:** ~$0.15/sec (Veo 3.1 Fast), ~$0.40/sec (Veo 3.1 Standard)
138 
139**Capabilities:**
140- Up to 60 seconds at 1080p
141- Native audio generation (dialogue, sound effects, ambient)
142- Vertical 9:16 output for Stories/Reels/Shorts
143- Upscale to 4K
144- Text-to-video and image-to-video
145 
146**Ad creative use cases:**
147- Generate short video ads (15-30 sec) from text descriptions
148- Create vertical video ads for TikTok, Reels, Shorts
149- Produce product demos with voiceover
150- Generate multiple video variations from the same prompt with different styles
151 
152**Docs:** [Veo on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/video/overview)
153 
154---
155 
156### Kling (Kuaishou)
157 
158Video generation with simultaneous audio-visual generation and camera controls.
159 
160**Best for:** Cinematic video ads, longer-form content, audio-synced video
161**API:** Kling API, PiAPI, fal.ai
162**Pricing:** ~$0.09/sec (via fal.ai third-party)
163 
164**Capabilities:**
165- Up to 3 minutes at 1080p/30-48fps
166- Simultaneous audio-visual generation (Kling 2.6)
167- Text-to-video and image-to-video
168- Motion and camera controls
169 
170**Ad creative use cases:**
171- Longer product explainer videos
172- Cinematic brand videos with synchronized audio
173- Animate product images into video ads
174 
175**Docs:** [Kling AI Developer](https://klingai.com/global/dev/model/video)
176 
177---
178 
179### Runway
180 
181Video generation and editing platform with strong controllability.
182 
183**Best for:** Controlled video generation, style-consistent content, editing existing footage
184**API:** Runway Developer Portal
185 
186**Capabilities:**
187- Gen-4: Character/scene consistency across shots
188- Motion brush and camera controls
189- Image-to-video with reference images
190- Video-to-video style transfer
191 
192**Ad creative use cases:**
193- Generate video ads with consistent characters/products across scenes
194- Style-transfer existing footage to match brand aesthetics
195- Extend or remix existing video content
196 
197**Docs:** [Runway API](https://docs.dev.runwayml.com/)
198 
199---
200 
201### Sora 2 (OpenAI)
202 
203OpenAI's video generation model with synchronized audio.
204 
205**Best for:** High-fidelity video with dialogue and sound
206**API:** OpenAI API
207**Pricing:** Free tier available; Pro from $0.10-0.50/sec depending on resolution
208 
209**Capabilities:**
210- Up to 60 seconds with synchronized audio
211- Dialogue, sound effects, and ambient audio
212- sora-2 (fast) and sora-2-pro (quality) variants
213- Text-to-video and image-to-video
214 
215**Ad creative use cases:**
216- Video testimonials and talking-head style ads
217- Product demo videos with narration
218- Narrative brand videos
219 
220**Docs:** [OpenAI Video Generation](https://platform.openai.com/docs/guides/video-generation)
221 
222---
223 
224### Seedance 2.0 (ByteDance)
225 
226ByteDance's video generation model with simultaneous audio-visual generation and multimodal inputs.
227 
228**Best for:** Fast, affordable video ads with native audio, multimodal reference inputs
229**API:** BytePlus (official), Replicate, WaveSpeedAI, fal.ai (third-party); OpenAI-compatible API format
230**Pricing:** ~$0.10-0.80/min depending on resolution (estimated 10-100x cheaper than Sora 2 per clip)
231 
232**Capabilities:**
233- Up to 20 seconds at up to 2K resolution
234- Simultaneous audio-visual generation (Dual-Branch Diffusion Transformer)
235- Text-to-video and image-to-video
236- Up to 12 reference files for multimodal input
237- OpenAI-compatible API structure
238 
239**Ad creative use cases:**
240- High-volume short video ad production at low cost
241- Video ads with synchronized voiceover and sound effects in one pass
242- Multi-reference generation (feed product images, brand assets, style references)
243- Rapid iteration on video ad concepts
244 
245**Docs:** [Seedance](https://seed.bytedance.com/en/seedance2_0)
246 
247---
248 
249### Higgsfield
250 
251Full-stack video creation platform with cinematic camera controls.
252 
253**Best for:** Social video ads, cinematic style, mobile-first content
254**Platform:** [higgsfield.ai](https://higgsfield.ai/)
255 
256**Capabilities:**
257- 50+ professional camera movements (zooms, pans, FPV drone shots)
258- Image-to-video animation
259- Built-in editing, transitions, and keyframing
260- All-in-one workflow: image gen, animation, editing
261 
262**Ad creative use cases:**
263- Social media video ads with cinematic feel
264- Animate product images into dynamic video
265- Create multiple video variations with different camera styles
266- Quick-turn video content for social campaigns
267 
268---
269 
270### Video Tool Comparison
271 
272| Tool | Max Length | Audio | Resolution | API | Best For |
273|------|-----------|-------|------------|-----|----------|
274| **Veo 3.1** | 60 sec | Native | 1080p/4K | Gemini | Vertical social video |
275| **Kling 2.6** | 3 min | Native | 1080p | Third-party | Longer cinematic |
276| **Runway Gen-4** | 10 sec | No | 1080p | Official | Controlled, consistent |
277| **Sora 2** | 60 sec | Native | 1080p | Official | Dialogue-heavy |
278| **Seedance 2.0** | 20 sec | Native | 2K | Official + third-party | Affordable high-volume |
279| **Higgsfield** | Varies | Yes | 1080p | Web-based | Social, mobile-first |
280 
281---
282 
283## Voice & Audio Generation
284 
285For layering realistic voiceovers onto video ads, adding narration to product demos, or generating audio for Remotion-rendered videos. These tools turn ad scripts into natural-sounding voice tracks.
286 
287### When to Use Voice Tools
288 
289Many video generators (Veo, Kling, Sora, Seedance) now include native audio. Use standalone voice tools when you need:
290 
291- **Voiceover on silent video** — Runway Gen-4 and Remotion produce silent output
292- **Brand voice consistency** — Clone a specific voice for all ads
293- **Multi-language versions** — Same ad script in 20+ languages
294- **Script iteration** — Re-record voiceover without reshooting video
295- **Precise control** — Exact timing, emotion, and pacing
296 
297---
298 
299### ElevenLabs
300 
301The market leader in realistic voice generation and voice cloning.
302 
303**Best for:** Most natural-sounding voiceovers, brand voice cloning, multilingual
304**API:** REST API with streaming support
305**Pricing:** ~$0.12-0.30 per 1,000 characters depending on plan; starts at $5/month
306 
307**Capabilities:**
308- 29+ languages with natural accent and intonation
309- Voice cloning from short audio clips (instant) or longer recordings (professional)
310- Emotion and style control
311- Streaming for real-time generation
312- Voice library with hundreds of pre-built voices
313 
314**Ad creative use cases:**
315- Generate voiceover tracks for video ads
316- Clone your brand spokesperson's voice for all ad variations
317- Produce the same ad in 10+ languages from one script
318- A/B test different voice styles (authoritative vs. friendly vs. urgent)
319 
320**API example:**
321```bash
322curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/{voice_id}" \
323  -H "xi-api-key: $ELEVENLABS_API_KEY" \
324  -H "Content-Type: application/json" \
325  -d '{
326    "text": "Stop wasting hours on manual reporting. Try DataFlow free for 14 days.",
327    "model_id": "eleven_multilingual_v2",
328    "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
329  }' --output voiceover.mp3
330```
331 
332**Docs:** [ElevenLabs API](https://elevenlabs.io/docs/api-reference/text-to-speech)
333 
334---
335 
336### OpenAI TTS
337 
338Simple, affordable text-to-speech built into the OpenAI API.
339 
340**Best for:** Quick voiceovers, cost-effective at scale, simple integration
341**API:** OpenAI API (same SDK as GPT/DALL-E)
342**Pricing:** $15/million chars (standard), $30/million chars (HD); ~$0.015/min with gpt-4o-mini-tts
343 
344**Capabilities:**
345- 13 built-in voices (no custom cloning)
346- Multiple languages
347- Real-time streaming
348- HD quality option
349- Simple API — same SDK you already use for GPT
350 
351**Ad creative use cases:**
352- Fast, cheap voiceover for draft/test ad versions
353- High-volume narration at low cost
354- Prototype ad audio before investing in premium voice
355 
356**Docs:** [OpenAI TTS](https://platform.openai.com/docs/guides/text-to-speech)
357 
358---
359 
360### Cartesia Sonic
361 
362Ultra-low latency voice generation built for real-time applications.
363 
364**Best for:** Real-time voice, lowest latency, emotional expressiveness
365**API:** REST + WebSocket streaming
366**Pricing:** Starts at $5/month; pay-as-you-go from $0.03/min
367 
368**Capabilities:**
369- 40ms time-to-first-audio (fastest in class)
370- 15+ languages
371- Nonverbal expressiveness: laughter, breathing, emotional inflections
372- Sonic Turbo for even lower latency
373- Streaming API for real-time generation
374 
375**Ad creative use cases:**
376- Real-time ad preview during creative iteration
377- Interactive demo videos with dynamic narration
378- Ads requiring natural laughter, sighs, or emotional reactions
379 
380**Docs:** [Cartesia Sonic](https://docs.cartesia.ai/build-with-cartesia/tts-models/latest)
381 
382---
383 
384### Voicebox (Open Source)
385 
386Free, local-first voice synthesis studio powered by Qwen3-TTS. The open-source alternative to ElevenLabs.
387 
388**Best for:** Free voice cloning, local/private generation, zero-cost batch production
389**API:** Local REST API at `http://localhost:8000`
390**Pricing:** Free (MIT license). Runs entirely on your machine.
391**Stack:** Tauri (Rust) + React + FastAPI (Python)
392 
393**Capabilities:**
394- Voice cloning from short audio samples via Qwen3-TTS
395- Multi-language support (English, Chinese, more planned)
396- Multi-track timeline editor for composing conversations
397- 4-5x faster inference on Apple Silicon via MLX Metal acceleration
398- Local REST API for programmatic generation
399- No cloud dependency — all processing on-device
400 
401**Ad creative use cases:**
402- Free voice cloning for brand spokesperson across all ad variations
403- Batch generate voiceovers without per-character costs
404- Private/local generation when ad content is sensitive or pre-launch
405- Prototype voice variations before committing to a paid service
406 
407**API example:**
408```bash
409curl -X POST http://localhost:8000/generate \
410  -H "Content-Type: application/json" \
411  -d '{"text": "Stop wasting hours on manual reporting.", "profile_id": "abc123", "language": "en"}'
412```
413 
414**Install:** Desktop apps for macOS and Windows at [voicebox.sh](https://voicebox.sh), or build from source:
415```bash
416git clone https://github.com/jamiepine/voicebox.git
417cd voicebox && make setup && make dev
418```
419 
420**Docs:** [GitHub](https://github.com/jamiepine/voicebox)
421 
422---
423 
424### Other Voice Tools
425 
426| Tool | Best For | Differentiator | API |
427|------|----------|---------------|-----|
428| **PlayHT** | Large voice library, low latency | 900+ voices, <300ms latency, ultra-realistic | [play.ht](https://play.ht/) |
429| **Resemble AI** | Enterprise voice cloning | On-premise deployment, real-time speech-to-speech | [resemble.ai](https://www.resemble.ai/) |
430| **WellSaid Labs** | Ethical, commercial-safe voices | Voices from compensated actors, safe for commercial use | [wellsaid.io](https://www.wellsaid.io/) |
431| **Fish Audio** | Budget-friendly, emotion control | ~50-70% cheaper than ElevenLabs, emotion tags | [fish.audio](https://fish.audio/) |
432| **Murf AI** | Non-technical teams | Browser-based studio, 200+ voices | [murf.ai](https://murf.ai/) |
433| **Google Cloud TTS** | Google ecosystem, scale | 220+ voices, 40+ languages, enterprise SLAs | [Google TTS](https://cloud.google.com/text-to-speech) |
434| **Amazon Polly** | AWS ecosystem, cost | Neural voices, SSML control, cheap at volume | [Amazon Polly](https://aws.amazon.com/polly/) |
435 
436---
437 
438### Voice Tool Comparison
439 
440| Tool | Quality | Cloning | Languages | Latency | Price/1K chars |
441|------|---------|---------|-----------|---------|----------------|
442| **ElevenLabs** | Best | Yes (instant + pro) | 29+ | ~200ms | $0.12-0.30 |
443| **OpenAI TTS** | Good | No | 13+ | ~300ms | $0.015-0.030 |
444| **Cartesia Sonic** | Very good | No | 15+ | ~40ms | ~$0.03/min |
445| **PlayHT** | Very good | Yes | 140+ | <300ms | ~$0.10-0.20 |
446| **Fish Audio** | Good | Yes | 13+ | ~200ms | ~$0.05-0.10 |
447| **WellSaid** | Very good | No (actor voices) | English | ~300ms | Custom pricing |
448| **Voicebox** | Good | Yes (local) | 2+ | Local | Free (open source) |
449 
450### Choosing a Voice Tool
451 
452```
453Need voiceover for ads?
454├── Need to clone a specific brand voice?
455│   ├── Best quality → ElevenLabs
456│   ├── Enterprise/on-premise → Resemble AI
457│   └── Budget-friendly → Fish Audio, PlayHT
458├── Need multilingual (same ad, many languages)?
459│   ├── Most languages → PlayHT (140+)
460│   └── Best quality → ElevenLabs (29+)
461├── Need free / open source / local?
462│   └── Voicebox (MIT, runs on your machine)
463├── Need cheap, fast, good-enough?
464│   └── OpenAI TTS ($0.015/min)
465├── Need commercially-safe licensing?
466│   └── WellSaid Labs (actor-compensated voices)
467└── Need real-time/interactive?
468    └── Cartesia Sonic (40ms TTFA)
469```
470 
471### Workflow: Voice + Video
472 
473```
4741. Write ad script (use ad-creative skill for copy)
4752. Generate voiceover with ElevenLabs/OpenAI TTS
4763. Generate or render video:
477   a. Silent video from Runway/Remotion → layer voice track
478   b. Or use Veo/Sora/Seedance with native audio (skip separate VO)
4794. Combine with ffmpeg if layering separately:
480   ffmpeg -i video.mp4 -i voiceover.mp3 -c:v copy -c:a aac output.mp4
4815. Generate variations (different scripts, voices, or languages)
482```
483 
484---
485 
486## Code-Based Video: Remotion
487 
488For templated, data-driven video ads at scale, Remotion is the best option. Unlike AI video generators that produce unique video from prompts, Remotion uses React code to render deterministic, brand-perfect video from templates and data.
489 
490**Best for:** Templated ad variations, personalized video, brand-consistent production
491**Stack:** React + TypeScript
492**Pricing:** Free for individuals/small teams; commercial license required for 4+ employees
493**Docs:** [remotion.dev](https://www.remotion.dev/)
494 
495### Why Remotion for Ads
496 
497| AI Video Generators | Remotion |
498|---------------------|----------|
499| Unique output each time | Deterministic, pixel-perfect |
500| Prompt-based, less control | Full code control over every frame |
501| Hard to match brand exactly | Exact brand colors, fonts, spacing |
502| One-at-a-time generation | Batch render hundreds from data |
503| No dynamic data insertion | Personalize with names, prices, stats |
504 
505### Ad Creative Use Cases
506 
507**1. Dynamic product ads**
508Feed a JSON array of products and render a unique video ad for each:
509```tsx
510// Simplified Remotion component for product ads
511export const ProductAd: React.FC<{
512  productName: string;
513  price: string;
514  imageUrl: string;
515  tagline: string;
516}> = ({productName, price, imageUrl, tagline}) => {
517  return (
518    <AbsoluteFill style={{backgroundColor: '#fff'}}>
519      <Img src={imageUrl} style={{width: 400, height: 400}} />
520      <h1>{productName}</h1>
521      <p>{tagline}</p>
522      <div className="price">{price}</div>
523      <div className="cta">Shop Now</div>
524    </AbsoluteFill>
525  );
526};
527```
528 
529**2. A/B test video variations**
530Render the same template with different headlines, CTAs, or color schemes:
531```tsx
532const variations = [
533  {headline: "Save 50% Today", cta: "Get the Deal", theme: "urgent"},
534  {headline: "Join 10K+ Teams", cta: "Start Free", theme: "social-proof"},
535  {headline: "Built for Speed", cta: "Try It Now", theme: "benefit"},
536];
537// Render all variations programmatically
538```
539 
540**3. Personalized outreach videos**
541Generate videos addressing prospects by name for cold outreach or sales.
542 
543**4. Social ad batch production**
544Render the same content across different aspect ratios:
545- 1:1 for feed
546- 9:16 for Stories/Reels
547- 16:9 for YouTube
548 
549### Remotion Workflow for Ad Creative
550 
551```
5521. Design template in React (or use AI to generate the component)
5532. Define data schema (products, headlines, CTAs, images)
5543. Feed data array into template
5554. Batch render all variations
5565. Upload to ad platform
557```
558 
559### Getting Started
560 
561```bash
562# Create a new Remotion project
563npx create-video@latest
564 
565# Render a single video
566npx remotion render src/index.ts MyComposition out/video.mp4
567 
568# Batch render from data
569npx remotion render src/index.ts MyComposition --props='{"data": [...]}'
570```
571 
572---
573 
574## Choosing the Right Tool
575 
576### Decision Tree
577 
578```
579Need video ads?
580├── Templated, data-driven (same structure, different data)
581│   └── Use Remotion
582├── Unique creative from prompts (exploratory)
583│   ├── Need dialogue/voiceover? → Sora 2, Veo 3.1, Kling 2.6, Seedance 2.0
584│   ├── Need consistency across scenes? → Runway Gen-4
585│   ├── Need vertical social video? → Veo 3.1 (native 9:16)
586│   ├── Need high volume at low cost? → Seedance 2.0
587│   └── Need cinematic camera work? → Higgsfield, Kling
588└── Both → Use AI gen for hero creative, Remotion for variations
589 
590Need image ads?
591├── Need text/headlines in image? → Ideogram
592├── Need product consistency across variations? → Flux (multi-ref)
593├── Need quick iterations on existing images? → Nano Banana Pro
594├── Need highest visual quality? → Flux Pro, Midjourney
595└── Need high volume at low cost? → Flux Klein, Nano Banana
596```
597 
598### Cost Comparison for 100 Ad Variations
599 
600| Approach | Tool | Approximate Cost |
601|----------|------|-----------------|
602| 100 static images | Nano Banana Pro | ~$4-24 |
603| 100 static images | Flux Dev | ~$1-2 |
604| 100 static images | Ideogram API | ~$6 |
605| 100 × 15-sec videos | Veo 3.1 Fast | ~$225 |
606| 100 × 15-sec videos | Remotion (templated) | ~$0 (self-hosted render) |
607| 10 hero videos + 90 templated | Veo + Remotion | ~$22 + render time |
608 
609### Recommended Workflow for Scaled Ad Production
610 
6111. **Generate hero creative** with AI (Nano Banana, Flux, Veo) — high-quality, exploratory
6122. **Build templates** in Remotion based on winning creative patterns
6133. **Batch produce variations** with Remotion using data (products, headlines, CTAs)
6144. **Iterate** — use AI tools for new angles, Remotion for scale
615 
616This hybrid approach gives you the creative exploration of AI generators and the consistency and scale of code-based rendering.
617 
618---
619 
620## Platform-Specific Image Specs
621 
622When generating images for ads, request the correct dimensions:
623 
624| Platform | Placement | Aspect Ratio | Recommended Size |
625|----------|-----------|-------------|-----------------|
626| Meta Feed | Single image | 1:1 | 1080x1080 |
627| Meta Stories/Reels | Vertical | 9:16 | 1080x1920 |
628| Meta Carousel | Square | 1:1 | 1080x1080 |
629| Google Display | Landscape | 1.91:1 | 1200x628 |
630| Google Display | Square | 1:1 | 1200x1200 |
631| LinkedIn Feed | Landscape | 1.91:1 | 1200x627 |
632| LinkedIn Feed | Square | 1:1 | 1200x1200 |
633| TikTok Feed | Vertical | 9:16 | 1080x1920 |
634| Twitter/X Feed | Landscape | 16:9 | 1200x675 |
635| Twitter/X Card | Landscape | 1.91:1 | 800x418 |
636 
637Include these dimensions in your generation prompts to avoid needing to crop or resize.
638

Ad Creative

references/generative-tools.md

Preparing the source view

Ad Creative

references/generative-tools.md