Skip to content
Tools / UGC Video Turbo
UGC Video Turbo icon

UGC Video Turbo

UGC videos with face-reference lip-sync

UGC Video Turbo generates authentic UGC-style videos with real face references, native voice synthesis, and ambient audio — all from a single persona headshot. Drop in a real photo and get back clips where the character speaks in a natural voice with consistent face across every scene.

Unlike standard video generation tools, Turbo accepts real person photos as the face reference with no blocking. The voice is generated natively in each clip, steered by persona tone and accent settings in the prompt — no separate TTS pass, no dubbed audio, no lip-sync mismatches. The result sounds and looks like real iPhone UGC footage, not AI-generated video.

What you can do

  • research_audience — deep audience research: pain points, emotional triggers, language patterns, objections
  • plan_bias_stack — optional pre-planning step: maps psychological mechanisms (information gap, empathy gap, zeigarnik effect, etc.) to each section of the script before writing
  • generate_creative — hooks, scripts, and scenes in one fast call with ugc_style direction
  • generate_frames — first-frame images with real persona face reference and product compositing
  • generate_videos — animate frames via face-reference synthesis with native voice and ambient audio
  • composite_product — insert a real product image into frames before video generation
  • regenerate_frame — fix a single weak frame without re-running the pipeline
  • assemble_final — stitch clips into a finished video with transitions
  • check_video / check_image — poll for pending generation jobs

Who it's for

DTC brands, performance marketers, and agencies who need character-consistent UGC video with real-looking faces and natural-sounding speech — not dubbed AI voices over animated avatars.

How to use it

  1. Start with research_audience — pass company name, product name, and description
  2. Run generate_creative with the full research output and ugc_style — review before proceeding
  3. Call generate_frames with a real persona headshot as persona_image_url for face consistency
  4. Run generate_videos — Turbo passes your headshot as a face reference so the character looks consistent across clips (takes 2–10 minutes)
  5. Finish with assemble_final to produce the completed video

Getting started

Connect your MuAPI account to enable Turbo video generation. Upload persona, scene, product, and outfit assets to your file library for best results.

Research Audience

Step 1: Research audience pain points, language, triggers, and objections. Async (1-3 min). Do NOT skip. Pass full output to generate_creative.

Returns: Structured audience research with pain points, problems, language patterns, emotional triggers, objections, and research summary
Write Hooks

Generate UGC video hooks using the 3-variable framework (Angle + Aesthetic + Action). Uses RAG from the UGC playbook. ⏱ ~5 seconds. Requires research output from step 1.

Returns: Array of hooks, each with text, angle, aesthetic, action, emotional trigger, and target pain point
Write Scripts

Generate authentic UGC scripts using But/Therefore zigzag structure. ⏱ 5-10s per hook. Requires hooks from write_hooks.

Returns: Array of scripts, each with hook_id, variation, hook_text, script_body, CTA, duration estimate, and mini-hooks at drop-off timestamps
Generate Scenes

Generate visual scene descriptions for each script, optimized for AI image/video generation. Set "format" for format-specific direction. Takes ~5s per script.

Returns: Array of scenes, each with scene description, setting, lighting, wardrobe, camera angle, mood, and shot list with visual changes
Plan Bias Stack

Step 1.5 (optional but recommended): Map psychological mechanisms to each video section before writing scripts. Fast (~5s). Pass full research_audience output and receive a bias_stack to pass to generate_creative.

Returns: bias_stack with hook/pain/mechanism_bridge/transformation/cta plans, execution_order, incompatibility_warnings, research_quality_warnings. Pass to generate_creative.
Generate Creative (Fast)

Generate hooks, scripts, and scene descriptions in one call (~8s). Set "format" for format-specific output. Photo formats return compositions instead of hooks/scripts. Pass FULL research output. If no preset fits, use ugc_style='custom' with custom_style.

Returns: Video: hooks, scripts (with written_hook, hook_mechanism, hook_gap_text, vulnerability_moment, loss_statement, zeigarnik_loops), scenes arrays. Photo: compositions + scenes.
Generate First Frames

Generate first-frame images from scenes. ~30s async. Pass existing_frames to skip frames that already have image_url. persona_image_url for face consistency.

Returns: Array of frames with image URLs and scene descriptions, plus asset paths for auto-upload
Generate Videos (Turbo)

Generate UGC video clips via MuAPI Seedance 2.0 Omni-Reference. Accepts real face photos as persona_image_url (no fal.ai face block). Takes 2-10 MINUTES. Seedance generates speech and ambient natively — voice is steered by persona.tone + persona.accent. Use check_video for pending clips.

Returns: Array of video clips with video URLs, script indices, segment indices, and duration. Long scripts auto-split across multiple clips with continuation keyframes.
Check Video Status

Check on a pending MuAPI video that was still generating when generate_videos returned. Pass the muapi_request_id. Returns the video URL if ready, or current status.

Returns: Video URL if completed, or current status (processing/failed) with instructions
Check Image Status

Check on a pending image that was still generating when generate_frames timed out. Pass the fal_request_id from pending_frames.

Returns: Image URL if completed, or current status with instructions
Regenerate Frame

Re-generate a single frame without re-running the full pipeline. Accepts optional revision notes for targeted edits.

Returns: Single frame object with image URL and scene description
Composite Product

Composite real product into frames. Inline chat images CANNOT be passed as URLs. Direct user to toolrouter.com/dashboard/files to upload their image.

Returns: Object with "frames" array (composited frames) — pass directly to generate_videos
Assemble Final Video

Stitch video clips into one final video with text overlays and transitions. Last step after generate_videos. Takes ~30-120s. Async.

Returns: Final assembled video file (auto-uploaded via asset system)
List Models

List available models for this tool, sorted by popularity. Returns provider details and pricing.

Returns: List of available models with pricing and provider info
Loading reviews...

Loading activity...

v0.072026-04-27
  • Added plan_bias_stack — optional behavioral science pre-planning step that maps psychological mechanisms (information gap, empathy gap, Zeigarnik effect, narrative transportation) to each section of the script before writing. Scripts now output a written_hook (3–8 word text overlay, readable muted) separate from the spoken hook_text.
v0.062026-04-27
  • Added no_dialogue option — generate silent visual-only videos where the creator does not speak. Pass no_dialogue: true to generate_creative and generate_videos.
v0.052026-04-25
  • Added "custom" ugc_style — supply your own camera_prompt, scene_direction, and video_prompt to film in any style beyond the six presets.
v0.042026-04-24
  • Added ugc_style — choose how the video is filmed: hand-held selfie, outfit showcase (propped camera, full body, closing shot), mirror selfie, GRWM, unboxing, or walk-and-talk. Style drives camera setup, scene direction, and video motion across the full pipeline.
v0.032026-04-22
  • Product image now composited into frames and passed as @Image3 reference to Seedance so the real product appears in every clip
  • Tightened iPhone front-camera sensor aesthetic — grain, highlight clipping, compressed dynamic range
v0.022026-04-14
  • Renamed to UGC Video Turbo — accept scenes, products, and outfits from your file library
v0.012026-04-13
  • Initial release — Seedance 2.0 Omni-Reference via MuAPI, face-reference persona images, ElevenLabs TTS lip-sync

Related Tools