Seedance 1.5 Pro: Text-to-Video with Built-in Audio & Sound Effects | RunComfy

bytedance/seedance-v1.5-pro/text-to-video

Turn text prompts into 720p videos with synchronized dialogue, sound effects, and music. The premier text-to-video engine for creating audio-visual narratives from scratch.

The prompt should be less than 500 characters to get better results.
Whether to fix the camera in the video.
Idle
The rate is $0.012 per second for 480p without audio, $0.024 per second for 480p with audio, $0.026 per second for 720p without audio, and $0.052 per second for 720p with audio.

Introduction to Seedance 1.5 Pro Text-to-Video

Seedance 1.5 Pro Text-to-Video is the ultimate "Director's Engine," capable of turning written scripts into fully realized 720p cinematics with synchronized audio. It constructs entire worlds, characters, and soundscapes purely from text descriptions. It features a breakthrough unified architecture that generates pixels and sound waves simultaneously, allowing creators to describe a scene's visuals, camera movements, dialogue (in multiple dialects), and background score in a single prompt. For developers, Seedance 1.5 Pro Text-to-Video on RunComfy offers a streamlined API to build automated content generation pipelines without needing visual assets.

Examples of Videos Made in Seedance 1.5

Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...

Seedance 1.5 Pro on X

Key Capabilities


1. Script-to-Screen Audio Synthesis

Seedance 1.5 Pro Text-to-Video is unique because it "reads" your prompt to generate sound.

  • Text-to-Dialogue: Write a line of text, and a generated character will speak it with perfect lip-sync. Supports English, Chinese, and specific dialects (e.g., "Speak in Sichuan dialect").
  • Text-to-SFX: Describe a sound (e.g., "sound of glass breaking," "distant thunder"), and the model generates the audio to match the visual event precisely.
  • Atmosphere Generation: Just describe the mood (e.g., "eerie silence," "bustling market"), and the model fills the audio track with appropriate ambient noise.

2. Prompt-Driven Cinematography

You are the cinematographer. The model follows technical camera terms in your text prompt:

  • Camera Moves: Execute complex shots purely via text commands.
  • Visual Style: Define the film stock, lighting, and color grading without needing a reference image.

3. Character & World Creation

  • Zero-Shot Creation: Create fantasy creatures, specific architectural styles, or historical figures from descriptions alone.
  • Emotional Acting: Direct the actor's performance via text.

Master the Audio-Visual Prompting Guide

In Text-to-Video, your prompt is the script. You must describe Visuals, Camera, and Audio together.


The Scripting Formula

> [Scene & Character] + [Camera Movement] + [Audio/Dialogue Instruction]


Explore Related Capabilities


If you already have a character design or product photo you want to bring to life, switch to the Image-to-Video tool.

Seedance 1.5 Pro – Image-to-Video

Related Playgrounds

Frequently Asked Questions

What are the current technical limitations of Seedance 1.5 Pro text-to-video in terms of resolution and duration?

Seedance 1.5 Pro text-to-video currently supports HD output resolutions up to 720p for 4–12 second clips. As of the latest tests, no 4K generation is available, and users should expect performance optimization primarily around 480p-720p output.

How do I move from testing Seedance 1.5 Pro text-to-video in RunComfy to production API usage?

You can transition from trial to production by first confirming that your Seedance 1.5 Pro text-to-video tests in the RunComfy meet your expectations, then obtaining an API key through your RunComfy account. The model and API share the same model versioning, allowing seamless deployment by replacing manual web generation with automated API requests in your production environment.

What improvements does Seedance 1.5 Pro text-to-video offer compared to Seedance 1.0 Pro?

Seedance 1.5 Pro text-to-video is expected to deliver better motion continuity, faster rendering, and enhanced temporal stability. Its multi-shot narrative consistency and improved understanding of camera directions significantly reduce visual artifacts compared to the Seedance 1.0 Pro model.

What kinds of artistic styles or scenarios does Seedance 1.5 Pro perform best in?

Seedance 1.5 Pro performs particularly well in cinematic, photorealistic, anime, and stylized visual scenarios that require stable subject consistency, controlled camera movement, and clear narrative structure. It is well suited for storytelling-driven videos, character-focused scenes, and short-form creative content where visual continuity, motion coherence, and synchronized audio-visual elements are important.

How does Seedance 1.5 Pro text-to-video compare to competitors like Wan 2.5 or Kling Video 2.6 in motion and realism?

Seedance 1.5 Pro text-to-video generally offers superior audio-visual accuracy when compared to Wan 2.5 and Kling Video 2.6. While Kling may excel in stylized effects, Seedance 1.5 Pro emphasizes built-in audio generation and prompt fidelity.

Can I use the outputs from Seedance 1.5 Pro text-to-video commercially?

Commercial use of Seedance 1.5 Pro text-to-video content is typically allowed on paid RunComfy or affiliated platform plans. However, developers and businesses should verify the specific license terms from ByteDance or the hosting provider to ensure compliance before commercial deployment.

Does Seedance 1.5 Pro text-to-video support audio or lip-sync features?

Yes. Seedance 1.5 Pro supports built-in audio generation as part of its text-to-video pipeline, including dialogue, ambient sound effects, and background music. Audio and video are generated in a synchronized manner, enabling natural audio-visual alignment and accurate lip movement when characters are speaking. These capabilities are native to the model and do not rely on third-party post-processing.

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.