ComfyUI  >  Workflows  >  Nvidia Cosmos | Text & Image to Video Creation

Nvidia Cosmos | Text & Image to Video Creation

Experience Nvidia's newly released Cosmos models (7B and 14B) for state-of-the-art video generation in ComfyUI. This comprehensive workflow offers both text-to-video generation and image interpolation capabilities. For text-to-video, create fluid 121-frame videos using detailed text descriptions. For image-to-video, you can set a start_image and end_image to generate smooth transitions between them. Thanks to its ultra-efficient VAE, it can process 1280x704 videos on 12GB GPUs, making it 50x more memory-efficient than alternatives. Perfect for creating both realistic and stylized animations with guaranteed motion in every sequence.

ComfyUI Nvidia Cosmos Workflow

Nvidia Cosmos Text or Image-to-Video Workflow in ComfyUI | Video Generation
Want to run this workflow?
  • Fully operational workflows
  • No missing nodes or models
  • No manual setups required
  • Features stunning visuals

ComfyUI Nvidia Cosmos Examples

nvidia-cosmos-text-or-image-to-video-workflow-in-comfyUI-video-generation-1184-example_1.webp
nvidia-cosmos-text-or-image-to-video-workflow-in-comfyUI-video-generation-1184-example_2.webp
nvidia-cosmos-text-or-image-to-video-workflow-in-comfyUI-video-generation-1184-example_3.webp
nvidia-cosmos-text-or-image-to-video-workflow-in-comfyUI-video-generation-1184-example_4.webp

ComfyUI Nvidia Cosmos Description

ComfyUI Nvidia Cosmos Text & Image to Video Workflow

What is the Nvidia Cosmos Workflow

Turn your imagination into fluid videos using the newly released Nvidia Cosmos models in ComfyUI. This workflow demonstrates the strong AI capabilities of Nvidia Cosmos with its text-to-video and image-to-video generation features. Powered by Nvidia Cosmos's state-of-the-art 7B and 14B models, you can create high-quality videos from either textual descriptions or still images. The Nvidia Cosmos engine gives stellar results thanks to its ultra-efficient video processing capabilities.


Key Features of Nvidia Cosmos

  • Dual Generation Modes: Nvidia Cosmos offers both text-to-video and image-to-video generation
  • Guaranteed Motion: Always generates videos with movement when using 121 frames
  • Effective Negative Prompts: Non-distilled model ensures better control through negative prompts
  • Flexible Image Control: Generate from the last frame or create transitions between images
  • Ultra-Efficient VAE: Nvidia Cosmos employs a refined VAE system for smooth, high-quality video generation
  • High Resolution Support: Create videos at resolutions of 704x704 and above
  • Precise Frame Control: Optimized for 121-frame sequences
  • Smart Image Interpolation: Generate smooth transitions between reference images

How to Use the Nvidia Cosmos Workflow

Nvidia Cosmos workflow contains two main parts: text-to-video and image-to-video generation. By default, the image-to-video group is bypassed. To switch between the two modes:

  • For text-to-video: Keep the image-to-video group bypassed (default setting)
  • For image-to-video: Right-click the image-to-video group and select Set Group Nodes to Always

1. Text to Video Generation with Nvidia Cosmos

Setup and Requirements

  • Choose your preferred Nvidia Cosmos model size (7B recommended for starting)
Nvidia Cosmos
  • Set resolution (Default 1280x704; minimum 704x704)
  • Frame settings:
    • Length: 121 frames (The model performs optimally with a length of 121; deviating too much from this can result in subpar video quality.)
    • Frame rate: 24.00 (default rate for optimal quality)
    Nvidia Cosmos Nvidia Cosmos

Sampling Parameters for Nvidia Cosmos

  • Sampler: res_multistep (Nvidia's recommended sampler for Cosmos)
  • Scheduler: karras (default for stability)
  • Steps: 20 (higher = better quality but slower; lower = faster but less detailed)
  • CFG: 6.5 (prompt guidance strength)
  • Denoise: 1.00 (1.00 = complete transformation; lower values keep more original content)
Nvidia Cosmos

Prompting Tips for Nvidia Cosmos

  • Use detailed, multi-sentence prompts for better results
  • Include comprehensive negative prompts
  • Short prompts may generate coherent videos but might not strictly follow instructions

2. Image to Video Generation with Nvidia Cosmos

Setup and Requirements

  • Same base requirements as Nvidia Cosmos text-to-video
  • Supports start_image and end_image inputs

Reference Image Options

  • Set a start_image or end_image, or both at the same time
  • Images work best when similar in style and content (for smooth transitions)
Nvidia Cosmos

Key Parameters

  • Identical sampling settings to text-to-video mode
  • Maintains same video quality standards

Advanced Tips for Nvidia Cosmos

  • For higher quality results with more VRAM, try the Nvidia Cosmos 14B model
  • Ensure prompts are descriptive and detailed for best results
  • Experiment with different image pairs for unique transitions

More Information about Nvidia Cosmos

For more details and updates about Nvidia Cosmos, visit .

Want More ComfyUI Workflows?

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows.