ComfyUI  >  Nodes  >  CosyVoice-ComfyUI >  CosyVoiceDubbingNode

ComfyUI Node: CosyVoiceDubbingNode

Class Name

CosyVoiceDubbingNode

Category
AIFSH_CosyVoice
Author
AIFSH (Account age: 260 days)
Extension
CosyVoice-ComfyUI
Latest Updated
7/23/2024
Github Stars
0.1K

How to Install CosyVoice-ComfyUI

Install this extension via the ComfyUI Manager by searching for  CosyVoice-ComfyUI
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter CosyVoice-ComfyUI in the search bar
After installation, click the  Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • High-speed GPU machines
  • 200+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 50+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

CosyVoiceDubbingNode Description

Facilitates voice dubbing with advanced TTS technology, supports multiple languages and inference modes for AI artists.

CosyVoiceDubbingNode:

The CosyVoiceDubbingNode is designed to facilitate the process of voice dubbing by leveraging advanced text-to-speech (TTS) technology. This node allows you to input text and audio prompts to generate high-quality speech outputs in various languages. It supports multiple inference modes, including zero-shot, cross-lingual, and instruct-based inference, making it versatile for different dubbing scenarios. The node is particularly beneficial for AI artists looking to create multilingual voiceovers, as it can seamlessly switch between languages and adapt to different speech styles. By using this node, you can achieve natural and expressive voice dubbing, enhancing the overall quality and authenticity of your audio projects.

CosyVoiceDubbingNode Input Parameters:

tts_srt

This parameter accepts an SRT file, which is a standard subtitle format containing the text and timing information for the speech. The SRT file guides the node on what text to convert to speech and when, ensuring that the generated audio aligns perfectly with the intended timing. This is crucial for synchronizing the dubbed voice with visual content.

prompt_wav

This parameter takes an audio file in WAV format, which serves as a prompt for the TTS model. The prompt helps the model understand the desired voice characteristics, such as tone, pitch, and speaking style. By providing a sample of the target voice, you can achieve more accurate and personalized dubbing results.

language

This parameter allows you to select the language for the generated speech. The available options are <|zh|>, <|en|>, <|jp|>, <|yue|>, and <|ko|>. Choosing the correct language ensures that the TTS model uses the appropriate phonetic and linguistic rules, resulting in more natural and intelligible speech.

CosyVoiceDubbingNode Output Parameters:

audio

The output parameter is a dictionary containing the generated audio waveform and the sample rate. The waveform is a tensor representing the audio signal, and the sample rate indicates the number of samples per second. This output can be directly used in audio editing software or further processed for various applications. The high-quality audio output ensures that the dubbed voice is clear and professional.

CosyVoiceDubbingNode Usage Tips:

  • Ensure that your SRT file is accurately timed and contains the correct text to achieve precise synchronization between the audio and visual content.
  • Use a high-quality WAV file as the prompt to provide the TTS model with a clear example of the desired voice characteristics.
  • Experiment with different languages and inference modes to find the best settings for your specific dubbing project.

CosyVoiceDubbingNode Common Errors and Solutions:

"Invalid SRT file format"

  • Explanation: The provided SRT file does not conform to the standard subtitle format.
  • Solution: Verify that your SRT file is correctly formatted with proper timing and text entries.

"Unsupported audio format"

  • Explanation: The provided audio file is not in WAV format.
  • Solution: Convert your audio file to WAV format before using it as a prompt.

"Language not supported"

  • Explanation: The selected language is not among the supported options.
  • Solution: Choose one of the supported languages: <|zh|>, <|en|>, <|jp|>, <|yue|>, or <|ko|>.

"Inference mode not recognized"

  • Explanation: The specified inference mode is not valid.
  • Solution: Ensure that you are using one of the supported inference modes: zero-shot, cross-lingual, or instruct-based inference.

CosyVoiceDubbingNode Related Nodes

Go back to the extension to check out more related nodes.
CosyVoice-ComfyUI
RunComfy

© Copyright 2024 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.