ComfyUI > Nodes > CosyVoice2 for ComfyUI > NTCosyVoiceInstruct2Sampler

ComfyUI Node: NTCosyVoiceInstruct2Sampler

Class Name

NTCosyVoiceInstruct2Sampler

Category
Nineton Nodes
Author
muxueChen (Account age: 3218days)
Extension
CosyVoice2 for ComfyUI
Latest Updated
2025-02-11
Github Stars
0.1K

How to Install CosyVoice2 for ComfyUI

Install this extension via the ComfyUI Manager by searching for CosyVoice2 for ComfyUI
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter CosyVoice2 for ComfyUI in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

NTCosyVoiceInstruct2Sampler Description

Sophisticated node for transforming text instructions into audio using advanced TTS capabilities.

NTCosyVoiceInstruct2Sampler:

The NTCosyVoiceInstruct2Sampler is a sophisticated node designed to transform textual instructions into audio outputs, leveraging advanced text-to-speech (TTS) capabilities. This node is particularly beneficial for users who wish to generate speech from text with specific instructions, allowing for a more nuanced and controlled audio output. By integrating both text and instructive prompts, it provides a versatile tool for creating dynamic and contextually rich audio content. The node is part of the Nineton Nodes collection, which is known for its innovative approach to audio processing, making it an essential component for AI artists looking to enhance their projects with high-quality speech synthesis.

NTCosyVoiceInstruct2Sampler Input Parameters:

audio

The audio parameter is a required input that provides the initial audio waveform and its sample rate. This parameter is crucial as it serves as the base audio from which the node will generate the new speech output. The waveform should be a tensor, and the sample rate should be an integer, typically representing the number of samples per second. This input allows the node to align the generated speech with the provided audio characteristics.

speed

The speed parameter controls the playback speed of the generated speech. It is a float value with a default of 1.0, allowing for a range between 0.5 and 1.5, with increments of 0.1. Adjusting this parameter affects the tempo of the speech, where values below 1.0 slow down the speech and values above 1.0 speed it up. This flexibility enables users to tailor the speech output to match specific timing requirements or artistic preferences.

text

The text parameter is a multiline string input that contains the primary text to be converted into speech. This parameter is essential as it defines the content of the speech output. Users can input any text they wish to be spoken, and the node will process this text to generate the corresponding audio. The ability to input multiline text allows for the creation of complex and detailed speech outputs.

instruct

The instruct parameter is another multiline string input that provides additional instructions or context for the speech synthesis process. This parameter allows users to influence the style, tone, or other characteristics of the generated speech, offering a higher degree of customization. By providing specific instructions, users can achieve more personalized and context-aware audio outputs.

NTCosyVoiceInstruct2Sampler Output Parameters:

tts_speech

The tts_speech output parameter is the resulting audio generated by the node, encapsulated in a dictionary containing the waveform and sample rate. This output represents the synthesized speech based on the provided text and instructions, processed at the specified speed. The waveform is a tensor that can be further used or manipulated in audio applications, while the sample rate ensures compatibility with various audio playback systems. This output is crucial for users who need high-quality, contextually accurate speech synthesis for their projects.

NTCosyVoiceInstruct2Sampler Usage Tips:

  • To achieve the best results, ensure that the audio input is of high quality and matches the desired sample rate for your project. This will help maintain the clarity and fidelity of the generated speech.
  • Experiment with the speed parameter to find the optimal tempo for your speech output. This can significantly impact the delivery and perception of the synthesized audio, especially in artistic or narrative contexts.

NTCosyVoiceInstruct2Sampler Common Errors and Solutions:

"Invalid audio input format"

  • Explanation: This error occurs when the audio input does not conform to the expected format, such as an incorrect waveform tensor or sample rate.
  • Solution: Ensure that the audio input is a valid tensor with the correct dimensions and that the sample rate is an integer representing the number of samples per second.

"Text input is empty"

  • Explanation: This error is triggered when the text parameter is left empty, which prevents the node from generating any speech output.
  • Solution: Provide a valid string in the text parameter to enable the node to process and generate the desired speech.

"Instruct input is empty"

  • Explanation: This error occurs when the instruct parameter is not provided, which may lead to less customized speech output.
  • Solution: Include relevant instructions in the instruct parameter to enhance the customization and context of the generated speech.

NTCosyVoiceInstruct2Sampler Related Nodes

Go back to the extension to check out more related nodes.
CosyVoice2 for ComfyUI
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.