ComfyUI > Nodes > Comfyui-Spark-TTS > SparkTTS Voice Creator

ComfyUI Node: SparkTTS Voice Creator

Class Name

SparkTTS_VoiceCreator

Category
🧪AILab/🔊Audio
Author
1038lab (Account age: 774days)
Extension
Comfyui-Spark-TTS
Latest Updated
2025-04-15
Github Stars
0.09K

How to Install Comfyui-Spark-TTS

Install this extension via the ComfyUI Manager by searching for Comfyui-Spark-TTS
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter Comfyui-Spark-TTS in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

SparkTTS Voice Creator Description

Facilitates synthetic voice creation using SparkTTS for customizable, natural-sounding voices in ComfyUI integration.

SparkTTS Voice Creator:

The SparkTTS_VoiceCreator node is designed to facilitate the creation of synthetic voices using the SparkTTS text-to-speech synthesis system. This node is part of the ComfyUI-SparkTTS integration, which provides a robust platform for generating high-quality speech from text inputs. The primary goal of the SparkTTS_VoiceCreator is to enable users to create unique and natural-sounding voices by leveraging advanced machine learning models. This node is particularly beneficial for AI artists and developers who wish to incorporate custom voice synthesis into their projects, offering a seamless way to generate speech that can be tailored to specific needs. By utilizing the SparkTTS model, which supports multiple languages and offers fine control over speech characteristics, users can achieve a high degree of customization and realism in their voice outputs.

SparkTTS Voice Creator Input Parameters:

text

The text parameter is a string input that allows you to specify the text you want to convert into speech. This parameter supports multiline input, enabling you to enter longer passages of text. The default value is a sample text that demonstrates the node's capabilities. You can use double line breaks to separate paragraphs, which helps in structuring the speech output. This parameter is crucial as it directly influences the content of the generated speech.

reference_audio

The reference_audio parameter is an audio input that serves as a sample for cloning the voice. This parameter is essential for creating a voice that closely resembles the characteristics of the provided audio sample. By analyzing the reference audio, the node can capture unique voice traits, such as tone and accent, to produce a more personalized and accurate voice synthesis.

reference_text

The reference_text parameter is a string input that should contain the exact text spoken in the reference audio. This input significantly enhances the quality of voice cloning by helping the model understand the speaker's pronunciation patterns. Providing accurate reference text ensures that the synthesized voice closely matches the original speaker's style and intonation.

max_tokens

The max_tokens parameter is an integer input that controls the maximum length of the generated speech. It has a default value of 3000, with a minimum of 500 and a maximum of 5000. This parameter is important for managing memory usage and ensuring that the node can handle longer texts without running into out-of-memory errors. Adjusting this value allows you to balance between the length of the speech and the available computational resources.

SparkTTS Voice Creator Output Parameters:

wav

The wav output parameter provides the generated audio in waveform format. This output is the result of the text-to-speech synthesis process, where the input text is converted into a natural-sounding voice. The waveform can be used in various applications, such as voiceovers, virtual assistants, or any project requiring synthetic speech. The quality and characteristics of the output audio depend on the input parameters and the reference audio provided.

SparkTTS Voice Creator Usage Tips:

  • Ensure that the reference_audio is clear and of high quality to achieve the best voice cloning results.
  • Use accurate reference_text to improve the model's understanding of pronunciation patterns, leading to more natural-sounding speech.
  • Adjust the max_tokens parameter based on the length of your text and available memory resources to prevent out-of-memory errors.

SparkTTS Voice Creator Common Errors and Solutions:

"Failed to import from sparktts. Please make sure the sparktts folder exists."

  • Explanation: This error occurs when the necessary SparkTTS modules are not found in the expected directory.
  • Solution: Ensure that the SparkTTS library is correctly installed and that the sparktts folder is present in the specified path.

"huggingface_hub not available, automatic model download disabled"

  • Explanation: This message indicates that the Hugging Face Hub library is not installed, preventing automatic model downloads.
  • Solution: Install the huggingface_hub library using a package manager like pip to enable automatic model downloads.

SparkTTS Voice Creator Related Nodes

Go back to the extension to check out more related nodes.
Comfyui-Spark-TTS
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.