Enhance realism by using ControlNet to guide FLUX.1-dev.

ACE-Step Music Generation | AI Audio Creation

Generate studio-quality music 15× faster with breakthrough diffusion technology.

Consistent Style Transfer with Unsampling

Controlling latent noise with Unsampling helps dramatically increase consistency in video style transfer.

FLUX Img2Img | Merge Visuals and Prompts

Merge visuals and prompts for stunning, enhanced results.

ComfyUI > Nodes > ComfyUI-MARS5-TTS > MARS5-TTS Node

ComfyUI Node: MARS5-TTS Node

Class Name

MARS5TTS_Node

Category
AIFSH_MARS5_TTS

Author
AIFSH (Account age: 516days) Extension
ComfyUI-MARS5-TTS Latest Updated
2024-07-02 Github Stars
0.03K

Github Ask AIFSH Current Questions Past Questions

Table of Content

Description
MARS5-TTS Node:
MARS5-TTS Node Input Parameters:
MARS5-TTS Node Output Parameters:
MARS5-TTS Node Usage Tips:
MARS5-TTS Node Common Errors and Solutions:
Related Nodes

How to Install ComfyUI-MARS5-TTS

Install this extension via the ComfyUI Manager by searching for ComfyUI-MARS5-TTS

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-MARS5-TTS in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

MARS5-TTS Node Description

Text-to-speech node using MARS5-TTS model for high-quality speech synthesis with voice cloning and customization options.

MARS5-TTS Node:

The MARS5TTS_Node is a powerful tool designed to convert text into speech using advanced deep learning models. This node leverages the MARS5-TTS model, which is pre-trained to generate high-quality, natural-sounding speech. The primary goal of this node is to provide a seamless and efficient way to synthesize speech from text, with the added capability of cloning voices from reference audio files. This makes it an invaluable asset for AI artists looking to create personalized and dynamic audio content. The node supports various customization options, allowing you to fine-tune the speech synthesis process to match specific needs, such as adjusting the temperature for more creative outputs or using deep cloning for more accurate voice replication.

MARS5-TTS Node Input Parameters:

text

This parameter represents the text that you want to convert into speech. The input should be a string containing the text content. The quality and naturalness of the generated speech will depend on the clarity and structure of the input text.

ref_voice

This parameter is the file path to a reference audio file containing the voice you want to clone. The reference voice helps the model to mimic the tone, pitch, and style of the provided audio. The file should be in a format supported by the librosa library, such as WAV.

if_deep_clone

This boolean parameter determines whether to use deep cloning for voice replication. When set to True, the model requires a reference transcript to accurately clone the voice. This option is useful for achieving high fidelity in voice replication. Default value is False.

rep_penalty_window

This parameter controls the repetition penalty window size. It helps in reducing repetitive patterns in the generated speech. A larger window size can lead to more varied and natural-sounding speech. The value should be an integer.

top_k

This parameter sets the number of top tokens to consider during the sampling process. A higher value allows for more diversity in the generated speech, while a lower value makes the output more deterministic. The value should be an integer.

temperature

This parameter adjusts the randomness of the speech generation process. A higher temperature results in more creative and varied outputs, while a lower temperature produces more stable and predictable speech. The value should be a float, typically between 0.7 and 1.5.

freq_penalty

This parameter applies a penalty to frequent tokens, encouraging the model to use less common words and phrases. This can help in generating more diverse and interesting speech. The value should be a float.

ref_transcript

This optional parameter is the transcript of the reference audio file. It is required if if_deep_clone is set to True. The transcript helps the model to better understand and replicate the reference voice.

MARS5-TTS Node Output Parameters:

outfile

This parameter is the file path to the generated speech audio file. The output is a WAV file containing the synthesized speech based on the input text and reference voice. The file is saved in the specified output directory with a unique timestamp to avoid overwriting.

MARS5-TTS Node Usage Tips:

Ensure that the reference audio file is clear and of high quality to achieve the best voice cloning results.
Experiment with the temperature parameter to find the right balance between creativity and stability in the generated speech.
Use the rep_penalty_window parameter to reduce repetitive patterns and make the speech sound more natural.
If using deep cloning, provide an accurate and well-aligned transcript of the reference audio to improve the fidelity of the voice replication.

MARS5-TTS Node Common Errors and Solutions:

"deep clone need ref_transcript,but you give nothing!"

Explanation: This error occurs when if_deep_clone is set to True, but no reference transcript is provided.
Solution: Ensure that you provide a valid transcript of the reference audio file when using deep cloning.

"File not found: ref_voice"

Explanation: This error indicates that the specified reference audio file could not be found.
Solution: Verify the file path and ensure that the reference audio file exists and is accessible.

"Invalid value for temperature"

Explanation: This error occurs when the temperature parameter is set to a value outside the acceptable range.
Solution: Set the temperature parameter to a float value between 0.7 and 1.5.

"Model loading failed"

Explanation: This error indicates that the MARS5-TTS model could not be loaded.
Solution: Ensure that the model files are correctly placed in the specified directory and that the device has enough resources to load the model.

MARS5-TTS Node Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI-MARS5-TTS

Table of Content

Description
MARS5-TTS Node:
MARS5-TTS Node Input Parameters:
MARS5-TTS Node Output Parameters:
MARS5-TTS Node Usage Tips:
MARS5-TTS Node Common Errors and Solutions:
Related Nodes

Flux TTP Upscale | 4K Face Restore

Repair distorted faces and upscale images to 4K resolution.

LivePortrait | Animate Portraits | Vid2Vid

Transfer facial expressions and movements from a driving video onto a source video

Self Forcing | Autoregressive Keyframe-to-Video Generation

SUPER FAST! 5-second video in 45 seconds!

Insert Anything | Reference-Based Image Editing

Insert any subject into images with mask or text guidance.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.