FramePack Wrapper | Efficient long Video Generation

Create stable, 60s+ long videos with minimal cloud resources.

Wonder3D | ComfyUI 3D Pack

Generate multi-view normal maps and color images for 3D assets.

Flux Consistent Characters | Input Image

Create consistent characters and ensure they look uniform using your images.

Pyramid Flow | Video Generation

Including both text-to-video and image-to-video mode.

ComfyUI > Nodes > MW-ComfyUI_MegaTTS3 > Mega TTS3 Run

ComfyUI Node: Mega TTS3 Run

Class Name

MegaTTS3Run

Category
🎤MW/MW-MegaTTS3

Author
mw (Account age: 2258days) Extension
MW-ComfyUI_MegaTTS3 Latest Updated
2025-05-03 Github Stars
0.08K

Github Ask mw Current Questions Past Questions

Table of Content

Description
MegaTTS3Run:
MegaTTS3Run Input Parameters:
MegaTTS3Run Output Parameters:
MegaTTS3Run Usage Tips:
MegaTTS3Run Common Errors and Solutions:
Related Nodes

How to Install MW-ComfyUI_MegaTTS3

Install this extension via the ComfyUI Manager by searching for MW-ComfyUI_MegaTTS3

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter MW-ComfyUI_MegaTTS3 in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Mega TTS3 Run Description

Sophisticated node for high-quality text-to-speech audio generation using advanced machine learning models in ComfyUI framework.

Mega TTS3 Run:

MegaTTS3Run is a sophisticated node designed to facilitate the generation of high-quality audio from text inputs using the MegaTTS3 model. This node is part of the ComfyUI framework and is specifically tailored for text-to-speech (TTS) applications. It leverages advanced machine learning models to convert written text into natural-sounding speech, supporting multiple languages such as English and Chinese. The node is equipped with features that allow for precise control over the speech synthesis process, including the ability to adjust parameters like time steps and weightings for phoneme and tone predictions. By utilizing this node, you can create realistic audio outputs that can be used in various applications, from virtual assistants to multimedia content creation.

Mega TTS3 Run Input Parameters:

speaker

The speaker parameter specifies the voice model to be used for generating the audio. It is a string that corresponds to a specific speaker's voice profile stored in the system. This parameter is crucial as it determines the vocal characteristics of the generated speech, such as pitch, tone, and accent. There are no explicit minimum or maximum values, but it must match a valid speaker profile available in the system.

text

The text parameter is the input string that you want to convert into speech. It is a required parameter and should be a well-formed sentence or phrase in the language specified by the text_language parameter. The quality and clarity of the generated audio depend significantly on the input text's structure and content.

text_language

The text_language parameter defines the language of the input text. It accepts two options: "en" for English and "zh" for Chinese, with "zh" being the default. This parameter ensures that the text is processed correctly according to the linguistic rules of the specified language, affecting pronunciation and intonation.

time_step

The time_step parameter is an integer that controls the granularity of the speech synthesis process. It has a default value of 32 and a minimum value of 1. Adjusting this parameter can influence the smoothness and speed of the generated audio, with higher values potentially leading to more detailed and nuanced speech.

p_w

The p_w parameter is a floating-point value that adjusts the weight of phoneme predictions during the synthesis process. It has a default value of 1.6 and a minimum value of 0.1. This parameter allows you to fine-tune the emphasis on phonetic accuracy, which can enhance the clarity and naturalness of the speech.

t_w

The t_w parameter is a floating-point value that modifies the weight of tone predictions. It has a default value of 2.5 and a minimum value of 0.1. By adjusting this parameter, you can control the tonal quality of the speech, which is particularly important for tonal languages like Chinese.

unload_model

The unload_model parameter is a boolean that determines whether the model should be unloaded from memory after processing. It defaults to False. Setting this to True can help manage system resources, especially when running multiple instances or when memory usage is a concern.

Mega TTS3 Run Output Parameters:

audio

The audio output parameter provides the generated audio waveform as a result of the text-to-speech conversion. This output includes both the waveform data and the sample rate, allowing you to use the audio in various applications. The quality of the audio is influenced by the input parameters and the underlying model's capabilities, offering a realistic and natural-sounding speech output.

Mega TTS3 Run Usage Tips:

Ensure that the speaker parameter matches a valid speaker profile to achieve the desired vocal characteristics in the output audio.
Experiment with the time_step, p_w, and t_w parameters to find the optimal balance between speed, clarity, and naturalness for your specific application.
Use the unload_model parameter to manage system resources effectively, especially when working with large datasets or running multiple processes.

Mega TTS3 Run Common Errors and Solutions:

FileNotFoundError: Speaker file not found

Explanation: This error occurs when the specified speaker profile does not exist in the system.
Solution: Verify that the speaker parameter is set to a valid and existing speaker profile.

ValueError: Invalid text input

Explanation: This error is raised when the input text is not a valid string or is empty.
Solution: Ensure that the text parameter is a non-empty string and is properly formatted.

RuntimeError: CUDA out of memory

Explanation: This error indicates that the system has run out of GPU memory during processing.
Solution: Try reducing the batch size, using a smaller model, or setting unload_model to True to free up memory after processing.

Mega TTS3 Run Related Nodes

Go back to the extension to check out more related nodes.

MW-ComfyUI_MegaTTS3

Table of Content

Description
MegaTTS3Run:
MegaTTS3Run Input Parameters:
MegaTTS3Run Output Parameters:
MegaTTS3Run Usage Tips:
MegaTTS3Run Common Errors and Solutions:
Related Nodes

MMAudio | Video-to-Audio

MMAudio: Advanced video-to-audio model for high-quality audio generation.

Flux Fill | Inpaint and Outpaint

Official Flux Tools - Flux Fill for Inpainting and Outpainting

ReActor | Fast Face Swap

Professional face swapping toolkit for ComfyUI that enables natural face replacement and enhancement.

IDM-VTON | Virtual Try-on

Virtual try-on creating realistic results by capturing garment details and style.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.