Explore XLabs FLUX IPAdapter V2 model compared to V1 for your creative goals.

Flux Consistent Characters | Input Text

Create consistent characters and ensure they look uniform by inputting text.

SkyReels V1 | Human-Focused Video Creation

Generate cinematic human videos with genuine facial expressions and natural movements from text or images.

Era3D | ComfyUI 3D Pack

Generate 3D content, from multi-view images to detailed meshes.

ComfyUI > Nodes > ComfyUI > CLIPTextEncodeHunyuanDiT

ComfyUI Node: CLIPTextEncodeHunyuanDiT

Class Name

CLIPTextEncodeHunyuanDiT

Category
advanced/conditioning

Author
ComfyAnonymous (Account age: 833days) Extension
ComfyUI Latest Updated
2025-04-05 Github Stars
73.39K

Github Ask ComfyAnonymous Current Questions Past Questions

Table of Content

Description
CLIPTextEncodeHunyuanDiT:
CLIPTextEncodeHunyuanDiT Input Parameters:
CLIPTextEncodeHunyuanDiT Output Parameters:
CLIPTextEncodeHunyuanDiT Usage Tips:
CLIPTextEncodeHunyuanDiT Common Errors and Solutions:
Related Nodes

How to Install ComfyUI

Install this extension via the ComfyUI Manager by searching for ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

CLIPTextEncodeHunyuanDiT Description

Encode textual inputs for advanced AI models using CLIP, BERT, and mT5-XL for enhanced text representation.

CLIPTextEncodeHunyuanDiT:

The CLIPTextEncodeHunyuanDiT node is designed to encode textual inputs into a format that can be used for conditioning in advanced AI models. This node leverages the CLIP model to tokenize and encode text inputs, specifically utilizing the BERT and mT5-XL models for enhanced text representation. The primary purpose of this node is to transform textual descriptions into a structured format that can be effectively used in various AI applications, such as image generation or text-based conditioning. By encoding the text inputs, the node helps in capturing the semantic meaning and context, which can significantly improve the performance and accuracy of AI models. This node is particularly beneficial for AI artists who want to incorporate complex and dynamic text prompts into their creative workflows.

CLIPTextEncodeHunyuanDiT Input Parameters:

clip

This parameter represents the CLIP model instance that will be used for tokenizing and encoding the text inputs. The CLIP model is a powerful tool that combines vision and language understanding, making it ideal for tasks that require a deep understanding of textual descriptions. The clip parameter is essential for the node's operation as it provides the necessary functionality to process the text inputs.

bert

The bert parameter is a string input that allows you to provide text prompts using the BERT model. This parameter supports multiline text and dynamic prompts, enabling you to input complex and detailed descriptions. The BERT model is known for its robust language understanding capabilities, making it suitable for capturing the nuances and context of the provided text. This parameter plays a crucial role in generating accurate and meaningful token representations.

mt5xl

Similar to the bert parameter, the mt5xl parameter is a string input that allows you to provide text prompts using the mT5-XL model. This parameter also supports multiline text and dynamic prompts, offering flexibility in inputting diverse and intricate text descriptions. The mT5-XL model is a multilingual text-to-text transformer, which enhances the node's ability to handle a wide range of languages and text formats. This parameter is vital for generating comprehensive token representations that can be used for conditioning.

CLIPTextEncodeHunyuanDiT Output Parameters:

CONDITIONING

The output of the CLIPTextEncodeHunyuanDiT node is a conditioning object that contains the encoded representations of the input text. This conditioning object includes the tokenized and encoded text, which can be used in various AI models for tasks such as image generation, text-based conditioning, and more. The conditioning output is designed to capture the semantic meaning and context of the input text, providing a rich and detailed representation that can enhance the performance of AI models.

CLIPTextEncodeHunyuanDiT Usage Tips:

To achieve the best results, provide detailed and context-rich text prompts in the bert and mt5xl parameters. This will help the node generate more accurate and meaningful token representations.
Experiment with different text prompts and observe how the conditioning output changes. This can help you understand the impact of various text inputs on the final results and optimize your prompts accordingly.
Utilize the multiline and dynamic prompts features to input complex and varied text descriptions. This can enhance the node's ability to capture intricate details and context, leading to better conditioning outputs.

CLIPTextEncodeHunyuanDiT Common Errors and Solutions:

"Invalid CLIP model instance"

Explanation: This error occurs when the clip parameter does not receive a valid CLIP model instance.
Solution: Ensure that you provide a valid and properly initialized CLIP model instance in the clip parameter.

"Text input is empty"

Explanation: This error occurs when the bert or mt5xl parameters receive empty text inputs.
Solution: Provide non-empty text prompts in the bert and mt5xl parameters to ensure the node can generate meaningful token representations.

"Tokenization failed"

Explanation: This error occurs when the text inputs cannot be tokenized by the CLIP model.
Solution: Verify that the text inputs are in a valid format and compatible with the CLIP model's tokenization process. If the issue persists, try simplifying the text prompts or breaking them into smaller segments.

CLIPTextEncodeHunyuanDiT Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI

Table of Content

Description
CLIPTextEncodeHunyuanDiT:
CLIPTextEncodeHunyuanDiT Input Parameters:
CLIPTextEncodeHunyuanDiT Output Parameters:
CLIPTextEncodeHunyuanDiT Usage Tips:
CLIPTextEncodeHunyuanDiT Common Errors and Solutions:
Related Nodes

Flux UltraRealistic LoRA V2

Create stunningly lifelike image with Flux UltraRealistic LoRA V2

SkyReels-A2 | Multi-Element Video Generation

Combine multi elements into dynamic videos with precision.

Wan 2.1 Video Restyle | Consistent Video Style Transform

Transform your video style by applying the restyled first frame using Wan 2.1 video restyle workflow.

Hallo2 | Lip-Sync Portrait Animation

Audio-driven lip-sync for portrait animation in 4K.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.