IPAdapter Plus (V2) | One-Image Style Transfer

Use IPAdapter Plus and ControlNet for precise style transfer with a single reference image.

ACE-Step Music Generation | AI Audio Creation

Generate studio-quality music 15× faster with breakthrough diffusion technology.

Generate smooth videos from a start and end frame using Wan 2.1 FLF2V.

Create consistent, high-resolution character designs from multiple angles with full control over emotions, lighting, and environments.

ComfyUI > Nodes > ComfyUI-Prompt-MZ > MinusZone - CLIPTextEncode(Florence-2)

ComfyUI Node: MinusZone - CLIPTextEncode(Florence-2)

Class Name

MZ_Florence2CLIPTextEncode

Category
MinusZone - Prompt

Author
MinusZoneAI (Account age: 350days) Extension
ComfyUI-Prompt-MZ Latest Updated
2025-03-14 Github Stars
0.11K

Github Ask MinusZoneAI Current Questions Past Questions

Table of Content

Description
MinusZone - CLIPTextEncode(Florence-2):
MinusZone - CLIPTextEncode(Florence-2) Input Parameters:
MinusZone - CLIPTextEncode(Florence-2) Output Parameters:
MinusZone - CLIPTextEncode(Florence-2) Usage Tips:
MinusZone - CLIPTextEncode(Florence-2) Common Errors and Solutions:
Related Nodes

How to Install ComfyUI-Prompt-MZ

Install this extension via the ComfyUI Manager by searching for ComfyUI-Prompt-MZ

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-Prompt-MZ in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

MinusZone - CLIPTextEncode(Florence-2) Description

Convert images to text descriptions using Florence-2 model for AI artists to generate descriptive text from visual inputs.

MinusZone - CLIPTextEncode(Florence-2):

The MZ_Florence2CLIPTextEncode node is designed to convert images into text descriptions using the Florence-2 model, which is a powerful tool for generating textual representations of visual content. This node leverages the capabilities of the Florence-2 model to encode images into text, making it easier for AI artists to generate descriptive text from visual inputs. The primary goal of this node is to facilitate the transformation of visual data into a textual format that can be used for various applications, such as image captioning, content generation, and more. By using this node, you can harness the advanced encoding capabilities of the Florence-2 model to create rich and meaningful text descriptions from images, enhancing your creative projects and workflows.

MinusZone - CLIPTextEncode(Florence-2) Input Parameters:

resolution

The resolution parameter specifies the resolution at which the image will be processed. It is an integer value with a default of 512, a minimum of 128, and a maximum of 0xffffffffffffffff. This parameter impacts the quality and detail of the encoded text, with higher resolutions providing more detailed descriptions but potentially requiring more computational resources.

keep_device

The keep_device parameter is a boolean option that determines whether the model should remain on the device after processing. It has two options: False (default) and True. Setting this to True can improve performance for repeated operations by avoiding the overhead of loading the model multiple times, but it will consume more memory.

seed

The seed parameter is an integer value used to initialize the random number generator for reproducibility. It has a default value of 0, a minimum of 0, and a maximum of 0xffffffffffffffff. By setting a specific seed, you can ensure that the encoding process produces consistent results across different runs.

image

The image parameter is an optional input that accepts an image to be encoded. This parameter allows you to provide the visual content that will be transformed into a text description by the Florence-2 model.

clip

The clip parameter is an optional input that accepts a CLIP model. This parameter can be used to provide additional context or conditioning for the encoding process, potentially enhancing the quality and relevance of the generated text.

captioner_config

The captioner_config parameter is an optional input that accepts an ImageCaptionerConfig. This parameter allows you to customize the configuration of the image captioning process, providing more control over how the text descriptions are generated.

MinusZone - CLIPTextEncode(Florence-2) Output Parameters:

text

The text parameter is a string output that contains the textual description generated from the input image. This output provides a human-readable representation of the visual content, which can be used for various applications such as image captioning, content generation, and more.

conditioning

The conditioning parameter is an output that provides additional context or conditioning information used during the encoding process. This output can be used to further refine or interpret the generated text, enhancing its relevance and accuracy.

MinusZone - CLIPTextEncode(Florence-2) Usage Tips:

To achieve the best results, use high-resolution images as input to ensure detailed and accurate text descriptions.
If you plan to encode multiple images in succession, set the keep_device parameter to True to improve performance by keeping the model loaded in memory.
Use a specific seed value to ensure reproducibility of the generated text descriptions across different runs.
Experiment with different captioner_config settings to customize the text generation process according to your specific needs and preferences.

MinusZone - CLIPTextEncode(Florence-2) Common Errors and Solutions:

"Model loading failed"

Explanation: This error occurs when the Florence-2 model fails to load properly.
Solution: Ensure that the model files are correctly installed and accessible. Check the file paths and permissions.

"Invalid resolution value"

Explanation: This error occurs when the resolution parameter is set to a value outside the allowed range.
Solution: Set the resolution parameter to a value between 128 and 0xffffffffffffffff.

"Image input missing"

Explanation: This error occurs when the required image input is not provided.
Solution: Ensure that you provide an image input for the node to process.

"Seed value out of range"

Explanation: This error occurs when the seed parameter is set to a value outside the allowed range.
Solution: Set the seed parameter to a value between 0 and 0xffffffffffffffff.

MinusZone - CLIPTextEncode(Florence-2) Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI-Prompt-MZ

Table of Content

Description
MinusZone - CLIPTextEncode(Florence-2):
MinusZone - CLIPTextEncode(Florence-2) Input Parameters:
MinusZone - CLIPTextEncode(Florence-2) Output Parameters:
MinusZone - CLIPTextEncode(Florence-2) Usage Tips:
MinusZone - CLIPTextEncode(Florence-2) Common Errors and Solutions:
Related Nodes

Hunyuan3D-1 | ComfyUI 3D Pack

Create multi-view RGB images first, then transform them into 3D assets.

Wan FusionX | T2V+I2V+VACE Complete

Most powerful video generation solution yet! Cinema-grade detail, your personal film studio.

Flux Consistent Characters | Input Image

Create consistent characters and ensure they look uniform using your images.

MMAudio | Video-to-Audio

MMAudio: Advanced video-to-audio model for high-quality audio generation.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.