Official Flux Tools - Flux Depth and Canny ControlNet Model

Stable Diffusion 3.5

Stable Diffusion 3.5 (SD3.5) for high-quality, diverse image generation.

FLUX ControlNet Depth-V3 & Canny-V3

Achieve better control with FLUX-ControlNet-Depth & FLUX-ControlNet-Canny for FLUX.1 [dev].

Epic CineFX | CogVideoX, ControlNet, and Live Portrait Workflow

Turn simple footage into epic film scenes with CogVideoX, ControlNet, and Live Portrait.

ComfyUI > Nodes > ComfyUI > PhotoMakerEncode

ComfyUI Node: PhotoMakerEncode

Class Name

PhotoMakerEncode

Category
_for_testing/photomaker

Author
ComfyAnonymous (Account age: 833days) Extension
ComfyUI Latest Updated
2025-04-05 Github Stars
73.39K

Github Ask ComfyAnonymous Current Questions Past Questions

Table of Content

Description
PhotoMakerEncode:
PhotoMakerEncode Input Parameters:
PhotoMakerEncode Output Parameters:
PhotoMakerEncode Usage Tips:
PhotoMakerEncode Common Errors and Solutions:
Related Nodes

How to Install ComfyUI

Install this extension via the ComfyUI Manager by searching for ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

PhotoMakerEncode Description

Enhance AI-generated images by blending visual and text elements for more accurate and appealing outputs.

PhotoMakerEncode:

The PhotoMakerEncode node is designed to enhance your AI-generated images by integrating specific visual elements into the text-based prompts used for image generation. This node leverages a sophisticated encoding mechanism to fuse image embeddings with text embeddings, allowing for more nuanced and contextually rich outputs. By using this node, you can seamlessly blend visual cues from an image with textual descriptions, resulting in more accurate and visually appealing AI-generated images. This is particularly useful for tasks that require a high degree of visual-textual coherence, such as creating photorealistic images based on detailed descriptions.

PhotoMakerEncode Input Parameters:

photomaker

This parameter expects a PHOTOMAKER model, which is a pre-trained model specifically designed for encoding and integrating visual elements into text prompts. The model should be loaded and ready to use. The quality and specificity of the photomaker model directly impact the effectiveness of the encoding process.

image

This parameter takes an IMAGE input, which is the visual element you want to integrate into your text prompt. The image should be in a format compatible with the photomaker model and should be relevant to the text prompt for optimal results.

clip

This parameter requires a CLIP model, which is used for tokenizing and encoding the text prompt. The CLIP model helps in generating embeddings that are compatible with the visual embeddings from the photomaker model, ensuring a seamless fusion of text and image data.

text

This parameter accepts a STRING input, which is the text prompt you want to enhance with visual elements. The text can be multiline and support dynamic prompts, allowing for complex and detailed descriptions. The default value is "photograph of photomaker," but you can customize it to fit your specific needs.

PhotoMakerEncode Output Parameters:

CONDITIONING

The output of this node is a CONDITIONING parameter, which contains the enhanced text embeddings that now include visual elements from the provided image. This enriched conditioning can be used in subsequent nodes to generate more accurate and visually coherent AI-generated images. The output also includes a pooled output, which provides additional context for the generated embeddings.

PhotoMakerEncode Usage Tips:

Ensure that the image you provide is relevant to the text prompt to achieve the best results.
Use a high-quality photomaker model to improve the accuracy and richness of the visual-textual fusion.
Experiment with different text prompts to see how the visual elements influence the generated images.
Utilize the pooled output for additional context when fine-tuning your AI-generated images.

PhotoMakerEncode Common Errors and Solutions:

"ValueError: 'photomaker' token not found in text"

Explanation: This error occurs when the special token "photomaker" is not found in the provided text prompt.
Solution: Ensure that your text prompt includes the special token "photomaker" or modify the code to handle cases where the token is absent.

"RuntimeError: Shape mismatch in id_pixel_values"

Explanation: This error happens when the shape of the id_pixel_values does not match the expected dimensions.
Solution: Verify that the image input is correctly preprocessed and matches the expected dimensions required by the photomaker model.

"TypeError: 'NoneType' object is not callable"

Explanation: This error can occur if the photomaker model or CLIP model is not properly loaded.
Solution: Ensure that both the photomaker and CLIP models are correctly loaded and initialized before running the node.

"AssertionError: class_tokens_mask sum mismatch"

Explanation: This error indicates a mismatch between the expected and actual sum of the class_tokens_mask.
Solution: Check the logic for generating the class_tokens_mask to ensure it correctly identifies the positions of the image tokens in the text prompt.

PhotoMakerEncode Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI

Table of Content

Description
PhotoMakerEncode:
PhotoMakerEncode Input Parameters:
PhotoMakerEncode Output Parameters:
PhotoMakerEncode Usage Tips:
PhotoMakerEncode Common Errors and Solutions:
Related Nodes

FLUX IPAdapter V2 | XLabs

Explore XLabs FLUX IPAdapter V2 model compared to V1 for your creative goals.

LTX Video | Image+Text to Video

Generates videos from image+text prompts.

IC-Light | Image Relighting

Edit backgrounds, enhance lighting, and regenerate new scenes easily.

CogvideoX Fun | Video-to-Video Model

CogVideoX Fun: Advanced video-to-video model for high-quality video generation.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.