Visit ComfyUI Online for ready-to-use ComfyUI environment
Select best images/masks based on similarity scores using CLIP model embeddings, with threshold and limit filters.
The ZuellniPickScoreSelector node is designed to help you select the best matching images, latents, or masks based on a scoring system that evaluates the similarity between image and text embeddings. This node leverages the CLIP model to compute these embeddings and then uses a threshold and limit to filter and rank the results. The primary goal of this node is to streamline the process of identifying the most relevant visual data that corresponds to a given textual description, making it an invaluable tool for AI artists who need to curate or refine their generated content efficiently.
This parameter expects a pre-trained PickScore model (PS_MODEL) that has been set up and loaded into memory. The model is responsible for generating the embeddings and performing the similarity scoring. Ensure that the model is correctly initialized and loaded to avoid any processing interruptions.
The inputs parameter requires a tuple of image and text inputs (PS_INPUTS) that have been pre-processed by the Processor node. These inputs are used to generate the embeddings that will be compared to determine the similarity scores. Properly formatted and pre-processed inputs are crucial for accurate scoring.
The threshold parameter is a floating-point value that sets the minimum score required for an image, latent, or mask to be considered relevant. It ranges from 0.0 to 1.0, with a default value of 0.0. Adjusting this threshold allows you to control the strictness of the selection process, filtering out less relevant results.
This parameter is an integer that specifies the maximum number of results to return. It ranges from 1 to 1000, with a default value of 1. The limit helps you manage the number of top-scoring items you want to retrieve, ensuring that you only get the most relevant results up to the specified count.
The images parameter is an optional input that accepts a list of images (IMAGE) to be scored and filtered. If provided, the node will return the top-scoring images based on the computed similarity scores. This parameter is useful when you want to directly work with image data.
The latents parameter is an optional input that accepts a dictionary of latent representations (LATENT) to be scored and filtered. If provided, the node will return the top-scoring latents. This is particularly useful for tasks involving latent space manipulations or generative models.
The masks parameter is an optional input that accepts a list of masks (MASK) to be scored and filtered. If provided, the node will return the top-scoring masks. This parameter is beneficial for tasks that involve segmentation or mask-based operations.
The SCORES output is a string that contains the similarity scores of the selected items, formatted as a comma-separated list of rounded values. This output helps you understand the relative relevance of each selected item based on the computed scores.
The IMAGES output is a list of the top-scoring images (IMAGE) that meet the specified threshold and limit criteria. This output is useful for visualizing and further processing the most relevant images that match the given text description.
The LATENTS output is a dictionary containing the top-scoring latent representations (LATENT) that meet the specified threshold and limit criteria. This output is essential for tasks that involve further manipulation or analysis of latent spaces.
The MASKS output is a list of the top-scoring masks (MASK) that meet the specified threshold and limit criteria. This output is valuable for segmentation tasks or any operations that require mask data.
InterruptProcessingException
Model not loaded
Invalid input format
© Copyright 2024 RunComfy. All Rights Reserved.