Visit ComfyUI Online for ready-to-use ComfyUI environment
Perform object detection and image analysis using pre-trained YOLO models for accurate identification of objects in images.
The UltralyticsInference node is designed to perform object detection and image analysis using pre-trained models from the Ultralytics YOLO series. This node leverages the power of YOLO (You Only Look Once) models to quickly and accurately identify objects within an image, making it an essential tool for AI artists who need to incorporate advanced image recognition capabilities into their projects. By using this node, you can detect various objects in an image, obtain their bounding boxes, and extract other relevant information such as masks, probabilities, and keypoints. The node is highly configurable, allowing you to adjust parameters like confidence threshold, intersection over union (IoU) threshold, image dimensions, and more, to fine-tune the detection process according to your specific needs.
This parameter expects an Ultralytics model that has been loaded using the UltralyticsModelLoader or CustomUltralyticsModelLoader nodes. The model is used to perform the inference on the provided image.
This parameter takes an image input on which the object detection will be performed. The image should be in a compatible format that the model can process.
This is the confidence threshold for object detection. It determines the minimum confidence level required for a detection to be considered valid. The value ranges from 0 to 1, with a default of 0.25. Lowering this value may result in more detections, including less confident ones, while increasing it will filter out less certain detections.
The intersection over union (IoU) threshold is used for non-maximum suppression, which helps in eliminating redundant overlapping boxes. The value ranges from 0 to 1, with a default of 0.7. A higher value will result in fewer boxes, while a lower value will allow more overlapping boxes.
This parameter sets the height of the image to be used for inference. The value ranges from 64 to 1280 pixels, with a default of 640 pixels. Adjusting this value can impact the accuracy and speed of the detection.
This parameter sets the width of the image to be used for inference. The value ranges from 64 to 1280 pixels, with a default of 640 pixels. Similar to the height parameter, changing this value can affect the detection performance.
This parameter specifies the device to be used for inference, with options including cuda:0
for GPU and cpu
for CPU. Using a GPU can significantly speed up the inference process.
This boolean parameter determines whether to use half-precision floating-point numbers during inference. The default value is False. Enabling this option can reduce memory usage and potentially increase inference speed on compatible hardware.
This boolean parameter indicates whether to use data augmentation during inference. The default value is False. Enabling augmentation can improve detection robustness but may increase inference time.
This boolean parameter specifies whether to use class-agnostic non-maximum suppression. The default value is False. When enabled, this option will apply non-maximum suppression across all classes, which can be useful in certain scenarios.
This parameter allows you to specify a comma-separated list of class names to filter the detections. The default value is "None", meaning all classes will be considered. Providing specific class names will limit the detections to those classes only.
This output contains the overall results of the inference, including detected objects and their associated data.
This output provides the original image with the detected objects overlaid, allowing you to visualize the results directly.
This output contains the bounding boxes for the detected objects, represented as coordinates in the image.
This output includes the segmentation masks for the detected objects, if available.
This output provides the confidence scores for each detected object, indicating the likelihood of each detection being correct.
This output contains keypoints for the detected objects, which can be useful for tasks requiring detailed object analysis.
This output includes oriented bounding boxes for the detected objects, providing additional spatial information.
This output provides the class labels for the detected objects, indicating the type of each detected object.
conf
parameter to balance between detection sensitivity and precision. Lower values may detect more objects but include false positives, while higher values will be more selective.device
parameter to leverage GPU acceleration for faster inference times, especially when processing large images or batches.height
and width
parameters to find the optimal image size for your specific use case, balancing between detection accuracy and processing speed.augment
if you need more robust detections in varied conditions, but be aware that it may increase inference time.cpu
if GPU is not available.© Copyright 2024 RunComfy. All Rights Reserved.