Official Flux Tools - Flux Fill for Inpainting and Outpainting

Wan 2.1 Control LoRA | Depth and Tile

Advance Wan 2.1 video generation with lightweight depth and tile LoRAs for improved structure and detail.

FLUX Inpainting | Seamless Image Editing

Effortlessly fill, remove, and refine images, seamlessly integrating new content.

AnimateDiff + ControlNet + AutoMask | Comic Style

Effortlessly restyle videos, converting realistic characters into anime while keeping the original backgrounds intact.

ComfyUI > Nodes > ComfyUI-DataSet

ComfyUI Extension: ComfyUI-DataSet

Repo Name

ComfyUI-DataSet

Author
daxcay (Account age: 380 days) Nodes
View all nodes(14) Latest Updated
2025-03-01 Github Stars
0.05K

Github Ask daxcay Current Questions Past Questions

Table of Content

Description
How ComfyUI-DataSet Works
ComfyUI-DataSet Features
Troubleshooting ComfyUI-DataSet
Learn More about ComfyUI-DataSet
Related Nodes

How to Install ComfyUI-DataSet

Install this extension via the ComfyUI Manager by searching for ComfyUI-DataSet

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-DataSet in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ComfyUI-DataSet Description

ComfyUI-DataSet offers data research, preparation, and manipulation nodes for model trainers, artists, designers, and animators, featuring tools like captions, visualizer, and text manipulator.

ComfyUI-DataSet Introduction

ComfyUI-DataSet is an extension designed to assist AI artists and model trainers in managing and manipulating datasets. This extension provides a variety of nodes that help you visualize, organize, and process your data efficiently. Whether you are preparing data for training models or analyzing existing datasets, ComfyUI-DataSet offers tools to streamline these tasks, making it easier to handle large volumes of data and extract meaningful insights.

How ComfyUI-DataSet Works

ComfyUI-DataSet operates through a series of nodes that you can integrate into your workflow. Each node performs a specific function, such as visualizing data, copying files, or extracting specific information from text files. By connecting these nodes, you can create complex data processing pipelines tailored to your needs. Think of it as building blocks that you can combine in various ways to achieve your desired outcome.

For example, you might use a node to load text files, another to analyze the frequency of words, and a third to visualize this data in a graph. This modular approach allows you to customize your data processing workflow without needing to write any code.

ComfyUI-DataSet Features

DataSet_Visualizer

The DataSet_Visualizer node helps you visualize dataset captions by generating graphs. It includes:

Word Cloud: Shows token frequency with different font sizes.
Network Graph: Illustrates relationships between tokens.
Frequency Graph: Displays how often each token appears.

Inputs

TextFileContents: The text to be processed.
Seperator: Delimiter used to separate tokens (comma, colon, space, pipe).
WordCloudTop: Number of top tokens for the Word Cloud.
NetworkGraphTop: Number of top tokens for the Network Graph.
FrequencyGraphTop: Number of top tokens for the Frequency Graph.

Outputs

GraphsPaths: Paths to the generated visualizations.
GraphsImages: The generated images for the visualizations.

DataSet_CopyFiles

The DataSet_CopyFiles node copies files from a source folder to a destination folder using different modes:

BlindCopy: Copies all files.
CopyByDestinationFiles: Copies files only if a matching file exists in the destination.

Inputs

source_folder: Path to the source folder.
destination_folder: Path to the destination folder.
copy_mode: Mode of copying (BlindCopy, CopyByDestinationFiles).

DataSet_TriggerWords

The DataSet_TriggerWords node extracts trigger words from captions, identifying tokens that contain both letters and numbers.

Inputs

TextFileContents: The text to be processed.
search: Mode of extraction (trigger_word_only, trigger_word_phrase).

Outputs

Words: The extracted trigger words or phrases.

DataSet_TextFilesLoadFromList

This node processes basic attributes of text files, such as filenames and contents, from a list of file paths.

Inputs

TextFilePathsList: List of file paths to the text files.

Outputs

TextFileNames: Names of the text files.
TextFileNamesWithoutExtension: Names without extensions.
TextFilePaths: File paths.
TextFileContents: Contents of the text files.

DataSet_TextFilesLoad

Similar to the above, but uses a directory path to load text files.

Inputs

directory: Directory path where the text files are located.

Outputs

TextFileNames: Names of the text files.
TextFileNamesWithoutExtension: Names without extensions.
TextFilePaths: File paths.
TextFileContents: Contents of the text files.

DataSet_TextFilesSave

This node saves text file contents to a specified directory with various modes like overwriting, merging, and creating new files.

Inputs

TextFileNames: Names of the text files.
TextFileContents: Contents of the text files.
destination: Directory path for saving.
save_mode: Mode of saving (Overwrite, Merge, SaveNew, MergeAndSaveNew).

DataSet_FindAndReplace

The DataSet_FindAndReplace node finds and replaces text patterns within caption text files.

Inputs

TextFileContents: The text to be processed.
SearchFor: The text pattern to search for.
ReplaceWith: The replacement text.

Outputs

TextFileContents: The modified text contents.

DataSet_PathSelector

This node identifies images in a sub-dataset that are missing caption text files from a larger repository.

Inputs

search_in_directory: Directory with missing pairings.
search_for_extensions: Extensions of the orphaned files.
select_from_directory: Repository directory with complete pairings.
select_extensions: Extensions of the required files.

Outputs

SelectedNamesWithExtension: Names with extensions.
SelectedNamesWithoutExtension: Names without extensions.
SelectedPaths: Full paths of the required files.

DataSet_ConceptManager

The DataSet_ConceptManager node adds or removes tokens within caption files and places them at designated positions.

Inputs

TextFileContents: The text to be processed.
Mode: Mode of operation (add, remove).
Concepts: Concepts to add or remove.

Outputs

TextFileContents: The modified text contents.

DataSet_OpenAIChat

This node uses the OpenAI GPT chat to help generate prompts.

Inputs

model: OpenAI model to use.
api_url: API URL.
api_key: API key.
prompt: The query chat.
token_length: Maximum number of tokens.

Outputs

STRING: The generated prompt.

DataSet_LoadImage

Provides essential image file attributes for captioning with the DataSet_OpenAIChat node.

Inputs

image: Name of the image file.

Outputs

IMAGE: The image file.
MASK: The mask associated with the image.
STRING: Name of the image file.
STRING: Name without extension.
STRING: Full path of the image file.
STRING: Directory path of the image file.

DataSet_SaveImage

Batch saves images to a specified directory with optional PNG metadata.

Inputs

Images: List of images to save.
ImageFilePrefix: Prefix for the saved image filenames.
destination: Directory path for saving.

DataSet_OpenAIChatImage

Uses the OpenAI GPTo multi-modal vision API to caption images.

Inputs

image: Image to be processed.
image_detail: Detail level of the image.
prompt: Text prompt for the AI model.
model: OpenAI model to use.
api_url: API URL.
api_key: API key.
token_length: Maximum token length.

Outputs

STRING: Generated captions.

DataSet_OpenAIChatImageBatch

Extends the functionality of DataSet_OpenAIChatImage to process batches of images.

Inputs

images: List of images to be processed.
image_detail: Detail level of the images.
prompt: Text prompt for the AI model.
model: OpenAI model to use.
api_url: API URL.
api_key: API key.
token_length: Maximum token length.

Outputs

STRING: List of generated captions.

Troubleshooting ComfyUI-DataSet

Common Issues and Solutions

Node Not Working as Expected:

Ensure all required inputs are provided.
Check for any error messages in the console.
Restart ComfyUI and try again.

File Not Found Errors:

Verify the file paths are correct.
Ensure the files exist in the specified directories.

API Key Issues:

Double-check the API key for OpenAI nodes.
Ensure the API key has the necessary permissions.

Frequently Asked Questions

Q: How do I update ComfyUI-DataSet? A: Follow the installation instructions to update the extension. Restart ComfyUI after updating.

Q: Can I use ComfyUI-DataSet with other extensions? A: Yes, ComfyUI-DataSet is designed to work alongside other extensions. Ensure there are no conflicts between nodes.

Learn More about ComfyUI-DataSet

For additional resources, tutorials, and community support, visit the following links:

ComfyUI GitHub Repository
ComfyUI-DataSet GitHub Repository
ComfyUI Community Forums These resources provide comprehensive guides, examples, and a platform to ask questions and share your experiences with other AI artists.

ComfyUI-DataSet Related Nodes

DataSet_ConceptManager

DataSet_CopyFiles

DataSet_FindAndReplace

DataSet_LoadImage

DataSet_OpenAIChat

DataSet_OpenAIChatImage

DataSet_OpenAIChatImageBatch

DataSet_PathSelector

DataSet_SaveImage

DataSet_TextFilesLoad

DataSet_TextFilesLoadFromList

DataSet_TextFilesSave

DataSet_TriggerWords

DataSet_Visualizer

Table of Content

Description
How ComfyUI-DataSet Works
ComfyUI-DataSet Features
Troubleshooting ComfyUI-DataSet
Learn More about ComfyUI-DataSet
Related Nodes

FLUX Dev ControlNet | Multi-Condition ControlNet

Controlled FLUX Dev image generation with Pose, Depth, Canny, and ReColor

ACE-Step Music Generation | AI Audio Creation

Generate studio-quality music 15× faster with breakthrough diffusion technology.

Fluxtapoz | RF Inversion and Stylization

Fluxtapoz Nodes for RF Inversion and Stylization - Unsampling and Sampling

PuLID Flux II | Consistent Character Generation

Generate images with precise character control while preserving artistic style.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.