ComfyUI > Nodes > ComfyUI-DataSet

ComfyUI Extension: ComfyUI-DataSet

Repo Name

ComfyUI-DataSet

Author
daxcay (Account age: 134 days)
Nodes
View all nodes(14)
Latest Updated
2024-08-02
Github Stars
0.02K

How to Install ComfyUI-DataSet

Install this extension via the ComfyUI Manager by searching for ComfyUI-DataSet
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-DataSet in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • High-speed GPU machines
  • 200+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 50+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ComfyUI-DataSet Description

ComfyUI-DataSet offers data research, preparation, and manipulation nodes for model trainers, artists, designers, and animators, featuring tools like captions, visualizer, and text manipulator.

ComfyUI-DataSet Introduction

ComfyUI-DataSet is an extension designed to assist AI artists and model trainers in managing and manipulating datasets. This extension provides a variety of nodes that help you visualize, organize, and process your data efficiently. Whether you are preparing data for training models or analyzing existing datasets, ComfyUI-DataSet offers tools to streamline these tasks, making it easier to handle large volumes of data and extract meaningful insights.

How ComfyUI-DataSet Works

ComfyUI-DataSet operates through a series of nodes that you can integrate into your workflow. Each node performs a specific function, such as visualizing data, copying files, or extracting specific information from text files. By connecting these nodes, you can create complex data processing pipelines tailored to your needs. Think of it as building blocks that you can combine in various ways to achieve your desired outcome.

For example, you might use a node to load text files, another to analyze the frequency of words, and a third to visualize this data in a graph. This modular approach allows you to customize your data processing workflow without needing to write any code.

ComfyUI-DataSet Features

DataSet_Visualizer

The DataSet_Visualizer node helps you visualize dataset captions by generating graphs. It includes:

  • Word Cloud: Shows token frequency with different font sizes.
  • Network Graph: Illustrates relationships between tokens.
  • Frequency Graph: Displays how often each token appears.

Inputs

  • TextFileContents: The text to be processed.
  • Seperator: Delimiter used to separate tokens (comma, colon, space, pipe).
  • WordCloudTop: Number of top tokens for the Word Cloud.
  • NetworkGraphTop: Number of top tokens for the Network Graph.
  • FrequencyGraphTop: Number of top tokens for the Frequency Graph.

Outputs

  • GraphsPaths: Paths to the generated visualizations.
  • GraphsImages: The generated images for the visualizations.

DataSet_CopyFiles

The DataSet_CopyFiles node copies files from a source folder to a destination folder using different modes:

  • BlindCopy: Copies all files.
  • CopyByDestinationFiles: Copies files only if a matching file exists in the destination.

Inputs

  • source_folder: Path to the source folder.
  • destination_folder: Path to the destination folder.
  • copy_mode: Mode of copying (BlindCopy, CopyByDestinationFiles).

DataSet_TriggerWords

The DataSet_TriggerWords node extracts trigger words from captions, identifying tokens that contain both letters and numbers.

Inputs

  • TextFileContents: The text to be processed.
  • search: Mode of extraction (trigger_word_only, trigger_word_phrase).

Outputs

  • Words: The extracted trigger words or phrases.

DataSet_TextFilesLoadFromList

This node processes basic attributes of text files, such as filenames and contents, from a list of file paths.

Inputs

  • TextFilePathsList: List of file paths to the text files.

Outputs

  • TextFileNames: Names of the text files.
  • TextFileNamesWithoutExtension: Names without extensions.
  • TextFilePaths: File paths.
  • TextFileContents: Contents of the text files.

DataSet_TextFilesLoad

Similar to the above, but uses a directory path to load text files.

Inputs

  • directory: Directory path where the text files are located.

Outputs

  • TextFileNames: Names of the text files.
  • TextFileNamesWithoutExtension: Names without extensions.
  • TextFilePaths: File paths.
  • TextFileContents: Contents of the text files.

DataSet_TextFilesSave

This node saves text file contents to a specified directory with various modes like overwriting, merging, and creating new files.

Inputs

  • TextFileNames: Names of the text files.
  • TextFileContents: Contents of the text files.
  • destination: Directory path for saving.
  • save_mode: Mode of saving (Overwrite, Merge, SaveNew, MergeAndSaveNew).

DataSet_FindAndReplace

The DataSet_FindAndReplace node finds and replaces text patterns within caption text files.

Inputs

  • TextFileContents: The text to be processed.
  • SearchFor: The text pattern to search for.
  • ReplaceWith: The replacement text.

Outputs

  • TextFileContents: The modified text contents.

DataSet_PathSelector

This node identifies images in a sub-dataset that are missing caption text files from a larger repository.

Inputs

  • search_in_directory: Directory with missing pairings.
  • search_for_extensions: Extensions of the orphaned files.
  • select_from_directory: Repository directory with complete pairings.
  • select_extensions: Extensions of the required files.

Outputs

  • SelectedNamesWithExtension: Names with extensions.
  • SelectedNamesWithoutExtension: Names without extensions.
  • SelectedPaths: Full paths of the required files.

DataSet_ConceptManager

The DataSet_ConceptManager node adds or removes tokens within caption files and places them at designated positions.

Inputs

  • TextFileContents: The text to be processed.
  • Mode: Mode of operation (add, remove).
  • Concepts: Concepts to add or remove.

Outputs

  • TextFileContents: The modified text contents.

DataSet_OpenAIChat

This node uses the OpenAI GPT chat to help generate prompts.

Inputs

  • model: OpenAI model to use.
  • api_url: API URL.
  • api_key: API key.
  • prompt: The query chat.
  • token_length: Maximum number of tokens.

Outputs

  • STRING: The generated prompt.

DataSet_LoadImage

Provides essential image file attributes for captioning with the DataSet_OpenAIChat node.

Inputs

  • image: Name of the image file.

Outputs

  • IMAGE: The image file.
  • MASK: The mask associated with the image.
  • STRING: Name of the image file.
  • STRING: Name without extension.
  • STRING: Full path of the image file.
  • STRING: Directory path of the image file.

DataSet_SaveImage

Batch saves images to a specified directory with optional PNG metadata.

Inputs

  • Images: List of images to save.
  • ImageFilePrefix: Prefix for the saved image filenames.
  • destination: Directory path for saving.

DataSet_OpenAIChatImage

Uses the OpenAI GPTo multi-modal vision API to caption images.

Inputs

  • image: Image to be processed.
  • image_detail: Detail level of the image.
  • prompt: Text prompt for the AI model.
  • model: OpenAI model to use.
  • api_url: API URL.
  • api_key: API key.
  • token_length: Maximum token length.

Outputs

  • STRING: Generated captions.

DataSet_OpenAIChatImageBatch

Extends the functionality of DataSet_OpenAIChatImage to process batches of images.

Inputs

  • images: List of images to be processed.
  • image_detail: Detail level of the images.
  • prompt: Text prompt for the AI model.
  • model: OpenAI model to use.
  • api_url: API URL.
  • api_key: API key.
  • token_length: Maximum token length.

Outputs

  • STRING: List of generated captions.

Troubleshooting ComfyUI-DataSet

Common Issues and Solutions

  1. Node Not Working as Expected:
  • Ensure all required inputs are provided.
  • Check for any error messages in the console.
  • Restart ComfyUI and try again.
  1. File Not Found Errors:
  • Verify the file paths are correct.
  • Ensure the files exist in the specified directories.
  1. API Key Issues:
  • Double-check the API key for OpenAI nodes.
  • Ensure the API key has the necessary permissions.

Frequently Asked Questions

Q: How do I update ComfyUI-DataSet? A: Follow the installation instructions to update the extension. Restart ComfyUI after updating.

Q: Can I use ComfyUI-DataSet with other extensions? A: Yes, ComfyUI-DataSet is designed to work alongside other extensions. Ensure there are no conflicts between nodes.

Learn More about ComfyUI-DataSet

For additional resources, tutorials, and community support, visit the following links:

ComfyUI-DataSet Related Nodes

RunComfy

© Copyright 2024 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.