ComfyUI > Nodes > ComfyUI Neural Network Toolkit NNT > NNT Text Batch Processor

ComfyUI Node: NNT Text Batch Processor

Class Name

NntTextBatchProcessor

Category
NNT Neural Network Toolkit/Text
Author
inventorado (Account age: 3209days)
Extension
ComfyUI Neural Network Toolkit NNT
Latest Updated
2025-01-08
Github Stars
0.07K

How to Install ComfyUI Neural Network Toolkit NNT

Install this extension via the ComfyUI Manager by searching for ComfyUI Neural Network Toolkit NNT
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI Neural Network Toolkit NNT in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

NNT Text Batch Processor Description

Efficiently tokenize and prepare text data in batches for machine learning models using `transformers` library.

NNT Text Batch Processor:

The NntTextBatchProcessor is a specialized node designed to efficiently handle and process batches of text data, particularly useful in scenarios involving natural language processing tasks. Its primary function is to tokenize and prepare text data for further processing by machine learning models, ensuring that the data is in a suitable format for model consumption. By leveraging the capabilities of the transformers library, this node can handle large volumes of text by splitting them into manageable batches, applying tokenization, and converting them into tensor formats that are compatible with PyTorch. This process is crucial for optimizing the performance of models that require input data in a specific format, such as sequence-to-sequence models or transformers. The node's ability to process text in batches not only enhances efficiency but also ensures that the text data is uniformly prepared, which is essential for maintaining consistency in model training and inference.

NNT Text Batch Processor Input Parameters:

texts

This parameter represents the raw text data that you want to process. It is expected to be a single string containing multiple pieces of text separated by a specified separator. The function of this parameter is to provide the node with the text data that needs to be tokenized and processed into batches. There are no specific minimum or maximum values for this parameter, but the length and content of the text will impact the number of batches created and the processing time.

separator

The separator is a string used to split the input text into individual pieces. Its function is to delineate where one piece of text ends and another begins within the texts parameter. The choice of separator can significantly impact how the text is divided and subsequently processed. Common separators include spaces, commas, or newline characters, depending on how the input text is structured.

max_length

This parameter defines the maximum length of the tokenized sequences. It ensures that each piece of text is truncated or padded to this specified length, which is crucial for maintaining uniform input sizes for models. The max_length parameter directly affects the memory usage and processing time, as longer sequences require more resources. There is no default value provided, but it should be set according to the model's requirements.

batch_size

The batch size determines how many pieces of text are processed together in a single batch. This parameter is critical for optimizing the processing speed and resource utilization. A larger batch size can lead to faster processing but may require more memory, while a smaller batch size is more memory-efficient but may slow down the processing. The choice of batch size should balance these considerations based on the available resources.

tokenizer

This parameter specifies the name or path of the tokenizer to be used for processing the text. The tokenizer is responsible for converting the text into token IDs that the model can understand. The choice of tokenizer can affect the quality and compatibility of the tokenized output with the model. It is important to select a tokenizer that matches the model architecture you plan to use.

output_dtype

The output_dtype parameter defines the data type of the output tensor. It ensures that the tokenized data is converted into a format that is compatible with the model's input requirements. The choice of data type can impact the precision and performance of the model, with common options including int32, int64, or float32. Selecting the appropriate data type is crucial for maintaining the model's accuracy and efficiency.

NNT Text Batch Processor Output Parameters:

batched_tokens

This output parameter is a tensor containing the tokenized text data, organized into batches. Its function is to provide a structured and model-ready format of the input text, ensuring that each batch is of uniform length and data type. The batched_tokens tensor is essential for feeding the processed text into machine learning models, and its shape and data type are determined by the input parameters such as max_length and output_dtype.

num_batches

The num_batches output indicates the total number of batches created from the input text. This parameter is important for understanding how the text data was divided and processed, providing insights into the batch processing efficiency and resource utilization. It helps in assessing whether the chosen batch size and text length were appropriate for the given task.

info

This output provides a detailed summary of the processing operation, including the number of texts processed, the number of batches created, the shape of the tokenized data, the output data type, and the tokenizer used. The info parameter is valuable for debugging and verifying that the text processing was executed as expected, offering a comprehensive overview of the node's operation.

NNT Text Batch Processor Usage Tips:

  • Ensure that the separator parameter is correctly set to match the structure of your input text, as this will affect how the text is split into individual pieces.
  • Choose a max_length that aligns with your model's requirements to avoid unnecessary truncation or padding, which can impact model performance.
  • Adjust the batch_size based on your system's memory capacity to optimize processing speed without exceeding available resources.
  • Select a tokenizer that is compatible with your model architecture to ensure that the tokenized output is correctly interpreted by the model.

NNT Text Batch Processor Common Errors and Solutions:

Tokenizer not found

  • Explanation: This error occurs when the specified tokenizer cannot be located or loaded, possibly due to an incorrect name or path.
  • Solution: Verify that the tokenizer parameter is set to a valid name or path of a pre-trained tokenizer available in the transformers library.

Text exceeds max_length

  • Explanation: This error arises when the input text is longer than the specified max_length, leading to truncation.
  • Solution: Consider increasing the max_length parameter or ensure that the input text is appropriately segmented to fit within the specified length.

Insufficient memory for batch size

  • Explanation: This error occurs when the chosen batch_size exceeds the available memory capacity, causing processing to fail.
  • Solution: Reduce the batch_size to a level that your system can handle, or increase the available memory resources if possible.

NNT Text Batch Processor Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI Neural Network Toolkit NNT
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.