Visit ComfyUI Online for ready-to-use ComfyUI environment
Breaks text into manageable chunks for efficient processing, offering flexibility in chunking methods and word boundary preservation.
The TextChunker node is designed to break down large blocks of text into smaller, more manageable chunks. This is particularly useful for processing lengthy documents or texts where handling the entire content at once is impractical. By dividing the text into chunks, you can perform more efficient and targeted operations on each segment. The TextChunker node offers flexibility in how the text is divided, allowing you to choose between chunking by words or characters. Additionally, it provides an option to respect word boundaries, ensuring that chunks do not split words inappropriately. This node is essential for tasks that require text segmentation, such as natural language processing, document analysis, and data preparation for machine learning models.
This parameter accepts the text that you want to chunk. It should be a string and can be multiline. The text is the primary input that will be processed and divided into smaller segments.
This parameter determines the size of each chunk. When chunking by words, it specifies the number of words per chunk. When chunking by characters, it specifies the number of characters per chunk. The default value is 1000, with a minimum value of 1 and a maximum value of 10000. Adjusting this parameter allows you to control the granularity of the chunks.
This parameter allows you to choose the method of chunking. It can be set to either "words" or "characters". Selecting "words" will divide the text based on word count, while "characters" will divide the text based on character count. This choice affects how the text is segmented and can be tailored to your specific needs.
This boolean parameter determines whether to respect word boundaries when chunking by characters. If set to true, the node will ensure that chunks do not split words, providing cleaner and more readable segments. The default value is true. This parameter is particularly useful when you want to maintain the integrity of words within each chunk.
This output parameter returns the resulting chunks of text as a list of strings. Each string in the list represents a chunk of the original text, divided according to the specified chunk size and method. This output allows you to easily access and process each segment individually.
respect_word_boundaries
parameter to true when chunking by characters.chunk_size
parameter based on the length of your text and the desired granularity of the chunks. For shorter texts, a smaller chunk size may be more appropriate.chunk_method
parameter to choose the most suitable chunking method for your task. Chunking by words is often more natural for text analysis, while chunking by characters can be useful for specific technical applications.text
parameter.<number_of_chunks>
selected_index
parameter is within the valid range of chunk indices. Adjust the index to a value between 0 and the number of available chunks minus one.© Copyright 2024 RunComfy. All Rights Reserved.