Visit ComfyUI Online for ready-to-use ComfyUI environment
Generate text using Molmo model, integrating visual cues for coherent and contextually relevant output.
MolmoGenerateText is a powerful node designed to generate text using the Molmo model, which is particularly adept at processing visual and textual inputs. This node allows you to input a series of images along with a textual prompt, enabling the model to generate coherent and contextually relevant text based on the visual content provided. The primary benefit of using MolmoGenerateText is its ability to seamlessly integrate visual cues into text generation, making it ideal for applications that require a deep understanding of both image and text data. This node is especially useful for creative projects where you want to describe images or generate narratives that are informed by visual elements. By leveraging advanced text generation techniques, MolmoGenerateText ensures that the output is not only relevant but also engaging and insightful.
This parameter specifies the vision model to be used for text generation. It is crucial as it determines the model's ability to interpret and generate text based on the provided images and prompts.
A list of images that the model will use as input. The number of images should match the number of [IMG]
tokens in the prompt. These images provide the visual context necessary for generating relevant text.
A string that serves as an initial prompt or context for the model. It can be multiline and is used to set the stage for the text generation process. The default value is an empty string.
This is the main textual input that guides the text generation. It should include [IMG]
tokens corresponding to the images provided. The default prompt is "Describe this image."
An integer that sets the maximum number of new tokens the model can generate. This controls the length of the generated text, with a default of 256 and a range from 1 to 4096.
A boolean that determines whether sampling is used during text generation. When set to true, the model will generate more diverse outputs. The default value is true.
A float that influences the randomness of the text generation. Higher values result in more random outputs, while lower values make the output more deterministic. The default is 0.3, with a minimum of 0.
A float that sets the cumulative probability threshold for token selection. It helps in controlling the diversity of the generated text. The default value is 0.9, with a range from 0.0 to 1.0.
An integer that limits the number of highest probability tokens to consider during generation. This parameter helps in focusing the output. The default is 40, with a minimum of 1.
A string that specifies the stopping criteria for text generation. The model will stop generating text when it encounters this string. The default is `
© Copyright 2024 RunComfy. All Rights Reserved.