Visit ComfyUI Online for ready-to-use ComfyUI environment
Comfyui_MiniCPMv2_6-prompt-generator by ComfyUI enables single-image captioning, prompt generation from uploaded images, and batch-image prompt generation, enhancing image-to-text capabilities.
The Comfyui_MiniCPMv2_6-prompt-generator is an extension designed to automatically generate image labels or prompts, which can be particularly useful for AI artists working with LoRA (Low-Rank Adaptation) or DreamBooth training on flux series models. This extension leverages a fine-tuned model, , to create natural language descriptions for images. These descriptions can be short or long prompts, making it easier to generate training data for various AI art projects.
The extension works by using a fine-tuned version of the model, which has been trained on a dataset of MidJourney prompts. This model can generate captions and prompts for images in a natural language style. The process involves uploading an image and selecting the desired caption method (single-image caption, short prompt, or long prompt). The model then processes the image and generates the corresponding text output.
Single-Image Caption: Upload an image and set the caption_method
to "caption". The model will generate a descriptive caption for the image.
Short Prompt Generation: Upload an image and set the caption_method
to "short_prompt". The model will generate a concise prompt for the image.
Long Prompt Generation: Upload an image and set the caption_method
to "long_prompt". The model will generate a detailed prompt for the image.
Image Regeneration: Use the generated prompt as input to a CLIP node to regenerate the image through a text-to-image (t2i) model.
caption_method
to "caption".caption_method
to "short_prompt".caption_method
to "long_prompt".The extension uses the model, which is fine-tuned on a MidJourney prompt dataset. This model can generate both short and long prompts for images in a natural language style. The model is trained with over 3000 samples, including images and prompts sourced from MidJourney, and it operates efficiently with lower GPU memory usage (about 7GB) when using the int4 quantized version.
ComfyUI\models\LLM\
directory. If not, download it manually from .caption_method
is set. Experiment with different images to see if the issue persists.For additional resources, tutorials, and community support, you can explore the following:
© Copyright 2024 RunComfy. All Rights Reserved.