Visit ComfyUI Online for ready-to-use ComfyUI environment
Comfyui_MiniCPMv2_6-prompt-generator by ComfyUI enables single-image captioning, prompt generation from uploaded images, and batch-image prompt generation, enhancing image-to-text capabilities.
The Comfyui_MiniCPMv2_6-prompt-generator is an extension designed to automatically generate image labels or prompts, which can be particularly useful for AI artists working with LoRA (Low-Rank Adaptation) or DreamBooth training on flux series models. This extension leverages a fine-tuned model, MiniCPMv2_6-prompt-generator, to create natural language descriptions for images. These descriptions can be short or long prompts, making it easier to generate training data for various AI art projects.
The extension works by using a fine-tuned version of the MiniCPM-V 2.6 model, which has been trained on a dataset of MidJourney prompts. This model can generate captions and prompts for images in a natural language style. The process involves uploading an image and selecting the desired caption method (single-image caption, short prompt, or long prompt). The model then processes the image and generates the corresponding text output.
Single-Image Caption: Upload an image and set the caption_method
to "caption". The model will generate a descriptive caption for the image.
single image caption
Short Prompt Generation: Upload an image and set the caption_method
to "short_prompt". The model will generate a concise prompt for the image.
short_prompt
Long Prompt Generation: Upload an image and set the caption_method
to "long_prompt". The model will generate a detailed prompt for the image.
long_prompt
Image Regeneration: Use the generated prompt as input to a CLIP node to regenerate the image through a text-to-image (t2i) model. Image regenerate
caption_method
to "caption".caption_method
to "short_prompt".caption_method
to "long_prompt".The extension uses the MiniCPMv2_6-prompt-generator model, which is fine-tuned on a MidJourney prompt dataset. This model can generate both short and long prompts for images in a natural language style. The model is trained with over 3000 samples, including images and prompts sourced from MidJourney, and it operates efficiently with lower GPU memory usage (about 7GB) when using the int4 quantized version.
ComfyUI\models\LLM\
directory. If not, download it manually from MiniCPMv2_6-prompt-generator.caption_method
is set. Experiment with different images to see if the issue persists.For additional resources, tutorials, and community support, you can explore the following:
© Copyright 2024 RunComfy. All Rights Reserved.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.