Install this extension via the ComfyUI Manager by searching
for ComfyUI-Ollama-Describer
1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-Ollama-Describer in the search bar
After installation, click the Restart button to
restart ComfyUI. Then, manually
refresh your browser to clear the cache and access
the updated list of nodes.
Visit
ComfyUI Online
for ready-to-use ComfyUI environment
ComfyUI-Ollama-Describer is an extension for ComfyUI enabling the use of LLM models from Ollama, including Gemma, Llava, Llama2, Llama3, and Mistral. Notably, the LLaVa model excels in image comprehension and question answering, akin to multimodal models like GPT-4.
ComfyUI-Ollama-Describer Introduction
ComfyUI-Ollama-Describer is an extension for ComfyUI that allows you to leverage various large language models (LLMs) provided by Ollama. These models include Gemma, Llava (multimodal), Llama2, Llama3, and Mistral. This extension is designed to help AI artists and other users easily integrate advanced language models into their workflows, enabling tasks such as image description, text generation, and more.
ComfyUI-Ollama-Describer works by connecting ComfyUI with the Ollama library, which provides access to various LLMs. These models can process and generate text based on the input provided. For example, you can use an image as input to generate descriptive text or use a text prompt to generate creative content. The extension simplifies the interaction with these models, making it accessible even for users without a strong technical background.
ComfyUI-Ollama-Describer Features
Ollama Image Describer
This feature allows you to use images as input to generate descriptive text. You can add the node via Ollama -> Ollama Image Describer.
images: The image(s) used for extracting or processing information. Some models, like Llava, can accept multiple images.
model: Choose from models like 7b, 13b, or 34b. Larger models (with more parameters) provide more detailed responses but require more processing power.
custom_model: Use a model not listed by default by specifying its name from the Ollama library (https://ollama.com/library).
api_host: Specify the API address for model communication, either locally or remotely.
timeout: Set the maximum response time before the request is canceled.
temperature: Adjust the creativity of the response. Higher values result in more creative but less accurate responses.
top_k: Limits the number of possible next words to consider, reducing the chance of nonsensical outputs.
top_p: Controls the diversity of the generated text. Higher values lead to more diverse outputs.
repeat_penalty: Penalizes repeated phrases to ensure varied responses.
seed_number: Set a specific seed for reproducible results.
max_tokens: Define the maximum length of the response.
keep_model_alive: Determine how long the model stays in memory after generation.
prompt: The text prompt or question for the model.
system_context: Provide additional context to influence the model's responses.
Ollama Text Describer
This feature allows you to generate text based on a given prompt. Add the node via Ollama -> Ollama Text Describer.
prepend_text: Optional text to add at the beginning.
append_text: Optional text to add at the end.
replace_find_mode: Choose between normal string replacement or regex-based replacement.
replace_find: The string or regex pattern to find in the text.
replace_with: The replacement string for matches found.
ComfyUI-Ollama-Describer Models
Available Models
Gemma: Suitable for general-purpose text generation.
Llava (multimodal): Can process both text and images.
Llama2: A robust model for various text generation tasks.
Llama3: An advanced version with more parameters for detailed responses.
Mistral: Optimized for specific tasks requiring high accuracy.
Custom Models
You can also use custom models by specifying their names from the Ollama library (https://ollama.com/library). This flexibility allows you to choose models that best fit your specific needs.
Troubleshooting ComfyUI-Ollama-Describer
Common Issues and Solutions
Model Not Loading: Ensure that the model name is correctly specified and that the model is available in the Ollama library.
Slow Response Time: Check your hardware capabilities and consider using a smaller model or increasing the timeout parameter.
Unexpected Outputs: Adjust the temperature, top_k, and top_p parameters to fine-tune the model's responses.
Frequently Asked Questions
How do I install the extension?
Follow the installation steps provided in the documentation.
Can I use multiple images with the Image Describer?
Yes, some models like Llava support multiple images.
What is the best model for my hardware?
For most users, the 7b or 13b models offer a good balance between performance and resource requirements.
Learn More about ComfyUI-Ollama-Describer
For additional resources, tutorials, and community support, visit the following links:
Ollama GPU Installation Guide
These resources provide comprehensive guides and support to help you make the most out of the ComfyUI-Ollama-Describer extension.