Visit ComfyUI Online for ready-to-use ComfyUI environment
ComfyUI ExLlamaV2 Nodes is a local text generator for ComfyUI, leveraging the ExLlamaV2 model. It requires manual package installation and provides efficient text generation capabilities within the ComfyUI framework.
ComfyUI-ExLlama-Nodes is an extension designed to enhance the capabilities of ComfyUI by integrating it with ExLlamaV2, a powerful local text generation library. This extension allows AI artists to generate high-quality text locally on their machines, leveraging the advanced features of ExLlamaV2. Whether you're creating stories, dialogues, or any other text-based content, ComfyUI-ExLlama-Nodes provides a seamless and efficient way to produce text with minimal setup.
At its core, ComfyUI-ExLlama-Nodes works by connecting ComfyUI with ExLlamaV2, enabling local text generation on modern consumer GPUs. ExLlamaV2 is an inference library that supports various models and quantization techniques, making it versatile and efficient. The extension provides nodes that load models, generate text based on prompts, and display the generated text within the ComfyUI interface.
The Loader node is responsible for loading models from the llm
directory. It offers several customization options:
0
defaults to the model's configuration.The Generator node generates text based on a given prompt. Key parameters include:
["\n"]
stops generation on a newline.0
uses the available context.The Previewer node displays the generated text within the ComfyUI interface, allowing users to review and interact with the output.
The Replacer node replaces variable names in brackets (e.g., [a]
) with their corresponding values, making it easier to manage dynamic content within the generated text.
ComfyUI-ExLlama-Nodes supports various models, including EXL2, 4-bit GPTQ, and unquantized models. These models can be found on Hugging Face. Here are some examples:
models/llm
directory.models/llm
directory.cache_bits
value in the Loader node settings.flash_attention
if your GPU supports it.fast_tensors
in the Loader node settings.max_seq_len
value to decrease the context length.Can I use my own models?
Yes, you can add your own models by placing them in the models/llm
directory and updating the extra_model_paths.yaml
file.
What GPUs are supported? The extension supports modern consumer GPUs with compute capability 8.0 or higher for FlashAttention.
For additional resources, tutorials, and community support, consider the following:
© Copyright 2024 RunComfy. All Rights Reserved.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.