Visit ComfyUI Online for ready-to-use ComfyUI environment
Perturbed-Attention Guidance for ComfyUI enhances image generation by manipulating attention maps, allowing for refined control over visual outputs. This extension adjusts attention mechanisms to improve detail and coherence in generated images.
The sd-perturbed-attention
extension is a powerful tool designed to enhance the quality of images generated by AI models like Stable Diffusion. It implements the Perturbed-Attention Guidance (PAG) technique, which is based on the research paper Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance by D. Ahn et al. This extension is compatible with both ComfyUI and SD WebUI (Forge), and it works with Stable Diffusion versions 1.5 and SDXL.
The main goal of this extension is to improve the structural coherence and overall quality of generated images by adjusting the attention mechanism during the diffusion process. This can help AI artists achieve more accurate and visually appealing results, especially when dealing with complex prompts or high levels of detail.
At its core, the sd-perturbed-attention
extension modifies the way the AI model pays attention to different parts of the image during the generation process. Think of it as a way to guide the model's focus, ensuring that it pays more attention to important details and less to irrelevant noise.
Here's a simple analogy: Imagine you're painting a picture, and you have a guide who tells you where to focus your brushstrokes to make the image look more coherent and detailed. The Perturbed-Attention Guidance acts like this guide, helping the AI model to focus on the right areas at the right times.
The extension introduces several parameters that control how this guidance is applied, such as the scale of the guidance, the specific layers of the model it affects, and the stages of the diffusion process where it is active. By tweaking these parameters, you can fine-tune the behavior of the model to suit your artistic needs.
The sd-perturbed-attention
extension offers a range of features that allow you to customize its behavior:
PAG Scale (scale
): This parameter controls the intensity of the Perturbed-Attention Guidance. Higher values can increase the structural coherence of the image but may also lead to oversaturation. Experiment with different values to find the right balance for your artwork.
Adaptive Scale (adaptive_scale
): This dampening factor reduces the effect of PAG during the later stages of the denoising process, speeding up the overall sampling. A value of 0.0 means no penalty, while 1.0 completely removes PAG.
U-Net Block (unet_block
): Specifies the part of the U-Net model to which PAG is applied. The original paper suggests using the middle block, but you can experiment with other blocks to see how it affects the results.
U-Net Block ID (unet_block_id
): Identifies the specific layer within the selected U-Net block where PAG is applied. PAG can only be applied to layers containing self-attention blocks.
Sigma Start / Sigma End (sigma_start
/ sigma_end
): Defines the range of the diffusion process where PAG is active. Setting both values to negative disables this feature.
Rescale PAG (rescale_pag
): Similar to the RescaleCFG node, this prevents over-exposure at high scale
values. It is based on Algorithm 2 from the paper Common Diffusion Noise Schedules and Sample Steps are Flawed. Set to 0 to disable this feature.
Rescale Mode (rescale_mode
): Determines how the rescaling is applied:
full
: Takes into account both CFG and PAG.
partial
: Depends only on PAG.
U-Net Block List (unet_block_list
): Allows you to select multiple U-Net layers for PAG application. You can specify layers using dot notation (e.g., m0,u0.4
for middle block 0 and output block 0 with index 4).
The sd-perturbed-attention
extension works with different models, including SD1.5 and SDXL. Each model has its own set of layers and blocks where PAG can be applied. Here’s a quick overview:
d0
to d5
m0
u0
to u8
d0
to d3
m0
u0
to u5
d0
and d1
) has indices from 0
to 9
(e.g., m0.7
or u0.4
), while d0
and d1
have indices 0
to 1
.
By selecting different blocks and layers, you can control how the guidance is applied and see how it affects the final output.Here are some common issues you might encounter while using the sd-perturbed-attention
extension and how to solve them:
sigma_end
parameter to 0.7 or higher. This can help reduce the striped patterns.scale
parameter or enabling the rescale_pag
feature to prevent over-exposure.adaptive_scale
parameter to speed up the later stages of denoising.unet_block
and unet_block_id
settings to find the optimal configuration for your specific use case.To learn more about the sd-perturbed-attention
extension and how to use it effectively, check out the following resources:
© Copyright 2024 RunComfy. All Rights Reserved.