Dance Video Transform | Scene Customization & Face Swap

Transform dance videos with scene editing, face-swapping, and motion preservation.

Pyramid Flow | Video Generation

Including both text-to-video and image-to-video mode.

FLUX Dev ControlNet | Multi-Condition ControlNet

Controlled FLUX Dev image generation with Pose, Depth, Canny, and ReColor

Uni3C Video-Referenced Camera & Motion Transfer

Extract camera movements and human motions from reference videos for professional video generation

ComfyUI > Nodes > Perturbed-Attention Guidance

ComfyUI Extension: Perturbed-Attention Guidance

Repo Name

sd-perturbed-attention

Author
pamparamm (Account age: 2415 days) Nodes
View all nodes(1) Latest Updated
2025-02-23 Github Stars
0.24K

Github Ask pamparamm Current Questions Past Questions

Table of Content

Description
How Perturbed-Attention Guidance Works
Perturbed-Attention Guidance Features
Perturbed-Attention Guidance Models
Troubleshooting Perturbed-Attention Guidance
Learn More about Perturbed-Attention Guidance
Related Nodes

How to Install Perturbed-Attention Guidance

Install this extension via the ComfyUI Manager by searching for Perturbed-Attention Guidance

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter Perturbed-Attention Guidance in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Perturbed-Attention Guidance Description

Perturbed-Attention Guidance for ComfyUI enhances image generation by manipulating attention maps, allowing for refined control over visual outputs. This extension adjusts attention mechanisms to improve detail and coherence in generated images.

Perturbed-Attention Guidance Introduction

The sd-perturbed-attention extension is a powerful tool designed to enhance the quality of images generated by AI models like Stable Diffusion. It implements the Perturbed-Attention Guidance (PAG) technique, which is based on the research paper Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance by D. Ahn et al. This extension is compatible with both ComfyUI and SD WebUI (Forge), and it works with Stable Diffusion versions 1.5 and SDXL.

The main goal of this extension is to improve the structural coherence and overall quality of generated images by adjusting the attention mechanism during the diffusion process. This can help AI artists achieve more accurate and visually appealing results, especially when dealing with complex prompts or high levels of detail.

How Perturbed-Attention Guidance Works

At its core, the sd-perturbed-attention extension modifies the way the AI model pays attention to different parts of the image during the generation process. Think of it as a way to guide the model's focus, ensuring that it pays more attention to important details and less to irrelevant noise.

Here's a simple analogy: Imagine you're painting a picture, and you have a guide who tells you where to focus your brushstrokes to make the image look more coherent and detailed. The Perturbed-Attention Guidance acts like this guide, helping the AI model to focus on the right areas at the right times.

The extension introduces several parameters that control how this guidance is applied, such as the scale of the guidance, the specific layers of the model it affects, and the stages of the diffusion process where it is active. By tweaking these parameters, you can fine-tune the behavior of the model to suit your artistic needs.

Perturbed-Attention Guidance Features

The sd-perturbed-attention extension offers a range of features that allow you to customize its behavior:

PAG Scale (scale): This parameter controls the intensity of the Perturbed-Attention Guidance. Higher values can increase the structural coherence of the image but may also lead to oversaturation. Experiment with different values to find the right balance for your artwork.
Adaptive Scale (adaptive_scale): This dampening factor reduces the effect of PAG during the later stages of the denoising process, speeding up the overall sampling. A value of 0.0 means no penalty, while 1.0 completely removes PAG.
U-Net Block (unet_block): Specifies the part of the U-Net model to which PAG is applied. The original paper suggests using the middle block, but you can experiment with other blocks to see how it affects the results.
U-Net Block ID (unet_block_id): Identifies the specific layer within the selected U-Net block where PAG is applied. PAG can only be applied to layers containing self-attention blocks.
Sigma Start / Sigma End (sigma_start / sigma_end): Defines the range of the diffusion process where PAG is active. Setting both values to negative disables this feature.
Rescale PAG (rescale_pag): Similar to the RescaleCFG node, this prevents over-exposure at high scale values. It is based on Algorithm 2 from the paper Common Diffusion Noise Schedules and Sample Steps are Flawed. Set to 0 to disable this feature.
Rescale Mode (rescale_mode): Determines how the rescaling is applied:
full: Takes into account both CFG and PAG.
partial: Depends only on PAG.
U-Net Block List (unet_block_list): Allows you to select multiple U-Net layers for PAG application. You can specify layers using dot notation (e.g., m0,u0.4 for middle block 0 and output block 0 with index 4).

Perturbed-Attention Guidance Models

The sd-perturbed-attention extension works with different models, including SD1.5 and SDXL. Each model has its own set of layers and blocks where PAG can be applied. Here’s a quick overview:

SD1.5 U-Net Layers:
Input blocks: d0 to d5
Middle block: m0
Output blocks: u0 to u8
SDXL U-Net Layers:
Input blocks: d0 to d3
Middle block: m0
Output blocks: u0 to u5
Each block (except d0 and d1) has indices from 0 to 9 (e.g., m0.7 or u0.4), while d0 and d1 have indices 0 to 1. By selecting different blocks and layers, you can control how the guidance is applied and see how it affects the final output.

Troubleshooting Perturbed-Attention Guidance

Here are some common issues you might encounter while using the sd-perturbed-attention extension and how to solve them:

Striped Noise in Images: If you notice striped noise in your images, try setting the sigma_end parameter to 0.7 or higher. This can help reduce the striped patterns.
Oversaturation or "Fried" Images: If your images appear oversaturated, try lowering the scale parameter or enabling the rescale_pag feature to prevent over-exposure.
Slow Sampling Speed: If the sampling process is too slow, consider adjusting the adaptive_scale parameter to speed up the later stages of denoising.
Unexpected Results: If the generated images don't look as expected, experiment with different unet_block and unet_block_id settings to find the optimal configuration for your specific use case.

Learn More about Perturbed-Attention Guidance

To learn more about the sd-perturbed-attention extension and how to use it effectively, check out the following resources:

Perturbed-Attention Guidance Paper and Demo
ComfyUI GitHub Repository
SD WebUI (Forge) GitHub Repository These resources provide detailed information on the underlying principles of Perturbed-Attention Guidance and offer additional tips and examples to help you get the most out of this powerful extension.

Perturbed-Attention Guidance Related Nodes

Perturbed-Attention Guidance (Advanced)

Table of Content

Description
How Perturbed-Attention Guidance Works
Perturbed-Attention Guidance Features
Perturbed-Attention Guidance Models
Troubleshooting Perturbed-Attention Guidance
Learn More about Perturbed-Attention Guidance
Related Nodes

VACE Wan2.1 | V2V

Transform videos with a reference style image using VACE Wan2.1.

Nvidia Cosmos | Text & Image to Video Creation

Generate videos from text prompts or create frame interpolation between two images with Nvidia's Cosmos.

EchoMimic | Audio-driven Portrait Animations

Generate realistic talking heads and body gestures synced with the provided audio.

ACE++ Face Swap ｜ Image Editing

Swap faces in images with natural language instructions while preserving style and context.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.