AnimateDiff + ControlNet + AutoMask | Comic Style

Effortlessly restyle videos, converting realistic characters into anime while keeping the original backgrounds intact.

Wan 2.1 FLF2V | First-Last Frame Video

Generate smooth videos from a start and end frame using Wan 2.1 FLF2V.

Flux Upscaler - Ultimate 32k | Image Upscaler

Flux Upscaler – Achieve 4k, 8k, 16k, and Ultimate 32k Resolution!

Dance Video Transform | Scene Customization & Face Swap

Transform dance videos with scene editing, face-swapping, and motion preservation.

ComfyUI > Nodes > KJNodes for ComfyUI > Create Audio Mask

ComfyUI Node: Create Audio Mask

Class Name

CreateAudioMask

Category
KJNodes/deprecated

Author
kijai (Account age: 2467days) Extension
KJNodes for ComfyUI Latest Updated
2025-04-04 Github Stars
1.16K

Github Ask kijai Current Questions Past Questions

Table of Content

Description
Create Audio Mask:
Create Audio Mask Input Parameters:
Create Audio Mask Output Parameters:
Create Audio Mask Usage Tips:
Create Audio Mask Common Errors and Solutions:
Related Nodes

How to Install KJNodes for ComfyUI

Install this extension via the ComfyUI Manager by searching for KJNodes for ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter KJNodes for ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Create Audio Mask Description

Generate visual masks from audio amplitude data for creative visualizations.

Create Audio Mask:

The CreateAudioMask node is designed to generate visual masks based on the amplitude of an audio file. This node processes an audio file to create a series of images, where each image represents a frame of the audio's spectrogram. The primary purpose of this node is to convert audio data into a visual format that can be used in various creative and artistic applications, such as audio-reactive visualizations. By analyzing the amplitude of the audio, the node creates circular masks whose sizes are proportional to the audio's intensity at each frame. This allows for dynamic and visually engaging representations of audio signals.

Create Audio Mask Input Parameters:

invert

This parameter determines whether the generated masks should be inverted. When set to True, the masks will be inverted, meaning the areas that would normally be white will be black, and vice versa. This can be useful for creating different visual effects. The default value is False.

frames

This parameter specifies the number of frames to generate from the audio file. Each frame corresponds to a segment of the audio, and the node will create a mask for each frame. The minimum value is 1, the maximum value is 255, and the default value is 16. Adjusting this parameter allows you to control the granularity of the audio analysis.

scale

This parameter controls the scaling factor for the size of the circles in the masks. A higher value will result in larger circles, while a lower value will produce smaller circles. The minimum value is 0.0, the maximum value is 2.0, and the default value is 0.5. This parameter allows you to fine-tune the visual representation of the audio's amplitude.

audio_path

This parameter specifies the path to the audio file that will be processed. The default value is "audio.wav". Ensure that the audio file is accessible and correctly specified, as this is crucial for the node to function properly.

width

This parameter sets the width of the generated images. The minimum value is 16, the maximum value is 4096, and the default value is 256. Adjusting this parameter allows you to control the resolution of the output images.

height

This parameter sets the height of the generated images. The minimum value is 16, the maximum value is 4096, and the default value is 256. Adjusting this parameter allows you to control the resolution of the output images.

Create Audio Mask Output Parameters:

IMAGE

The output is a tensor containing the generated images. Each image represents a frame of the audio's spectrogram, with circular masks indicating the amplitude of the audio at that frame. The images are normalized to a range of 0.0 to 1.0, making them suitable for further processing or visualization.

Create Audio Mask Usage Tips:

Ensure that the audio file specified in the audio_path parameter is accessible and correctly formatted to avoid errors during processing.
Experiment with the frames parameter to find the optimal number of frames for your specific application. More frames provide finer detail but require more processing power.
Use the scale parameter to adjust the size of the circles in the masks to match the visual style you are aiming for.
If you need a different visual effect, try toggling the invert parameter to see how the inverted masks look.

Create Audio Mask Common Errors and Solutions:

"Can not import librosa. Install it with 'pip install librosa'"

Explanation: This error occurs when the librosa library is not installed in your Python environment.
Solution: Install the librosa library by running the command pip install librosa in your terminal or command prompt.

"FileNotFoundError: [Errno 2] No such file or directory: 'audio.wav'"

Explanation: This error occurs when the specified audio file cannot be found at the given path.
Solution: Ensure that the audio_path parameter is correctly set to the location of your audio file and that the file exists.

"ValueError: Audio file format not supported"

Explanation: This error occurs when the audio file format is not supported by the librosa library.
Solution: Convert your audio file to a supported format, such as WAV, and try again.

"RuntimeError: CUDA out of memory"

Explanation: This error occurs when the GPU runs out of memory while processing the audio file.
Solution: Reduce the frames, width, or height parameters to decrease the memory usage, or run the node on a machine with more GPU memory.

Create Audio Mask Related Nodes

Go back to the extension to check out more related nodes.

KJNodes for ComfyUI

Table of Content

Description
Create Audio Mask:
Create Audio Mask Input Parameters:
Create Audio Mask Output Parameters:
Create Audio Mask Usage Tips:
Create Audio Mask Common Errors and Solutions:
Related Nodes

Consistent & Realistic Characters

Create consistent and realistic characters with precise control over facial features, poses, and compositions.

Consistent Character Creator

Create consistent, high-resolution character designs from multiple angles with full control over emotions, lighting, and environments.

Hunyuan3D-2 | Leading-edge 3D Assets Generator

Generate precise textured 3D assets from images with state-of-the-art AI technology.

Flux TTP Upscale | 4K Face Restore

Repair distorted faces and upscale images to 4K resolution.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.