ComfyUI > Nodes > ComfyUI_Yvann-Nodes > Audio Analysis

ComfyUI Node: Audio Analysis

Class Name

Audio Analysis

Category
👁️ Yvann Nodes/🔊 Audio
Author
yvann-ba (Account age: 1129days)
Extension
ComfyUI_Yvann-Nodes
Latest Updated
2025-01-27
Github Stars
0.35K

How to Install ComfyUI_Yvann-Nodes

Install this extension via the ComfyUI Manager by searching for ComfyUI_Yvann-Nodes
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_Yvann-Nodes in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • High-speed GPU machines
  • 200+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 50+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Audio Analysis Description

Dissect audio files into elements for reactive visualizations and manipulation, leveraging advanced separation models.

Audio Analysis:

The Audio Analysis node is designed to dissect audio files into their constituent elements, such as drums, vocals, bass, and others, allowing you to generate reactive weights and visual graphs based on these components. This node is particularly beneficial for AI artists who wish to create audio-reactive visualizations or manipulate specific audio elements for creative projects. By leveraging advanced audio separation models, the node can isolate and process different audio components, providing you with the flexibility to focus on particular elements of a track. The node's primary goal is to facilitate the extraction and analysis of audio elements, enabling you to apply manual control over audio weights and enhance your creative workflow.

Audio Analysis Input Parameters:

audio_sep_model

This parameter requires a pre-loaded audio separation model, which is essential for the node to function. The model is responsible for isolating different audio components, such as drums, vocals, and bass, from the input audio. The quality and accuracy of the separation depend on the model used, making it a critical component of the node's execution.

batch_size

The batch size determines the number of frames that will be associated with the audio weights during processing. It directly impacts the granularity of the analysis, with larger batch sizes potentially leading to less detailed weight distribution. The batch size must be an integer, and it is crucial to balance it according to the desired level of detail and processing efficiency.

fps

Frames per second (fps) is a parameter that sets the rate at which audio weights are processed. It affects the temporal resolution of the analysis, with higher fps values providing more frequent updates to the audio weights. This parameter is a float and should be chosen based on the desired smoothness and responsiveness of the audio-reactive elements.

audio

The audio parameter is the input audio file that you wish to analyze. It must contain a waveform and a sample rate, as these are necessary for the node to process the audio correctly. The quality and format of the input audio can influence the results, so it is important to ensure that the audio is properly prepared before analysis.

analysis_mode

This parameter allows you to select the specific audio component to analyze, such as "Drums Only," "Vocals Only," "Bass Only," "Others Audio," or "Full Audio." The choice of analysis mode determines which elements of the audio will be isolated and processed, providing you with control over the focus of the analysis.

threshold

The threshold parameter sets the minimum weight value that must be exceeded for an audio component to be considered significant. It is a float with a default value of 0.5, and it can range from 0.0 to 1.0. Adjusting the threshold allows you to filter out less prominent audio elements, ensuring that only the most impactful components are highlighted.

multiply

This parameter is an amplification factor applied to the audio weights before normalization. It is a float with a default value of 1.0, and it can range from 0.0 to 5.0. By adjusting the multiply value, you can enhance or diminish the influence of the audio weights, providing additional control over the final output.

Audio Analysis Output Parameters:

processed_audio

The processed audio output is the result of the audio separation process, containing only the isolated components specified by the analysis mode. This output allows you to work with specific elements of the audio, such as drums or vocals, independently from the rest of the track.

original_audio

This output provides the original, unmodified audio input, allowing you to reference the initial audio file alongside the processed results. It is useful for comparison and ensuring that the integrity of the original audio is maintained.

audio_weights

Audio weights are a list of values that represent the reactive weights based on the processed audio. These weights can be used to create audio-reactive visualizations or to inform other creative processes. They provide a quantitative measure of the audio's dynamic elements.

graph_audio

The graph audio output is an image that visualizes the audio weights over time, providing a graphical representation of the audio's dynamic characteristics. This visualization can be used to better understand the distribution and impact of the audio weights across the frames.

Audio Analysis Usage Tips:

  • Ensure that the audio separation model is well-suited for the type of audio you are analyzing to achieve the best results.
  • Experiment with different analysis modes to isolate and focus on specific audio components that are most relevant to your project.
  • Adjust the threshold and multiply parameters to fine-tune the sensitivity and impact of the audio weights, tailoring them to your creative needs.

Audio Analysis Common Errors and Solutions:

Invalid audio input

  • Explanation: This error occurs when the input audio does not contain the required waveform or sample rate information.
  • Solution: Verify that the input audio file is correctly formatted and includes both waveform and sample rate data.

Model not loaded

  • Explanation: This error indicates that the audio separation model has not been properly loaded or is missing.
  • Solution: Ensure that the audio separation model is correctly loaded and available before running the node.

Unsupported analysis mode

  • Explanation: This error arises when an invalid or unsupported analysis mode is selected.
  • Solution: Check that the analysis mode is one of the supported options: "Drums Only," "Vocals Only," "Bass Only," "Others Audio," or "Full Audio."

Audio Analysis Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_Yvann-Nodes
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.