Visit ComfyUI Online for ready-to-use ComfyUI environment
Extracts audio features for visualization and analysis, leveraging advanced processing techniques for tempo, spectrogram, and chroma.
The mbmAudioFeatureCalculator
is a powerful node designed to extract and calculate relevant audio features from loaded audio files, making it an essential tool for music visualization and audio analysis. This node leverages advanced audio processing techniques to analyze the audio's tempo, spectrogram, and chroma features, among others, to generate a comprehensive set of feature modifiers. These modifiers can be used to create dynamic visualizations or further audio analysis. The node's primary goal is to transform audio data into meaningful visual or analytical outputs, providing a bridge between audio content and visual representation. By utilizing this node, you can gain insights into the audio's structure and dynamics, which can be particularly beneficial for AI artists looking to create music-driven visual art.
The audio
parameter is a tuple that contains the audio data and its sample rate. This is the primary input for the node, as it provides the raw audio information that will be analyzed to extract features.
The intensity
parameter is a float that acts as a multiplier for the audio features, allowing you to increase or decrease the overall effect of the audio features. The default value is 1.0, meaning no change, but you can adjust it to amplify or diminish the impact of the features.
The hop_length
parameter is an integer that determines the number of audio samples between successive frames. It affects the temporal resolution of the analysis, with a default value of 512. Adjusting this value can influence the granularity of the feature extraction.
The fps_target
parameter is a float that specifies the desired frames per second for the output. It has a default value of 6, with a range from -1 to 10000. A value of <= 0
will use the audio's natural sampling rate, while positive values will resample the features to match the specified frame rate.
The feat_mod_max
parameter is a float that sets the maximum value for the feature modifier. The default is 10000.0, with a range from -10000.0 to 10000.0. This parameter ensures that the feature modifiers do not exceed a certain threshold, which can be useful for maintaining consistency in visual outputs.
The feat_mod_min
parameter is a float that sets the minimum value for the feature modifier. The default is -10000.0, with a range from -10000.0 to 10000.0. Similar to feat_mod_max
, this parameter helps in controlling the lower bound of the feature modifiers.
The feat_mod_normalize
parameter is a boolean option that determines whether the feature modifier array should be normalized between 0 and the maximum value in the array. This can be useful for standardizing the output, especially when comparing different audio files.
The FEAT_MODS
output is a 1D tensor containing the calculated feature modifiers for each frame of the audio. These modifiers represent the combined effect of various audio features and can be used for visualizing or further processing the audio data.
The FEAT_SECONDS
output is a float representing the duration of each frame in seconds. This value is derived from the total duration of the audio and the number of frames, providing a temporal context for the feature modifiers.
The FPS
output is a float indicating the frames per second of the output data. This value reflects either the target frame rate specified by the fps_target
parameter or the natural frame rate of the audio if no target was set.
The CHARTS
output is an image that visualizes the various audio features and their modifiers. This visual representation can help in understanding the dynamics of the audio and the impact of different features on the overall analysis.
hop_length
to a smaller value, which increases the temporal resolution of the feature extraction.feat_mod_normalize
option to ensure consistent feature modifier values across different audio files, which can be particularly useful when comparing or combining multiple audio sources.feat_mod_max
is set to a value less than or equal to feat_mod_min
.feat_mod_max
is greater than feat_mod_min
to avoid this error.audio
parameter is not provided as a tuple containing the audio data and sample rate.audio
input is correctly formatted as a tuple with the necessary components.RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.