ComfyUI  >  Nodes  >  VLM_nodes >  AudioLDM-2 Node

ComfyUI Node: AudioLDM-2 Node

Class Name

AudioLDM2Node

Category
VLM Nodes/Audio
Author
gokayfem (Account age: 1058 days)
Extension
VLM_nodes
Latest Updated
6/2/2024
Github Stars
0.3K

How to Install VLM_nodes

Install this extension via the ComfyUI Manager by searching for  VLM_nodes
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter VLM_nodes in the search bar
After installation, click the  Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Cloud for ready-to-use ComfyUI environment

  • Free trial available
  • High-speed GPU machines
  • 200+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 50+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

AudioLDM-2 Node Description

Generate high-quality audio from text prompts using advanced machine learning models for creating unique soundscapes and effects.

AudioLDM-2 Node:

The AudioLDM2Node is designed to generate high-quality audio based on textual descriptions. This node leverages advanced machine learning models to create audio that matches the given text prompt, allowing AI artists to produce unique soundscapes and audio effects effortlessly. By providing a text prompt and various configuration settings, you can control the characteristics of the generated audio, such as its duration, guidance scale, and sample rate. This node is particularly useful for creating audio content for multimedia projects, enhancing creative workflows, and exploring new auditory experiences.

AudioLDM-2 Node Input Parameters:

text

This parameter takes a string input that describes the audio you want to generate. The text prompt guides the model in creating audio that matches the description. The default value is an empty string, and it is a required input.

negative_prompt

This string input allows you to specify elements that should be avoided in the generated audio. By providing a negative prompt, you can refine the output to better match your desired outcome. The default value is an empty string, and it is a required input.

duration

This integer parameter defines the length of the generated audio in seconds. You can set the duration between 1 and 60 seconds, with a default value of 10 seconds. Adjusting this parameter will directly impact the length of the audio output.

guidance_scale

This float parameter controls the influence of the text prompt on the generated audio. A higher guidance scale makes the audio more closely match the text description. The value can range from 0.1 to 20.0, with a default of 3.5. Fine-tuning this parameter helps achieve the desired balance between creativity and adherence to the prompt.

seed

This integer parameter sets the random seed for audio generation, ensuring reproducibility of results. By using the same seed, you can generate identical audio outputs for the same input parameters. The default value is 42.

n_candidates

This integer parameter specifies the number of audio candidates to generate. You can choose between 1 and 10 candidates, with a default value of 3. Generating multiple candidates allows you to select the best match for your needs.

sample_rate

This integer parameter determines the sample rate of the generated audio, affecting its quality and file size. The sample rate can be set between 8000 and 48000 Hz, with a default value of 16000 Hz. Higher sample rates result in better audio quality.

extension

This parameter allows you to choose the file format for the generated audio. Available options are "wav", "mp3", and "flac", with "wav" as the default. Selecting the appropriate format depends on your specific use case and compatibility requirements.

AudioLDM-2 Node Output Parameters:

wave_form

This output parameter provides the waveform data of the generated audio. The waveform is a numerical representation of the audio signal, which can be used for further processing or playback.

sample_rate

This output parameter returns the sample rate of the generated audio. It indicates the number of samples per second in the audio file, which is crucial for playback and compatibility with other audio processing tools.

AudioLDM-2 Node Usage Tips:

  • Experiment with different text prompts and guidance scales to achieve the desired audio characteristics.
  • Use the seed parameter to reproduce specific audio outputs for consistency in your projects.
  • Generate multiple candidates to have a variety of options and select the best one for your needs.
  • Adjust the sample rate based on the quality requirements and file size constraints of your project.

AudioLDM-2 Node Common Errors and Solutions:

ValueError: Please provide a text input.

  • Explanation: This error occurs when the text input is missing or None.
  • Solution: Ensure that you provide a valid text prompt to guide the audio generation.

Invalid sample rate value.

  • Explanation: This error occurs when the sample rate is set outside the allowed range of 8000 to 48000 Hz.
  • Solution: Adjust the sample rate parameter to a value within the specified range.

Invalid duration value.

  • Explanation: This error occurs when the duration is set outside the allowed range of 1 to 60 seconds.
  • Solution: Set the duration parameter to a value within the specified range.

Invalid number of candidates.

  • Explanation: This error occurs when the number of candidates is set outside the allowed range of 1 to 10.
  • Solution: Adjust the n_candidates parameter to a value within the specified range.

AudioLDM-2 Node Related Nodes

Go back to the extension to check out more related nodes.
VLM_nodes
RunComfy

© Copyright 2024 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.