ComfyUI > Nodes > comfyui-sound-lab > Stable Audio

ComfyUI Node: Stable Audio

Class Name

StableAudio_

Category
♾️Sound Lab
Author
shadowcz007 (Account age: 3366days)
Extension
comfyui-sound-lab
Latest Updated
2024-07-04
Github Stars
0.08K

How to Install comfyui-sound-lab

Install this extension via the ComfyUI Manager by searching for comfyui-sound-lab
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter comfyui-sound-lab in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • High-speed GPU machines
  • 200+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 50+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Stable Audio Description

Powerful audio generation node using textual prompts with advanced diffusion models for creating high-quality audio samples, ideal for AI artists crafting unique soundscapes, music, or effects.

Stable Audio:

StableAudio_ is a powerful node designed to generate high-quality audio based on textual prompts. It leverages advanced diffusion models to create audio samples that match the given descriptions, making it an invaluable tool for AI artists looking to produce unique soundscapes, music, or audio effects. The node is capable of handling various audio generation tasks, from creating short sound bites to producing longer musical pieces. By specifying parameters such as the prompt, duration, and configuration settings, you can fine-tune the output to meet your specific needs. StableAudio_ simplifies the complex process of audio generation, providing an accessible interface for artists to explore and create without needing deep technical knowledge.

Stable Audio Input Parameters:

prompt

The prompt parameter is a textual description of the audio you want to generate. This description guides the model in creating audio that matches the given prompt. For example, you could use prompts like "A beautiful orchestral symphony" or "Chill hip-hop beat." The quality and relevance of the generated audio heavily depend on the clarity and specificity of the prompt.

seconds

The seconds parameter specifies the duration of the generated audio in seconds. This parameter determines how long the output audio will be. The minimum value is 0, and the maximum value is 512 seconds. Adjusting this parameter allows you to control the length of the audio sample, making it suitable for various applications, from short sound effects to longer musical compositions.

seed

The seed parameter is used to initialize the random number generator for the diffusion process. If set to -1, a random seed will be generated. Using a specific seed value allows for reproducibility, meaning you can generate the same audio output multiple times by using the same seed.

steps

The steps parameter defines the number of diffusion steps to be performed during the audio generation process. More steps generally result in higher quality audio but will take longer to compute. This parameter allows you to balance between quality and computational efficiency.

cfg_scale

The cfg_scale parameter controls the classifier-free guidance scale. This parameter influences the strength of the guidance provided by the prompt. Higher values can lead to more accurate adherence to the prompt but may also introduce artifacts. Finding the right balance is key to achieving the desired audio quality.

sigma_min

The sigma_min parameter sets the minimum noise level for the diffusion process. This parameter affects the initial noise added to the audio signal and can influence the texture and clarity of the generated audio.

sigma_max

The sigma_max parameter sets the maximum noise level for the diffusion process. This parameter works in conjunction with sigma_min to define the range of noise levels used during the generation process, impacting the overall sound quality.

sampler_type

The sampler_type parameter specifies the type of sampler to be used in the diffusion process. Different samplers can produce varying results, and selecting the appropriate sampler can help achieve the desired audio characteristics.

device

The device parameter determines the hardware on which the model will run. It can be set to "auto," "cpu," or "cuda." If set to "auto," the node will automatically select the best available device. Using a GPU (cuda) can significantly speed up the audio generation process.

Stable Audio Output Parameters:

filename

The filename parameter provides the name of the generated audio file. This name is automatically generated and includes a counter to ensure uniqueness. The file is saved in the specified output directory.

subfolder

The subfolder parameter indicates the subfolder within the output directory where the audio file is saved. This helps in organizing generated files, especially when working on multiple projects.

type

The type parameter specifies the type of output, which in this case is "output." This is a standard parameter used to categorize the output files.

prompt

The prompt parameter returns the original prompt used for generating the audio. This is useful for reference and documentation purposes, allowing you to track which prompts were used for specific audio files.

Stable Audio Usage Tips:

  • Use clear and specific prompts to guide the audio generation process effectively.
  • Experiment with different cfg_scale values to find the right balance between adherence to the prompt and audio quality.
  • Utilize the seed parameter to reproduce specific audio outputs for consistency in your projects.
  • Adjust the steps parameter to balance between audio quality and computational time, especially for longer audio samples.
  • Select the appropriate device setting to optimize performance, using a GPU if available for faster processing.

Stable Audio Common Errors and Solutions:

"CUDA out of memory"

  • Explanation: This error occurs when the GPU does not have enough memory to handle the audio generation process.
  • Solution: Reduce the steps or sample_size parameters, or switch to using the CPU by setting the device parameter to "cpu."

"Invalid prompt"

  • Explanation: This error occurs when the provided prompt is not in a valid format or is empty.
  • Solution: Ensure that the prompt is a non-empty string and provides a clear description of the desired audio.

"Model not initialized"

  • Explanation: This error occurs when the model has not been properly loaded or initialized.
  • Solution: Check the model loading process and ensure that the model path is correct and the necessary files are available.

"File save error"

  • Explanation: This error occurs when there is an issue saving the generated audio file.
  • Solution: Verify that the output directory exists and has the necessary write permissions. Ensure that there is enough disk space available.

Stable Audio Related Nodes

Go back to the extension to check out more related nodes.
comfyui-sound-lab
RunComfy

© Copyright 2024 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.