Visit ComfyUI Online for ready-to-use ComfyUI environment
Converts audio input to text for transcribing spoken words, enhancing interactivity and accessibility.
The SpeechRecognition
node is designed to convert audio input into text, making it a powerful tool for transcribing spoken words into written form. This node leverages advanced audio processing techniques to accurately interpret and transcribe audio files, which can be particularly useful for creating text prompts from voice commands or integrating voice input into your AI art projects. By using this node, you can streamline the process of converting speech to text, enhancing the interactivity and accessibility of your creative workflows.
This parameter accepts an audio input, which is the source file that the node will process to recognize speech. The audio input should be in a format compatible with the node's processing capabilities.
This optional parameter allows you to specify the starting point in the audio file from which the speech recognition should begin. It is an integer value with a default of 0, a minimum of 0, and a maximum of 2048. This can be useful if you want to skip initial parts of the audio or start recognition from a specific timestamp.
The output parameter prompt
is a string that contains the transcribed text from the audio input. This text is the result of the speech recognition process and can be used as a prompt or input for other nodes or applications within your AI art projects.
start_by
parameter to skip irrelevant parts of the audio and focus on the segment that contains the desired speech.start_by
parameter is set to a value outside the acceptable range.start_by
parameter to a value between 0 and 2048 and try again.© Copyright 2024 RunComfy. All Rights Reserved.