With ComfyUI ReActor, you can easily swap the faces of one or more characters in images or videos.

Flux Consistent Characters | Input Text

Create consistent characters and ensure they look uniform by inputting text.

CogvideoX Fun | Video-to-Video Model

CogVideoX Fun: Advanced video-to-video model for high-quality video generation.

ComfyUI Phantom | Subject to Video

Reference-driven video generation using Wan2.1 14B

ComfyUI > Nodes > comfyui_LLM_party > OpenAI语音识别(openai_whisper)

ComfyUI Node: OpenAI语音识别(openai_whisper)

Class Name

openai_whisper

Category
大模型派对（llm_party）/函数（function）

Author
heshengtao (Account age: 3180days) Extension
comfyui_LLM_party Latest Updated
2025-03-30 Github Stars
1.57K

Github Ask heshengtao Current Questions Past Questions

Table of Content

Description
OpenAI语音识别(openai_whisper):
OpenAI语音识别(openai_whisper) Input Parameters:
OpenAI语音识别(openai_whisper) Output Parameters:
OpenAI语音识别(openai_whisper) Usage Tips:
OpenAI语音识别(openai_whisper) Common Errors and Solutions:
Related Nodes

How to Install comfyui_LLM_party

Install this extension via the ComfyUI Manager by searching for comfyui_LLM_party

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter comfyui_LLM_party in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

OpenAI语音识别(openai_whisper) Description

Transcribe audio to text with high accuracy using OpenAI's Whisper model for subtitles, interviews, and voice notes.

OpenAI语音识别(openai_whisper):

The openai_whisper node is designed to transcribe audio files into text using OpenAI's Whisper model. This node is particularly useful for converting spoken language into written text, making it an invaluable tool for tasks such as creating subtitles, transcribing interviews, or generating text from voice notes. By leveraging the powerful Whisper model, this node ensures high accuracy and reliability in transcription, providing you with clear and precise text outputs from your audio inputs. The node is easy to use and integrates seamlessly with other components, allowing you to automate and streamline your transcription workflows.

OpenAI语音识别(openai_whisper) Input Parameters:

is_enable

This parameter is a boolean that determines whether the node is active or not. When set to True, the node will process the audio input and generate a transcription. If set to False, the node will not perform any action. The default value is True.

audio

This parameter accepts an audio file that you want to transcribe. The audio file should be in a format that is compatible with the Whisper model, such as WAV or MP3. The node will read this file and use it as the input for the transcription process.

base_url

This optional parameter allows you to specify the base URL for the OpenAI API. If not provided, the node will use the default URL https://api.openai.com/v1/. This parameter is useful if you need to point to a different API endpoint.

api_key

This optional parameter is used to provide your OpenAI API key. If not provided, the node will attempt to load the API key from the environment variables. This key is essential for authenticating your requests to the OpenAI API.

OpenAI语音识别(openai_whisper) Output Parameters:

text

The output parameter text contains the transcribed text from the provided audio file. This text is generated by the Whisper model and represents the spoken content of the audio input in written form. The output is a string that you can use for further processing or display.

OpenAI语音识别(openai_whisper) Usage Tips:

Ensure that your audio files are clear and free from excessive background noise to improve transcription accuracy.
Use the base_url parameter if you need to connect to a custom OpenAI API endpoint.
Always provide a valid api_key to authenticate your requests and avoid errors related to missing or invalid API keys.
Test with different audio formats to find the one that works best with the Whisper model for your specific use case.

OpenAI语音识别(openai_whisper) Common Errors and Solutions:

请输入API_KEY

Explanation: This error occurs when the API key is not provided or is empty.
Solution: Ensure that you have provided a valid API key either through the api_key parameter or by setting the appropriate environment variable.

Invalid audio file

Explanation: This error occurs when the provided audio file is not in a supported format or is corrupted.
Solution: Check the audio file format and ensure it is compatible with the Whisper model. Supported formats typically include WAV and MP3.

API request failed

Explanation: This error occurs when the request to the OpenAI API fails, possibly due to network issues or incorrect API endpoint.
Solution: Verify your network connection and ensure that the base_url parameter is correctly set to a valid OpenAI API endpoint.

Transcription failed

Explanation: This error occurs when the Whisper model is unable to transcribe the provided audio file.
Solution: Ensure that the audio file is clear and of good quality. Try using a different audio file to see if the issue persists.

OpenAI语音识别(openai_whisper) Related Nodes

Go back to the extension to check out more related nodes.

comfyui_LLM_party

Table of Content

Description
OpenAI语音识别(openai_whisper):
OpenAI语音识别(openai_whisper) Input Parameters:
OpenAI语音识别(openai_whisper) Output Parameters:
OpenAI语音识别(openai_whisper) Usage Tips:
OpenAI语音识别(openai_whisper) Common Errors and Solutions:
Related Nodes

Hunyuan Video | Text to Video

Generates videos from text prompts.

AnimateDiff + ControlNet + AutoMask | Comic Style

Effortlessly restyle videos, converting realistic characters into anime while keeping the original backgrounds intact.

Self Forcing | Autoregressive Keyframe-to-Video Generation

SUPER FAST! 5-second video in 45 seconds!

IDM-VTON | Virtual Try-on

Virtual try-on creating realistic results by capturing garment details and style.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.