ComfyUI > Nodes > comfyui-kokoro > Kokoro Generator

ComfyUI Node: Kokoro Generator

Class Name

KokoroGenerator

Category
kokoro
Author
stavsap (Account age: 4341days)
Extension
comfyui-kokoro
Latest Updated
2025-02-14
Github Stars
0.03K

How to Install comfyui-kokoro

Install this extension via the ComfyUI Manager by searching for comfyui-kokoro
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter comfyui-kokoro in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Kokoro Generator Description

Generate lifelike speech from text input with various speaker styles and languages using Kokoro ONNX model.

Kokoro Generator:

The KokoroGenerator node is designed to synthesize audio from text input, allowing you to create lifelike speech using a variety of speaker styles and languages. This node leverages the Kokoro ONNX model to transform written text into spoken words, providing a powerful tool for AI artists who wish to incorporate realistic voiceovers into their projects. By specifying parameters such as the speaker's voice, speech speed, and language, you can generate customized audio outputs that suit your creative needs. The KokoroGenerator is particularly beneficial for those looking to add a human touch to their AI-generated content, offering a seamless way to produce high-quality audio with minimal effort.

Kokoro Generator Input Parameters:

text

The text parameter is a string input that represents the content you wish to convert into speech. It supports multiline text, allowing you to input longer passages for synthesis. The default value is "I am a synthesized robot". This parameter is crucial as it directly influences the audio output, with the spoken words reflecting the text provided.

speaker

The speaker parameter specifies the voice style to be used for the audio generation. It is of type KOKORO_SPEAKER, which is a custom type representing different speaker profiles. This parameter allows you to choose from a variety of pre-defined voices, each with unique characteristics, to match the desired tone and style of your project.

speed

The speed parameter is a float that controls the rate of speech in the generated audio. It ranges from 0.1 to 4, with a default value of 1. Adjusting this parameter can make the speech faster or slower, allowing you to tailor the pacing to fit the context of your content. A higher value results in faster speech, while a lower value slows it down.

lang

The lang parameter is a string that determines the language of the synthesized speech. It does not support multiline input and defaults to "en-us". This parameter is essential for ensuring that the pronunciation and intonation of the generated audio align with the specified language, enhancing the authenticity of the speech.

Kokoro Generator Output Parameters:

audio

The audio output is a dictionary containing the generated waveform and its sample rate. The waveform is represented as a tensor, which is a multi-dimensional array used to store the audio data. The sample rate indicates the number of samples per second in the audio, which affects the quality and fidelity of the sound. This output is crucial for further processing or playback, as it provides the actual audio content generated by the node.

Kokoro Generator Usage Tips:

  • Ensure that the text input is clear and free of errors to achieve the best audio quality, as the generated speech will directly reflect the provided text.
  • Experiment with different speaker profiles to find the voice that best suits your project's tone and style, enhancing the overall impact of your audio content.
  • Adjust the speed parameter to match the desired pacing of your audio, keeping in mind that extreme values may affect the naturalness of the speech.

Kokoro Generator Common Errors and Solutions:

ERROR: could not load kokoro-onnx in generate

  • Explanation: This error occurs when the Kokoro ONNX model cannot be loaded, possibly due to missing or corrupted model files.
  • Solution: Ensure that the model and voices files are correctly downloaded and located in the specified directory. If necessary, re-download the files using the provided URLs.

no audio is generated

  • Explanation: This error indicates that the audio generation process failed, resulting in no output.
  • Solution: Check the input parameters for any inconsistencies or errors. Ensure that the text is valid and that the speaker profile is correctly specified. If the issue persists, verify that the model and voices files are intact and accessible.

Kokoro Generator Related Nodes

Go back to the extension to check out more related nodes.
comfyui-kokoro
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.