Ideogram 4 ComfyUI workflow | Structured Text-to-Image Generator

Ideogram 4 ComfyUI workflow Workflow

Want to run this workflow?

Fully operational workflows
No missing nodes or models
No manual setups required
Features stunning visuals

Ideogram 4 ComfyUI workflow Examples

ideogram-4-comfyui-workflow-structured-text-to-image-generator-1443-example_01.webp

ideogram-4-comfyui-workflow-structured-text-to-image-generator-1443-example_02.webp

ideogram-4-comfyui-workflow-structured-text-to-image-generator-1443-example_03.webp

ideogram-4-comfyui-workflow-structured-text-to-image-generator-1443-example_04.webp

ideogram-4-comfyui-workflow-structured-text-to-image-generator-1443-example_05.webp

ideogram-4-comfyui-workflow-structured-text-to-image-generator-1443-example_06.webp

Ideogram 4 ComfyUI workflow: structured text-to-image with precise layout and typography#

This Ideogram 4 ComfyUI workflow is a compact, RunComfy-ready template for Ideogram 4.0, an open-weight, non-commercially licensed text-to-image model built for design, layout control, and reliable in‑image text. It turns structured JSON captions into images with scene summaries, style blocks, normalized bounding boxes, and hex color palettes, making it ideal for posters, brand comps, typography‑heavy graphics, and layout‑aware illustration.

The graph delivers a clean, single‑path text‑to‑image pipeline plus an optional on‑graph JSON prompt builder. If you already write JSON prompts, paste them and render immediately; if you prefer to start from a short idea, the LLM helper can draft a schema‑correct caption you can preview and paste into the generator. Under the hood, the workflow follows Ideogram 4’s flow‑matching DiT sampling with asymmetric classifier‑free guidance.

Key models in Comfyui Ideogram 4 ComfyUI workflow#

Ideogram 4 (FP8). The 9.3B‑parameter Diffusion Transformer trained with flow matching, designed for JSON‑guided generation, strong text rendering, and explicit layout control. Official model card: ideogram-ai/ideogram-4-fp8. Inference code: ideogram-oss/ideogram4.
Ideogram 4 Unconditional branch. A paired unconditional checkpoint used for asymmetric classifier‑free guidance during sampling; packaged for ComfyUI alongside the main model: Comfy-Org/Ideogram-4.
Qwen3‑VL‑8B‑Instruct (FP8). A vision‑language encoder used as the text encoder, providing multi‑scale semantic features from the prompt: Qwen/Qwen3-VL-8B-Instruct-FP8 (ComfyUI repack: Comfy-Org/Qwen3-VL).
FLUX.2 VAE. The decoder used to turn sampled latents into final images, packaged for ComfyUI: Comfy-Org/flux2-dev.

How to use Comfyui Ideogram 4 ComfyUI workflow#

Overall logic: choose a canvas, provide a prompt (ideally structured JSON), pick a sampler preset (Default, Quality, Turbo), then render. The main “Text to Image (Ideogram v4)” subgraph performs encoding, guidance, sampling, and decoding in one pass; an optional “LLM Prompt Builder” group can draft JSON for you.

Canvas and aspect ratio: ResolutionSelector (#37)
- Pick a preset like 1:1, 16:9, or 9:16. The workflow computes valid dimensions for Ideogram 4 (multiples of 16 with sensible minimums) and propagates them to the sampler and VAE. This lets you target everything from square thumbnails to tall posters without manual math. Change anytime; the scheduler adapts to your chosen resolution.
Prompt and JSON caption: CLIP Text Encode (Positive Prompt) (#24)
- Paste natural language or, for best results, a structured JSON caption following Ideogram 4’s schema. Use high_level_description, a style_description block (with color_palette as uppercase hex codes), and a compositional_deconstruction section. Bounding boxes are normalized on a 0–1000 grid with the order [y_min, x_min, y_max, x_max] and origin at the top‑left; include type: "text" items to render literal text in the image. The model is sensitive to key order; see the official guide in docs/prompting.md.
Preset mode (speed vs quality): “Preset” group inside the subgraph
- Choose a mode in the subgraph’s mode input: Default (balanced), Quality (more steps and fidelity), or Turbo (fewer steps and fastest feedback). The workflow parses a small internal preset table and routes the matching step count and schedule parameters to the scheduler. Switch presets to iterate quickly, then finish at higher quality.
Sampling and guidance: “Sampling” group inside the subgraph
- The pipeline uses flow‑matching sampling with a paired unconditional branch for asymmetric classifier‑free guidance. The DualModelGuider blends conditional and unconditional predictions, while Ideogram4Scheduler shapes the noise schedule for your chosen size and preset. KSamplerSelect picks the algorithm and SamplerCustomAdvanced runs the denoising pass before decoding.
Models (prewired): “Models” group inside the subgraph
- The graph loads the main Ideogram 4 model, its unconditional partner, the Qwen3‑VL text encoder, and the FLUX.2 VAE. These are wired into the guider, sampler, and decoder. You normally do not need to change these, but swapping models is possible if you are experimenting with variants packaged for ComfyUI.
Optional: on‑graph JSON drafting: JSON Prompt Builder (Gemma4) (#134)
- Select the “LLM Prompt Builder (Select and Ctrl+B to enable)” group to turn it on. Enter a short idea in the user_prompt field; the node drafts a schema‑correct JSON caption you can preview with PreviewAny (#111). Copy the generated JSON into the main prompt input for the image subgraph.
Output: SaveImage (#158)
- Images are written under a folder named for the model version. Rename the prefix if you want to keep outputs from different presets or aspect ratios separate.

Key nodes in Comfyui Ideogram 4 ComfyUI workflow#

CLIP Text Encode (Positive Prompt) (#24)
- Encodes the prompt with Qwen3‑VL for Ideogram 4. Use structured JSON for layout control, explicit in‑image text, and palette steering. Keep key order stable and use [y_min, x_min, y_max, x_max] with values on a 0–1000 grid for bbox entries; this matches the model’s documented schema in docs/prompting.md.
UNETLoader (#23)
- Loads the main Ideogram 4 checkpoint that performs conditional denoising. This is the backbone that translates your encoded caption into images; leave it as the official release for the most consistent results: ideogram-ai/ideogram-4-fp8.
UNETLoader (#154)
- Loads the unconditional Ideogram 4 checkpoint used for asymmetric classifier‑free guidance. Pairing this with the main model lets the guider control prompt adherence and overall image quality separately: Comfy-Org/Ideogram-4.
DualModelGuider (#155)
- Combines conditional and unconditional predictions to implement asymmetric classifier‑free guidance. Adjust the guidance strength only if you understand the trade‑off: too little weakens prompt fidelity; too much can oversharpen or distort. When changing presets, revisit guidance to maintain a similar “feel.”
Ideogram4Scheduler (#17)
- Produces the noise schedule and step count specialized for Ideogram 4 at your chosen width and height. The “Preset” group feeds it the matching steps and schedule parameters; use Quality for final renders, Turbo for drafts, and Default for everyday work.
SamplerCustomAdvanced (#12)
- Runs the denoising pass using the selected sampler and the scheduler’s sigmas. Leave this unchanged unless you are intentionally comparing sampler families; if you do swap samplers, keep resolution and preset fixed to make A/Bs meaningful.
CFGOverride (#157)
- Provides a fine‑grained knob over how conditioning is applied during sampling. Most users can ignore this and rely on the presets; if you tweak it, do small changes and re‑evaluate on multiple prompts to avoid overfitting sampler behavior to a single scene.
VAELoader (#9) and VAEDecode (#13)
- Load and apply the FLUX.2 VAE to decode sampled latents into final images. Keep the official VAE to preserve colorimetry and detail balance unless you are testing alternatives: Comfy-Org/flux2-dev.

Optional extras#

Use type: "text" elements in your JSON to render exact wording in‑image; keep strings concise and place them with a dedicated bbox.
Start with 3–6 colors in style_description.color_palette (uppercase hex) and add per‑element palettes only when you need local overrides.
For layout, think in thirds: vary bbox sizes and positions to create depth; non‑overlapping boxes reduce collisions.
Lock the noise seed to reproduce a composition; change it to explore variations without altering your JSON.
If you see “Image blocked by safety filter,” that response comes from the model itself; adjust content toward safe, schema‑consistent prompts. For full details, see the model card: ideogram-ai/ideogram-4-fp8.

Acknowledgements#

This workflow implements and builds upon the following works and resources. We gratefully acknowledge Comfy-Org for the ComfyUI Day 0 support announcement and workflow template for Ideogram 4, Comfy-Org for the Ideogram-4 model card, and ideogram-oss for the official Ideogram 4 inference-code repository for their contributions and maintenance. For authoritative details, please refer to the original documentation and repositories linked below.

Resources#

Comfy-Org/Comfy blog announcement
- Docs / Release Notes: Ideogram 4 Day 0 support in ComfyUI
Comfy-Org/Comfy workflow template
- GitHub: Comfy-Org/workflow_templates — image_ideogram4_t2i.json
Comfy-Org/Ideogram 4 ComfyUI model card
- Hugging Face: Comfy-Org/Ideogram-4
ideogram-oss/Ideogram 4 inference-code repository
- GitHub: ideogram-oss/ideogram4

Note: Use of the referenced models, datasets, and code is subject to the respective licenses and terms provided by their authors and maintainers.

Want More ComfyUI Workflows?

Qwen-Image | HD Multi-Text Poster Generator

New Era of Text Generation in Images!

Qwen-Image Lightning | 8-Step Speed Boost

Cut generation time in half.

Omost | Enhance Image Creation

Omost uses LLM coding to generate precise, high-quality images.

ERNIE-Image ComfyUI | Smart Text to Image Generator

Transforms words into precise, high-detail visuals instantly.

LTX 2.3 Image to Video | Cinematic Motion Creator

Turn images into realistic, cinematic videos with smooth, consistent motion.

CogvideoX Fun | Video-to-Video Model

CogVideoX Fun: Advanced video-to-video model for high-quality video generation.

InstantID | Portraits to Art

InstantID accurately enhances and transforms portraits with style and aesthetic appeal.

SUPIR | Photo-Realistic Image/Video Upscaler

SUPIR enables photo-realistic image restoration, works with SDXL model, and supports text-prompt enhancement.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Ideogram 4 ComfyUI workflow | Text-to-Image Layout Generator