logo
RunComfy
ComfyUIPlaygroundPricing
discord logo
ComfyUI>Workflows>IDM-VTON | Virtual Try-on

IDM-VTON | Virtual Try-on

Workflow Name: RunComfy/IDM-VTON
Workflow ID: 0000...1135
IDM-VTON, or Improving Diffusion Models for Authentic Virtual Try-on in the Wild, is a groundbreaking diffusion model that allows for realistic virtual garment try-on. By preserving the unique details and identity of garments, IDM-VTON generates incredibly authentic results. The model utilizes an image prompt adapter (IP-Adapter) to extract high-level garment semantics and a parallel UNet (GarmentNet) to encode low-level features. In ComfyUI, the IDM-VTON node powers the virtual try-on process, requiring inputs such as a human image, pose representation, clothing mask, and garment image.

IDM-VTON, short for "Improving Diffusion Models for Authentic Virtual Try-on in the Wild," is an innovative diffusion model that allows you to realistically try on garments virtually using just a few inputs. What sets IDM-VTON apart is its ability to preserve the unique details and identity of the garments while generating virtual try-on results that look incredibly authentic.

1. Understanding IDM-VTON

At its core, IDM-VTON is a diffusion model that's been specifically engineered for virtual try-on. To use it, you simply need a representation of a person and a garment you want to try on. IDM-VTON then works its magic, rendering a result that looks like the person is actually wearing the garment. It achieves a level of garment fidelity and authenticity that surpasses previous diffusion-based virtual try-on methods.

2. The Inner Workings of IDM-VTON

So, how does IDM-VTON pull off such realistic virtual try-on? The secret lies in its two main modules that work together to encode the semantics of the garment input:

  1. The first is an image prompt adapter, or IP-Adapter for short. This clever component extracts the high-level semantics of the garment - essentially, the key characteristics that define its appearance. It then fuses this information into the cross-attention layer of the main UNet diffusion model.
  2. The second module is a parallel UNet called GarmentNet. Its job is to encode the low-level features of the garment - the nitty-gritty details that make it unique. These features are then fused into the self-attention layer of the main UNet.

But that's not all! IDM-VTON also makes use of detailed textual prompts for both the garment and the person inputs. These prompts provide additional context that enhances the authenticity of the final virtual try-on result.

3. Putting IDM-VTON to Work in ComfyUI

3.1 The Star of the Show: The IDM-VTON Node

In ComfyUI, the "IDM-VTON" node is the powerhouse that runs the IDM-VTON diffusion model and generates the virtual try-on output.

For the IDM-VTON node to work its magic, it needs a few key inputs:

  1. Pipeline: This is the loaded IDM-VTON diffusion pipeline that powers the whole virtual try-on process.
  2. Human Input: An image of the person who will be virtually trying on the garment.
  3. Pose Input: A preprocessed DensePose representation of the human input, which helps IDM-VTON understand the person's pose and body shape.
  4. Mask Input: A binary mask that indicates which parts of the human input are clothing. This mask needs to be converted into an appropriate format.
  5. Garment Input: An image of the garment to be virtually tried on.

3.2 Getting Everything Ready

To get the IDM-VTON node up and running, there are a few preparation steps:

  1. Loading the Human Image: A LoadImage node is used to load the image of the person. IDM-VTON
  2. Generating the Pose Image: The human image is passed through a DensePosePreprocessor node, which computes the DensePose representation that IDM-VTON needs. IDM-VTON
  3. Obtaining the Mask Image: There are two ways to get the clothing mask: IDM-VTON

a. Manual Masking (Recommended)

  • Right-click on the loaded human image and choose "Open in Mask Editor."
  • In the mask editor UI, manually mask the clothing regions.

b. Automatic Masking

  • Use a GroundingDinoSAMSegment node to automatically segment the clothing.
  • Prompt the node with a text description of the garment (like "t-shirt").

Whichever method you choose, the obtained mask needs to be converted to an image using a MaskToImage node, which is then connected to the "Mask Image" input of the IDM-VTON node.

  1. Loading the Garment Image: It is used to load the image of the garment.
IDM-VTON

For a deeper dive into the IDM-VTON model, don't miss the original paper, "Improving Diffusion Models for Authentic Virtual Try-on in the Wild". And if you're interested in using IDM-VTON in ComfyUI, be sure to check out the dedicated nodes here. Huge thanks to the researchers and developers behind these incredible resources.

Want More ComfyUI Workflows?

Hunyuan3D-2 | Leading-edge 3D Assets Generator

Generate precise textured 3D assets from images with state-of-the-art AI technology.

ReActor | Fast Face Swap

With ComfyUI ReActor, you can easily swap the faces of one or more characters in images or videos.

ACE++ Character Consistency

Generate consistent images of your character across poses, angles, and styles from a single photo.

SUPIR | Photo-Realistic Image/Video Upscaler

SUPIR enables photo-realistic image restoration, works with SDXL model, and supports text-prompt enhancement.

Insert Anything | Reference-Based Image Editing

Insert any subject into images with mask or text guidance.

Creative Software Soap

Creative Software Soap

Combine IPAdapter and ControlNet for efficient texture application and enhanced visuals.

Flux Kontext Character Turnaround Sheet LoRA

Generate 5-pose character turnaround sheets from single image

FLUX Inpainting | Seamless Image Editing

FLUX Inpainting | Seamless Image Editing

Effortlessly fill, remove, and refine images, seamlessly integrating new content.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.