This workflow is inspired by with some modifications. For more information, please visit his YouTube channel.
This workflow lets you transform standard videos into enchanting Japanese anime creations using AnimateDiff, ControlNet, and IPAdapter. Feel free to experiment with various checkpoints, LoRA settings, and reference images for the IPAdapter to craft your unique style. It's a fun and creative way to bring your videos to life in the anime world!
Please check out the details on
ControlNet revolutionizes the way we generate images by bringing a new level of spatial control to text-to-image diffusion models. This cutting-edge neural network architecture partners beautifully with giants like Stable Diffusion, harnessing their vast libraries—forged from billions of images—to weave spatial nuances directly into the fabric of image creation. From sketching out edges to mapping human stances, depth perception, or segmenting visuals, ControlNet empowers you to mold the imagery in ways that go far beyond the scope of mere text prompts.
At its core, ControlNet is ingeniously straightforward. It starts by safeguarding the integrity of the original model's parameters—keeping the base training intact. Then, ControlNet introduces a mirrored set of the model's encoding layers, but with a twist: they're trained using "zero convolutions." These zeros as a starting point mean the layers gently fold in new spatial conditions without causing a ruckus, ensuring that the model's original talents are preserved even as it embarks on new learning paths.
Both ControlNets and T2I-Adapters play crucial roles in the conditioning of image generation, with each offering distinct advantages. T2I-Adapters are recognized for their efficiency, particularly in terms of speeding up the image generation process. Despite this, ControlNets are unparalleled in their ability to intricately guide the generation process, making them a powerful tool for creators.
Considering the overlap in functionalities between many T2I-Adapter and ControlNet models, our discussion will primarily focus on ControlNets. However, it's worth noting that the RunComfy platform has preloaded several T2I-Adapter models for ease of use. For those interested in experimenting with T2I-Adapters, you can seamlessly load these models and integrate them into your projects.
Choosing between ControlNet and T2I-Adapter models in ComfyUI does not affect the use of ControlNet nodes or the consistency of the workflow. This uniformity ensures a streamlined process, allowing you to leverage the unique benefits of each model type according to your project needs.
3.4.1. Loading the “Apply ControlNet” Node
To begin, you'll need to load the "Apply ControlNet" Node into your ComfyUI. This is your first step toward a dual-conditioned image crafting journey, blending visual elements with textual prompts.
3.4.2. Understanding the Inputs of “Apply ControlNet” Node
Positive and Negative Conditioning: These are your tools for shaping the final image—what it should embrace and what it should avoid. Connect these to the "Positive prompt" and "Negative prompt" slots to sync them with the text-based part of your creative direction.
Selecting the ControlNet Model: You'll need to link this input to the "Load ControlNet Model" node's output. This is where you decide whether to use a ControlNet or a T2IAdaptor model based on the specific traits or styles you're aiming for. While we're focusing on ControlNet models, mentioning some sought-after T2IAdaptors is worthwhile for a well-rounded view.
Preprocessing Your Image: Connect your image to a “ControlNet Preprocessor” node, which is vital to ensure your image is ControlNet-ready. It's essential to match the preprocessor to your ControlNet model. This step adjusts your original image to fit the model's needs perfectly—resizing, recoloring, or applying necessary filters—preparing it for use by ControlNet.
3.4.3. Understanding the Outputs of “Apply ControlNet” Node
After processing, the "Apply ControlNet" node presents you with two outputs reflecting the sophisticated interplay of ControlNet and your creative input: Positive and Negative Conditioning. These outputs guide the diffusion model within ComfyUI, leading to your next choice: refine the image using the KSampler or dive deeper by stacking more ControlNets for those seeking unparalleled detail and customization.
3.4.4. Tuning “Apply ControlNet” for Best Results
Determining Strength: This setting controls how much ControlNet sways the resulting image. A full-on 1.0 means ControlNet's input has the reins, while dialing down to 0.0 lets the model run without ControlNet's influence.
Adjusting Start Percent: This tells you when ControlNet starts to pitch in during the diffusion process. For example, a 20% start means that from one-fifth of the way through, ControlNet begins to make its mark.
Setting End Percent: This is the flip side of Start Percent, marking when ControlNet bows out. If you set it to 80%, ControlNet's influence fades away as the image nears its final stages, untouched by ControlNet in the last stretch.
3.5.1. ControlNet Model: Openpose
Preprocessor options include: Openpose or DWpose
3.5.2. ControlNet Model: Depth
Depth models use a 2D image to infer depth, representing it as a grayscale map. Each has its strengths in terms of detail or background focus:
Preprocessors to consider: Depth_Midas, Depth_Leres, Depth_Zoe, Depth_Anything, MeshGraphormer_Hand_Refiner. This model excels in robustness and compatibility with actual depth maps from rendering engines.
3.5.3. ControlNet Model: SoftEdge
ControlNet Soft Edge is crafted to produce images with gentler edges, enhancing detail while maintaining a natural look. It utilizes cutting-edge neural networks for refined image manipulation, offering extensive creative control and flawless integration.
In terms of robustness: SoftEdge_PIDI_safe > SoftEdge_HED_safe >> SoftEdge_PIDI > SoftEdge_HED
For the highest quality results: SoftEdge_HED > SoftEdge_PIDI > SoftEdge_HED_safe > SoftEdge_PIDI_safe
As a general recommendation, SoftEdge_PIDI is the go-to option since it typically delivers excellent results.
Preprocessors include: SoftEdge_PIDI, SoftEdge_PIDI_safe, SoftEdge_HED, SoftEdge_HED_safe.
3.5.4. ControlNet Model: Canny
The Canny model implements the Canny edge detection to spotlight a wide spectrum of edges within images. This model is excellent for maintaining the integrity of structural elements while simplifying the image's overall look, aiding in creating stylized art or preparing images for additional manipulation.
Preprocessors available: Canny
3.5.5. ControlNet Model: Lineart
Lineart models are your tools for transforming images into stylized line drawings, suitable for a variety of artistic applications:
Preprocessors available can produce either detailed or more pronounced lineart (Lineart and Lineart_Coarse).
3.5.6. ControlNet Model: Tile
The Tile Resample model excels in bringing out details in images. It's especially effective when used in tandem with an upscaler to enhance image resolution and detail, often applied to sharpen and enrich image textures and elements.
Preprocessor recommended: Tile
Incorporating multiple ControlNets or T2I-Adapters allows for the sequential application of different conditioning types to your image generation process. For example, you can combine Lineart and OpenPose ControlNets for enhanced detailing.
Lineart for Object Shape: Start by integrating a Lineart ControlNet to add depth and detail to objects or elements in your imagery. This process involves preparing a lineart or canny map for the objects you wish to include.
OpenPose for Pose Control: Following the lineart detailing, utilize the OpenPose ControlNet to dictate the pose of individuals within your image. You will need to generate or acquire an OpenPose map that captures the desired pose.
Sequential Application: To effectively combine these effects, link the output from the Lineart ControlNet into the OpenPose ControlNet. This method ensures that both the pose of the subjects and the shapes of objects are simultaneously guided during the generation process, creating an outcome that harmoniously aligns with all input specifications.
Please check out the details on
© Copyright 2024 RunComfy. All Rights Reserved.