Do I need all three controls (edge, depth, pose) to get good results?

No. Wan2.2 Fun Control works with any single control stream (e.g., just Canny) or combinations. More controls generally increase structural faithfulness but can limit stylistic freedom. Start with edges, then add depth or pose if you need stronger guidance.

How should I match video length and FPS?

Use GetVideoComponents to read your source video’s FPS and frame count, then set CreateVideo to 16 FPS and up to 81 frames for best alignment with Wan2.2’s training. If your source differs, you can still run it—just expect slightly different motion timing.

What does the Wan2.2 Lighting LoRA do?

The Lighting LoRA reduces generation time and can improve lighting consistency, but it tends to reduce video dynamics. The workflow includes a dedicated fp8 + 4 steps LoRA group for speed. If you prioritize richer motion, disable the LoRA and increase steps modestly.

How can I improve temporal coherence and reduce flicker?

Use stronger controls (edges + depth), keep CFG moderate (around 3–6), fix the seed, and match resolution/FPS to 512/768/1024 at 16 FPS. Ensure prompts are stable across the whole run and avoid overly aggressive negative prompts that can destabilize details.

Wan 2.2 14B Fun Control - ComfyUI Workflow

This ComfyUI tutorial workflow demonstrates how to generate controllable videos with Wan2.2 Fun Control (14B), guided by pose, depth, and edge inputs. At its core, the Wan22FunControlToVideo node fuses your text prompt from CLIPTextEncode with one or more control streams derived from a reference video. The model stack loads via UNETLoader and VAELoader, while CLIPLoader and CLIPTextEncode provide multilingual text conditioning. Sampling runs through ModelSamplingSD3 with KSamplerAdvanced to balance quality, speed, and control strength. Finally, VAEDecode reconstructs frames and SaveVideo exports the result.

You can preprocess control videos directly in the graph: LoadVideo and GetVideoComponents ingest your footage, Canny extracts edges, and optional custom nodes (comfyui_controlnet_aux and ComfyUI-DepthAnythingV2) add pose/depth maps. CreateVideo sets resolution, frame count, and FPS to match Wan2.2’s training regime (multi-resolution 512/768/1024, 81 frames at 16 FPS). The Start_image group lets you anchor the first frame with LoadImage for a consistent subject/look. An optional LoraLoaderModelOnly applies the Wan2.2 Lighting LoRA for faster renders at the cost of reduced motion dynamics. This gives you a flexible, reproducible pipeline for video-to-video generation with reliable structure preservation and style control.

FAQ

常見問題