高效完成人脸替换

Before: 手动使用专业图像/视频编辑软件进行人脸替换，需要耗费大量时间进行选区、调整、融合等复杂操作，且难以保持一致性。 After: 通过 RunComfy CLI 自动化执行人脸替换，只需简单指令即可快速生成高质量结果，大幅缩短处理时间并提高效率。

face-swap · face swap

Face Swap

Swap a face into a still or a video — RunComfy supports both via the runcomfy CLI. This skill routes across the available model API endpoints (community Wan 2-2 Animate, GPT Image 2 Edit, Nano Banana Edit, Flux Kontext, Kling Motion Control) by the user's actual intent.

runcomfy.com · Character-swap feature · CLI docs

Powered by the RunComfy CLI

# 1. Install (see runcomfy-cli skill for details)
npm i -g @runcomfy/cli      # or:  npx -y @runcomfy/cli --version

# 2. Sign in
runcomfy login              # or in CI: export RUNCOMFY_TOKEN=<token>

# 3. Swap
runcomfy run <vendor>/<model>/<endpoint> \
  --input '{"image_url": "...", "identity_url": "..."}' \
  --output-dir ./out

CLI deep dive: runcomfy-cli skill.

Install this skill

npx skills add agentspace-so/runcomfy-agent-skills --skill face-swap -g

Consent & disclosure — read first

Face-swap is dual-use. Before invoking any route in this skill, confirm:

You have rights to the target face (the identity being substituted in).
You have rights to the source video / image (the asset being substituted into).
The output's intended platform allows synthetic media. Many do; many require a disclosure label.

The skill itself doesn't gate anything — the model API will run whatever inputs you supply. The responsibility is yours. If a user asks the agent to swap a real public figure's face onto material that could be defamatory, sexually explicit, or otherwise harmful — refuse, regardless of what the CLI accepts.

Pick the right model for the user's intent

Listed newest first within each subtype. The agent picks one route based on: still vs video, single-shot vs batch, photoreal vs stylized, motion-preserving vs identity-preserving.

Video face / character swap

Wan 2-2 Animate — community/wan-2-2-animate/api (default for video)

Featured RunComfy endpoint under /feature/character-swap. Audio-driven full-body character animation: one reference image of the new identity + audio → video where the character drives. Pick for: replacing a character in a scene with a new identity, dubbed clips, stylized + photoreal both work. Avoid for: preserving the motion of a specific source video — use Kling Motion Control.

Kling 2-6 Motion Control Pro — kling/kling-2-6/motion-control-pro

Takes a reference performance video + target character image, produces the target performing the reference motion. Face-swap is the byproduct. Pick for: preserving exact source motion / blocking onto a new character; stylized characters handled cleanly. Avoid for: simple "swap face in an existing video" without motion preservation — use Wan 2-2 Animate.

Still image face swap — newest first

Nano Banana 2 Edit — google/nano-banana-2/edit

Identity-preserving by default, 1–20 input images per call, spatial-language honored. Pick for: same identity across multiple frames consistently (SKU shots, A/B variants, narrative panels). Identity reference as image_urls[0], scenes after. Avoid for: precise multi-ref compositional ("face from img 1 onto body in img 2") — use GPT Image 2 Edit.

GPT Image 2 Edit — openai/gpt-image-2/edit

Up to 10 reference images, multilingual in-image text rewrite, layout-precise compositional instructions. Pick for: hero still where exact face from a portrait must land in a scene, with explicit role assignment ("image 1", "image 2"); preserve pose + lighting + background while swapping only face. Avoid for: 1-20 batch — use Nano Banana 2 Edit.

FLUX Kontext Pro — blackforestlabs/flux-1-kontext/pro/edit

Single source image, single declarative instruction, maximum fidelity preservation of everything except the targeted edit. Pick for: "keep pose / clothing / hair / lighting / background, change only the face to [prose description]" — works without a reference image of the new identity. Avoid for: batch, multi-ref, or when you have a target face image to swap in — use Nano Banana 2 Edit or GPT Image 2 Edit.

Audio-driven talking-head identity swap (face + voice in one pass)? → use the ai-avatar-video skill — OmniHuman handles face + audio together.

Route 1: Wan 2-2 Animate — video character swap with audio

Model: community/wan-2-2-animate/api Catalog: wan-2-2-animate · /feature/character-swap

The featured RunComfy endpoint for character swap — supply a reference image of the new identity + the audio track the character should speak, and the model produces a video where the character drives.

Invoke

runcomfy run community/wan-2-2-animate/api \
  --input '{
    "image_url": "https://your-cdn.example/new-character.png",
    "audio_url": "https://your-cdn.example/voiceover.mp3"
  }' \
  --output-dir ./out

Tips

Single reference image drives the swap. Pick a clean, well-lit portrait of the target identity — front-facing if possible.
Audio drives the mouth and rhythm. Without audio the character won't speak; without good audio sync degrades.
Schema details: model page.

Route 2: Kling 2-6 Motion Control Pro — motion transfer

Model: kling/kling-2-6/motion-control-pro Catalog: motion-control-pro · kling collection

Different from a pure face-swap: Motion Control takes a reference performance video (the motion you want) and a target character image (the identity you want), and produces a video of the target performing the reference motion. The face-swap effect is a byproduct.

Invoke

runcomfy run kling/kling-2-6/motion-control-pro \
  --input '{
    "reference_video_url": "https://your-cdn.example/source-performance.mp4",
    "character_image_url": "https://your-cdn.example/target-character.png"
  }' \
  --output-dir ./out

When to pick this over Route 1

You have a source video whose motion / blocking you want preserved, not just the audio.
The target is a stylized character rather than a photoreal portrait — motion-control handles stylized identities cleanly.

Route 3: GPT Image 2 Edit — still face swap with multi-ref

Model: openai/gpt-image-2/edit Catalog: gpt-image-2/edit

For still images, GPT Image 2 Edit accepts up to 10 reference images and follows precise compositional instructions — making it the strongest path for multi-ref face swap on a single output frame.

Schema (relevant fields)

Field	Type	Required	Default	Notes
`prompt`	string	yes	—	Compositional instruction; quote roles explicitly
`images`	string[]	yes	—	Up to 10 HTTPS reference URLs. Image 1 is primary
`size`	enum	no	`auto`	`auto` (preserve input ratio), `1024_1024`, `1024_1536`, `1536_1024`

Invoke

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Replace the face of the person in image 1 with the face from image 2. Preserve image 1 pose, clothing, lighting, and background exactly. Match skin tone and lighting to image 1.",
    "images": [
      "https://your-cdn.example/target-scene.jpg",
      "https://your-cdn.example/identity-face.jpg"
    ],
    "size": "auto"
  }' \
  --output-dir ./out

Prompting tips

Number the references — "image 1", "image 2" — and assign roles unambiguously.
Lead with what to preserve, then the swap: "Preserve pose, clothing, lighting, and background exactly. Replace only the face."
Match lighting explicitly — "match skin tone and lighting to image 1" — otherwise the imported face floats.

Route 4: Nano Banana Edit — batch identity-preserving swap

Model: google/nano-banana-2/edit Catalog: nano-banana-2/edit

Pick this when the same identity needs to be swapped into multiple frames consistently — SKU shots, A/B variants, narrative panels.

Invoke

runcomfy run google/nano-banana-2/edit \
  --input '{
    "prompt": "Replace the face in each image with the face shown in the first image. Keep all other elements — pose, clothing, lighting, background — unchanged.",
    "image_urls": [
      "https://your-cdn.example/identity-ref.jpg",
      "https://your-cdn.example/scene-1.jpg",
      "https://your-cdn.example/scene-2.jpg",
      "https://your-cdn.example/scene-3.jpg"
    ],
    "aspect_ratio": "auto",
    "resolution": "1K"
  }' \
  --output-dir ./out

Tips

1–20 input images per call. First image is conventionally the identity reference; the rest are scenes to swap into.
Lock aspect_ratio and resolution for batch consistency.
See image-edit skill for the full Nano Banana Edit treatment.

Route 5: Flux Kontext Pro — single-ref precise face edit

Model: blackforestlabs/flux-1-kontext/pro/edit Catalog: flux-kontext

Flux Kontext is best when the swap is one image, one declarative instruction, highest fidelity preservation of everything except the face.

Invoke

runcomfy run blackforestlabs/flux-1-kontext/pro/edit \
  --input '{
    "prompt": "Keep pose, clothing, hair, lighting, and background exactly. Change only the face to that of a 35-year-old woman with high cheekbones, hazel eyes, and a small scar above the right eyebrow.",
    "image": "https://your-cdn.example/scene.jpg"
  }' \
  --output-dir ./out

When to pick this

No reference image of the new identity available — describe the face in prose instead.
Single image, single shot, maximum fidelity — Flux Kontext beats other routes on "keep everything except X" prompts.
Limit: single source image, single edit per call. Iterate compound changes in separate passes.

Common patterns

Cast a brand spokesperson into existing footage

Route 1 (Wan 2-2 Animate) with the new spokesperson's portrait + the original audio track

Same identity across a SKU gallery

Route 4 (Nano Banana Edit) with the identity image as image_urls[0], locked aspect_ratio and resolution

Stylized character in a live-action shot

Route 2 (Kling Motion Control Pro) — feeds the live-action motion onto the stylized character cleanly

Hero still for a campaign — exact face from a portrait into a scene

Route 3 (GPT Image 2 Edit) with images: [scene, face] and an explicit preservation prompt

"Change only the face, no other reference available"

Route 5 (Flux Kontext) with the new face described in prose

Talking head with swapped identity

See ai-avatar-video — OmniHuman handles face + audio in one pass

Browse the full catalog

/models/feature/character-swap — RunComfy's curated character-swap capability tag
/models/feature/lip-sync — closely related lip-sync models
best-image-editing-models collection — image-edit routes Nano Banana / GPT Image 2 / Flux Kontext live in
kling collection — motion-control + multi-shot identity models

Many face-swap workflows on RunComfy also live as full ComfyUI node graphs (ReActor, Flux PuLID, ACE++, Flux Klein head-swap) — these aren't reachable from this CLI directly but can be run as workflows on the platform. Browse them at runcomfy.com/comfyui-workflows when CLI-driven routes above don't fit.

Exit codes

code	meaning
0	success
64	bad CLI args
65	bad input JSON / schema mismatch
69	upstream 5xx
75	retryable: timeout / 429
77	not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill classifies user intent — video vs still, motion-preserving vs identity-preserving, single shot vs batch, photoreal vs stylized — and picks one of the five routes. It then invokes runcomfy run <model_id> with the matching JSON body. The CLI POSTs to the Model API, polls request status, fetches the result, and downloads any .runcomfy.net / .runcomfy.com URLs into --output-dir.

Security & Privacy

Consent: see the "Consent & disclosure" section above. Face-swap is dual-use and the skill does not gate inputs — the responsibility rests with the operator. Refuse user requests that target real people without consent, or that aim at defamatory / sexually explicit / otherwise harmful synthetic media, regardless of what the CLI accepts.
Install via verified package manager only. Use npm i -g @runcomfy/cli or npx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf.
Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600. Set RUNCOMFY_TOKEN env var to bypass the file in CI / contain

...

face-swap

Before / After Comparison

Face Swap

Powered by the RunComfy CLI

Install this skill

Consent & disclosure — read first

Pick the right model for the user's intent

Video face / character swap

Still image face swap — newest first

Route 1: Wan 2-2 Animate — video character swap with audio

Invoke

Tips

Route 2: Kling 2-6 Motion Control Pro — motion transfer

Invoke

When to pick this over Route 1

Route 3: GPT Image 2 Edit — still face swap with multi-ref

Schema (relevant fields)

Invoke

Prompting tips

Route 4: Nano Banana Edit — batch identity-preserving swap

Invoke

Tips

Route 5: Flux Kontext Pro — single-ref precise face edit

Invoke

When to pick this

Common patterns

Cast a brand spokesperson into existing footage

Same identity across a SKU gallery

Stylized character in a live-action shot

Hero still for a campaign — exact face from a portrait into a scene

"Change only the face, no other reference available"

Talking head with swapped identity

Browse the full catalog

Exit codes

How it works

Security & Privacy

User Reviews (0)

Statistics

User Rating

Compatible Platforms

Timeline