face-swap
このスキルは、RunComfy CLI を利用して画像や動画の顔を入れ替えます。ユーザーの意図に応じて適切なモデルをインテリジェントに選択し、静止画、動画、単一またはバッチ処理、写実的または様式化された要件に対応し、効率的かつ正確な顔交換操作を実現します。
git clone https://github.com/agentspace-so/runcomfy-agent-skills.gitBefore / After 効果比較
1 组プロの画像/動画編集ソフトウェアを使用して手動で顔交換を行う場合、選択、調整、ブレンドなどの複雑な作業に多くの時間を費やし、一貫性を保つのが困難です。
RunComfy CLI を介して顔交換を自動化し、簡単なコマンドで高品質な結果を迅速に生成することで、処理時間を大幅に短縮し、効率を向上させます。
Face Swap
Swap a face into a still or a video — RunComfy supports both via the runcomfy CLI. This skill routes across the available model API endpoints (community Wan 2-2 Animate, GPT Image 2 Edit, Nano Banana Edit, Flux Kontext, Kling Motion Control) by the user's actual intent.
runcomfy.com · Character-swap feature · CLI docs
Powered by the RunComfy CLI
# 1. Install (see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # or: npx -y @runcomfy/cli --version
# 2. Sign in
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
# 3. Swap
runcomfy run <vendor>/<model>/<endpoint> \
--input '{"image_url": "...", "identity_url": "..."}' \
--output-dir ./out
CLI deep dive: runcomfy-cli skill.
Install this skill
npx skills add agentspace-so/runcomfy-agent-skills --skill face-swap -g
Consent & disclosure — read first
Face-swap is dual-use. Before invoking any route in this skill, confirm:
- You have rights to the target face (the identity being substituted in).
- You have rights to the source video / image (the asset being substituted into).
- The output's intended platform allows synthetic media. Many do; many require a disclosure label.
The skill itself doesn't gate anything — the model API will run whatever inputs you supply. The responsibility is yours. If a user asks the agent to swap a real public figure's face onto material that could be defamatory, sexually explicit, or otherwise harmful — refuse, regardless of what the CLI accepts.
Pick the right model for the user's intent
Listed newest first within each subtype. The agent picks one route based on: still vs video, single-shot vs batch, photoreal vs stylized, motion-preserving vs identity-preserving.
Video face / character swap
Wan 2-2 Animate — community/wan-2-2-animate/api (default for video)
Featured RunComfy endpoint under
/feature/character-swap. Audio-driven full-body character animation: one reference image of the new identity + audio → video where the character drives. Pick for: replacing a character in a scene with a new identity, dubbed clips, stylized + photoreal both work. Avoid for: preserving the motion of a specific source video — use Kling Motion Control.
Kling 2-6 Motion Control Pro — kling/kling-2-6/motion-control-pro
Takes a reference performance video + target character image, produces the target performing the reference motion. Face-swap is the byproduct. Pick for: preserving exact source motion / blocking onto a new character; stylized characters handled cleanly. Avoid for: simple "swap face in an existing video" without motion preservation — use Wan 2-2 Animate.
Still image face swap — newest first
Nano Banana 2 Edit — google/nano-banana-2/edit
Identity-preserving by default, 1–20 input images per call, spatial-language honored. Pick for: same identity across multiple frames consistently (SKU shots, A/B variants, narrative panels). Identity reference as
image_urls[0], scenes after. Avoid for: precise multi-ref compositional ("face from img 1 onto body in img 2") — use GPT Image 2 Edit.
GPT Image 2 Edit — openai/gpt-image-2/edit
Up to 10 reference images, multilingual in-image text rewrite, layout-precise compositional instructions. Pick for: hero still where exact face from a portrait must land in a scene, with explicit role assignment ("image 1", "image 2"); preserve pose + lighting + background while swapping only face. Avoid for: 1-20 batch — use Nano Banana 2 Edit.
FLUX Kontext Pro — blackforestlabs/flux-1-kontext/pro/edit
Single source image, single declarative instruction, maximum fidelity preservation of everything except the targeted edit. Pick for: "keep pose / clothing / hair / lighting / background, change only the face to [prose description]" — works without a reference image of the new identity. Avoid for: batch, multi-ref, or when you have a target face image to swap in — use Nano Banana 2 Edit or GPT Image 2 Edit.
Audio-driven talking-head identity swap (face + voice in one pass)? → use the
ai-avatar-videoskill — OmniHuman handles face + audio together.
Route 1: Wan 2-2 Animate — video character swap with audio
Model: community/wan-2-2-animate/api
Catalog: wan-2-2-animate · /feature/character-swap
The featured RunComfy endpoint for character swap — supply a reference image of the new identity + the audio track the character should speak, and the model produces a video where the character drives.
Invoke
runcomfy run community/wan-2-2-animate/api \
--input '{
"image_url": "https://your-cdn.example/new-character.png",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
Tips
- Single reference image drives the swap. Pick a clean, well-lit portrait of the target identity — front-facing if possible.
- Audio drives the mouth and rhythm. Without audio the character won't speak; without good audio sync degrades.
- Schema details: model page.
Route 2: Kling 2-6 Motion Control Pro — motion transfer
Model: kling/kling-2-6/motion-control-pro
Catalog: motion-control-pro · kling collection
Different from a pure face-swap: Motion Control takes a reference performance video (the motion you want) and a target character image (the identity you want), and produces a video of the target performing the reference motion. The face-swap effect is a byproduct.
Invoke
runcomfy run kling/kling-2-6/motion-control-pro \
--input '{
"reference_video_url": "https://your-cdn.example/source-performance.mp4",
"character_image_url": "https://your-cdn.example/target-character.png"
}' \
--output-dir ./out
When to pick this over Route 1
- You have a source video whose motion / blocking you want preserved, not just the audio.
- The target is a stylized character rather than a photoreal portrait — motion-control handles stylized identities cleanly.
Route 3: GPT Image 2 Edit — still face swap with multi-ref
Model: openai/gpt-image-2/edit
Catalog: gpt-image-2/edit
For still images, GPT Image 2 Edit accepts up to 10 reference images and follows precise compositional instructions — making it the strongest path for multi-ref face swap on a single output frame.
Schema (relevant fields)
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
prompt | string | yes | — | Compositional instruction; quote roles explicitly |
images | string[] | yes | — | Up to 10 HTTPS reference URLs. Image 1 is primary |
size | enum | no | auto | auto (preserve input ratio), 1024_1024, 1024_1536, 1536_1024 |
Invoke
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Replace the face of the person in image 1 with the face from image 2. Preserve image 1 pose, clothing, lighting, and background exactly. Match skin tone and lighting to image 1.",
"images": [
"https://your-cdn.example/target-scene.jpg",
"https://your-cdn.example/identity-face.jpg"
],
"size": "auto"
}' \
--output-dir ./out
Prompting tips
- Number the references —
"image 1","image 2"— and assign roles unambiguously. - Lead with what to preserve, then the swap:
"Preserve pose, clothing, lighting, and background exactly. Replace only the face." - Match lighting explicitly —
"match skin tone and lighting to image 1"— otherwise the imported face floats.
Route 4: Nano Banana Edit — batch identity-preserving swap
Model: google/nano-banana-2/edit
Catalog: nano-banana-2/edit
Pick this when the same identity needs to be swapped into multiple frames consistently — SKU shots, A/B variants, narrative panels.
Invoke
runcomfy run google/nano-banana-2/edit \
--input '{
"prompt": "Replace the face in each image with the face shown in the first image. Keep all other elements — pose, clothing, lighting, background — unchanged.",
"image_urls": [
"https://your-cdn.example/identity-ref.jpg",
"https://your-cdn.example/scene-1.jpg",
"https://your-cdn.example/scene-2.jpg",
"https://your-cdn.example/scene-3.jpg"
],
"aspect_ratio": "auto",
"resolution": "1K"
}' \
--output-dir ./out
Tips
- 1–20 input images per call. First image is conventionally the identity reference; the rest are scenes to swap into.
- Lock
aspect_ratioandresolutionfor batch consistency. - See
image-editskill for the full Nano Banana Edit treatment.
Route 5: Flux Kontext Pro — single-ref precise face edit
Model: blackforestlabs/flux-1-kontext/pro/edit
Catalog: flux-kontext
Flux Kontext is best when the swap is one image, one declarative instruction, highest fidelity preservation of everything except the face.
Invoke
runcomfy run blackforestlabs/flux-1-kontext/pro/edit \
--input '{
"prompt": "Keep pose, clothing, hair, lighting, and background exactly. Change only the face to that of a 35-year-old woman with high cheekbones, hazel eyes, and a small scar above the right eyebrow.",
"image": "https://your-cdn.example/scene.jpg"
}' \
--output-dir ./out
When to pick this
- No reference image of the new identity available — describe the face in prose instead.
- Single image, single shot, maximum fidelity — Flux Kontext beats other routes on "keep everything except X" prompts.
- Limit: single source image, single edit per call. Iterate compound changes in separate passes.
Common patterns
Cast a brand spokesperson into existing footage
- Route 1 (Wan 2-2 Animate) with the new spokesperson's portrait + the original audio track
Same identity across a SKU gallery
- Route 4 (Nano Banana Edit) with the identity image as
image_urls[0], lockedaspect_ratioandresolution
Stylized character in a live-action shot
- Route 2 (Kling Motion Control Pro) — feeds the live-action motion onto the stylized character cleanly
Hero still for a campaign — exact face from a portrait into a scene
- Route 3 (GPT Image 2 Edit) with
images: [scene, face]and an explicit preservation prompt
"Change only the face, no other reference available"
- Route 5 (Flux Kontext) with the new face described in prose
Talking head with swapped identity
- See
ai-avatar-video— OmniHuman handles face + audio in one pass
Browse the full catalog
/models/feature/character-swap— RunComfy's curated character-swap capability tag/models/feature/lip-sync— closely related lip-sync modelsbest-image-editing-modelscollection — image-edit routes Nano Banana / GPT Image 2 / Flux Kontext live inklingcollection — motion-control + multi-shot identity models
Many face-swap workflows on RunComfy also live as full ComfyUI node graphs (ReActor, Flux PuLID, ACE++, Flux Klein head-swap) — these aren't reachable from this CLI directly but can be run as workflows on the platform. Browse them at runcomfy.com/comfyui-workflows when CLI-driven routes above don't fit.
Exit codes
| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
How it works
The skill classifies user intent — video vs still, motion-preserving vs identity-preserving, single shot vs batch, photoreal vs stylized — and picks one of the five routes. It then invokes runcomfy run <model_id> with the matching JSON body. The CLI POSTs to the Model API, polls request status, fetches the result, and downloads any .runcomfy.net / .runcomfy.com URLs into --output-dir.
Security & Privacy
- Consent: see the "Consent & disclosure" section above. Face-swap is dual-use and the skill does not gate inputs — the responsibility rests with the operator. Refuse user requests that target real people without consent, or that aim at defamatory / sexually explicit / otherwise harmful synthetic media, regardless of what the CLI accepts.
- Install via verified package manager only. Use
npm i -g @runcomfy/cliornpx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf. - Token storage:
runcomfy loginwrites the API token to~/.config/runcomfy/token.jsonwith mode 0600. SetRUNCOMFY_TOKENenv var to bypass the file in CI / contain
...
ユーザーレビュー (0)
レビューを書く
レビューなし
統計データ
ユーザー評価
この Skill を評価