视频内容创作效率

Name: seedance-v2 AI Agent Skill
Availability: InStock
Rating: 4.8 (6 reviews)
Author: agentspace-so

Before: 传统视频制作需要脚本撰写、拍摄、剪辑、特效等多个环节，一条 30 秒的专业视频需要团队 2-3 天完成 After: 输入文本描述直接生成高质量视频，自动处理场景、动作和特效，30 秒专业视频 5 分钟完成，支持批量生成和风格定制

seedance-v2 · video generation

seedance-v2

Seedance 2.0 Pro — Pro Pack on RunComfy

runcomfy.com · Seedance 2.0 Pro · GitHub

ByteDance Seedance 2.0 Pro — multimodal cinematic video generator with native lip-synced audio — hosted on the RunComfy Model API.

npx skills add agentspace-so/runcomfy-skills --skill seedance-v2 -g

When to pick this model (vs siblings)

Seedance 2.0 Pro's distinct strength is multi-modal cinematic short-form: combine character images + scene videos + reference audio into one coherent shot. Pick it when fidelity to a reference identity / scene matters and you want native lip-sync.

You want Use

Lip-synced spokesperson / dialogue ad Seedance 2.0 Pro

Multi-modal references (image + video + audio) Seedance 2.0 Pro

Brand-consistent multi-language narrative Seedance 2.0 Pro

Currently-#1 blind-vote video quality HappyHorse 1.0

Audio-driven lip-sync from your own track Wan 2.7 (audio_url)

Motion editing on existing footage Kling Video O1

Ultra-fast iteration LTX 2

If the user said "Seedance" / "Seedance 2" / "ByteDance video" explicitly, route here regardless.

Prerequisites

RunComfy CLI — npm i -g @runcomfy/cli
RunComfy account — runcomfy login opens a browser device-code flow.
CI / containers — set RUNCOMFY_TOKEN=<token> instead of runcomfy login.

Endpoints + input schema

`bytedance/seedance-v2/pro`

Field Type Required Default Notes

prompt string yes — CN ≤ 500 chars OR EN ≤ 1000 words.

image_url array no [] 0–9 references (JPEG/PNG/WebP/BMP/TIFF/GIF).

video_url array no [] 0–3 clips (MP4/MOV), 2–15s each.

audio_url array no [] 0–3 audio refs (WAV/MP3), 2–15s, < 15MB each.

aspect_ratio enum no adaptive adaptive, 16:9, 9:16, 4:3, 3:4, 1:1, 21:9.

duration int no 5 4–15 (whole seconds).

resolution enum no 720p 480p or 720p.

generate_audio bool no true In-pass synchronized speech / SFX / music.

seed int no — Reproducibility.

How to invoke

Default (text only, 5s, 720p with audio):

runcomfy run bytedance/seedance-v2/pro \
  --input '{"prompt": "<user prompt>"}' \
  --output-dir <absolute/path>

Lip-synced ad with character reference (image-stable, text-evolves):

runcomfy run bytedance/seedance-v2/pro \
  --input '{
    "prompt": "Medium close-up. The woman explains today'\''s special in a warm friendly tone, slow push-in, soft window light, gentle cafe ambience.",
    "image_url": ["https://.../barista-headshot.jpg"],
    "duration": 8,
    "aspect_ratio": "9:16"
  }' \
  --output-dir <absolute/path>

Multi-modal (image + video + audio refs):

runcomfy run bytedance/seedance-v2/pro \
  --input '{
    "prompt": "Subject from image 1 walks through the café from video 1, voice tone matches audio 1.",
    "image_url": ["https://.../subject.jpg"],
    "video_url": ["https://.../cafe-locked-shot.mp4"],
    "audio_url": ["https://.../voice-ref.mp3"]
  }' \
  --output-dir <absolute/path>

The CLI submits, polls, fetches the result, downloads *.runcomfy.net/*.runcomfy.com URLs into --output-dir.

Prompting — what actually works

Image vs text division. This is the single most important rule. Stable identity (face, costume, brand mark, logo) → put in image_url. Evolving narrative (action, mood, lighting, camera) → put in prompt. Trying to verbally describe a face in detail wastes tokens and produces drift.

Camera + motion in plain language. "Medium close-up", "slow push-in", "handheld follow", "locked-off wide" all work as directives. Combine: "Medium close-up. Slow push-in over 3 seconds. Handheld, slight breathing motion."

Audio direction with generate_audio: true — say the tone: "warm friendly conversational", "calm instructional", "crisp newsroom delivery". For ambient: "gentle cafe chatter, distant traffic, no foreground music".

Reference media specs — videos must be 2–15s; audio must be ≤15MB and 2–15s. Out-of-range files reject. Match aspect ratio of refs to your output to avoid crops.

Anti-patterns:

Mixing radically different aesthetic refs (watercolor + photoreal) → confuses.
Conflicting style cues in prompt → simplify by removing contradictions.
Trying to describe stable identity verbally → use image_url instead.
Asking for >15s clips → 422; segment into multiple calls.

Where it shines

Use case Why Seedance 2.0 Pro

Spokesperson / dialogue ads Native in-pass lip-sync, no separate TTS step

Brand-consistent multi-language narratives Image refs hold identity; text drives translation

Cinematic short-form film previs Camera-shot grammar + multi-modal refs

Ad creatives with reference music / VO tone Audio refs guide voice / mood without locking lip-sync

Reproducible variant testing Seed control + fixed schema

Sample prompts (verified to produce strong results)

Default playground example:

Golden hour on a quiet cafe terrace: a barista wipes the counter, then
looks up and explains today's special in a friendly tone, natural
lip-sync. Medium close-up, slow push-in; warm side light, soft bokeh
through glass, gentle cafe ambience and subtle film grain.

Multi-modal lip-sync (text + image):

Same person as image 1 in a softly-lit recording booth, leaning into
the mic, says: "We just shipped the biggest update of the year."
Calm conversational tone. Medium close-up, locked tripod, shallow DOF,
warm key light from camera-left.

Limitations

Duration 4–15s — no longer clips on this endpoint.
Resolution ceiling 720p on the playground variant.
Reference media specs — videos / audio must be 2–15s; audio < 15MB.
Lip-sync quality — depends on prompt clarity; not guaranteed perfect under all conditions.
No @-syntax for character binding — relies on image refs + prompt alignment.

Exit codes

code meaning

0 success

64 bad CLI args

65 bad input JSON / schema mismatch

69 upstream 5xx

75 retryable: timeout / 429

77 not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill invokes runcomfy run bytedance/seedance-v2/pro with a JSON body matching the schema. The CLI POSTs to https://model-api.runcomfy.net/v1/models/bytedance/seedance-v2/pro, polls the request, fetches the result, and downloads any .runcomfy.net/.runcomfy.com URL into --output-dir. Ctrl-C cancels the remote request before exit.

Security & Privacy

Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.

Weekly Installs10.5KRepositoryagentspace-so/r…t-skillsFirst SeenTodaySecurity AuditsGen Agent Trust HubPass SocketPass SnykWarn

seedance-v2

Before / After Comparison

seedance-v2

Seedance 2.0 Pro — Pro Pack on RunComfy

When to pick this model (vs siblings)

Prerequisites

Endpoints + input schema

`bytedance/seedance-v2/pro`

How to invoke

Prompting — what actually works

Where it shines

Sample prompts (verified to produce strong results)

Limitations

Exit codes

How it works

Security & Privacy

User Reviews (0)

Statistics

User Rating

Compatible Platforms

Timeline