F

fal-ai-media

by @affaan-mv1.0.0
4.6(14)

通过fal.ai模型生成图像、视频和音频,支持文本转图片、视频生成和音频创作

image-generationvideo-generationtext-to-speechgenerative-aimedia-creationGitHub
安装方式
npx skills add affaan-m/everything-claude-code --skill fal-ai-media
compare_arrows

Before / After 效果对比

1
使用前

手动使用多个工具分别生成图片、视频、音频,格式不统一,风格难以保持一致

使用后

统一接口生成多种媒体内容,风格自动匹配,快速完成多媒体项目

description SKILL.md

fal-ai-media

fal.ai Media Generation

Generate images, videos, and audio using fal.ai models via MCP.

When to Activate

  • User wants to generate images from text prompts

  • Creating videos from text or images

  • Generating speech, music, or sound effects

  • Any media generation task

  • User says "generate image", "create video", "text to speech", "make a thumbnail", or similar

MCP Requirement

fal.ai MCP server must be configured. Add to ~/.claude.json:

"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}

Get an API key at fal.ai.

MCP Tools

The fal.ai MCP provides these tools:

  • search — Find available models by keyword

  • find — Get model details and parameters

  • generate — Run a model with parameters

  • result — Check async generation status

  • status — Check job status

  • cancel — Cancel a running job

  • estimate_cost — Estimate generation cost

  • models — List popular models

  • upload — Upload files for use as inputs

Image Generation

Nano Banana 2 (Fast)

Best for: quick iterations, drafts, text-to-image, image editing.

generate(
  app_id: "fal-ai/nano-banana-2",
  input_data: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)

Nano Banana Pro (High Fidelity)

Best for: production images, realism, typography, detailed prompts.

generate(
  app_id: "fal-ai/nano-banana-pro",
  input_data: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)

Common Image Parameters

Param Type Options Notes

prompt string required Describe what you want

image_size string square, portrait_4_3, landscape_16_9, portrait_16_9, landscape_4_3 Aspect ratio

num_images number 1-4 How many to generate

seed number any integer Reproducibility

guidance_scale number 1-20 How closely to follow the prompt (higher = more literal)

Image Editing

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:

# First upload the source image
upload(file_path: "/path/to/image.png")

# Then generate with image input
generate(
  app_id: "fal-ai/nano-banana-2",
  input_data: {
    "prompt": "same scene but in watercolor style",
    "image_url": "<uploaded_url>",
    "image_size": "landscape_16_9"
  }
)

Video Generation

Seedance 1.0 Pro (ByteDance)

Best for: text-to-video, image-to-video with high motion quality.

generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.

generate(
  app_id: "fal-ai/kling-video/v3/pro",
  input_data: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)

Veo 3 (Google DeepMind)

Best for: video with generated sound, high visual quality.

generate(
  app_id: "fal-ai/veo-3",
  input_data: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)

Image-to-Video

Start from an existing image:

generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)

Video Parameters

Param Type Options Notes

prompt string required Describe the video

duration string "5s", "10s" Video length

aspect_ratio string "16:9", "9:16", "1:1" Frame ratio

seed number any integer Reproducibility

image_url string URL Source image for image-to-video

Audio Generation

CSM-1B (Conversational Speech)

Text-to-speech with natural, conversational quality.

generate(
  app_id: "fal-ai/csm-1b",
  input_data: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)

ThinkSound (Video-to-Audio)

Generate matching audio from video content.

generate(
  app_id: "fal-ai/thinksound",
  input_data: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)

ElevenLabs (via API, no MCP)

For professional voice synthesis, use ElevenLabs directly:

import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)

VideoDB Generative Audio

If VideoDB is configured, use its generative audio:

# Voice generation
audio = coll.generate_voice(text="Your narration here", voice="alloy")

# Music generation
music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

# Sound effects
sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

Cost Estimation

Before generating, check estimated cost:

estimate_cost(
  estimate_type: "unit_price",
  endpoints: {
    "fal-ai/nano-banana-pro": {
      "unit_quantity": 1
    }
  }
)

Model Discovery

Find models for specific tasks:

search(query: "text to video")
find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])
models()

Tips

  • Use seed for reproducible results when iterating on prompts

  • Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals

  • For video, keep prompts descriptive but concise — focus on motion and scene

  • Image-to-video produces more controlled results than pure text-to-video

  • Check estimate_cost before running expensive video generations

Related Skills

  • videodb — Video processing, editing, and streaming

  • video-editing — AI-powered video editing workflows

  • content-engine — Content creation for social platforms

Weekly Installs228Repositoryaffaan-m/everyt…ude-codeGitHub Stars85.6KFirst Seen6 days agoSecurity AuditsGen Agent Trust HubPassSocketWarnSnykWarnInstalled oncodex217cursor179gemini-cli177kimi-cli177github-copilot177opencode177

forum用户评价 (0)

发表评价

效果
易用性
文档
兼容性

暂无评价,来写第一条吧

统计数据

安装量416
评分4.6 / 5.0
版本1.0.0
更新日期2026年3月20日
对比案例1 组

用户评分

4.6(14)
5
0%
4
0%
3
0%
2
0%
1
0%

为此 Skill 评分

0.0

兼容平台

🔧Claude Code

时间线

创建2026年3月20日
最后更新2026年3月20日