ai-avatar-video
通过AI技术创建虚拟形象视频,实现逼真的人物动画和语音同步,广泛应用于教育、营销和娱乐等领域。
npx skills add inferen-sh/skills --skill ai-avatar-videoBefore / After 效果对比
1 组制作真人出镜的视频成本高昂,需要演员、场地、设备租赁和复杂的后期制作,周期长且修改不便。
利用AI虚拟形象视频技术,通过文本或音频输入即可生成逼真的虚拟主播视频。这大幅降低了制作成本和时间,使得企业和个人能够快速、灵活地创建大量视频内容,实现个性化和规模化。
ai-avatar-video
AI Avatar & Talking Head Videos
Create AI avatars and talking head videos via inference.sh CLI.
Quick Start
Requires inference.sh CLI (infsh). Install instructions
infsh login
# Create avatar video from image + audio
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "https://speech.mp3"
}'
Available Models
Model App ID Best For
OmniHuman 1.5
bytedance/omnihuman-1-5
Multi-character, best quality
OmniHuman 1.0
bytedance/omnihuman-1-0
Single character
Fabric 1.0
falai/fabric-1-0
Image talks with lipsync
PixVerse Lipsync
falai/pixverse-lipsync
Highly realistic
Search Avatar Apps
infsh app list --search "omnihuman"
infsh app list --search "lipsync"
infsh app list --search "fabric"
Examples
OmniHuman 1.5 (Multi-Character)
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "https://speech.mp3"
}'
Supports specifying which character to drive in multi-person images.
Fabric 1.0 (Image Talks)
infsh app run falai/fabric-1-0 --input '{
"image_url": "https://face.jpg",
"audio_url": "https://audio.mp3"
}'
PixVerse Lipsync
infsh app run falai/pixverse-lipsync --input '{
"image_url": "https://portrait.jpg",
"audio_url": "https://speech.mp3"
}'
Generates highly realistic lipsync from any audio.
Full Workflow: TTS + Avatar
# 1. Generate speech from text
infsh app run infsh/kokoro-tts --input '{
"prompt": "Welcome to our product demo. Today I will show you..."
}' > speech.json
# 2. Create avatar video with the speech
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://presenter-photo.jpg",
"audio_url": "<audio-url-from-step-1>"
}'
Full Workflow: Dub Video in Another Language
# 1. Transcribe original video
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://video.mp4"}' > transcript.json
# 2. Translate text (manually or with an LLM)
# 3. Generate speech in new language
infsh app run infsh/kokoro-tts --input '{"text": "<translated-text>"}' > new_speech.json
# 4. Lipsync the original video with new audio
infsh app run infsh/latentsync-1-6 --input '{
"video_url": "https://original-video.mp4",
"audio_url": "<new-audio-url>"
}'
Use Cases
-
Marketing: Product demos with AI presenter
-
Education: Course videos, explainers
-
Localization: Dub content in multiple languages
-
Social Media: Consistent virtual influencer
-
Corporate: Training videos, announcements
Tips
-
Use high-quality portrait photos (front-facing, good lighting)
-
Audio should be clear with minimal background noise
-
OmniHuman 1.5 supports multiple people in one image
-
LatentSync is best for syncing existing videos to new audio
Related Skills
# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@infsh-cli
# Text-to-speech (generate audio for avatars)
npx skills add inference-sh/skills@text-to-speech
# Speech-to-text (transcribe for dubbing)
npx skills add inference-sh/skills@speech-to-text
# Video generation
npx skills add inference-sh/skills@ai-video-generation
# Image generation (create avatar images)
npx skills add inference-sh/skills@ai-image-generation
Browse all video apps: infsh app list --category video
Documentation
-
Running Apps - How to run apps via CLI
-
Content Pipeline Example - Building media workflows
-
Streaming Results - Real-time progress updates
Weekly Installs4.4KRepositoryinferen-sh/skillsGitHub Stars159First Seen6 days agoSecurity AuditsGen Agent Trust HubPassSocketPassSnykWarnInstalled onclaude-code3.5Kgemini-cli3.1Kcodex3.1Kamp3.1Kgithub-copilot3.1Kkimi-cli3.1K
用户评价 (0)
发表评价
暂无评价
统计数据
用户评分
为此 Skill 评分