text-to-speech
テキストコンテンツを自然で流暢な音声出力に変換し、多様なマルチメディアアプリケーションに適用可能で、ユーザーの聴覚体験とコンテンツのアクセシビリティを向上させます。
npx skills add inferen-sh/skills --skill text-to-speechBefore / After 効果比較
1 组ナレーションや音声コンテンツの録音には、プロの声優と録音設備が必要で、コストが高く、後からの修正も不便です。スクリプトに変更があると、再録音が必要になります。
テキスト読み上げ(TTS)技術を使用すると、テキストを自然で流暢な音声に素早く変換でき、多様な音色、話速、言語選択に対応します。これにより、制作コストと時間を大幅に削減できるだけでなく、音声コンテンツの修正も非常に便利で迅速になります。
description SKILL.md
text-to-speech
Text-to-Speech
Convert text to natural speech via inference.sh CLI.
Quick Start
Requires inference.sh CLI (infsh). Install instructions
infsh login
# Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'
Available Models
Model App ID Best For
ElevenLabs TTS
elevenlabs/tts
Premium quality, 22+ voices, 32 languages
DIA TTS
infsh/dia-tts
Conversational, expressive
Kokoro TTS
infsh/kokoro-tts
Fast, natural
Chatterbox
infsh/chatterbox
General purpose
Higgs Audio
infsh/higgs-audio
Emotional control
VibeVoice
infsh/vibevoice
Podcasts, long-form
Browse All Audio Apps
infsh app list --category audio
Examples
Basic Text-to-Speech
infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'
Conversational TTS with DIA
infsh app sample infsh/dia-tts --save input.json
# Edit input.json:
# {
# "text": "Hey! How are you doing today? I'm really excited to share this with you.",
# "voice": "conversational"
# }
infsh app run infsh/dia-tts --input input.json
Long-form Audio (Podcasts)
infsh app sample infsh/vibevoice --save input.json
# Edit input.json with your podcast script
infsh app run infsh/vibevoice --input input.json
Expressive Speech with Higgs
infsh app sample infsh/higgs-audio --save input.json
# {
# "text": "This is absolutely incredible!",
# "emotion": "excited"
# }
infsh app run infsh/higgs-audio --input input.json
Use Cases
-
Voiceovers: Product demos, explainer videos
-
Audiobooks: Convert text to spoken word
-
Podcasts: Generate podcast episodes
-
Accessibility: Make content accessible
-
IVR: Phone system voice prompts
-
Video Narration: Add narration to videos
Combine with Video
Generate speech, then create a talking head video:
# 1. Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json
# 2. Use the audio URL with OmniHuman for avatar video
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "<audio-url-from-step-1>"
}'
Related Skills
# ElevenLabs TTS (premium, 22+ voices)
npx skills add inference-sh/skills@elevenlabs-tts
# ElevenLabs dialogue (multi-speaker)
npx skills add inference-sh/skills@elevenlabs-dialogue
# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@infsh-cli
# AI avatars (combine TTS with talking heads)
npx skills add inference-sh/skills@ai-avatar-video
# AI music generation
npx skills add inference-sh/skills@ai-music-generation
# Speech-to-text (transcription)
npx skills add inference-sh/skills@speech-to-text
# Video generation
npx skills add inference-sh/skills@ai-video-generation
Browse all apps: infsh app list
Documentation
-
Running Apps - How to run apps via CLI
-
Audio Transcription Example - Audio processing workflows
-
Apps Overview - Understanding the app ecosystem
Weekly Installs4.4KRepositoryinferen-sh/skillsGitHub Stars159First Seen6 days agoSecurity AuditsGen Agent Trust HubPassSocketWarnSnykWarnInstalled onclaude-code3.5Kgemini-cli3.1Kcodex3.1Kamp3.1Kkimi-cli3.1Kgithub-copilot3.1K
forumユーザーレビュー (0)
レビューを書く
レビューなし
統計データ
ユーザー評価
この Skill を評価