Video encoding, audio processing, image processing, streaming, podcasts, OCR
by @inferen-sh
Produces "talking head" videos, transforming static images or text into lively explanatory videos using AI technology, widely applied in news broadcasting, education, and marketing.
by @mwguerra
Generate complete FilamentPHP v4 resources, including form schemas, table configurations, relation managers, and custom pages.
by @heygen-com
This skill is deprecated and was originally used to generate video content based on prompts. Please use the `create-video` skill instead for video generation functionality.
by @aahl
A skill for binge-watching TV series/anime, providing AI with the ability to search for video playback addresses and support direct playback on Xiaomi TV. Use this skill when users want to search for information or update progress on movies, anime, short dramas, variety shows, etc.
Provides your agent with hundreds of image-to-video production skills using the inference.sh API.
Provides your agent with hundreds of AI music generation skills using the inference.sh API.
Utilizes AI technology to clone specific human voices, generating highly realistic speech for personalized voice assistants, audiobooks, or film/TV dubbing.
by @daffy0208
Proficient in Three.js and 3D graphics, used to create interactive 3D visualization content and enhance user experience.
by @adithya-s-k
Masters Manim animation best practices, including scripting, scene design, and rendering optimization, for creating high-quality mathematical animations and visualizations.
by @emzod
Equips AI agents with real-time voice conversation capabilities, supporting local text-to-speech, voice cloning, and audio generation, with optimized performance for Apple Silicon devices.
Use HeyGen API for AI face swapping, replacing faces from source images into target videos, supporting GPU-accelerated processing.
by @op7418
This skill is an intelligent YouTube video clipping tool that helps users efficiently edit YouTube video content.
by @remotion-dev
Handle video playback error reports, download videos, update source files, and then run Remotion for processing.
Provides your agent with hundreds of storyboard creation skills using the inference.sh API.
Converts text content into natural, fluent speech output, suitable for various multimedia applications, enhancing user audio experience and content accessibility.
Automatically generates podcast content using AI technology, including scriptwriting, voice synthesis, and post-production, significantly simplifying the podcast creation process and lowering production barriers.
by @sickn33
Generate AI-driven podcast-style audio narratives using Azure OpenAI's GPT technology.
Understand local video content by extracting frames with ffmpeg and transcribing with Whisper, running completely offline without API keys.
by @mrgoonie
Processes video, audio, and images using FFmpeg and ImageMagick command-line tools for conversion, optimization, and manipulation.
by @sundial-org
Generates FFmpeg commands from natural language video editing requests, supporting operations like cutting, trimming, converting, and compressing.
by @maartenlouis
Generates professional AI voiceovers for Remotion videos using the ElevenLabs API; API key configuration is required.
by @cinience
Perform image editing using the Qwen Image Edit model in Alibaba Cloud AI Model Studio.
by @lwmxiaobei
Provides tools using yt-dlp to download videos and extract audio from multiple platforms, supporting various video sources for efficient media content acquisition.
Uses the Qwen TTS model in Alibaba Cloud AI Model Studio for audio text-to-speech.
by @libtv-labs
Supports creating sessions, sending messages to generate or edit images/videos, uploading multimedia files, and querying session records via agent-im's OpenAPI, enabling intelligent multimedia interaction.
by @yizhiyanhua-ai
Intelligent media downloader that finds and downloads relevant images and videos based on user needs. An AI Agent Skill to enhance work efficiency and automation.
Alibaba Cloud Video WAN Video Minimum Viability Test skill, verifies the minimum request path and records detailed error information upon failure.
Provides expert guidance on creating explainer videos, helping users clearly and effectively communicate complex concepts or product features, enhancing audience understanding.
by @wshuyi
A framework for programmatically creating MP4 videos using React, providing core concepts like video definition, frame animation, interpolation, and physics animation for dynamic video generation.
by @schwepps
Professional music creation using Suno AI V5 and Suno Studio to generate high-quality original songs.
by @davila7
Provides OpenAI's Whisper multilingual speech recognition model for speech-to-text, podcast/video transcription, and meeting minutes across 99 languages.
by @joeseesun
Provide integration features between Qiaomu Music Player and Spotify, enabling music playback and management.
by @jezweb
Master image processing techniques, including editing, format conversion, compression optimization, and special effects, to deliver high-quality visual presentations for multimedia content.
by @hamsterider-m
Extract subtitles from Bilibili videos, supporting AI subtitle detection and ASR transcription fallback.
by @steipete
Initiate voice calls via the OpenClaw voice call plugin to enable multimedia communication.
by @jimliu
Automatically download YouTube video subtitles and transcripts, supporting manual and auto-generated captions, without API keys or browser, directly calling YouTube InnerTube API.
by @affaan-m
VideoDB provides perception, memory, and action capabilities for videos, live streams, and desktop sessions, enabling intelligent video analysis and interaction.
Generate images, videos, and audio using fal.ai models, supporting text-to-image, video generation, and audio creation.
Utilize ffmpeg to precisely extract single frames or short video segments of specified lengths from videos.
Creates virtual avatar videos using AI technology, achieving realistic character animation and voice synchronization, widely applied in education, marketing, and entertainment.
Helps you build video players, process video streams, and manage video content, providing comprehensive video solutions.
by @enzed
Focuses on React Three Fiber texture handling, including useTexture, texture loading, and environment mapping.
by @shreefentsar
Remotion video toolkit, allowing React components to generate MP4 videos and guiding AI agents for video rendering.
by @aradotso
Automatically process local music libraries, separate vocals and accompaniment, and transcribe lyrics to create a personal karaoke system.
Provides AI-assisted video editing features, focusing on editing, optimizing, and post-processing real-shot footage, rather than generating video content via prompts.
by @marswaveai
Creates podcasts based on topics, URLs, or text, supporting multiple trigger words for quick audio content generation.
Performs advanced searches using YouTube Data API v3 and downloads YouTube videos with `yt-dlp`, requiring a YouTube API Key configuration.
Transcribe audio files to text using local speech recognition. Triggers on: "transcribe", "ASR".
Specializes in video content translation and multi-language dubbing, capable of rapidly localizing existing videos into multiple languages to expand global audience reach.
Controls Sonos speakers via a command-line interface, enabling device discovery, status inquiry, playback, volume, and group management.