minimax-image-understanding
使用 MiniMax AI 模型分析图像内容,提取视觉信息,生成详细描述,回答关于图片的复杂问题
npx skills add imsus/pi-extension-minimax-coding-plan-mcp --skill minimax-image-understandingBefore / After 效果对比
1 组人工逐张查看图片并手动标注内容,每天处理 500 张图片需要 3 小时,容易疲劳导致遗漏细节
批量发送图片到 AI 模型,自动生成详细描述和标签,500 张图片 5 分钟完成处理,准确率提升 40%
description SKILL.md
minimax-image-understanding
MiniMax Image Understanding Skill
Use this skill when you need to analyze, describe, or extract information from images.
How to Use
Call the understand_image tool directly with a prompt and image URL:
understand_image({
prompt: "Your question about the image",
image_url: "https://example.com/image.png"
})
When to Use
Use understand_image when:
-
Screenshots: Error messages, UI issues, code in screenshots
-
Visual content: Photos, diagrams, charts, graphs
-
Documents: Extracting text from images (OCR), understanding layouts
-
UI/UX analysis: Evaluating designs, identifying components
-
Visual debugging: Understanding visual bugs or layout issues
When NOT to Use
Do NOT use understand_image when:
-
Image is already described in the conversation
-
The image is a simple icon or emoji you recognize
-
No image is provided or the image URL is inaccessible
-
Redundant with existing context (e.g., file contents already visible)
Usage
understand_image({
prompt: "What do you see in this image?",
image_url: "https://example.com/screenshot.png"
})
API Details
Endpoint: POST {api_host}/v1/coding_plan/vlm
Request Body:
{
"prompt": "Your question about the image",
"image_url": "data:image/jpeg;base64,/9j/4AAQ..."
}
Response Format:
{
"content": "AI analysis of the image...",
"base_resp": {
"status_code": 0,
"status_msg": "success"
}
}
Image Processing
The tool automatically handles three types of image inputs:
HTTP/HTTPS URLs: Downloads the image and converts to base64
Example: https://example.com/image.jpg
Local file paths: Reads local files and converts to base64
Absolute: /Users/username/Documents/image.png
-
Relative:
images/photo.png -
Removes
@prefix if present
Base64 data URLs: Passes through existing base64 data
Example: data:image/png;base64,iVBORw0KGgo...
Image Formats
Supported:
-
JPEG (.jpg, .jpeg)
-
PNG (.png)
-
WebP (.webp)
Not supported:
- PDF, GIF, PSD, SVG, and other formats
Crafting Effective Prompts
For Descriptions
-
"Describe what's in this image in detail"
-
"What is the main subject of this image?"
-
"Describe the visual style and composition"
For Code/Technical
-
"What code is shown in this screenshot?"
-
"Extract all text from this image"
-
"Identify the UI framework/components used"
For Analysis
-
"Analyze this UI design. What is working well and what could be improved?"
-
"What emotions or mood does this image convey?"
-
"Compare this design to Material Design principles"
For OCR/Text Extraction
-
"Extract all text from this image"
-
"Read the error message in this screenshot"
-
"What does the label say in this image?"
Examples
Error Analysis
understand_image({
prompt: "What is the error message and where is it located in this screenshot?",
image_url: "./error-screenshot.png"
})
Code Screenshot
understand_image({
prompt: "What code is shown in this screenshot? Please transcribe it exactly.",
image_url: "https://example.com/code.png"
})
Design Review
understand_image({
prompt: "Analyze this UI design. What is working well and what could be improved?",
image_url: "https://example.com/mockup.png"
})
OCR
understand_image({
prompt: "Extract all text from this image",
image_url: "/Users/username/Documents/scan.png"
})
Tips
-
Be specific in your prompt about what you want to know
-
Mention format if you need structured output (e.g., "list all elements")
-
Include context if the image is part of a larger task
-
For screenshots, specify if you need full-page or just a specific area
-
Complex analysis may trigger a confirmation prompt (analyze, extract, describe, recognize, transcribe, read)
Error Handling
-
Status code 1004: Authentication error - check API key and region
-
Status code 2038: Real-name verification required
-
Invalid image: File doesn't exist or URL is inaccessible
-
Unsupported format: Image format not in JPEG, PNG, WebP
Weekly Installs250Repositoryimsus/pi-extens…plan-mcpGitHub Stars2First SeenFeb 18, 2026Security AuditsGen Agent Trust HubFailSocketPassSnykFailInstalled onopencode248gemini-cli246github-copilot246codex246cursor245kimi-cli244
forum用户评价 (0)
发表评价
暂无评价,来写第一条吧
统计数据
用户评分
为此 Skill 评分