B

baoyu-image-gen

by @JimLiuv
4.6(220)

全能AI图片生成工具,集成OpenAI DALL-E、Google Imagen、DashScope、即梦、Seedream等7大主流平台,支持自定义宽高比和质量预设,满足创意设计、营销素材和视觉内容生成的多样化需求。

ai-image-generationopenai-dall-egoogle-imagendashscopereplicategenerative-aiGitHub
安装方式
npx skills add JimLiu/baoyu-skills --skill baoyu-image-gen
compare_arrows

Before / After 效果对比

1
使用前

使用不同的AI图片生成平台需要分别操作,管理多个账号和配置参数复杂。难以统一管理和高效利用多种AI能力。

使用后

整合多个AI图片生成平台,支持宽高比和质量预设,实现一站式高效出图。极大简化操作流程,提升图片创作效率。

SKILL.md

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Jimeng (即梦), Seedream (豆包) and Replicate providers.

Script Directory

Agent Execution:

  1. {baseDir} = this SKILL.md file's directory
  2. Script path = {baseDir}/scripts/main.ts
  3. Resolve ${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun

Step 0: Load Preferences ⛔ BLOCKING

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.

Check EXTEND.md existence (priority: project → user):

# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }
ResultAction
FoundLoad, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2)
Not found⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue

CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.

PathLocation
.baoyu-skills/baoyu-image-gen/EXTEND.mdProject directory
$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.mdUser home

EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits

Schema: references/config/preferences-schema.md

Usage

# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google, OpenAI, OpenRouter, or Replicate)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter

# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png

# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872

# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928

# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json

# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

Batch File Format

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

Paths in promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.

Options

OptionDescription
--prompt <text>, -pPrompt text
--promptfiles <files...>Read prompt from files (concatenated)
--image <path>Output image path (required in single-image mode)
--batchfile <path>JSON batch file for multi-image generation
--jobs <count>Worker count for batch mode (default: auto, max from config, built-in default 10)
--provider google|openai|openrouter|dashscope|jimeng|seedream|replicateForce provider (default: auto-detect)
--model <id>, -mModel ID (Google: gemini-3-pro-image-preview; OpenAI: gpt-image-1.5; OpenRouter: google/gemini-3.1-flash-image-preview; DashScope: qwen-image-2.0-pro)
--ar <ratio>Aspect ratio (e.g., 16:9, 1:1, 4:3)
--size <WxH>Size (e.g., 1024x1024)
--quality normal|2kQuality preset (default: 2k)
--imageSize 1K|2K|4KImage size for Google/OpenRouter (default: from quality)
--ref <files...>Reference images. Supported by Google multimodal, OpenAI GPT Image edits, OpenRouter multimodal models, and Replicate. Not supported by Jimeng or Seedream
--n <count>Number of images
--jsonJSON output

Environment Variables

VariableDescription
OPENAI_API_KEYOpenAI API key
OPENROUTER_API_KEYOpenRouter API key
GOOGLE_API_KEYGoogle API key
DASHSCOPE_API_KEYDashScope API key (阿里云)
REPLICATE_API_TOKENReplicate API token
JIMENG_ACCESS_KEY_IDJimeng (即梦) Volcengine access key
JIMENG_SECRET_ACCESS_KEYJimeng (即梦) Volcengine secret key
ARK_API_KEYSeedream (豆包) Volcengine ARK API key
OPENAI_IMAGE_MODELOpenAI model override
OPENROUTER_IMAGE_MODELOpenRouter model override (default: google/gemini-3.1-flash-image-preview)
GOOGLE_IMAGE_MODELGoogle model override
DASHSCOPE_IMAGE_MODELDashScope model override (default: qwen-image-2.0-pro)
REPLICATE_IMAGE_MODELReplicate model override (default: google/nano-banana-pro)
JIMENG_IMAGE_MODELJimeng model override (default: jimeng_t2i_v40)
SEEDREAM_IMAGE_MODELSeedream model override (default: doubao-seedream-5-0-260128)
OPENAI_BASE_URLCustom OpenAI endpoint
OPENROUTER_BASE_URLCustom OpenRouter endpoint (default: https://openrouter.ai/api/v1)
OPENROUTER_HTTP_REFEREROptional app/site URL for OpenRouter attribution
OPENROUTER_TITLEOptional app name for OpenRouter attribution
GOOGLE_BASE_URLCustom Google endpoint
DASHSCOPE_BASE_URLCustom DashScope endpoint
REPLICATE_BASE_URLCustom Replicate endpoint
JIMENG_BASE_URLCustom Jimeng endpoint (default: https://visual.volcengineapi.com)
JIMENG_REGIONJimeng region (default: cn-north-1)
SEEDREAM_BASE_URLCustom Seedream endpoint (default: https://ark.cn-beijing.volces.com/api/v3)
BAOYU_IMAGE_GEN_MAX_WORKERSOverride batch worker cap
BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCYOverride provider concurrency, e.g. BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY
BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MSOverride provider start gap, e.g. BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS

Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Model Resolution

Model priority (highest → lowest), applies to all providers:

  1. CLI flag: --model <id>
  2. EXTEND.md: default_model.[provider]
  3. Env var: <PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)
  4. Built-in default

EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.

Agent MUST display model info before each generation:

  • Show: Using [provider] / [model]
  • Show switch hint: Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

DashScope Models

Use --model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior.

Official DashScope model families:

  • qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03
    • Free-form size in 宽*高 format
    • Total pixels must stay between 512*512 and 2048*2048
    • Default size is approximately 1024*1024
    • Best choice for custom ratios such as 21:9 and text-heavy Chinese/English layouts
  • qwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image
    • Fixed sizes only: 1664*928, 1472*1104, 1328*1328, 1104*1472, 928*1664
    • Default size is 1664*928
    • qwen-image currently has the same capability as qwen-image-plus
  • Legacy DashScope models such as z-image-turbo, z-image-ultra, wanx-v1
    • Keep using them only when the user explicitly asks for legacy behavior or compatibility

When translating CLI args into DashScope behavior:

  • --size wins over --ar
  • For qwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions below
  • For qwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro
  • --quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the qwen-image-2.0* table below is an implementation inference, not an official API guarantee

Recommended qwen-image-2.0* sizes for common aspect ratios:

Rationormal2k
1:11024*10241536*1536
2:3768*11521024*1536
3:21152*7681536*1024
3:4960*12801080*1440
4:31280*9601440*1080
9:16720*12801080*1920
16:91280*7201920*1080
21:91344*5762048*872

DashScope official APIs also expose negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today.

Official references:

OpenRouter Models

Use full OpenRouter model IDs, e.g.:

  • google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)
  • google/gemini-2.5-flash-image-preview
  • black-forest-labs/flux.2-pro
  • Other OpenRouter image-capable model IDs

Notes:

  • OpenRouter image generation uses /chat/completions, not the OpenAI /images endpoints
  • If --ref is used, choose a multimodal model that supports image input and image output
  • --imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possible

Replicate Models

Supported model formats:

  • owner/name (recommended for official models), e.g. google/nano-banana-pro
  • owner/name:version (community models by version), e.g. stability-ai/sdxl:<version>

Examples:

# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Provider Selection

  1. --ref provided + no --provider → auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)
  2. --provider specified → use it (if --ref, must be google, openai, openrouter, or replicate)
  3. Only one API key available → use that provider
  4. Multiple available → default to Google

Quality Presets

PresetGoogle imageSizeOpenAI SizeOpenRouter sizeReplicate resolutionUse Case
normal1K1024px1K1KQuick previews
2k (default)2K2048px2K2KCovers, illustrations, infographics

Google/OpenRouter imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

  • Google multimodal: uses imageConfig.aspectRatio
  • OpenAI: maps to closest supported size
  • OpenRouter: sends imageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automatically
  • Replicate: passes aspect_ratio to model; when --ref is provided without --ar, defaults to match_input_image

Generation Mode

Default: Sequential generation.

Batch Parallel Generation: When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.

ModeWhen to Use
Sequential (default)Normal usage, single images, small batches
Parallel batchBatch mode with 2+ tasks

Execution choice:

SituationPreferred approachWhy
One image, or 1-2 simple imagesSequentialLower coordination overhead and easier debugging
Multiple images already have saved prompt filesBatch (--batchfile)Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput
Each image still needs separate reasoning, prompt writing, or style explorationSubagentsThe work is still exploratory, so each image may need independent analysis before generation
Output comes from baoyu-article-illustrator with outline.md + prompts/Batch (build-batch.ts -> --batchfile)That workflow already produces prompt files, so direct batch execution is the intended path

Rule of thumb:

  • Prefe

...

用户评价 (0)

发表评价

效果
易用性
文档
兼容性

暂无评价

统计数据

安装量22.7K
评分4.6 / 5.0
版本
更新日期2026年5月23日
对比案例1 组

用户评分

4.6(220)
5
63%
4
26%
3
7%
2
2%
1
1%

为此 Skill 评分

0.0

兼容平台

🔧Claude Code
🔧OpenClaw
🔧OpenCode
🔧Codex
🔧Gemini CLI
🔧GitHub Copilot
🔧Amp
🔧Kimi CLI

时间线

创建2026年2月1日
最后更新2026年5月23日