---
id: daily-gpt-image-edit
name: "gpt-image-edit"
url: https://skills.yangsir.net/skill/daily-gpt-image-edit
author: agentspace-so
domain: multimedia
tags: ["image-editing", "generative-ai", "openai", "image-understanding"]
install_count: 121600
rating: 4.80 (12 reviews)
github: https://github.com/agentspace-so/runcomfy-agent-skills
---

# gpt-image-edit

> OpenAI GPT Image Edit 模型，智能图像理解和编辑

**Stats**: 121,600 installs · 4.8/5 (12 reviews)

## Before / After 对比

### 智能图像编辑

**Before**:

根据客户反馈修改设计稿需要理解意见、手动调整元素和重新渲染，一次修改迭代需要 2-3 小时，沟通成本高

**After**:

直接描述修改意见自动完成编辑，智能理解上下文并保持设计一致性，一次修改迭代 5 分钟完成，支持快速多版本对比

| Metric | Before | After | Change |
|---|---|---|---|
| 修改迭代时间 | 150分钟 | 5分钟 | -97% |

## Readme

# gpt-image-edit

# GPT Image Edit — Pro Pack on RunComfy

[runcomfy.com](https://www.runcomfy.com/?utm_source=skills.sh&utm_medium=skill&utm_campaign=gpt-image-edit) · [Edit endpoint](https://www.runcomfy.com/models/openai/gpt-image-2/edit?utm_source=skills.sh&utm_medium=skill&utm_campaign=gpt-image-edit) · [Text-to-image sibling](https://www.runcomfy.com/models/openai/gpt-image-2/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=gpt-image-edit) · [GitHub](https://github.com/agentspace-so/runcomfy-skills/tree/main/gpt-image-edit)

OpenAI **GPT Image 2 — `/edit` endpoint** (ChatGPT Images 2.0 image-to-image) on the **RunComfy Model API**. Strongest in its class at preserving identity through targeted edits and rewriting embedded text in any script (Latin, kana, CJK, Cyrillic, Arabic).

```
npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g

```

## When to pick this model (vs siblings)

You want
Use

Edit multilingual / embedded text in image
**GPT Image Edit**

Identity preservation through translated headline variants
**GPT Image Edit**

Layout-precise edit (move headline, swap CTA, etc.)
**GPT Image Edit**

Up to 10 reference images
**GPT Image Edit**

Batch up to 20 images consistently
Nano Banana Edit

Single-shot precise local edit, source-fidelity-first
Flux Kontext

Generate from scratch with GPT Image 2
sibling [`gpt-image-2`](https://github.com/agentspace-so/runcomfy-agent-skills/blob/HEAD/gpt-image-edit/../gpt-image-2) skill

Batch SKU galleries with stable identity
Nano Banana Edit

## Prerequisites

- **RunComfy CLI** — `npm i -g @runcomfy/cli`

- **RunComfy account** — `runcomfy login` opens a browser device-code flow.

- **CI / containers** — set `RUNCOMFY_TOKEN=<token>` instead of `runcomfy login`.

## Endpoints + input schema

### `openai/gpt-image-2/edit`

Field
Type
Required
Default
Notes

`prompt`
string
yes
—
Edit instruction. Lead with preservation, end with the change.

`images`
string[]
yes
—
**Up to 10** publicly-fetchable HTTPS URLs. First is primary; rest are auxiliary.

`size`
enum
no
`auto`
`auto` (preserve input), `1024_1024` (1:1), `1024_1536` (2:3 portrait), `1536_1024` (3:2 landscape).

`size=auto` preserves the input ratio — strongly recommended unless the edit explicitly changes framing.

## How to invoke

**Single-ref preservation edit:**

```
runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
    "images": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

```

**Multilingual text rewrite (preserve everything except the headline):**

```
runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

```

**Multi-ref composition:**

```
runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

```

## Prompting — what actually works

**Lead with preservation goals.** Always: `"Keep [face / pose / clothing / brand / framing] unchanged."` Then state the change. The model honors what's stated up front.

**Multilingual text — quote the characters, name the script.** `"the headline reads \"コーヒー\" in bold Japanese kana"`, `"the label says \"АРОМА\" in Cyrillic, white on black"`, `"the right-margin caption reads \"تخفيض\" in Arabic right-to-left"`. Don't paraphrase — quote.

**Directional language for spatial edits.** Concrete spatial scopes work: `"move the headline from top-right to bottom-center"`, `"remove the leftmost object only"`, `"replace the watermark in the bottom-right corner"`.

**Multi-ref numbering.** When passing multiple `images`, refer to them by number: `"subject from image 1, lighting from image 2, color palette from image 3"`. The model routes cues correctly.

**Use `size: "auto"` to preserve input ratio.** Only override when the edit explicitly changes framing (e.g. cropping a 16:9 to 1:1).

**Anti-patterns:**

- Long compound edit instructions ("change A and B and C and D") → drift increases per added scope.

- Missing preservation goals → model subtly rewrites the face / brand / framing.

- Paraphrasing in-image text instead of quoting it → text comes out different.

- Asking for `size` outside the 3 fixed values + `auto` → 422.

## Where it shines

Use case
Why GPT Image Edit

**Multilingual ad localization**
One source asset → many language variants of the same headline

**Brand-safe headline / CTA swaps**
Layout precision + preservation language hold the rest stable

**Multi-ref composition (subject from one, scene from another)**
Numbered refs route cues correctly

**Layout-precise repositioning**
Directional language ("top-right to bottom-center") honored

**Identity preservation across signage edits**
Strongest in class for face / brand preservation through targeted edits

## Sample prompts (verified to produce strong results)

**Background swap with full preservation (page example):**

```
Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchanged

```

**Multilingual variant:**

```
Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.

```

**Multi-ref composition:**

```
Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.

```

## Limitations

- **`size`: 3 fixed values + `auto`** — anything else 422s.

- **`images`: up to 10** — first is primary, rest are auxiliary cues.

- **Long compound prompts drift** — split into multiple passes when needed.

- **For batch consistency across many SKU images, Nano Banana Edit (up to 20) is better.**

- **Photorealism on portraits** — Nano Banana Pro wins head-to-head.

## Exit codes

code
meaning

0
success

64
bad CLI args

65
bad input JSON / schema mismatch

69
upstream 5xx

75
retryable: timeout / 429

77
not signed in or token rejected

Full reference: [docs.runcomfy.com/cli/troubleshooting](https://docs.runcomfy.com/cli/troubleshooting?utm_source=skills.sh&utm_medium=skill&utm_campaign=gpt-image-edit).

## How it works

The skill invokes `runcomfy run openai/gpt-image-2/edit` with a JSON body matching the schema. The CLI POSTs to `https://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit`, polls the request, fetches the result, and downloads any `.runcomfy.net`/`.runcomfy.com` URL into `--output-dir`. `Ctrl-C` cancels the remote request before exit.

## Security & Privacy

- **Token storage**: `runcomfy login` writes the API token to `~/.config/runcomfy/token.json` with mode 0600 (owner-only read/write). Set `RUNCOMFY_TOKEN` env var to bypass the file entirely in CI / containers.

- **Input boundary**: the user prompt is passed as a JSON string to the CLI via `--input`. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.

- **Third-party content**: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.

- **Outbound endpoints**: only `model-api.runcomfy.net` (request submission) and `*.runcomfy.net` / `*.runcomfy.com` (download whitelist for generated outputs). No telemetry, no callbacks.

- **Generated-file size cap**: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.

Weekly Installs10.5KRepository[agentspace-so/r…t-skills](https://github.com/agentspace-so/runcomfy-agent-skills)First SeenTodaySecurity Audits[Gen Agent Trust HubPass](/agentspace-so/runcomfy-agent-skills/gpt-image-edit/security/agent-trust-hub)[SocketPass](/agentspace-so/runcomfy-agent-skills/gpt-image-edit/security/socket)[SnykWarn](/agentspace-so/runcomfy-agent-skills/gpt-image-edit/security/snyk)

---
*Source: https://skills.yangsir.net/skill/daily-gpt-image-edit*
*Markdown mirror: https://skills.yangsir.net/api/skill/daily-gpt-image-edit/markdown*