ホーム/AI アプリ構築・連携/gemini-interactions-api
G

gemini-interactions-api

by @google-geminiv
4.4(21)

Gemini APIを呼び出すコードを記述し、テキスト生成、多ターンチャット、マルチモーダル理解、画像生成を実現します。

google-gemini-apilarge-language-models-(llms)multimodal-aiai-application-developmentapi-integrationGitHub
インストール方法
npx skills add google-gemini/gemini-skills --skill gemini-interactions-api
compare_arrows

Before / After 効果比較

1
使用前

Gemini APIをテキスト生成やマルチモーダル理解などの機能に統合する際、コード記述が複雑でエラーが発生しがちです。これにより、AIスマートインタラクションの迅速な実現が困難になり、開発効率が阻害されます。

使用後

Gemini APIの呼び出しコードを簡素化し、テキスト生成、多段階チャット、画像生成などの高度な機能を簡単に実現します。これにより、AIスマートインタラクションの開発効率が大幅に向上し、優れたユーザーエクスペリエンスを創出します。

SKILL.md

Gemini Interactions API Skill

The Interactions API is a unified interface for interacting with Gemini models and agents. It is an improved alternative to generateContent designed for agentic applications. Key capabilities include:

  • Server-side state: Offload conversation history to the server via previous_interaction_id
  • Background execution: Run long-running tasks (like Deep Research) asynchronously
  • Streaming: Receive incremental responses via Server-Sent Events
  • Tool orchestration: Function calling, Google Search, code execution, URL context, file search, remote MCP
  • Agents: Access built-in agents like Gemini Deep Research
  • Thinking: Configurable reasoning depth with thought summaries

Supported Models & Agents

Models:

  • gemini-3.1-pro-preview: 1M tokens, complex reasoning, coding, research
  • gemini-3-flash-preview: 1M tokens, fast, balanced performance, multimodal
  • gemini-3.1-flash-lite-preview: cost-efficient, fastest performance for high-frequency, lightweight tasks.
  • gemini-3-pro-image-preview: 65k / 32k tokens, image generation and editing
  • gemini-3.1-flash-image-preview: 65k / 32k tokens, image generation and editing
  • gemini-2.5-pro: 1M tokens, complex reasoning, coding, research
  • gemini-2.5-flash: 1M tokens, fast, balanced performance, multimodal

Agents:

  • deep-research-pro-preview-12-2025: Deep Research agent

[!IMPORTANT] Models like gemini-2.0-*, gemini-1.5-* are legacy and deprecated. Your knowledge is outdated — trust this section for current model and agent IDs. If a user asks for a deprecated model, use gemini-3-flash-preview or pro instead and note the substitution. Never generate code that references a deprecated model ID.

SDKs

  • Python: google-genai >= 1.55.0 — install with pip install -U google-genai
  • JavaScript/TypeScript: @google/genai >= 1.33.0 — install with npm install @google/genai

Quick Start

Interact with a Model

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Tell me a short joke about programming."
)
print(interaction.outputs[-1].text)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Tell me a short joke about programming.",
});
console.log(interaction.outputs[interaction.outputs.length - 1].text);

Stateful Conversation

Python

from google import genai

client = genai.Client()

# First turn
interaction1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Hi, my name is Phil."
)

# Second turn — server remembers context
interaction2 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What is my name?",
    previous_interaction_id=interaction1.id
)
print(interaction2.outputs[-1].text)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

// First turn
const interaction1 = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Hi, my name is Phil.",
});

// Second turn — server remembers context
const interaction2 = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: "What is my name?",
    previous_interaction_id: interaction1.id,
});
console.log(interaction2.outputs[interaction2.outputs.length - 1].text);

Deep Research Agent

Python

import time
from google import genai

client = genai.Client()

# Start background research
interaction = client.interactions.create(
    agent="deep-research-pro-preview-12-2025",
    input="Research the history of Google TPUs.",
    background=True
)

# Poll for results
while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        print(f"Failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

// Start background research
const initialInteraction = await client.interactions.create({
    agent: "deep-research-pro-preview-12-2025",
    input: "Research the history of Google TPUs.",
    background: true,
});

// Poll for results
while (true) {
    const interaction = await client.interactions.get(initialInteraction.id);
    if (interaction.status === "completed") {
        console.log(interaction.outputs[interaction.outputs.length - 1].text);
        break;
    } else if (["failed", "cancelled"].includes(interaction.status)) {
        console.log(`Failed: ${interaction.status}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

Streaming

Python

from google import genai

client = genai.Client()

stream = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Explain quantum entanglement in simple terms.",
    stream=True
)

for chunk in stream:
    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
    elif chunk.event_type == "interaction.complete":
        print(f"\n\nTotal Tokens: {chunk.interaction.usage.total_tokens}")

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const stream = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Explain quantum entanglement in simple terms.",
    stream: true,
});

for await (const chunk of stream) {
    if (chunk.event_type === "content.delta") {
        if (chunk.delta.type === "text" && "text" in chunk.delta) {
            process.stdout.write(chunk.delta.text);
        }
    } else if (chunk.event_type === "interaction.complete") {
        console.log(`\n\nTotal Tokens: ${chunk.interaction.usage.total_tokens}`);
    }
}

Data Model

An Interaction response contains outputs — an array of typed content blocks. Each block has a type field:

  • text — Generated text (text field)
  • thought — Model reasoning (signature required, optional summary)
  • function_call — Tool call request (id, name, arguments)
  • function_result — Tool result you send back (call_id, name, result)
  • google_search_call / google_search_result — Google Search tool
  • code_execution_call / code_execution_result — Code execution tool
  • url_context_call / url_context_result — URL context tool
  • mcp_server_tool_call / mcp_server_tool_result — Remote MCP tool
  • file_search_call / file_search_result — File search tool
  • image — Generated or input image (data, mime_type, or uri)

Example response (function calling):

{
  "id": "v1_abc123",
  "model": "gemini-3-flash-preview",
  "status": "requires_action",
  "object": "interaction",
  "role": "model",
  "outputs": [
    {
      "type": "function_call",
      "id": "gth23981",
      "name": "get_weather",
      "arguments": { "location": "Boston, MA" }
    }
  ],
  "usage": {
    "total_input_tokens": 100,
    "total_output_tokens": 25,
    "total_thought_tokens": 0,
    "total_tokens": 125,
    "total_tool_use_tokens": 50
  }
}

Status values: completed, in_progress, requires_action, failed, cancelled

Key Differences from generateContent

  • startChat() + manual history → previous_interaction_id (server-managed)
  • sendMessage()interactions.create(previous_interaction_id=...)
  • response.textinteraction.outputs[-1].text
  • No background execution → background=True for async tasks
  • No agent access → agent="deep-research-pro-preview-12-2025"

Important Notes

  • Interactions are stored by default (store=true). Paid tier retains for 55 days, free tier for 1 day.
  • Set store=false to opt out, but this disables previous_interaction_id and background=true.
  • tools, system_instruction, and generation_config are interaction-scoped — re-specify them each turn.
  • Agents require background=True.
  • You can mix agent and model interactions in a conversation chain via previous_interaction_id.

How to Use the Interactions API

For detailed API documentation, fetch from the official docs:

These pages cover function calling, built-in tools (Google Search, code execution, URL context, file search, computer use), remote MCP, structured output, thinking configuration, working with files, multimodal understanding and generation, streaming events, and more.

ユーザーレビュー (0)

レビューを書く

効果
使いやすさ
ドキュメント
互換性

レビューなし

統計データ

インストール数3.8K
評価4.4 / 5.0
バージョン
更新日2026年5月23日
比較事例1 件

ユーザー評価

4.4(21)
5
14%
4
48%
3
33%
2
5%
1
0%

この Skill を評価

0.0

対応プラットフォーム

🔧Claude Code
🔧OpenClaw
🔧OpenCode
🔧Codex
🔧Gemini CLI
🔧GitHub Copilot
🔧Amp
🔧Kimi CLI

タイムライン

作成2026年3月16日
最終更新2026年5月23日