---
id: daily-tavily-extract
name: "tavily-extract"
url: https://skills.yangsir.net/skill/daily-tavily-extract
author: tavily-ai
domain: ai-data-management-analysis
tags: ["data-extraction", "content-parsing", "markdown-conversion", "information-retrieval", "text-extraction"]
install_count: 7200
rating: 4.50 (21 reviews)
github: https://github.com/tavily-ai/skills
---

# tavily-extract

> 专注于从指定URL中提取干净、无干扰的Markdown或纯文本内容，便于阅读、分析或进一步处理，提高信息获取效率。

**Stats**: 7,200 installs · 4.5/5 (21 reviews)

## Before / After 对比

### 中文

## Readme

# tavily-extract

# tavily extract

Extract clean markdown or text content from one or more URLs.

## Before running any command

If `tvly` is not found on PATH, install it first:

```
curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login

```

Do not skip this step or fall back to other tools.

See [tavily-cli](https://github.com/tavily-ai/skills/blob/HEAD/skills/tavily-extract/../tavily-cli/SKILL.md) for alternative install methods and auth options.

## When to use

- You have a specific URL and want its content

- You need text from JavaScript-rendered pages

- Step 2 in the [workflow](https://github.com/tavily-ai/skills/blob/HEAD/skills/tavily-extract/../tavily-cli/SKILL.md): search → **extract** → map → crawl → research

## Quick start

```
# Single URL
tvly extract "https://example.com/article" --json

# Multiple URLs
tvly extract "https://example.com/page1" "https://example.com/page2" --json

# Query-focused extraction (returns relevant chunks only)
tvly extract "https://example.com/docs" --query "authentication API" --chunks-per-source 3 --json

# JS-heavy pages
tvly extract "https://app.example.com" --extract-depth advanced --json

# Save to file
tvly extract "https://example.com/article" -o article.md

```

## Options

Option
Description

`--query`
Rerank chunks by relevance to this query

`--chunks-per-source`
Chunks per URL (1-5, requires `--query`)

`--extract-depth`
`basic` (default) or `advanced` (for JS pages)

`--format`
`markdown` (default) or `text`

`--include-images`
Include image URLs

`--timeout`
Max wait time (1-60 seconds)

`-o, --output`
Save output to file

`--json`
Structured JSON output

## Extract depth

Depth
When to use

`basic`
Simple pages, fast — try this first

`advanced`
JS-rendered SPAs, dynamic content, tables

## Tips

- **Max 20 URLs per request** — batch larger lists into multiple calls.

- **Use `--query` + `--chunks-per-source`** to get only relevant content instead of full pages.

- **Try `basic` first**, fall back to `advanced` if content is missing.

- **Set `--timeout`** for slow pages (up to 60s).

- If search results already contain the content you need (via `--include-raw-content`), skip the extract step.

## See also

- [tavily-search](https://github.com/tavily-ai/skills/blob/HEAD/skills/tavily-extract/../tavily-search/SKILL.md) — find pages when you don't have a URL

- [tavily-crawl](https://github.com/tavily-ai/skills/blob/HEAD/skills/tavily-extract/../tavily-crawl/SKILL.md) — extract content from many pages on a site

Weekly Installs280Repository[tavily-ai/skills](https://github.com/tavily-ai/skills)GitHub Stars95First Seen2 days agoSecurity Audits[Gen Agent Trust HubFail](/tavily-ai/skills/tavily-extract/security/agent-trust-hub)[SocketPass](/tavily-ai/skills/tavily-extract/security/socket)[SnykFail](/tavily-ai/skills/tavily-extract/security/snyk)Installed oncodex275opencode274cursor274kimi-cli273gemini-cli273amp273

---
*Source: https://skills.yangsir.net/skill/daily-tavily-extract*
*Markdown mirror: https://skills.yangsir.net/api/skill/daily-tavily-extract/markdown*