T

tavily-extract

by @tavily-aiv
4.5(21)

Focuses on extracting clean, distraction-free Markdown or plain text content from specified URLs for easy reading, analysis, or further processing, improving information retrieval efficiency.

data-extractioncontent-parsingmarkdown-conversioninformation-retrievaltext-extractionGitHub
Installation
npx skills add tavily-ai/skills --skill tavily-extract
compare_arrows

Before / After Comparison

1
Before

In the past, when copying content from web pages, it often brought in a lot of ads, navigation elements, and messy formatting, requiring a significant amount of time for manual cleanup afterwards.

After

The Tavily Extract skill can intelligently extract clean Markdown or text content from a specified URL, eliminating tedious post-extraction cleanup.

SKILL.md

tavily-extract

tavily extract

Extract clean markdown or text content from one or more URLs.

Before running any command

If tvly is not found on PATH, install it first:

curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login

Do not skip this step or fall back to other tools.

See tavily-cli for alternative install methods and auth options.

When to use

  • You have a specific URL and want its content

  • You need text from JavaScript-rendered pages

  • Step 2 in the workflow: search → extract → map → crawl → research

Quick start

# Single URL
tvly extract "https://example.com/article" --json

# Multiple URLs
tvly extract "https://example.com/page1" "https://example.com/page2" --json

# Query-focused extraction (returns relevant chunks only)
tvly extract "https://example.com/docs" --query "authentication API" --chunks-per-source 3 --json

# JS-heavy pages
tvly extract "https://app.example.com" --extract-depth advanced --json

# Save to file
tvly extract "https://example.com/article" -o article.md

Options

Option Description

--query Rerank chunks by relevance to this query

--chunks-per-source Chunks per URL (1-5, requires --query)

--extract-depth basic (default) or advanced (for JS pages)

--format markdown (default) or text

--include-images Include image URLs

--timeout Max wait time (1-60 seconds)

-o, --output Save output to file

--json Structured JSON output

Extract depth

Depth When to use

basic Simple pages, fast — try this first

advanced JS-rendered SPAs, dynamic content, tables

Tips

  • Max 20 URLs per request — batch larger lists into multiple calls.

  • Use --query + --chunks-per-source to get only relevant content instead of full pages.

  • Try basic first, fall back to advanced if content is missing.

  • Set --timeout for slow pages (up to 60s).

  • If search results already contain the content you need (via --include-raw-content), skip the extract step.

See also

Weekly Installs280Repositorytavily-ai/skillsGitHub Stars95First Seen2 days agoSecurity AuditsGen Agent Trust HubFailSocketPassSnykFailInstalled oncodex275opencode274cursor274kimi-cli273gemini-cli273amp273

User Reviews (0)

Write a Review

Effect
Usability
Docs
Compatibility

No reviews yet

Statistics

Installs7.2K
Rating4.5 / 5.0
Version
Updated2026年5月22日
Comparisons1

User Rating

4.5(21)
5
71%
4
29%
3
0%
2
0%
1
0%

Rate this Skill

0.0

Compatible Platforms

🔧Claude Code

Timeline

Created2026年3月18日
Last Updated2026年5月22日