Home/数据分析/parallel-web-extract
P

parallel-web-extract

by @parallel-webv
4.6(11)

Efficiently extracts content in parallel from any URL (including web pages, APIs, etc.), suitable for large-scale data collection and information processing tasks.

Web Data ExtractionParallel ProcessingDistributed ScrapingData PipelinesGitHub
Installation
npx skills add parallel-web/parallel-agent-skills --skill parallel-web-extract
compare_arrows

Before / After Comparison

1
Before

Extracting web content one by one is inefficient and time-consuming when dealing with a large number of URLs, making it difficult to quickly obtain the necessary information for analysis.

After

By using parallel methods to batch extract web pages, articles, PDFs, and other content, data acquisition time is significantly shortened, and the efficiency of data analysis preparation is improved.

description SKILL.md


name: parallel-web-extract description: "URL content extraction. Use for fetching any URL - webpages, articles, PDFs, JavaScript-heavy sites. Token-efficient: runs in forked context. Prefer over built-in WebFetch." user-invocable: true argument-hint: [url2] [url3] context: fork agent: parallel:parallel-subagent compatibility: Requires parallel-cli and internet access. allowed-tools: Bash(parallel-cli:*) metadata: author: parallel

URL Extraction

Extract content from: $ARGUMENTS

Command

Choose a short, descriptive filename based on the URL or content (e.g., vespa-docs, react-hooks-api). Use lowercase with hyphens, no spaces.

parallel-cli extract "$ARGUMENTS" --json -o "/tmp/$FILENAME.md"

Options if needed:

  • --objective "focus area" to focus on specific content

Response format

Return content as:

Page Title

Then the extracted content verbatim, with these rules:

  • Keep content verbatim - do not paraphrase or summarize
  • Parse lists exhaustively - extract EVERY numbered/bulleted item
  • Strip only obvious noise: nav menus, footers, ads
  • Preserve all facts, names, numbers, dates, quotes

After the response, mention the output file path (/tmp/$FILENAME.md) so the user knows it's available for follow-up questions.

Setup

If parallel-cli is not found, install and authenticate:

curl -fsSL https://parallel.ai/install.sh | bash

If unable to install that way, install via pipx instead:

pipx install "parallel-web-tools[cli]"
pipx ensurepath

Then authenticate:

parallel-cli login

Or set an API key: export PARALLEL_API_KEY="your-key"

forumUser Reviews (0)

Write a Review

Effect
Usability
Docs
Compatibility

No reviews yet

Statistics

Installs760
Rating4.6 / 5.0
Version
Updated2026年3月16日
Comparisons1

User Rating

4.6(11)
5
0%
4
0%
3
0%
2
0%
1
0%

Rate this Skill

0.0

Compatible Platforms

🔧Claude Code
🔧OpenClaw
🔧OpenCode
🔧Codex
🔧Gemini CLI
🔧GitHub Copilot
🔧Amp
🔧Kimi CLI

Timeline

Created2026年3月16日
Last Updated2026年3月16日