parallel-web-extract
Efficiently extracts content in parallel from any URL (including web pages, APIs, etc.), suitable for large-scale data collection and information processing tasks.
npx skills add parallel-web/parallel-agent-skills --skill parallel-web-extractBefore / After Comparison
1 组Extracting web content one by one is inefficient and time-consuming when dealing with a large number of URLs, making it difficult to quickly obtain the necessary information for analysis.
By using parallel methods to batch extract web pages, articles, PDFs, and other content, data acquisition time is significantly shortened, and the efficiency of data analysis preparation is improved.
description SKILL.md
name: parallel-web-extract description: "URL content extraction. Use for fetching any URL - webpages, articles, PDFs, JavaScript-heavy sites. Token-efficient: runs in forked context. Prefer over built-in WebFetch." user-invocable: true argument-hint: [url2] [url3] context: fork agent: parallel:parallel-subagent compatibility: Requires parallel-cli and internet access. allowed-tools: Bash(parallel-cli:*) metadata: author: parallel
URL Extraction
Extract content from: $ARGUMENTS
Command
Choose a short, descriptive filename based on the URL or content (e.g., vespa-docs, react-hooks-api). Use lowercase with hyphens, no spaces.
parallel-cli extract "$ARGUMENTS" --json -o "/tmp/$FILENAME.md"
Options if needed:
--objective "focus area"to focus on specific content
Response format
Return content as:
Then the extracted content verbatim, with these rules:
- Keep content verbatim - do not paraphrase or summarize
- Parse lists exhaustively - extract EVERY numbered/bulleted item
- Strip only obvious noise: nav menus, footers, ads
- Preserve all facts, names, numbers, dates, quotes
After the response, mention the output file path (/tmp/$FILENAME.md) so the user knows it's available for follow-up questions.
Setup
If parallel-cli is not found, install and authenticate:
curl -fsSL https://parallel.ai/install.sh | bash
If unable to install that way, install via pipx instead:
pipx install "parallel-web-tools[cli]"
pipx ensurepath
Then authenticate:
parallel-cli login
Or set an API key: export PARALLEL_API_KEY="your-key"
forumUser Reviews (0)
Write a Review
No reviews yet
Statistics
User Rating
Rate this Skill