---
id: gh-defuddle
name: "defuddle"
url: https://skills.yangsir.net/skill/gh-defuddle
author: kepano
domain: ai-data-management-analysis
tags: ["web-scraping", "markdown", "content-extraction", "cli", "ai-optimization"]
install_count: 23700
rating: 4.60 (120 reviews)
github: https://github.com/kepano/obsidian-skills/tree/main/skills/defuddle
---

# defuddle

> Defuddle CLI 可从网页提取干净的 Markdown 内容，去除杂乱信息和导航，从而节省 AI 处理的 token 消耗。适用于在线文档、文章和博客等，提升内容处理效率。

**Stats**: 23,700 installs · 4.6/5 (120 reviews)

## Before / After 对比

### 网页内容处理效率对比

**Before**:

使用传统方法或直接抓取网页内容时，会包含大量无关信息（如广告、导航、页脚等），导致阅读体验差，且在 AI 处理时消耗大量 token，增加成本和处理时间。

**After**:

Defuddle CLI 自动清理网页，仅保留核心内容并输出为 Markdown。这不仅极大提升了阅读效率，还显著减少了 AI 处理所需的 token 数量，降低了成本，加速了分析过程。

| Metric | Before | After | Change |
|---|---|---|---|
| AI Token 消耗 | 1000tokens | 200tokens | -80% |

## Readme

# Defuddle

Use Defuddle CLI to extract clean readable content from web pages. Prefer over WebFetch for standard web pages — it removes navigation, ads, and clutter, reducing token usage.

If not installed: `npm install -g defuddle`

## Usage

Always use `--md` for markdown output:

```bash
defuddle parse <url> --md
```

Save to file:

```bash
defuddle parse <url> --md -o content.md
```

Extract specific metadata:

```bash
defuddle parse <url> -p title
defuddle parse <url> -p description
defuddle parse <url> -p domain
```

## Output formats

| Flag | Format |
|------|--------|
| `--md` | Markdown (default choice) |
| `--json` | JSON with both HTML and markdown |
| (none) | HTML |
| `-p <name>` | Specific metadata property |


---
*Source: https://skills.yangsir.net/skill/gh-defuddle*
*Markdown mirror: https://skills.yangsir.net/api/skill/gh-defuddle/markdown*