---
id: gh-firecrawl-parse
name: "firecrawl-parse"
url: https://skills.yangsir.net/skill/gh-firecrawl-parse
author: firecrawl
domain: data-ai
tags: ["document-parsing", "markdown-conversion", "ai-summary", "data-extraction", "cli-tool"]
install_count: 14800
rating: 4.50 (120 reviews)
github: https://github.com/firecrawl/cli/tree/main/skills/firecrawl-parse
---

# firecrawl-parse

> firecrawl parse 将本地文档（如 PDF, DOCX, HTML）转换为干净的 Markdown 格式。它支持 AI 摘要和问答功能，帮助用户快速从文档中提取关键信息或进行内容创作。适用于自动化文档处理和数据分析场景。

**Stats**: 14,800 installs · 4.5/5 (120 reviews)

## Before / After 对比

### 简化文档处理流程

**Before**:

面对大量PDF、DOCX等格式的本地文档，手动提取关键信息或将其转换为可用于AI分析的文本格式耗时耗力，容易出错，且格式混乱，严重阻碍了数据利用效率。

**After**:

自动将各种本地文档转换为干净的Markdown，并利用AI进行摘要和问答，大幅缩短文档处理时间，提高数据质量和利用效率，加速AI应用开发。

| Metric | Before | After | Change |
|---|---|---|---|
| 文档准备时间 | 120分钟 | 5分钟 | -95.8% |

## Readme

# firecrawl parse

Turn a local document into clean markdown on disk. Supports **PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML/HTM/XHTML**.

## When to use

- You have a file on disk (not a URL) and want its text as markdown
- User drops a PDF/DOCX and asks what it says, or to summarize it
- Use `scrape` instead when the source is a URL

## Quick start

Always save to `.firecrawl/` with `-o` — parsed docs can be hundreds of KB and blow up context if streamed to stdout. Add `.firecrawl/` to `.gitignore`.

```bash
mkdir -p .firecrawl

# File → markdown
firecrawl parse ./paper.pdf -o .firecrawl/paper.md

# AI summary
firecrawl parse ./paper.pdf -S -o .firecrawl/paper-summary.md

# Ask a question about the doc
firecrawl parse ./paper.pdf -Q "What are the main conclusions?" \
  -o .firecrawl/paper-qa.md
```

Then `head`, `grep`, `rg` etc., or incrementally read the file - don't load the whole thing at once.

## Options

| Option                 | Description                             |
| ---------------------- | --------------------------------------- |
| `-S, --summary`        | AI-generated summary                    |
| `-Q, --query <prompt>` | Ask a question about the parsed content |
| `-o, --output <path>`  | Output file path — **always use this**  |
| `-f, --format <fmt>`   | `markdown` (default), `html`, `summary` |
| `--timeout <ms>`       | Timeout for the parse job               |
| `--timing`             | Show request duration                   |

## Tips

- Quote paths with spaces: `firecrawl parse "./My Doc.pdf" -o .firecrawl/mydoc.md`.
- Max upload size: **50 MB** per file.
- Credits: ~1 per PDF page; HTML is 1 flat.
- Check `.firecrawl/` before re-parsing the same file.
- To check your credit balance (recommended for batch processing and similar workflows), use the `firecrawl credit-usage` command.

## See also

- [firecrawl-scrape](../firecrawl-scrape/SKILL.md) — same idea for URLs


---
*Source: https://skills.yangsir.net/skill/gh-firecrawl-parse*
*Markdown mirror: https://skills.yangsir.net/api/skill/gh-firecrawl-parse/markdown*