---
id: ssh2-wechat-article-extractor
name: "wechat-article-extractor"
url: https://skills.yangsir.net/skill/ssh2-wechat-article-extractor
author: freestylefly
domain: content-creation
tags: ["wechat-articles", "data-extraction", "content-scraping", "information-retrieval"]
install_count: 3000
rating: 4.40 (29 reviews)
github: https://github.com/freestylefly/wechat-article-extractor-skill
---

# wechat-article-extractor

> 微信公众号文章一键提取 skill，可以从微信文章 URL 中提取标题、作者、内容、封面图等元数据

**Stats**: 3,000 installs · 4.4/5 (29 reviews)

## Before / After 对比

### 一键提取微信公众号文章元数据

## Readme

# wechat-article-extractor

# WeChat Article Extractor

Extract metadata and content from WeChat Official Account (微信公众号) articles.

## Capabilities

- Parse WeChat article URLs (`mp.weixin.qq.com`)

- Extract article metadata: title, author, description, publish time

- Extract account info: name, avatar, alias, description

- Get article content (HTML)

- Get cover image URL

- Support multiple article types: post, video, image, voice, text, repost

- Handle various error cases: deleted content, expired links, access limits

## Usage

### Basic Extraction from URL

```
const { extract } = require('./scripts/extract.js');

const result = await extract('https://mp.weixin.qq.com/s?__biz=...');
// Returns: { done: true, code: 0, data: {...} }

```

### Extraction from HTML

```
const html = await fetch(url).then(r => r.text());
const result = await extract(html, { url: sourceUrl });

```

### Options

```
const result = await extract(url, {
  shouldReturnContent: true,      // Return HTML content (default: true)
  shouldReturnRawMeta: false,     // Return raw metadata (default: false)
  shouldFollowTransferLink: true, // Follow migrated account links (default: true)
  shouldExtractMpLinks: false,    // Extract embedded mp.weixin links (default: false)
  shouldExtractTags: false,       // Extract article tags (default: false)
  shouldExtractRepostMeta: false  // Extract repost source info (default: false)
});

```

## Response Format

### Success Response

```
{
  done: true,
  code: 0,
  data: {
    // Account info
    account_name: "公众号名称",
    account_alias: "微信号",
    account_avatar: "头像URL",
    account_description: "功能介绍",
    account_id: "原始ID",
    account_biz: "biz参数",
    account_biz_number: 1234567890,
    account_qr_code: "二维码URL",

    // Article info
    msg_title: "文章标题",
    msg_desc: "文章摘要",
    msg_content: "HTML内容",
    msg_cover: "封面图URL",
    msg_author: "作者",
    msg_type: "post", // post|video|image|voice|text|repost
    msg_has_copyright: true,
    msg_publish_time: Date,
    msg_publish_time_str: "2024/01/15 10:30:00",

    // Link params
    msg_link: "文章链接",
    msg_source_url: "阅读原文链接",
    msg_sn: "sn参数",
    msg_mid: 1234567890,
    msg_idx: 1
  }
}

```

### Error Response

```
{
  done: false,
  code: 1001,
  msg: "无法获取文章信息"
}

```

## Error Codes

Code
Message
Description

1000
文章获取失败
General failure

1001
无法获取文章信息
Missing title or publish time

1002
请求失败
HTTP request failed

1003
响应为空
Empty response

1004
访问过于频繁
Rate limited

1005
脚本解析失败
Script parsing error

1006
公众号已迁移
Account migrated

2001
请提供文章内容或链接
Missing input

2002
链接已过期
Link expired

2003
内容涉嫌侵权
Content removed (copyright)

2004
无法获取迁移后的链接
Migration link failed

2005
内容已被发布者删除
Content deleted by author

2006
内容因违规无法查看
Content blocked

2007
内容发送失败
Failed to send

2008
系统出错
System error

2009
不支持的链接
Unsupported URL

2010
内容获取失败
Content fetch failed

2011
涉嫌过度营销
Marketing/spam content

2012
账号已被屏蔽
Account blocked

2013
账号已自主注销
Account deleted

2014
内容被投诉
Content reported

2015
账号处于迁移流程中
Account migrating

2016
冒名侵权
Impersonation

## Dependencies

Required npm packages:

- `cheerio` - HTML parsing

- `dayjs` - Date formatting

- `request-promise` - HTTP requests

- `qs` - Query string parsing

- `lodash.unescape` - HTML entities

## Notes

- Handles various WeChat page structures and anti-scraping measures

- Automatically detects article type from page content

- Supports extracting from Sogou WeChat search results (`weixin.sogou.com`)

- Some fields may be null depending on article type and page structure

Weekly Installs895Repository[freestylefly/we…or-skill](https://github.com/freestylefly/wechat-article-extractor-skill)GitHub Stars12First SeenFeb 21, 2026Security Audits[Gen Agent Trust HubWarn](/freestylefly/wechat-article-extractor-skill/wechat-article-extractor/security/agent-trust-hub)[SocketPass](/freestylefly/wechat-article-extractor-skill/wechat-article-extractor/security/socket)[SnykWarn](/freestylefly/wechat-article-extractor-skill/wechat-article-extractor/security/snyk)Installed onopencode883codex882gemini-cli881amp881github-copilot881kimi-cli881

---
*Source: https://skills.yangsir.net/skill/ssh2-wechat-article-extractor*
*Markdown mirror: https://skills.yangsir.net/api/skill/ssh2-wechat-article-extractor/markdown*