4. XML external entities
XML entities are used to request local data or files.
What it is
XML is a data format used to describe different data elements. XML being a rich and sophisticated standard also introduces “ to help define related data. Entities can access local or remote content, which can be as harmless as pulling schema definition or current stock price from a third-party website. Entities can, however, be used maleficently to request data or files, even if that data is never intended for outside access. If the application accepts XML directly or via XML uploads, processes SAML federated identity requests, or uses SOAP prior to version 12, it may present vulnerabilities.
How it works
An attacker sends malicious data lookup values asking the site, device, or app to request and display data from a local file. If a developer uses a common or default filename in a common location, an attacker’s job is easy.
Why it’s bad
Attackers can gain access to any data stored locally or can further pivot to attack other internal systems.
Countermeasures
Whenever possible, use less complex data formats, such as JSON, and avoid serialization of sensitive data.
Patch or upgrade all XML processors and libraries in use by the application.
Most of the time, you can safely disable XML EE and DTD processing in an XML tool.
Use the OWASP cheat sheet for prevention options: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html.
Implement a positive allow list ") server-side input validation, filtering, or sanitization to prevent hostile data within XML documents, headers, or nodes.
Verify that XML or XSL file upload functionality validates incoming XML using XSD validation or something similar.
Use SAST tools like SonarQube, which will detect XXE in the source code.
Utilize manual code reviews in large, complex applications with many integrations.
XML external entities example
An attacker interferes with an application’s processing of XML or substitutes XML using specifically crafted DOCTYPE data to perform denial of service (DDOS), server-side request forgery (SSRF), or even remote code execution.
Scenario 1
A hacker attempts to extract data from the server by injecting an ENTITY XXE command to obtain access and contents of a file:///etc/passwd on the system, potentially exposing stored secrets to gain direct access to the host.
<?xml version=“1.0” encoding=ISO-8859-1””?
<!DOCTYPE foo[
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM “file:///etc/passwd”>
]>
<foo>&xxe;</foo>
Scenario 2
A hacker probes the server’s private network by changing the ENTITY line and injecting an ENTITY XXE command to determine a specific host and folder to see if it is exposed.
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "http://192.168.0.1/mypasswords.txt">
]>
<foo>&xxe;</foo>
Scenario 3
A hacker attempts a denial-of-service attack by including a potentially endless file and injects an ENITTY XXE command to execute a denial-of-service attack using an endless file hack.
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM “file///dev/urandom”>
]>
<foo>&xxe;</foo>