any comments HIGHLY appreciated; feedback/questions welcome!
Escape and Data Channels
What is escape code? Why do we use it?
Escape code attempts to solve a complex yet foundational issue on how a computer reads and handles data as a stream, like printing or executing. When we see a " quote, just one, it's just some text, but to a computer, it is an entire command, a declaration of data! When we want to pass data, it's required to specify what type and how we need the computer to read it.
Escape code example:
'<imgsrc="x"onerror="alert(1)">';
When providing a user an input field, the computer has to take the input and store it somehow, in what we know as a variable. Then when we go back and tell the computer to read the variable; we call said variable, and we have our assignment/call loop. When we assign an input DIRECTLY to a variable and call it we are executing code.
In a smart component, we cannot avoid the manipulation of data through these streams and therefore MUST use some form of escape code. Our escape code can help us interpret strings and inputs properly, not just allowing us to write code freely, but also protect users against xSS
innerHTML and Injection
wait, what?
First, we need to know how innerHTML works and then we move to why we can inject JS to this element.
Secondly, Injection is defined as running JavaScript from the client-side invoked by the client
Now, when our JS source for a site reads innerHTML data, it doesn't just send some data for an HTML file to execute. The innerHTML executes the value defined by the expression, this means if you use
element.innerHTML = <input />
When we put <script> </script>
tags inside our input, we can VERY easily inject code, directly to the site, like so:
const HACKS=
<script deferred> alert("xScriptatK");</script>
You need to defer because as the page loads the DOM has already executed
This is how most hackers steal site data: cookies, localStorage, sometimes make fetch() requests for altering data, and then return data to a remote location.
ESCAPING X INNERHTML
All Together Now:
When we want to dynamically set web-data and content we need to use a markup of escaped characters.
We have a few options:
textContent
Renders a string of escaped characters, our HACKS var from earlier would render as a string and not execute.Sanitize 3rd Party Content
Let's say we have an input field, and the user fills it with the following,'<img src="x" onerror=HACKS()>'
This would send an error and allow them to inject code from the error call, in this case, itsHACKS()
Instead, we call a sanitizer on the content with an escape method it will become:
`app.innerHTML =
'<imgsrc="x"error="alert(1)">';
This is the encoded syntax of
<img src="x" onerror="HACKS">
(our hacks)
and we remove any markup and get our source back without any injection.
var sanitizeHTML = function (str) {
return str.replace(/[^\w. ]/gi, function (c) {
return '&#' + c.charCodeAt(0) + ';';
});
};
This code takes user-submitted (str) and encodes it (sanitizes it) and returns another string(str) the cleaned-up version.
const Hac = '" onerror="alert(\'XSS Attack\')"';
//Inject our dynamic element into the UI
app.innerHTML = '<img src="' + sanitizeHTML(thirdPartySrc) + '">';
This is just an example of encoding a property, if you want to allow some markup you will need a library of whitelisted elements/methods.
- Use a framework/library Using a library like REACT, Angular, or jQuery. Libraries tend use #2 under the hood, and use understandable syntax to make your code more understandable, and writeable. This is the best option in my opinion, as they are VERY well tested and should be considered reliable by any coder worth his/her/they/whomst/nonbinary markup.
Roll Credits
Now next time you are using a library/framework you can appreciate all the work that is done for you by escape code, handled by a library.
HUGE THANKS TO:
Element.innerHTML
gomakethings
Decoder
Without these resources I would have no idea where to start with this, and they were highly informative in this learning process.
Top comments (4)
+1 for using frameworks like React, I think it's pretty funny they call their
innerHtml
functiondangerouslySetInnerHTML
.With a name like that, you better remember to sanitize your inputs
Thanks For The Explanation !
aHR0cHM6Ly93d3cubGlua2VkaW4uY29tL2luL2thbGViLWZyYW5rZW4tMzE1YjUzMjMyLw==
Your linkedin?