Note: The topic for this post was chosen through a poll on my Twitter page. Be sure to follow for updates and input on future posts!
Today I'm going to be discussing three simple security threats that target web applications. You may be familiar with some of them already, but it's very important that, as a dev, you know how to defend against these types of vulnerabilities.
When a browser makes a request to your web server, what exactly is being returned? If your first thought was a "response" or "web page", you are very right, but down in a lower level, the actual data that your server sends back to the client is in the form of a string. A big bunch of text that represents either a web page, or some more abstract form of data such as a JSON object. That string can contain things that you specified exactly when constructing the response (e.g. web page structure, hard-coded text, etc), and it can also contain dynamic data that a user provided somewhere else, say, in an input form.
Here's the kicker: That string is processed by a program that doesn't know which parts are which.
When the web browser gets that big string of content, it simply goes through it and parses HTML tags, comments, and other metadata around the content that will actually be shown on the page. And if, say, a malicious user has decided to include some metadata in an input to your site, that metadata might be parsed as well.
Let's take an overly-simplistic example. Suppose your server takes input from a user, and stores it in a variable called x. When the user clicks a link, the new page takes whatever is stored in x and puts it between bold tags as follows:
String pageContent = "<html><body><b>" + x + "</b></body></html>";
So, if the user inputs "Hello world" and clicks the link, the response will be a bolded "Hello world".
But what if the user puts the following in the input:
When your server constructs the page, it will be returned as follows:
Resulting in the text now being contained within an h1 header. Now, suppose if we were a bit more nefarious and put this in the input:
Hello world<script>alert("Ha ha, I am hacking your mainframe!!!1");</script>
The answer is called "input sanitization". In short, you want to programmatically go through the contents of dynamic data, and make sure that it does not contain things that can be parsed by a web browser. In the above example, we could replace all angle brackets with "<" and ">", the HTML angle bracket codes. That way, when the malicious input is put back into the page, the text is simply parsed as plaintext.
However, a much safer route for defending against XSS attacks is simply utilizing the tools available to you. Web server frameworks such as Django, Flask, Spring, and Spark all have tools included to automatically sanitize input. This typically involves some sort of mechanism for dynamically creating a page before returning it, and inputting variables and such directly into it. Read up on the API for whatever framework you're using for details.
This is a very similar type of threat to the previous one, cross-site scripting. It uses the same principle of manipulating a string that will be passed to some parser, but instead of targeting the web browser, attackers will target a backend SQL database. Suppose you have some SQL building code in your server:
String myQuery = "SELECT * FROM Users WHERE uname = '" + usernameField + "';"
This code takes the input from usernameField and puts it between quotes in the specified place in the query, ending it with a semicolon. In the same way that someone could manipulate the HTML content in the previous example, someone could enter the following into the field:
Derek'; DROP TABLE Users; --
Resulting in the following query to be passed to the database:
SELECT * FROM Users WHERE uname = 'Derek'; DROP TABLE Users; --';
This query would then do some dummy search, and then drop the table "Users". The two hyphens comment out the rest of the string.
This can be guarded against in the exact same way as defending against XSS. The idea is to sanitize your input so that the strings being passed into your backend database cannot contain malicious commands. And, as before, the better method is to simply utilize the tools available to you for this same purpose. Most SQL integration frameworks have tools to do specific types of queries, taking query parameters as extra arguments. When you use these, the framework handles the work of sanitizing the input.
Note: This could also theoretically be attempted against NoSQL databases with a query language, but it's much less common since those types of databases are typically operated on only through frameworks as described above.
This one is quite different from the others, as it doesn't rely on any sort of technological limitations, but rather the gullibility of your users. In short, if someone pretends that their site is your site, then they could potentially trick people into giving them credentials for your site. This could be done through setting up a fake site on a URL that is a common typo for yours, or through sending phishing emails, or through any other method of pretending to be you.
So what can you do to prevent your users' information from being stolen? In short, give them information that only the legitimate site owner can give, and telling your users to look for it. This could be:
Having a user pick a secret picture that you associate with their account, and display that picture to them on every page
Use SSL and tell users to look for a green lock with your company's name next to it
Reminding users that you will never ask for certain pieces of information that you for sure will never ask for (e.g. SSN, bank info, passwords for other services, etc)
or any other method of proving that you are who you say you are.
These are just some of the simplest types of threats that malicious actors may try against your web server. Keep in mind that in no way is this list even close to exhaustive, so you should always be researching new threats to defend against. Thanks for reading, and happy coding!