DEV Community

Cover image for 👻 Do you have any horror stories to share? Spooky bugs, scary data leaks, horrifying code, etc. 🎃

👻 Do you have any horror stories to share? Spooky bugs, scary data leaks, horrifying code, etc. 🎃

Ben Halpern on October 31, 2017

Collapse
 
rbanffy profile image
Ricardo Bánffy

Every time one installs or updates the JDK, there is the message:

"3 billion devices run Java".

Collapse
 
lpasqualis profile image
Lorenzo Pasqualis

Oh yeah, I remember many years ago when my CTO wanted to clean up a few rows in the DB and ran a select without specifying a WHERE clause.

DELETE FROM users;

The team spent the next week recovering data from printouts ( Hey, it was 1999 :) )

...Opppsssss....

From that moment on, we made him run mysql with the --i-am-a-dummy option.

Collapse
 
lluismf profile image
Lluís Josep Martínez

In 1999 there were already DB backups. And many years before too :-)

Collapse
 
lpasqualis profile image
Lorenzo Pasqualis

Oh, I know... :)

Collapse
 
jess profile image
Jess Lee • Edited

Once upon a time, all my work from the previous night disappeared. git reflog showed a SHA but the line was blank and my work never came back. It was spooked and never used that computer again.

Collapse
 
meisekimiu profile image
Natalie Martin

The "monotable".

I thought it was just a myth. An urban legend people made up for a funny story on The Daily WTF. But then I witnessed it. It happened on my team. Done by people I worked with!

We were working on a system based on a very annoying CMS. We decided to move certain aspects to a database instead of however we were storing those parts in the CMS before. I joined the team near the middle the migration and helped work out some bugs on one component we were migrating to the database.

I then moved on to another aspect of the system, which had some problems with the SQL. I asked the guy on our team responsible for integrating the database in the first place to see the schemas for the tables. And that he did... he showed me the schema for the table. Yes, our entire database consisted of a single table, with columns upon columns added on for additional features.

Now, I've seen some database table monstrosities before, and heck, I've even made some when I was first learning how to work with databases, but this was coupling completely unrelated aspects of the system together. I'm talking about storing user input and system configuration in the SAME table!

So yes, that was pretty spooky.

Collapse
 
ben profile image
Ben Halpern

Just the other day I accidentally committed a ~200mb file to git, which is not allowed in GitHub, but removing it was much more complicated than just deleting the file because it already existed in the git history.

The whole process of removing it was pretty scary as it required rewriting the git history in possibly destructive ways if done wrong. 😱

But it worked out fine. 😄

Collapse
 
andreujuanc profile image
Juan C. Andreu

Or when you forget to create gitignore and you see your connection-strings commited xD

Collapse
 
inidaname profile image
Hassan Sani

Guilty of this😄😄😄

Collapse
 
s_anastasov profile image
Stojan Anastasov

Working with a "REST" API today the error response was

{errorCode:1, error:"Error message"}

For the people who haven't had coffee yet, the quotes for errorCode and error are missing so it isn't a proper JSON. I think it actually took some effort to make it like this instead of proper JSON.

Collapse
 
mogery profile image
Gergő Móricz

Use eval()

Collapse
 
s_anastasov profile image
Stojan Anastasov • Edited

I am not sure what that does... I am writing the Android client (using Kotlin), the server code is written in PHP :)

Thread Thread
 
mogery profile image
Gergő Móricz

Oh,

that’s a valid object in JS, so if you would work in JS and put it in an eval function, it’d return the object.

PLEASE FOR THE LOVE OF GOD NEVER USE EVAL. IT’S VERY UNSAFE.

Collapse
 
luar_serrot profile image
🎃Raúl TorreSpooky🎃

I remember one time that I was forced to debug a 7000 lines, obfuscated javascript file.

I think that says everything. :>

Collapse
 
pildit profile image
pildit

This must be hell..sure it was!

Collapse
 
andreujuanc profile image
Juan C. Andreu • Edited

One time a friend of mine and colleague was "fixing" some purchase orders and he had to delete some of the rows. He opened Management Studio and got some queries running.

After confirming the rows he needed to delete, he started to write the DELETE query. He always commented the delete statements to prevent any accidental data loss, but that night right after when he has just written the table name part of the syntax, he accidentally pressed F5 because he wanted to be sure that the SELECT conditions were correct.

Something like this:

SELECT .. 
FROM [TABLE HERE]
WHERE [LOTS OF CONDITIONS ]

DELETE FROM [SAME TABLE NAME HERE] 
[HE FORGOT TO PLACE THE CONDITIONS HERE]

The best part was when he pressed STOP to stop the query and the cancel was not working, he panicked and unplugged the Ethernet cable from his computer. xD

To this day, some people say that some of those lost rows still appear on query results.

Collapse
 
derekpalmer profile image
Derek Palmer

Being asked "why would I use source control?"

Collapse
 
skyrpex profile image
Cristian Pallarés

Oh, I've been told that, and that code revisions and unit tests are useless...

Collapse
 
derekpalmer profile image
Derek Palmer

Terrifying

Collapse
 
preciselyalyss profile image
Alyss 💜

I once not only deployed on a Friday...I changed the hosting server and underlying software AND THEN deployed on a Friday. 👻

Collapse
 
danieljsummers profile image
Daniel J. Summers

Did it work out, or did you learn why the conventional wisdom about that can be summed up with "don't"?

Collapse
 
preciselyalyss profile image
Alyss 💜

It did work out, but I was enabled and supported by the manager/director from technical operations. I'm not ignorant of the conventional wisdom and I'm stability-conscious. There are nuances to the situation which I didn't feel the need to divulge in a light-hearted post.

Thread Thread
 
danieljsummers profile image
Daniel J. Summers

I wasn't trying to imply otherwise. :) I just wondered if it all worked out; I'm glad it did.

Collapse
 
bgadrian profile image
Adrian B.G.

One of my first days of work, open the project, browse, read and find 2 5000LOC God classes.

Good thing later I used those to practice all the refactoring techniques from a famous book and recognize most of the OOP anti-patterns.

Collapse
 
engineercoding profile image
Wesley Ameling

Which famous book? That could come in handy later for me

Collapse
 
bgadrian profile image
Adrian B.G.
Collapse
 
cdvillard profile image
Charles D. Villard

The word 'Drupal' will probably already send a shiver down many a developer's spine, but that's what I was working on today: updating a client site's core Drupal libraries and modules with drush. This is something I've gotten down to a mad, mechanical sort of a science, but it's not without its flaws. I went through the process only to be met with a styleless page, and what's worse, fixing any issues brought it to a screeching halt in the form of an 500 error! Thankfully this was only a scare as I had forgotten this was on a dev server, but never something to be taken lightly.

Collapse
 
le_newtype profile image
Nicholas ―M― • Edited

I didn't do anything to prevent the unnecessary re-rendering of a React component that used the Google Location API. Within a few hours of it going live, we hit our max amount of free API calls and the whole feature was basically useless for 24 hours 🙃

Thankfully this was when we were slowly reskinning / revamping our product, so clients could still use the old version and knew there wouldn't be a seamless transition until everything was finished.

And that's how I learned why performance lifecycle functions are important.

Collapse
 
maxwell_dev profile image
Max Antonucci

CSS ID selectors...nested inside MORE CSS ID SELECTORS!

Collapse
 
rhymes profile image
rhymes

Definitely guilty of this in the past, due to some mess with jquery and widgets with dynamic content. Not enough of an excuse, I know 🤣

Collapse
 
plutov profile image
Alex Pliutau

For around one month I had a bug on Mac when Mac clicked random parts of the screen sometimes, like switch to another tab in Slack, open a tab in browser, etc. Then I found that it was my coworker who used bluetooth mouse, which was connected to my laptop, I used it one time, so it was saved in configuration.

Collapse
 
maxwell_dev profile image
Max Antonucci • Edited

Also, here's a picture I share with my front-end developer friends to scare the crap out of them.

CSS selector executive order

Collapse
 
maestromac profile image
Mac Siri

This is rather terrifying

Collapse
 
bizzibody profile image
Ian bradbury • Edited

That time I spent a week deconstructing and documenting a super complex algorythm in c only to find the last line.... return 0.5.

Still angry.

Collapse
 
maxwell_dev profile image
Max Antonucci

Understandable. I'm surprised your computer escaped intact after something like that.

Collapse
 
atldev profile image
Chris

One DB, 4500 stored procedures.

Collapse
 
jasodonnell profile image
James O'Donnell

In a previous life, I was tasked with trying to scrape an automated migration task together on the DB. It was a fairly organic operation that required me to work in production. There was a lot of back and forth between my target and production and I had to drop the target DB frequently.

The hours got long, coffee ran out and just as I had wrapped up the task, I decided to clean up for one last test run. I started by deleting the target db. But it wasn't the target. It was production.

Collapse
 
oscherler profile image
Olivier “Ölbaum” Scherler

In the third week at my new job, we had to import a multi-gigabyte database into MySQL on our development server, and the /var partition was too small to hold it. “Fortunately,” the machine was setup with LLVM, so we could resize it (and /home, to make room). To avoid making silly mistakes with the CLI, we chose to use the GUI tool. After we downsized /home and upsized /var, everything was corrupt. The GUI had resized the partitions, but not the volumes, so /var was overlapping /home.

(Here you have to wonder about the point of having a GUI if it makes even sillier mistakes than those you’d have made in the CLI.)

Fortunately, we could retrieve the exact previous block counts of both partitions from the logs, and resize everything back to how it was before. But wait, it gets better: this time, the GUI chose to resize partition AND volume, so everything was still corrupt, /home was now smaller than the data it contained. So we had to re-re-resize the volumes, and fortunately everything was back to normal and no interesting data was lost.

Did I mention it was my third week on the job that I almost nuked the team’s sole development machine?

From this day on, every time Linux suggests I partition and set up LLVM, I smash the Nope button.

Collapse
 
jtvanwage profile image
John Van Wagenen

That one time when the test credit card server actually processed payments... Turns out you had to use the test credit card on the test payment server or it'd still try to charge the card. Imagine that. Luckily we caught it in time before the processing went through.

And that other time when some small bug (I can't remember what it was anymore) prevented (a small subset of) payments from being processed for a day or so.

Those are probably my two biggest blunders.

As far as scary code goes, I was asked to rewrite some modules that were written years ago by a third party out of the country. Some of the things I saw in there really made me scratch my head and literally facepalm at times.

Collapse
 
arcticshadow profile image
Cole Diffin

Many years back I was involved in the double charging in ~100 peoples stored credit card details, during a routine scheduled payment process. Turns out someone had inverted the logic on an if clause. Spent well into a Sunday night trying to track that down.

Collapse
 
danieljsummers profile image
Daniel J. Summers

This was probably 15 years or so ago at this point (and deals with mainframes and COBOL; yes, I know I'm like a living "You're not connected" icon - and get off my lawn while you're at it).

On the UNISYS mainframe, they have a transaction process side that's way more performant than the traditional model (think nginx vs. Apache). But, to really get the throughput for commonly-used programs, you use a concept called RTPS (Resident Transaction Processing System) - basically, instead of your program hitting STOP RUN and terminating, you GO TO a point near the top that reinitializes all your variables, then waits for the next input. The advantage to RTPS is that the operating system doesn't have to actually load the executable from disk; since it's in memory, it just runs it.

Anyway, our current setup didn't need this. But, our "next" setup (my project) needed to clear 100 transactions a second, 100k per day. Loading what was (in effect) the security program 100x per second was crazy I/O; when most of these programs finished, they called a second to display their output, and that's doubling the I/O. So, the security program, the screen-based output program, and the paginated plain-text output programs are prime candidates for RTPS.

As part of a contract with UNISYS to make sure this all went smoothly, my employer actually sent me to work with them, and we established that we could make the security program run as part of RTPS, and it worked - it was really fast! I returned to my office, excited to put this code change on our development box so that everyone could start exercising it; this turned these programs into what we call today "long running processes," and I knew that exercise was crucial to getting a lot of the kinks worked out. When I got ready to activate them, I was really excited; I think my fingers were shaking as I typed the command to launch 5 copies of this program into RTPS. It worked! I listed them, and it showed 5 copies; I ran a transaction, and that worked too! Then, the dreaded

SESSION PATH CLOSED

arrived in my terminal. "Great - what an awesome time for the network to suck!" We gave it a few minutes (I had a small audience at this point), and I was able to get signed back in. I ran the command to put 5 copies of that program back in RTPS, and again, we lost connection within a few seconds.

Time to call the help desk. We dialed the number, and this is literally how they answered the phone...

"WHAT are you DOING?!?!"

Long story sh... er, not quite as long, there were two patches the mainframe needed that they had never bothered to load, because "no one uses that." We chastised them for not keeping us current on patches, and they obtained and loaded them. RTPS worked great after that. This kept the ghouls away, until the same organization provisioned us 25% of what we and UNISYS told them we'd need when we actually went enterprise-wide with this project...

(On the upside, I joined a rather elusive group of programmers who made the mainframe spontaneously reboot from something other than the reboot command (which they wisely never gave any of us) - an elite group known as "real programmers".)

Collapse
 
binaryforgeltd profile image
Bart Karalus

It happened to me during the winter of 2016. It was really difficult to get any details around this case but finally I managed to convince some local developers to start talking, fear in their eyes. Turned out there was this old User Story mentioning some Zombie images coming back after deleting them from our system. Regardless of effort spent on removing them, they would always come back, again and again. This foggy day I got delegated to investigate it deeper. What happened next was nothing one could ever expect...

Collapse
 
benhemphill profile image
Ben Hemphill

Hadoop 0.20.2 had a bug where fixing missing block replicas would not respect the rack aware placement policy. Over the course of many years we had lost enough drives to start losing blocks whenever we lost a drive. Took us a while to figure out what was happening. Luckily the data in hadoop was not the primary source, but we had to recopy data from origin. Then copy all files in HDFS to a new HDFS cluster (almost every file had at least one block affected) Petabyte scale copies don't happen quick. :)

Collapse
 
andy profile image
Andy Zhao (he/him) • Edited

I can never remember any horror stories, but I'm always a bit on edge whenever I touch production data... 🙃

Collapse
 
andreujuanc profile image
Juan C. Andreu

I do enjoy the thrill of playing in production :)

Collapse
 
leandrogs profile image
Leandro Gomes

We have a async job on resque to expire passwords given an interval. No one have notice that this job stoped to work for a long time. It had about 8M jobs enqueued to process. Someone fixed the queue and send all jobs to be executed. We expired password for about 40% of our clients and some of them received more than a thousand password expiration emails in one day. Shit happens...

Collapse
 
nempet profile image
Nemanja Petrovic

Once I created bug in our system and the whole day Invoices on the website were not creating at all. It was d-day...

Collapse
 
rendall profile image
rendall • Edited

Not scary, but spooky good.

I once saw a self-contained function written in 10+ year-old Transact-SQL in a banking database, used to calculate if a date was Easter (literally IS_EASTER(DATE)...).

This was High Technomancy.

Per Wikipedia, Easter "falls on the Sunday following the full moon that follows the northern spring equinox". So, you can imagine the loops and mods and leap year and off-leap year calculating.

And, since the method of calculation depends on whether you're using the Gregorian or Julian calendar, of course there was a conditional "IF YEAR < 1752..." enclosing a whole other set of calculations.

Collapse
 
hexhead profile image
Bill White

Once spent an entire Christmas break 2003 tracking down a bug that turned out to be one character in a printf format specifier.