loading...
markdown guide
 
 

not necessarily what exactly needs to happen to fix it, but how can the whole team be involved in fixing it so that it doesn't happen again. With everyone getting involved it:

  • makes everyone realise what level of effort it takes to fix it if they let it happen again.
  • prevents knowledge silos
  • collaborative code on complicated legacy projects usually looks less like spaghetti than if approached individually.
 

if the bug is hairy as u say i will start with roots and re-build the whole code without rewriting it just double check everything and i hope u were using any kind of version control just print everything

 

My thoughts turn to mitigation and isolation.

First things first, how can I make it livable for however long I have to live with it? Can I put a cron in to restart a troublesome service? Can I add more retries to talk to a box with a bad network? Can I wrap a bad library with something that makes it less bad?

Second, how can I contain the problem so when I want to fix it properly I can do it quickly and without hurting the rest of the system? Interfaces, facades, etc.

Finally, I like to document how much the issue hurts the team in terms of lost productivity and hours spent messing with it - never hurts to have ammo to convince stakeholders and supervisors that we should make a persistent problem a priority.

 

The greatest part is the users might get used to it and fixing the bug may create churn πŸ˜πŸ˜†

 

β€œIt's not a bug, it's a feature (now)!β€œ :D

 

Given that I have the privilege of just hacking and making it up as I go, this is half my day I swear.

 

First thing I usually think about it if the code is designed correctly. There is usually a design flaw that causes those bugs. At least in my experience. Lay out the steps you need to do in the simplest form and see if there is a better way to do it.

 

Project stopping bug early in a development? Start from nothing and see where things went wrong.

Project stopping bug later in development? Start breaking the data flow apart to see where things go wrong.

Anything else gets prioritized based on how critical (work around available?), and how burnt out I am dealing with that particular issue.

 

I document it and break it down into steps.

Sometimes there's a part of your code which works but nobody remembers how, something legacy that nobody wants to touch for fear of breaking it.
Fear of fragility is a major reason these bugs go on so long, so periodically revisiting your code or getting another set of eyes on it will go a long way to preventing these sorts of things in the first place.

I have just finished going through this process for an authentication system on a big, sprawling project which was tripping up in various ways but seemed too big to replace. It took a while, and it took me working on a separate branch and having maintenance tickets deflected to other developers for longer than the client wanted, but it's all better now. I can pat it on the head and give it a lollipop.

Incidentally, this is why my Mac is so out of date. It tells me it needs to reboot every day and I deny it because I have no confidence it'll come back up. MacOS has tricked me too many times. At some point I'll wipe it and start from scratch, or get a replacement. But the OS is so fragile that I avoid going anywhere near updates and patches.

 

Its a little open ended. There are different types of bugs. There are bugs where you see where the problem is but don't know why. There are bugs where you don't even know where the problem is. Assuming that your code is well designed you will know where the problem is. Get second pair of eyes on it, pair programming always helps. If the code is spaghetti, this is a perfect opportunity to re-factor. If we are stuck for a while, take breaks, switch to doing something else. Get an ice cream :)

 

How long has this bug existed, and is it generating issues for customers?

Can I present a work-around approach to the team to mitigate the bug's impact?

Can I make a business case to management/client to allocate resources to fix this bug? What are the costs to repair the bug? What are the costs to ignore the bug?

 

First, I try to log every step of the buggy method into a log file (especially when it is on a production server).
Then I take an abstract look at the method design, to make sure I don't have any design gotchas.
Then, I retrace the call and check the platform/plugin details (if for example it is a plugin that is used in a mobile app).

 

I guess this is where IDE capabilities are the most helpful because nobody can debug like the IDE debugger, through breakpoints and narrowing down the function or variables which are responsible for doing something but are messing up big time.

 

I just work the problem in the Agile way with a combination of spikes and user stories, maybe contained within an epic if it's that big and hairy.

I've got one right now that I'm working on that's been a bit tough to define. I've been working this spike for the entirety of the current sprint. Users tell me it's a big problem but can't really give me a clear description of what is needed to correct the issue. It's probably part business process/policy with software changes to match the desired policy. Getting to the desired policy though is where there's a lack of agreement. Until product owners agree, it will be difficult to move forward.

In other cases, it's been a bit easier but getting to the root cause (to use one of my least favorite buzz words) of the problem is the first thing. I've usually found that it goes back to data so that tends to be the first place I look. For example, a column in a table can be null but the code expects it not to ever be null. If good tests are in place, this isn't too difficult usually. But, when working with legacy code, there may not be tests available so it requires really digging into code that might have been written 10+ years ago.

 

Usually if a bug is taking long to trace, then its better to sleep over it. Most of the times its just a wrong approach one is taking in tracing it. Also its better to take such bugs at the end, if they aren't show stoppers, to save and spend valuable time on being actually productive.

 

This is how we do it:
a) Try to find a workaround, and mark the workaround with "FIXME".
b) Invest one week per quarter to fix all the FIXMEs (whole team - fixmekathon).

And a general thought: If anything (feature, issue, ...) seems to take "very long": invest 15% of "very long" into thinking about the problem and sketching solutions. This will mostly reduce "very long" and result in better solutions.

 

Why fix it when I can make it a feature? 😁

 
 

Can I add telemetry data or traces? Keep gathering data, which over time will give me enough information to formulate a hypothesis. Hopefully.

 

It's a feature leave it like that πŸ˜‚πŸ˜‚

 

Try to convince the business side to promote it to a feature.

If they say no I can throw the bomb "refactoring".

 

I'll start suspecting a race condition or some other concurrency issue. These are usually the hardest to detect and kill.

 

what do I begin to think about?

How did i just ____ myself?

Classic DEV Post from Sep 13 '19

Is it possible to get relevant industry experience on your own (not through working at a company)?

This is an anonymous post sent in by a member who does not want their name disclo...

Ben Halpern profile image
A Canadian software developer who thinks he’s funny. He/Him.