Good News and Bad News, Pivot and Turn - Building DDTJ Day 6

#startup #programming #architecture #opensource

Last week I spent 5 days building and designing the DDTJ project. Hopefully, I’ll complete the MVP by this weekend. As we left the series last week, I was stuck. This week there’s a bit of an improvement…

There are a lot of variations on the good news/bad news joke. I’m sure you all know some form of it… Here’s a programming oriented one:

developer: so i have good news and bad news

manager: what's the good news?

developer: i've discovered that the "5 second rule" only applies to food

manager: and the bad news?

developer: i dropped our tables
— Kat Maddox (@ctrlshifti) January 21, 2021

In that spirit, I’ve made some progress and we are now collecting some data… The bad news: I had to throw away a lot of concepts and performance is roughly 700x worse than without our instrumentation. Ugh.

How did I get Here?

Last week I was still modeling the objects stored locally when debugging the remote VM. As you may recall, the debug API needs to copy objects locally so we can later on analyze them. We can’t leave this in the target VM as the data will get lost as the VM continues.

I created a data model and implemented a debugger of sorts that effectively automatically debugs the application and stores all the data. Unfortunately, this isn’t as simple as one would hope.

Initially, I thought I’d use the step over API to review individual lines. This includes some problems:

I don’t know what a line contains without looking at the source/bytecode and I don’t want to get into that complexity
I can’t have two instances of a step over monitor at the same time. This is something I’ll need for nested method calls, threads, etc.

So using the approach of stepping over code isn’t practical for this tool…

Method Enter/Exit

Luckily, pretty much any debugging API supports setting a method enter/exit event. This lets us monitor all the important points and log all the relevant information. Perfect… Almost…

Here’s the problem, say we have code like:

public void myMethod(int x) {
    otherObject.otherMethod(x, y);
}

I’ll get 4 events for this sequence:

Method enter to myMethod
Method enter to otherMethod
Method exit on otherMethod
Method exit on myMethod

So far, so good. But here’s the thing. I need to log the invocation of otherMethod inside myMethod so I can mock that later on.

How would I do that?

We need to analyze the data and see the data related to myMethod’s call hierarchy.

The solution is to somehow know that the enter event and the exit events are related. Initially, I tried to create the exit event dynamically and restrict it to a specific object instance. But that fails with recursion.

I’ve tried several approaches. I now have one that works, but it isn’t ideal…

The Current Approach

I create a uniquely repeatable string when method entry occurs. It’s built with roughly this formula:

threadId + object instance id (if not static) + frame count + callstack

This is a relatively long string, but it’s unique. We place an object in the map with this as a key when entering a method. Then, when exiting the method, we find the object using an identical key.

We can now connect the method entry/exit and as a result log all the data we need.

Performance

Unfortunately, that isn’t enough. 700x is way above a reasonable 10x or bearable 20x reduction in performance for a debugging environment. Normally we need to stay away from premature optimizations, but if there’s a conceptual flaw in the architecture, we need to understand it. 700x isn’t something we can realistically work with.

So my next focus is to get the current code in shape for a pull request and then move on to understand the performance implications and what exactly causes this overhead.

I have a lot of theories on the subject, but experience taught me to ignore them. A lot of times, the things we think are “expensive” turn out to be a tiny bump. The things we discounted end up being the big penalty. A profiler will tell us where time is wasted and I can evaluate based on facts. Right now I don’t want to look. If I do, I won’t be able to stop and it’s important to stop, synchronize with the trunk and move iteratively.

Tomorrow

This has been a short post since I’m still pretty busy getting the PR out and bringing the tests up to speed. Once it’s done, I hope to get to the profiling session.
At the current rate, I think my initial goal of having a tool that can generate tests for spring boot might be high for this week. I’ll lower the expectations to debugging a hello world application and generating a simple test case. After everything I’ve been through with this API, it seems like a pretty ambitious goal too.

If you want to keep up with the latest updates on this series and the many other things I work on then follow me on twitter.