Stop Staring at a Blank Notebook: I Built plan() to Fix That

#ai #datascience #rag #api

We’ve all experienced moments when we procrastinate because we are not sure how to approach a problem. I am not ashamed to admit it still happens to me quite often but that’s the job of a data scientist. We have to be product experts on a plethora of topics, senior engineers, and let’s not forget — highly skilled statisticians. The truth is no one is all three and that’s fine.

In an age of AI, when we are constantly told that we will soon be obsolete as skilled professionals, doomed to be replaced by for-loops or agents if we are being fancy I decided to bet on us and create a suite of AI-powered tools that complement us instead of aiming to replace us. I strongly believe that human judgement is going to continue being key to arriving at the right answer about a business question. And yes, of course we’ll be using agents to automate the mundane but we’ll ultimately be the ones making the final call.

And making the right call starts with picking the right approach. The reality is that constraints make it harder. Experiments are expensive. We often don’t have enough data to run certain tests and every method comes with tradeoffs that in many cases only become obvious mid-analysis.

So I built plan() - describe your analytical problem and get a structured plan for the right approach - before you’ve started the analysis or at any point of the process if you need a second opinion. The tool, part of the Bridgekit suite, recommends a method, covers why it fits your specific problem, key assumptions, common pitfalls, and alternatives, using the Anthropic API under the hood.

Here’s an example of plan() in action: A product manager wants to know whether a new onboarding flow is increasing upgrade rates.

My favorite sections are KEY ASSUMPTIONS and WATCH OUT FOR - because of course most data scientists would land on a z-test for this setup but the nuances and pitfalls often only surface mid-analysis. Having them at the start means you can build the right safeguards in the beginning, not patch them in when someone asks a question you can’t answer.

plan() won’t tell you if you have data quality issues or the A/B test was set up correctly or whether the business question is worth answering. That’s still on you. What it does is it makes sure that you think through all of the possibilities before you start - or mid-analysis when you are second-guessing your approach. So if a stakeholder asks you why you chose a specific method at any point of this process, you can eloquently explain the reasoning, walk through the assumptions, and articulate what could potentially go wrong.

Link to repo:

Bridgekit is open source at github.com/getbridgekit/bridgekit - clone it and give it a star if you find it useful.