Madhesh T

Posted on Mar 15

What Drives User Success?

#analytics #datascience #machinelearning #showdev

I Built an ML Dashboard on Zerve to Find Out

A complete walkthrough of the idea, the models, the dashboard, and the finding that changed how I think about user retention.

The Problem No One Talks About Honestly

Every SaaS platform has a churn problem. But most of the time, the conversation around churn is reactive — someone cancels, someone stops logging in, a renewal gets missed. By the time the signal arrives, the user is already gone.

The real question is earlier than that. Not why did they leave — but who is about to leave, and what do we do right now?

That was the challenge Zerve put in front of me. They provided a dataset of user event data — every action a user takes on the platform generates an event — and asked a single, open-ended question.

Which user behaviours are most predictive of long-term success?

No predefined answer. No suggested approach. Just the data and the question.

Here is exactly what I built, how I built it, and what the data actually said.

What Zerve Is and Why It Made This Possible

Zerve is a notebook-based data platform with built-in AI capabilities. Users create canvases, write and run code, call AI agents, use generative AI tools, and collaborate with teammates — all inside one workspace.

What makes Zerve genuinely different as a development environment is the variable() function. Any output computed in a Zerve notebook — a dataframe, a model result, a summary table — can be called directly into a Streamlit application with a single line of code.

archetype_df = variable("kmeans_user_archetypes", "archetype_cluster_df")

That means the model and the dashboard are one connected system. No exporting CSVs. No manual uploads. No stale data. Every time the dashboard loads, it pulls live results directly from the notebook. That is not a minor convenience — it is a fundamental architectural advantage that made this entire project possible as a single coherent pipeline.

The Dataset

Each row in the dataset is one user. Across each user there are 43 behavioural features — signals describing what that user has done on the platform. The most important ones:

events_first_7d — Total activity in the first seven days. The single most intuitive early signal.

days_active_first_7d — How many of those seven days did they return? Frequency matters as much as volume.

agent_tool_calls — How many times did they call an AI agent? This is the platform's core capability. Usage here is the strongest predictor of everything that follows.

genai_events — Broader generative AI engagement. Prompting, AI-assisted code, model interactions.

credits_used_total — Credits are the platform's primary resource. Zero credit usage means zero real output.

feature_adoption_breadth — Did they explore multiple features or stay in one corner of the platform?

collaboration_index — How much did they work with others? Collaborative users embed themselves in team workflows, which dramatically increases retention.

execution_success_rate — When they ran code, did it work? A proxy for whether they are getting real value or hitting frustration.

target_success — The outcome variable. One or zero. Succeeded or did not. This is what both models are built around.

The Approach — Two Models, One System

I built two completely separate models. Understanding why both are necessary is the core of the whole project.

Model One — XGBoost Gradient Boosting Machine

A supervised learning model trained on all 43 features against the target_success label. The model learns which combinations of behaviour predict success and produces a probability for every user — which I scale into a zero to one hundred success likelihood score.

I chose XGBoost because user behaviour data is non-linear and messy. The relationship between agent calls and success is not a straight line — it interacts with credit usage, with feature breadth, with early engagement patterns. GBM handles these interactions naturally.

I used SHAP values — SHapley Additive exPlanations — to make the model interpretable. SHAP tells you not just what the model predicts, but which features drove each individual prediction up or down. This is how I can tell you with confidence what the top predictive signals are, rather than just reporting an accuracy score.

Model Two — KMeans Clustering, k=4

An unsupervised model — no success label involved. It looks purely at behavioural similarity and finds natural groups in the data. The number four was chosen because it produces four genuinely distinct archetypes without over-fragmenting the user base.

Why run clustering separately from prediction?

Because a score alone is not actionable. If a customer success manager sees a user scored 24, the immediate question is — what do I do? The score tells you how urgent the situation is. The cluster tells you what action to take. A Casual Explorer scoring 24 needs a re-engagement tutorial. An At-Risk User scoring 24 needs free credits to try AI features for the first time. Same score. Completely different response. You need both.

The Four Archetypes

Power Coders
Heavy agent usage, high credit consumption, scripting and deploying to production. These users found the platform's core value immediately. They have the highest success rate of any group and respond to capability-led upsells — more compute, advanced APIs, dedicated support.

Collaborators
Socially active builders working in shared canvases and team workflows. Their viral potential is enormous — every collaborator is a gateway to additional team seats. The right intervention is a frictionless team upgrade prompt when their collaboration score crosses a threshold.

Casual Explorers
They signed up, explored, created some things — but never formed a consistent habit. There is genuine intent here. It just has not converted. The thirty-day window is critical. A well-timed tutorial showing them a relevant use case reactivates more than twelve percent of this group.

At-Risk Users
The largest group. They browse canvases, perform basic actions, appear active — but they have never touched the AI features that drive real retention. Without a catalyst, they churn silently. The highest-leverage intervention is a time-sensitive free credit grant that removes the cost barrier to first AI use.

The Dashboard

Built in Streamlit, connected directly to Zerve notebook variables, and designed to serve five completely different business audiences from one tool.

Platform Overview — The executive layer. Success rate, score distribution across tiers, cluster breakdown, archetype success rate comparison, and score distribution by cluster. The full picture in one screen.

User Lookup — For the customer success team. Paste any user ID and get their score, cluster, raw probability, and a plain-English recommended action. A comparison chart shows their behavioural profile against the dataset median — you can see exactly which dimensions they are underperforming on and why they scored the way they did.

Cluster Recommendations — For the growth team. Each archetype has a specific recommended intervention with a trigger condition written as an executable rule — ready to drop directly into a CRM or marketing automation workflow. No manual review required.

ROI Table — For leadership and finance. Cluster size multiplied by expected lift gives estimated users impacted. Multiply by average contract value and you have a business case, not just a data science result.

Cluster Narratives — For everyone else. Plain-English portraits of each archetype written for product managers, marketers, and executives who need the story rather than the statistics. This is the layer that makes the intelligence portable — it travels into strategy documents, board decks, and onboarding playbooks.

The Finding

After all of it — the modelling, the clustering, the SHAP analysis — the answer to Zerve's question is this.

Early, deep engagement with AI features is the single strongest predictor of long-term user success.

Not tenure. Not the number of canvases created. Not account age. Whether a user called an agent, consumed credits, and engaged with generative AI in their first seven days is what separates the users who succeed from the ones who quietly disappear.

The users who stay on the surface — active but never crossing into the AI core - are not disengaged. They are one well-timed intervention away from becoming Power Coders. The platform's job is to make that crossing happen before day thirty.

That is what this dashboard is built to do.

Why Zerve Made This the Right Tool for the Job

I could have built this in any notebook environment. But Zerve's variable() system meant the model outputs flowed directly into the dashboard without any intermediate steps. The clustering results, the feature importances, the scored dataframe — all live in the notebook, all accessible in the app with one line.

That is not a workflow convenience. It is the reason the system works as a single coherent product rather than a collection of disconnected scripts. For anyone building production-grade data applications, that kind of tight integration between computation and presentation is genuinely rare.

#MachineLearning #DataScience #Streamlit #Python #XGBoost #KMeans #SHAP #ProductAnalytics #UserSuccess #ChurnPrediction #Zerve #BuildInPublic #MLEngineering #HackathonProject #DataDriven

DEV Community