DEV Community: Daniel Stepanian

Server experiments: 2026 Q1-Q2

Daniel Stepanian — Mon, 01 Jun 2026 21:13:46 +0000

One of my personal goals is to have more actively running services on servers. I have been learning new technologies and at the same exploring different infrastructure options: local homelab, VPS, serverless, renting hardware remotely. Here’s what I’ve explored in Q1-Q2 ‘26:

Homelab Experiments:

I set up a home server lab with a Lenovo Thinkstation with 24GB RAM, Intel i5, 256GB SSD, 4GB VRAM - Nvidia Quadro.
I use Proxmox. It’s cool - that I have connected only 2 cables to the computer - power and ethernet - and i connect using browser from my laptop, using ssh.
Ideas that i explored: 10 containers, some of them:

Openclaw. Interesting idea, but for my needs - i wanted it to create simple agentic research workflow - it failed and burned too many openai tokens. Wouldnt give access to my email, which seems most hype where is.
Ollama - installed locally with mainly phi-4-mini
Local home file cloud - also from browser, accessible from any local computer. Automation on top: using local phi-4-mini model - I can upload any new file to source folder, and then model decides based on file name and file type to which folder to move it.
Langchain: using mixture of models: Openai API, local models on ollama - research workflows on any topic of interest
OpenWeb UI - local space for chatting - both openai and local models for chat. Knowledge source - RAG from local cloud integration. Aim: add bielik 11b model. Accept slower working: 4GB VRAM + 7GB RAM model split.

Runpod experiments

Like the idea of renting by hours. 24GB vram 4090 - perfect for many models - I use mostly text - lama, bielik, uncensored models on top of ollama - very easy and fast setup.

ComfyUI - experimented with it. Loved idea of visualized workflows, wanted to try with text, but seems video is the main focus of this
I experimented with Terraform - to rent+set up all with one command - it worked partially - the pod setup was failing, so I dropped it.
Interesting idea to consider: try serverless pods - to have a fav big model activated after sending a request (need to accept high latency, personal use?)
Conversational agent on top of Bielik - with browser UI - speech to text -> agent -> text to speech. Quite nice - maybe I will use this and explore in future.

VPS

OVH: Analytical layer on top of Graphmotivo platform: custom-made GTM integration (extended data than simple GTM tags for analytical purposes) -> server receiver -> Bigquery. Also regular Google analytics data for comparison. I love the idea of custom enhanced measurement on user-level, it helps with bot and fraud detection - legitimate purpose.
GCP VM: graph loading scripts (Bigquery->Neo4j), fraud detection platform frontend (Hackathon teamwork). I started using Claude Code there big time.

How I use Claude Code

I’ve been using Cursor since late 2024, spent a couple of billions tokens since then and had many ups and downs. I tried to learn latest best-practices and tips regularly - they changed big time. Models got so much better.
In January 2026 I tried Github copilot in VScode, but maybe it’s only me, or I had done something wrong with that - but it felt like a transition from a Lamborghini to a tractor.
In May 2026 I installed Claude. I wasn’t convinced of working with AI directly in the terminal, but I’ve got used to it eventually. Got a chance to use it extensively during a Hackathon, and feels like it’s also great, sometimes better than Cursor. It’s a lot clearer to me.

I loved the idea of LLM Wiki - by Andrei Karpathy. That there is a personal wiki that we can build for any topic, and keep documents structured in folders like : /wiki - /entities /ideas /analysis /data /concepts /milestones - custom for every project. User uploads raw documents to /raw folder, then Claude acts like a librarian that reads it, processes it into the wiki. It sticked with me, and there is a lot of space to add more skills, MCPs and optimize for less token usage.

SaaS System Design - Graphmotivo

Daniel Stepanian — Mon, 09 Feb 2026 11:48:45 +0000

Background

In the last months of 2025 I was under great impression of graphs: graph databases, knowledge graphs, graph algorithms, Neo4j, GraphRAGs. I feel like these technologies open another valuable approach of understanding reality through the lens of relations between them. We can visualize relations, train ML models, find patterns and make better predictions, and understand knowledge domains better.
Under such data-science-oriented influence I also have been learning the DevOps side on intense 5-weekend bootcamp, which gave me abilities to set up, run, monitor and move software production deployments.

Idea, project overview

The idea of using knowledge graphs, embeddings to match buyer persona’s treats to product use cases, and matching it all to Google/META ad configs to reach the right people came to me much earlier and during a couple of months, the final idea matured in my head. I started working on Graphmotivo in late November 2025, and devoted much of my free time throughout December and January. By coincidence, we saw a significant increase in the quality of AI programming, which allowed me to create this project in such a short time: 1.5 months.
Graphmotivo is a future marketing intelligence platform that uses Expert AI Agentic workflows and knowledge graphs, which allows marketers and business owners generate buyer personas, use case story journeys, ad targeting inspirations, and explorable identity graphs of simulated buyer personas.

Development Process

My primary tool was Cursor 2.1.49 version with Composer for researching and planning the work, sometimes Auto mode for simpler stuff, and Sonnet 4.5 / Opus 4.5 for hardest parts, and debugging, which often took place for 2-4 hours to find root causes and make fixes. In total, around 500m tokens were used.

The development process was iterative: I started with the design doc, architecture plan, and building every service step by step, starting with the heart: Neo4j graph database + Agentic workflows to fill it in with data, based on user prompts. I tested several agentic workflows frameworks: Google ADK - Agent Development Kit (dropped it because it had a constraint of running up to 10 steps in a workflow), LangGraph (dropped as it was too complex for debugging, too much of important parts what works and how was hidden under abstraction). So I ended up creating a custom Python workflow of agents that: think about persona matching the business description, research of these persona traits, what they do, their problems, what website they visit. Then another set of agents extracts nodes and relations and translates them to a Cypher query according to a specific database schema that I created during a project planning phase. Cypher queries are invoked, graphs are created. Other agents, based on research agents’ data, generate ad targeting configs, user journey story, generate images for the scenario and return data from Neo4j that can be then used in web-based graphs for visualization. These workflows are deployed in a separate container, and are controlled by API.

Once a scenario generation process was working on local docker containers (this was achieved after ~50+ failing 5 minute agentic workflow processes during development) - I’ve finally got json files that could be loaded to the demo UI. Frontend was built in a separate container, for backend I used Supabase, which is an open source Firebase alternative that makes backend setup much faster, as it comes with Postgres database, storage, authorization out-of-the-box. Payment Integration with Paypal, sandbox tests, and when everything worked locally, I went to the next step: Production deployment.

Building a production environment was based on a central plan in a .md file that was preceded with a research & planning phase. The production steps were:

Pre-deployment preparations (env, code verifications, security hardening)
VPS, DNS setups, server hardening, firewall, user permissions
Infrastructure config with Docker compose
Reverse proxy setup, SSL
Docker Compose deployment, database migrations, RLS policies, network security hardening
GCP integrations: Google Oauth authorization, Pub-sub automation for a spend safety cap - Gemini API cost usage to detach the budget account from the project once spend crosses a treshold.
CI/CD with Github Actions
Testing, debugging the layout, agentic workflows (another 50+ failed during this phase before getting it working), testing auth, payments sandbox > production
Grafana + Loki containers for observability
Ansible playbooks for future reproducibility

Results

In mid-January 2026, everything was working as intended, and the platform was hosted at: https://graphmotivo.dstepanian-tech.ovh/

It offers 3 demo persona purchase story explorations, a flexible token-based payment system allowing users to request their own custom persona + user journey presentation with graph explorations.

There are some improvements that could be done in future i.a. with UX optimization, graph database deduplication (e.g. META vs Meta Ads) but for this stage it’s good enough. It was fun building this, but it required a tight focus and patience to work in Cursor. Sometimes it still feels like working with ultra-fast but clueless temps. These also helped through the process: basic technical understanding of how LLMs work, statistics, AI-augmented programming experience and best practices to guide programming models correctly to the goal, ensure they know what’s needed, and use them to identify root causes of errors together.

Failed Machine Learning Experiment: Training XGBoost Classifier with 1.5m signals

Daniel Stepanian — Sun, 08 Feb 2026 19:15:25 +0000

In 2022 I started creating trading strategies in Python, and I had in mind some powerful ML-based strategies, but had neither knowledge, nor abilities to code and test them. Now, although I still have no experience with professional Machine Learning with deep mathematics, I thought that I could use AI to write code for this (Sonnet 4.5), and suggest model parameters (Grok Thinking).

When looking at many market price charts, I was under the impression that there are some patterns that can be utilized and with a right set of trading strategy and position optimization to get automated at least a couple percents of return. It’s clear that this is not true - usually the market presents a distorted picture. Yet, tempted with the ability to check it myself in a quick prototype project, I proceeded with an experiment to verify if a hypothesis based on my earlier impressions could be true. I used 2 Jupyter notebooks: XGBoost model training and Strategy backtest.

First, I downloaded 5y data of 15m timeframe price points for top 30 crypto tokens into parquet files. Then created an algorithm to find all price points, after which there was a price drop bigger than 3% within the 10 next 15 min blocks, and extracted preceding 10 price points with technical analysis indicators as training data for XGBoost classifier - for identification of moments preceding price drops. 500k Drop signals were found, and I added another 1 million with random non-drop preceding samples, in total 1.5m training samples, with 20% from these used for testing.

I’ve also normalized drops, as 3% drop on Bitcoin has a different magnitude than the same drop on Dogecoin. So I’ve chose the drop threshold = -2 expressed with Z-score approach: drop_zscore=drop_pct/volatility. It means that it’s a drop with 2x typical volatility (based on std deviation).
Then feature engineering process based on indicators momentum, volatility, price differences. Data preparation, then XGBoost training with parameters from Grok’s recommendation:

*Recommended hyperparameters:**
- `max_depth`: 3-7 (prevents memorizing noise)
- `learning_rate`: 0.01-0.1 (smaller = better with more trees)
- `n_estimators`: 200-500 (with early stopping)
- `subsample` / `colsample_bytree`: 0.6-0.9 (prevents overfitting)
- `scale_pos_weight`: 3-10 (handles class imbalance)

The model performed very similarly on test predictions and train set:

============================================================
TRAIN SET PERFORMANCE
============================================================
ROC-AUC Score: 0.6899

Classification Report:
              precision    recall  f1-score   support

   No Signal       0.93      0.62      0.74   3149036
      Signal       0.19      0.66      0.29    426220

    accuracy                           0.62   3575256
   macro avg       0.56      0.64      0.52   3575256
weighted avg       0.84      0.62      0.69   3575256

Confusion Matrix:
[[1938267 1210769]
 [ 144995  281225]]

============================================================
TEST SET PERFORMANCE (Unseen Data)
============================================================
ROC-AUC Score: 0.6761

Classification Report:
...
Train AUC: 0.6899
Test AUC:  0.6761
Difference: 0.0138
✓ Good generalization - minimal overfitting

Basic returns turned out to be the most important feature for drop prediction. Yet there are too many false positives, which could hurt a portfolio.

So I thought, maybe a set of position parameters could save this signal and make it usable? So I proceed with the backtesting notebook. I loaded the model, created a backtesting trading simulation environment, and a set of trading position parameters: TP, SL, delay, cooldown. I tried a grid search optimization approach - to test 900 scenarios with parameter combinations and find algorithmically the best one. It took 3 hours on my local computer, and yet… All scenarios resulted in 100% loss! The process failed miserably.

It was nice working on this step-by-step with Cursor + Sonnet 4.5. I’ve read a lot about XGBoost when building this, so just telling the assistant what needs to be done and why, and seeing it creating neat notebooks that work out-of-the-box or after 1-2 debug-fix iterations, felt almost seamless. Working with Jupyter Notebooks in Cursor is not convenient - the notebook needs to be closed, reopened and rerunned manually after changes applied in Agent mode. So I ended up in Ask Mode and pasting the code blocks manually.