Building a Skills Updater Pipeline for AI Platforms

#ai #automation #llm #showdev

I turned 1,870 JSONL files into six new user-level skills for my AI platform in a single session. Here’s how I built a repeatable pipeline for skills-updater.

The Problem

I had a one-off question: 'Look through all my Claude Code JSONL files and recommend new skills.' This meant walking through ~904 MB of data across 45 project directories, filtering down to 2,752 real user-typed prompts, and cross-referencing against an existing skill set of 56 (9 user + 47 plugin). The manual deep-dive was expensive—too expensive to redo from scratch. So, I built skills-updater to automate it.

The Pipeline

The repo lives at the heart of my AI ecosystem, tied to Nexus and ARIA. I wrote scripts to parse the JSONL files, extract meaningful user interactions, and rank skill gaps. The synthesis returned 12 candidates; six shipped immediately:

narrative-docs-update: Captures my policy of documentary-grade writing (147 hits across 30 projects).

whats-next: Briefs me on session restarts (62+ hits).

Four others targeting specific repetitive tasks.

The pipeline runs on my own server, leveraging local compute to keep costs down. I used Node.js for file parsing and Python for ranking logic, with outputs written back as actionable configs.

The Code

Here’s a simplified snippet from the ranking script:

python

skills_ranker.py

def rank_candidates(prompts, existing_skills):
gaps = []
for prompt in prompts:
if not matches_existing(prompt, existing_skills):
gaps.append(calculate_relevance(prompt))
return sorted(gaps, key=lambda x: x['frequency'], reverse=True)[:12]

The Tradeoffs

The first run missed edge cases—some prompts were misclassified as noise due to inconsistent formatting. I had to manually tweak the filter logic at 2am to catch those. Also, the pipeline isn’t real-time; it’s a batch process that assumes static data. That’s a limitation I’ll address in v2.

Why It Matters

This isn’t just about skills. It’s about turning manual grunt work into a system. If you’re building AI tools, you’ve likely faced the same slog—repeating analysis that a script could do. Automating this saved me hours per session, and it scales as my project count grows. Next up: wiring this into Nexus for continuous updates.