Every October, the developer community buzzes with Hacktoberfest energy. PRs fly, t-shirts are earned, and GitHub turns green. But here's what nobody talks about: what happens in November?
For most contributors, the answer is simple: nothing. The repositories they contributed to become distant memories, the architecture they briefly touched remains unexplored, and the relationships they could have built with maintainers never formed.
I decided to do things differently this year. Let me tell you how one Hacktoberfest discovery turned into a masterclass in production AI systems.
The Problem: October-Only Open Source
Let's be honest about the Hacktoberfest paradox:
The Surface-Level Contribution Trap: Quick documentation fixes and typo corrections are valuable, don't get me wrong. But if that's ALL you do, you're missing the forest for the trees. The real learning happens when you understand why the code is structured a certain way, not just that a semicolon was missing.
Chasing Swag Over Growth: T-shirts and badges are nice, but they don't teach you how to build production-grade systems. The developers who grow fastest are the ones who stick around after October.
Missed Opportunities: The best open source contributions come from contributors who understand the codebase deeply. That takes time, more than one month.
The Discovery: Finding Skyflo.ai
While searching for interesting projects to contribute to, I stumbled upon Skyflo.ai, an open-source AI agent for DevOps and cloud-native operations.
As someone actively learning AI engineering and building agent architectures, this wasn't just another contribution opportunity. It was exactly what I was looking to learn:
LangGraph for stateful agent orchestration
MCP (Model Context Protocol) for standardized tool execution
Human-in-the-loop safety patterns
Kubernetes-native deployment
Instead of submitting a quick PR and moving on, I decided to dive deep.
What is Skyflo.ai?
Skyflo.ai is an AI copilot that unifies Kubernetes operations and CI/CD systems behind a natural-language interface. Instead of memorizing CLI commands or clicking through UIs, you just tell Skyflo.ai what you want:
Show me the last 200 lines of logs for checkout in production.
If there are errors, summarize them.
Or:
Progressively canary rollout auth-backend in dev through 10/25/50/100 steps
The magic is in how it does this safely, with human approval required for any mutating operation.
Understanding the Architecture
This is where the real learning happened. Skyflo's architecture is a textbook example of how to build production AI agent systems.
The Three-Layer Design
1. Frontend Layer: Command Center
Built with Next.js, TypeScript, and Tailwind
Real-time streaming: SSE to frontend, Streamable HTTP for MCP
Shows every action the agent takes in real-time
2. Intelligence Layer: The Engine
FastAPI backend with LangGraph workflows
Manages the Plan → Execute → Verify loop
Handles approvals and checkpoints
Real-time SSE updates to UI
3. Tool Layer: MCP Server
FastMCP implementation for tool execution
Standardized tools for kubectl, Helm, Jenkins, Argo Rollouts
Safe, consistent actions across all integrations
Why This Architecture Works
The separation of concerns is beautiful:
UI changes don't affect the agent logic
New tools can be added without touching the intelligence layer
Each component is independently deployable and testable
Key Learnings from Production AI Systems
1. LangGraph for Stateful Agents
Traditional LLM chains are stateless—you send a prompt, get a response, done. But real-world AI agents need to:
Remember context across multiple steps
Handle failures gracefully with checkpoints
Support human intervention at any point
LangGraph provides graph-based orchestration that enables all of this. The agent's workflow is defined as nodes and edges, with state persisted at each step.
Here's how Skyflo.ai implements this workflow in engine/src/api/agent/graph.py:
from langgraph.graph import StateGraph, START, END
def _build_graph(self) -> StateGraph:
workflow = StateGraph(AgentState)
# Define the workflow nodes
workflow.add_node("entry", self._entry_node)
workflow.add_node("model", self._model_node)
workflow.add_node("gate", self._gate_node)
workflow.add_node("final", self._final_node)
# Define the flow
workflow.add_edge(START, "entry")
workflow.add_conditional_edges(
"entry", route_from_entry,
{"gate": "gate", "model": "model"}
)
workflow.add_conditional_edges(
"model", route_after_model,
{"gate": "gate", "model": "model", "final": "final"}
)
workflow.add_conditional_edges(
"gate", route_after_gate,
{"model": "model", "final": "final"}
)
workflow.add_edge("final", END)
return workflow
This creates a stateful workflow where the agent can loop between planning (model), execution (gate), and verification phases until the task is complete.
2. MCP: The USB-C for AI
Model Context Protocol is becoming the standard for how AI agents interact with tools. Instead of building custom integrations for each tool (the "M x N nightmare"), MCP provides:
A universal interface for tool discovery
Standardized invocation patterns
Clean separation between agent logic and tool implementation
Think of it as "OpenAPI for AI agents."
3. Human-in-the-Loop is Non-Negotiable
For DevOps operations, automatic execution without approval is dangerous. Skyflo's pattern:
Agent creates a plan
User reviews and approves
Agent executes
Agent verifies the outcome
Repeat until complete
This Plan → Execute → Verify loop with human approval gates is a pattern every production AI system should adopt.
4. Real-Time Streaming Builds Trust
Users need to see what the agent is doing. Skyflo.ai streams every action in real-time:
Tool invocations
Intermediate reasoning
Execution results
Verification steps
This transparency is critical when your agent is touching production infrastructure. Skyflo.ai streams events via SSE from the Engine to the UI, while the Engine communicates with the MCP server over Streamable HTTP transport for efficient tool execution.
5. Safety-First Design
Key safety patterns I observed:
Dry-run by default for Helm operations
Diff-first before any apply
Approval gates for all mutations
Audit logging of every action
Architecture Note: The communication between components uses different protocols optimized for their use case:
Engine → UI: Server-Sent Events (SSE) for real-time user feedback
Engine → MCP Server: Streamable HTTP transport for tool execution
My Contributions to Skyflo.ai
Over the past few months, I've contributed multiple features and fixes to Skyflo.ai:
Features Shipped
Jenkins Build Control: Added tools to stop/cancel running builds, enabling full CI/CD lifecycle management
Kubernetes Rollout Management: Implemented rollout history and rollback tools for safer deployments
Helm Template Rendering: Added
helm_templatetool to preview manifests before deploymentLabel Selector for K8s Resources: Enhanced
k8s_gettool with label selectors for more precise resource queriesChat History Search: Implemented debounced server-side search for better conversation management
Bug Fixes & UX Improvements
Message Continuity Fix: Resolved critical issue where chat messages disappeared after tool approval/denial
Approval Flow Refinement: Streamlined approval action handling and message finalization in the UI
Navigation Enhancements: Made logo clickable and added GitHub project link to navbar
Profile Management: Fixed button state management for profile updates
SSE Timeout Fix: Increased Nginx proxy timeouts to prevent 60-second SSE connection cutoffs
Documentation
-
Fixed architecture guide link in
CONTRIBUTING.mdto help new contributors
Each contribution taught me something new about production AI systems, from SSE streaming patterns to Kubernetes operations safety.
My Journey: Challenges and Breakthroughs
Contributing to Skyflo.ai wasn't always smooth sailing. Here are the challenges I faced and what I learned from them:
Understanding LangGraph State Management
The Challenge: At first, I didn't understand how state flows through the workflow nodes. The conditional edges and state updates seemed complex.
The Breakthrough: After reading through engine/src/api/agent/graph.py and experimenting locally, I realized that LangGraph's state is additive, each node returns updates that merge with the existing state. This pattern makes it easy to maintain conversation context while allowing nodes to be independent.
Decoding the MCP Abstraction
The Challenge: The abstraction between the Engine, MCP Client, and MCP Server initially confused me. I couldn't understand why we needed three separate components.
The Breakthrough: Once I traced a tool call through the entire flow, it clicked:
Engine receives user intent via LLM
MCP Client (
engine/src/api/services/mcp_client.py) acts as a bridgeMCP Server (
mcp/tools/) executes actual kubectl/helm commands
This separation means you can swap out tools without touching the agent logic—brilliant architecture.
Grasping the Approval Flow
The Challenge: Understanding when and how approval gates trigger took time. The interaction between approval_decisions state and ApprovalPending exceptions was not immediately obvious.
The Breakthrough: I discovered that the gate node checks if a tool requires approval, then raises ApprovalPending which halts execution. The state is checkpointed, and when the user approves/denies, the workflow resumes from exactly where it stopped. This is production-grade error handling.
Beyond Hacktoberfest: The Year-Round Journey
October: The Starting Point
Hacktoberfest is a fantastic catalyst. It lowers the barrier to entry and introduces you to projects you'd never discover otherwise. Use it as your launchpad, not your destination.
November-December: Go Deeper
This is where the real learning happens:
Read the entire codebase, not just the file you're changing
Join Discord/Slack discussions to understand the roadmap
Pick up complex issues that intimidate you
Ask questions about architectural decisions
Year-Round: Become a Maintainer
Consistent contributions build trust. Over time:
You start reviewing other contributors' PRs
Maintainers ask for your input on design decisions
You shape the project's future direction
You build lasting professional relationships
The Contributor Mindset
Here's what separates occasional contributors from impactful ones:
1. Choose projects you want to learn from
Don't just pick easy projects to farm PRs. Pick projects that use technologies you want to master. My contribution to Skyflo.ai taught me more about production AI systems than any tutorial could.
2. Contribute in multiple ways
Code features and bug fixes
Documentation improvements
Test coverage
Code reviews
Community support
All contributions are valuable. Mix them up.
3. Build relationships
Open source is as much about people as it is about code. The maintainers and contributors you meet today become your professional network tomorrow.
4. Track your growth, not just your PRs
The real metric isn't merged PRs, it's skills gained, patterns learned, and confidence built.
Your Turn: Start Now
Whether it's Skyflo.ai or another project that excites you, the best time to start contributing is now. Not next October, now.
Here's your action plan:
Find a project that aligns with what you want to learn
Read the contributing guidelines and code of conduct
Set up the development environment locally
Start with the codebase, not the issues, understand before you contribute
Join the community (Discord, Slack, Discussions)
Pick your first issue and dive in
Stay around after your first PR merges
Resources
Skyflo.ai GitHub: github.com/skyflo-ai/skyflo
Skyflo.ai Discord: discord.gg/kCFNavMund
LangGraph Docs: docs.langchain.com/langgraph
Model Context Protocol: modelcontextprotocol.io
Connect With Me
I'm sharing my AI engineering journey, open source contributions, and developer productivity tips:
YouTube: @sachin-chaurasiya
LinkedIn: in/sachin-chaurasiya
Dev.to: sachinchaurasiya
What's a project that taught you something unexpected? Drop a comment, I'd love to hear your open source stories.



Top comments (0)