DEV Community

Sumeet Dugg
Sumeet Dugg

Posted on

Your Local AI Software Engineer

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Introduction

An intelligent command-line coding assistant built with C#, Semantic Kernel, and local LLM inference. Get real-time code debugging, Docker assistance, and AI-powered development help - all running locally on your machine.

Download Ready to use? Download the pre-built executable from the Releases page. No installation required - just download, configure, and run!

Download

Ready to use? Download the pre-built executable from the Releases page.

No installation required - just download, configure, and run!

Features

  • Code Debugging & Fixing: Paste broken code, get working solutions

  • Docker Assistant: Generate Dockerfiles, docker-compose configs, and commands

  • Interactive Chat Interface: Beautiful terminal UI with Spectre.Console

  • Streaming Responses: Real-time AI output as it thinks

  • Plugin Architecture: Extensible Semantic Kernel plugins

  • 100% Local: No cloud APIs, complete privacy with llama.cpp

  • Fast: Optimized for local LLM inference

Prerequisites

Required Software

Optional but Recommended

  • Windows Terminal for better console experience

  • Git for version control

Demo

Quick Start (Using Pre-built EXE)

If you just want to run the application without building from source:

  • Download the Release

  • Go to the Releases section of this repository

  • Download the latest AiCodingAgent.zip file

  • Extract the ZIP file to a folder (e.g., C:\AiCodingAgent)

Set Up llama.cpp Server You still need the llama.cpp server and a model

@echo off
set MODEL_PATH= Your_model_path.gguf
C:\llama.cpp\llama-server.exe ^
-m %MODEL_PATH% ^
--ctx-size 2048 ^
--port 8080 ^
--threads 12 ^
--chat-template chatml

pause
Enter fullscreen mode Exit fullscreen mode

Configure the Application

Edit Configuration/appsettings.json in the extracted folder:

{
  "modelId": "gemma 4",
  "apiKey": "dummy-key",
  "endpoint": "http://localhost:8080/",
  "HttpClientConfig": {
    "TimeoutSeconds": 10
  }
}
Enter fullscreen mode Exit fullscreen mode

Run the Application

  • Start the llama.cpp server: start-llama-server.bat

  • Run AiCodingAgent.exe from the extracted folder

  • Start coding!

Installation & Setup (Building from Source)

Step 1:

Clone the Repository

git clone https://github.com/sumeetpypi/AICodingAgnet-SemanticKernel.git
cd AICodingAgnet
Enter fullscreen mode Exit fullscreen mode

Step 2:

Restore NuGet Packages

dotnet restore
Enter fullscreen mode Exit fullscreen mode

This will install:

  • Microsoft.SemanticKernel (v1.75.0)

  • Spectre.Console (v0.55.2)

  • Microsoft.Extensions.Configuration packages

  • Other dependencies from AiCodingAgent.csproj

Step 3:

Configure Application Settings

Edit Configuration/appsettings.json:

{
  "LLM": {
    "Endpoint": "http://localhost:8080",
    "ModelId": "local-model",
    "ApiKey": "not-needed"
  }
}
Enter fullscreen mode Exit fullscreen mode

Configuration Options:

  • Endpoint: URL where your llama.cpp server is running

  • ModelId: Identifier for your model (can be any string for local models)

  • ApiKey: Not required for local llama.cpp server

Step 4:

Set Up llama.cpp Server

Create a batch file start-llama-server.bat in your llama.cpp directory:

@echo off
set MODEL_PATH= Your_model_path.gguf
C:\llama.cpp\llama-server.exe ^
-m %MODEL_PATH% ^
--ctx-size 2048 ^
--port 8080 ^
--threads 12 ^
--chat-template chatml

pause
Enter fullscreen mode Exit fullscreen mode

Parameters Explained:

  • --model: Path to your GGUF model file

  • --port: Port number (must match appsettings.json)

  • --ctx-size: Context window size (tokens)

  • --n-gpu-layers: Number of layers to offload to GPU (0 for CPU-only)

  • --threads: CPU threads to use

  • --chat-template : For formating

Step 5:

Build the Project

In Visual Studio:

  • Open AiCodingAgent.csproj or the solution file

  • Build → Build Solution (or press Ctrl+Shift+B)

Or via command line:

 dotnet build
Enter fullscreen mode Exit fullscreen mode

Usage

Starting the Agent

Method 1: Using Pre-built EXE (Easiest)

# Start llama.cpp server first
start-llama-server.bat  # Windows

or 

# Double-click AiCodingAgent.exe or run from command line
AiCodingAgent.exe
Enter fullscreen mode Exit fullscreen mode

Method 2: Visual Studio

  • Press F5 to run with debugging

  • Or Ctrl+F5 to run without debugging

Method 3:

Command Line (from source)

# Start llama.cpp server first
start-llama-server.bat  # Windows

# In a new terminal, run the agent
dotnet run
Enter fullscreen mode Exit fullscreen mode

or

Using the Agent

Once started, you'll see the main menu:

=== AI Coding Agent ===
Commands:
  1 or Debug Code      - Debug and fix your code
  2 or Docker Commands - Get Docker help
  exit                 - Quit the application

You: 
Enter fullscreen mode Exit fullscreen mode

Option 1: Debug Code

You: 1
Paste your code and press Ctrl+Z then Enter (Windows)
Enter fullscreen mode Exit fullscreen mode
  • Type or paste your code

  • Press Ctrl+Z then Enter (Windows)

  • The AI will analyze and provide fixes/improvements in real-time

Example:

def calculate(x, y):
    result = x + y
    return result

print(calculate(5))  # Missing argument!
Enter fullscreen mode Exit fullscreen mode

The agent will identify the error and provide the corrected code.

Option 2:

Docker Commands

You: 2

Paste your Docker requirement and press Ctrl+Z then Enter
Enter fullscreen mode Exit fullscreen mode

Example requests:

  • "Create a Dockerfile for a Node.js application"

  • "Generate docker-compose for nginx and postgres"

  • "Fix this Docker error: [paste error]"

Exiting the Application

Type exit and press Enter at any prompt.

Project Structure

AiCodingAgent/
│
├── Agent/
│   ├── AgentPrompts.cs       # Display prompts and commands
│   ├── CodingAgent.cs        # Core agent logic with streaming
│   └── UserInput.cs          # Main loop and command handling
│
├── Configuration/
│   ├── AppSettings.cs        # Settings model classes
│   └── appsettings.json      # Configuration file
│
├── Kernel/
│   ├── KernelExtensions.cs   # Extension methods for kernel
│   └── KernelFactory.cs      # Kernel initialization
│
├── Plugins/
│   ├── CodingPlugin/         # Code assistance functions
│   │   ├── Code/
│   │   │   ├── config.json
│   │   │   └── skprompt.txt
│   │   ├── CodePython/
│   │   ├── DOSScript/
│   │   └── Entity/
│   │
│   └── DockerPlugin/         # Docker assistance functions
│       ├── docker/
│       │   ├── config.json
│       │   ├── skprompt.txt
│       │   └── docker_suggest.yaml
│       └── DockerPlugin.cs
│
├── Services/
│   ├── CodeParserService.cs  # Code parsing utilities
│   └── Services.cs           # Service initialization
│
├── Program.cs                # Application entry point
└── AiCodingAgent.csproj      # Project file with dependencies
Enter fullscreen mode Exit fullscreen mode

Customization

Adding New Plugins

  • Create a new folder in Plugins/

  • Add a YourPlugin.cs file

  • Create semantic functions with:

    - config.json: Function metadata

    - skprompt.txt: AI prompt template

Example config.json: json

{
  "schema": 1,
  "description": "Your function description",
  "execution_settings": {
    "default": {
      "max_tokens": 1000,
      "temperature": 0.7
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Example skprompt.txt:

You are an expert [domain] assistant.

Task: [What the function does]

Rules:
-[Rule 1]
-[Rule 2]

User Input: 
{{$input}}
Enter fullscreen mode Exit fullscreen mode

Modifying Prompts

Edit the skprompt.txt files in plugin folders. Changes take effect on next run - no recompilation needed!

Changing LLM Settings

Modify Configuration/appsettings.json to point to different servers or adjust parameters.

Troubleshooting

Common Issues

"Connection refused" or "Cannot connect to LLM"

  • Ensure llama.cpp server is running (llama-server.exe)

  • Check port matches in both server and appsettings.json

  • Verify endpoint URL is correct

"Model not found" error from llama.cpp

  • Check model path in batch file is correct

  • Ensure GGUF file exists and isn't corrupted

  • Try absolute path instead of relative

Slow responses

  • Reduce --ctx-size in server startup

  • Decrease model size (use quantized versions like Q4_K_M)

  • If using CPU, reduce --threads

  • For GPU, increase --n-gpu-layers

"Invalid command" in agent

  • Type exact commands: 1, 2, exit, or full names

  • Commands are case-insensitive

Build errors about .NET 10.0

  • Install .NET 10.0 SDK from Microsoft

  • Or modify TargetFramework in .csproj to net8.0 or net9.0

Dependencies

NuGet Packages:

  • Microsoft.SemanticKernel (1.75.0) - AI orchestration framework

  • Spectre.Console (0.55.2) - Beautiful terminal UI

  • Microsoft.Extensions.Configuration (7.0.0) - Configuration management

  • Microsoft.Extensions.Configuration.Json (7.0.0) - JSON config support

  • Microsoft.Extensions.VectorData.Abstractions (10.5.2) - Vector data support

External Tools:

  • llama.cpp - Local LLM inference server

  • GGUF Model - Quantized language model

Use Cases

  • Code Review: Get instant feedback on code quality

  • Bug Fixing: Paste error messages and get solutions

  • Docker Setup: Generate container configurations quickly

  • Learning: Understand code patterns and best practices

  • Prototyping: Generate boilerplate code fast. -->

Code

Github link: https://github.com/sumeetpypi/AICodingAgnet-SemanticKernel.git

Top comments (0)