DEV Community

Sangwoo Lee
Sangwoo Lee

Posted on

Integrating AI into a Legacy Broadcasting CMS(Content-uploading Manager System): Architecture Design

How I designed a sermon AI pipeline that bridges a 20-year-old ASP Classic system with modern STT, LLM structuring, and vector search — without touching the legacy core.

The Problem

I work at a Korean Christian broadcasting company (GOODTV) serving 900,000+ users. Every week, 140 pastors upload sermon MP3s through an internal Windows-based CMS built on ASP Classic — a system that's been running for over two decades.

The ask from leadership: "Make it searchable by AI."

Specifically, they wanted a RAG (Retrieval-Augmented Generation) system where staff could ask natural language questions like "What did Pastor Kim preach about forgiveness last year?" and get accurate, sourced answers.

Simple enough in theory. The challenge was the existing infrastructure.

The Legacy Stack

Before writing a single line of new code, I mapped out exactly what I was dealing with:

Windows CMS Server (192.168.1.228)
  └── ASP Classic (.asp files)
  └── MSSQL database
  └── IIS web server

Linux APM Servers (192.168.1.226 / 192.168.1.227)
  └── PHP transcoding pipeline
  └── C binary for media processing
  └── Apache

My Local PC (192.168.1.138)
  └── Python (Whisper STT, Ollama LLM, Pinecone client)
  └── RTX 3060 12GB GPU
Enter fullscreen mode Exit fullscreen mode

The sermon MP3s were already being processed by a C binary (transcoding_ai) on the Linux servers. This binary handled format conversion and was called via PHP's exec(). The entire chain looked like this:

ASP (228) → HTTP POST → PHP (226/227) → exec() → C binary
Enter fullscreen mode Exit fullscreen mode

The C binary already had the MP3 URL. That was my integration point.

Design Decision: Don't Break What Works

The safest architectural decision was also the most constrained: inject into the existing pipeline without modifying the ASP or C binary core logic.

Why? The CMS had accumulated years of edge cases, quirks, and undocumented business logic. A wrong touch could break sermon uploads for 140 pastors simultaneously.

The design I landed on:

┌─────────────────────────────────────────────────┐
│               FULL PIPELINE                     │
│                                                 │
│  [CMS 228] Admin checks "AI Learn" checkbox     │
│       │                                         │
│       ▼  HTTP POST (async, non-blocking)        │
│  [PHP 226/227] transcoding_ai.php               │
│    - Query DB: ai_learn_YN = 'Y'?               │
│    - Is output file .mp3?                       │
│    - If both true → pass to C binary            │
│       │                                         │
│       ▼  exec()                                 │
│  [C Binary] transcoding_ai                      │
│    - Existing transcoding logic (unchanged)     │
│    - NEW: curl POST → Python API                │
│       │                                         │
│       ▼  HTTP POST /process_url                 │
│  [Python 138:8001] api_server.py                │
│    - Download MP3                               │
│    - Whisper large-v3 STT                       │
│    - LLM structuring (Ollama)                   │
│    - Pinecone vector upload                     │
└─────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The key insight: the C binary already had argv[2] (the MP3 URL) available. Adding a single curl call at the end of its execution was the minimal viable integration — no ASP changes, no core PHP changes, no MSSQL schema changes to the main content table.

The Gate: ai_learn_YN Field

The business logic required that not every sermon upload triggers AI processing — only the ones explicitly flagged by admins. I added a single column ai_learn_YN CHAR(1) DEFAULT 'N' to the content table and surfaced a checkbox in the CMS UI.

PHP reads this flag before invoking the C binary:

// transcoding_ai.php (simplified)
$result = sqlsrv_query($conn,
    "SELECT ai_learn_YN FROM content_list WHERE idx = ?",
    [[$idx, SQLSRV_PARAM_IN]]
);
$row = sqlsrv_fetch_array($result);

$is_mp3 = substr($output_filename, -4) === '.mp3';
$ai_enabled = $row['ai_learn_YN'] === 'Y';

if ($is_mp3 && $ai_enabled) {
    exec("/var/www/html/transcoding_ai $list_idx $mp3_url $output_filename");
}
Enter fullscreen mode Exit fullscreen mode

Two conditions, both required. This prevents non-MP3 content (video sermons, announcements) from hitting the AI pipeline, and gives admins explicit control over what gets learned.

Why a Local PC, Not a Server?

The Whisper large-v3 model needs a GPU. Our Linux APM servers are CPU-only — they handle PHP, not ML workloads. Our organization isn't ready to provision a GPU EC2 instance, so I run the Python API server on my local workstation with an RTX 3060.

This is intentionally a dev/pilot setup. The Python API is accessible on the internal network (192.168.1.138:8001), and the C binary on the Linux servers can reach it via curl. For production, the same code runs on any machine with a GPU — swapping the IP is the only change required.

What's Next

In Part 2, I'll walk through the actual implementation: how the PHP and C binary hand off the MP3 URL, how the Python FastAPI handles async job processing without blocking the CMS, and how the multi-stage AI pipeline (STT → LLM correction → paragraph structuring → vector embedding) was wired together.

The stack:

  • STT: faster-whisper (Whisper large-v3)
  • LLM pre-processing: gemma4:e4b via Ollama (OCR error correction)
  • LLM structuring: llama3.1:8b via Ollama (paragraph segmentation)
  • Vector DB: Pinecone with multilingual-e5-large embeddings
  • QA: exaone3.5:7.8b via Ollama (Korean language RAG answers)

Top comments (0)