matias yoon

Posted on May 26

터미널 AI 에이전트 구축 (v49)

#ai #llm #developers #tutorial

터미널 AI 에이전트 구축 (v49)

개발자들을 위한 로컬 터미널 AI 에이전트 구축 가이드

개발자들은 점점 더 AI를 코드 작성에 통합하고 있습니다. 하지만 기존 도구들은 성능 저하, 비공개 데이터 문제, 느린 응답 속도 등의 문제를 가지고 있습니다. 이 가이드에서는 로컬에서 실행되는 빠르고 안전한 터미널 AI 에이전트를 구축하는 방법을 실습 중심으로 설명합니다.

1. CLI AI 에이전트 생태계 분석

주요 도구들

# Aider: GitHub Copilot과 유사한 기능
pip install aider

# Continue.dev: VSCode 확장에서 유래된 터미널 버전
npm install -g continue.dev

# OpenCode: 오픈소스 CLI AI 에이전트
git clone https://github.com/opencode/opencode.git

Aider vs Continue vs Custom

도구	장점	단점	성능
Aider	GitHub 통합, 코드 생성	API 속도 제한	⭐⭐⭐
Continue	VSCode 통합	클라우드 기반	⭐⭐
Custom	완전 제어	개발 시간 필요	⭐⭐⭐⭐

2. 로컬 LLM API 엔드포인트 설정

우리는 로컬 LLM 서버를 설정하여 데이터 보안과 속도 향상을 실현합니다.

# 1. Ollama 설치 (가장 간단한 방법)
curl -fsSL https://ollama.com/install.sh | sh

# 2. 모델 다운로드
ollama pull llama3
ollama pull codellama:7b

# 3. API 서버 시작
ollama serve

# 4. Python에서 API 호출 테스트
curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Python으로 factorial 함수를 작성하세요"
}'

3. Python CLI 에이전트 구축

# smart_agent.py
import openai
import json
import subprocess
import sys
from typing import List, Dict, Any

class TerminalAIAgent:
    def __init__(self):
        self.client = openai.OpenAI(
            base_url="http://localhost:11434/v1",
            api_key="ollama"
        )
        self.system_prompt = """
        You are a senior Python developer. Your task is to help with code generation,
        debugging, and analysis. Always respond in JSON format with the following structure:
        {
            "action": "generate|analyze|fix",
            "content": "your response",
            "code_blocks": ["code1", "code2"]
        }
        """

    def get_file_context(self, filepath: str) -> str:
        """파일 내용을 가져와 컨텍스트로 활용"""
        try:
            with open(filepath, 'r') as f:
                return f.read()
        except Exception as e:
            return f"Error reading {filepath}: {str(e)}"

    def call_llm(self, user_prompt: str, context: str = "") -> Dict[str, Any]:
        """LLM 호출 함수"""
        full_prompt = f"{self.system_prompt}\n\nContext:\n{context}\n\nUser: {user_prompt}"

        response = self.client.chat.completions.create(
            model="llama3",
            messages=[
                {"role": "system", "content": self.system_prompt},
                {"role": "user", "content": full_prompt}
            ],
            temperature=0.3,
            response_format={"type": "json_object"}
        )

        return json.loads(response.choices[0].message.content)

# 실행 예시
if __name__ == "__main__":
    agent = TerminalAIAgent()
    result = agent.call_llm("Python으로 factorial 함수를 작성하세요")
    print(json.dumps(result, indent=2, ensure_ascii=False))

4. tmux와 통합

터미널 다중화기를 사용하여 AI 에이전트를 편리하게 사용합니다:

# tmux 설정
cat >> ~/.tmux.conf << EOF
bind-key C-a send-keys C-a
bind-key C-b send-keys C-b
set -g mouse on
EOF

# tmux 세션 생성 및 AI 에이전트 실행
tmux new-session -d -s ai_agent
tmux send-keys -t ai_agent 'python smart_agent.py' Enter

5. 사용자 정의 도구 개발

# tools.py - 사용자 정의 도구 집합
import subprocess
import os
from pathlib import Path

class CodeTools:
    @staticmethod
    def search_code(pattern: str, directory: str = ".") -> List[str]:
        """코드 검색"""
        try:
            result = subprocess.run([
                'find', directory, '-name', '*.py', '-exec', 'grep', '-l', pattern, '{}', '+'
            ], capture_output=True, text=True)
            return result.stdout.strip().split('\n') if result.stdout.strip() else []
        except Exception as e:
            return [f"Error: {str(e)}"]

    @staticmethod
    def get_git_status() -> str:
        """Git 상태 확인"""
        try:
            result = subprocess.run(['git', 'status', '--porcelain'], 
                                  capture_output=True, text=True)
            return result.stdout
        except Exception as e:
            return f"Git error: {str(e)}"

    @staticmethod
    def file_operations(operation: str, source: str, target: str = None) -> str:
        """파일 작업"""
        try:
            if operation == "copy":
                subprocess.run(['cp', source, target])
                return f"Copied {source} to {target}"
            elif operation == "move":
                subprocess.run(['mv', source, target])
                return f"Moved {source} to {target}"
            elif operation == "create":
                with open(target, 'w') as f:
                    f.write("")
                return f"Created {target}"
        except Exception as e:
            return f"Error: {str(e)}"

# 도구 사용 예시
tools = CodeTools()
print(tools.search_code("def main"))
print(tools.get_git_status())

6. 컨텍스트 윈도우 관리

대규모 코드베이스에서 성능 문제를 피하기 위해 컨텍스트 윈도우 관리:

# context_manager.py
class ContextWindowManager:
    def __init__(self, max_tokens: int = 4000):
        self.max_tokens = max_tokens
        self.context = []

    def add_file_context(self, filepath: str, max_lines: int = 50) -> str:
        """파일 컨텍스트 추가 - 최대 라인 수 제한"""
        try:
            with open(filepath, 'r') as f:
                lines = f.readlines()
                # 최대 라인 수 제한
                if len(lines) > max_lines:
                    lines = lines[:max_lines] + [f"... ({len(lines)-max_lines} more lines)"]

                content = ''.join(lines)
                self.context.append(f"File: {filepath}\n{content}")
                return content
        except Exception as e:
            return f"Error: {str(e)}"

    def get_context_str(self) -> str:
        """전체 컨텍스트 문자열 생성"""
        return '\n\n'.join(self.context)

    def trim_context(self) -> None:
        """컨텍스트 크기 조정"""
        total_length = sum(len(ctx) for ctx in self.context)
        while total_length > self.max_tokens and self.context:
            self.context.pop(0)
            total_length = sum(len(ctx) for ctx in self.context)

# 사용 예시
context_manager = ContextWindowManager(max_tokens=3000)
context_manager.add_file_context("main.py")
context_manager.add_file_context("utils.py")

7. 비용/성능 최적화

# 1. 모델 최적화
ollama pull llama3:8b
ollama pull codellama:7b

# 2. 추론 최적화
export OLLAMA_NUM_PARALLEL=1
export OLLAMA_MAX_LOADED_MODELS=1

# 3. 비용 절감을 위한 로컬 캐싱
mkdir -p ~/.cache/llm_cache


python
# performance_optimizer.py
import time
import psutil
from functools import wraps

def monitor_performance(func):
    """성능 모니터링 장식자"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = time.time()
        start_memory = psutil.Process().memory_info().rss / 1024 / 1024

        result = func(*args, **kwargs)

        end_time = time.time()
        end_memory = psutil.Process().memory_info().rss / 1024 / 1024

        print(f"Execution time:

---

📥 **Get the full guide on Gumroad**: https://gumroad.com/l/auto ($5)

DEV Community

터미널 AI 에이전트 구축 (v49)

터미널 AI 에이전트 구축 (v49)

개발자들을 위한 로컬 터미널 AI 에이전트 구축 가이드

1. CLI AI 에이전트 생태계 분석

주요 도구들

Aider vs Continue vs Custom

2. 로컬 LLM API 엔드포인트 설정

3. Python CLI 에이전트 구축

4. tmux와 통합

5. 사용자 정의 도구 개발

6. 컨텍스트 윈도우 관리

7. 비용/성능 최적화

Top comments (0)