Pete Letkeman

Posted on Jan 24

AI-Powered Resume Generator: Architecture & Implementation

#java #ollama #springboot #llm

Building an AI-Powered Resume Generator: Architecture & Implementation

Overview

I've been working on a full-stack application that leverages LLMs to generate polished, professional resume content. This post is a technical walkthrough of the architecture, integration points, and key implementation details.

Tech Stack:

Backend: Java 21, Spring Boot 3.x, Gradle
Frontend: React 19, TypeScript, Vite
LLM Integration: OpenAI API / Ollama (OpenAI-compatible endpoint)
Data Format: JSON-driven resume model
Build Tooling: Gradle (backend), Node (frontend)

Repository: https://github.com/pbaletkeman/java-resumes

Architecture Overview

┌─────────────┐
│   React UI  │
│ (TypeScript)│
└──────┬──────┘
       │ HTTP/REST
       ↓
┌─────────────────────────────────┐
│   Spring Boot REST API          │
│  (Java 21, Gradle 8.10)         │
│                                 │
│  ├─ ResumeController            │
│  ├─ FilesStorageService         │
│  └─ ApiService (LLM gateway)    │
└──────────┬──────────────────────┘
           │
      ┌────┴─────┐
      ↓          ↓
  ┌────────┐  ┌────────────┐
  │ Ollama │  │ OpenAI API │
  │(local) │  │  (cloud)   │
  └────────┘  └────────────┘

Key Components & Design Decisions

1. REST API Layer (Spring Boot)

The backend exposes endpoints for:

File uploads (multipart/form-data)
Resume optimization (async background processing)
File retrieval (results polling)
File management (list, download, delete)

Key Endpoint Pattern:

@PostMapping(path = "/api/upload")
public ResponseEntity<ResponseMessage> optimizeResume(
    @RequestParam("optimize") String optimizeJson,
    @RequestParam("resume") MultipartFile resume,
    @RequestParam("job") MultipartFile job) {

    // Validate inputs
    if (resume.isEmpty() || job.isEmpty()) {
        return ResponseEntity.status(HttpStatus.BAD_REQUEST)
            .body(new ResponseMessage("No file/invalid file provided"));
    }

    // Spawn background thread for LLM processing (non-blocking)
    Thread thread = new Thread(new BackgroundResume(optimize, root));
    thread.start();

    // Return 202 Accepted immediately
    return ResponseEntity.status(HttpStatus.ACCEPTED)
        .body(new ResponseMessage("generating"));
}

Why this pattern?

LLM API calls are slow (2-30+ seconds)
HTTP connections timeout if we wait for LLM
202 Accepted signals async processing to the client
Frontend polls /api/files until results appear

2. Async Background Processing

The BackgroundResume class handles long-running operations:

public class BackgroundResume implements Runnable {
    private final Optimize optimize;
    private final String root;

    @Override
    public void run() {
        try {
            // 1. Load LLM configuration
            String configStr = Utility.readFileAsString("config.json");
            Config config = new Gson().fromJson(configStr, Config.class);

            // 2. Build LLM request
            ChatBody chatBody = ApiService.createChatBody(optimize);

            // 3. Call LLM (OpenAI-compatible API)
            LLMResponse response = ApiService.produceFiles(
                optimize,
                config.getEndpoint(),
                config.getApikey(),
                config.getModel()
            );

            // 4. Save results (Markdown + PDF)
            FilesStorageService.save(response.getContent());

            LOGGER.info("Resume optimization completed");
        } catch (Exception e) {
            LOGGER.error("Background task failed: {}", e.getMessage());
        }
    }
}

Why background threads instead of async/await?

Simple, synchronous model
No need for reactive framework overhead
Easy to reason about error handling
Works well for moderate concurrency

3. LLM Integration (OpenAI-Compatible API)

The ApiService class abstracts LLM provider differences:

public class ApiService {
    public static LLMResponse produceFiles(
        Optimize optimize,
        String endpoint,
        String apiKey,
        String model) throws Exception {

        // Build OpenAI-compatible request
        ChatBody chatBody = new ChatBody();
        chatBody.setModel(model);
        chatBody.setMessages(createPrompt(optimize));
        chatBody.setTemperature(optimize.getTemperature());

        // Send to LLM
        HttpClient client = HttpClient.newHttpClient();
        String jsonRequest = new Gson().toJson(chatBody);

        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(endpoint + "/v1/chat/completions"))
            .header("Authorization", "Bearer " + apiKey)
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(jsonRequest))
            .build();

        HttpResponse<String> response = client.send(
            request,
            HttpResponse.BodyHandlers.ofString()
        );

        // Parse response
        LLMResponse llmResponse = new Gson().fromJson(
            response.body(),
            LLMResponse.class
        );

        return llmResponse;
    }
}

Why OpenAI-compatible format?

Works with Ollama (local models)
Works with OpenAI (cloud models)
Works with Azure OpenAI, Together.ai, etc.
Single integration code path
Easy to swap providers

Configuration (config.json):

{
  "endpoint": "http://localhost:11434",
  "apikey": "ollama",
  "model": "mistral:7b"
}

Frontend Architecture (React + TypeScript)

Core Hook: `useApi`

Centralized API communication:

export function useApi() {
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState<string | null>(null);

  const execute = async (fn: () => Promise<any>) => {
    setLoading(true);
    setError(null);
    try {
      await fn();
    } catch (err) {
      setError(err instanceof Error ? err.message : "Unknown error");
      throw err;
    } finally {
      setLoading(false);
    }
  };

  return { execute, loading, error };
}

File Upload & Polling Pattern

function MainContentTab() {
  const { execute, loading } = useApi();
  const [generatedFiles, setGeneratedFiles] = useState<File[]>([]);

  const handleSubmit = async (formData: FormData) => {
    await execute(async () => {
      // 1. Upload resume + job description
      await fileService.uploadForOptimization(formData);

      // 2. Start polling for results
      let attempts = 0;
      while (attempts < 60) { // 5 minutes max
        await new Promise(r => setTimeout(r, 5000)); // Poll every 5s

        const files = await fileService.listFiles();
        const newFiles = files.filter(f =>
          f.name.endsWith('.pdf') &&
          f.timestamp > formData.get('uploadTime')
        );

        if (newFiles.length > 0) {
          setGeneratedFiles(newFiles);
          break;
        }
        attempts++;
      }
    });
  };

  return (
    // UI for upload and display results
  );
}

Why polling instead of WebSockets?

Simpler client/server contract
Works through corporate proxies/firewalls
No need for persistent connection
Acceptable for batch processing workflows

Data Model: Optimize DTO

public class Optimize {
    private String[] promptType;      // ["Resume", "CoverLetter", "Skills"]
    private double temperature;        // 0.0-1.0 (creativity level)
    private String model;              // Model identifier from config
    private String company;            // Target company name
    private String jobTitle;           // Target job title
    private String jobDescription;     // Full job posting text
    private String resume;             // User's current resume

    // Getters/setters...
}

This DTO drives:

Prompt construction - What content to generate
LLM parameters - Temperature, model selection
Output filtering - Which sections to include

Key Implementation Challenges & Solutions

Challenge 1: LLM Response Time

Problem: API calls can take 10-30+ seconds
Solution:

Return 202 Accepted immediately
Process async in background thread
Frontend polls for completion

Challenge 2: File Format Conversion

Problem: LLM outputs plain text; need PDF with formatting
Solution:

Convert Markdown → HTML (CommonMark parser)
Convert HTML → PDF (Flying Saucer library)
Save both Markdown + PDF for flexibility

Challenge 3: Local vs Cloud LLM

Problem: Different APIs for Ollama vs OpenAI
Solution:

Use OpenAI-compatible format (both support it)
Config-driven endpoint selection
Single integration point

Challenge 4: Test Isolation

Problem: Tests failing due to state dependencies (file existence)
Solution:

@BeforeEach
void setUp() throws IOException {
    Path uploadsPath = Paths.get("uploads");
    Files.createDirectories(uploadsPath);
    // Create dummy files for delete tests, etc.
    Files.write(uploadsPath.resolve("resume.pdf"),
        "dummy".getBytes());
}

Deployment Considerations

Local Development

# Terminal 1: Start Ollama
ollama serve
ollama pull mistral:7b

# Terminal 2: Run backend
./gradlew bootRun  # Listens on :8080

# Terminal 3: Run frontend
cd frontend && npm run dev  # Listens on :5173

Cloud Deployment

# application.properties
server.port=8080
upload.path=/data/uploads
# Spring will detect OpenAI config from environment

Testing Strategy

80%+ Coverage Target:

Controller Tests - HTTP layer with MockMvc
Service Tests - Business logic, mocked LLM
Integration Tests - Full request flow
Model Tests - DTO serialization/validation

./gradlew test                    # Run all tests
./gradlew test --tests ClassName  # Run specific test
./gradlew checkstyleMain          # Code quality (100% compliance)

Performance & Scalability Notes

Horizontal Scaling: Add more backend instances behind load balancer
Rate Limiting: Implement per-user quotas for LLM API costs
Caching: Cache LLM responses for identical inputs
Async Queue: For high volume, use message queue (RabbitMQ, Kafka)
File Storage: Consider cloud storage (S3, Azure Blob) vs local filesystem

⚠️ Important Considerations

LLM Hallucination Risk

Critical: LLMs can generate plausible-sounding but inaccurate content. This includes:

Fabricated job experiences
Incorrect technical skills
Made-up company names or achievements
Dates and timelines that don't align with reality

Mitigation:

Always proofread generated content before using it
Cross-check facts against source documents
Verify all claims in the resume
Consider this tool as a content enhancement tool, not a replacement for human review
Use it to refine and polish verified information, not to generate unverified content

Processing Time

Important: File generation is NOT instant:

Local models (Ollama): 30 seconds to 5+ minutes depending on model size (7B models are faster, 13B+ models take longer)
Cloud models (OpenAI): 5-30 seconds typically, but can vary with load
Large job descriptions: Processing time increases with input size
Network latency: Slower connections add to total time

Frontend Polling:

// Default: polls every 5 seconds for up to 5 minutes (60 attempts)
// For longer processing, increase attempts or polling interval
let attempts = 0;
while (attempts < 60) {
  // Adjust this for longer waits
  await new Promise((r) => setTimeout(r, 5000)); // 5 seconds
  // ... check for files
  attempts++;
}

User Experience:

Display a progress indicator during processing
Show estimated wait time based on model selection
Allow users to check back later via job ID
Consider implementing email notifications when complete

Code Quality Standards

Checkstyle: 100% compliance (120 char line limit)
Test Coverage: 80%+ target
Java Version: Java 21 LTS with modern features
Spring Boot: Version 3.5.1 with latest practices

What's Next?

Potential improvements:

[ ] WebSocket support for real-time updates
[ ] Template system for different resume formats
[ ] Batch processing for multiple candidates
[ ] Integration with LinkedIn/job boards
[ ] A/B testing for LLM prompt optimization
[ ] Cost analytics for OpenAI usage

Lessons Learned

Async by Default - HTTP endpoints should never block on slow operations
Embrace Standards - OpenAI-compatible API is a superpower
Simple Patterns > Complex Frameworks - Background threads work great for this use case
Test Independence - Always set up required state in @BeforeEach
Config Over Code - Keep LLM provider flexible via configuration

Get Started

Repository: https://github.com/pbaletkeman/java-resumes

Quick Start:

git clone https://github.com/pbaletkeman/java-resumes
cd java-resumes
./gradlew clean build
./gradlew bootRun
# Visit http://localhost:8080/spotlight/index.html

Credits

Special thanks to Shaw Talebi for his excellent tutorial on building resume optimization tools, which served as the inspiration and starter foundation for this project.

Have you built LLM integrations in Java? What patterns did you use? Drop a comment!

Discussion Topics:

Async patterns for LLM integrations
Local vs cloud LLM trade-offs
Resume optimization strategies
Full-stack Java + React workflows

DEV Community

AI-Powered Resume Generator: Architecture & Implementation

Building an AI-Powered Resume Generator: Architecture & Implementation

Overview

Architecture Overview

Key Components & Design Decisions

1. REST API Layer (Spring Boot)

2. Async Background Processing

3. LLM Integration (OpenAI-Compatible API)

Frontend Architecture (React + TypeScript)

Core Hook: `useApi`

File Upload & Polling Pattern

Data Model: Optimize DTO

Key Implementation Challenges & Solutions

Challenge 1: LLM Response Time

Challenge 2: File Format Conversion

Challenge 3: Local vs Cloud LLM

Challenge 4: Test Isolation

Deployment Considerations

Local Development

Cloud Deployment

Testing Strategy

Performance & Scalability Notes

⚠️ Important Considerations

LLM Hallucination Risk

Processing Time

Code Quality Standards

What's Next?

Lessons Learned

Get Started

Credits

Top comments (0)

Building an AI-Powered Resume Generator: Architecture & Implementation

Overview

Architecture Overview

Key Components & Design Decisions

1. REST API Layer (Spring Boot)

2. Async Background Processing

3. LLM Integration (OpenAI-Compatible API)

Frontend Architecture (React + TypeScript)

Core Hook: useApi

File Upload & Polling Pattern

Data Model: Optimize DTO

Key Implementation Challenges & Solutions

Challenge 1: LLM Response Time

Challenge 2: File Format Conversion

Challenge 3: Local vs Cloud LLM

Challenge 4: Test Isolation

Deployment Considerations

Local Development

Cloud Deployment

Testing Strategy

Performance & Scalability Notes

⚠️ Important Considerations

LLM Hallucination Risk

Processing Time

Code Quality Standards

What's Next?

Lessons Learned

Get Started

Credits

Core Hook: `useApi`