blaycoder

Posted on Mar 16

ScamDetect: Building a Multilingual AI-Powered Phishing Detection Platform

#ai #cybersecurity #nextjs #security

tags: ai, security, phishing, next.js, node.js, cybersecurity

Introduction

Every day, millions of people fall victim to phishing scams and phishing messages. Scammers don't just target people in English-speaking countries; they attack globally, using local languages to make their deceptive messages seem more trustworthy.

This is where ScamDetect comes in.

I built ScamDetect to solve this problem - a free, multilingual AI-powered platform that can detect phishing attempts, scams, and malicious URLs in multiple languages. Whether you speak English, Spanish, French, Arabic, or Chinese, ScamDetect can help you identify and avoid online scams before they harm you.

Why I Built This Project

The motivation came from two key realizations:

1. Phishing is a Global Problem

According to recent statistics, over 3.4 billion phishing emails are sent every day -> View. But what's worse is that non-English speakers are often underserved by existing security tools. A scammer might convince someone in Indonesia by sending a message in Indonesian, but government and commercial security tools often focus on English-language threats.

2. A friend of mine was scammed 😪😥

A friend of mine was duped through a phishing attack, and it really made me think about how easily these things can happen. Many times people don’t recognize phishing messages until it’s already too late.

I’ve always had a strong interest in cybersecurity and phishing awareness, and whenever I get the opportunity, I try to educate people about how these scams work.

When the lingo.dev hackathon came up, I saw it as an opportunity to build something that could actually help people beyond just awareness.

Key Features of ScamDetect

1. Multilingual Text Message Analysis

Users can paste a suspicious text message (SMS, WhatsApp, Telegram, etc.) and ScamDetect analyzes it for phishing indicators. The detection is language-independent, so a message in Spanish, Arabic, or Portuguese can be analyzed with the same accuracy.

2. URL Phishing Scanning

ScamDetect checks suspicious URLs against two powerful threat intelligence databases:

VirusTotal API — checks against 80+ antivirus engines
PhishTank — checks against known phishing databases

This multi-source approach ensures fewer false positives while catching more real threats.

3. Screenshot OCR Detection

Many scams come from screenshots of fake banking apps, fake payment screens, or fabricated messages. ScamDetect can:

Extract text from screenshots using Google Vision OCR (even though I have challenges enabling the billing after trying several attempt. I could not log into Microsoft Azure to use the OCR tool hence why I reverted back to Google Vision OCR since that is just only a billing issue.)
Analyze the extracted text for phishing indicators
All without needing the user to manually type anything

4. AI-Powered Scam Classification

Using Ollama with the gpt-oss:120b-cloud model, ScamDetect runs an advanced AI classifier to determine if a message is genuinely a phishing attempt or a legitimate message. This goes beyond simple keyword matching—the AI understands context and language patterns.

5. Results Translated into Your Language

All analysis results are automatically translated into the user's preferred language using the Lingo.dev translation API.

6. User Dashboard

Users can:

View their scan history
Track patterns in scams they've encountered

Architecture Overview

ScamDetect follows a modern full-stack architecture:

┌─────────────────────────────────────┐
│     Frontend (Next.js + React)      │
│   • User Interface                  │
│   • Language Selection              │
│   • Input Handling                  │
└────────────┬────────────────────────┘
             │
             │ HTTP/HTTPS
             │
┌────────────▼────────────────────────┐
│     Backend (Node.js + Express)     │
│   • API Endpoints                   │
│   • Detection Pipeline              │
│   • Service Orchestration           │
└────────────┬────────────────────────┘
             │
    ┌────────┼────────┬──────────┐
    │        │        │          │
    ▼        ▼        ▼          ▼
┌────────┐┌──────┐┌──────┐┌────────────┐
│Database││ OCR  ││AI    ││APIs (VT,   │
│        ││      ││Model ││PhishTank)  │
│Supabase││Google││Ollama││Translation │
│        ││ Vision││gpt-oss:120b-cloud││Lingo.dev  │
└────────┘└──────┘└──────┘└────────────┘

Data Flow:

User Input → Frontend captures text, URL, or screenshot image
Frontend sends request → Backend API with language preference
Backend processes:
- Extracts keywords and URLs
- Checks domain similarity using the Levenshtein distance method
- Submits URLs to VirusTotal/PhishTank
- Runs AI classification if text is detected
Results are compiled → Risk score calculated
Translation layer → Results translated to user's language
Response sent back → Frontend displays rich visualization
Data persisted → Results stored in Supabase for user's dashboard

Flow Chart:

Important Code Examples

Let's look at real code from ScamDetect and understand how it works.

1. Sending Text to Backend for Analysis

Frontend Code - how users send text messages for analysis:

// frontend/src/lib/api.ts

export const api = {
  /** POST /api/analyze-message */
  analyzeMessage: async (
    message: string,
    language = "en",
  ): Promise<DetectionResult> =>
    fetch(`${API_URL}/api/analyze-message`, {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        ...(await authHeaders()), // Include user auth token
      },
      body: JSON.stringify({ message, language }),
    }).then(handleResponse<DetectionResult>),
};

What this does:

Takes a suspicious message and target language
Sends it to the backend /api/analyze-message endpoint
Includes authentication headers (if user is logged in)
Returns structured analysis results

Why it's important:

Simple, type-safe API using TypeScript
Supports authentication without exposing tokens
Error handling built-in via handleResponse

Backend Code - how the backend processes the message:

// backend/src/controllers/analyze.controller.ts

export async function analyzeMessage(
  req: Request,
  res: Response,
): Promise<void> {
  const { message, language } = req.body;
  try {
    // Run the full detection pipeline
    const result = await runDetectionPipeline(message, language ?? "en");

    // Save result to database asynchronously
    saveDetectionResult(message, result, req.userId).catch(() => {});

    // Return results immediately
    res.json(result);
  } catch (err) {
    res.status(500).json({ error: "Detection pipeline failed" });
  }
}

What this does:

Validates and extracts message and language from request
Runs the detection pipeline (keyword matching, URL extraction, AI classification)
Saves results to database for user history
Returns response immediately (doesn't wait for database save)

Why it's important:

Fast response time—doesn't block on database operations
Persistent history for user analysis
Graceful error handling

2. Running VirusTotal API for URL Scanning

Backend Service — how we check URLs against VirusTotal:

// backend/src/services/virustotalService.ts

export async function submitUrl(url: string): Promise<string | null> {
  const key = apiKey();
  if (!key) return null;

  try {
    const body = new URLSearchParams();
    body.set("url", url);

    const res = await axios.post<{ data: { id: string } }>(
      `${VT_API_URL}/urls`,
      body,
      {
        headers: {
          "x-apikey": key,
          "Content-Type": "application/x-www-form-urlencoded",
        },
      },
    );

    return res.data.data.id;
  } catch (err) {
    return null;
  }
}

/**
 * Get detailed analysis of a VirusTotal scan
 */
export async function getAnalysis(
  analysisId: string,
): Promise<VTDetailedResult | null> {
  const key = apiKey();
  if (!key) return null;

  try {
    const res = await axios.get<VTAnalysisResponse>(
      `${VT_API_URL}/analyses/${analysisId}`,
      {
        headers: { "x-apikey": key },
      },
    );

    const attrs = res.data.data.attributes;
    return {
      vtAnalysisId: analysisId,
      status: attrs.status,
      maliciousCount: attrs.stats.malicious || 0,
      phishingCount: (attrs.results || {})["Phish Threat"] ? 1 : 0,
      harmlessCount: attrs.stats.harmless || 0,
      suspiciousCount: attrs.stats.suspicious || 0,
      undetectedCount: attrs.stats.undetected || 0,
      engines: Object.entries(attrs.results || {}).map(
        ([name, result]) => ({
          name,
          category: result.category,
          result: result.result,
        }),
      ),
    };
  } catch (err) {
    return null;
  }
}

What this does:

Submits a URL to VirusTotal's API for scanning
Polls the analysis endpoint to get the results
Returns detailed information including malicious engines and detection counts
Gracefully handles API key missing or API failures

Why it's important:

VirusTotal provides crowd-sourced threat intelligence from 80+ antivirus engines
One malicious URL detection might be a false positive, but multiple engines agreeing is strong signal
Decoupling URL checking from our own detection logic keeps our system modular

3. Extracting Text from Screenshots Using OCR

Backend Service — how we extract text from screenshot images:

// backend/src/services/ocrService.ts

import { ImageAnnotatorClient } from "@google-cloud/vision";

export interface OCRResult {
  extractedText: string;
  confidence?: number;
}

/** Extract text from an image using Google Cloud Vision API */
export async function extractTextFromImage(
  imageBase64: string,
): Promise<OCRResult> {
  const client = createClient();

  const request = {
    image: { content: imageBase64 },
  };

  const response = await client.documentTextDetection(request);
  const fullTextAnnotation = response[0].fullTextAnnotation;

  if (!fullTextAnnotation) {
    return { extractedText: "", confidence: 0 };
  }

  // Calculate average confidence from all blocks
  const annotations = response[0].fullTextAnnotation.pages?.[0].blocks || [];
  const confidences = annotations
    .flatMap((b) => b.paragraphs || [])
    .flatMap((p) => p.words || [])
    .flatMap((w) => w.symbols || [])
    .map((s) => s.confidence || 0);

  const avgConfidence =
    confidences.length > 0
      ? confidences.reduce((a, b) => a + b, 0) / confidences.length
      : 0;

  return {
    extractedText: fullTextAnnotation.text || "",
    confidence: avgConfidence,
  };
}

Backend Controller — how the OCR endpoint works:

// backend/src/controllers/screenshot.controller.ts

export async function scanScreenshot(
  req: Request,
  res: Response,
): Promise<void> {
  const { imageBase64, language } = req.body;

  try {
    // Extract text from image
    const { extractedText } = await extractTextFromImage(imageBase64);

    if (!extractedText.trim()) {
      return res.json({
        riskLevel: "SAFE",
        message: "No text found in image",
      });
    }

    // Analyze the extracted text
    const result = await runDetectionPipeline(extractedText, language ?? "en");

    // Save for history
    saveDetectionResult(imageBase64, result, req.userId).catch(() => {});

    res.json(result);
  } catch (err) {
    res.status(500).json({ error: "Screenshot scanning failed" });
  }
}

What this does:

Takes a base64-encoded image from the frontend
Uses Google Cloud Vision to extract text
Calculates confidence score based on per-character confidence values
Runs the normal detection pipeline on extracted text
Returns phishing analysis results

Why it's important:

OCR allows users to share screenshots instead of typing (better UX)
Confidence score helps users understand if text extraction was reliable
Image-based scams (fake banking screens, fake payment apps) are very common—this feature is critical

4. Translating Results with Lingo.dev

Backend Service — translating analysis results:

// backend/src/services/translationService.ts

import { LingoDotDevEngine } from "lingo.dev/sdk";

const engine = new LingoDotDevEngine({
  apiKey: process.env.LINGODOTDEV_API_KEY,
});

export async function translateText(
  text: string,
  targetLanguage: string,
): Promise<string> {
  // Skip translation if no API key or target is English
  if (!process.env.LINGODOTDEV_API_KEY || targetLanguage === "en") {
    return text;
  }

  const result = await engine.localizeText(text, {
    sourceLocale: "en",
    targetLocale: targetLanguage,
  });

  return result ?? text;
}

Detection Pipeline — integrating translation into results:

export async function runDetectionPipeline(
  text: string,
  language = "en",
): Promise<DetectionResult> {
  // ... analysis code ...

  const riskLevel = calculateRiskLevel(totalScore);
  const recommendation = generateRecommendation(riskLevel);

  // Translate recommendation to user's language
  if (language !== "en") {
    translatedRecommendation = await translateText(recommendation, language);
  }

  return {
    riskLevel,
    score: totalScore,
    flags,
    recommendation: translatedRecommendation,
    extractedUrls,
    language,
  };
}

What this does:

Uses Lingo.dev SDK to translate text between languages
Skips translation if no API key (graceful degradation)
Integrates into the detection pipeline to translate analysis recommendations

Why it's important:

Makes the tool accessible globally
Users aren't forced to read English analysis results
Lingo.dev handles nuanced localization (not just word translation)

5. Frontend Component for User Interaction

Frontend Page — scanning URLs with real-time feedback:

// frontend/src/app/scan-url/page.tsx

export default function ScanUrlPage() {
  const [url, setUrl] = useState("");
  const [result, setResult] = useState<DetectionResult | null>(null);
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState<string | null>(null);
  const { language, setLanguage } = useLanguage();

  async function handleScan() {
    if (!url.trim()) return;
    setLoading(true);
    setError(null);
    setResult(null);

    try {
      // Call backend API
      const data = await api.checkUrl(url, language);
      setResult(data);
    } catch (err) {
      setError(err instanceof Error ? err.message : "Scan failed");
    } finally {
      setLoading(false);
    }
  }

  return (
    <div className="min-h-screen px-4 py-16">
      <div className="mx-auto max-w-2xl">
        {/* Header */}
        <h1 className="font-mono text-3xl font-bold text-[#e2e8ff]">
          Scan <span className="text-[#ff00ff]">URL</span>
        </h1>

        {/* Input Panel */}
        <input
          type="url"
          value={url}
          onChange={(e) => setUrl(e.target.value)}
          onKeyDown={(e) => e.key === "Enter" && handleScan()}
          placeholder="https://suspicious-site.com"
          className="w-full rounded border bg-[rgba(255,0,255,0.03)] py-3 px-4 text-[#e2e8ff]"
        />

        {/* Scan Button */}
        <button
          onClick={handleScan}
          disabled={loading}
          className="mt-4 px-6 py-2 bg-[#ff00ff] rounded text-white font-bold disabled:opacity-50"
        >
          {loading ? "Scanning..." : "Scan URL"}
        </button>

        {/* Results */}
        {result && <ResultPanel result={result} />}
        {error && <div className="text-red-500">{error}</div>}
      </div>
    </div>
  );
}

What this does:

Provides user interface for URL scanning
Handles loading states and error display
Calls the backend API with user's language preference
Uses React hooks to manage state
Keyboard support (Enter to scan)

Challenges I Faced

Building ScamDetect involved solving several complex problems:

1. OCR Accuracy & Confidence Level

The Challenge:
Different images have different quality levels. At first I tried a common OCR library (Tesseract) but noticed it was missing characters or skipping parts of the text. That forced me to experiment with other OCR options until I found something that worked better for extracting text from screenshots such as Google Vision OCR or Microsoft Azure OCR tool. I decided to go for the Google Vision OCR since I have worked with different Google Cloud services.

After setting up the Google Vision OCR, I had an error to enable billing because the service required enabling billing but Google could not verify every card details I tried. I decided to make use of Microsoft Azure, I signed up and fill every detailed bearing in mind that I am close but unfortunately, I could not login after signup, I kept receiving this error message :"interaction_required: AADSTS5000225: This tenant has been blocked due to inactivity. To learn more about tenant lifecycle policies, see https://aka.ms/TenantLifecycle". I had to maintain the Google Vision OCR setup since it was only billing issue.

The Solution:
No solution yet because I am yet to figure out how to solve the billing issue. I have done everything asked of me by Google but I still have same issue.

What I intend to do is to make a subscription on hugging face and use one of the models such as Zai for the OCR extraction. I could have used the Deepseek OCR model in Ollama but the model is large and I might not be able to use it in production, it is best used locally.

// Only flag as high-risk if confidence is good
if (confidence < 0.7) {
  console.warn("Low OCR confidence—reducing detection sensitivity");
  flags = flags.filter((f) => f.score > 30); // Only high-confidence flags
}

2. Integrating AI Models Reliably

The Challenge:
Ollama (local AI) and VirusTotal API both can fail intermittently. We can't let one failing service break the entire detection pipeline.

The Solution:

Made all external service calls optional
If VirusTotal fails, we still return keyword-based detection
If Ollama doesn't respond, we skip AI classification but complete the analysis
Return partial results instead of errors
This is called "graceful degradation"

// Ollama AI classification is optional
try {
  const { classifyWithOllama } = await import("./ollamaService");
  aiClassification = (await classifyWithOllama(text)) ?? undefined;
} catch (err) {
  console.error("[AI] Error:", err);
  // Continue without AI classification
}

// Return results even if external services failed
return {
  riskLevel,
  score: calculateRiskScore(flags), // Built from flags we do have
  flags,
  message: "Detection completed with available services",
};

3. Managing API Call Costs and Rate Limits

The Challenge:

VirusTotal has rate limits (free tier: 4 requests/min)
Translation API has per-character costs
Scanning screenshots with Google Cloud Vision costs money
We needed to avoid wasting money on repeated scans of the same URLs

The Solution:

Implemented result caching in Supabase
If a URL was scanned recently, return cached result instead of making new API call
Batch translation requests when possible
Use express-rate-limit middleware to protect our backend
Rate limit by user and IP address

// Rate limiting in express
const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Max 100 requests per IP per 15 min
  message: { error: "Too many requests, please try again later." },
});
app.use(limiter);

Setup Guide for Developers

Want to run ScamDetect yourself? Here's how to get started.

Prerequisites

Before you begin, make sure you have:

Node.js 20+ — Download here
Ollama installed — Download here
Git — Download here

You'll also need API keys from:

Supabase (free) — supabase.com
VirusTotal (free) — virustotal.com
Google Cloud Vision (free tier) — cloud.google.com
Lingo.dev (for translation) — lingo.dev

Step 1: Clone the Repository

git clone https://github.com/blaycoder/ScamDetect-Multilingual
cd ScamDetect

Step 2: Setup Backend

# Navigate to backend directory
cd backend

# Install dependencies
npm install

# Copy environment file
cp .env.example .env

# Edit .env with your API keys
nano .env
# Fill in:
# SUPABASE_URL=your-supabase-url
# SUPABASE_KEY=your-supabase-key
# VIRUSTOTAL_API_KEY=your-virustotal-key
# GOOGLE_CLOUD_CREDENTIALS=your-google-vision-json
# LINGODOTDEV_API_KEY=your-lingo-key

Start Ollama first (in a separate terminal):

ollama serve

Run the backend server:

npm run dev

You should see:

[Express] Server running on port 4000
[Ollama] Connected to local model

Step 3: Setup Frontend

In a new terminal:

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Create environment file
cp .env.example .env.local

# Edit .env.local
nano .env.local
# Fill in:
# NEXT_PUBLIC_API_URL=http://localhost:4000
# NEXT_PUBLIC_SUPABASE_URL=your-supabase-url
# NEXT_PUBLIC_SUPABASE_KEY=your-supabase-key

Run the frontend:

npm run dev

Open your browser to http://localhost:3000

Step 4: Test the Application

Test text scanning: Go to "Analyze Message" and paste a phishing message
Test URL scanning: Go to "Scan URL" and paste a suspicious URL
Test screenshot: Go to "Upload Screenshot" and upload a screenshot

If everything works, you should see detection results with risk scores!

Troubleshooting

Backend won't start:

Make sure Ollama is running (ollama serve)
Check that port 4000 isn't in use: lsof -i :4000

Frontend can't reach backend:

Make sure NEXT_PUBLIC_API_URL points to your backend
Check CORS settings in backend/src/app.ts

OCR not working:

Verify Google Cloud Vision credentials are correct
Check that you have the right JSON file format
Check that billing is enabled on Google Cloud
Confirm that Google Vision OCR is enabled in APIs & Services

Ollama responses are slow:

First response can take 10-30 seconds
Subsequent responses should be faster (model is cached in memory)

Real World Applications

ScamDetect isn't just a technical project—it has real impact on people's lives.

Use Case 1: Protecting Vulnerable Communities

A grandmother in rural India receives a WhatsApp message claiming to be from her bank. The message asks her to "verify her account" by clicking a link. She's not tech-savvy and the message looks official.

Instead of losing ₹50,000 to the scammer, she pastes the message into ScamDetect. The system detects phishing keywords and suspicious domain patterns. The result appears in Hindi, her native language. She learns not to click the link.

Impact: One person saved from financial loss.

Use Case 2: Supporting Small Business Owners

A small business owner in Mexico receives an email claiming to be from "PayPal Support" asking him to verify his account. He doesn't speak English well, but he can use ScamDetect to analyze the email in Spanish.

ScamDetect detects:

Phishing keywords
Domain impersonation (fake PayPal domain)
VirusTotal flags from multiple antivirus engines

Impact: Business owner avoids losing business payment data.

Use Case 3: Educational Outreach

Schools and cybersecurity awareness programs can use ScamDetect to teach students about phishing in their native language. Instead of abstract lessons, students can:

Scan real (anonymized) phishing attempts
See how detection works
Understand the techniques scammers use

Impact: Next generation grows up more phishing-aware.

Key Takeaways

Building ScamDetect taught me several important lessons:

Accessibility matters — A tool is only useful if people can actually use it. Multilingual support isn't a nice-to-have, it's essential.
AI + APIs = powerful combinations — Combining local AI (Ollama) with threat intelligence APIs (VirusTotal) creates something stronger than either alone.
Graceful degradation — Don't let one failing service break your whole system. Build systems that work with partial data.
Open source tools are powerful — Ollama, Mistral, Google Cloud Vision, and VirusTotal's free tier made this project possible without massive cloud budgets.
Real-world problems inspire good design — Building something that helps people avoid financial loss is more motivating than building for the sake of building.

Get to know me:

GitHub: github.com/blaycoder

Linkedin: https://www.linkedin.com/in/ayomide-onatola-3180281a5

X: https://x.com/blaycoder

Conclusion

Phishing and scams are one of the biggest cybersecurity threats facing everyday people. Language barriers shouldn't make anyone more vulnerable.

ScamDetect is my attempt to make the internet a little bit safer, one multilingual detection at a time.

If you found this interesting, I'd love to hear your thoughts! Feel free to comment, ask questions, or share your own experiences building security tools.

Before I drop my pen, check out this post written by me to understand why your privacy is important: Understanding your privacy is very important

Stay safe out there! 🛡️