Security Checks with Local LLMs

#security #productivity #automation #ai

Continuing articles AI-Powered Repository Security Check with Antigravity Workflow and https://dev.to/gdg/how-to-build-a-custom-ai-quality-gate-on-cloud-run-from-zero-to-production-1odp I've decided to try to outsource some checks to local LLM.

This article describes my experiment and outcomes. Will be glad to read your questions, proposals, opinions or advices! 🙌

You can listen a podcast generated based on this publication (thanks NotebookLM):

Intro

Last changes in limits management for popular LLM APIs make me thinking about FinOps management. Why should I spend expensive cloud tokens for simple tasks? Also I have a lot of talks at last security and AI events which led me to begin experiments with local LLMs in terms of code generation and code quality checks.

Hardware

The hardware for experiments is MacBook Air M5 24GB RAM. I bought it especially for diving into ML topics but it was underloaded since today.

Pains

The first pain was an introduction of new limits for the Antigravity IDE. Along with models list changing it led me to think about optimizing my development and security flows which were intended to use cheaper Antigravity tokens prior to more expensive Vertex AI tokens.

The second pain was the FOMO effect about Machine Learning and MLOps itself.

Solution Track

After some iterations with Ollama and local models I've selected the qwen2.5-coder:14b-instruct-q5_K_M as a base model with optimized context window:

% cat Modelfile-qwen-32k 
FROM qwen2.5-coder:14b-instruct-q5_K_M
PARAMETER num_ctx 32000

% ollama create qwen-coder-32k -f ./Modelfile-qwen-32k

...

% ollama list
NAME                                 ID              SIZE      MODIFIED     
qwen-coder-32k:latest                dc3c4762d967    10 GB     2 hours ago     
qwen-coder-64k:latest                42f060e717dd    10 GB     2 hours ago     
qwen2.5-coder:14b-instruct-q5_K_M    05d16c5ac1c1    10 GB     2 hours ago     
gemma4:e4b                           c6eb396dbd59    9.6 GB    25 hours ago    
gemma4:e2b                           7fbdbf8f5e45    7.2 GB    25 hours ago

The 32k window provided me with quite quick execution and a trade-off between the speed and the temperature of my laptop. I think this configuration will be a subject of experiments in near future.

Then I've realized that I have to decompose tasks and give some rest time between requests to my hardware. So the unified script was born:

#!/bin/bash

# Default values
OUTPUT_DIR="."
MODEL_NAME="qwen-coder-32k"
COEFF=2
PROMPT_FILE=""

show_help() {
    echo "Usage: $0 -d <directory> -m <file_mask> -p <prompt_file> [OPTIONS]"
    echo ""
    echo "Required parameters:"
    echo "  -d  Directory for searching files"
    echo "  -m  File mask to check"
    echo "  -p  Path to a text file with system prompt (e.g., prompts/strict_table.txt)"
    echo ""
    echo "Optional parameters:"
    echo "  -o  Directory to save the final report (default: current directory)"
    echo "  -e  Exclude directories (comma-separated, e.g., venv,tests,migration)"
    echo "  -f  Exclude file masks (comma-separated, e.g., *test*,__init__.py)"
    echo "  -c  Cooldown delay multiplier (default: 2)"
    exit 1
}

# Argument parsing
while getopts "d:m:o:e:f:c:p:h" opt; do
    case "$opt" in
        d) SRC_DIR="$OPTARG" ;;
        m) FILE_MASK="$OPTARG" ;;
        o) OUTPUT_DIR="$OPTARG" ;;
        e) EXCLUDE_DIRS="$OPTARG" ;;
        f) EXCLUDE_FILES="$OPTARG" ;;
        c) COEFF="$OPTARG" ;;
        p) PROMPT_FILE="$OPTARG" ;;
        h) show_help ;;
        *) show_help ;;
    esac
done

# Check required parameters
if [ -z "$SRC_DIR" ] || [ -z "$FILE_MASK" ] || [ -z "$PROMPT_FILE" ]; then
    echo "❌ Error: Required parameters -d, -m, or -p are missing."
    show_help
fi

# Check if prompt file exists
if [ ! -f "$PROMPT_FILE" ]; then
    echo "❌ Error: Prompt file '$PROMPT_FILE' not found!"
    exit 1
fi

# Check Ollama
if ! pgrep -x "ollama" > /dev/null && ! curl -s http://localhost:11434 > /dev/null; then
    echo "❌ Error: Ollama is not running!"
    exit 1
fi

# Check jq
if ! command -v jq &> /dev/null; then
    echo "❌ Error: 'jq' utility is not installed. Run: brew install jq"
    exit 1
fi

# Initialize report directory
mkdir -p "$OUTPUT_DIR"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
REPORT_FILE="$OUTPUT_DIR/review_report_$TIMESTAMP.md"

# Write report header
{
    echo "# 🛡️ Review Report"
    echo "Generation date: $(date)"
    echo "Used prompt: \`$PROMPT_FILE\`"
    echo -e "\n---\n"
} > "$REPORT_FILE"

echo "=================================================================="
echo "🕵️‍♂️ Starting review..."
echo "📂 Final report will be saved to: $REPORT_FILE"
echo "=================================================================="

# Build find command
FIND_CMD="find \"$SRC_DIR\" -type f -name \"$FILE_MASK\""

if [ -n "$EXCLUDE_DIRS" ]; then
    IFS=',' read -ra DIRS <<< "$EXCLUDE_DIRS"
    FOR_FIND=""
    for dir in "${DIRS[@]}"; do
        if [ -z "$FOR_FIND" ]; then
            FOR_FIND="-path '*/$dir/*'"
        else
            FOR_FIND="$FOR_FIND -o -path '*/$dir/*'"
        fi
    done
    FIND_CMD="find \"$SRC_DIR\" \( $FOR_FIND \) -prune -o -type f -name \"$FILE_MASK\" -print"
fi

# Start main file processing loop
eval "$FIND_CMD" | while read -r file; do
    if [ ! -f "$file" ]; then continue; fi

    # Check file exclusions
    if [ -n "$EXCLUDE_FILES" ]; then
        IFS=',' read -ra FILE_MASKS <<< "$EXCLUDE_FILES"
        skip_file=false
        for mask in "${FILE_MASKS[@]}"; do
            if [[ "$(basename "$file")" == $mask ]]; then
                skip_file=true
                break
            fi
        done
        if [ "$skip_file" = true ]; then
            echo "⏭️ Skipping file (excluded by mask): $file"
            continue
        fi
    fi

    echo -n "⏳ Analyzing: $file ... "

    # Read code and clear comments/empty lines
    CLEANED_CODE=$(sed -e 's/[[:space:]]*#.*//' -e '/^[[:space:]]*$/d' "$file")
    if [ -z "$CLEANED_CODE" ]; then 
        echo "⚠️ Empty."
        continue
    fi

    # Write file section to report
    {
        echo "## 📁 File: $file"
        echo -e "\n### 🔍 Analysis results:\n"
    } >> "$REPORT_FILE"

    # Read external prompt and combine with code
    SYSTEM_PROMPT=$(cat "$PROMPT_FILE")
    FULL_PROMPT="$SYSTEM_PROMPT\n\n--- TARGET CODE ---\n$CLEANED_CODE"

    JSON_PAYLOAD=$(jq -n --arg model "$MODEL_NAME" --arg prompt "$FULL_PROMPT" '{model: $model, prompt: $prompt, stream: false}')

    # Measure time and send API request
    START_TIME=$(date +%s)
    curl -s -X POST http://localhost:11434/api/generate -H "Content-Type: application/json" -d "$JSON_PAYLOAD" | jq -r '.response' >> "$REPORT_FILE"
    END_TIME=$(date +%s)

    ELAPSED=$((END_TIME - START_TIME))
    SLEEP_TIME=$((ELAPSED * COEFF))

    echo -e "\n\n---\n\n" >> "$REPORT_FILE"
    echo "✅ Elapsed: ${ELAPSED}s. Rest: ${SLEEP_TIME}s."

    if [ "$SLEEP_TIME" -gt 0 ]; then 
        sleep "$SLEEP_TIME"
    fi
done

echo "=================================================================="
echo "🎉 Review successfully completed!"
echo "=================================================================="

The logic of the script:

Get info about which files to check and where they are stored.
Get the file with the prompt content.
Get some optional parameters about filtering, outputs and delays between requests.
For each file:
- Read the file and clean it from not meaningful things like comments and empty lines.
- Send the file content into the local LLM along with the prompt.
- Receive result and save it to the report.
- Count the processing time for the file and sleep x2 (by default) time to cool down the hardware.

Outcomes

Execution Flow

(venv) %n@%m %1~ %# ./scripts/repo-check-1.sh -d scripts -m setup* -p scripts/prompt-infrasec.txt 
==================================================================
🕵️‍♂️ Starting review...
📂 Final report will be saved to: ./review_report_20260521_121530.md
==================================================================
⏳ Analyzing: scripts/setup-quality-gate-iam.sh ... ✅ Elapsed: 6s. Rest: 12s.
⏳ Analyzing: scripts/setup-gcp-details.sh ... ✅ Elapsed: 95s. Rest: 190s.
⏳ Analyzing: scripts/setup-gcp.sh ... ✅ Elapsed: 128s. Rest: 256s.
==================================================================
🎉 Review successfully completed!
==================================================================

Report

🔍 Analysis results:

Finding / Vulnerability	Recommendation / Fix
Assigning public access (legacyObjectReader) to GCS bucket	Remove the line `gsutil iam ch allUsers:legacyObjectReader "gs://${BUCKET_NAME}"` to prevent making the bucket publicly accessible. Consider using more restrictive permissions based on your security requirements.
Hardcoded service account name in the script	Avoid hardcoding sensitive information like service account names. Instead, retrieve them from a secure source or use environment variables.
Missing encryption settings for GCS bucket	Ensure that the GCS bucket is encrypted by default. Add the `--encryption` flag to the `gsutil mb` command if you want to specify a specific encryption type, such as `--encryption=DEFAULT`.
No logging and monitoring configurations	Implement logging and monitoring for the resources created. Enable Cloud Logging and Monitoring to track access and usage of the secrets and GCS bucket.
Using automatic replication policy for secrets	Consider using a more controlled replication policy for secrets. Automatic replication might not be necessary for all use cases, and you should evaluate whether it aligns with your security and compliance requirements.
Lack of error handling for secret creation	Add proper error handling when creating the secret to ensure that any issues during the creation process are caught and addressed appropriately.
No version control for secrets	Ensure that secrets have a versioning strategy in place. This allows you to manage changes and roll back to previous versions if needed.
Potential for misconfiguration of IAM roles	Double-check the IAM roles being assigned to ensure they align with the principle of least privilege. Avoid assigning broader permissions than necessary for the service account.

Conclusion

Looks extremely interesting:

The time elapsed is quite good for me.
The LLM answer is quite similar to cloud LLMs. And it was achieved without prompt tuning or additional context manipulations.

Further steps planned:

Experiment with models, context windows, prompts and additional contexts.
Check whether it will work on some kind of a local SOHO server for batch tasks.

Top comments (5)

Andy Stewart • May 21

The engineering nuances—like pre-cleaning code with sed and implementing a dynamic COEFF cooldown loop—are sheer genius, perfectly balancing local compute efficiency with device longevity. In our work with private clouds and dedicated NAS hardware, we heavily advocate for this exact brand of local-first, privacy-centric batch processing. Looking forward to seeing your upcoming stress tests on a dedicated SOHO server!