Oleksii Antoniuk

Posted on Jun 11 • Originally published at oleant.dev

Refactoring Laravel Visit Analytics: The Path to Version 2.0.0

#laravel #analytics #webdev #cybersecurity

Remember my recent guest with Firefox 140.0, the one I mentioned in "A Lie Detector for HTTP Requests: Analytics Through Time"? That case proved it: the "digital customs" is working, and the scoring system is diligently building "suspicion dossiers" on every shady visitor.

But here’s the catch: the smarter my "lie detector" got, the harder it became for it to breathe. In version 1.3.0, I added port checks, referrer loop detection, and the infamous Snowball Effect. At some point, I looked at the main Middleware code and was horrified. It started to resemble a Swiss Army knife that some fanatic tried to upgrade with a chainsaw, a microscope, and a fishing net all at once.

The logic branched out, conditions multiplied, and milliseconds began crowding the logs, demanding surgical precision. I realized that if I left it as is, the package itself would turn into a sluggish "bot," spending more resources on self-analysis than on serving actual guests.

Today, I’m going to tell you why my laravel-visit-analytics package suddenly skipped several steps — jumping from version 1.3.0 straight to 2.0.0. Spoiler: we performed full-scale "open-heart surgery" on the project. We didn't just repaint the facade; we completely rewrote the architecture so the system could scale indefinitely without losing a fraction of a second or a shred of common sense.

Jumping the Chasm: Why SemVer is About Honesty

In the dev world, there’s an unwritten code of honor — SemVer (Semantic Versioning). It’s simple but rigid math: MAJOR.MINOR.PATCH. If you just fixed a typo, that’s a patch. If you added a new feature, that’s minor. But if your changes make a user "scratch their head" over a crashed database — you are obligated to bump the first digit.

I could have spent ages patching version 1.x, trying to maintain backward compatibility, but the old foundation simply couldn't support the weight of new ideas. Version 2.0.0 is my way of saying: "We swapped the chassis while driving. Please, check the manual!"

The main stumbling block was in the migrations. Previously, the log table structure was standard and fairly relaxed:

// Version 1.3.0: Standard timestamps, basic indices
Schema::create('visit_logs', function (Blueprint $table) {
    // ...
    $table->timestamps(); // Second-level precision
});

In version 2.0.0, we switched to "High Definition." For the new analyzers to work and for the Snowball Effect to trigger accurately, we needed microsecond precision and additional fields to store "evidence."

// Version 2.0.0: Microseconds and flexible data structure
Schema::create('visit_logs', function (Blueprint $table) {
    // ...
    // Moving to milliseconds for Laravel + MySQL/MariaDB
    $table->dateTime('created_at', 3)->index(); 
});

If I had released this as 1.4.0, an automatic composer update could have turned some developer's morning into a nightmare with Column not found errors or data type mismatches. A major release isn't about hype; it's a bulletproof vest for your data. We consciously chose this break to lay the groundwork for the future.

The "Vanishing Milliseconds" Problem and Moody SQLite

The first serious blow came from the database world. My stack is pretty standard: powerful MariaDB in production, and lightning-fast in-memory SQLite for testing. And it was exactly at the intersection of these two worlds that the "mystical" bugs began.

Picture this: an elite bot makes a series of requests. In reality, there are fractions of a second between them — say, 200 or 300 milliseconds. But standard $table->timestamps() in Laravel create columns with whole-second precision by default.

As a result, for the database, three different visits occurring at 14:00:00.100, 14:00:00.400, and 14:00:00.800 merged into one blurry "ecstasy" timestamped 14:00:00. This completely broke the Snowball Effect logic: the system couldn't build a chain of events because, from its perspective, they happened simultaneously.

To restore the analytics' "vision," I had to switch to dateTime('created_at', 3). Those "three decimal places" became my entry ticket to the world of major updates. But SQLite, that moody little engine, started to resist. It simply ignored the precision setting in migrations unless you approached it with "special care."

I had to implement a driver check directly in the migration code to ensure the package remained universal:

$precision = 3;

// For MySQL/MariaDB we explicitly set precision, 
// for SQLite we use the standard as it stores dates as strings
$driverName = Schema::getConnection()->getDriverName();

Schema::create('visit_logs', function (Blueprint $table) use ($precision, $driverName) {
    $table->id();
    // ... other fields
    if ($driverName === 'sqlite') {
        $table->timestamps(); 
    } else {
        $table->dateTime('created_at', $precision)->nullable()->index();
    }
});

Why is this critical? Without this fix, my Pest tests were becoming a lottery. The analyzer saw three records with identical times and couldn't tell which came first. Now, the precision is surgical — we see every "sneeze" from a bot with millisecond accuracy.

Lesson learned: If your analytics can't see milliseconds, it's >blind. Bots will appear to you as either supernaturally fast or >magically synchronous. В modern web security, a second is an >eternity — enough time to hack half a website.

From "Layer Cake" to Analyzer Pipeline

In version 1.3, the entire interrogation logic—from hunting for "traitorous ports" to calculating "referrer loops"—lived inside a single, massive class. It felt like a detective’s office in a low-budget TV show: evidence, protocols, yesterday’s sandwiches, and personal files were all piled onto one desk. Add one new check, and the whole shaky structure threatened to collapse on your head.

In 2.0.0, we did a deep clean and implemented the Chain of Responsibility pattern. Now, each "suspicion zone" is handled by its own highly specialized agent—the Analyzer.

How this "Special Task Force" works:

AnalysisState: This is our "Case File." A special container object that is carefully passed through the entire chain. Each analyzer records its findings there: scores, reasons, and indisputable evidence.
BotAnalysisService: The "Police Chief" managing the process. It takes the request and hands it over to the profile experts one by one.

Meet the "staff" in version 2.0.0:

ExplicitBotsAnalyzer: Deals with those who honestly (or foolishly) admit in the headers that they are bots.
HeaderIntegrityAnalyzer: Checks header integrity. If a browser claims to be Chrome but acts like a leaky bucket, it’s his client.
NetworkAnalyzer: The tough guy checking IP "residency" (data centers, clouds, suspicious subnets).
ObsoleteOSAnalyzer & OutdatedBrowserAnalyzer: The "Archeologist" duo. They track down those coming from the "Digital Paleozoic" (Hello, Windows XP!).
RefererAnalyzer: Specialist in "loops" and suspicious ports.
ReputationAnalyzer: Cross-references the guest against blacklists and past sins.
HoneypotAnalyzer: Our expert in "honey traps." If someone pokes around /.env, this analyzer closes the case.

Why is this great? This architecture provides isolation and clarity. Each analyzer is an independent file of a couple hundred lines, easy to test and extend. If tomorrow hackers release a new generation of bots imitating, say, "smart fridges," I won't need to rewrite the system core. I'll just create a SmartFridgeAnalyzer.php, drop it into the chain, and my "Digital Sheriff" will instantly learn to recognize threats from household appliances.

Behind the scenes, everything is managed by RetroAnalysisService — our "Cold Case Unit" that implements the Snowball Effect, cleaning up visit history if a guest finally slips up at a later stage.

Control Panel: How the "Engine" Works

Architecture isn't just about beautiful classes; it's about the ease of managing them. In version 2.0.0, I wanted the user to be able to disable heavy checks or add their own "on the fly" without digging into the package source. All control is centered in the configuration file. Each analyzer is an independent block that can be toggled with a single switch:

// config/visit-analytics.php
'analyzers' => [
    'obsolete_os' => [
        'enabled' => true,
        'class'   => \Oleant\VisitAnalytics\Analyzers\ObsoleteOSAnalyzer::class,
        'params'  => [
            'target_os' => ['Windows NT 5.1', 'Windows NT 6.0'],
            // ... other settings
        ],
    ],
    // ...
],

And here is the "magic" inside the BotAnalysisService. It doesn't just iterate through classes; it works like a smart pipeline with built-in fail-safes:

foreach ($analyzers as $settings) {
    // 1. Check if the expert is active
    if (!($settings['enabled'] ?? false)) {
        continue;
    }

    try {
        // 2. Injection via Laravel container (Dependency Injection in action!)
        /** @var BotAnalyzerInterface $analyzer */
        $analyzer = app($settings['class']);

        // 3. Pass only the required parameters (Encapsulation)
        $analyzer->analyze($log, $state, $settings['params'] ?? []);
    } catch (Throwable $e) {
        // 4. "Fail-safe" mode: if one agent fails, the investigation continues
        report($e);
        $state->addEvidence('execution_errors', [
            'analyzer' => $settings['class'],
            'error'    => $e->getMessage(),
        ]);
        continue;
    }
}

Why this matters:

Parameter Isolation: Each analyzer receives only its own portion of settings. It doesn't know about others and can't interfere.
Fault Tolerance: If an error occurs in one analyzer, the main process won't break. The system logs the "evidence" of the failure and moves on with the interrogation.
Clean Code via DI: Using app($settings['class']) allows you to use any other Laravel services in your analyzer constructors.

Code That Doesn't Cover Its Tracks: Collecting "Digital Evidence"

Remember how in version 1.3 we just recorded suspicions? In 2.0, I realized we needed a full-blown crime scene investigation protocol. But a purely technical problem arose. Previously, the method for adding evidence was too simplistic. If ObsoleteOSAnalyzer found an old Windows version, and then OutdatedBrowserAnalyzer found an ancient Internet Explorer, they might accidentally "fight" over keys in the data array.

As a result, the latest piece of evidence would simply overwrite the previous one. PHP arrays are powerful, but with a careless array_merge, they turn into an eraser that wipes out history. In version 2.0, I rewrote the evidence collection logic in AnalysisState. Now we use "smart merging" (recursive merging) that doesn't overwrite data but neatly stacks it up.

// AnalysisState.php
public function addEvidence(string $key, mixed $value): self
{
    // If data already exists for this key, we don't overwrite it; 
    // we turn it into a list or supplement the array.
    if (isset($this->evidence[$key])) {
        $existing = (array) $this->evidence[$key];
        $this->evidence[$key] = array_unique(array_merge($existing, (array) $value));
    } else {
        $this->evidence[$key] = $value;
    }
    return $this;
}

A Forensic Report in Your Logs

Thanks to this "thrifty" approach, the database now stores a detailed dossier instead of an abstract status. When you open the visit_logs table, the evidence field (which is now a full-fledged JSON) tells a whole story:

Bot Score: 85 (Trust threshold exceeded)
Reasons: ['obsolete_os', 'obsolete_browsers', 'missing_referer']
Evidence:

{
  "os_signature": "Windows NT 5.1",
  "browsers_signature": "MSIE 6.0",
  "referer_status": "missing",
  "checked_analyzers": ["ObsoleteOSAnalyzer", "OutdatedBrowserAnalyzer", "RefererAnalyzer"]
}

This approach turns analytics into a tool for evidence-based security. If a client comes to you asking, "Why was I blocked?", you won't mumble about "algorithms"—you'll show concrete facts: "Your browser from 2001 and the lack of a referrer resulted in a suspicion score of 85." This is the transparency I aimed for when moving to version 2.0.0. We are no longer just guessing—we are documenting.

Epilogue: Why Did We Do This?

Refactoring is a tricky business. At first glance, it seems like a boring and thankless task: you spend dozens of hours to end up with... exactly the same application, which just works "more correctly" under the hood. No new buttons, no flashy animations.

But in reality, it’s the only way to keep a project from being buried under the weight of its own code. If I had continued "patching" logic into the version 1.3 Middleware, any attempt to add a new check would have soon turned into defusing a bomb. One wrong if—and the whole analytics system goes down.

Version 2.0.0 is my "Engineering Manifesto." By laying down a clean analyzer architecture, I’ve prepared the perfect runway for the most ambitious stage of the project — visualization in Filament. Now that we have detailed evidence in JSON and a clear bot_score, turning these dry numbers into interactive graphs, threat maps, and real-time dashboards is only a matter of time.

But not everything in the world of algorithms obeys the dry logic of ones and zeros. Sometimes, even the most perfect architecture falters before... an ordinary old phone. In the next article, I’ll tell an almost detective-like story of how my own son nearly became an "enemy of the system" by visiting the site from a vintage Sony Ericsson. We’ll explore how to avoid turning your defense into a digital dictator and why sometimes you need to let a "suspicious" guest through.

Stay tuned—the most interesting things are always hidden in the details!

DEV Community