DEV Community

Cover image for Solved: The new Coca-Cola christmas ad is a load of ai-generated slop. Is it intentional?
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: The new Coca-Cola christmas ad is a load of ai-generated slop. Is it intentional?

🚀 Executive Summary

TL;DR: The flawed Coca-Cola AI ad exemplifies a broken automation pipeline where processes validate execution but not output quality, leading to ‘AI slop.’ The solution involves implementing robust quality gates, ranging from mandatory human approval steps to advanced automated output validation using tools like perceptual hashes or policy-as-code, to ensure generated content meets standards before deployment.

🎯 Key Takeaways

  • A ‘green checkmark’ in CI/CD pipelines only signifies successful code execution, not the quality or correctness of the generated output, especially when integrating generative AI.
  • Implementing a ‘Manual Gate’ (human approval step) before final deployment is a quick, 100% effective method to prevent low-quality outputs from reaching production, though it introduces a bottleneck.
  • Automated quality gates, such as using Perceptual Hashes (pHash) for image validation or Open Policy Agent (OPA) for Infrastructure as Code, enable ‘shifting left’ quality checks by programmatically validating the output rather than just execution.

The flawed Coca-Cola AI ad isn’t just bad art; it’s a classic symptom of a broken automation pipeline. Learn how to fix your own “AI slop” with proper quality gates and a human-in-the-loop approach.

That AI Christmas Ad is a Symptom of a Broken Pipeline, Not Just a Bad Prompt

I saw that Coke ad and got a cold sweat. It wasn’t because of the uncanny valley polar bears; it was because I’ve lived that failure. It took me back to a 3 AM incident call a few years ago. We’d just rolled out a “brilliant” new service in our CI/CD pipeline that was supposed to auto-optimize all our product images for the main e-commerce platform. The pipeline went green. All checks passed. The deployment to the web farm, from ecomm-prod-web-01 to ecomm-prod-web-12, was flawless. Then the support tickets started rolling in. The script had “optimized” 10,000 product images into a pixelated, artifact-ridden mess. The code ran perfectly; the result was a disaster. That ad is the same thing on a global scale: a process that worked, but a product that failed.

The “Why”: Your Pipeline is a Liar (By Omission)

Here’s the hard truth we often forget in the race to automate everything: a green checkmark in Jenkins or a successful GitHub Actions run doesn’t mean you’ve produced something of quality. It just means your code didn’t crash. Your pipeline is likely testing if the script ran, not if the output was right.

In the case of that ad, an AI model was given a prompt. It executed that prompt and spat out a series of images. The “build” was successful. But no one, or no automated process, was in place to ask the critical questions:

  • Does the polar bear have a consistent look frame-to-frame?
  • Is the art style coherent?
  • Does this look, you know, good?

You can’t just bolt on a new, powerful tool like Generative AI to an old pipeline and expect it to work. You’ve automated the creation of garbage, at scale. The problem isn’t the AI; it’s the lack of a quality gate in your process.

The Fixes: How to Stop Shipping Slop

Whether it’s AI-generated video or a Terraform plan, the principle is the same. You need to validate the output, not just the execution. Here are three ways to tackle it, from a quick patch to a permanent architectural change.

1. The Quick Fix: The Manual Gate

This is the “stop the bleeding” approach. You immediately insert a mandatory human approval step right before the final commit to main or the deployment to production. It’s slow, it creates a bottleneck, but it’s 100% effective at stopping a disaster from going live.

In a CI/CD context, this is the “environment” feature in GitHub Actions or a manual “Promote Build” step in Jenkins. You configure your production deployment job to require a specific person or team to click a button before it can proceed.

# .github/workflows/deploy.yml
jobs:
  build-and-test:
    # ... build and test steps happen here ...

  deploy-to-staging:
    # ... automatically deploys to a review environment ...

  wait-for-approval:
    runs-on: ubuntu-latest
    needs: [deploy-to-staging]
    environment: 
      name: Production Gate
      url: https://your-staging-env.example.com
    steps:
      - run: echo "Waiting for manual sign-off to deploy to Production."

  deploy-to-prod:
    runs-on: ubuntu-latest
    needs: [wait-for-approval]
    steps:
      - name: Deploy
        run: ./deploy-to-prod.sh
Enter fullscreen mode Exit fullscreen mode

This is a hack, but it’s a necessary one when quality has taken a nosedive. The Creative Director should have had to approve that ad render before it ever got sent out, and your lead engineer should have to approve that database migration script before it hits prod-db-01.

2. The Permanent Fix: Automate the Eyeball Test

The real goal is to build automated quality checks that approximate what a human would look for. This is about “shifting left” and making quality an automated part of the process, not a manual step at the end.

Instead of just checking if a script ran, you add a new stage to your pipeline that validates the output.

Scenario Automated Quality Gate
Image Processing Pipeline Add a step that uses a Perceptual Hash (pHash) to compare the ‘before’ and ‘after’ images. If the distance is too great, fail the build. You can also run it through a quality scoring model like BRISQUE.
Infrastructure as Code (Terraform) Run terraform plan and then use a policy-as-code tool like Open Policy Agent (OPA) to programmatically check the plan file for undesirable changes (e.g., “fail if this plan tries to destroy a production database”).
AI Ad Generation This is cutting-edge, but you could add a step that uses another AI model (a vision model) to check for stylistic consistency across frames or detect common visual artifacts. If the ‘artifact score’ is too high, the build fails and a human is alerted.

Pro Tip: Your quality gate doesn’t have to be perfect. A simple script that catches 80% of common errors is infinitely more valuable than a manual process that people start to rubber-stamp because they’re in a hurry. Start small and iterate.

3. The ‘Nuclear’ Option: Rip It Out

Sometimes, the tech just isn’t ready. The most senior engineering decision you can make is to admit that a piece of automation is causing more harm than good and to pull it from the critical path entirely.

This means de-integrating the new, shiny tool from your main production pipeline. Let it run in a development environment. Let the team generating the AI assets do it on their local machines. Go back to the old, slow, manual way of uploading and validating assets until the new automated process is truly battle-tested and reliable.

It’s a tough pill to swallow, especially if you’ve invested months into a new system. But shipping a broken product because your process demands it is how you end up as a case study in what not to do. Your job is to protect production and the customer experience, not to blindly follow a flawed pipeline. Sometimes, that means hitting the big red stop button and going back to the drawing board.


Darian Vance

👉 Read the original article on TechResolve.blog


☕ Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)