Ievgenii Gryshkun

Posted on May 22 • Originally published at angeo.dev

The Magento multi-store bug every AI description generator has — and how we fixed it

#magento #ai #opensource #php

A client came to us with 8,000 SKUs across four store views — English, Dutch, German, French. Descriptions were either copied from supplier PDFs or missing entirely. The fix was obviously AI generation. The less obvious problem was that every existing module we evaluated had the same architectural bug.

So we built our own, made it MIT-licensed, and put it on Packagist. This post is about the bug, the fix, and the four-provider abstraction (including a free one) we shipped on top.

🔗 Originally published on https://angeo.dev/magento-2-ai-product-description-generator/

The bug nobody talks about

Most Magento 2 AI content modules call this:

$product = $this->productRepository->get($sku, editMode: true);
$product->setCustomAttribute('description', $generated);
$this->productRepository->save($product);

Looks fine. It isn't. Without an explicit $storeId, this loads and saves in the default scope — Magento's global, store-view-independent fallback. When you save back, you overwrite every store view at once. The Dutch store gets English descriptions. The German store gets English descriptions. The French store gets English descriptions.

This is not a configuration problem. The multi-store architecture works correctly — the tooling ignores it. Writing to the default scope is simpler to implement than writing per store. Every commercial module we tested took the simpler path.

The right way:

// Load the product in the target store scope
$product = $this->productRepository->get($sku, false, $storeId);
$product->setCustomAttribute('description', $generated);

// Save with explicit store scope (Magento_Catalog\Model\Product\Action)
$this->productService->updateAttributes($sku, $generated, $storeId);

The difference is one parameter. The architectural cost is iterating stores around your generation loop. The data cost of skipping it is silently corrupting your multi-language catalog.

The framework around the fix

The store-scope fix is the boring part. The interesting part is everything you need around it.

┌─────────────────────────────────────────────────────┐
│       Angeo Multi-Store AI Content Framework        │
├──────────────┬──────────────────┬───────────────────┤
│  SKU Source  │ Store Iteration  │   AI Provider     │
│  ──────────  │  ──────────────  │  ──────────────── │
│  Catalog     │  Store 1 (EN)    │  OpenAI           │
│  G.Sheets    │  Store 2 (NL)    │  Claude           │
│  CLI --sku   │  Store 3 (DE)    │  Gemini           │
│              │  Store 4 (FR)    │  Groq (free)      │
├──────────────┴──────────────────┴───────────────────┤
│               Content Pipeline                      │
│  load(sku, storeId) → prompt → generate → save      │
├─────────────────────────────────────────────────────┤
│                    Output                           │
│  Magento DB · Local CSV · Google Sheets API v4      │
└─────────────────────────────────────────────────────┘

Four layers:

Provider Layer — uniform interface across OpenAI, Claude, Gemini, Groq
Store Iteration Layer — resolves all active store views before processing any SKUs
Content Pipeline — for each store × SKU: load in scope → build prompt → generate → save in scope
I/O Layer — reads SKUs from catalog, Google Sheet, or CLI; writes to Magento DB, CSV, and Google Sheets

The provider abstraction is the part most people will copy. One interface:

interface AiProviderInterface
{
    public function generate(string $system, string $user): string;
}

Wired in di.xml:

<type name="Angeo\AiDescriptionUpdater\Service\AiProviderService">
  <arguments>
    <argument name="providers" xsi:type="array">
      <item name="openai" xsi:type="object">...OpenAiProvider</item>
      <item name="claude" xsi:type="object">...ClaudeProvider</item>
      <item name="gemini" xsi:type="object">...GeminiProvider</item>
      <item name="groq"   xsi:type="object">...GroqProvider</item>
    </argument>
  </arguments>
</type>

Adding a fifth provider is one class + one line. The pipeline, store iteration, and I/O don't change. This is the pattern, not just the code.

The four-provider benchmark

200 product descriptions from a real Dutch jewellery store, same system prompt, same product names, all four providers.

Speed (avg per description)

Provider	Model	Avg. time
Groq	llama-3.3-70b-versatile	0.8s
Groq	mixtral-8x7b-32768	0.6s
Google	gemini-2.0-flash	1.2s
Anthropic	claude-haiku-4-5	1.1s
OpenAI	gpt-4.1-mini	1.4s
OpenAI	gpt-4.1	2.1s
Anthropic	claude-sonnet-4-6	2.8s

For 32,000 generations (8,000 SKUs × 4 store views): Groq ≈ 7 hours, GPT-4.1 ≈ 19 hours.

Cost (per 1,000 descriptions, ~200 words each)

Provider	Model	Cost / 1k
Groq	llama-3.3-70b-versatile	$0.00 (free tier)
Google	gemini-2.0-flash	~$0.08
OpenAI	gpt-4.1-mini	~$0.24
Anthropic	claude-haiku-4-5	~$0.32
OpenAI	gpt-4.1	~$1.80
Anthropic	claude-sonnet-4-6	~$2.40

Quality (manual review of 200 samples)

Criteria	Groq Llama 3.3	GPT-4.1-mini	GPT-4.1	Claude Sonnet
Factual accuracy	★★★★☆	★★★★☆	★★★★★	★★★★★
Language fluency	★★★★☆	★★★★☆	★★★★★	★★★★★
SEO keyword use	★★★☆☆	★★★★☆	★★★★☆	★★★★☆
HTML formatting	★★★★☆	★★★★☆	★★★★★	★★★★★

Practical recommendation: Validate prompts with Groq first — it's free, fast, and good enough to test workflow. Production on GPT-4.1-mini if SEO keyword density matters. Flagship products on GPT-4.1 or Claude Sonnet if copy quality directly affects conversion.

Why Groq matters

Groq's free tier is 14,400 requests/day. No credit card. Llama 3.3 70B output quality is genuinely good — ~1 star behind GPT-4.1 in our review, mostly on SEO keyword density.

For 8,000 SKUs × 4 stores = 32,000 calls, you're at ~1.85 days on the free tier. For most stores under 5,000 SKUs, this is free production-grade AI content generation if you're patient.

No other Magento module supports Groq at time of writing. This was the killer feature for our client.

Installation

composer require angeo/module-ai-description-updater
bin/magento setup:upgrade
bin/magento setup:di:compile
bin/magento cache:flush

First run (Groq, ~5 minutes from zero):

# 1. Get a free API key at console.groq.com
# 2. In admin: Stores → Configuration → Angeo → AI Description Updater
#    AI Provider = Groq (Free), paste key, Dry Run = Yes
# 3. Test on one SKU
bin/magento angeo:ai-description:run --sku=MY-SKU-001 --dry-run

# 4. If output looks good, disable dry-run and run the batch
bin/magento angeo:ai-description:run

Other useful flags:

bin/magento angeo:ai-description:run --sku=ABC-123     # single SKU, all stores
bin/magento angeo:ai-description:run --store=2         # single store view
bin/magento angeo:ai-description:run --dry-run         # generate, don't save

CLI-first by design. The 8,000-SKU client doesn't want a UI; they want cron.

Why it's open source

The honest answer: distribution. We sell AEO audits and full-stack Magento services. The modules are how stores find us. The code is free; the expertise applied to a specific store is not.

It's a familiar model — Vercel does it with Next.js, Sentry does it with sentry-php. The artefact is open; the operator is the product.

Key takeaways for Magento devs

Default-scope writes are silent multi-store corruption. Audit any AI/content module before installing on a multi-store catalog. Look for $storeId parameters in productRepository->get() and save calls.
Provider abstraction is one interface plus a di.xml array. Don't hardcode OpenAI. Future you (or future LLM pricing) will thank you.
Groq is the current sweet spot for free Magento AI generation. Llama 3.3 70B output is production-acceptable for most stores; the 14,400/day cap is workable for catalogs under ~5,000 SKUs.
GPT-4.1-mini is the best paid-tier value. Comparable to GPT-4.1 at ~17% of the cost.
CLI + cron beats admin UI for catalogs above ~500 SKUs. Click-by-click generation isn't a workflow.

DEV Community