<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ronny Nyabuto</title>
    <description>The latest articles on DEV Community by Ronny Nyabuto (@ronnyabuto).</description>
    <link>https://dev.to/ronnyabuto</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3674535%2F6d5fa311-537e-457f-81d3-4c85b549cf24.jpg</url>
      <title>DEV Community: Ronny Nyabuto</title>
      <link>https://dev.to/ronnyabuto</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ronnyabuto"/>
    <language>en</language>
    <item>
      <title>Safaricom's sandbox STK Query API returns FAILED for successful payments. Here's what's happening.</title>
      <dc:creator>Ronny Nyabuto</dc:creator>
      <pubDate>Mon, 30 Mar 2026 16:17:02 +0000</pubDate>
      <link>https://dev.to/ronnyabuto/safaricoms-sandbox-stk-query-api-returns-failed-for-successful-payments-heres-whats-happening-dio</link>
      <guid>https://dev.to/ronnyabuto/safaricoms-sandbox-stk-query-api-returns-failed-for-successful-payments-heres-whats-happening-dio</guid>
      <description>&lt;p&gt;Running reconciliation against the Daraja sandbox last week, I got this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"checked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"matched"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"skipped"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"mismatches"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"checkoutRequestId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"ws_CO_26032026133641276708729173"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"storedStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"PENDING"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"mpesaStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"FAILED"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"checkoutRequestId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"ws_CO_26032026111016899708729173"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"storedStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"SUCCESS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"mpesaStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"FAILED"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"checkoutRequestId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"ws_CO_26032026113146397708729173"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"storedStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"SUCCESS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"mpesaStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"FAILED"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The last two entries are the problem. Both have confirmed M-Pesa receipts in the database — &lt;code&gt;UCQ5UAQ403&lt;/code&gt; and &lt;code&gt;UCQ5UAPYRY&lt;/code&gt; — with confirmed deductions on the test account. The STK callback delivered &lt;code&gt;ResultCode: 0&lt;/code&gt; for both. Money moved. Safaricom's own callback said so.&lt;/p&gt;

&lt;p&gt;The STK Query API disagrees. It says both payments failed.&lt;/p&gt;

&lt;p&gt;I searched Stack Overflow, the Safaricom GitHub repos, every community integration I could find. No prior documentation of this. Not a single issue or comment. It appears to be unreported.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's actually happening&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Safaricom's sandbox doesn't fully simulate the USSD network layer. This is documented behavior — it's why Pesa Playground exists. The sandbox can't reliably generate failure states. What's less documented is the inverse: the sandbox STK Query endpoint apparently cannot reliably confirm success states either. It defaults to FAILED when it can't definitively resolve a transaction, regardless of what the callback already told you.&lt;/p&gt;

&lt;p&gt;The sandbox callback and the sandbox STK Query are not reading from the same source of truth.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;How &lt;a href="https://www.npmjs.com/package/mpesa-stk" rel="noopener noreferrer"&gt;mpesa-stk@0.1.1&lt;/a&gt; handled it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The library refused to act on the contradiction. &lt;code&gt;matched:0&lt;/code&gt; — it checked the payments, found that the STK Query response conflicted with an authoritative stored SUCCESS, and did not overwrite. The PENDING record from the orphaned payment stayed PENDING rather than being incorrectly resolved to FAILED.&lt;/p&gt;

&lt;p&gt;That is the correct behavior. A reconciliation system that overwrites &lt;code&gt;SUCCESS&lt;/code&gt; with a contradictory query response would be worse than one that does nothing.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What this means for your reconciliation implementation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two things need to be true in how you handle STK Query responses:&lt;/p&gt;

&lt;p&gt;Never overwrite a terminal &lt;code&gt;SUCCESS&lt;/code&gt; or confirmed &lt;code&gt;FAILED&lt;/code&gt; record based on a query response alone. The callback is the authoritative source. The query is a fallback for records that never received a callback — &lt;code&gt;PENDING&lt;/code&gt; only.&lt;/p&gt;

&lt;p&gt;Don't trust sandbox reconciliation results. The sandbox STK Query is not a reliable test surface for this code path. Test your reconciliation logic against a production environment, or accept that sandbox results for this specific path are noise.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The production question&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I haven't run this against a live production environment. Safaricom's documentation implies the production STK Query returns accurate results — the sandbox is the broken environment, not production. If you've tested reconciliation in production and can confirm the query API behaves correctly there, I'd like to know. Leave a comment or find me on the Daraja Discord.&lt;/p&gt;

&lt;p&gt;The finding stands regardless: if you're building reconciliation, your implementation needs to handle contradictory query responses. The sandbox will generate them. Production might too, in edge cases nobody has documented yet.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tested on 2026-03-26, Daraja sandbox, &lt;a href="https://www.npmjs.com/package/mpesa-stk" rel="noopener noreferrer"&gt;mpesa-stk@0.1.1&lt;/a&gt;. Full test log in the &lt;a href="https://github.com/ronnyabuto/flutter-daraja-raw" rel="noopener noreferrer"&gt;flutter-daraja-raw repo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




</description>
      <category>mpesa</category>
      <category>node</category>
      <category>javascript</category>
      <category>typescript</category>
    </item>
    <item>
      <title>I measured M-Pesa STK Push polling lag on a real device. The variance will ruin your UX.</title>
      <dc:creator>Ronny Nyabuto</dc:creator>
      <pubDate>Thu, 26 Mar 2026 11:49:01 +0000</pubDate>
      <link>https://dev.to/ronnyabuto/i-measured-m-pesa-stk-push-polling-lag-on-a-real-device-the-variance-will-ruin-your-ux-38j1</link>
      <guid>https://dev.to/ronnyabuto/i-measured-m-pesa-stk-push-polling-lag-on-a-real-device-the-variance-will-ruin-your-ux-38j1</guid>
      <description>&lt;p&gt;Same code. Same device. Same network. Same shortcode.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Test 1: 39 seconds from PIN entry to UI update.&lt;br&gt;
Test 2: 3 seconds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;13x variance. Not a bug. Not a fluke. Just the math of a fixed polling schedule colliding with a non-deterministic callback.&lt;br&gt;
When you fire an STK Push, Safaricom returns a &lt;code&gt;CheckoutRequestID&lt;/code&gt; and &lt;code&gt;ResponseCode: 0&lt;/code&gt; almost immediately. Most developers celebrate this. It means nothing. It means Safaricom received your request. The customer hasn't seen a prompt yet.&lt;/p&gt;

&lt;p&gt;The actual payment outcome arrives later — via a POST to your &lt;code&gt;CallBackURL&lt;/code&gt;. That callback takes 5 seconds or it takes 45. Safaricom doesn't tell you when it's coming. And if your server isn't reachable when it arrives, Safaricom does not retry. The delivery attempt is fire-and-forget.&lt;/p&gt;

&lt;p&gt;So the typical Flutter developer does what makes sense: they poll. Every 10 or 30 seconds, ask the server if anything happened. This works until it doesn't.&lt;/p&gt;



&lt;p&gt;My polling schedule fired at T+10s, T+30s, and T+70s. In Test 1, the callback landed at T+45s — squarely between the T+30 and T+70 windows. The next poll was 25 seconds away. Safaricom completed the payment in 14 seconds. The user waited 39.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Test 1:
  PIN entered:        11:10:48
  Callback processed: 11:11:02  (14s — Safaricom's side)
  UI updated:         11:11:27  (39s — polling lag)

  Polls: T+10 → PENDING, T+30 → PENDING, T+70 → SUCCESS

Test 2:
  PIN entered:        11:31:55
  Callback processed: 11:31:59
  UI updated:         11:31:58  (3s)

  T+10 poll and callback arrived within 1 second of each other.
  Lucky timing. Not better code.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same polling schedule. The only variable was when Safaricom's callback landed relative to the poll windows.&lt;/p&gt;




&lt;p&gt;There is one optimisation that actually moves the number.&lt;/p&gt;

&lt;p&gt;The real-world flow for most users: tap "Pay," get the USSD prompt, press home, open M-Pesa to confirm the request or check their balance, enter PIN, return to your app. The app was backgrounded the entire time. The callback arrived and was processed server-side while the user was in a different app. Without &lt;code&gt;WidgetsBindingObserver&lt;/code&gt;, they come back to a spinner and wait for the next scheduled poll.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="nd"&gt;@override&lt;/span&gt;
&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;didChangeAppLifecycleState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AppLifecycleState&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;AppLifecycleState&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;resumed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paymentProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;notifier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;checkStatusOnResume&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The moment they return to your app, you poll immediately. My Test 7 result: 1–2 seconds from return to PaymentSuccess.&lt;/p&gt;

&lt;p&gt;That is not a polling win. That is knowing when to trigger the poll. Most Flutter M-Pesa implementations do not have this. The USSD flow almost guarantees the user will background the app. The one scenario you should optimize for is the one most developers leave unhandled.&lt;/p&gt;




&lt;p&gt;The failure mode nobody documents is worse.&lt;/p&gt;

&lt;p&gt;Test 3: I killed the ngrok tunnel after the STK Push was sent but before the customer entered their PIN. Customer paid. Balance reduced. Server never received the callback. Safaricom made one delivery attempt, got no response, and moved on.&lt;/p&gt;

&lt;p&gt;DB state after 90 seconds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;               &lt;span class="s"&gt;PENDING&lt;/span&gt;
&lt;span class="na"&gt;result_code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;          &lt;span class="kc"&gt;null&lt;/span&gt;
&lt;span class="na"&gt;failure_reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;       &lt;span class="kc"&gt;null&lt;/span&gt;
&lt;span class="na"&gt;mpesa_receipt_number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The app timed out and displayed: &lt;em&gt;"Status unknown. We did not receive a confirmation within the expected window."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That copy is deliberate. Telling a user their payment failed when money has already left their account is not a UX problem. It is a trust problem. The distinction matters more than most developers realize until a customer calls.&lt;/p&gt;

&lt;p&gt;This is not a contrived scenario. It happens when your server restarts, when your laptop sleeps during a demo, when a deployment takes thirty seconds at the wrong moment. Safaricom does not retry. The only recovery is reconciliation — query the STK Push Query endpoint on a schedule and resolve orphaned PENDING records.&lt;/p&gt;

&lt;p&gt;One caveat: Safaricom's sandbox STK Query API returned FAILED for confirmed SUCCESS payments during testing. That is a known sandbox limitation. Production behaves correctly.&lt;/p&gt;




&lt;p&gt;The baseline from this session:&lt;/p&gt;

&lt;p&gt;Polling lag: 3–39 seconds, non-deterministic.&lt;br&gt;
Callback delivery: 100% when the server is reachable. 0% when it isn't.&lt;br&gt;
Lifecycle optimisation: 1–2 seconds on resume, which covers the most common real-world flow.&lt;/p&gt;

&lt;p&gt;Every Flutter developer building on M-Pesa either lives with these numbers, reinvents the solution from scratch, or doesn't know the problem exists until a production incident surfaces it.&lt;/p&gt;

&lt;p&gt;No maintained Flutter package handles the full lifecycle — callback receipt, persistence, polling fallback, lifecycle recovery — without requiring a separately managed backend. That is the gap.&lt;/p&gt;

&lt;p&gt;The next post will show what happens when you replace the polling cascade with Appwrite Realtime. The numbers are not subtle.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tested on Google Pixel 9, Android 15. Daraja sandbox, Flutter 3.41. All timings are from real device logs. Test harness: &lt;a href="https://github.com/ronnyabuto/flutter-daraja-raw" rel="noopener noreferrer"&gt;flutter-daraja-raw.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>mpesa</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
