<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joel Mendoza</title>
    <description>The latest articles on DEV Community by Joel Mendoza (@joel_mendoza_8a2623998b93).</description>
    <link>https://dev.to/joel_mendoza_8a2623998b93</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3981946%2F70143611-00c6-4c77-9611-4b21a31946ba.png</url>
      <title>DEV Community: Joel Mendoza</title>
      <link>https://dev.to/joel_mendoza_8a2623998b93</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/joel_mendoza_8a2623998b93"/>
    <language>en</language>
    <item>
      <title>What 506K real transactions taught me about how Latin Americans actually save</title>
      <dc:creator>Joel Mendoza</dc:creator>
      <pubDate>Wed, 17 Jun 2026 16:24:12 +0000</pubDate>
      <link>https://dev.to/joel_mendoza_8a2623998b93/what-506k-real-transactions-taught-me-about-how-latin-americans-actually-save-4ke1</link>
      <guid>https://dev.to/joel_mendoza_8a2623998b93/what-506k-real-transactions-taught-me-about-how-latin-americans-actually-save-4ke1</guid>
      <description>&lt;p&gt;For nine years I ran Coinch, a goal-based savings app used by 30,000+ people across Mexico, Colombia, Argentina, Peru and Chile. When it wound down in 2024, it left behind 506,311 records of real saving behavior: 305,808 transactions, 108,570 savings goals, 91,933 users.&lt;/p&gt;

&lt;p&gt;Most of what I found contradicts how savings products are designed. Here are the patterns, with numbers.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Naming your dream makes you more likely to fail at it
&lt;/h2&gt;

&lt;p&gt;This is the finding that still bothers me. We grouped goals by category and measured completion:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Goal type&lt;/th&gt;
&lt;th&gt;Completion rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Free saving (no specific target)&lt;/td&gt;
&lt;td&gt;7.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Travel / vacation&lt;/td&gt;
&lt;td&gt;~4.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Car / vehicle&lt;/td&gt;
&lt;td&gt;~4.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Aspirational, labeled goals — the car, the trip to Cancún — failed &lt;em&gt;more&lt;/em&gt; than unlabeled "just saving" goals. The standard product playbook says the opposite: make the user visualize the dream, attach a photo, name the goal. Our data says the dream-naming ritual correlates with worse outcomes, possibly because aspirational goals get set with unrealistic amounts and deadlines (median goal horizon was just 120 days), while "free saving" grows quietly without a deadline to fail against.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Saving apps are graveyards of good intentions — and that's the honest baseline
&lt;/h2&gt;

&lt;p&gt;Across all 108,570 goals: &lt;strong&gt;73.8% ended overdue, 19.2% in progress, only 7.0% achieved.&lt;/strong&gt; Among users, 93.7% never completed a single goal.&lt;/p&gt;

&lt;p&gt;If you're building or evaluating a savings product, that's the base rate you're fighting. Any pitch deck claiming 40% goal completion is either measuring something else or selecting heavily. The honest question for product design isn't "how do we get everyone to finish" — it's "what distinguishes the 6.3% who do."&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The answer: discipline beats everything else, and it's measurable
&lt;/h2&gt;

&lt;p&gt;We computed a savings discipline score (regularity and consistency of deposits) per user. Its correlation with goal completion: &lt;strong&gt;ρ = 0.89&lt;/strong&gt;. Nothing else came close. Not goal size, not income proxies, not demographics.&lt;/p&gt;

&lt;p&gt;The practical implication is uncomfortable for feature roadmaps: the predictive signal isn't in &lt;em&gt;what&lt;/em&gt; people save for, it's in &lt;em&gt;how regularly they show up&lt;/em&gt;. A user depositing $2 every Friday is a fundamentally different (and more promising) customer than one who deposited $200 once.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Saving together works — but almost nobody does it
&lt;/h2&gt;

&lt;p&gt;Shared goals (saving with friends or family toward one pot) were only 2.1% of all goals. But they outperformed on every axis: &lt;strong&gt;8.9% completion vs 7.0%&lt;/strong&gt; for individual goals (+27% relative), median target of $10,000 vs $6,000, and a savings rate roughly 3× higher.&lt;/p&gt;

&lt;p&gt;The feature with the best outcomes had the worst adoption. In our case, the social layer was under-built — the data shows the demand signal we didn't fully act on. If I were building a savings product today, shared goals wouldn't be a feature; they'd be the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. January is real. December is brutal.
&lt;/h2&gt;

&lt;p&gt;Monthly deposit seasonality across nine years: January peaks at 9.7% of annual volume (new year's resolutions show up in the data, every year, without fail), with a secondary lift in August–October. December bottoms out at 7.1% — holiday spending doesn't just compete with saving, it wins.&lt;/p&gt;

&lt;p&gt;If you run a financial product in LatAm: your acquisition budget belongs in the first week of January, and your December churn isn't your fault.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. People deposit round numbers — 69.5% of the time
&lt;/h2&gt;

&lt;p&gt;Of all deposits, 69.5% landed on exactly round values: 50, 100, 500, 1,000. Money is psychological before it's numerical. (This one has a practical engineering consequence too — synthetic test data with amounts like $147.23 is instantly recognizable as fake. I wrote about the statistics of faking it properly &lt;a href="https://dev.to/joel_mendoza_8a2623998b93/why-your-synthetic-fintech-data-fails-code-review-and-how-mixture-models-fix-it-fm9"&gt;in a separate post&lt;/a&gt;.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this data lives now
&lt;/h2&gt;

&lt;p&gt;The app is gone, but the behavioral patterns are now the calibration source for a &lt;a href="https://apify.com/active_yardstick/latam-synth" rel="noopener noreferrer"&gt;synthetic data generator on Apify&lt;/a&gt; — it produces unlimited fake-but-statistically-faithful LatAm fintech data (users, goals, transactions) for testing, ML training and demos. 100% synthetic output, zero PII, the real dataset stays private.&lt;/p&gt;

&lt;p&gt;If you're building fintech for Latin America and want to pressure-test your assumptions against measured behavior, that's exactly what it's for. And if there's a pattern here you'd like to see explored deeper, tell me in the comments — the data has more stories than one post can hold.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>fintech</category>
      <category>analytics</category>
      <category>startup</category>
    </item>
    <item>
      <title>Why your synthetic fintech data fails code review (and how mixture models fix it)</title>
      <dc:creator>Joel Mendoza</dc:creator>
      <pubDate>Fri, 12 Jun 2026 22:01:57 +0000</pubDate>
      <link>https://dev.to/joel_mendoza_8a2623998b93/why-your-synthetic-fintech-data-fails-code-review-and-how-mixture-models-fix-it-fm9</link>
      <guid>https://dev.to/joel_mendoza_8a2623998b93/why-your-synthetic-fintech-data-fails-code-review-and-how-mixture-models-fix-it-fm9</guid>
      <description>&lt;p&gt;Every fintech developer has done this: you need test data, you reach for Faker, you generate ten thousand transactions, and your demo works. Then a data scientist on the buying side opens your dataset, runs one &lt;code&gt;df.describe()&lt;/code&gt;, and the deal-killing question arrives: &lt;em&gt;"Why are your transaction amounts uniformly distributed?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Real financial data has a shape. Synthetic data that ignores that shape is instantly recognizable — and in testing, ML training, or sales demos, instantly discrediting. I spent nine years running a savings app in Latin America (30,000+ users, 2015–2024), and when it wound down I kept something most synthetic data generators never had: 506,311 real records to measure that shape against. This post is about the three statistical properties that separate believable synthetic financial data from Faker output, with the actual numbers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Property 1: Amounts are multimodal, not lognormal
&lt;/h2&gt;

&lt;p&gt;The standard "sophisticated" approach is to sample amounts from a lognormal distribution. It's better than uniform — and it still fails. When I fitted a single lognormal to 261,070 real deposits, the body of the distribution looked fine (7–10% deviation between p25 and p90), but the tail fell apart: &lt;strong&gt;35–45% deviation at p95–p99&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The reason is that "deposit amount" isn't one population. It's at least three: micro-deposits (the $1–$20 spare-change crowd), typical deposits ($100–$800), and large transfers ($6,000+). Each has its own location and spread. A single lognormal averages across them and gets all of them wrong.&lt;/p&gt;

&lt;p&gt;The fix is a &lt;strong&gt;mixture of lognormals&lt;/strong&gt;. Fit &lt;code&gt;GaussianMixture&lt;/code&gt; from scikit-learn on the log-amounts, select the number of components, sample from the mixture. One non-obvious lesson from doing this on real data: &lt;strong&gt;don't select K with BIC&lt;/strong&gt;. Financial amounts have heavy atoms at round values (more on that below), and BIC reacts to those atoms by under-fitting the number of components. Selecting K by minimizing the Kolmogorov–Smirnov statistic against a held-out sample worked far better: a 6-component mixture brought deposits from KS=0.068 down to &lt;strong&gt;KS=0.032&lt;/strong&gt;, and p99 deviation from ~45% to under 5%.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>datascience</category>
      <category>fintech</category>
      <category>testing</category>
      <category>python</category>
    </item>
  </channel>
</rss>
