<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Muhammad Arslan</title>
    <description>The latest articles on DEV Community by Muhammad Arslan (@marslanmustafa).</description>
    <link>https://dev.to/marslanmustafa</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1889950%2Fcd2e77bd-aa55-4402-8d1a-678f619c0861.png</url>
      <title>DEV Community: Muhammad Arslan</title>
      <link>https://dev.to/marslanmustafa</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/marslanmustafa"/>
    <language>en</language>
    <item>
      <title>Why Your Profanity Filter Fails Against Unicode (And How to Fix It)</title>
      <dc:creator>Muhammad Arslan</dc:creator>
      <pubDate>Tue, 24 Feb 2026 18:37:46 +0000</pubDate>
      <link>https://dev.to/marslanmustafa/why-your-profanity-filter-fails-against-unicode-and-how-to-fix-it-40fd</link>
      <guid>https://dev.to/marslanmustafa/why-your-profanity-filter-fails-against-unicode-and-how-to-fix-it-40fd</guid>
      <description>&lt;p&gt;&lt;strong&gt;Most profanity filters only check raw input.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That’s the problem.&lt;/p&gt;

&lt;p&gt;You can block &lt;code&gt;fuck&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But what about:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;fu\u0441k&lt;/code&gt; (Cyrillic “с” instead of Latin “c”)&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ｆｕｃｋ&lt;/code&gt; (fullwidth Unicode characters)&lt;/p&gt;

&lt;p&gt;&lt;code&gt;f.u.c.k&lt;/code&gt; (separator bypass)&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Fr33 m0ney&lt;/code&gt; (leet-speak)&lt;/p&gt;

&lt;p&gt;&lt;code&gt;fuuuuck&lt;/code&gt; (character stretching)&lt;/p&gt;

&lt;p&gt;They all bypass typical word-list filters.&lt;/p&gt;

&lt;p&gt;The issue isn’t your regex.&lt;br&gt;
It’s the &lt;strong&gt;order of operations&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Normalize First. Validate Second.
&lt;/h2&gt;

&lt;p&gt;Before checking profanity or spam, input should be normalized:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unicode NFKC normalization&lt;/li&gt;
&lt;li&gt;Zero-width character removal&lt;/li&gt;
&lt;li&gt;Separator stripping&lt;/li&gt;
&lt;li&gt;Homoglyph mapping&lt;/li&gt;
&lt;li&gt;Leet-speak normalization&lt;/li&gt;
&lt;li&gt;Repetition reduction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After normalization, all evasions collapse into a canonical form.&lt;br&gt;
Then your profanity/spam logic actually works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I Built&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I created &lt;strong&gt;&lt;em&gt;&lt;a class="mentioned-user" href="https://dev.to/marslanmustafa"&gt;@marslanmustafa&lt;/a&gt;/input-shield — a zero-dependency TypeScript validation package&lt;/em&gt;&lt;/strong&gt; that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Detects Unicode homoglyph attacks&lt;/li&gt;
&lt;li&gt;Catches leet-based spam&lt;/li&gt;
&lt;li&gt;Blocks stretched profanity&lt;/li&gt;
&lt;li&gt;Detects gibberish (e.g. asdfghjkl)&lt;/li&gt;
&lt;li&gt;Supports Zod integration&lt;/li&gt;
&lt;li&gt;Validates HTML email content safely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import { createValidator } from '@marslanmustafa/input-shield';

const validator = createValidator()
  .field('Message')
  .min(2).max(500)
  .noProfanity()
  .noSpam()
  .noGibberish();

validator.validate('fu\u0441k'); 
// → blocked
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why This Matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unicode homoglyph attacks are not edge cases.&lt;br&gt;
They’re easy, invisible, and widely ignored.&lt;/p&gt;

&lt;p&gt;If you're validating user input in production, normalization isn’t optional. It’s required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/marslanmustafa/input-shield" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · &lt;a href="https://www.npmjs.com/package/@marslanmustafa/input-shield" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>security</category>
      <category>typescript</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
