<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mohamed Hamdy</title>
    <description>The latest articles on DEV Community by Mohamed Hamdy (@maboyadak).</description>
    <link>https://dev.to/maboyadak</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1047808%2Fa5bb96c9-0669-4ab0-9790-2013aff27dea.jpeg</url>
      <title>DEV Community: Mohamed Hamdy</title>
      <link>https://dev.to/maboyadak</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/maboyadak"/>
    <language>en</language>
    <item>
      <title>🧩 How hidden unicode characters can break your system</title>
      <dc:creator>Mohamed Hamdy</dc:creator>
      <pubDate>Thu, 19 Mar 2026 03:48:31 +0000</pubDate>
      <link>https://dev.to/maboyadak/how-hidden-unicode-characters-can-break-your-system-2fd7</link>
      <guid>https://dev.to/maboyadak/how-hidden-unicode-characters-can-break-your-system-2fd7</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Introduction&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;While working on a feature that sends notifications to phone numbers, I recently faced one of the trickiest bugs I’ve encountered. Everything looked correct the phone number was valid, the logs seemed fine, and yet the system kept rejecting the request.&lt;/p&gt;

&lt;p&gt;After digging deeper, the culprit turned out to be something almost invisible: hidden Unicode character inside the phone number.&lt;/p&gt;

&lt;p&gt;In this article, I’ll explain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What Unicode characters are&lt;/li&gt;
&lt;li&gt;What hidden Unicode characters are&lt;/li&gt;
&lt;li&gt;How they can silently break your systems&lt;/li&gt;
&lt;li&gt;The real bug I faced&lt;/li&gt;
&lt;li&gt;How I solved it&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;What Are Unicode Characters?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Computers represent text using numeric codes. Early systems used ASCII, which only supported about 128 characters By using 7 bits to represent each character. Mainly English letters, digits, and basic symbols.&lt;/p&gt;

&lt;p&gt;To support all languages, symbols and emojis, the industry adopted the Unicode standard.&lt;/p&gt;

&lt;p&gt;Unicode assigns a unique code point to every character.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Character&lt;/th&gt;
&lt;th&gt;Unicode Code Point&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;U+0041&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;U+0035&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ع&lt;/td&gt;
&lt;td&gt;U+0639&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;😊&lt;/td&gt;
&lt;td&gt;U+1F60A&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This allows computers to represent text from virtually every language consistently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hidden Unicode Characters
&lt;/h2&gt;

&lt;p&gt;Some Unicode characters are not visible when rendered, but they still exist inside the string.&lt;/p&gt;

&lt;p&gt;These characters are often used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;text direction control&lt;/li&gt;
&lt;li&gt;formatting&lt;/li&gt;
&lt;li&gt;invisible separators&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples include:&lt;/p&gt;

&lt;p&gt;Character   Unicode Description&lt;br&gt;
LRM     U+200E  Left-to-Right Mark&lt;br&gt;
RLM     U+200F  Right-to-Left Mark&lt;br&gt;
ZWSP        U+200B  Zero Width Space&lt;/p&gt;

&lt;p&gt;These characters can appear when:&lt;/p&gt;

&lt;p&gt;copying text from messaging apps&lt;/p&gt;

&lt;p&gt;copying from PDFs&lt;/p&gt;

&lt;p&gt;user input from multilingual keyboards&lt;/p&gt;

&lt;p&gt;some mobile devices&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem is that &lt;strong&gt;they are invisible&lt;/strong&gt;, but they still affect string processing.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Bug I Faced
&lt;/h2&gt;

&lt;p&gt;I was sending a request to an external service with a phone number like this: "+2‭12345678"&lt;/p&gt;

&lt;p&gt;Everything looked normal, But the API kept rejecting the number.&lt;/p&gt;

&lt;p&gt;When I logged the request payload, I saw something strange, At first glance, the number seemed valid.&lt;/p&gt;

&lt;p&gt;But there was actually a hidden Unicode character before the digits.&lt;/p&gt;

&lt;p&gt;The real string looked like this internally:&lt;/p&gt;

&lt;p&gt;\u202D123456789&lt;/p&gt;

&lt;p&gt;U+202D is a Left-to-Right Override (LRO) character.&lt;/p&gt;

&lt;p&gt;It doesn't display visually, but it changes how text direction is interpreted.&lt;/p&gt;

&lt;p&gt;This character was likely introduced when the phone number was copied from a copying from a chat message containing Arabic text or contact list.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Breaks Systems
&lt;/h2&gt;

&lt;p&gt;Most phone number validations expect digits only.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;/^[0-9]+$/&lt;/p&gt;

&lt;p&gt;When a hidden Unicode character exists in the string:&lt;/p&gt;

&lt;p&gt;\u202D123456789&lt;/p&gt;

&lt;p&gt;The regex fails because the string does not actually start with a digit.&lt;/p&gt;

&lt;p&gt;So even though the number looks correct, it fails validation.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Debugged It
&lt;/h2&gt;

&lt;p&gt;To detect the issue, I inspected the string at the byte level.&lt;/p&gt;

&lt;p&gt;In PHP, you can check the raw string representation:&lt;/p&gt;

&lt;p&gt;var_dump($phoneNumber);&lt;/p&gt;

&lt;p&gt;Or to ensure you could check any online unicode parser i have used this &lt;a href="https://apps.timwhitlock.info/unicode/inspect" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;+2‭12345678 =&amp;gt; the one with hidden LTR unicode&lt;br&gt;
+212345678 =&amp;gt; the normal number&lt;/p&gt;

&lt;p&gt;This revealed the unexpected Unicode character before the digits.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;The fix was to sanitize the input and remove non-digit characters.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$phoneNumber = preg_replace('/\D/', '', $phoneNumber);&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
This removes everything except digits.&lt;/p&gt;

&lt;p&gt;Another safe approach is to explicitly remove invisible Unicode characters:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$phoneNumber = preg_replace('/[\x{200B}-\x{200F}\x{202A}-\x{202E}]/u', '', $phoneNumber);&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Or simply normalize the phone number before validation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Hidden Unicode characters can silently break systems in ways that are extremely difficult to detect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Things that can be affected include:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;phone number validation&lt;/li&gt;
&lt;li&gt;authentication tokens&lt;/li&gt;
&lt;li&gt;payment identifiers&lt;/li&gt;
&lt;li&gt;database lookups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your system accepts user input, always consider sanitizing and normalizing text before processing it.&lt;/p&gt;

&lt;p&gt;Invisible characters might be hiding in plain sight.&lt;/p&gt;

</description>
      <category>unicode</category>
      <category>validation</category>
      <category>sanitization</category>
    </item>
  </channel>
</rss>
