<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mirko Dimartino</title>
    <description>The latest articles on DEV Community by Mirko Dimartino (@mirko_ddd).</description>
    <link>https://dev.to/mirko_ddd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1738316%2Fb1bedc66-ad16-41da-8456-62675cac130c.jpg</url>
      <title>DEV Community: Mirko Dimartino</title>
      <link>https://dev.to/mirko_ddd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mirko_ddd"/>
    <language>en</language>
    <item>
      <title>Your Java Regex Just Silently Broke in Production. Here's How to Make That Impossible.</title>
      <dc:creator>Mirko Dimartino</dc:creator>
      <pubDate>Fri, 13 Mar 2026 15:18:24 +0000</pubDate>
      <link>https://dev.to/mirko_ddd/your-java-regex-just-silently-broke-in-production-heres-how-to-make-that-impossible-55jh</link>
      <guid>https://dev.to/mirko_ddd/your-java-regex-just-silently-broke-in-production-heres-how-to-make-that-impossible-55jh</guid>
      <description>&lt;p&gt;A colleague pushes a fix for a validation bug. The regex looks right. Tests pass. Two weeks later, a user reports that their perfectly valid email address is being rejected — because the fix introduced an unbalanced bracket that the compiler never complained about.&lt;/p&gt;

&lt;p&gt;This is the nature of raw regex in Java. The compiler has no opinion. The mistake lives quietly in a string literal until runtime hands you the bill.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://github.com/mirkoddd/Sift" rel="noopener noreferrer"&gt;Sift&lt;/a&gt; to make this class of bug unrepresentable.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Sift actually does
&lt;/h2&gt;

&lt;p&gt;Sift is a fluent DSL for building regular expressions in Java. Its core idea is simple: instead of writing a string, you traverse a &lt;strong&gt;type-state machine&lt;/strong&gt;. Each method returns only the next valid state — so wrong transitions don't exist as methods, and the compiler rejects incomplete or structurally invalid patterns before your code ever runs.&lt;/p&gt;

&lt;p&gt;The before/after speaks for itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before — what does this even do?&lt;/span&gt;
&lt;span class="nc"&gt;Pattern&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;compile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^(?=[\\p{Lu}])[\\p{L}\\p{Nd}_]{3,15}+[0-9]?$"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// After — your IDE guides every step&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;regex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromStart&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;upperCaseLettersUnicode&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;       &lt;span class="c1"&gt;// Must start with an uppercase letter&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;between&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;wordCharactersUnicode&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;withoutBacktracking&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// ReDoS-safe&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;optional&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;                         &lt;span class="c1"&gt;// May end with a digit&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andNothingElse&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shake&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Result: ^[\p{Lu}][\p{L}\p{Nd}_]{3,15}+[0-9]?$&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same output. Zero runtime overhead. And if you try to skip the quantifier step and call &lt;code&gt;.digits()&lt;/code&gt; directly, it simply doesn't compile.&lt;/p&gt;




&lt;h2&gt;
  
  
  The LEGO brick approach
&lt;/h2&gt;

&lt;p&gt;The real power emerges when you start composing patterns from named building blocks.&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;Sift.fromAnywhere()&lt;/code&gt; call returns a &lt;code&gt;SiftPattern&amp;lt;Fragment&amp;gt;&lt;/code&gt; — an unanchored, reusable piece that can be embedded anywhere without carrying unwanted &lt;code&gt;^&lt;/code&gt; anchors. Patterns built with &lt;code&gt;fromStart()&lt;/code&gt; or sealed with &lt;code&gt;andNothingElse()&lt;/code&gt; become &lt;code&gt;SiftPattern&amp;lt;Root&amp;gt;&lt;/code&gt; — they cannot be embedded. Attempting it is a compile-time error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Define named building blocks&lt;/span&gt;
&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Fragment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Fragment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Fragment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Fragment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;dash&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;character&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'-'&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Compose into a reusable date block&lt;/span&gt;
&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Fragment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dash&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dash&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Embed inside a larger log parser&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;logRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromStart&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'\t'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;oneOrMore&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;upperCaseLetters&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;// log level: INFO, WARN, ERROR&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'\t'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;oneOrMore&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;anyCharacter&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andNothingElse&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shake&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Result: ^[0-9]{4}-[0-9]{2}-[0-9]{2}\t[A-Z]+\t.+$&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;date&lt;/code&gt; fragment is independently testable, readable by name, and reusable across your codebase without copy-paste.&lt;/p&gt;




&lt;h2&gt;
  
  
  Patterns are extraction tools, not just validators
&lt;/h2&gt;

&lt;p&gt;This is the part that usually surprises people. In v5.6, every &lt;code&gt;SiftPattern&lt;/code&gt; ships a complete extraction API — no &lt;code&gt;Matcher&lt;/code&gt; boilerplate required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Named group extraction
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;NamedCapture&lt;/span&gt; &lt;span class="n"&gt;yearGroup&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SiftPatterns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;capture&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"year"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="nc"&gt;NamedCapture&lt;/span&gt; &lt;span class="n"&gt;monthGroup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SiftPatterns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;capture&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"month"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="nc"&gt;NamedCapture&lt;/span&gt; &lt;span class="n"&gt;dayGroup&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SiftPatterns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;capture&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"day"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;   &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;?&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;datePattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromStart&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;namedCapture&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearGroup&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'-'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;namedCapture&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;monthGroup&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'-'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;namedCapture&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dayGroup&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andNothingElse&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datePattern&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;extractGroups&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"2026-03-13"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// → { "year": "2026", "month": "03", "day": "13" }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Extracting all matches across a text
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;prices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;oneOrMore&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;extractAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Order: 3 items at 25 and 40 euros"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// → ["3", "25", "40"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Extracting all named groups across multiple matches
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;allMatches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;invoicePattern&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;extractAllGroups&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;largeDocument&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// → [{"id": "INV-001", "amount": "250"}, {"id": "INV-002", "amount": "80"}, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Lazy streaming for large inputs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;oneOrMore&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;lettersUnicode&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;streamMatches&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;largeText&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full API — all null-safe:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Returns&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;containsMatchIn(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;boolean&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Is there at least one match?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;matchesEntire(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;boolean&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Does the entire string match?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;extractFirst(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Optional&amp;lt;String&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;First match, or empty&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;extractAll(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;List&amp;lt;String&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;extractGroups(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Map&amp;lt;String, String&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Named groups from first match&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;extractAllGroups(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;List&amp;lt;Map&amp;lt;String, String&amp;gt;&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Named groups from all matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;replaceFirst(input, replacement)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;String&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Replace first match&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;replaceAll(input, replacement)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;String&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Replace all matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;splitBy(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;List&amp;lt;String&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Split around matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;streamMatches(input)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Stream&amp;lt;String&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lazy stream of all matches&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  ReDoS mitigation built in
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;withoutBacktracking()&lt;/code&gt; you saw earlier generates a possessive quantifier (&lt;code&gt;\w++&lt;/code&gt;). There are two other tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Atomic group — locks a sub-pattern once matched&lt;/span&gt;
&lt;span class="nc"&gt;SiftPattern&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Fragment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;safe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;oneOrMore&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;preventBacktracking&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// wraps in (?&amp;gt;...)&lt;/span&gt;

&lt;span class="c1"&gt;// Lazy quantifier — matches as few characters as possible&lt;/span&gt;
&lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;oneOrMore&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;anyCharacter&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;asFewAsPossible&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// generates .+?&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Secure patterns become the path of least resistance — you don't have to remember whether it's &lt;code&gt;*+&lt;/code&gt; or &lt;code&gt;*?&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Jakarta Validation — no more duplicated regex
&lt;/h2&gt;

&lt;p&gt;If you use Bean Validation, you've probably written the same &lt;code&gt;@Pattern&lt;/code&gt; across multiple DTOs and then forgotten to sync them when the rule changed. Sift solves this with &lt;code&gt;@SiftMatch&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Define the rule once&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PromoCodeRule&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;SiftRegexProvider&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Override&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getRegex&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromStart&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;atLeast&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;letters&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andNothingElse&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shake&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Reuse it everywhere — compiled once at bootstrap, zero overhead per request&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;ApplyPromoRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="nd"&gt;@SiftMatch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PromoCodeRule&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;flags&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nc"&gt;SiftMatchFlag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CASE_INSENSITIVE&lt;/span&gt; &lt;span class="o"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Invalid promo code format"&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;promoCode&lt;/span&gt;
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Ready-made patterns — SiftCatalog
&lt;/h2&gt;

&lt;p&gt;For common formats, &lt;code&gt;SiftCatalog&lt;/code&gt; provides production-ready, ReDoS-safe patterns. All are &lt;code&gt;Fragment&lt;/code&gt;-typed — they compose cleanly with your own chains.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Standalone validation&lt;/span&gt;
&lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SiftCatalog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;matchesEntire&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user@example.com"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Embedded in a larger pattern&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;regex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromStart&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SiftCatalog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uuid&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'/'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SiftCatalog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isoDate&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andNothingElse&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shake&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available: &lt;code&gt;uuid()&lt;/code&gt;, &lt;code&gt;ipv4()&lt;/code&gt;, &lt;code&gt;macAddress()&lt;/code&gt;, &lt;code&gt;email()&lt;/code&gt;, &lt;code&gt;webUrl()&lt;/code&gt;, &lt;code&gt;isoDate()&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Gradle:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;implementation&lt;/span&gt; &lt;span class="s1"&gt;'com.mirkoddd:sift-core:&amp;lt;latest&amp;gt;'&lt;/span&gt;

&lt;span class="c1"&gt;// Optional: Jakarta Validation integration&lt;/span&gt;
&lt;span class="n"&gt;implementation&lt;/span&gt; &lt;span class="s1"&gt;'com.mirkoddd:sift-annotations:&amp;lt;latest&amp;gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Maven:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.mirkoddd&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;sift-core&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;latest&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Java 8 bytecode. Zero runtime dependencies. Tested on JVM 8, 11, 17, and 21.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/mirkoddd/Sift" rel="noopener noreferrer"&gt;GitHub — mirkoddd/Sift&lt;/a&gt;&lt;br&gt;
📖 &lt;a href="https://github.com/mirkoddd/Sift/blob/main/COOKBOOK.md" rel="noopener noreferrer"&gt;Sift Cookbook&lt;/a&gt; — real-world recipes: UUID validation, TSV log parsing, lookarounds, conditional patterns, nested structures, and more.&lt;/p&gt;




&lt;p&gt;The compiler is the best test suite you have. Sift puts it to work on your regex too.&lt;/p&gt;

</description>
      <category>java</category>
      <category>programming</category>
      <category>opensource</category>
      <category>regex</category>
    </item>
    <item>
      <title>Stop Writing Raw Regex in Java</title>
      <dc:creator>Mirko Dimartino</dc:creator>
      <pubDate>Thu, 05 Mar 2026 16:41:38 +0000</pubDate>
      <link>https://dev.to/mirko_ddd/stop-writing-raw-regex-in-java-14pk</link>
      <guid>https://dev.to/mirko_ddd/stop-writing-raw-regex-in-java-14pk</guid>
      <description>&lt;p&gt;Let's be honest: writing Regular Expressions in Java is often a painful experience.&lt;/p&gt;

&lt;p&gt;You start with a simple rule, but before you know it, you are staring at an unreadable string of backslashes, brackets, and cryptic symbols. Worse still, if you make a syntax error, the compiler won't help you — you'll only find out at runtime when your application crashes or, even worse, falls victim to a &lt;strong&gt;Catastrophic Backtracking (ReDoS)&lt;/strong&gt; attack.&lt;/p&gt;

&lt;p&gt;I got tired of this, so I built &lt;strong&gt;Sift&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Sift?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/mirkoddd/Sift" rel="noopener noreferrer"&gt;Sift&lt;/a&gt; is a fluent, state-machine-driven Domain Specific Language (DSL) for building &lt;strong&gt;Type-Safe Regular Expressions&lt;/strong&gt; in Java.&lt;/p&gt;

&lt;p&gt;Instead of guessing the syntax, Sift uses your IDE's auto-completion to guide you. It enforces strict structural guarantees at compile time.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Problem: Raw Regex
&lt;/h3&gt;

&lt;p&gt;Imagine trying to validate an international username and ensuring it doesn't trigger an infinite backtracking loop. You might write something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;^[\p{Lu}][\p{L}\p{Nd}_]{3,15}+[0-9]?$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works, but it's hard to read, impossible to maintain, and a nightmare for junior developers to review.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Solution: Sift's Fluent API
&lt;/h3&gt;

&lt;p&gt;With Sift, you write self-documenting code that compiles down to the exact same 100% native Java &lt;code&gt;Pattern&lt;/code&gt; with zero runtime overhead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromStart&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;anywhere&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Sift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromAnywhere&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;oneUnicodeUppercaseLetter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uppercaseLettersUnicode&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;otherUnicodeChar&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anywhere&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;between&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;wordCharactersUnicode&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;withoutBacktracking&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;optionalDigits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anywhere&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;optional&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;digits&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;regex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;oneUnicodeUppercaseLetter&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;followedBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;otherUnicodeChar&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optionalDigits&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andNothingElse&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shake&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Modularity: The "LEGO Brick" Approach
&lt;/h3&gt;

&lt;p&gt;Complex patterns usually require long, monolithic strings. Sift introduces modularity: you can build &lt;strong&gt;unanchored intermediate blocks&lt;/strong&gt; and compose them into strict boundaries later.&lt;/p&gt;

&lt;p&gt;Sift also supports &lt;strong&gt;Lazy Validation for Backreferences&lt;/strong&gt;. You can define a Named Capture Group in one block, reference it in another disconnected block, and Sift will safely merge and validate them when you finally call &lt;code&gt;.shake()&lt;/code&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  Built for Security
&lt;/h3&gt;

&lt;p&gt;Java's default regex engine is vulnerable to ReDoS if you aren't careful with quantifiers. Sift exposes possessive (&lt;code&gt;.withoutBacktracking()&lt;/code&gt;) and lazy (&lt;code&gt;.asFewAsPossible()&lt;/code&gt;) modifiers directly through the Type-State machine, making secure patterns the path of least resistance.&lt;/p&gt;




&lt;h3&gt;
  
  
  Check out the Cookbook 🧑‍🍳
&lt;/h3&gt;

&lt;p&gt;I recently wrote a comprehensive &lt;code&gt;COOKBOOK.md&lt;/code&gt; demonstrating real-world use cases, such as parsing TSV logs, validating UUIDs/IPs, and extracting HTML tags.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/mirkoddd/Sift" rel="noopener noreferrer"&gt;Sift on GitHub&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;I would love to hear your feedback on the DSL design and the state-machine architecture. What do you think about fluent builders for Regex? Let me know in the comments!&lt;/p&gt;

</description>
      <category>java</category>
      <category>programming</category>
      <category>opensource</category>
      <category>regex</category>
    </item>
  </channel>
</rss>
