<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vasishta Nandipati</title>
    <description>The latest articles on DEV Community by Vasishta Nandipati (@vasishtanandipati).</description>
    <link>https://dev.to/vasishtanandipati</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2967599%2Fd6fa8559-3808-4889-9e04-685a2d1bc541.jpg</url>
      <title>DEV Community: Vasishta Nandipati</title>
      <link>https://dev.to/vasishtanandipati</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vasishtanandipati"/>
    <language>en</language>
    <item>
      <title>I Built a Secret Scanner That Checks Your Git History, Not Just Your Code</title>
      <dc:creator>Vasishta Nandipati</dc:creator>
      <pubDate>Fri, 29 May 2026 07:07:21 +0000</pubDate>
      <link>https://dev.to/vasishtanandipati/i-built-a-secret-scanner-that-checks-your-git-history-not-just-your-code-3pgo</link>
      <guid>https://dev.to/vasishtanandipati/i-built-a-secret-scanner-that-checks-your-git-history-not-just-your-code-3pgo</guid>
      <description>&lt;p&gt;Most developers know they shouldn't commit API keys. Most secret scanners will catch an AWS key sitting in your current codebase. What they won't catch is the key you deleted three commits ago -- which is still fully recoverable by anyone who clones your repo and runs &lt;code&gt;git log -p&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That gap is what I built &lt;a href="https://github.com/Vasishta03/secret-scanner" rel="noopener noreferrer"&gt;leakscan&lt;/a&gt; to address.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With Current-State-Only Scanners
&lt;/h2&gt;

&lt;p&gt;When you delete a secret from a file and commit, the removal is recorded in git history. But the original commit that introduced the secret is still there. Every clone of your repository carries that history. Anyone -- a future contributor, a malicious actor, a job applicant reviewing your public code -- can recover those secrets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# This recovers secrets you "deleted" months ago&lt;/span&gt;
git log &lt;span class="nt"&gt;-p&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A2&lt;/span&gt; &lt;span class="s2"&gt;"AKIA&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;sk-&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;ghp_"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most scanners only look at your working tree. leakscan traverses every commit.&lt;/p&gt;

&lt;h2&gt;
  
  
  What leakscan Does
&lt;/h2&gt;

&lt;p&gt;leakscan is a Python CLI that scans for leaked secrets across:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local file trees (parallel, 8 threads)&lt;/li&gt;
&lt;li&gt;Full git history across any branch&lt;/li&gt;
&lt;li&gt;Public GitHub repos by URL&lt;/li&gt;
&lt;li&gt;All repos and gists for a GitHub user or org&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It ships with 55+ regex patterns covering AWS, GitHub, GitLab, Stripe, OpenAI, Anthropic, Slack, Twilio, Discord, Telegram, npm, PyPI, and more. On top of regex, it runs Shannon entropy scoring on &lt;code&gt;.env&lt;/code&gt;, YAML, and INI files to catch high-entropy values that don't match a known pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shannon Entropy: Catching the Unknowns
&lt;/h2&gt;

&lt;p&gt;Not every leaked secret follows a known format. A randomly generated 32-character database password won't match any regex. Shannon entropy measures the randomness of a string -- secrets tend to have high entropy because they're generated to be unpredictable.&lt;/p&gt;

&lt;p&gt;The entropy scorer in leakscan is scoped to value-bearing lines in config files, not general source code, to keep the false positive rate low. You can disable it with &lt;code&gt;--no-entropy&lt;/code&gt; if you're scanning code that has intentionally high-entropy strings (e.g., compiled output).&lt;/p&gt;

&lt;h2&gt;
  
  
  Live Verification
&lt;/h2&gt;

&lt;p&gt;Finding a secret is only half the picture. leakscan can verify whether a found secret is still active by making a live API call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;secrets scan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--verify&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Currently supports: GitHub, GitLab, Stripe, OpenAI, Anthropic, HuggingFace, SendGrid, Slack, npm, Replicate.&lt;/p&gt;

&lt;p&gt;A revoked or rotated secret shows as INACTIVE in the output. This matters in triage -- you want to know if you have an active exposure or just a historical artifact.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI/CD Integration
&lt;/h2&gt;

&lt;p&gt;The tool is built to run in pipelines without manual configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GitHub Actions&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Scan for secrets&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secrets scan . --severity HIGH --no-entropy --format sarif --output results.sarif&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Upload SARIF&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github/codeql-action/upload-sarif@v3&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;sarif_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;results.sarif&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit code &lt;code&gt;1&lt;/code&gt; on any CRITICAL or HIGH finding, so the build fails automatically. SARIF output integrates with the GitHub Security tab and GitLab SAST.&lt;/p&gt;

&lt;p&gt;Baseline mode handles the "known findings" problem in CI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First run: save current state&lt;/span&gt;
secrets scan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--save-baseline&lt;/span&gt; .secrets.baseline

&lt;span class="c"&gt;# Subsequent runs: only alert on NEW secrets&lt;/span&gt;
secrets scan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--baseline&lt;/span&gt; .secrets.baseline
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This stops CI from constantly alerting on findings you've already triaged and accepted (test fixtures, example configs with placeholder values, etc.).&lt;/p&gt;

&lt;h2&gt;
  
  
  Pre-commit Hook
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;your-git-repo
secrets install-hook
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hook runs on every commit and uses the baseline automatically if present. Inline suppression is supported: add &lt;code&gt;# nosec&lt;/code&gt;, &lt;code&gt;# gitleaks:allow&lt;/code&gt;, or &lt;code&gt;# secretscanner:allow&lt;/code&gt; to any line to skip it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Output Formats
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Terminal (default)&lt;/td&gt;
&lt;td&gt;Interactive review with severity colors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;Programmatic consumption, SIEM ingestion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSV&lt;/td&gt;
&lt;td&gt;Spreadsheet review, audit exports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SARIF 2.1.0&lt;/td&gt;
&lt;td&gt;GitHub Security tab, GitLab SAST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;td&gt;Disclosure reports to security teams&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The codebase is intentionally modular:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scanner/
cli.py        entry point (click)
engine.py     file walker, parallel scanner, git history
patterns.py   55+ regex patterns
entropy.py    Shannon entropy scorer
verifier.py   live API verification (10 services)
baseline.py   save/load/compare baseline fingerprints
reporter.py   terminal/JSON/CSV/SARIF/disclosure output
ignorefile.py .secretignore parser with ** glob support
github/
fetcher.py  GitHub API client: repos, gists, commit history
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each module is independently testable. The full pytest suite is in &lt;code&gt;/tests&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;leakscan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Where This Fits vs. Existing Tools
&lt;/h2&gt;

&lt;p&gt;Tools like Gitleaks, Detect-Secrets, and TruffleHog are excellent. leakscan is a Python-native alternative with a focus on Git history scanning, live verification, and baseline-aware CI. If your team is already Python-heavy, a pip install is a lower-friction entry point than distributing a Go binary.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Expanded verifier coverage (Twilio, Mailchimp, Shopify)&lt;/li&gt;
&lt;li&gt;GitHub Actions marketplace action&lt;/li&gt;
&lt;li&gt;PyPI download metrics and badge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repo is at &lt;a href="https://github.com/Vasishta03/secret-scanner" rel="noopener noreferrer"&gt;github.com/Vasishta03/secret-scanner&lt;/a&gt;. Contributions, pattern additions, and feedback welcome.&lt;/p&gt;

</description>
      <category>security</category>
      <category>python</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
