DEV Community

Aman Shekhar
Aman Shekhar

Posted on

We stopped AI bot spam in our GitHub repo using Git's –author flag

I’ll never forget the day I logged into our GitHub repository, ready to check out the latest contributions from our team, only to be greeted by an avalanche of spam. You know the type—the kind that makes you question humanity and your life choices as a developer. It was like a digital version of a junk mail explosion. Ever had that sinking feeling? You’re not alone.

So, there we were, a small team trying to do our best work, and suddenly we’re drowning in AI bot spam. We had to do something about it, and fast. That’s when I stumbled upon the Git –author flag. I’ve got to tell you; it was a game changer.

Getting to Know the Enemy: The Spam Bots

Let’s talk about the spam we were dealing with. These weren’t just random messages; they were well-crafted pieces of code that, if unchecked, could clutter our repository and confuse our contributors. Ever wondered why GitHub has such a robust community? It’s because it’s a space for collaboration and sharing—something that spam completely undermines.

I spent an entire afternoon researching anti-spam measures, and while I found quite a few approaches, none felt right for our team. That's when the idea of leveraging Git’s –author flag popped into my mind. It felt like discovering a secret weapon in a video game.

The Lightbulb Moment: Utilizing Git’s –author Flag

If you’re not familiar with the –author flag, let me give you the quick and dirty. This flag allows you to specify the author information when you commit changes in Git. Think of it like wearing a name tag at a party; it lets everyone know who you are—essential for transparency.

Here’s a bit of code to illustrate how it works:

git commit --author="Spam Bot <spam@bot.com>" -m "This is a spam commit"
Enter fullscreen mode Exit fullscreen mode

I realized that if we could identify and filter out these spammy authors, we could drastically reduce the clutter in our repo. So, I jumped into action, putting together a plan to parse through our commits and isolate anything that looked suspicious.

Crafting a Solution: Filtering Out the Spam

The next step was to create a script that would scan through recent commits and identify any that were from known “spammy” authors. This process was akin to cleaning out your closet—there’s a lot of junk, and you’ve got to sift through to find the gems.

Here’s a basic example using Python:

import subprocess

def get_commits():
    result = subprocess.run(['git', 'log', '--pretty=format:"%an <%ae>"'], capture_output=True, text=True)
    return result.stdout.splitlines()

def filter_spam(commits):
    known_spammers = ['spam@bot.com', 'fake@user.com']
    return [commit for commit in commits if any(spam in commit for spam in known_spammers)]

commits = get_commits()
spam_commits = filter_spam(commits)
print("Spam commits found:", spam_commits)
Enter fullscreen mode Exit fullscreen mode

This little script helped us identify the authors we didn’t want in our repo. There’s something so satisfying about writing code that directly solves a problem you’re facing. It’s those “aha” moments that keep me coming back to development.

The Deployment: Implementing the Fix

Once I had the script ready, it was time to put it into action. I shared my findings with the team, and we decided to set up a routine to run the spam filter weekly. I was pretty nervous about it at first—what if it misidentified a legitimate contributor? But we had to take the plunge.

To ensure we didn’t flag anyone important, we included a review step where team members could check the flagged commits. It was a collective effort, and I loved that about our team. The camaraderie and shared responsibility made the process more enjoyable.

The Results: A Cleaner GitHub

After a few weeks of running our script, I can honestly say it was like a breath of fresh air. We noticed a significant drop in spam commits. It was thrilling to see contributions from our actual developers thrive without the noise drowning them out.

Of course, there were failures along the way. We did accidentally flag a couple of legitimate commits, but those moments turned into learning experiences. We refined our filter criteria, which not only improved accuracy but also brought us closer as a team.

Reflections: Lessons Learned and Future Thoughts

Now, looking back, I can’t help but feel a sense of pride in what we accomplished. We took a frustrating problem that could’ve derailed our productivity and turned it into an opportunity for improvement. It reminded me that in development, every challenge is an opportunity to learn something new.

So, what's my takeaway? Don’t hesitate to explore the tools at your disposal. Sometimes, the simplest solutions are hiding in plain sight. The Git –author flag may seem trivial, but it opened a world of possibilities for us.

In the future, I’d love to explore more sophisticated solutions, perhaps incorporating machine learning to identify spam patterns. After all, the world of tech is always evolving, and so should we.

I’m genuinely excited about sharing what I’ve learned. If you’ve ever faced similar challenges or have your own tips for combating spam, I’d love to hear them. Let’s keep the conversation going—after all, that’s what community is all about!


Connect with Me

If you enjoyed this article, let's connect! I'd love to hear your thoughts and continue the conversation.

Practice LeetCode with Me

I also solve daily LeetCode problems and share solutions on my GitHub repository. My repository includes solutions for:

  • Blind 75 problems
  • NeetCode 150 problems
  • Striver's 450 questions

Do you solve daily LeetCode problems? If you do, please contribute! If you're stuck on a problem, feel free to check out my solutions. Let's learn and grow together! 💪

Love Reading?

If you're a fan of reading books, I've written a fantasy fiction series that you might enjoy:

📚 The Manas Saga: Mysteries of the Ancients - An epic trilogy blending Indian mythology with modern adventure, featuring immortal warriors, ancient secrets, and a quest that spans millennia.

The series follows Manas, a young man who discovers his extraordinary destiny tied to the Mahabharata, as he embarks on a journey to restore the sacred Saraswati River and confront dark forces threatening the world.

You can find it on Amazon Kindle, and it's also available with Kindle Unlimited!


Thanks for reading! Feel free to reach out if you have any questions or want to discuss tech, books, or anything in between.

Top comments (0)