Jean Devaux

Posted on Sep 29, 2020

CrowdSec, an open-source, modernized & collaborative Fail2ban

#security #devops #opensource #linux

Dear estimated community,

We would like to introduce a new security project, CrowdSec, and collect your feedback & comments.
The solution is available on GitHub and will remain open-source (MIT license) and free of charge.

TL;DR: CrowdSec parses logs from various data sources, normalizes and enriches them before applying heuristic scenarios to identify aggressive behaviors and protect you from most attack classes. Like with fail2ban, things like credential stuffing, web or port scans, ssh / ftp / telnet brute-force, and many others are really easy to defeat with the software, but CrowdSec modern grammar & architecture give the users more possibilities.

Target & goal: CrowdSec is designed to protect servers, services, containers or VMs exposed on the Internet with a server side agent. It currently runs on Linux (ports to MacOS & Windows are on the roadmap).

How it works: The software is written in Go-Lang and thought from day one to run on modern, complex architectures, like cloudified ones, lambdas, containers, etc. To achieve this, it’s “decoupled”. Meaning you can “detect here” (say in your database logs) and “remedy there” (say in your firewall or Rproxy). The tool internally uses leaky buckets to allow for tight event control. Scenarios are written in YAML to make them as simple and readable as possible, without sacrificing granularity. The inference engine lets you get insights from chain buckets or meta-buckets. (i.e. if several buckets (web scan, port scan and login attempt failed) overflow in a “meta bucket”, you can trigger a “targeted attack” remediation).

Aggressive IPs are dealt with by bouncers. The CrowdSec Hub offers ready to use data connectors, bouncers (Nginx, PHP, CloudFlare, Netfilter) and scenarios to deter various attack classes. Bouncers will be able to remedy threats in various ways. We work on bouncers like Captcha, limiting applicative rights, MFA, throttling queries, or activating Cloudflare attack mode just when needed, etc. You also already get a sense of what’s happening locally (and where from), with a lightweight visualisation interface and a strong prometheus observability.

While the software currently looks like a 2020 pimped fail2ban, the endgame is to leverage the power of the crowd to create a very accurate IP reputation database. When CrowdSec bounces a specific IP, the triggered scenario and the timestamp are sent to our API, to be checked and integrated in the global consensus of bad IPs. While we are already redistributing a block list to our community (you can see it with the CLI: cscli ban list --api), we plan to really improve this part as soon as we have dealt with other, prerequisite, code lines. The network already has sightings of 100K+ IPs (refreshed daily), and is able to redistribute ~10% (10K) of those to our community members. Also to be noted, the project has been designed to be GDPR compliant and privacy respectful, both in technical and legal terms.

Mid-term vision: When the CS community will be large enough, we will all generate, in real time, the most accurate IP reputation database. This global reputation engine coupled with the local behavior assessment and remediation should allow lots of businesses to get tighter security at a very low cost.

Current state: Setup is quick & easy, heavily assisted by the wizard, to allow the greatest number to use it. The project is production-grade and already runs in many places, including hosting companies. As a good example, one of the CrowdSec users was able to stop a botnet attack from 7,000 different IPs in 5 minutes last week thanks to the solution. We are looking for more users, contributors and ambassadors to take the project to the next level. As of today, community members come from 21 countries across 5 different continents.

We would love to hear your feedback and engage further discussions so don’t hesitate to comment, reach out through our website, GitHub, Discourse or give us a shout on Gitter.

Hope you will like it, use it and eventually contribute to improve it. Thanks in advance for sharing your thoughts.

The CrowdSec team

Top comments (8)

Phil Ashby • Sep 29 '20

Hi Jean, this looks interesting, you have a few competitors around (I found a nice list here, which I'm sure the CrowdSec team are aware of, but readers may not be! zeltser.com/malicious-ip-blocklists/).

Can you provide any more information on the business model / premium features (as per the roadmap on the website) your investors are relying on for their profit? I would be concerned about a bait & switch model, or selling user info ;)

Jean Devaux • Sep 30 '20

Hi Phil,

Thanks for your comments and for sending the list over, we know most of these competitors indeed.

Regarding our business model, we will monetize CrowdSec data in the least possible aggressive way. To put it shortly, anyone using CrowdSec can do it without sharing data and it’s fine by us. These users are not getting the community signals though. We call them the “free tier”. The “watcher tier” can use the product and the IP reputation service entirely for free.

Two monetization plans are studied. These paying features are basically added value, which costs us money to create and operate. Typically, the “premium tier”, offers better support, self-monitoring (of your own IP to see if any get compromised), a “type” filtering where you can describe what you don’t want to see on your network with more granularity (AS, Geo, IP activity type, etc.) and cold log analysis which allows you to use IP reputation DB to make forensic. This last activity implies that we keep a history of how an IP behaved in the past and correlate this information with your log timestamps, hence using space on our storage.

The “enterprise tier” offers the same benefits as the premium tier plus fleet management features. Typically this plan is made for companies handling hundreds of exposed endpoints, administration IP, VPNs, websites, apps, etc. They can centrally define several filtering profiles and enforce them on a large scale, from a single backoffice. This plan also includes a private consensus, where CrowdSec agents belonging to the same machine group can ban IPs targeting only one precise customer, hence not visible in the global database, but that could be identified locally.

The “API tier” will simply query the IP to get the reputation of a given IP they are dealing with. They don’t share any signals with us, hence they pay to get access to this data.

Phil Ashby • Sep 30 '20

Brilliant - thanks for the clear explanation Jean!

Jean Devaux • Sep 30 '20

No problem Phil. Do you feel like testing it?

Phil Ashby • Sep 30 '20

I may do - currently using fail2ban on all my public-facing Debian systems.. would be delighted to see the CrowdSec agent in official Debian, or at least a managed .deb repository I can configure into apt. I tend to avoid 'build from source' or unmanaged binary (download) approaches to production systems :)

Jean Devaux • Oct 1 '20 • Edited

Good point, we are working on it at the moment. More specifically, the team is currently working on an important update of the open source software (APIL - github.com/crowdsecurity/crowdsec/...). Also, we already started to work on the packaging (github.com/crowdsecurity/crowdsec/...) and it will be in our roadmap after the next release.

yellow1912 • Sep 29 '20

Very interesting. I hope the setup is simple so that it can become popular and useful for everyone. Quick question: can it be abused to blacklist a good player?

Jean Devaux • Sep 29 '20

Thanks for your comment.

You can set it up and get started in 5 minutes, creating an easy-to-use solution was critical to us.

Indeed, poisoning is the main threat to the integrity of the central IP reputation database. To limit risks, we are creating a "trust factor" mechanism that we use to rate users. When the user's trust is too low, their reports aren't even taken into account (except if confirmed by trusted, members). The trust rate will grow based on factors such as accuracy and consistency of reports. The idea behind is that we want the trust factor to be as hard as possible to fake or artificially grow. Last but not least, we are mostly relying on our honeypot network as of now to weight decisions. Also, we are distributing whitelists (from the Hub) that will ensure that even poorly configured scenarios aren't going to ban critical actors/partners (ie. SEO bots).

Does that help?

DEV Community

CrowdSec, an open-source, modernized & collaborative Fail2ban

Top comments (8)

Read next

How to Simulate High CPU Usage on AWS Ubuntu Instances for Testing and Performance Optimization

Distributed computing made easy

Azure OpenAI vs OpenAI

Simplify Environment Variable Management with GitHub Environments