DevOps is one of the most challenging fields to be in, and to stay relevant you need to learn constantly.
CHeck out SigNoz - an open-source application performance monitoring tool.
So today, I want to share 5 amazing GitHub projects which will help you become a better DevOps engineer. These 5 Github projects can come in handy for anyone looking to learn and want good resources to dive in. 🏊♀️
So let's get started👊
1. How they SRE
⭐ Github stars: 4.8k
This repo is a curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
upgundecha / howtheysre
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
How they SRE
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
Introduction
How They SRE is a curated knowledge repository of best practices, tools, techniques, and culture of SRE adopted by the leading technology or tech-savvy organizations.
Many organizations regularly come forward and share their best practices, tools, techniques and offer an insight into engineering culture on various public platforms like engineering blogs, conferences & meetups. The content is curated from these avenues and shared in this repository.
Note to readers: This list refers to some of the articles, posts, videos, tools, and techniques published before 2015. Please use such material with caution as there may be recent advances in technology and practices which offer better alternatives and perspectives.
Topics
- Site Reliability Engineering
- Hiring and Building SRE teams
- SRE Culture
- DevOps
- Monitoring & Observability
- Alerting
- Incident Response…
2. Awesome Scalability
⭐ Github stars: 32.5k
This repo has an organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. This is one of the best resources on scalability with real examples from large organizations.
binhnguyennus / awesome-scalability
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users.
If your system goes slow
Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some design principles and checking how scalability and performance problems are solved at tech companies. The section of intelligence are created for those who work with data and machine learning at big (data) and deep (learning) scale.
If your system goes down
…"Even if you lose all one day, you can build all over again if you retain your calm!" - Thuan Pham, former CTO of Uber. So, keep calm and mind the availability and stability matters!
3. DevOps Exercises
⭐ Github stars: 8.6k
This repo contains questions and exercises on technical topics related to DevOps and SRE.
bregman-arie / devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
4. Test your sysadmin skills
⭐ Github stars: 7.2k
This project contains test questions and answers that can be asked during an interview/exam for positions such as Linux System Administrator.
trimstray / test-your-sysadmin-skills
A collection of Linux Sysadmin Test Questions and Answers. Test your knowledge and skills in different fields with these Q/A.
"A great Admin doesn't need to know everything, but they should be able to come up with amazing solutions to impossible projects." - cwheeler33 (ServerFault)
"My skills are making things work, not knowing a billion facts. [...] If I need to fix a system I’ll identify the problem, check the logs and look up the errors. If I need to implement a solution I’ll research the right solution, implement and document it, the later on only really have a general idea of how it works unless I interact with it frequently... it’s why it’s documented." - Sparcrypt (Reddit)
5. Awesome Site Reliability Engineering
⭐ Github stars: 6.5k
This repo has a curated list of awesome Site Reliability and Production Engineering resources.
dastergon / awesome-sre
A curated list of Site Reliability and Production Engineering resources.
Awesome Site Reliability Engineering
A curated list of awesome Site Reliability and Production Engineering resources.
What is Site Reliability Engineering?
"Fundamentally, it's what happens when you ask a software engineer to design an operations function." - Ben Treynor Sloss, VP Google Engineering, founder of Google SRE
Contributing
Please take a look at the contribution guidelines first Contributions are always welcome!
Contents
- Culture
- Education
- Books
- Hiring
- Reliability
- Monitoring & Observability & Alerting
- On-Call
- Post-Mortem
- Capacity Planning
- Service Level Agreement
- Performance
- Programming
- Misc Articles
- Real-time Messaging
- Blogs
- Newsletters
- Conferences & Meetups
- SRE Tools
Culture
- What is Site Reliability Engineering?
- Keys To SRE by Ben Treynor
- Google SRE Resources
- Notes from Production Engineering by Pedro Canahuati
- PostOps: Recovery from Operations
- Love DevOps? Wait 'till you meet SRE [video]
- How Google Does Planet-Scale Engineering for Planet-Scale Infra
- Site Reliability Engineering at Facebook
- A History of Site Reliability Engineering at Uber
- Case Study: Adopting…
I hope you enjoyed this list. I will be coming up with more such amazing resources soon. So, stay tuned! 🙂
Currently building SigNoz - an open-source alternative to DataDog, New Relic, etc. 💙
SigNoz helps developers monitor applications and troubleshoot problems in their deployed applications. Check out our GitHub repo👇
SigNoz / signoz
SigNoz is an open-source APM. It helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool
Monitor your applications and troubleshoot problems in your deployed applications, an open-source alternative to DataDog, New Relic, etc.
Documentation • ReadMe in Chinese • ReadMe in German • ReadMe in Portuguese • Slack Community • Twitter
SigNoz helps developers monitor applications and troubleshoot problems in their deployed applications. SigNoz uses distributed tracing to gain visibility into your software stack.
Join our Slack community
Come say Hi to us on Slack
Features:
- Application overview metrics like RPS, 50th/90th/99th Percentile latencies, and Error Rate
- Slowest endpoints in your application
- See exact…
Top comments (20)
You might also like: How to Master Python Fast and easy: A complete simple Tutorial based on the officiel documentation
Here are a couple more free eBooks on GitHub:
💡 Introduction to Bash Scripting
💡 Introduction to Git and GitHub
I appreciate your idea but even if you translated this article to vietnamese. Readers then have to follow the links and read related resources in english anyway. I mean, I wish we can help our vietnam devs to improve their english in general somehow :D
Awesome!
I'm just starting with devops and this comes in suuuuper handy. Thanks!
Glad you liked it!
Just what I was looking for. Awesome
Great Resources 🤩
Thanks Rajat! :)
wow! thanks. This is really Gold🤗
You're welcome Trevor :)
Awesome list of awesome lists!
Good one! 😁
Awesome Ankit good one
Glad you liked it 😊
Thanks @ankit01oss for this valuable collection of github repos!
You are welcome :)
Thank you for sharing
You're welcome!
Hi @nthanhhai2909 Good idea! Please go ahead 👍