DEV Community: Samad Yar Khan

Top 10 Tips to Get Started with Open Source and GSoC

Samad Yar Khan — Thu, 22 Aug 2024 16:52:30 +0000

During my sophomore year, I was applying for internships daily but had no luck. Then, I read a blog that recommended contributing to open source to gain real-world experience and mentioned opportunities like Google Summer of Code (GSoC), the MLH Fellowship, and the Linux Foundation Mentorship. Intrigued, I decided to dive in.

However, I quickly felt overwhelmed by the vast number of projects with complex codebases. My initial contributions were minor UI fixes, and I didn't get selected for GSoC that year.

After a break, I returned with a better approach and eventually secured a spot in GSoC the next year.

Whether you’re aiming for GSoC or want to contribute, here are my top 10 tips for starting in open source and preparing for GSoC.

Choose the Right Organization
Become a User First
Start Small: Pick a Tiny Functionality
Tackle Open Issues
Communicate and Ask Questions
Focus on Consistency, Not Quantity
Learn to Navigate Large Codebases
Don’t Overlook Documentation and Testing
Sharpen Your Git Skills
Read the Code

1. Choose the Right Organization

Begin by browsing through various open-source organizations. Focus on projects that align with your skills and interests. For example, if you’re into web development, look for organizations that work on web-based projects similarly, if you’re proficient in languages like Java, Kotlin, or Python, target projects that make extensive use of those languages.

One way to find these projects is by exploring past GSoC organizations and filtering them based on your area of interest.

You can also use this website to view trends of past projects and filter them by the languages used in those projects.

Ensure you’re genuinely interested in the project and that it solves a problem you care about. Aligning your interests, skills, and the organization’s needs will make the contribution process much smoother.

Back to Table of Contents

2. Become a User First

Open-source projects often have large and complex codebases, which can feel overwhelming at first. A great way to familiarize yourself with a project is by becoming a user. Build the project locally, explore its features, and use it like any other software. This hands-on experience will help you understand its functionality and spot areas for improvement.

For instance, before contributing to organizations like Joplin, Rocket.Chat, or CircuitVerse, I spent over a week experimenting with their projects. This not only helped me understand the codebase but also gave me ideas on where I could contribute.

Back to Table of Contents

3. Start Small: Pick a Tiny Functionality

Before diving into big features or fixing complex issues, start small. Choose a minor feature that interests you—like tweaking logic in a login screen or updating a button’s behaviour—and search the codebase for relevant keywords. Make small changes and see how they affect the project. In short, hack it out! This process will help you get comfortable with the code while boosting your confidence.

Back to Table of Contents

4. Tackle Open Issues

Open issues are a great starting point but can be intimidating at first. A helpful strategy is to look for issues tagged as good-first-issue. Example

Websites like Good First Issue and Good First Issues make it easier to find beginner-friendly issues.

A useful trick is to reverse-engineer solved issues. Start by reviewing closed issues similar to the one you’re interested in. Understand how they were resolved, then apply that knowledge to your solution. This approach not only helps you solve current issues but also provides insights into the project’s problem-solving patterns.

Back to Table of Contents

5. Communicate and Ask Questions

One of my first open-source contributions to Rocket.Chat involved fixing the login screen. I spent a week working on it and opened a pull request (PR). Moments later, I found out that the internal team was already revamping the entire login screen, so my PR couldn’t be merged. All that effort could have been saved if I had first communicated with the maintainers.

Open-source communities are generally welcoming and supportive. Don’t hesitate to ask questions if you’re stuck or need guidance. Whether it’s understanding the project’s structure, finding relevant files, or figuring out best practices, reach out to the community. Engaging in discussions will also help you build relationships with other contributors and maintainers. Communicating your approach before diving in can save you from unnecessary rework.

Back to Table of Contents

6. Focus on Consistency, Not Quantity

Contributing to open source is a marathon, not a sprint. It’s better to be consistent with small contributions than to aim for big changes that might be hard to sustain. As you get more comfortable, you can take on bigger tasks.

For example, one of my first contributions was adding a small brand logo component to Rocket.Chat: Link to PR. A few months later, I worked on adding a GitHub Component Kit, which was a much more complex task. During this period, I focused on learning Next.js and improving my code quality.

Persistence is key—making your first meaningful contribution might take a few weeks, but that’s where the real growth happens.

Back to Table of Contents

7. Learn to Navigate Large Codebases

As you progress, you’ll need to get comfortable navigating large codebases. Tools like grep, ripgrep, or your IDE’s search function are invaluable for locating specific functions, variables, or files.

If you’re looking to fix something in the UI, search for the keywords visible on the screen.

If your focus is on data-related issues, examine the network calls happening during the loading process to figure out which call corresponds to which component.

Also, use documentation, GitHub Wikis, and architecture diagrams (if available) to understand how different components interact.

Back to Table of Contents

8. Don’t Overlook Documentation and Testing

Many new contributors skip over documentation and testing, but these are areas where help is often needed. Improving documentation, creating tutorials, or adding test cases are valuable contributions.

Writing test cases helps you understand the project’s core logic while documenting code deepens your understanding and helps others.

Back to Table of Contents

9. Sharpen Your Git Skills

Version control is at the heart of open-source development. Get comfortable with Git commands and workflows, including branching, pull requests, rebasing, and resolving merge conflicts.

Git isn’t always as straightforward as it seems, and you’ll likely encounter tricky situations like merge conflicts or rebasing issues. Learning how to navigate these problems is essential. You can start with this Git Guide or find something more beginner-friendly.

Projects often have specific contribution guidelines, so, be sure to follow them and keep your commit history clean.

Back to Table of Contents

10. Read the Code

One of the most underrated skills in open source is the ability to dive into a codebase and start contributing, even if documentation is lacking. While documentation is helpful, it’s often outdated or incomplete in many open-source projects.

When I worked on adding a PR review feature during my GSoC period, I needed to integrate a code editor block into the component kit. This required changes across three different codebases, none of which were documented. The only way forward was to read a lot of code and commits, piecing together a solution.

As a newcomer, focus on reading as much code as possible and try to map out how different pieces connect to features you see on the screen.

Back to Table of Contents

Conclusion

Your first few contributions might be slow, and that’s perfectly fine. Getting comfortable with a new codebase, understanding the project’s culture, and making meaningful contributions takes time. Enjoy the process and celebrate the small wins. The satisfaction of seeing your code accepted into a project used by thousands (or even millions) of people makes it all worthwhile.

By following these tips and being patient with yourself, you’ll find that contributing to open source is not just a way to level up your skills—it’s also an opportunity to collaborate with like-minded people and create something impactful. Good luck on your open-source journey and in your pursuit of GSoC!

Feel free to checkout open issues in Middleware OpenSource👇🏻

middlewarehq / middleware

✨ Open-source DORA metrics platform for engineering teams ✨

Open-source engineering management that unlocks developer potential

Join our Open Source Community

Introduction

Middleware is an open-source tool designed to help engineering leaders measure and analyze the effectiveness of their teams using the DORA metrics. The DORA metrics are a set of four key values that provide insights into software delivery performance and operational efficiency.

They are:

Deployment Frequency: The frequency of code deployments to production or an operational environment.
Lead Time for Changes: The time it takes for a commit to make it into production.
Mean Time to Restore: The time it takes to restore service after an incident or failure.
Change Failure Rate: The percentage of deployments that result in failures or require remediation.

Table of Contents

Middleware - Open Source

View on GitHub

Building a Scalable Notifications and Alerting System 🔥🚀

Samad Yar Khan — Fri, 09 Aug 2024 13:44:09 +0000

Introduction

This article explores the development of a Rule-Based alerting and notification system at Middleware. The system, known as the Playbook, aims to notify customers when certain metrics cross predefined thresholds.

The Idea💡

One of my first tasks after joining Middleware as a full time software engineer was to build something known as the Playbook.

The Playbook serves as a Rule-Based alerting system, allowing Engineering managers to receive notifications when specific metrics exceed set thresholds. For instance:

Send an email if developers spend more than 50% of their time on bug fixes in the past week.
Send a Slack message if the team’s average PR Rework Time exceeds 6 hours over the last month.

Application Flow

Setting Up Rules and Cadence: Users define alerting rules for their teams and select the time range for rule checks.
Breach Processing: The system reads rules from the database, validates metrics against these rules, and generates notifications for breaches.
Dispatching Notifications: Notifications are sent to users based on their set time zones.

Assumptions

Entities like Users and Teams exist.
Services can provide metric values for Users or Teams within specified time ranges.
A notification dispatcher can send notifications via Slack/Email.
One notification is sent per breach.

Low Level Design

The system employs various models, including Playbook, Playbook Rules, Playbook Rule Breaches, and Notifications. These models handle rule creation, breach generation, and notification dispatching.

Functionalities:

Playbook Core: Create a playbook as an aggregation of Rules. Each rule has some setting data and cadence.
Breach Processor: Identify breaches based on the rules set by the user and the rule cadence.
Notification Processor: Create Notifications based on the breaches.
Notification Dispatcher: Sends Notifications via different channels.

Models:

Playbook and Playbook Rules:

Playbook are set by the manager for a team.
Each Playbook has a set of Rules for each metric with set threshold.



class Playbook(){
    team_id: uuid,
    created_by: uuid,
    created_at:date_time,
    updated_at:date_time,
    updated_by: uuid,
    rules: set(PlaybookRule), (set hashes based on rule type),
}

class PlaybookRule(){
    rule_type: PlaybookRuleType(ENUM),
    rule_data: {}
    alert_cadence: PlaybookRuleAlertCadence(ENUM),
    users_to_notify: set(uuid),
    is_active: boolean
}

class PlaybookRuleType(Enum):
    CYCLE_TIME= "CYCLE_TIME"
    INCIDENT_COUNT = "INCIDENT_COUNT"

Alert Cadence refer to the frequency at which a user would like to receive these notification.
Daily Cadence: Breaches are calculated daily and we send notification every day according to user time zone.
Weekly Cadence: Breaches are calculated based on weekly data, and notifications are sent every Monday.
Two Weeks Cadence: Breaches are calculated over the past two weeks’ data, and notifications are sent every second Monday.
Monthly Cadence: Breaches are calculated over the monthly average, and notifications are sent on the 1st of each month.



class AlertCadence(Enum):
    DAILY="DAILY"
    WEEKLY="WEEKLY"
    TWO_WEEKS="TWO_WEEKS"
    MONTHLY="MONTHLY"

Playbook Breaches:

Triggering Breaches: Whenever a metric exceeds or falls below the set threshold, a PlaybookBreach is generated.
Linkage: Each PlaybookBreach is associated with a playbook and a rule type, providing context for the breach.
Rule Data Inclusion: To accommodate potential rule changes later, each breach includes the rule data as it was at the time of generation. This ensures historical accuracy and consistency despite future rule modifications.



class PlaybookRuleBreach(){
    playbook: uuid,
    rule_type: uuid,
    rule_data: {}
    team_id: uuid,
    alert_cadence: PlaybookRuleAlertCadence(ENUM),
    metric_value: float
}

Notifications:

Breach Notification: Upon breach creation, a notification can be generated and sent to the user.
Preventing Duplicates: To avoid duplicate notifications, each notification is assigned an idempotency key, ensuring uniqueness in the database.
Notification Model Flexibility: The notification model is designed to be versatile, accommodating other services beyond the Playbook. As such, each notification can be categorized by type to facilitate organization and handling.



class Notification(){
    receiver_id: uuid,
    idempotency_key: str,
    notification_type: NotificationTypes(ENUM),
    due_at: date_time,
    queued_at: date_time,
    sent_at: date_time
}

High Level Design

In this section we will define how to makes this system robust:

Ensure breaches are generated reliably at set intervals despite system failures.
Implement measures to prevent the generation of duplicate breaches.
Establish safeguards to avoid sending duplicate notifications.
Develop a retry system for notifications in case of bugs or system failures.

Generating and Processing Breaches

CRONS

We must check and process our playbook rules based on the cadence set by the user.

For this purpose we can simplify the system to use a CRON that runs Daily.

Process rules with a daily cadence each time the CRON job executes.
Check if the current date is the 1st to process rules with a monthly alert cadence.
For rules with a weekly alert cadence, process them on Mondays.
Handle rules with a two-week alert cadence by processing them on the first or third Monday of the month.

Jobs and Workers

Processing all this data inside a single cron process can be a challenge incase one of the rule fails due to incorrect data, unhandled case in code or deleted entities.

Example: If you have processed a 10 Rules and there is an unhandled case on the 11th Rule, the last generated breaches and notifications are wasted and not stored to the DB. Similarly, once the CRON throws an error, it will need to be manually re-triggered else we will have to wait for the next time it automatically runs.

To avoid this, we use the producer-consumer model.

Treat each Playbook Rule as a single job and enqueue these jobs into a PlaybookRule Queue.
Utilise multiple workers to listen to the PlaybookQueue and process one job at a time to generate PlaybookRuleBreach and Notification.
If a job fails, the Queue will receive a 500 status code, we can setup some alerts and this failed job can be re-tried by the Queue.
Using a non-FIFO queue ensures that failed jobs do not block the queue while fixes are being deployed.

Regarding technology choices:

The system described uses Amazon SQS and Lambdas for its internal systems.
While AWS provides built-in functionality for this setup, similar systems can be built using other service providers or custom solutions.

Queuing Notifications

Once the Notification are in the Database, we can run a hourly CRON that checks for any due notifications in the DB and queues them as jobs.

These notification jobs are handled by the notification dispatcher which decides which channel to use to notify the user and runs any additional logic to manipulate notification message set for a notification type.

Points of Failure

As data travels through distributed systems, there are chances of failure. We handled some cases in our design, but each system has its own shortcomings.

Handled Cases

Processing jobs from the PlaybookRuleQueue:

Failed saving operations for breaches are retried by the Queue.
When saving notifications fails and a retry occurs, ensure no duplicate breaches are generated for the same packet. Breaches remain idempotent based on playbook_id, rule_type, and rule checking interval.

Processing and Saving Notification

When re-queuing playbook rule jobs, we prevent duplicate notifications from being saved in the DB by using an idempotency key based on breach data.
Using idempotency key we make sure one notification can be created from one breach.
Incase a dispatched notifications is failed, it will be retried by the queue.
Incase a sent notification is re-queued, the worker will check the sent_at key for a notification making sure it is not resent.

Shortcomings

Rule Changes

Incase a user updates a rule after the notification is generated and waiting to be sent, we still send the notification.
This was a edge case that we did not handle internally simply because of time constraints.
This can also be handled by cross-checking the rule data inside the breach used to generate the notification with the rule data in associated playbook and deleting notification and re-queueing a PlaybookRule Job.

Conclusion

The Rule-Based Notification System developed at Middleware has proven to be robust in the past year at our current scale.

While it may not be the most scalable notifications system out there, I hope this article can get you thinking in the right direction :)

middlewarehq / middleware

✨ Open-source DORA metrics platform for engineering teams ✨

Open-source engineering management that unlocks developer potential

Join our Open Source Community

Introduction

They are:

Deployment Frequency: The frequency of code deployments to production or an operational environment.
Lead Time for Changes: The time it takes for a commit to make it into production.
Mean Time to Restore: The time it takes to restore service after an incident or failure.
Change Failure Rate: The percentage of deployments that result in failures or require remediation.

Table of Contents

Middleware - Open Source

View on GitHub

LLAMA 3.1 vs GPT4: Which is smarter for analytics?

Samad Yar Khan — Tue, 30 Jul 2024 11:29:48 +0000

Introduction
Background
Objectives
Implementation
- Data Processing: Middleware to the Rescue
- Model Integration: FireworksAi and OpenAI
Evaluation and Results: GPT4o Vs LLAMA 3.1
- Mathematical Accuracy
- Data Analysis
- Actionability
- Summarisation
Conclusion
Future Work

Introduction

Middleware is a platform that enables engineering leaders to derive actionable insights from data and improve the processes, making dev teams more efficient. With the fast movements in the field of AI we have been continuously trying to integrate ML models across the product with the goal of deriving actionable insights from the data.

We took some time and figured that the open source LLAMA or Mistral models we wanted to use were good but GPT4o was more reliable when it came to data centric problems. Hence we decided to move in the more sophisticated direction of building RAG pipelines and using function calling.

All this changed when we heard that Meta dropped LLAMA 3.1 models. The 70B and 405B models are simply one of the best open-source models out there and compete neck to neck with GPT4o. So we decided to integrate AI powered DORA reports as a part of an experimental effort and see how GPT4 and LLAMA 3.1 perform when it comes to data analysis and reasoning.

Background

DORA metrics provide critical insights into the performance and reliability of software delivery processes.

1) Lead Time for Changes

Lead time consists of First Commit to PR Open time, First Response Time, Rework Time, Merge Time, and Merge to Deploy Time.

2) Deployment Frequency

This metric gauges how frequently code changes are deployed to production.

3) Mean Time to Recover (MTTR)

MTTR measures how swiftly a team can restore service after a failure occurs in production.
The team's average incident resolution time is to compute its MTTR.

4) Change Failure Rate (CFR)

CFR quantifies the percentage of changes that result in a service impairment or outage in production, aiding in the evaluation of deployment process stability and reliability.
CFR is computed by linking incidents to deployments within an interval; each deployment may have several or no incidents.

You can learn more about dora metrics from here. By leveraging advanced LLMs, we aim to automate the analysis of these metrics, providing teams with deeper and more actionable insights.

Objectives

To integrate LLMs into Middleware for the analysis of DORA metrics.
To compare the performance of different large language models in terms of:
- Mathematical Accuracy: How well can it calculate the DORA score ?
- Data Analysis: Can the LLM analyse the input data and derive correct inferences ?
- Summarising: How well can the model summarise data ?
- Actionability: How well can the models suggest an action-plan based on the input data ?

Implementation

Data Processing: Middleware to the Rescue

Middleware syncs all your data from different sources and calculates the DORA Metrics for your teams.
Checkout middlewarehq/middleware and setup the dev server using docker.

Model Integration: FireworksAi and OpenAI

We integrated OpenAI GPT4o and LLAMA 3.1 (70B and 405B) models.
The OpenAI models use the official OpenAI API under the hood, while the Fireworks AI APIs have been used to integrate the 70B and 405B LLAMA 3.1 Models.
These AI analytics are powered by the AIAnalyticsService in the analytics server. This service can be extended to use more closed sources models from OpenAI or OpenSource model using FireworksAi
Changes on the front end introduce components and BFF logic allowing users to enter their token, choose a large language model and generate AI Reports for their DORA Metrics.
Whenever the user tries to generate AI analysis, the UI makes a POST request to the BFF API: internal/ai/dora_metrics with all the preprocessed DORA Metrics and trends data.
This BFF API internally calls multiple analytics APIs with the dora metrics and trends data, which in turn generate the analysis based on the processed data and the curated prompts.
Finally, the analysis for each individual metric trend is fed again into the LLM for a summarising effort and all the data is sent to the front-end.

More implementation details can be found in this pull request.

Evaluation and Results: GPT4o Vs LLAMA 3.1

We did the DORA AI analysis for July on the following open-source repositories: facebook/react, middlewarehq/middlware, meta-llama/llama and facebookresearch/dora.

Mathematical Accuracy

Middleware generated a DORA Performance Score for the team based on this guide by dora.dev
To test out the computational accuracy of the model we provide it with the four key metrics and prompt the LLM to generate a DORA Score and compare the results with Middleware.

The four keys was a JSON of the format.



{
    "lead_time": 4000,
    "mean_time_to_recovery": 200000,
    "change_failure_rate": 20,
    "weekly_deployment_frequency": 2
}

- The Actual Dora Score for the repositories was around 5. While OpenAi’s GPT4o was able to predict the score to be 4-5 most of the times, LLAMA 3.1 405B a margin away.

_DORA Metrics score: 5/10_
![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/saepp6t4su3j86fm1g3j.png)

_GPT 4o with DORA score 5/10_
![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9u7nln407p0rhhqkag71.png)

_LLAMA 3.1 with DORA Score 8/10 (incorrect)_
![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lwpladuhj66ij2s5j1l7.png)


GPT 4o  DORA Score was closer to the actual DORA score than LLAMA 3.1 in 9/10 cases, hence GPT4o was more accurate compared to LLAMA 3.1 in this scenario.

### Data Analysis
- The trend data for the four keys dora metrics, calculated by Middleware, was fed to the LLMs as input along with different experimental prompts to ensure a concrete data analysis.
- The trend data is usually a JSON object with date strings as keys, representing weeks' start dates mapped to the metric data.

{
   "2024-01-01": {
           ...
       },
       "2024-01-08": {
           ...
       }
}


- *Mapping Data*: Both the models were at par at extracting data from the JSON and inferring the data in the correct manner. Example: Both GPT and LLAMA were able to map the correct data to the input weeks without errors or hallucinations.


     _Deployment Trends Summarised: GPT4o_
     ![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ey4jlh2o1nk5xkvg4tt0.png)


     _Deployment Trends Summarised: LLAMA 3.1 405B_
     ![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ziiymc6tl0l360uam8hs.png)


- **Extracting Inferences**: Both the models were able to derive solid inferences from data. 
  - LLAMA 3.1 identified week with maximum lead time along with the reason for the high lead time.![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/evww5o0tg6bu4m941z6h.png)


  - This inference could be verified by the Middleware Trend Charts.![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lmu39pip49f0brsbd0ti.png)


  - GPT4o was also able to extract the week with the maximum lead time and the reason too, which was, high first-response time.![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/74eepbadfzk24z0i5i80.png)


- **Data Presentation**: Data representation has been a hit or miss with LLMs. There are cases where GPT performs better at data presentation but lacks behind LLAMA 3.1 in accuracy and there have been cases like the DORA score where GPT was able to do the math better.
  - LLAMA and GPT were both given the lead time value in seconds. LLAMA was able to round off the data closer to the actual value of 16.99 days while GPT rounded off the data to 17 days 2 hours but presented the data in a  more detailed format.

     _GPT4o_![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3dpwmlcscgehi47zlx0c.png)


     _LLAMA 3.1 405B_![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c3owjv94wjfrtxetf91a.png)



### Actionability
<img width="100%" style="width:100%" src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExZXFmcmM2cno2c3liN3doeXJ6Z282NmxrZDN0ZGd3c2xta2RwOXp5eCZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/jsrfOEfEHkHPFSNlir/giphy.gif">

- The models were able to output similar actionables for improving teams' efficiency based on all the metrics.
- Example: Both the models identified the reason for high lead-time to be first-response time and suggested the team to use an alerting tool to avoid delayed PR Reviews. The models also suggested better planning to avoid rework where rework was high in a certain week.

_GPT4o_![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tq9uaapz50z3dsom7jhd.png)

_LLAMA 3.1 405B_![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gbw7ovecvj3rc6fhykz3.png)


### Summarisation
To test out the summarisation capabilities of the models we asked the model to summarise each metric trend individually and then feed the output results for all the trends back into the LLMs to get a summary or in Internet's slang *DORA TLDR* for the team.

The summarisation capability of large data is similar in both the LLMs.

_LLAMA 3.1 405B_
![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ewsg3cgyqp3mikx1pb92.png)

_GPT4o_
![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/iq6qgz104pacq7nhpyku.png)

## Conclusion
For a long time LLAMA was trying to catch up with GPT in terms of data processing and analytical abilities. Our earlier experimentation with older LLAMA models led us to believe that GPT is way ahead, but the recent LLAMA 3.1 405B model is at par with the GPT4o.

If you value data privacy of your customers and want to try out the open-source LLAMA 3.1 models instead of GPT4, go ahead! There will be negligible difference in performance and you will be able to ensure data privacy if you use self hosted models. Open-Source LLMs have finally started to compete with all the closed-source competitors.

Both LLAMA 3.1 and GPT4o are super capable of deriving inferences from processed data and making Middleware’s DORA metrics more actionable and digestible for engineering leaders, leading to more efficient teams.

## Future Work
This was an experiment to build an AI powered DORA solution and in the future we will be focusing on adding greater support for self hosted or locally running LLMs from Middleware. Enhanced support for AI powered action-plans throughout the product using self hosted LLMs, while ensuring data privacy, will be our goal for the coming months. 

In the mean time you can try out the AI DORA summary feature [here](https://github.com/middlewarehq/middleware/tree/ai-beta).


  
    
      
      
        middlewarehq
       / 
        middleware
      
    
    
      ✨ Open-source DORA metrics platform for engineering teams ✨
    
  
  
    





Open-source engineering management that unlocks developer potential











Join our Open Source Community




Introduction


Middleware is an open-source tool designed to help engineering leaders measure and analyze the effectiveness of their teams using the DORA metrics. The DORA metrics are a set of four key values that provide insights into software delivery performance and operational efficiency.

They are:



Deployment Frequency: The frequency of code deployments to production or an operational environment.

Lead Time for Changes: The time it takes for a commit to make it into production.

Mean Time to Restore: The time it takes to restore service after an incident or failure.

Change Failure Rate: The percentage of deployments that result in failures or require remediation.


Table of Contents



Middleware - Open Source

Features

Quick Start

Installing Middleware
Troubleshooting







Developer Setup


Using Gitpod
Using Docker
Manual Setup





Usage

How we Calculate DORA

Roadmap

Contributing guidelines

…









  


  View on GitHub

Automate the Boring Stuff: How I Built a Code Generator to Save Hours of Redundant Work🧑‍💻

Samad Yar Khan — Sun, 16 Jun 2024 14:20:04 +0000

In this article, I will explain how I got frustrated with writing redundant code needed to extend a service every time a new requirement was raised, and how I automated this with code generation.

You can skip to one of the sections:

0) Backstory / Lore Time 🧙
1) The Redundant Work 🥱
2) Writing a Code Generator 🧑‍💻
3) The Result ⚡

Backstory / Lore Time 🧙

A new customer logged in at Middleware and they had 100 times the data of our previous ones, causing our data pipelines to choke. We had to ship a hot fix ASAP.

After mitigation, we discovered most of their data was irrelevant bot-generated content, which we could filter out during data sync. We implemented a hot fix to filter data at ingestion, which worked well.

Next, I was tasked with adding a new setting to filter bot-generated data without manual coding. While this provided more control over what we sync, it was a boring 🥱, redundant task that required understanding the Setting Service context 📚.

The Redundant Work 🥱

We have a Setting Service in our codebase that handles all the settings in our product. The code follows a great structure; it's easy to breeze through and handles any type of setting needed by the product.

The whole process of adding a new setting requires any developer to see a previous PR where someone added a new setting, regain context, make code changes across multiple files (spanning from adapters to validators), which all depend on the new setting's schema, and ensure that the APIs are working.

The code changes are straightforward; they just feel like a lot of manual work with little gain and are heavily dependent on the class schema of the new setting type. It's easily half a day of work for any developer.

When I got the task to add a setting, I thought to myself: 🤔💡 If some work feels redundant and follows changes based on a set structure, I should try to automate it.

Writing a Code Generator 🧑‍💻

I had no idea how I could automate code generation based on some rules. I have often heard my friends working in bigger organizations speak about how they have whole services that build out APIs and layouts for them based on certain schemas, so I knew it was possible.

Research 📖

I googled if a solution already existed. People have built similar things, but none tied to my use case.

I used ChatGPT to churn out an easy solution, but none worked. My next idea was to just send all the files to GPT using a script and get updated files, but that had two problems:

1) LLMs are not reliable enough with code generation.
2) Sharing proprietary code would get me fired 💀.

Breaking Down the Problem 🛠️

The next step was to verify if a solution was even possible for our use case, so I did the following:

1) Gained all the context needed to add a new setting to the codebase.
2) Tracked all the files and specific classes and functions where the changes would go.
3) Mapped the nature of changes needed in each file with the new setting schema.

So far, the function/variable/class naming was consistent with the setting type name, and the logic for adapters and validators could be coded out and automated based on the primitive data types used in my new setting type schema.

If I added placeholder comments where new code was to be added and my script could identify which code to add in place of which comment and at the same time move the comment down so it could be used again next time, it would solve the problem.

The Solution ⭐

The first challenge was to create pure functions that could generate code based on the new setting name and schema. Generating new enums, classes, and dictionaries was straightforward, but creating handlers, adapters, and validators was more complex and time-consuming. I used GPT and Claude to help develop a generic solution for this.

The next challenge was locating placeholder comments across files and inserting the generated code while handling Python's indentation issues. This was particularly painful as I had little experience with regex and had to learn it on the go.

Here is how I managed the code population bit, explained with an example because it was new for me and might be for you as well:

enum_pattern = r"(?P<indent>\s*)# ADD NEW SETTING TYPE ENUM HERE\n"
match = re.search(enum_pattern, content)
if match:
    indent = match.group('indent')
    new_enum_entry = f'{indent}{setting_type.upper()} = "{setting_type.upper()}"\n'
    content = re.sub(enum_pattern, new_enum_entry + match.group(0), content)

enum_pattern: The regex pattern to find the placeholder comment.
re.search: Searches for the pattern in the file content.
match.group('indent'): Captures the indentation level.
re.sub: Replaces the placeholder with the new enum entry, preserving the indentation.

And all this effort paid off ✨

The Result ⚡

After hours of struggling to build the code generator script, it paid off 🚀

I was able to build a script that would prompt the user for the new setting name and the required fields along with their types and make changes across files. All the developers would need to do is add the imports (because they were too messy to handle) and handle any complex data types (which would be rather simple as 90% of the code is generated).

This is the Pull Request that adds the script to our Middleware Open Source Codebase. I would recommend going through the code if you wish to implement a similar solution for your use case: https://github.com/middlewarehq/middleware/pull/433

This brought down the development time of adding a setting from a few hours to less than 10 minutes and, most importantly, helped me and other developers escape the boring work 😎!

Thanks for sticking till the end 🤝

If you liked the article please spare some time to star the opensource repositories I maintain:

⭐ https://github.com/middlewarehq/middleware
⭐ https://github.com/RocketChat/Apps.Github22

You can follow me on socials:

GitHub/samad-yar-khan
LinkedIn/samad-yar-khan
X/samadnotyouryar

DEV Community: Samad Yar Khan

Top 10 Tips to Get Started with Open Source and GSoC

Table of Contents

1. Choose the Right Organization

2. Become a User First

3. Start Small: Pick a Tiny Functionality

4. Tackle Open Issues

5. Communicate and Ask Questions

6. Focus on Consistency, Not Quantity

7. Learn to Navigate Large Codebases

8. Don’t Overlook Documentation and Testing

9. Sharpen Your Git Skills

10. Read the Code

Conclusion

middlewarehq / middleware

✨ Open-source DORA metrics platform for engineering teams ✨

Introduction

Building a Scalable Notifications and Alerting System 🔥🚀

Table of Contents:

Introduction

The Idea💡

Application Flow

Assumptions

Low Level Design

Functionalities:

Models:

High Level Design

Generating and Processing Breaches

CRONS

Jobs and Workers

Queuing Notifications

Points of Failure

Handled Cases

Processing jobs from the PlaybookRuleQueue:

Processing and Saving Notification

Shortcomings

Rule Changes

Conclusion

middlewarehq / middleware

✨ Open-source DORA metrics platform for engineering teams ✨

Introduction

LLAMA 3.1 vs GPT4: Which is smarter for analytics?

Table of Contents

Introduction

Background

Objectives

Implementation

Data Processing: Middleware to the Rescue

Model Integration: FireworksAi and OpenAI

Evaluation and Results: GPT4o Vs LLAMA 3.1

Mathematical Accuracy

middlewarehq / middleware

✨ Open-source DORA metrics platform for engineering teams ✨

Introduction

Automate the Boring Stuff: How I Built a Code Generator to Save Hours of Redundant Work🧑‍💻

Backstory / Lore Time 🧙

The Redundant Work 🥱

Writing a Code Generator 🧑‍💻

Research 📖

Breaking Down the Problem 🛠️

The Solution ⭐

The Result ⚡

Thanks for sticking till the end 🤝