A Hackathon, Again?
At this point, I have been to 9 hackathons, one of them being an international one, even winning at 4 of them. Then again, when my juniors Dhruv and Tushar told me about a Golang Specific hackathon, I dragged Harsh along with us because why not. And not just Harsh, I dragged along 40+ people from our team Point Blank, which ended up making the hackathon our own internal competition haha.
All of us in our team GoGoingGone (lmao) had good experience working with Golang, But we wanted to do more than just build another tool. We wanted to innovate. That’s when the idea struck—let's build a mini-language to define dynamic, configurable data pipelines.
Introduction
I am Akash Singh, a third year engineering student and Open Source Contributor from Bangalore.
Here is my LinkedIn, GitHub and Twitter
I go by the name SkySingh04 online.
Introducing Fractal
Fractal started as a data processing tool for seamless migration from legacy systems (like SQL databases and CSV files) to modern platforms such as MongoDB or AWS S3. But we wanted more than just another ETL tool. The idea was to make it highly flexible and user-friendly, allowing users to define validation and transformation rules with a simple, declarative syntax—a mini-language within the tool.
Why a Mini-Language?
We observed that most tools in the data pipeline space rely on rigid configurations or custom scripts. This approach often requires significant programming expertise, which limits accessibility for non-developers. A declarative mini-language provides:
- Simplicity: Users define rules in an intuitive, human-readable format.
- Flexibility: It accommodates a wide range of use cases, from basic validations to complex transformations.
- Scalability: The mini-language can evolve as new requirements arise.
This mini-language wasn’t about reinventing the wheel—it was about providing an abstraction to streamline data transformations and validations.
When this is combined with a simple yaml file configuration, we thought we hit the mark of making a easy to configure data pipeline that can process data from one source to another on scale.
The Core: Validation and Transformation Syntax
We designed the syntax to be simple yet expressive, focusing on two primary operations:
- Validation Rules These ensure incoming data meets specific quality standards before further processing. For example:
FIELD("age") TYPE(INT) RANGE(18, 65)
FIELD("email") MATCHES(EMAIL_REGEX)
FIELD("status") IN ("active", "inactive")
- Transformation Rules These enable data enrichment or restructuring. For example:
RENAME("old_field", "new_field")
MAP("status", {"0": "inactive", "1": "active"})
ADD_FIELD("processed_at", CURRENT_TIME())
IF FIELD("age") > 50 THEN ADD_FIELD("senior_discount", TRUE)
This abstraction allowed users to process diverse datasets with minimal effort, enhancing productivity and reducing complexity.
In the middle of figuring out how to make the lexer and parser of this language, the team at GoFr.dev took us all upstairs for a stress busting session, which was full of late night sharayis and jam sessions!
Building Fractal at the Hackathon
The hackathon wasn’t just about creating the mini-language. We also had to build the surrounding infrastructure, ensuring Fractal was:
- Extensible: Supporting multiple input/output formats like JSON, CSV, SQL databases, and message queues.
- Configurable: YAML-based configuration for defining pipeline workflows, integrating the mini-language seamlessly.
-
Robust: Handling errors gracefully with options like
LOG_AND_CONTINUE
orSTOP
.
We divided the work into four modules:
- Mini-Language Implementation: Designing the lexer and parser to interpret the custom syntax.
- Data Integrations: Adding support for common data sources and destinations.
- Pipeline Engine: Orchestrating validation, transformation, and error handling.
- CLI Interface: Providing a simple interface for defining and running pipelines.
Challenges We Faced
- Designing the Syntax Striking a balance between simplicity and flexibility was a challenge. We iterated multiple times to finalize the syntax.
- Building the Parser Implementing a custom lexer and parser in Golang was time-consuming but rewarding.
- Real-Time Feedback Ensuring that the mini-language provided meaningful error messages to guide users was critical for usability.
- Time Constraints Building a tool of this scale in a hackathon setting required precise planning and seamless coordination.
And what happened after that?
Despite our strong showing at the GO for GOFR hackathon, we faced a critical challenge during the final evaluation. The judges requested a live demonstration in addition to our recorded demo, and unfortunately, we encountered an unexpected bug in our parser logic during the live run. Given the complexity of building a robust custom parser within just 24 hours, it was an ambitious feature to develop, and while our recorded demo showcased its functionality, achieving 100% accuracy under time constraints proved difficult. This hiccup ultimately cost us the top prize. However, our efforts were still highly regarded, and our team's clear vision and compelling delivery earned us the honor of "Best Pitch," highlighting our potential and ingenuity.
So Hackathons huh?
Hackathons are often about pushing boundaries and exploring uncharted territories. Fractal was our attempt to redefine how data processing tools can work—by making them accessible, modular, and developer-friendly.
I couldn't have asked for a more likeminded set of people to work with me on this, absolute best and hardworking teammates without a shadow of a doubt. Looking forward to what brings me to my next hackathon, dare I say, A RUST based hackathon? xD
Check out Fractal on GitHub
SkySingh04 / fractal
A flexible, configurable data processing tool
Fractal
Fractal is a flexible, configurable data processing tool built with GoFr and Golang. Fractal is designed to handle data ingestion from multiple sources, apply powerful transformations and validations, and deliver output to a wide range of destinations. With Fractal, you can automate complex data workflows without needing to manage low-level details Here's the documentation for setting up a new integration in your project:
Custom Syntax Documentation for Validation and Transformation Rules
1. Overview
The custom syntax enables users to:
- Validate incoming data to ensure it meets predefined conditions.
- Transform data fields to fit desired formats, structures, or requirements.
- Define flexible error-handling strategies for data processing pipelines.
Rules can be written for any data source or destination, such as JSON, YAML, CSV, SQL Databases, Message Brokers, or Cloud Services.
2. Validation Rules
Validation rules ensure that data meets specific quality and integrity requirements.
Pitch Deck : Drive Link
Or Try it Yourself and let us know what do you think!
Top comments (4)
No rust based hackathons😭🙏🏻 (great write btw, i'm writing one too on this experience (soon?) :D)
We are dragging you with us!
Immensely proud of the implementation and progress we made in those 24 hours haha.
Loved having you as the team lead @skysingh04 ! Count me in on that rust based hackathon !
LESGO , time to build a lexer and parser in RUST!