Hi dev.to! I'm new here, good to be part of the community :).
I've been working on a project to make it easier to build serverless applications. In my experience there is a lot of tricky stuff that you have to do to make any non-trivial serverless app, where "non-trivial" is anything more than one function in isolation.
For example, if you're building a very simple data processing pipeline, it might have 3 steps (and a cloud function for each step):
- Split data into N chunks.
- Process each chunk...
- Join the results together to form the final product.
This kind of thing is super common, especially in data-rich companies doing machine learning or business intelligence.
Assuming each chunk can be processed independently, ideally you'd like to process them all in parallel and get the results faster. This makes sense on a compute platform like AWS Lambda where you're billed (mostly) for compute time, so actually processing them in parallel costs about the same as doing them in series. It's just way faster.
As far as I know, there are a couple of ways to accomplish this.
- Build a bespoke cloud architecture with multiple Lambda functions, SNS/SQS queues between them, etc.
- Use a "workflow orchestrator" (Airflow, Dagster, etc).
- Use a service like AWS Step Functions.
What options have I missed?
Anyway, these aren't that great...
Bespoke cloud architecture
Hello management overhead. Hello complicated testing and deployment.
This always feels like a great idea - surely your application is unique enough to warrant bespoke architecture?! And it probably is the cheapest option to run. But it's definitely not the cheapest to build and maintain.
It's definitely fun - you get the whiteboard out, sketching how data flows between functions, where dependencies are, what the triggers are. You write code, you hook it all up in the cloud (creating a separate environment for testing as you go), and you push it to production.
But maintenance is a nightmare. As the number of functions in your cloud grows, it gets hard to see what anything is connected to. When things break, getting coherent stack traces out can be complicated. Adding new features without breaking everything else is hard. Deployments have to be managed carefully.
Workflow orchestrators
These are pretty great. But there's an inelegance and an inefficiency to having a service permanently running alongside your business logic, even when none of your business logic is running. It requires management, upgrades, etc. And if the orchestrator goes down, everything else does too.
Step Functions
Also great, and widely used. But they're expensive and just not fun! Who wants to write JSON, or even YAML (with the serverless framework)? It's not programming, it's not easily testable, and you're now locked into AWS.
Teal
I started Teal to try to make this easier.
It's very early stage. I need your help.
Has anything I've said above made sense, or not? Do you have answers to these questions?
- How do you manage your prod/stag/dev envs?
- How do you go about creating new features?
- If a cloud function breaks, how do you diagnose and fix it?
- How do you test everything?
- How do you do CI?
I really want to make something useful, and would love to hear your thoughts and experiences. And if you have half an hour, let's jump on a Zoom call! Hit me up on ric@condense9.com or on LinkedIn.
–
Ric
Top comments (0)