DEV Community

Cover image for Is Code Generation a Bad Idea? šŸ¤”
JS for ZenStack

Posted on

Is Code Generation a Bad Idea? šŸ¤”

Iā€™m building ZenStack: a full-stack development toolkit on top of Pirsma ORM that simplifies the development of a web app's backend. It uses the schema-first approach to generate access control-enabled APIs and front-end queries.

Upon introducing it to certain individuals, right after hearing the code generation part, their immediate reaction is like this:

No-code-generation

The benefit of code generation is not worth mentioning

The most benefit that code generation could bring is quite obvious: write less code. For example, in my last post, you can see how to get a scalable SaaS backend with less than 100 lines of schema file thanks to the code generation:

I could get it if, after hearing the whole picture, they say: Iā€™m not buy-in because the benefit doesnā€™t pay off the debts that code generation brings. But coming across these people who just up a tag on their heads, ā€œI never use code generation,ā€ leaves me pondering the reasons why they hate it so badly.

We use code generation every day

To the surprise of many people, especially those who hate code generation, they actually use code generation every day.

For instance, if you have ever used Svelte, on the home page of its official website, it directly shows you how the code generation works to convert the simple Svelte code to the actual Javascript that is actually running in the browser:

Svelte

Maybe you are a backend developer never used any frontend framework. Have you ever used Typescript? If so, I think you should be familiar with the command tsc and what it does. šŸ˜‰

Even if you are completely out of the javascript ecosystem unless you are still using assembly language, ode generation is an unavoidable aspect.

Some might say Iā€™m stealing concepts, as these are called compilation rather than code generation. So, what exactly distinguishes them?

The bad side of code generation

In the context of ZenStack, the notable difference is as below:

  • For compilation, code generation applies to all the code, and the resulting generated code is typically not a significant concern for developers
  • For ZenStack, code generation specifically targets a specific portion of the code, utilizing a distinct syntax. The remaining parts of the code then rely on the generated output from this specific code generation process.

Considering these, code generation in ZenStack does reveal some practical challenges:

1. Learning a new syntax

This is true. But I guess thatā€™s the price you need to pay if you want to use less to generate more.

2. Have to run generation command whenever changing the schema

It does sound like damage to the developer experience, apparently. However, separating the schema from the code carries an implicit advantageā€”it enforces a design pattern.

The schema serves as the contract between the different modules of your code base, similar to an interface in object-oriented programming. By adhering to the moral ā€œprogramming to interfaceā€, changes should first occur in the interface, or in this case, the schema. For instance, if your team has adopted GraphQL as your API, you are familiar with the efficiency it brings, allowing independent and parallel work between frontend and backend teams.

You might say even if it brings value like the interface, changing the interface does not require you to run the generation command.

While it is true that altering an interface does not require running a generation command, it is important to note that the frequency of changes in the schema is typically lower compared to other parts of the codebase. Moreover, the additional step of code generation serves as a reminder to exercise caution when changing the contract, which can be a good mindset to cultivate. However, if it really bothers you that much, automating it with a script that watches the schema changes and executes the generation command automatically in the background could be a viable solution.

3. Limited control and customization

Code generation comes with certain predefined rules. Some developers prefer having fine-grained control over everything, which limits their ability to customize. For instance, with RESTful APIs, you can add an endpoint anywhere in your codebase and directly include additional data in the response body. In contrast, with GraphQL, any API-related changes necessitate modifications to the schema first. However, when viewed from a different perspective, these limitations help address the maintenance challenges associated with RESTful APIs and contribute to the efficiency mentioned earlier.

It is essential to acknowledge that there is no one-size-fits-all approach, and it always involves making trade-offs.

Donā€™t hate a concept

Forget about the concept, try to get the whole picture by analyzing the specific pros and cons of an approach. Otherwise, you might miss the good stuff.

Preoccupied with a single leaf, you won't see the tree.Ā Preoccupied with a single tree, you'll miss the entire forest


If you donā€™t hate code generation, check out the schema-first approach ZenStack we are building. It uses the declarative data model on top of Prisma, adding access policy and validation rules, from which it will automatically generate APIs, including OpenAPI, tPRC route, and hooks for you.

If you do feel it could help you, I would be super happy if you could give me a star so it could really help more people to move fast! ā¤ļø

https://github.com/zenstackhq/zenstack

Top comments (15)

Collapse
 
usegen profile image
usegen

Code generation is really good approach if the generated code looks like the written one, and can be extended.

I'm gathering such solutions under GeneratedCode.io

I can add ZenStack there as well, just provide me with a pitch line

Collapse
 
jiasheng profile image
JS

Thanks for the recognition. Here is the pitch line follows existing style of GeneratedCode.io:

Turn Your Prisma Schema into a Secure Fullstack App in Minutes

  • ORM With Access Control
  • Automatic CRUD API
  • Frontend Query Code Generation
Collapse
 
usegen profile image
usegen

done is updated

Collapse
 
cloutierjo profile image
cloutierjo

There are so much different user case for code generation, some are, some are not. I actually started my first internship by writing a code generator. The goal was to automate all the copy pasted layer based on the entity definition. While i did enjoy that task, the resulting architecture mean that you have a whole lot of (automatically) copy pasted code and add I believe most dev agree, code duplication is usually a code smell. We could argue that if it's generated, then you don't really have to maintain it, but i would if as a dev you have to open those file, debug through then, then it does impact the code/architecture quality. In those case, they're often better code pattern and architecture to reach your goal.

On the other side, any kickoff code generation is good imo, like yeoman or maven archetype. Those tool will generate all the boilers plate code to start your project but then it becomes your code and you clean it up as a first step and never use the generator again.

On a similar pattern of generate once, Django migration are also a good example. It take a few seconds to generate the db migration script, they are easy to review and customize but it become your own code afterwards. Bonus point, once deployed, you never go back modifying it in this case.

Then there is what is call "background code generator" haven't seen a lot of them, it's code that get generated but you will never have to see it or debug through it, the only example that come to mind is Project Lombok even if we might not consider it purely a code generator that would also be fine to me as it simplify boiler plate code that really no one cares about.

I'm the end, the most important question to me is, after the code is generated, what will be the cost to me maintaining that code. Of the answer is more than writing it myself or using some other pattern, then it's a no go. The reason is simple, as a dev, we will spend way more time maintaining every single line of code than initially writing them.

There is also another factor to take into consideration, what feature your language allow. Java is really poor in dynamically creating interface or object attribute which make it more likely to see a benefit from code generation. In the other side, python or JavaScript can create new interface and attribute on the fly and thus are way less likely to need code generation since the language is flexible enough too write generic code that since the same problem

Collapse
 
fjones profile image
FJones

It really depends hugely on the use-case.

Generated code is great for APIs: Making sure that at least one end is generated from a specification makes communication a lot easier.

Generated code is also great for repetitive boilerplate. We wrote a simple PHPDoc-based code generator to expand DTOs and their Builders with optional aspects - child models that may or may not exist on the instance, and thus implement a trait and interface to type check against, for each combination of child models. So you'd have DTO A with aspects B and C, and there'd be resulting A, A+B, A+C, A+B+C DTOs. A beautiful idea for PHP7.4, making it clear on the class which features it supports. But horribly annoying to implement manually when there's six or seven aspects.

But there are so many use cases where code generation is used that are only really worth it if there's an out-of-the-box solution (e.g. project creation or migrations)

Collapse
 
theaccordance profile image
Joe Mainwaring

It's hard to argue with processes which reduce the amount of time you spend doing a task.

The only situation where I'd likely criticize code generation is if the output is obfuscated/minified and I have to debug that file, but typically processes can be tweaked to produce the desired result.

Collapse
 
usegen profile image
usegen

@theaccordance obfuscated output is not of much help, that is basically a low code tool that allows export.

generated code(hopefully becomes a concept on it's own) means the output looks similar to the one you would write. That way one can extend and change anything - the only way to achieve long run productivity gain.

Collapse
 
danrabbit profile image
danrabbit

I am using Vely framework, which generates C code on the back end. It is really sophisticated generator though, to the tune of being a full blown declarative language. It also integrates well with C which makes it fully customizable. I think generators are great if done right. After all C++ was generated C for quite awhile.

Collapse
 
ymc9 profile image
ymc9

I think ORM or SQL databases, one of them is anti-pattern. You usually don't need an ORM to work with document databases because the data access model matches programming model well. SQL's way of modeling nesting is bizarre, and ORM is a fix.

Collapse
 
fruntend profile image
fruntend

Š”ongratulations šŸ„³! Your article hit the top posts for the week - dev.to/fruntend/top-10-posts-for-f...
Keep it up šŸ‘

Collapse
 
jiasheng profile image
JS

Thanks again! I will keep trying.

Collapse
 
moofoo profile image
Nathan Cook

One ā€œgotchaā€ with generating your frontend graphql api code for a prisma backend is that you really donā€™t want to expose the full capability of prisma to query your data to the public internet, for obvious reasons, and if one is just following tutorials and copying boilerplate thatā€™s what you often end up with.

This isnā€™t the fault of code generation itself, of course.

Collapse
 
tylim88 profile image
Acid Coder • Edited

I believe this post is a response to my 'why Prisma code generation sucks?' So here we go again.

To understand why Prisma code generation is unfavorable compared to common cases of code generation, I think the best analogy is: 'If giving birth is more painful than getting kicked in the groin, why don't men ask for a second kick, but women do ask for a second child?'"

Take the example of transpiling TypeScript to JavaScript. We gain type safety without changing how the runtime works, which is a huge benefit. Additionally, we transpile modern JavaScript to ES5 so that the majority of browsers can run our code without problems. We also transpile front-end code like Svelte, JSX, and Vue template syntax because these front-end frameworks allow us to write more maintainable code.

The hassles of code generation like these are bearable because they provide us with great benefits. With a library like Drizzle and Kysely, we can declare schemas using TypeScript without the need to learn an extra language.

However, the value of Prisma code generation is simply not as great compared to other common cases of code generation.

Not all code generation provides equal value; some are simply worse/better. This is analogous to why women are willing to give birth to a second child - because the child itself holds inherent value.

Meanwhile, Prisma code generation is like getting kicked in the groin for nothing but pain: the pain of constantly regenerating code, and the pain of learning a low-reusability language.

We have the choice not to suffer the pain, not to get kicked, and not to invite more problems to solve.

Collapse
 
jiasheng profile image
JS

Thanks for sharing different views and also for the elaboration on Transpiing. I guess Compiling is just a more general and recognized term that could include Transpiing for most cases, which you can see in the official doc of Typescript and Svelte.

Collapse
 
ymc9 profile image
ymc9

I think quite a few languages use C as compilation target?