DEV Community

Cover image for Other languages than English in code
david duymelinck
david duymelinck

Posted on

Other languages than English in code

The reason I write English variable, function and object names, is the same reason I'm learning at least a few words in the language that is spoken in the country I'm traveling to. It is respectful and it creates less friction.
But there are also people who don't like that I butcher their language.
Machines don't care about the words, so why am I writing everything in English?

Bounded language and clean architecture

One of the pillars of clean architecture is to write understandable code. The native language is for most (all?) people more understandable.

With domain driven design the bounded language of a domain is important. And some of the terms are best expressed in the native language.

Is this enough reason to change the names from English to the native language?

Public versus internal

It seems normal to default to English when the application needs to communicate with other systems, because we are using programming languages and systems with English terms.

This means the English names could be pushed to the edge of the application. Would it be beneficial to have a language divide between the names we give functionality and the terms of the programming language?
Keywords for the language will have no meaning when written in the native language.

Is it a good idea to write names in a native language that runs on a users machine? I'm thinking about JavaScript, because that is the most used language where the code is public.
Of course with compiled languages this isn't a problem, because the code is not as easy to read.

Programming languages

As far as I know only esoteric languages like BrainFuck don't have English terms as keywords.
Even Go that is build to be as simple as possible has English terms.

The one thing that could complete with English as most universally understood are emoji. There are attempts like Emojicode. The problem with emoji is to find the ones that have the most universally understood meaning and bind that to the language functionality.
▶️ 💬 'hello world' ⏹️ I think the most problematic emoji is the speech bubble.

AI

Will code generation and spec-driven development make it easier to use native language names in code?
I even write prompts and specs in English, while for an AI there should be no problem to read them in any native language.

Your thoughts

Is English holding you back to write the most understandable code possible?
How would you communicate if English is of the table or only used sparsely?

I know the dev.to community is an international bunch of people, and many don't have English as their native language.
I would love to see comments of cases where you really wanted to use your native language. And please include the native language sentences/terms.

PS: The AI generated image always produced a mix of words and just noise, I gave up after ten tries to get a decent result. Maybe AI isn't ready for native language names.

Top comments (41)

Collapse
 
canro91 profile image
Cesar Aguirre

Story time:

At a past Spanish-speaking job, we wrote all names and warning/error/validation messages in English. The problem came when those messages started to appear in users' screens.

So a pour soul (a junior coder) had the long, tedious, boring task of going all thru the codebase looking for error message to translate them and the QA team to make the app crash to see them...We didn't have AI in those days or a localization module. Arrggg!

Collapse
 
xwero profile image
david duymelinck

The warning and validation messages should be translatable. I live in a country with three official languages, so preparing for multiple language output has become a default.

The error messages should not reach the user, most of the time they contain technical information. They should be intercepted and a user friendly message should be shown.

Just translating the message feels like half the effort to make in right, in my opinion.

Collapse
 
canro91 profile image
Cesar Aguirre

Yes, we did a lot of things wrong at this project :/

Collapse
 
uratmangun profile image
uratmangun

it will be his core memory and it will pass down to his own junior in the future lol

Collapse
 
ingosteinke profile image
Ingo Steinke, web developer • Edited

Valid cases for native language are special complicated requirements that can't be explained consisely when there is no English equivalent. Legal and cultures terms, as well as concepts specific to a brand or product. I used to struggle to translate the German "Inbetriebnahmeprozess" which meant a specific set of things to to shortly before and after going live with a new web project. I guess every company or institution develops something similar especially when their processes get more complex and involve a lot of people.

I remember Office macro scripts used to be localized (and hopefully stored as agnostic tokens internally) to match the operating system's natural language. I guess we're lucky that several of those words like THEN have been replaced by abstract symbols like colons, brackets and braces in most modern programming languages.

Still, there are enough built-in English words, some of which with their etymological history dating back to Arabic, Greek, and Latin, so that using any other language for variables, functions and file names feels like breaking consistency unless we do it for a very good reason.

Collapse
 
ingosteinke profile image
Ingo Steinke, web developer

P.S. the 🍌 AI-generated artwork can get boring soon, unless you come up with an unusual creative prompt and a more elaborate description preventing the AI to put a vintage desktop computer at the center of the image.

Collapse
 
xwero profile image
david duymelinck

Yes I'm finding that out now. I agree it is a bad image, but it gave me the opportunity to take a little stab at AI generation.

Thread Thread
 
ingosteinke profile image
Ingo Steinke, web developer

Have a look at the second image in last week's Meme Monday thread. There is a vintage computer, but it's kind of an 1980s laptop in a 1950s café setting. The prompt used in the DEV image generated must have been the first image's alt attribute content. dev.to/ingosteinke/comment/32hei

Collapse
 
easytarget profile image
Mitch

Re "Inbetriebnahmeprozess":

  • Onboarding
  • Launch
  • Setup
  • Commissioning

Pick one, depending on domain context, e.g. industrial for "commissioning". The context is most likely not "german". Optionally add "process" to any of these words for flavour

Collapse
 
sylwia-lask profile image
Sylwia Laskowska

I can share my thoughts on using different languages for variable names. In my projects I use English 100% of the time, because the team is often international - all it takes is one person who doesn’t speak Polish. However, I’ve heard that in government institutions, banks, or projects with very specific and complex domains, developers often use variable names in the local language. The business terminology can be so difficult that coming up with English equivalents would make things even more complicated.

Collapse
 
xwero profile image
david duymelinck

I agree it is not always possible to use your native language in code.

Collapse
 
fedtti profile image
Federico Moretti

I try to use English everywhere in my code, because Italian has accents, apostrophes, and other diacritical marks that can break parsers and/or compilers.

Using my mother tongue in coding is definitely not a good idea. Not even in prompts: I think Italian requires more tokens than English; LLMs prefer Sumerian, but I only know written Ionic-Attic from high school, and I cannot use it into production! 😅

Collapse
 
xwero profile image
david duymelinck

Most diacritical marks are just another unicode character for the code. And as long as you use the ascii apostrophe a compiler isn't going to bat an eye.

If the language accepts unicode characters, go nuts with Sumerian, old Greek or Sanskrit.
I hope you find a fellow developer to discuss the variable names with.

Collapse
 
vikkio88 profile image
Vincenzo • Edited

italian here too and thank fuck I've touched very little number of codebases written by Italians, but the one I did were full of variable names in italian 😢

Collapse
 
embernoglow profile image
EmberNoGlow • Edited

In fact, every country has tried (Not always successful) to create their own programming language based on their native language, and in my opinion, they were very inconvenient to use. For example, the biggest problem is the "{} [] : ;" symbols—to type them, you have to constantly change the language on the keyboard! It's incredibly inconvenient. If I were writing in such languages, I'd have to press Shift+Alt a number of times equivalent to lines of code * 2, and that's a lot of useless work.

Collapse
 
xwero profile image
david duymelinck

I wasn't thinking about countries or identity, when I was writing the post. It is more about the best way to communicate with each other in a code language.

I agree that symbols can be a problem, that is one of the reasons I think languages use keywords.
A solution for your problem can be a keyboard with programmable keys. Maybe not for all the symbols, but at least for the ones that you need the most.

Collapse
 
embernoglow profile image
EmberNoGlow

In my opinion, it's better to communicate in English, as it is the most versatile language, even if you have to use a translator 😥.

Thread Thread
 
xwero profile image
david duymelinck • Edited

That is one of the things that could cause problems; using a translator and not knowing if the translation is correct. Even if you are good in English, sometimes you are missing the a subtle meaning that can make the meaning of a name more accurate. Everyone is more aware of the subtleties in their own language.

Most applications don't need to address those subtleties, so I agree English will do in those cases.

Thread Thread
 
embernoglow profile image
EmberNoGlow

Then you can use LLM (unlimited) for translation - this is perhaps one of the most accurate translators today.

Thread Thread
 
xwero profile image
david duymelinck

I assumed you meant an LLM or translation software. Not everyone has easy access to a translator. Sorry for the mistake.

Collapse
 
mythorian_b77f3ebd0bce9c7 profile image
Mythorian

sorry to say this but you all think writing spanish and french and ... is hard while? coding? the try arabic or persias or ... the whole language is right to left!!! do you how fucking hard it is to create a persian app? :( i have to go to hell and back every single time!! and some editors dont even support right to left!!!!!

Collapse
 
xwero profile image
david duymelinck

I understand where you are coming from. There are languages that read right to left , there are also languages that are read from bottom to top.

I think that it should be possible to have a code language that could accommodate those reading styles, and not only left to right.
The main reason to have a code language is to make logic more easier to communicate.

I find it shocking that editors are not capable to support right to left.

This is one of the comments I was hoping for. Maybe a little less heated, but I understand your passion.

Collapse
 
mythorian_b77f3ebd0bce9c7 profile image
Mythorian

you know now that i think about up to bottom sounds impossible. even more than right to left. im very sorry if my comment was a little heated but it is the frustration of many years of making persian and arabic projects. honestly i wish all the user were english

Collapse
 
master_aless profile image
Jhon Alessandro

Commit yourself to find the bases and principles to deal with those problems, I know it can be hard (I've worked with arabic cases) but once you understand how to tackle those situations, you're settled up to even do something bigger than you think you could do

Don't settle.

Collapse
 
art_light profile image
Art light

Wonderful post! I love how clearly you explained the tension between English conventions and native-language clarity—it genuinely made me interested in trying it myself.

Collapse
 
ben profile image
Ben Halpern

PS: The AI generated image always produced a mix of words and just noise, I gave up after ten tries to get a decent result. Maybe AI isn't ready for native language names.

Curious — can you expand on this? Would love to see if there's tweaks that can be made but I'm not sure I follow.

Collapse
 
xwero profile image
david duymelinck

I tried several prompts to try to make the background more realistic, but I all got backgrounds like the one in the current image.
And the other tries I was asking for a class with the class name functions and properties in different languages, not English. And that got me either names with the language in brackets or the rendition you see in the image.

Maybe I didn't prompt the right way, I tried to be as clear as I could be. For the languages I wanted to leave it up to the AI instead of asking for specific languages. But if it can add the names of the languages, it should be able to return words.

Collapse
 
david_sporn_9688d10d7734e profile image
David Sporn

These days I would like to require that most of documents of a project should only contains US-ASCII character codes (exception : translation files, and localization specifications) ; in this case using english will help.

Apart limiting risks of injecting non-printable text into LLMs, restricting one-self to US-ASCII characters make any text file readable as ASCII or UTF-8.

Collapse
 
xwero profile image
david duymelinck • Edited

I knew non-printable characters can trip up compilers, but I never thought about the danger it could pose to the working of an AI.
I assume an AI is going to strip out characters it deems problematic, so basically adding them is costing you money for no reason.

The question I have is who or what is putting those characters in the code/documentation? I don't know how to type any of those characters.

UTF-8 contains those problematic characters, so according to you people should only use ASCII?

If you are hinting at the fact that some languages spend more tokens. I think that the best way to explain something is in your native language, not?

Collapse
 
david_sporn_9688d10d7734e profile image
David Sporn

About prompt injection using non printable/non rendered unicode characters : github.com/0x6f677548/unicode-inje...

And the other point I was mentionning : using a text editor using ANSI/ASCII by default (yes, its getting old because nowadays everyone using nowadays software/OSes use most likely utf8 by default) open an utf8 encoded text file, written in non english language, e.g. French, with diacritics : the characters outside the US-ASCII range are a jumble of characters because of the encoding. I admit that this point was more critical decades ago.

Thread Thread
 
xwero profile image
david duymelinck

Thank you for the link.

I think the danger is more from input by an unknown person, than from code that is under version control. And if there are multiple people that have seen and/or checked the code, the chance that the hidden characters are not detected will be smaller.

While I think it is good to be be vigilant, there has to be a balance between security and usability.

Collapse
 
johndemian profile image
John Demian

I'd go for English every day of the week, even if I'm working with a team of non-English speakers.

A few years ago I bought a SaaS product built by a russian team that had used russian in in most comments and explanations in the code. Even the variables were russian words so making sense of the code was a huge challenge. It literally took us months to decipher what was going on since no one on my team spoke russian.

Folks, keep it simple for everyone else. Use English!

Collapse
 
xwero profile image
david duymelinck

A few years ago I bought a SaaS product built by a russian team that had used russian in in most comments and explanations in the code

Isn't that your fault not checking the code in advance? Even if they didn't want you to see their full solution, they should have provided a relevant sample of their code.
If they showed you an all English sample you could break the contract and demand your money back.

I'd go for English every day of the week

Like I mentioned in my post that is how I work too. The question I have in my post is how the switching between languages is affecting the communication through a code language.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.