DEV Community

Florian Pigorsch
Florian Pigorsch

Posted on • Edited on

2 2

Go: Identifiers vs. Unicode

A recent Reddit post about Unicode characters in Go identifiers sparked my interest to dive into the Go spec and look things up properly:

According to the spec, the syntax for valid identifiers is

identifier = letter { letter | unicode_digit }
Enter fullscreen mode Exit fullscreen mode

with

letter = unicode_letter | "_"
unicode_letter = /* a Unicode code point classified as "Letter" */ .
unicode_digit  = /* a Unicode code point classified as "Number, decimal digit" */ .
Enter fullscreen mode Exit fullscreen mode

The "Letter" category consists of the Unicode categories Lu (uppercase letters), Ll (lowercase letters), Lt (titlecase letters), Lm (modifier letters), and Lo (other letters), where "Number, decimal digit" refers to the Unicode category Nd.

So an identifier has to start with either a "letter" or an underscore ("_"), and must contain only "letters", "decimal digits" and "underscores" - according to what's defined as letters and digits in Unicode.
The set of letters is not only the usual A-Z, a-z, but also letters from other scripts, like greek letters (e.g. Σ, or CJK characters (e.g. ). The same holds for digits - not only 0-9, but also digits from other scripts are allowed: e.g. , ٣, etc.

Valid identifiers:

Invalid identifiers:

  • 42 (does not start with a letter)
  • 😀 (not a letter, but So / Symbol, other)
  • (not a letter, but So / Symbol, other)
  • x🌞 (starts with a letter, but contains non-letter/digit characters)

Although Go considers identifiers valid that contain other characters than A-Z, a-z, 0-9, and _, it's generally not advisable to use those - because of readability, accessibility, or even to avoid rendering issues.

AWS Q Developer image

Your AI Code Assistant

Ask anything about your entire project, code and get answers and even architecture diagrams. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Start free in your IDE

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay