So, template strings are a pretty neat feature in quite a few languages. If you're unfamiliar, template strings allow you to embed code expressions within string literals. For example, in JavaScript, you can write the following code and a generate the output "Hello, 2 is the answer!"
console.log(`Hello, ${1 + 1} is the answer!`);
Pretty neat, right? But how does it work under the hood?
I dug around the NodeJS codebase to figure this out and ended up in quite a storm.
I was able to track down the code where the parsing of template literals happened and identified the TemplateObject data structure that was used to store the information. I got a little bit lost on what exactly a cooked_string
, which was referenced quite a few times in the parsing code was. Unfortunately for me, any kind of Googling on this only yielded results for cooked string bean recipes. I was in the realm of linguistic ambiguity, my friends! Maybe I was on the wrong track?
I was finally able to figure this out when I landed in some code in the Node codebase that appeared the parsing the actual template literal string. From this I could discern the following.
- When the parser encounters a backquote, it invokes the
parseTemplate
method (code). - The
parseTemplate
method iterates through the elements within the template literal. If one of the elements is an expression, meaning that it starts with a $, it first parses that expression the continues parsing the other elements in the template literal (code). - There is a
parseTemplateElement
method that parses the non-expression elements within the template literal. This is where the cooked_string business creeps up. Interestingly enough, it appears that acooked
string is a reference to a structured object that represents the strings in the abstract syntax tree whileraw
string represents the bytes of the string itself. (code)
The most interesting bit of code in step #2 above is the parseExpression
method which is used to parse the contents of the embedded expression within the template literal. This method is used quite liberally in the parsing codebase. For example, it is used to parse the initialization code in a for-loop or the cases within a switch statement.
Following this exploration, it appears that the TemplateObject lead from earlier was not exactly the right place to go. Reading the code in JavaScript-based Acorn parser offered a lot more clarity than spelunking through C++.
So there you have it, most of that magic with template literals happens at parse time when the abstract-syntax tree for the template literal is generated.
Top comments (6)
This is so cool, Safia! This is one of those things Iâve wondered about for so long but never really explored too deeply. I wanted to learn how template literals in libraries like
styled-components
work. Thank you for sharing what you learned!Thanks for the post!
And for the really brave, you can check out the c++ code:
github.com/nodejs/node/blob/57c708...
Scary stuff đ
Pretty cool! I would be a little intimidated to delve into a codebase that huge, so major props for that. I could see this being an interesting series that could dig into widely used code, explaining how it works.
I would love to see some code snippets in the posts themselves, too!
Iâve got other posts in this style as well. You can view them on my profile page. The publication date on most of them is around six months ago if you want to narrow down your search.
Thanks a lot for the digging and explaining it to us with code references! đ
Amazing work! đ€