DEV Community: Zubair Maqsood

The Cracked Engineer - A Journey to Low-Latency Principles

Zubair Maqsood — Mon, 24 Feb 2025 20:18:22 +0000

For the past couple of months, I have been studying High-Frequency Trading (HFT) systems, how they work and how they are architected. For complete disclosure, I do not have any Quant/HFT experience [even though I'd love to!], I am merely studying its principles on how to scale with ultra-low latency and how these principles can be applied to the Cloud. I'm also using this as a learning opportunity to develop my own simplified version of a HFT Engine. You can click the link to follow my developments, or if you want fork and clone and make your own adjustments.

The Microsecond Mindset: Why Latency Matters Beyond HFT

Latency matters because of one core reason. User's do not want to wait around for an indefinite amount of time for an application or service to run. In fact research has shown

For every second delay in mobile page load, conversions can fall by up to 20%.
Source

Now that's for your regular mobile/SaaS application that delivers on scale, that doesn't even compare on what HFT Engineer's have to deal with! But that doesn't mean it's not important, mastering the principles of latency is a craft that affects business performance overall. When it comes to HFT, trading firms are looking to capitalise on market inefficiencies in the most competitive way as possible, so they have adopted certain techniques to squeeze as much performance from their tech stack, as much as possible. In microseconds, their complex algorithms should spare no CPU waste on capitalising on a news reports, such as the US Federal Reserve or the Bank of England raising or decreasing interest rates as soon as the event is announced.

While a SaaS application might aim for 100ms response times, HFT systems operate in the microsecond range—that's 1000× faster!

Now, as regular Software Engineer's we may not need to go to the extreme end like our genius counterparts in high finance, but the principles is what can be transferred into our work-life to become more productive.

Understanding CPU Cache Hierarchies

In the context of HFT, I mentioned before that it is a very compute heavy task as you're dealing with microseconds worth of transactions/operations that are ultimately responsible of moving billions worth of dollars that move markets. So, as a result, you'd naturally think the computers these professionals operate on are much more powerful than your average tech bro that works in Shoreditch and is too busy debugging a React app hosted on Vercel.

Modern CPU's are fast, if they weren't fast then a lot of companies would go bankrupt. However, there is a bottleneck, memory access. A CPU can execute an instruction in 1 nanosecond, but accessing memory from the RAM can take almost 100 nanoseconds. In the world of HFT, quite literally, not just one second, but every nanosecond counts!

Here's where an understanding of cache heirarchy is crucial to combat this.

There are 3 levels of cache memory

L1 : smallest + fastest - stores data the processor is working on
L2: data that the processor may need to access soon
L3: data that is less likely to be needed by the processor

The time it takes for a CPU to access L1 is 1 to 2 nanoseconds, for L2 its 5 to 10 nanoseconds and for L3 its 10-20 nanoseconds. Quite literally, every nanosecond counts in the world of HFT!

Why am I telling you this? Because with this knowledge, HFT devs can optimise their code to ensure data is stored in L1/L2 caches to gain a competitive edge.

Memory Ordering and Atomics

Understanding memory ordering and atomics fundamental to low-latency principles.

Atomic Operations the main building blocks for anything that involves multiple threads. In Rust, atomic operations are available on the standard atomic types that live in the std::sync::atomic module.

When multiple threads need to modify a variable, atomics make sure modifications happen in a defined order without data races. These atomics are implemented using CPU-specific instructions that ensure thread safety, memory access with various memory ordering guarantees.

There are many Atomics, and each of them have unique use cases in the world of HFT

AtomicBool - use cases are circuit breakers, emergency stop
AtomicU64/AtomicI64 - use cases are position tracking, price update
AtomicUSize - use cases are message counters, queue indices, resource tracking

All of these atomics have similar methods attached to them, such as load() which atomically reads the value, or store() which atomically writes a value.

Memory Ordering determines how atomic operations are synchronised between threads. In Rust, these can be used from the Ordering Enum from the std::sync::atomic module

pub enum Ordering { Relaxed, Release, Acquire, AcqRel, SeqCst, }

Each of these memory orderings have different guarantees. What do I mean by guarantees?

Memory ordering guarantees control how operations on atomic variables become visible to other threads. Let's break down each ordering option:

Relaxed: The weakest ordering - only guarantees that the operation itself is atomic. There are no synchronization guarantees between threads. This is the fastest option but provides minimal safety.

Acquire: Used for loads (reading). Ensures that subsequent operations in the same thread cannot be reordered before this load. Essentially saying "any operations after this read must happen after the read."

Release: Used for stores (writing). Ensures that preceding operations in the same thread cannot be reordered after this store. Tells other threads "any operations before this write must be visible before the write becomes visible."

AcqRel: Combines Acquire and Release semantics. Used for operations that both read and modify, like compare-and-swap. Provides bidirectional ordering guarantees.

SeqCst: The strongest guarantee - Sequential Consistency. Ensures a total ordering of operations across all threads. It's the most intuitive but also the most expensive.

To summarise, atomics are the building blocks, memory ordering determines the atomics "behaviour"

Lock-Free Programming: The Art of Coordination Without Waiting

Lock-Free Programming is a paradigm that allows multiple threads to operate on shared data without traditional "locking mechanisms" like mutexes. Lock-Free Programming utilises Atomics as building blocks and also careful synchronisation techniques.

If you want to figure out if your program is "Lock-Free", then look at this code below

const isLockFree = null;
if ("Are you programming with multiple threads?") {
if("Do the threads access shared memory"){
if("Can the threads operate WITHOUT blocking each other?"){
isLockFree = true
}
}
}

These are the characteristics of a lock-free program

1) Non-blocking - at least one thread makes progress even if the others fail
2) Atomicity - operations appear indivisible and are executed without interference from other threads
3) Progress guarantees - ensures the system as a whole makes progress

Here is an example of a lock-free implementation of a Queue data-structure

Why use them? Well for the same reason we're discussing this topic in depth, for low latency computations. If you want to be specific, here are the reasons

1) Elimination of Context Switching - When threads are blocked on locks, the operating system has to perform a context switch to another thread.
2) Resilience against thread failures - if one thread stalls or fails, other threads continue
3) Cache efficiency - Lock-free data structures and algorithms can be designed to minimise cache-line sharing

Multithreading Patterns for Performance

When low-latency is a concern, it isn't just about spinning up multiple threads/processes and hoping for the best, we also have to take in considerations how these threads will need to interact with each other. The right pattern makes a big difference between a system that breaks down when theres traffic in coming and a maintains consistent low latency.

Lets go over some multithreading patterns to go over what exactly this entails

Readers-Writer Pattern

This pattern optimises for scenarios where reads are much more frequent than writes - in the context of HFT systems, it's perfect for order book data where many strategies need to read the current state, but updates happen less frequently.

Imagine a popular blog post that thousands of people are reading, but occasionally an editor needs to update it. You wouldn't want to lock out all readers while the editor works!

This pattern is like having a special system where:

Multiple readers can view the content simultaneously (like users browsing your website). When an editor needs to make changes, they get exclusive access briefly. Once editing is done, all readers immediately see the new version

Producer-Consumer Pattern

This is a fundamental pattern for building pipeline-style architectures that minimise latency while maintaining clean separation of concerns.
Using channels (mpsc::channel) provides:

Lock-free communication between components
Automatic back-pressure handling
Clean decoupling between data producers and consumer

Think of this like the relationship between your API backend and frontend. Your backend produces data that your frontend consumes, but they work at different speeds.
This pattern creates a buffer between components:

Producers (like your backend) can generate data at their own pace
Consumers (like your frontend) process it when they're ready
A channel between them handles timing differences

Shared Ownership Pattern

Using atomic reference counting (Arc<T>) is crucial for safe concurrent access to shared resources without the overhead of locks.
This pattern:

Allows safe sharing of read-only or thread-safe data across components
Eliminates expensive deep copies of data
Ensures resources are properly cleaned up when no longer needed

If you've used React's context API or Redux, you're familiar with the concept of shared state. This pattern is similar but for memory management.
Instead of copying data between components:

Multiple threads share references to the same data
The system tracks how many threads are using the data
When no one needs it anymore, it's automatically cleaned up

This dramatically reduces memory overhead and improves performance by eliminating expensive copies while maintaining safety.

NUMA-Aware Application Design

NUMA (Non-Uniform Memory Access) architecture is a fundamental concept when designing systems that need to scale across multiple processors. Let me explain this in a way that builds on our previous discussions.

What is NUMA Architecture?

As we discussed earlier, in a NUMA system:

The computer has multiple CPUs (physical processors)
Each CPU has its own memory controller and directly attached memory
Each CPU also has its own L1/L2/L3 cache hierarchy
Memory access speeds vary depending on whether a CPU is accessing its local memory or remote memory (belonging to another CPU)

This creates a critical performance consideration: accessing local memory might be 1.5-3x faster than accessing remote memory, depending on the system.

Why NUMA Awareness Matters

In a high-frequency trading system, these access differences can dramatically impact performance:

A thread accessing remote memory might experience 30-100ns additional latency per access. For applications making millions of memory accesses, this adds up quickly
Memory-intensive operations might run twice as slow if data is placed inappropriately
Cache coherency traffic across NUMA nodes creates additional overhead

Key NUMA-Aware Design Principles

Thread and Memory Locality

Keep threads and their data on the same NUMA node
Minimise cross-node communication and data sharing
Use thread affinity to bind threads to specific CPUs

Memory Allocation Strategies

Allocate memory on the same node where the processing thread runs
Use NUMA-specific allocation functions when available
Consider first-touch policies (memory is allocated on the node of the thread that first writes to it)

Data Partitioning

Divide work and data along NUMA boundaries
Each node handles a specific subset of the workload
Minimize shared data structures that cross NUMA boundaries

NUMA-Aware Data Structures

Design data structures that respect NUMA topology
Consider node-local queues feeding into global coordination
Avoid false sharing across NUMA boundaries

Translating Low-Latency Principles to AWS

Understanding hardware-level performance concepts doesn't just matter for HFT systems - they can significantly improve latency-sensitive applications in the cloud. Even though HFT developers work on their critical infrastructure on premises instead of the cloud, the principles that they operate on can be translated to the workflow of Backend and Cloud Engineers.

Here's how to apply these principles when developing on AWS:

CPU Cache Optimisation in the Cloud

Instance Selection: Choose compute-optimised instances (c7g, c6g) with higher CPU-to-memory ratios for cache-sensitive workloads
Data Locality: Structure your applications to maintain data locality even in virtualised environments
Workload Placement: Run related services on the same instance to benefit from shared L3 cache
Bare Metal Options: For critical paths, consider AWS bare metal instances to eliminate hypervisor overhead and gain direct access to physical CPU caches

Memory Ordering and Atomics in Distributed Systems

Single-Instance Performance: Optimise your critical paths using the atomics and memory ordering techniques on high-CPU instances
Service Boundaries: Design service interfaces to minimise cross-process synchronisation needs
Local Processing: Process data where it's stored whenever possible to avoid network round-trips
Consistent Instance Types: Use the same instance family for services that need predictable memory behaviour

Lock-Free Programming in AWS

Serverless Coordination: Use DynamoDB with optimistic concurrency control instead of traditional locks
SQS FIFO Queues: Leverage SQS FIFO queues for producer-consumer patterns with guaranteed ordering
Lambda Concurrency: Design Lambda functions to be truly independent, avoiding shared state that requires locking
ElastiCache Redis: Use Redis atomic operations for distributed coordination instead of application-level locks

Multithreading Patterns for AWS Services

Readers-Writer Pattern: Implement with DynamoDB's strongly consistent vs. eventually consistent reads
Producer-Consumer: Use SQS and SNS for decoupled, high-throughput communication
Shared Ownership: Leverage ElastiCache for shared state across distributed components

NUMA-Aware Design in AWS

Placement Groups: Use cluster placement groups to ensure instances run on the same underlying hardware with low-latency networking
Enhanced Networking: Enable ENA (Elastic Network Adapter) for improved latency between instances
Topology Awareness: Create availability zone-aware designs that keep related components physically close
Instance Size Selection: Choose appropriately sized instances - splitting a workload across multiple smaller instances might hit NUMA-like boundaries

AWS-Specific Optimisations

Direct VPC Endpoints: Reduce latency by connecting directly to AWS services without traversing the public internet
Enhanced Networking: Use Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA) for high-throughput, low-latency networking
Nitro System: Leverage AWS Nitro-based instances for hardware acceleration and reduced virtualisation overhead
Graviton Processors: Consider ARM-based Graviton instances which can offer better performance characteristics for certain workloads

By applying these low-level performance principles to your AWS architecture, you can build systems that achieve consistent low latency even in virtualised cloud environments. The key is understanding that while the physical hardware may be abstracted, the principles of CPU caching, memory access patterns, and concurrent programming still directly impact your application's performance.

[Boost]

Zubair Maqsood — Fri, 21 Feb 2025 11:48:14 +0000

The Cracked Engineer: From JavaScript to Rust: The Engineer’s Guide to Systems Programming

Zubair Maqsood — Wed, 12 Feb 2025 02:31:15 +0000

In my previous article, I dissected JavaScript's inner workings and mechanics. If you haven't read it yet, I recommend starting here. Today, we're diving into Rust, but first, let me explain why I chose it. Coming from a Cloud/Node.js background, I'm intimately familiar with interpreted languages. However, I felt something was missing – a deep understanding of compiled and systems programming languages. While C and C++ were options, their notorious compiler issues steered me away. Enter Rust: a modern systems language that's been gaining significant traction

The Art of Ownership: Memory Management Made Clear

Rust uses the Ownership model. Ownership is a set of rules that sets the show on how a Rust program manages memory. How that is done is with a set of rules that a compiler checks. If these rules are violated, the program itself will not compile. TLDR: There are rules that the Rust compiler has, and if your code breaks them, it simply won't work

This "strictness" allows Rust to have these attributes
1) Control over memory
2) To be Error free
3) Faster runtimes
4) Smaller program size

At compile time, Rust decides if certain variables and results of functions should be put on 2 core data structures. The Stack and the Heap

I went over the Stack data structure briefly. To put it simply, it is a LIFO (Last in, First Out) data structure. The Heap, is something different entirely. It is a tree-like data structure (like a binary search tree) but with a twist. Unlike the binary search tree, it there is no implied ordering between siblings and no implied sequence for in-order traversal.

So at compile time, Rust decides if variables should be pushed onto the stack or heap. For something to be pushed onto the stack, values must have a known size at compile time and/or have a fixed size type (primitives such a bool or char are examples of this), where as for something to be pushed onto the heap, it needs to be dynamically sized, usually that means large data, perhaps a batch of JSON returned from an API call that needs to be stored, or it could be a String that can be altered dynamically (we can do this easily using the String::(new) function)

Borrowing: Share Nicely or Don't Share at All

Rust uses a borrowing mechanism to access data without direct ownership of it. Instead of passing objects by value like JavaScript does, objects can be passed by reference by using the & operator

let message = String::from("Hello"); // message owns this string
let greeting = message; // ownership moves to greeting
// println!("{}", message); // ❌ Error: message has been moved

// This is different from JavaScript where:
// let message = "Hello";
// let greeting = message; // Creates a copy/reference
// console.log(message); // ✅ Works fine in JS

// Now let's look at borrowing
let original = String::from("Hello");
let borrowed = &original; // Immutable borrow with &
println!("{}", borrowed); // Prints: Hello
println!("{}", original); // ✅ Still works! original maintains ownership

In the example above, Rust initialises the message variable and at the first line, the message variable owns the value String::from("Hello) which is stored on the Heap. Essentially the Heap has a pointer telling the compiler where this value is located. In the second line, the ownership is transferred from the variable message to a new variable called greeting. In the third line, we have an error that signifies the mechanism of ownership, the message variable now does not have access to the string. Contrast this to JavaScript, it works fine because well, there is no ownership mechanism in JavaScript, so the variable greeting merely copies the value from message.

In the 3rd example, if we wanted to log out both the original and borrowed variables successfully and for same logging, we would need to use the & operator. This does not transfer ownership of the value defined originally, but merely allows the borrowed variable to do exactly what the borrowing mechanism is, to borrow the value. The original variable still owns the string however!

Think of this concept as a pencil, you are the owner of it, your friend needs a pencil for something so you let him borrow it, but it is still yours.

There is one crucial aspect however, using the & operator creates an immutable reference, meaning the variable thats doing the borrowing, cannot mutate the original value. So how can we mutate the value by using the borrower?

We use the mut keyword

let mut value = String::from("Hello");
let mut_borrow = &mut value; // Mutable borrow with &mut
mut_borrow.push_str(" World"); // Modify through mutable reference

By using the mut keyword, we are essentially allowing the borrower to be able to mutate the original value. So now, if we wanted to log out value, it would output "Hello World"

Thread Safety Without the Headaches

In multithreaded languages like Java or C++, the execution of a code is called a process or a thread, in highly performant systems like a trading engine of an FX exchange, you can have multiple threads working at the same time. This multithreading approach can help performance but it can lead to problems such as Race conditions or Deadlocks.

But before that we need to become familiar with some modules from commonly used crates in the Rust ecosystem to build thread safe programs.

1) Arc from the std crate

Arc stands for Atomic Reference Counting.

use std::sync::Arc;
// Arc enables multiple ownership of the same data across threads
let data = Arc::new(vec![1, 2, 3]);

// We can clone the Arc to share ownership
let data_clone = Arc::clone(&data); // This is thread-safe

// Both data and data_clone point to the same memory
// Arc keeps track of how many references exist
// Memory is freed when the last Arc is dropped

Think of Arc as a smart pointer that keeps track of everything thats referencing the data. Its inherent feature is that it allows read only access to multiple threads.

2) Mutex from the std crate

Mutex stands for mutual exclusion. Its inherent feature that it allows only one thread to access some data at any given time. For a thread to access data in a mutex, it needs the mutex's lock. A lock is a data structure that is part of the mutex that keeps track of who currently has exclusive access to the data. Only one thread access the lock at a time. This prevents data races.

use std::sync::Mutex;

// Mutex wraps data to ensure only one thread can access it at a time
let counter = Mutex::new(0);

// To access data, you must lock() the Mutex
{
let mut num = counter.lock().unwrap();
*num += 1;
} // Lock is automatically released here

// Trying to access while locked blocks the thread until lock is released

3) Channels by using the mspc module from the std crate

Another way of ensuring thread safety is using channels, where threads can communicate with each other by sending each other messages.

The best analogy is what I got from the official Rust documentation

You can imagine a channel in programming as being like a directional channel of water, such as a stream or a river. If you put something like a rubber duck into a river, it will travel downstream to the end of the waterway.

A channel has two halves: a transmitter and a receiver. The transmitter half is the upstream location where you put rubber ducks into the river, and the receiver half is where the rubber duck ends up downstream. One part of your code calls methods on the transmitter with the data you want to send, and another part checks the receiving end for arriving messages. A channel is said to be closed if either the transmitter or receiver half is dropped.

use std::sync::mpsc; // mpsc = Multi-Producer, Single-Consumer
use std::thread;

fn main() {
// Create a channel with sender and receiver
let (sender, receiver) = mpsc::channel();

// Clone sender for multiple producers
let sender_clone = sender.clone();

// Producer thread 1
thread::spawn(move || {
sender.send("Hello from thread 1").unwrap();
});

// Producer thread 2
thread::spawn(move || {
sender_clone.send("Hello from thread 2").unwrap();
});

// Main thread receives
for message in receiver {
println!("{}", message);
}
}

Channels are useful for Async communication, event handling, data streaming and message massing.

Building Blocks: Rust's Type System

One thing we haven't gone over in this issue is the Rust syntax, which I've heard is very similar to C or C++ (don't have an opinion as I have never used either of them). So let's go over them briefly and lets also compare them to JavaScript's analogous features.

1) Structs

A struct in Rust is like a blueprint for creating data objects. Think of it as a more rigid, type-safe version of JavaScript classes/objects:

Rust enforces:

Exact field types (String, bool, u64)
All fields must be initialised
Fields can't be added/removed after creation
Type checking at compile time

JavaScript classes are more flexible but less safe due to the following reasons

Dynamic typing
Properties can be added/removed
No compile-time type checking (unless using TypeScript)
More prone to runtime errors

2) Traits

Traits are Rust's way of defining shared behavior - similar to interfaces in TypeScript or abstract classes in JavaScript. The key differences are

Defining behaviour contracts
Can have default implementations
Multiple traits can be implemented
Enforced at compile time
Used for operator overloading and type conversion

The Tradeable example below shows:

Required methods (get_price, update_price)
Self reference (&self)
Return type specifications (-> f64)

3) Implementations

Implementations in Rust separate the data (struct) from behavior (methods). This is different from JavaScript's class-based approach.

These are the following differences
1) Clear separation of data and behavior
2) Multiple impl blocks allowed
3) Static methods (new)
4) Instance methods with explicit self reference
5) Mutable vs immutable methods (&mut self vs &self)

JavaScript lumps everything together in the class definition, which can be less organised but more familiar to OOP developers

4) Attributes

Attributes in Rust are metadata attached to code elements. They're more powerful than JavaScript decorators. Attributes in Rust are like special tags or labels that give extra instructions or capabilities to your code. Think of them like sticky notes that tell the compiler "hey, do something special with this code."

These are the common use cases
1) Deriving common traits (#[derive(Debug)])
2) Conditional compilation (#[cfg(test)])
3) Testing (#[test])
4) Documentation
5) Compiler optimisations

JavaScript decorators are more limited and primarily used for class and method modifications.

Putting It All Together

In this deep dive into Rust, we've explored what makes it a powerful systems programming language, especially when compared to interpreted languages like JavaScript. From its strict ownership model to its thread safety guarantees, Rust offers a unique approach to building reliable and performant software.

Moving from JavaScript to Rust represents more than just learning a new syntax - it's about adopting a different mindset towards software development. Where JavaScript offers flexibility and ease of use, Rust demands precision and thoughtfulness. This trade-off brings us significant rewards: better performance, fewer runtime bugs, and the ability to build systems that are correct by construction.

The Cracked Engineer’s Guide to JavaScript: Mechanics, Magic, and Misconceptions

Zubair Maqsood — Thu, 06 Feb 2025 01:37:33 +0000

After a career break from tech, I found myself going back to the roots of Software Engineering, exploring different areas such as AI and Rust. Along the way, I realised it would be valuable to document my findings and build a guide on how to be "cracked" as a Software Engineer—not just writing code, but truly understanding how things work under the hood.

The reason is simple: with the rise of Generative AI, the barrier to entry for software engineers is lower than ever, but at the same time, the demands on existing engineers are only increasing. I believe we are shifting from an era of specialism to an era of breadth.

Before tools like ChatGPT and Claude, companies hired experts in specific frameworks or tech stacks. Now, learning new technologies has become easier and faster, with automation handling many repetitive tasks. But this means that true engineering knowledge is more important than ever.

Anyone can code. Few can engineer. To stand out, it’s crucial to understand the internals of the technologies we use—how they work, how they scale, and how they solve complex modern-day problems.
With that in mind, let’s begin with JavaScript—the mechanics and magic behind it all.

Understanding JavaScript’s Execution Model

JavaScript is a single threaded programming language. This means it only handles one task at a time, unlike compiled languages such as Rust and C++, it can only execute one operation at a time. Here are some key features of JavaScript to note down

It uses a call stack to keep track of function execution
If we have a nested function - meaning a function that calls another function, it is added to the call stack, and popped off when the function has completed execution
Blocking code (nested for loops or synchronous network calls) freezes everything as JavaScript can’t move forward until it is finished.

So the question maybe, how does JavaScript handle tasks like API calls and timers if it is single threaded? Read through the whole page and you will find out!

The Call Stack – "The Brain of JavaScript Execution"

Before we jump in on how JavaScript handles things like API calls and timers, it is crucial to understand one core aspect of the language. The Call Stack, you can think of it as the “brain” of JavaScript as the subtitle of this section suggests.

But before that, let’s briefly go over what a stack is in the realm of Computer Science.

A stack is a data structure that can contain a list of elements, which has two operations.

Push - adds an element into the Stack
Pop - removes the recently added element from the Stack.

When JavaScript executes a function, it adds it to the call stack, if that function calls another function inside, then that function is added on top of the call stack. Only until the execution is finished is when the function is popped off the stack.

So we have a function named addition, when it is called at the bottom of the script, it is the first one added to the stack, inside the addition function, there’s a function called anotherFunction(), it will be the second one added to the stack, then inside that, there is someOtherFunction() being called, which is the last one added to the stack.

Once someOtherFunction is finished executing, then it will be the first one to be popped off the stack, and the same goes for the other functions when they are finished with execution.

The Event Loop & Async JavaScript – "How JavaScript Never Sleeps"

At the beginning, I asked how JavaScript handles things like API calls if it is single-threaded. Enter the event loop. This is what allows concurrency, it allows non-blocking code by offloading it to the browser or the Node.js runtime environment.

So there’s 3 components to bear in mind

1) The Call Stack - we went over this already in the previous section
2) Web API’s - JavaScript delegates asynchronous tasks (such as setTimeout, fetch, and DOM events) to the Web APIs provided by the browser (or Node.js).
3) Callback Queue & Microtask Queue - Once an async task completes, it can’t execute immediately. Instead, it goes into a task queue and waits for the event loop to process it.
There are two types of queues:
a) Callback Queue (Macrotask Queue) – Contains:
setTimeout
setInterval
I/O tasks (e.g., file reading in Node.js)
b) Microtask Queue (Higher Priority) – Contains:
Promise.then() and async/await
MutationObserver (used in the DOM)
How is this all relevant to event loop?

Well, think of the event loop as a continuous process JavaScript has, it continuously checks if the call stack is empty and if the queue has any pending tasks.

The Overall Process:
1️⃣ The call stack executes all synchronous code.
2️⃣ Once the stack is clear, the event loop checks the microtask queue and runs any pending microtasks.
3️⃣ If the microtask queue is empty, the event loop moves to the callback queue (macrotasks) and executes tasks one by one.
4️⃣ Repeat forever (until the program finishes).

Hoisting – "Variables Before They Exist?"

Hoisting is JavaScript's default behaviour of moving all declarations to the top of the current scope.

This means you can use a function or variable before its declaration without getting a ReferenceError.

If you use a variable before it is defined using the var keyword, you can use it, however it will be logged as undefined, as the declaration var a is hoisted at the top of the scope but the expression = 10, is not.

If you attempt the same thing with let and const, you will get a ReferenceError, as the variable definitions cannot be accessed before initialisation.

The same behaviour can be seen with function declarations and function expressions. Function declarations are fully hoisted, meaning both the function name and body move to the top of the scope. You can access the function before it is defined. Function expressions are not hoisted, so you cannot access them before it is assigned.

The this Keyword – "The Most Misunderstood Keyword in JavaScript"

The this keyword refers to the context where a piece of code is. Usually it is used in object methods. So let me show you an example of this by using the class constructor that JavaScript provides to create objects.

Inside the constructor of the class, we have this.name and this.age being set to the name and age being passed down as arguments. So now at this stage, name and age are set as properties of the Person class, and if you want to reference them anywhere in your code as we are doing it in the introduce() method, you will need to use the this keyword to refer back to the name and age you assigned in the constructor.

Similarly, if you have another method in the class, and you want to call the introduce() method inside it, then you would have to do this.introduce() to refer back to the method that was defined inside the class.

The Prototype Chain – "How JavaScript Inherits Without Classes"

The prototype chain is a mechanism in JavaScript that allows inheritance and property/method lookup. JavaScript in itself is a prototype-based language, meaning objects can inherit properties and methods from other objects.

Key Concepts:
1) Prototype Object: Every object in JavaScript has a hidden [[Prototype]] property (accessible via proto or Object.getPrototypeOf()). This property points to another object, which is its prototype.
2) Prototype Chain: When you access a property or method on an object, JavaScript first checks if the object itself has that property. If not, it looks up the prototype chain by following the [[Prototype]] link until it finds the property or reaches the end of the chain (where [[Prototype]] is null).
3) Constructor Functions: Functions in JavaScript can act as constructors. When you create an object using new, the object’s [[Prototype]] is set to the constructor’s prototype property.
4) Object.prototype: This is the top of the prototype chain. Most objects ultimately inherit from Object.prototype, which provides methods like toString(), hasOwnProperty(), etc

Memory Management & Garbage Collection – "The Hidden Cost of JavaScript"

Garbage collection is a process of automatically freeing up memory by removing objects that are not needed by the program. JavaScript handles memory allocation and deallocation automatically, however Cracked Software Engineers should still be mindful of how memory is used to avoid performance issues

Memory Lifecycle:
Allocation: Memory is allocated when objects, functions, or primitives are created.
Usage: Memory is used when reading or writing to variables.
Release: Memory is released when objects become unreachable and are garbage collected.

Memory Allocation:
Numbers, strings, booleans, etc., are stored directly in memory.
Objects, arrays, and functions are stored in the heap, and references to them are stored in variables.

Conclusion & Takeaways

Now, I have been using JavaScript in my career for almost 4 to 5 years now, it seems easy at first but underneath the hood, there is a lot to know and master to become a Cracked Software Engineer. Truly understanding the internals is what is going to differentiate you from a web developer or a simple coder in the job market. In an era where AI tools like Windsurf and DeepSeek are gaining momentum, truly understanding what is going on underneath is going to differentiate you from the rest.