Gaurav De

Posted on Jan 16

Creating an HD-2D Rendering Pipeline on DX12

#d3d12 #hd2d #graphics

Introduction:

In recent years, the HD-2D art style, made popular by games like Octopath Traveler and Triangle Strategy has brought new life to pixel art. This style places 2D pixel sprites into realistic 3D worlds with modern lighting, creating a look that feels like a living diorama.
For my university graphics project, I wanted to understand how this style works and recreate it myself. I built a custom renderer designed to produce the same visual effect.
This project was not just about making things look good—it was also a technical challenge. Traditional rendering pipelines are built for 3D models, not flat 2D sprites with transparency. A big part of the work was figuring out how to make these two ideas work together.
In this post, I explain the research I did, the design choices I made (such as choosing between Deferred and Forward rendering), and how I implemented key features like shadows and cinematic depth of field.

Pipeline Architecture: Why I Chose Deferred Rendering

The first big technical choice was deciding how to render the scene. While researching similar games, I found that studios take two different approaches.

The Octopath approach:
Square Enix uses Unreal Engine 4, which uses Deferred Rendering by default. This makes it easy to support many dynamic lights, such as torches, street lamps, and spell effects. These lights are very important for the dramatic, high-contrast look of HD-2D games.

The modern indie approach:
Some newer indie games, like Sea of Stars, use custom rendering pipelines. These are often based on Forward+ rendering, which works well for 2D lighting and avoids some of the transparency problems that Deferred rendering has.

I decided to use a Deferred Rendering pipeline.
Deferred rendering makes transparency harder to handle, so sprites often need extra forward passes or special techniques. However, it has a big advantage: lighting cost is separated from geometry complexity. This was important for my project.
Using Deferred rendering allowed me to treat 2D sprites like real 3D objects. I stored their color, normals, and depth in the G-Buffer, so they could react to lighting in the same way as 3D models.

Step 1: The Unified G-Buffer Layout:

To make this work, I standardized the output of all my geometry shaders. Whether drawing a 3D mesh or a 2D Sprite, they both target this exact stucture:

// This is the G_Buffer pixel shader
// Common Output Structure (Used by both 3D PBR and 2D Sprite Shaders)
struct PixelShaderOutput
{
    // Target 0: Albedo (R8G8B8A8)
    // Stores Base Color. For Sprites, this is the Texture Color.
    float4 Albedo : SV_Target0; 

    // Target 1: World Normal (R16G16B16A16_FLOAT)
    // High precision is critical for smooth lighting on "flat" surfaces.
    float4 Normal : SV_Target1; 

    // Target 2: Material Data (R8G8B8A8)
    // Packed: Roughness (R), Metalness (G), Ambient Occlusion (B)
    float4 Material : SV_Target2; 
};

Making Sprites React to Light
One problem I discovered was that normal sprite rendering looks flat when lit. To fix this, I added normal maps to the sprites.

For creation of Normal Maps for my Sprites I used a famous tool available online called "Laigter". This tool allows us to create Normal Maps for any sprites or textures simply by uploading it to the software.

Step 2: Solving "Flat" Sprites with Normal Mapping

A naive implementation of 2D sprites in a 3D engine looks "paper-thin" because the geometric normal always points straight back at the camera (0,0,-1).

To give sprites volume, I implemented Tangent-Space Normal Mapping within the sprite shader. This cheats the physics by transforming the 2D texture normals (bumps on the sprite) into the 3D World Space.

I encountered a specific issue where the Green channel (Y) was flipped compared to my engine's coordinate system, making light come from the wrong angle. I fixed this by manually constructing the TBN (Tangent-Bitangent-Normal) matrix:

// from shaders/SpriteShader_ps.hlsl

// 1. Sample the Normal Map
float3 normalTangent = g_NormalTexture.Sample(g_Sampler, IN.UV).rgb;
// Unpack from [0,1] to [-1, 1] range
normalTangent = normalize(normalTangent * 2.0f - 1.0f);

// 2. Fix Coordinate Mismatch
// My engine uses a coordinate system where Y is opposite to the texture generator
normalTangent.y = -normalTangent.y; 

// 3. Construct TBN Matrix on the fly
// We use the Vertex Tangent (Right) and Geometric Normal (Back) to find Up
float3 N = normalize(IN.Normal); 
float3 T = normalize(IN.Tangent.xyz); 
float3 B = cross(N, T) * IN.Tangent.w;

float3x3 TBN = float3x3(T, B, N);

// 4. Transform 2D Bump to 3D World Normal
float3 finalNormal = normalize(mul(normalTangent, TBN));

// Now the flat sprite writes a "3D" normal to the G-Buffer!
OUT.Normal = float4(finalNormal, 1.0f);

Step 3: The Lighting Pass

Once the G-Buffer is packed, I perform a Deferred Lighting Pass. This is a full-screen operation that calculates lighting for every pixel.

Optimization: The Full-Screen Triangle Instead of rendering a generic Quad mesh, I use a Vertex Shader trick to generate a single triangle that covers the entire screen. This is slightly faster than a Quad because it avoids the diagonal edge down the middle of the screen (quads are 2 triangles), preventing helper-pixel waste along that edge.

// from shaders/LightingShader_vs.hlsl
VS_Output main(uint VertexID : SV_VertexID)
{
    // Algorithmically generate a triangle that covers UV space [0,0] to [1,1]
    // ID 0 -> (-1,  1)
    // ID 1 -> ( 3,  1)
    // ID 2 -> (-1, -3)
    float2 texCoord = float2((VertexID << 1) & 2, VertexID & 2);
    output.Position = float4(texCoord * float2(2, -2) + float2(-1, 1), 0, 1);
    output.UV = texCoord;
    return output;
}

Lighting Logic: Reconstructing World Position Since I am using a Deferred renderer, I don't have the "World Position" of the geometry anymore—I only have a 2D image. To calculate lighting (which depends on distance), I implemented a mathematical reconstruction using the Hardware Depth Buffer.

This saves massive amounts of memory bandwidth because I don't need to write a XYZ_FLOAT texture to the G-Buffer.

// from shaders/LightingShader_ps.hlsl

float3 ReconstructWorldPos(float2 uv, float depth)
{
    // 1. Convert Screen UV [0,1] back to NDC [-1, 1]
    float x = uv.x * 2.0f - 1.0f;
    float y = (1.0f - uv.y) * 2.0f - 1.0f;

    // 2. Create a vector in Clip Space
    float4 clipPos = float4(x, y, depth, 1.0f);

    // 3. Multiply by Inverse View-Projection Matrix
    // This effectively "un-projects" the pixel back into the 3D world
    float4 worldPos = mul(InverseViewProj, clipPos);

    return worldPos.xyz / worldPos.w;
}

By combining these techniques, I achieved a pipeline where: Sprites and Meshes are indistinguishable to the lighting engine, Expensive lighting is only calculated once per pixel, not per object and Flat pixel art gains "volume" and reacts correctly to rotating lights.

The Shadow Problem: Making 2D Sprites Feel Grounded

One of the hardest problems was getting shadows to look right.
2D sprites are basically flat images with no thickness. When light hits them from certain angles (like sunlight from above), they don’t cast proper shadows because a flat plane has no real volume.

What I Learned from Research

Shadow proxies:
By studying games like Octopath Traveler and Live A Live, it appears they use invisible 3D shapes to cast shadows. These hidden models give characters proper, solid shadows even though the visible character is just a flat sprite.
However, for this engine, I wanted a more precise solution that respects the pixel art's silhouette.

Solution A: Alpha-Tested Shadow Mapping

I implemented a specialized "Shadow Pass" that runs before the main rendering. This pass renders the scene from the perspective of the sun (LightViewProj).

For standard 3D meshes, this is a simple Z-buffer write. But for 2D sprites, I wrote a custom Pixel Shader that performs an Alpha Test.

The shader reads the sprite's texture. If a pixel is transparent (alpha < 0.9), it discards the pixel. This prevents the empty parts of the rectangular quad from casting a shadow, ensuring the shadow perfectly matches the character's pixel art shape.

// from shaders/ShadowShader_ps.hlsl

void main(PS_Input input)
{
    // 1. Sample the sprite texture alpha
    float alpha = g_Texture.Sample(g_Sampler, input.UV).a;

    // 2. Discard invisible pixels
    // By discarding here, we prevent the Depth Buffer from recording a hit.
    // This allows light to pass through the transparent details of the sprite.
    if (alpha < 0.9f)
    {
        discard;
    }

    // If we survive the discard, the hardware automatically writes the depth.
}

Contact shadow issues:
Standard shadow maps often have a problem called “Peter Panning,” where the shadow appears slightly separated from the character’s feet. This makes sprites look like they’re floating above the ground.

I used Shadow Mapping in my Deferred Rendering pipeline, using the scene’s depth data. However, shadow maps alone didn’t fully solve the floating effect.

Solution B: Screen Space Contact Shadows (SSCS)

Even with correct shadow shapes, sprites often look like they are floating. This is caused by "Shadow Bias" a small offset required to prevent visual glitches (shadow acne) on 3D surfaces. This bias detaches the shadow from the sprite's feet.

To fix this, I implemented Screen Space Contact Shadows.

This technique works in the Lighting Shader. It fires a short ray from every pixel towards the light source. It checks the Depth Buffer at each step to see if something nearby is blocking the light. Because it works in Screen Space, it has perfect precision and doesn't need a bias closer to the camera.

Here is the core raymarching loop I implemented:
This technique makes the sprites feel firmly attached to the ground. It adds visual weight and realism that regular shadow maps alone could not provide.

// LightingShader_ps.hlsl

float ScreenSpaceContactShadows(float3 worldPos, float3 lightDir, float2 uv)
{
    // Raymarch Setup
    float3 rayDir = -lightDir; 
    float stepSize = RayLength / MaxSteps;

    // ... Dithering logic to hide banding ...

    for(int i = 0; i < MaxSteps; i++)
    {
        rayPos += rayDir * stepSize;

        // 1. Project Ray to Screen UVs
        float4 clipPos = mul(ViewProj, float4(rayPos, 1.0));
        float2 rayUV = clipPos.xy / clipPos.w;
        // ... transform to [0,1] ...

        // 2. Check the Depth Buffer
        // If the ray is BEHIND something in the Depth Buffer, it's occluded!
        float bufferDepth = LinearizeDepth(g_Depth.SampleLevel(g_Sampler, rayUV, 0).r);
        float rayDepth = clipPos.w;

        if (rayDepth > bufferDepth && (rayDepth - bufferDepth) < Thickness) 
        {
            // HIT! We are in shadow.
            return 0.0f; 
        }
    }
    return 1.0f; // No hit, fully lit.
}

By combining these two techniques—Shadow Mapping for large global shadows and SSCS for tiny grounding details—the 2D sprites feel physically present in the 3D world.

The Cinematic Lens: Depth of Field

The miniature or tilt-shift look is one of the most important parts of the HD-2D style. It makes the scene feel like a small toy world, not a life-size environment.

Research and Design Choices:

I explored different Depth of Field (DoF) techniques that work in real time.

Gaussian blur was not suitable because it looks soft and fake.
Instead, I focused on Bokeh Depth of Field, using a technique called Scatter-as-Gather.

This approach allows bright lights in the background to blur into the shape of the camera aperture (such as circles or hexagons). This effect is essential for achieving the HD-2D cinematic look.

I added a post-processing step that calculates how blurry each pixel should be. This is done using the Circle of Confusion (CoC), which is based on the depth stored in the G-Buffer.

The formula for the Circle of Confusion (CoC) is given by:
$CoC = A \cdot \frac{f \cdot |P - z|}{z \cdot (P - f)}$

z is the pixel depth
P is the focus plane
f is the focal length
A controls the aperture size

Implementation Detail of the "Scatter-as-Gather" Depth Of Field:

As I said, my research indicated that a simple Gaussian Blur looks too "foggy". For the cinematic look, we need Bokeh, where bright points of light expand into visible shapes (circles/hexagons).

I implemented a Golden Angle Spiral kernel. Instead of grid sampling (which looks blocky), I generate sample points in a spiral pattern on the fly. This creates a natural, organic blur.

Solving the "Ghosting" Problem: With only 64 samples, a static spiral pattern creates visible bands ("ghosting"). To fix this, I implemented Interleaved Gradient Noise to rotate the spiral randomly for every pixel. This trades banding for high-frequency noise, which looks like film grain—much more acceptable for a cinematic aesthetic.

//DoF_ps.hlsl

// 1. Generate Noise based on Screen Position
float noise = InterleavedGradientNoise(input.UV * ScreenRes);

// 2. Create Random Rotation Matrix
float theta = noise * 6.28; 
float c = cos(theta), s = sin(theta);
float2x2 rotation = float2x2(c, -s, s, c);

// 3. Golden Angle Spiral Loop (64 Samples)
for (int i = 0; i < 64; i++)
{
    // Generate Spiral Offset
    float r = sqrt((float)i + 0.5f) / sqrt(64.0f);
    float phi = i * 2.39996f; // Golden Angle
    float2 offset = float2(r * cos(phi), r * sin(phi));

    // Rotate the kernel randomly per-pixel to hide artifacts
    offset = mul(rotation, offset);

    // Sample Scene Color
    accumColor += g_SceneColor.Sample(g_Sampler, input.UV + offset * radius);
}

Handling Foreground and Background Issues:
One common problem with DoF is color bleeding, where blurry background colors leak onto sharp foreground objects.
To fix this, I used a depth-weighted sampling kernel. This ensures that background blur does not affect nearby focused sprites, keeping characters sharp while the background stays soft and cinematic.

ACES Tone Mapping:

Since I am using PBR lighting, my light values can easily exceed 1.0 (HDR). If I just output this to the screen, bright colors get clamped to ugly flat white.

To fix this, I implemented the ACES Filmic Tone Mapping curve. This mathematically compresses the High Dynamic Range (HDR) values into the visible monitor range (LDR), preserving detail in bright highlights (like fire and magic effects) and contrasting shadows.

// final shader pass

float3 ACESFilm(float3 x)
{
    // Standard ACES fitted curve
    float a = 2.51f;
    float b = 0.03f;
    float c = 2.43f;
    float d = 0.59f;
    float e = 0.14f;
    return clamp((x * (a * x + b)) / (x * (c * x + d) + e), 0.0, 1.0);
}

// ... In Main Shader ...
float3 mapped = finalHDRColor.rgb * Exposure;
mapped = ACESFilm(mapped);
return float4(mapped, 1.0f);

By combining the Golden Angle Bokeh with ACES Tone Mapping, the engine transforms the raw, jagged render into a soft, cohesive image that mimics real photography.

Reflections and Future Work

While I successfully implemented the core pillars of the HD-2D look (Deferred Lighting, Normal-Mapped Sprites, Shadows, and DoF), the scope of a block project meant some advanced features remains as future work:

Volumetric Fog: A key component of Octopath's atmosphere is volumetric god-rays. Currently, my engine lacks a volumetric rendering pass.

Translucency Sorting: My Deferred pipeline relies on alpha testing (cutout). Implementing true translucency (for glass or water) would require a separate Forward pass or "Separate Translucency" buffer, which was out of scope for this block.

References & Further Reading

On HD-2D Architecture:

[1] Square Enix & Epic Games. "The Fusion of Nostalgia and Novelty in the Development of Octopath Traveler." Unreal Fest Europe 2019. Watch Presentation (Primary inspiration for the Deferred Rendering choice).
[2] / 3DGEP. "Forward+ Rendering." Read Article (Used for the architectural trade-off analysis).

On Shadows & Integration:

[3] Epic Games. "Proxy Geometry Shadows in Unreal Engine." Unreal Engine Documentation. Read Article (Research on shadow proxies).
[4] Panos Karabelas. "Screen Space Shadows." Read Article (The mathematical basis for the Contact Shadows implementation).
[5] Unity Technologies. "Forward and Deferred Rendering in HDRP." Read Article

On Post-Processing (DoF & Tone Mapping):

[6] Earl Hammon, Jr. "Practical Post-Process Depth of Field." GPU Gems 3, Chapter 28. Read Paper (The foundational algorithm for "Scatter-as-Gather" Bokeh).
[7] The Realm of MJP. "Bokeh." Read Article (In-depth analysis of physical lens simulation).
[8] Krzysztof Narkowicz. "ACES Filmic Tone Mapping Curve." Read Article (Source for the tone-mapping math used in the shader).

DEV Community