DEV Community

RYU JAEMIN
RYU JAEMIN

Posted on

[BlindSpot] Log 02. C# Socket Optimization : Zero-Allocation & GC Free

Why Refactoring?

Since BlindSpot is an Online FPS Game, Zero-latency and Stutter-free frames are essential.
Therefore, I performed a major refactoring to minimize the garbage collection (GC) load and secure structural scalability.

1. Using a Sliding Window instead of a List

Before

//BlindSpotClient/Assets/Scripts/Network
/NetworkManager.cs

//Add received data to assemble buffer
byte[] temp = new byte[bytesRead];   // GC Overhead
Array.Copy(recvBuffer, 0, temp, 0, bytesRead);
assembleBuffer.AddRange(temp);
// Process complete packets
while (assembleBuffer.Count >= 4) 
{
      ushort packetSize = BitConverter.ToUInt16(assembleBuffer.ToArray(), 0); // GC Overhead

      if (assembleBuffer.Count < packetSize) break;

      ushort packetID = BitConverter.ToUInt16(assembleBuffer.ToArray(), 2); // GC Overhead

      byte[] payload = new byte[packetSize - 4];
      Array.Copy(assembleBuffer.ToArray(), 4, payload, 0, payload.Length);

      HandlePacket((PacketID)packetID, payload);

      assembleBuffer.RemoveRange(0, packetSize);
}
Enter fullscreen mode Exit fullscreen mode

After

_currentLength += bytesRead;

int processOffset = 0;
while(_currentLength - processOffset >= 4) // Check for minimum header size
{
    //Read Size without ToArray()
    ushort packetSize = BitConverter.ToUInt16(_recvBuffer, processOffset);

    //Check packetSize validation
    if (packetSize < 4 || packetSize > Buffersize)
    {
        Debug.LogError($"[Client] Invalid Packet Size: {packetSize}");
        // Modify to reconnect logic
        CloseConnection();
        return;
    }

    if(_currentLength - processOffset >= packetSize) // Check if full packet is received
    {
        ushort packetId = BitConverter.ToUInt16(_recvBuffer, processOffset + 2);
        byte[] payload = new byte[packetSize - 4]; //payload array
        Array.Copy(_recvBuffer, processOffset + 4, payload, 0, payload.Length);
        HandlePacket((PacketID)packetId, payload);
        processOffset += packetSize; // Go to next packet
    }
    else
    {
        break; // Wait for more data
    }
}
if(processOffset > 0)
{
    int remaining = _currentLength - processOffset;
    if (remaining > 0)
    {
        Buffer.BlockCopy(_recvBuffer, processOffset, _recvBuffer, 0, remaining);
    }
    _currentLength = remaining;
}
Enter fullscreen mode Exit fullscreen mode

This code reuses only one byte[] array of fixed size (8KB) ​​and moves only the index (Offset).

2. Producer-Consumer Pattern with ConcurrentQueue

The callback function (OnReceiveData) connected to Socket.BeginRead in C# was executed on a thread other than Unity's main thread.
Accessing Unity's Transform or UI within this callback will cause a crash.

//BlindSpotClient/Assets/Scripts/Network
/NetworkManager.cs

// Network Thread
private void HandlePacket(PacketID id, byte[] payload) {
    // Insert safely without lock
    _packetQueue.Enqueue(new PacketMessage(id, payload));
}

// Main Thread (Unity Update)
void Update() {
    // Take it out of the queue and process it
    while (_packetQueue.TryDequeue(out PacketMessage packet))
{
    try
    {
        ProcessPacket(packet);
    }
    catch (Exception e)
    {
        Debug.LogError($"[Client] Packet Processing Error: {e.Message}");
    }
}
}
Enter fullscreen mode Exit fullscreen mode

To use a general Queue, you need to lock it, which can cause performance degradation in case of thread contention.
ConcurrentQueue uses a lock-free algorithm internally to safely transfer data without bottlenecks even in a multi-threaded environment.

3. Remove switch-case

Switch-case is fast when there are only a few packet types, but as the game grows to 100 or 200 packets, the code becomes longer and less readable. Furthermore, branch prediction errors can cause a slight performance degradation.

//BlindSpotClient/Assets/Scripts/Network
/NetworkManager.cs
private Action<byte[]>[] _packetHandlers = new Action<byte[]>[MaxPacketID];

private void RegisterHandlers()
{
    _packetHandlers[(int)PacketID.IdLoginResponse] = HandleLoginResponse;
    _packetHandlers[(int)PacketID.IdJoinRoomResponse] = HandleJoinRoomResponse;
    _packetHandlers[(int)PacketID.IdMakeRoomResponse] = HandleMakeRoomResponse;
}

private void ProcessPacket(PacketMessage packet)
{
    int id = (int)packet.Id;
    if (id >= 0 && id < _packetHandlers.Length && _packetHandlers[id] != null)
    {
        _packetHandlers[id](packet.Payload, packet.Size); 
    }
}
Enter fullscreen mode Exit fullscreen mode

The time complexity for packet processing has been reduced from O(N) (worst case) or O(log N) to constant time O(1).
Furthermore, the structure adheres to the Open-Closed Principle (OCP) because adding a new packet does not require modifying the existing logic (ProcessPacket).

4. Use ArrayPool in Sliding Window

In the code of sliding window, it uses new byte[] for payload.
Creating an array every time a packet is received triggers the Garbage Collector.
So I decided to use ArrayPool
Previous: byte[] payload = new byte[size];
Changed: byte[] payload = ArrayPool.Shared.Rent(size);

But ArrayPool lends arrays in power-of-two sizes for performance.
Therefore, if we trust payload.Length and try to parse Protobuf as is, we will end up reading the previous data (garbage data) that was left behind, which will result in an error.
So, I added a size variable to the PacketMessage structure, and made the handler recognize and process only up to size as payload.

ushort packetId = BitConverter.ToUInt16(_recvBuffer, processOffset + 2);
int payloadSize = packetSize - 4;
byte[] payload = ArrayPool<byte>.Shared.Rent(packetSize - 4);
Array.Copy(_recvBuffer, processOffset + 4, payload, 0, payloadSize);
HandlePacket((PacketID)packetId, payloadSize, payload);
processOffset += packetSize; // Go to next packet

//...

void HandleLoginResponse(byte[] payload,int size)
{
    CodedInputStream stream = new CodedInputStream(payload, 0, size);
    LoginResponse pkt = LoginResponse.Parser.ParseFrom(stream);
    Debug.Log($"[Server] Login Result: {pkt.Success}, Msg: {pkt.Message}");
    Debug.Log($"Session ID: {pkt.SessionKey}, PlayerId: {pkt.PlayerId}");
}
Enter fullscreen mode Exit fullscreen mode

ArrayPools must be returned to the pool after use to avoid memory leaks. A finally block is used to ensure they are returned unconditionally, even if an exception occurs.

void Update()
{
    while (_packetQueue.TryDequeue(out PacketMessage packet))
    {
        try
        {
            ProcessPacket(packet);
        }
        catch (Exception e)
        {
            Debug.LogError($"[Client] Packet Processing Error: {e.Message}");
        }
        finally
        {
            if(packet.Payload != null) 
                ArrayPool<byte>.Shared.Return(packet.Payload);
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

In conclusion..

Of course, this code isn't completely Zero-Allocated.
However, by using ArrayPool to eliminate byte[] allocation/deallocation on every packet received, it addresses memory fragmentation and bulk allocation, which are major causes of GC spikes

Top comments (0)