🧠 Solving LeetCode Until I Become Top 1% — Day `49`

🔹 Problem: 1948 Delete Duplicate Folders in System

Difficulty: #Damn
Tags: #Trie, #DFS, #StringSerialization, #Hashing, #Tree, #Map, #FolderStructure

📝 Problem Summary

You're given a list of folder paths in your file system. If two subtrees (aka folders and all their descendants) have the same structure and names, you must delete both of them. Return the folder structure excluding the duplicates.

It’s like trying to find evil clone folders in your OS and mercilessly deleting them... even if they just wanted to live.

🧠 My Thought Process

Brute Force Idea:
"I can probably use a hashmap to count folders?" — me, before entering the abyss.
Optimized Strategy (According to the solution):
❌ Build a Trie.
❌ Serialize every subtree using post-order DFS into a weird string-based signature like a(b(c())).
❌ Count how many times each signature appears.
❌ If any signature appears more than once, nuke the whole subtree from orbit.
✅ Return what's left.
🤯 At this point I had 4 tabs open, a cup of tea, 3 brain cells left, and tears in my eyes.
Algorithm Used:
Trie + Post-order DFS + Signature Hashing + My Soul Leaving My Body

⚙️ Code Implementation (Python)

from collections import defaultdict

class Node:
    def __init__(self, name):
        self.name = name
        self.children = {}
        self.signature = ""

class Solution:
    def deleteDuplicateFolder(self, paths):
        root = Node("")
        for path in paths:
            node = root
            for folder in path:
                if folder not in node.children:
                    node.children[folder] = Node(folder)
                node = node.children[folder]

        signature_count = defaultdict(int)

        # Post-order DFS to generate signatures
        def dfs(node):
            if not node.children:
                node.signature = ""
                return ""
            child_signatures = []
            for name, child in sorted(node.children.items()):
                child_signature = dfs(child)
                child_signatures.append(f"{name}({child_signature})")
            node.signature = "".join(child_signatures)
            signature_count[node.signature] += 1
            return node.signature

        dfs(root)

        result = []
        current_path = []

        # Filter out duplicated structures
        def dfs2(node):
            if node.children and signature_count[node.signature] >= 2:
                return
            current_path.append(node.name)
            result.append(current_path.copy())
            for name, child in sorted(node.children.items()):
                dfs2(child)
            current_path.pop()

        for name, child in sorted(root.children.items()):
            dfs2(child)

        return result

⏱️ Time & Space Complexity

Time: O(N * L * log L) N = number of paths, L = average folder length. The log L is from sorting children.
Space: O(N * L) Because apparently we love wasting memory with trees, maps, and strings.

🧩 Key Takeaways

✅ Learned that "folder duplication detection" is a valid algorithmic problem, and not a virus scanner.
💡 The trick is to turn a subtree into a unique serial string — like generating a DNA for each subtree — and match them.
💭 Next time I see a problem that smells like a filesystem, I’ll just light a candle, take a deep breath, and prepare for a DFS + Trie battle.

🔁 Reflection (Self-Check)

[ ] Could I solve this without help? ❌ Not even in an alternate timeline where I was raised by Trie monks.
[ ] Did I write code from scratch? ❌ I wrote tears.
[ ] Did I understand why it works? ⚠️ Eventually… after sacrificing an hour and part of my ego.
[ ] Will I be able to recall this in a week? ✅ If I tattoo the signature logic on my hand.