DEV Community

jayk0001
jayk0001

Posted on

DSA Fundamentals: Arrays & Strings - From Theory to LeetCode Practice

Data Structures and Algorithms form the foundation of efficient programming. Today, we'll dive deep into Arrays and Strings - two fundamental data structures that appear in virtually every coding interview and real-world application.

This comprehensive guide combines theoretical understanding with practical problem-solving, featuring solutions to essential LeetCode problems that demonstrate core concepts.

Understanding Arrays: Static vs Dynamic

Array Fundamentals

Arrays are mutable (changeable) collections that store elements in contiguous memory locations, providing efficient random access.

# Array mutability example
A = [1, 2, 3, 4, 5]
A[3] = 1  # Modify element at index 3
print(A)  # Output: [1, 2, 3, 1, 5]
Enter fullscreen mode Exit fullscreen mode

Static Arrays vs Dynamic Arrays

Static Arrays:

  • Fixed size at creation time
  • To change size: copy all elements → create new array → paste elements
  • Memory allocated at compile time

Dynamic Arrays:

  • Size can grow/shrink during runtime
  • Programming language automatically handles expansion
  • Memory allocated at runtime (Python lists, JavaScript arrays)

Array Operations Time Complexity

Operation Time Complexity Notes
Random Access O(1) Direct index access
Search (in operator) O(n) Linear scan required
Append to end O(1)* *Amortized average
Pop from end O(1) Direct removal
Insert (not at end) O(n) Shift elements required
Delete (not at end) O(n) Shift elements required
Modify element O(1) Direct access and change

Understanding Strings: Immutability Matters

Strings are immutable - once created, they cannot be changed. Any "modification" creates a new string object.

String vs Array Operations Comparison

Operation Array/List String (Immutable)
Appending to end O(1)* Amortized O(n)
Popping from end O(1) O(n)
Insertion (not from end) O(n) O(n)
Deletion (not from end) O(n) O(n)
Modifying an element O(1) O(n)
Random Access O(1) O(1)
Searching O(n) O(n)

Key Insight: String immutability means every modification operation creates a new string, leading to O(n) complexity even for simple operations.

Essential LeetCode Problems & Solutions

Let's explore fundamental array and string problems that demonstrate core algorithmic patterns.

1. Valid Sudoku (LeetCode 36)

Problem: Determine if a 9x9 Sudoku board is valid.

Approach: Use hash sets to track seen numbers in rows, columns, and 3x3 boxes.

def isValidSudoku(board):
    # Create sets for rows, columns, and 3x3 boxes
    rows = [set() for _ in range(9)]
    cols = [set() for _ in range(9)]
    boxes = [set() for _ in range(9)]

    for row in range(9):
        for col in range(9):
            current_val = board[row][col]

            # Skip empty cells
            if current_val == '.':
                continue

            # Calculate box index: (row//3) * 3 + (col//3)
            box_index = (row // 3) * 3 + (col // 3)

            # Check if number already exists
            if (current_val in rows[row] or 
                current_val in cols[col] or 
                current_val in boxes[box_index]):
                return False

            # Add to respective sets
            rows[row].add(current_val)
            cols[col].add(current_val)
            boxes[box_index].add(current_val)

    return True
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(1) - Fixed 9x9 grid = 81 operations
Space Complexity: O(1) - Fixed number of sets

Key Concepts:

  • Hash Set Usage: Efficient O(1) lookup for duplicates
  • Box Index Calculation: (row//3) * 3 + (col//3) maps 2D coordinates to box number
  • Early Termination: Return False immediately when duplicate found

2. Longest Consecutive Sequence (LeetCode 128)

Problem: Find the length of the longest consecutive elements sequence in O(n) time.

Approach: Use hash set to identify sequence starts and extend sequences.

def longestConsecutive(nums):
    if not nums:
        return 0

    # Convert to set for O(1) lookup and remove duplicates
    num_set = set(nums)
    longest = 0

    for num in num_set:
        # Check if this can be a sequence start
        if num - 1 not in num_set:
            current_num = num
            current_length = 1

            # Extend the sequence
            while current_num + 1 in num_set:
                current_num += 1
                current_length += 1

            # Update longest sequence found
            longest = max(longest, current_length)

    return longest
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n) - Each number visited at most twice
Space Complexity: O(n) - Hash set storage

Key Concepts:

  • Sequence Start Identification: Only start counting when num - 1 not in set
  • Hash Set Optimization: O(1) lookup prevents nested loops
  • Duplicate Handling: Set automatically removes duplicates

3. Product of Array Except Self (LeetCode 238)

Problem: Return array where each element is the product of all elements except itself (no division allowed).

Approach: Two-pass algorithm - left products, then right products.

def productExceptSelf(nums):
    n = len(nums)
    result = [1] * n

    # First pass: accumulate products to the left
    for i in range(1, n):
        result[i] = result[i-1] * nums[i-1]

    # Second pass: multiply by products to the right
    right_product = 1
    for i in range(n-1, -1, -1):
        result[i] *= right_product
        right_product *= nums[i]

    return result

# Example walkthrough:
# nums = [1, 2, 3, 4]
# After first pass:  [1, 1, 2, 6]  (left products)
# After second pass: [24, 12, 8, 6] (left * right products)
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n) - Two passes through array
Space Complexity: O(1) - Only using result array (required output)

Key Concepts:

  • Two-Pass Strategy: Separate left and right product calculations
  • Space Optimization: Use result array to store intermediate values
  • No Division Constraint: Handles edge cases like zeros naturally

4. Top K Frequent Elements (LeetCode 347)

Problem: Return k most frequent elements from an array.

Approach 1: Counter + Sorting

from collections import Counter

def topKFrequent(nums, k):
    # Count frequencies
    counter = Counter(nums)

    # Sort by frequency (descending)
    sorted_items = sorted(counter.items(), key=lambda x: x[1], reverse=True)

    # Extract first k elements
    result = []
    for i in range(k):
        result.append(sorted_items[i][0])

    return result
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n log n) - Sorting dominates
Space Complexity: O(n) - Counter storage

Approach 2: Counter + Bucket Sort (Optimal)

from collections import Counter

def topKFrequent(nums, k):
    counter = Counter(nums)
    n = len(nums)

    # Create buckets for each possible frequency (0 to n)
    buckets = [[] for _ in range(n + 1)]

    # Place numbers in buckets based on frequency
    for num, freq in counter.items():
        buckets[freq].append(num)

    # Collect k most frequent elements (iterate backwards)
    result = []
    for freq in range(n, -1, -1):
        for num in buckets[freq]:
            result.append(num)
            if len(result) == k:
                return result

    return result
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n) - Linear bucket sort
Space Complexity: O(n) - Bucket array

Key Concepts:

  • Bucket Sort Application: Frequency as bucket index
  • Trade-off Analysis: Sorting vs bucket sort based on constraints
  • Early Termination: Stop when k elements found

5. Group Anagrams (LeetCode 49)

Problem: Group strings that are anagrams of each other.

Approach 1: Sorting as Key

from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)

    for s in strs:
        # Use sorted string as key
        key = ''.join(sorted(s))
        groups[key].append(s)

    return list(groups.values())
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n * m log m) where n = number of strings, m = average string length
Space Complexity: O(n * m) - Storage for groups

Approach 2: Character Count as Key (Optimal)

from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)

    for s in strs:
        # Create character count array
        char_count = [0] * 26
        for char in s:
            char_count[ord(char) - ord('a')] += 1

        # Use tuple of counts as key (arrays aren't hashable)
        key = tuple(char_count)
        groups[key].append(s)

    return list(groups.values())
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n * m) - Linear in string length
Space Complexity: O(n * m) - Storage for groups

Key Concepts:

  • Anagram Property: Same character frequencies
  • Key Generation: Sorted string vs character count
  • Hashable Keys: Tuple conversion for dictionary keys

6. Two Sum (LeetCode 1)

Problem: Find two numbers that add up to target.

Approach: Hash map for complement lookup.

def twoSum(nums, target):
    seen = {}  # value -> index mapping

    for i, num in enumerate(nums):
        complement = target - num

        if complement in seen:
            return [seen[complement], i]

        seen[num] = i

    return []  # No solution found
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n) - Single pass
Space Complexity: O(n) - Hash map storage

Key Concepts:

  • Complement Strategy: target - current = needed value
  • Hash Map Optimization: O(1) lookup vs O(n) nested loops
  • Index Tracking: Store positions for result

7. Valid Anagram (LeetCode 242)

Problem: Determine if two strings are anagrams.

Approach 1: Sorting

def isAnagram(s, t):
    return sorted(s) == sorted(t)
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n log n) - Sorting
Space Complexity: O(1) - Ignoring sort space

Approach 2: Character Frequency (Optimal)

from collections import Counter

def isAnagram(s, t):
    return Counter(s) == Counter(t)

# Or manual counting:
def isAnagram(s, t):
    if len(s) != len(t):
        return False

    char_count = {}

    # Count characters in s
    for char in s:
        char_count[char] = char_count.get(char, 0) + 1

    # Subtract characters in t
    for char in t:
        if char not in char_count:
            return False
        char_count[char] -= 1
        if char_count[char] == 0:
            del char_count[char]

    return len(char_count) == 0
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n) - Linear scan
Space Complexity: O(k) where k = unique characters

8. Contains Duplicate (LeetCode 217)

Problem: Check if array contains any duplicates.

Approach 1: Hash Set

def containsDuplicate(nums):
    seen = set()

    for num in nums:
        if num in seen:
            return True
        seen.add(num)

    return False
Enter fullscreen mode Exit fullscreen mode

Approach 2: Set Length Comparison

def containsDuplicate(nums):
    return len(set(nums)) != len(nums)
Enter fullscreen mode Exit fullscreen mode

Time Complexity: O(n) - Single pass or set creation
Space Complexity: O(n) - Hash set storage

Key Concepts:

  • Early Termination: Return immediately when duplicate found
  • Set Properties: Automatic duplicate removal
  • Length Comparison: Elegant one-liner solution

Core Algorithm Patterns from Today's Problems

1. Hash-Based Solutions (Primary Pattern)

  • When to use: Fast lookups, duplicate detection, complement finding
  • Examples: Two Sum, Valid Sudoku, Contains Duplicate, Longest Consecutive Sequence
  • Key insight: Trade O(n) space for O(1) lookup time
  • Implementation: Hash Sets for membership, Hash Maps for key-value pairs

2. Frequency Analysis (Primary Pattern)

  • When to use: Anagrams, character counting, top-k problems
  • Examples: Group Anagrams, Valid Anagram, Top K Frequent Elements
  • Tools: Counter, hash maps, bucket sort
  • Optimization: Linear time frequency-based solutions instead of sorting

3. Multi-Pass Array Processing

  • When to use: Complex calculations requiring multiple scans
  • Example: Product of Array Except Self (left pass + right pass)
  • Benefit: Maintain O(n) time while avoiding division or complex logic
  • Pattern: Accumulate information in one direction, then process in reverse

4. Set Operations for Uniqueness

  • When to use: Removing duplicates, finding unique elements
  • Examples: Longest Consecutive Sequence (duplicate removal), Contains Duplicate
  • Advantage: Automatic duplicate handling with O(1) operations
  • Common trick: Compare len(set(array)) vs len(array) for duplicate detection

Performance Optimization Tips

1. Choose Right Data Structure

# Slow: List membership testing
if item in my_list:  # O(n)

# Fast: Set membership testing  
if item in my_set:   # O(1)
Enter fullscreen mode Exit fullscreen mode

2. Minimize String Operations

# Slow: String concatenation in loop
result = ""
for char in chars:
    result += char  # O(n) each iteration

# Fast: Join operation
result = ''.join(chars)  # O(n) total
Enter fullscreen mode Exit fullscreen mode

3. Early Termination

# Stop as soon as condition is met
for item in items:
    if condition(item):
        return True  # Don't continue unnecessary iterations
return False
Enter fullscreen mode Exit fullscreen mode

4. Space-Time Trade-offs

# Time-optimized (more space)
seen = set(nums)  # O(n) space
return len(seen) != len(nums)  # O(n) time

# Space-optimized (more time)  
nums.sort()  # O(1) space (in-place)
for i in range(1, len(nums)):  # O(n log n) time
    if nums[i] == nums[i-1]:
        return True
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls and Solutions

1. Index Out of Bounds

# Dangerous
for i in range(len(arr)):
    if arr[i+1] > arr[i]:  # Error when i = len(arr)-1

# Safe
for i in range(len(arr) - 1):
    if arr[i+1] > arr[i]:
Enter fullscreen mode Exit fullscreen mode

2. Modifying List While Iterating

# Dangerous
for item in my_list:
    if condition(item):
        my_list.remove(item)  # Changes list size during iteration

# Safe
my_list = [item for item in my_list if not condition(item)]
Enter fullscreen mode Exit fullscreen mode

3. String Immutability Oversight

# Inefficient for large strings
def reverse_string(s):
    result = ""
    for char in reversed(s):
        result += char  # O(n²) due to string immutability
    return result

# Efficient
def reverse_string(s):
    return s[::-1]  # O(n) built-in operation
Enter fullscreen mode Exit fullscreen mode

Conclusion

Arrays and strings form the foundation of algorithmic problem-solving. Key takeaways:

Core Concepts Mastered:

  • Array mutability vs string immutability and their performance implications
  • Static vs dynamic arrays and when to use each
  • Time complexity analysis for common operations
  • Hash-based optimizations for O(1) lookups

Essential Patterns Mastered Today:

  • Hash-based lookups (Two Sum, Valid Sudoku, Contains Duplicate)
  • Frequency analysis (anagrams, Top K problems)
  • Multi-pass processing (Product Except Self)
  • Set operations (duplicate detection, uniqueness)

Problem-Solving Strategies:

  • Choose appropriate data structures based on operation requirements
  • Consider space-time trade-offs for optimization
  • Use early termination when possible
  • Leverage built-in functions for common operations

Next Steps:

Building on this foundation, future topics will cover:

  • Linked Lists and Pointers
  • Stack and Queue Applications
  • Tree and Graph Traversals
  • Dynamic Programming Patterns

The problems covered here represent fundamental patterns that appear across technical interviews and real-world applications. Master these concepts, and you'll have a solid foundation for tackling more complex algorithmic challenges.


References:

This comprehensive guide combines theoretical understanding with practical problem-solving, providing a solid foundation for data structures and algorithms mastery.

Top comments (0)