jayk0001

Posted on Nov 15

DSA Fundamentals: Arrays & Strings - From Theory to LeetCode Practice

#dsa #arrays #leetcode #algorithms

Data Structures and Algorithms form the foundation of efficient programming. Today, we'll dive deep into Arrays and Strings - two fundamental data structures that appear in virtually every coding interview and real-world application.

This comprehensive guide combines theoretical understanding with practical problem-solving, featuring solutions to essential LeetCode problems that demonstrate core concepts.

Understanding Arrays: Static vs Dynamic

Array Fundamentals

Arrays are mutable (changeable) collections that store elements in contiguous memory locations, providing efficient random access.

# Array mutability example
A = [1, 2, 3, 4, 5]
A[3] = 1  # Modify element at index 3
print(A)  # Output: [1, 2, 3, 1, 5]

Static Arrays vs Dynamic Arrays

Static Arrays:

Fixed size at creation time
To change size: copy all elements → create new array → paste elements
Memory allocated at compile time

Dynamic Arrays:

Size can grow/shrink during runtime
Programming language automatically handles expansion
Memory allocated at runtime (Python lists, JavaScript arrays)

Array Operations Time Complexity

Operation	Time Complexity	Notes
Random Access	O(1)	Direct index access
Search (`in` operator)	O(n)	Linear scan required
Append to end	O(1)*	*Amortized average
Pop from end	O(1)	Direct removal
Insert (not at end)	O(n)	Shift elements required
Delete (not at end)	O(n)	Shift elements required
Modify element	O(1)	Direct access and change

Understanding Strings: Immutability Matters

Strings are immutable - once created, they cannot be changed. Any "modification" creates a new string object.

String vs Array Operations Comparison

Operation	Array/List	String (Immutable)
Appending to end	O(1)* Amortized	O(n)
Popping from end	O(1)	O(n)
Insertion (not from end)	O(n)	O(n)
Deletion (not from end)	O(n)	O(n)
Modifying an element	O(1)	O(n)
Random Access	O(1)	O(1)
Searching	O(n)	O(n)

Key Insight: String immutability means every modification operation creates a new string, leading to O(n) complexity even for simple operations.

Essential LeetCode Problems & Solutions

Let's explore fundamental array and string problems that demonstrate core algorithmic patterns.

1. Valid Sudoku (LeetCode 36)

Problem: Determine if a 9x9 Sudoku board is valid.

Approach: Use hash sets to track seen numbers in rows, columns, and 3x3 boxes.

def isValidSudoku(board):
    # Create sets for rows, columns, and 3x3 boxes
    rows = [set() for _ in range(9)]
    cols = [set() for _ in range(9)]
    boxes = [set() for _ in range(9)]

    for row in range(9):
        for col in range(9):
            current_val = board[row][col]

            # Skip empty cells
            if current_val == '.':
                continue

            # Calculate box index: (row//3) * 3 + (col//3)
            box_index = (row // 3) * 3 + (col // 3)

            # Check if number already exists
            if (current_val in rows[row] or 
                current_val in cols[col] or 
                current_val in boxes[box_index]):
                return False

            # Add to respective sets
            rows[row].add(current_val)
            cols[col].add(current_val)
            boxes[box_index].add(current_val)

    return True

Time Complexity: O(1) - Fixed 9x9 grid = 81 operations
Space Complexity: O(1) - Fixed number of sets

Key Concepts:

Hash Set Usage: Efficient O(1) lookup for duplicates
Box Index Calculation: (row//3) * 3 + (col//3) maps 2D coordinates to box number
Early Termination: Return False immediately when duplicate found

2. Longest Consecutive Sequence (LeetCode 128)

Problem: Find the length of the longest consecutive elements sequence in O(n) time.

Approach: Use hash set to identify sequence starts and extend sequences.

def longestConsecutive(nums):
    if not nums:
        return 0

    # Convert to set for O(1) lookup and remove duplicates
    num_set = set(nums)
    longest = 0

    for num in num_set:
        # Check if this can be a sequence start
        if num - 1 not in num_set:
            current_num = num
            current_length = 1

            # Extend the sequence
            while current_num + 1 in num_set:
                current_num += 1
                current_length += 1

            # Update longest sequence found
            longest = max(longest, current_length)

    return longest

Time Complexity: O(n) - Each number visited at most twice
Space Complexity: O(n) - Hash set storage

Key Concepts:

Sequence Start Identification: Only start counting when num - 1 not in set
Hash Set Optimization: O(1) lookup prevents nested loops
Duplicate Handling: Set automatically removes duplicates

3. Product of Array Except Self (LeetCode 238)

Problem: Return array where each element is the product of all elements except itself (no division allowed).

Approach: Two-pass algorithm - left products, then right products.

def productExceptSelf(nums):
    n = len(nums)
    result = [1] * n

    # First pass: accumulate products to the left
    for i in range(1, n):
        result[i] = result[i-1] * nums[i-1]

    # Second pass: multiply by products to the right
    right_product = 1
    for i in range(n-1, -1, -1):
        result[i] *= right_product
        right_product *= nums[i]

    return result

# Example walkthrough:
# nums = [1, 2, 3, 4]
# After first pass:  [1, 1, 2, 6]  (left products)
# After second pass: [24, 12, 8, 6] (left * right products)

Time Complexity: O(n) - Two passes through array
Space Complexity: O(1) - Only using result array (required output)

Key Concepts:

Two-Pass Strategy: Separate left and right product calculations
Space Optimization: Use result array to store intermediate values
No Division Constraint: Handles edge cases like zeros naturally

4. Top K Frequent Elements (LeetCode 347)

Problem: Return k most frequent elements from an array.

Approach 1: Counter + Sorting

from collections import Counter

def topKFrequent(nums, k):
    # Count frequencies
    counter = Counter(nums)

    # Sort by frequency (descending)
    sorted_items = sorted(counter.items(), key=lambda x: x[1], reverse=True)

    # Extract first k elements
    result = []
    for i in range(k):
        result.append(sorted_items[i][0])

    return result

Time Complexity: O(n log n) - Sorting dominates
Space Complexity: O(n) - Counter storage

Approach 2: Counter + Bucket Sort (Optimal)

from collections import Counter

def topKFrequent(nums, k):
    counter = Counter(nums)
    n = len(nums)

    # Create buckets for each possible frequency (0 to n)
    buckets = [[] for _ in range(n + 1)]

    # Place numbers in buckets based on frequency
    for num, freq in counter.items():
        buckets[freq].append(num)

    # Collect k most frequent elements (iterate backwards)
    result = []
    for freq in range(n, -1, -1):
        for num in buckets[freq]:
            result.append(num)
            if len(result) == k:
                return result

    return result

Time Complexity: O(n) - Linear bucket sort
Space Complexity: O(n) - Bucket array

Key Concepts:

Bucket Sort Application: Frequency as bucket index
Trade-off Analysis: Sorting vs bucket sort based on constraints
Early Termination: Stop when k elements found

5. Group Anagrams (LeetCode 49)

Problem: Group strings that are anagrams of each other.

Approach 1: Sorting as Key

from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)

    for s in strs:
        # Use sorted string as key
        key = ''.join(sorted(s))
        groups[key].append(s)

    return list(groups.values())

Time Complexity: O(n * m log m) where n = number of strings, m = average string length
Space Complexity: O(n * m) - Storage for groups

Approach 2: Character Count as Key (Optimal)

from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)

    for s in strs:
        # Create character count array
        char_count = [0] * 26
        for char in s:
            char_count[ord(char) - ord('a')] += 1

        # Use tuple of counts as key (arrays aren't hashable)
        key = tuple(char_count)
        groups[key].append(s)

    return list(groups.values())

Time Complexity: O(n * m) - Linear in string length
Space Complexity: O(n * m) - Storage for groups

Key Concepts:

Anagram Property: Same character frequencies
Key Generation: Sorted string vs character count
Hashable Keys: Tuple conversion for dictionary keys

6. Two Sum (LeetCode 1)

Problem: Find two numbers that add up to target.

Approach: Hash map for complement lookup.

def twoSum(nums, target):
    seen = {}  # value -> index mapping

    for i, num in enumerate(nums):
        complement = target - num

        if complement in seen:
            return [seen[complement], i]

        seen[num] = i

    return []  # No solution found

Time Complexity: O(n) - Single pass
Space Complexity: O(n) - Hash map storage

Key Concepts:

Complement Strategy: target - current = needed value
Hash Map Optimization: O(1) lookup vs O(n) nested loops
Index Tracking: Store positions for result

7. Valid Anagram (LeetCode 242)

Problem: Determine if two strings are anagrams.

Approach 1: Sorting

def isAnagram(s, t):
    return sorted(s) == sorted(t)

Time Complexity: O(n log n) - Sorting
Space Complexity: O(1) - Ignoring sort space

Approach 2: Character Frequency (Optimal)

from collections import Counter

def isAnagram(s, t):
    return Counter(s) == Counter(t)

# Or manual counting:
def isAnagram(s, t):
    if len(s) != len(t):
        return False

    char_count = {}

    # Count characters in s
    for char in s:
        char_count[char] = char_count.get(char, 0) + 1

    # Subtract characters in t
    for char in t:
        if char not in char_count:
            return False
        char_count[char] -= 1
        if char_count[char] == 0:
            del char_count[char]

    return len(char_count) == 0

Time Complexity: O(n) - Linear scan
Space Complexity: O(k) where k = unique characters

8. Contains Duplicate (LeetCode 217)

Problem: Check if array contains any duplicates.

Approach 1: Hash Set

def containsDuplicate(nums):
    seen = set()

    for num in nums:
        if num in seen:
            return True
        seen.add(num)

    return False

Approach 2: Set Length Comparison

def containsDuplicate(nums):
    return len(set(nums)) != len(nums)

Time Complexity: O(n) - Single pass or set creation
Space Complexity: O(n) - Hash set storage

Key Concepts:

Early Termination: Return immediately when duplicate found
Set Properties: Automatic duplicate removal
Length Comparison: Elegant one-liner solution

Core Algorithm Patterns from Today's Problems

1. Hash-Based Solutions (Primary Pattern)

When to use: Fast lookups, duplicate detection, complement finding
Examples: Two Sum, Valid Sudoku, Contains Duplicate, Longest Consecutive Sequence
Key insight: Trade O(n) space for O(1) lookup time
Implementation: Hash Sets for membership, Hash Maps for key-value pairs

2. Frequency Analysis (Primary Pattern)

When to use: Anagrams, character counting, top-k problems
Examples: Group Anagrams, Valid Anagram, Top K Frequent Elements
Tools: Counter, hash maps, bucket sort
Optimization: Linear time frequency-based solutions instead of sorting

3. Multi-Pass Array Processing

When to use: Complex calculations requiring multiple scans
Example: Product of Array Except Self (left pass + right pass)
Benefit: Maintain O(n) time while avoiding division or complex logic
Pattern: Accumulate information in one direction, then process in reverse

4. Set Operations for Uniqueness

When to use: Removing duplicates, finding unique elements
Examples: Longest Consecutive Sequence (duplicate removal), Contains Duplicate
Advantage: Automatic duplicate handling with O(1) operations
Common trick: Compare len(set(array)) vs len(array) for duplicate detection

Performance Optimization Tips

1. Choose Right Data Structure

# Slow: List membership testing
if item in my_list:  # O(n)

# Fast: Set membership testing  
if item in my_set:   # O(1)

2. Minimize String Operations

# Slow: String concatenation in loop
result = ""
for char in chars:
    result += char  # O(n) each iteration

# Fast: Join operation
result = ''.join(chars)  # O(n) total

3. Early Termination

# Stop as soon as condition is met
for item in items:
    if condition(item):
        return True  # Don't continue unnecessary iterations
return False

4. Space-Time Trade-offs

# Time-optimized (more space)
seen = set(nums)  # O(n) space
return len(seen) != len(nums)  # O(n) time

# Space-optimized (more time)  
nums.sort()  # O(1) space (in-place)
for i in range(1, len(nums)):  # O(n log n) time
    if nums[i] == nums[i-1]:
        return True

Common Pitfalls and Solutions

1. Index Out of Bounds

# Dangerous
for i in range(len(arr)):
    if arr[i+1] > arr[i]:  # Error when i = len(arr)-1

# Safe
for i in range(len(arr) - 1):
    if arr[i+1] > arr[i]:

2. Modifying List While Iterating

# Dangerous
for item in my_list:
    if condition(item):
        my_list.remove(item)  # Changes list size during iteration

# Safe
my_list = [item for item in my_list if not condition(item)]

3. String Immutability Oversight

# Inefficient for large strings
def reverse_string(s):
    result = ""
    for char in reversed(s):
        result += char  # O(n²) due to string immutability
    return result

# Efficient
def reverse_string(s):
    return s[::-1]  # O(n) built-in operation

Conclusion

Arrays and strings form the foundation of algorithmic problem-solving. Key takeaways:

Core Concepts Mastered:

Array mutability vs string immutability and their performance implications
Static vs dynamic arrays and when to use each
Time complexity analysis for common operations
Hash-based optimizations for O(1) lookups

Essential Patterns Mastered Today:

Hash-based lookups (Two Sum, Valid Sudoku, Contains Duplicate)
Frequency analysis (anagrams, Top K problems)
Multi-pass processing (Product Except Self)
Set operations (duplicate detection, uniqueness)

Problem-Solving Strategies:

Choose appropriate data structures based on operation requirements
Consider space-time trade-offs for optimization
Use early termination when possible
Leverage built-in functions for common operations

Next Steps:

Building on this foundation, future topics will cover:

Linked Lists and Pointers
Stack and Queue Applications
Tree and Graph Traversals
Dynamic Programming Patterns

The problems covered here represent fundamental patterns that appear across technical interviews and real-world applications. Master these concepts, and you'll have a solid foundation for tackling more complex algorithmic challenges.

References:

Greg Hogg DSA Algorithm YouTube Channel
Algorithm Design Manual by Steven Skiena
Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein
Grokking Algorithms by Aditya Bhargava

This comprehensive guide combines theoretical understanding with practical problem-solving, providing a solid foundation for data structures and algorithms mastery.

DEV Community

DSA Fundamentals: Arrays & Strings - From Theory to LeetCode Practice

Understanding Arrays: Static vs Dynamic

Array Fundamentals

Static Arrays vs Dynamic Arrays

Array Operations Time Complexity

Understanding Strings: Immutability Matters

String vs Array Operations Comparison

Essential LeetCode Problems & Solutions

1. Valid Sudoku (LeetCode 36)

2. Longest Consecutive Sequence (LeetCode 128)

3. Product of Array Except Self (LeetCode 238)

4. Top K Frequent Elements (LeetCode 347)

5. Group Anagrams (LeetCode 49)

6. Two Sum (LeetCode 1)

7. Valid Anagram (LeetCode 242)

8. Contains Duplicate (LeetCode 217)

Core Algorithm Patterns from Today's Problems

1. Hash-Based Solutions (Primary Pattern)

2. Frequency Analysis (Primary Pattern)

3. Multi-Pass Array Processing

4. Set Operations for Uniqueness

Performance Optimization Tips

1. Choose Right Data Structure

2. Minimize String Operations

3. Early Termination

4. Space-Time Trade-offs

Common Pitfalls and Solutions

1. Index Out of Bounds

2. Modifying List While Iterating

3. String Immutability Oversight

Conclusion

Core Concepts Mastered:

Essential Patterns Mastered Today:

Problem-Solving Strategies:

Next Steps:

Top comments (0)