DEV Community

Daniel Keya
Daniel Keya

Posted on

String Union in Go: Merging Two Strings Without Duplicates

The Full Function

func Union() {
    if len(os.Args) < 3 {
        os.Stdout.Write([]byte("\n"))
        return
    }

    s1 := os.Args[1]
    s2 := os.Args[2]

    result := ""
    seen := make(map[rune]bool)

    for _, char := range s1 {
        if !seen[char] {
            result += string(char)
            seen[char] = true
        }
    }

    for _, char := range s2 {
        if !seen[char] {
            result += string(char)
            seen[char] = true
        }
    }

    os.Stdout.Write([]byte(result + "\n"))
}
Enter fullscreen mode Exit fullscreen mode

What Does It Do?

It takes two strings from the command line, walks through both of them in order, and builds a result string containing each character at most once — in the order it was first encountered.

Think of it as a set union on the characters of two strings.

For example:

  • "hello""world""helowrd"l and o already seen from s1, so skipped in s2
  • "abc""bcd""abcd"b and c already covered, only d is new

Algorithm Walkthrough

Step 1 — Validate arguments

if len(os.Args) < 3 {
    os.Stdout.Write([]byte("\n"))
    return
}
Enter fullscreen mode Exit fullscreen mode

At least 3 arguments are needed: the binary name + 2 strings. Unlike the strict != 3 check, this uses < 3 — so extra arguments are silently ignored rather than rejected.


Step 2 — Set up the seen-set

result := ""
seen := make(map[rune]bool)
Enter fullscreen mode Exit fullscreen mode

seen is a hash map that tracks which characters have already been added to result. Using rune as the key (instead of byte) means this handles full Unicode out of the box — emoji, accented characters, CJK glyphs, all work correctly.


Step 3 — Walk s1, add unseen characters

for _, char := range s1 {
    if !seen[char] {
        result += string(char)
        seen[char] = true
    }
}
Enter fullscreen mode Exit fullscreen mode

range over a string in Go yields rune values — so multi-byte characters are handled as single units. Each character is checked against seen: if it's new, it's appended and marked.


Step 4 — Walk s2, add only new characters

for _, char := range s2 {
    if !seen[char] {
        result += string(char)
        seen[char] = true
    }
}
Enter fullscreen mode Exit fullscreen mode

Same logic, but now seen already contains everything from s1. Only characters that didn't appear in s1 (or earlier in s2) get added.


Step 5 — Print the result

os.Stdout.Write([]byte(result + "\n"))
Enter fullscreen mode Exit fullscreen mode

The deduplicated union string is written to stdout, followed by a newline.


Dry Run: "hello""world"

Processing s1 = "hello"

char seen before? result after seen map
h "h" {h}
e "he" {h,e}
l "hel" {h,e,l}
l "hel" {h,e,l}
o "helo" {h,e,l,o}

Processing s2 = "world"

char seen before? result after seen map
w "helow" {h,e,l,o,w}
o "helow" {h,e,l,o,w}
r "helowr" {h,e,l,o,w,r}
l "helowr" {h,e,l,o,w,r}
d "helowrd" {h,e,l,o,w,r,d}

Output: helowrd


Complexity

Time O(m + n) where m = len(s1), n = len(s2)
Space O(k) where k = number of unique characters

Each character is visited exactly once. Map lookups and inserts are O(1) average. In practice, k is bounded by the size of the character set (e.g. 128 for ASCII, 1,114,112 for all Unicode) so space is effectively constant.


Edge Cases

Scenario Behaviour
Fewer than 2 args Writes \n, returns immediately
Both strings identical Result equals the deduplicated version of either string
Empty s1 Result is the deduplicated version of s2
Empty s2 Result is the deduplicated version of s1
Both empty Result is an empty string followed by \n
Unicode characters Handled correctly — range yields rune, map key is rune

Running It

# basic union
go run main.go hello world
# Output: helowrd

# no overlap
go run main.go abc xyz
# Output: abcxyz

# full overlap
go run main.go abc abc
# Output: abc

# unicode
go run main.go "café" "face"
# Output: café
Enter fullscreen mode Exit fullscreen mode

A Note on String Concatenation

Inside the loop, the function uses:

result += string(char)
Enter fullscreen mode Exit fullscreen mode

This works fine for short strings, but in Go, strings are immutable — every += creates a new string allocation. For large inputs, prefer strings.Builder:

var sb strings.Builder

for _, char := range s1 {
    if !seen[char] {
        sb.WriteRune(char)
        seen[char] = true
    }
}
// repeat for s2...

result := sb.String()
Enter fullscreen mode Exit fullscreen mode

strings.Builder grows a single underlying buffer, making it O(n) in both time and allocations instead of O(n²).


Refactored: Pure Function

Separating the union logic from CLI parsing makes it reusable and testable:

// StringUnion returns the union of characters from a and b,
// preserving first-seen order, with no duplicates.
func StringUnion(a, b string) string {
    var sb strings.Builder
    seen := make(map[rune]bool)

    for _, char := range a {
        if !seen[char] {
            sb.WriteRune(char)
            seen[char] = true
        }
    }

    for _, char := range b {
        if !seen[char] {
            sb.WriteRune(char)
            seen[char] = true
        }
    }

    return sb.String()
}

func Union() {
    if len(os.Args) < 3 {
        os.Stdout.Write([]byte("\n"))
        return
    }
    result := StringUnion(os.Args[1], os.Args[2])
    os.Stdout.Write([]byte(result + "\n"))
}
Enter fullscreen mode Exit fullscreen mode

Wrapping Up

Union is a clean example of the seen-set pattern — one of the most reusable tools in string processing. A hash map tracking visited elements lets you deduplicate in a single linear pass, and using rune as the key gives you Unicode support for free.

The same pattern shows up in removing duplicates from slices, finding first non-repeating characters, and building character frequency counters.


Written with ❤️ in Go


If you enjoyed this, feel free to check out more of my work on GitHub 👉 keyadaniel56 — always building something new in Go and beyond.

Top comments (0)