loading...

Giving Go another chance: hashtag parsing

detunized profile image Dmitry Yakimenko ・3 min read

Last time I added the --tag/-t flag. This time I'd like to add hashtag parsing. I'd like to simply mention tags in the comment and have to tool fish them out for me. For example:

$ klk in 'Writing some #tests to find a #bug'

My first impulse is to look for a library. And there is one for exactly this purpose. But who wants to introduce a left-pad timebomb into their own codebase? (Answer: almost anybody). I'd rather reinvent the wheel and solve the problem with a regex. Two problems are better than one.

Go has the regexp package in its standard library just for this purpose. So it's really a two-liner:

func extractTags(text string) []string {
    re := regexp.MustCompile("#\\S+")
    // TODO: Strip out #
    return re.FindAllString(text, -1)
}

Notice a TODO? That one of those cases where you'd have to take a dive from Python heights into C depths and roll you own loop. The two-liner becomes a six-liner. This where I start missing Ruby, Python, C#, Scala, Kotlin, hell, even C++.

func extractTags(comment string) []string {
    re := regexp.MustCompile("#\\S+")
    tags := re.FindAllString(comment, -1)

    // Strip #s
    for i, tag := range tags {
        tags[i] = strings.TrimLeft(tag, "#")
    }

    return tags
}

Done with C? Back to Python. Now we have to join tags that come from the --tag switch with the ones we fished out from the comment. That is surprisingly easy:

tags := append(tagFlag, extractTags(comment)...)

As was promised earlier, we have another problem now. Tags duplicate when the same tag comes from different places or when the same tag mentioned more than once. That should be quick to fix with in a language with such an amazing runtime, right? No, not really. One of the solutions would be to create a map and put all the tags in it, then iterate over its keys and put them into an array.

tagSet := map[string]bool{}

for _, tag := range tags {
    tagSet[tag] = true
}

uniqueTags := []string{}
for tag := range tagSet {
    uniqueTags = append(uniqueTags, tag)
}

Compare to Ruby:

uniqueTags = tags.uniq

Anywho, the final code looks like this:

func getTags(comment string) []string {
    tags := map[string]bool{}

    // Add all flags from --tag
    for _, tag := range tagFlag {
        tags[tag] = true
    }

    // Add all flags from #tag
    for _, tag := range extractTags(comment) {
        tags[tag] = true
    }

    // Deduplicate
    uniqueTags := []string{}
    for tag := range tags {
        uniqueTags = append(uniqueTags, tag)
    }

    return uniqueTags
}

After all this hard work I can write this in the terminal:

$ klk in --at '15 min ago' --tag coding "Writing some #tests"
Adding an entry: Writing some #tests at Fri Jan 25 01:53:05 CET 2019 with tags #coding #tests

The dream time tracking tool is coming along. And I'm relearning myself some C loops. Double win!


Google searches that went into getting this to work:

  • golang tag parser
  • golang hashtag parser
  • go regex
  • golang map
  • golang unique slice of strings
  • golang merge slices unique
  • golang merge slices
  • golang set
  • golang map keys as slice
  • golang map get keys as slice
  • go for
  • go foreach
  • go for range modify
  • go for range update

Time spent: 60 minutes
Total time spent: 6:35 hours

Posted on by:

detunized profile

Dmitry Yakimenko

@detunized

Grew up in Russia, lived in the States, moved to Germany, sometimes live in Spain. I program since I was 13. I used to program games, maps and now I reverse engineer password managers and other stuff

Discussion

pic
Editor guide