DEV Community

Kazuki Higashiguchi
Kazuki Higashiguchi

Posted on

Go 1.18: Cut added to strings/bytes

#go

Key takeaways

  • Go 1.18 is expected to be released in February 2022.
  • New function Cut added to strings and bytes package, which will make our Go code pretty simpler.

Go 1.18

Go 1.18 includes several major new features, generics(Type parameters), Fuzzing and etc.

The official release note is available on the following link.

https://tip.golang.org/doc/go1.18

Go 1.18 is expected to be released in February 2022, but you can download and use the beta version Go 1.18 Beta 2 as the following commands:

$ go install golang.org/dl/go1.18beta2@latest
$ go1.18beta2 download

$ go1.18beta2 version
go version go1.18beta2 darwin/arm64
Enter fullscreen mode Exit fullscreen mode

⚠️ Installing executables with go get is deprecated in Go 1.17. You should use go install instead as shown in the above command example. ref: Deprecation of 'go get' for installing executables

strings.Cut/ bytes.Cut

As always, there are various minor changes and updates to the library. This article brings the Cut function added to strings and bytes package as a topic.

The Cut function is described like this in Go 1.18 release notes.

The new Cut function slices a []byte around a separator. It can replace and simplify many common uses of Index, IndexByte, IndexRune, and SplitN.

The signature of Cut is as follow:

func Cut(s, sep []byte) (before, after []byte, found bool) {
Enter fullscreen mode Exit fullscreen mode

Let's take a look at an example. Here is an example of an implementation that parses username and password from a given header string during basic authentication.

First, if using Go 1.17 and prior...

package main

import (
    "encoding/base64"
    "fmt"
    "strings"
)

const prefix = "Basic "

func main() {
    auth := "Basic R28xMTg6d2VsY29tZSBjdXQ="
    const prefix = "Basic "
    base64Decoded, err := base64.StdEncoding.DecodeString(auth[len(prefix):])
    if err != nil {
        fmt.Printf("error while base64 decoding: %v\n", err)
        return
    }

    decodedString := string(base64Decoded)
    fmt.Printf("decoded: %s\n", decodedString)
    // Output: Go118:welcome cut

    separatorIndex := strings.IndexByte(decodedString, ':')
    if separatorIndex < 0 {
        fmt.Println("not a basic authentication format.")
        return
    }
    username := decodedString[:separatorIndex]
    password := decodedString[separatorIndex+1:]
    fmt.Printf("username: %q, password: %q\n", username, password)
    // username: "Go118", password: "welcome cut"
}
Enter fullscreen mode Exit fullscreen mode

(Available on Go playground)

In this example, find an index of the separator ":" using strings.IndexByte and get username "Go118" and password "welcome cut".

separatorIndex := strings.IndexByte(decodedString, ':')
if separatorIndex < 0 {
    fmt.Println("not a basic authentication format.")
    return
}
username := decodedString[:separatorIndex]
password := decodedString[separatorIndex+1:]
Enter fullscreen mode Exit fullscreen mode

The Cut function makes this code simpler. Let's modify this code using strings.Cut.

username, password, ok := strings.Cut(decodedString, ":")
if !ok {
    fmt.Println("not a basic authentication format.")
    return
}
Enter fullscreen mode Exit fullscreen mode

(Available on Go playground)

Awesome, The code improved pretty simpler! In Go 1.18 and later, you may be lots of opportunities to use the Cut function in your daily Go coding.

The problem Cut function addresses

Then, why does Go add the Cut function? Let's take a look at the proposal.

https://github.com/golang/go/issues/46336

The proposal mentioned the effectiveness of Cut by searching for use of string.Index, strings.IndexBytes, or strings.IndexRune that could use strings.Cut instead in the main Go repository.

That leaves 285 calls. Of those, 221 were better written as Cut, leaving 64 that were not.
That is, 77% of Index calls are more clearly written using Cut. That's incredible!

77% of Index calls, it's incredible number! It found out Cut would replace and simplify the overwhelming majority of the usage of these functions:

eq := strings.IndexByte(rec, '=')
if eq == -1 {
    return "", "", s, ErrHeader
}
k, v = rec[:eq], rec[eq+1:]
Enter fullscreen mode Exit fullscreen mode

with

k, v, ok = strings.Cut(rec, "=")
if !ok {
    return "", "", s, ErrHeader
}
Enter fullscreen mode Exit fullscreen mode

In fact, lots of standard libraries was updated using Cut after implementing it.

A screenshot of https://go-review.googlesource.com/c/go/+/351711

https://go-review.googlesource.com/c/go/+/351711

Implementation of Cut

The implementation of Cut looks very short and simple. The Cut function was added to strings and bytes libraries in 351710: bytes, strings: add Cut.

func Cut(s, sep []byte) (before, after []byte, found bool) {
    if i := Index(s, sep); i >= 0 {
        return s[:i], s[i+len(sep):], true
    }
    return s, nil, false
}
Enter fullscreen mode Exit fullscreen mode

The first edition of the current signature was proposed on November 12, 2020.

A nightlyone's comment on the issue

And then, renamed the third result from ok to found to avoid confusion with the comma-ok form for map access, channel receive, and so on.

If you look around at other languages, you'll see that str.partition of Python and split_once of Rust have similar functionalities.

  • str.partition: Split the string at the first occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator.
x = text.partition("delimeter")
Enter fullscreen mode Exit fullscreen mode
  • split_once: Splits the string on the first occurrence of the specified delimiter and returns prefix before delimiter and suffix after delimiter.
pub fn split_once<'a, P>(&'a self, delimiter: P) -> Option<(&'a str, &'a str)>
Enter fullscreen mode Exit fullscreen mode

In particular, the Python case of str.partition helped the Go team to accept introducing Cut.

That is, the fact that str.partition is useful in Python is added evidence for Cut, but we need not adopt the Pythonic signature.

Quoted from issue#46336

History until accepted

This feature was first discussed in issue 40135 and then moved to issue 46336, which was accepted. The following timeline led to the acceptance of the proposal

A diagram illustrating the history until accepted

  1. July 9, 2020: Incoming (opened the issue #40135)
  2. May 24, 2021: Incoming (moved to the issue #46336)
  3. Jun 3, 2021: Incoming → Active
  4. Jun 17, 2021: Active → Likely Accept
  5. July 15, 2021: Likely Accept → Accepted

By the way, a group of Go team members holds proposal review meetings roughly weekly. You can read meeting minutes on the issue. "Incoming," "Active," and etc refer to the state of proposals.

A state graph of proposals

Strictly speaking, there are 11 states of proposals as defined in golang/proposal repository.

State Description
Incoming New proposal.
Active Watch for emerging consensus in the discussions.
Likely Accept Discussion seems to have reached a consensus to accept the proposal.
Likely Decline Discussion seems to have reached a consensus to decline the proposal.
Accepted Accepted and it is moved out of the Proposal milestone into a work milestone.
Declined Declined, and it is closed.
Declined as Duplicate Declined because it duplicates a previously decided proposal.
Declined as Infeasible Directly contradicts the core design of the language or of a package, or impossible to implement efficiently.
Declined as Retracted Closed or retracted in a comment by the original author
Hold Waiting for design revisions or additional information, which would take a couple weeks or more.

References

This feature was first discussed in issue 40135 and then moved to issue 46336, which was accepted. The following timeline led to the acceptance of the proposal

Top comments (0)