DEV Community

James [Undefined]
James [Undefined]

Posted on • Edited on

The most underrated but useful Rust standard library type

The Rust standard library is full of many useful types, traits, and abstractions. Today I'll be talking about one that I think is underrated, but is quite useful. That type is Cow.

What is Cow?

According to the standard library docs, std::borrow::Cow is:

a smart pointer providing clone-on-write functionality: it can enclose and provide immutable access to borrowed data, and clone the data lazily when mutation or ownership is required. The type is designed to work with general borrowed data via the Borrow trait.

I feel like this description is accurate, but it just doesn't give it justice.

Cow is actually just an enum:

pub enum Cow<'a, B: ?Sized + 'a>
where
    B: ToOwned,
{
    /// Borrowed data.
    Borrowed(&'a B),

    /// Owned data.
    Owned(<B as ToOwned>::Owned),
}

Enter fullscreen mode Exit fullscreen mode

This definition not only makes it possible to have clone-on-write semantics (where if data is borrowed, it is turned into owned data before being accessible mutably), but it also allows you to store data that is potentially owned. The abstraction with ToOwned allows you to use Cow<'_, str>, storing either a &str or a String. Without this abstraction, Cow wouldn't be as helpful, because you would only be able to hold e.g. a str and a Box<str> (which is definitely not as useful as String).

You could of course just write your own definition of a PotentiallyOwned<'a, T> type, but if it already exists in the standard library, you might as well use it! Plus, the clone-on-write semantics are a helpful addition to a "potentially owned" type.

Why is it helpful?

The "potentially owned" semantics of Cow make it helpful for some cases where you may have a large string that you need to modify, but don't want to clone unnecessarily. This is an example that's not brought up in the standard library docs (I argue that it should definitely be).

Here is a simple (albeit, somewhat contrived) example of how this is helpful:

fn foo(s: &str, some_condition: bool) -> &str {
    if some_condition {
        &s.replace("foo", "bar")
    } else {
        s
    }
}
Enter fullscreen mode Exit fullscreen mode

Let me explain this code. This function may replace all instances of "foo" with "bar" in the string s, if the condition some_condition is true. But there's a problem!

The function returns a &str, but str.replace returns a String, and you can't return a reference to data owned by a function!

Okay, so let's make the function return String.

fn foo(s: &str, some_condition: bool) -> String {
    if some_condition {
        s.replace("foo", "bar")
    } else {
        s.to_string()
    }
}
Enter fullscreen mode Exit fullscreen mode

This function works, but notice that it always clones the string. This is generally bad, because if the string is long, this might be super expensive! So what do we do if we don't want to unnecessarily clone?

Well, we can use Cow! It allows us to return data that is potentially owned.

Here is a version of the function that doesn't unnecessarily clone:

use std::borrow::Cow;

fn foo<'a>(s: &'a str, some_condition: bool) -> Cow<'a, str> {
    if some_condition {
        Cow::from(s.replace("foo", "bar"))
    } else {
        Cow::from(s)
    }
}
Enter fullscreen mode Exit fullscreen mode

And it works! Our code passes the borrow checker, and it doesn't waste memory!

Why is it confusing?

Note: This section is mostly just my speculations so don't take it as fact.

I personally feel that Cow is a lot more confusing than it needs to be, thanks to the borrow checker. Don't get me wrong, the borrow checker is amazing. It definitely prevents memory safety bugs that would otherwise be present without it. But lifetimes are quite a papercut for new Rust users, and I think this is one reason why Cow is kind of underrated--it's kind of hard to use.

Anything producing or using a Cow needs to have a lifetime attached, and that makes it just a bit harder to use. It's also potentially hard for new users to reason about unsized types, so using a Cow<'_, str> might not make sense to a new user, as they may think something on the lines of "but wait, I thought you would only use &str, not just str?"

These aren't really problems that can be solved with some "quick fix", so this will continue to be a potential papercut.

Another reason why I find that Cow is confusing is because the naming and documentation clouds the potential usages of it for potentially owned data. A new user wondering "can I have borrowed data returned by this part of a function and owned data by this part" will probably not immediately think "oh, I need a clone-on-write smart pointer" (and they might end up trying to implement their owned "partially owned" data type, and may reach some papercuts with lifetimes or lack of ToOwned). I don't think that there's really a way to fix the naming part, without a lot of backwards-incompatibility, but there is definitely a way to fix the documentation.

Conclusion

Cow is a very useful type with somewhat hidden capabilities. I argue that these capabilities should be more documented in Rust documentation and books. As far as I know, there's no mention of Cow in the Rust Book or Rust By Example.

Thanks for reading this, and have a good morning/afternoon/evening/night/etc.
-James

If you liked this article, you may like some of my writings on my journey through writing Calypso, a programming language that I'm implementing in Rust:

(I will be writing another post in that series soon(tm))

Top comments (0)