Span<T>

#textprocessing #dotnet #csharp

When Microsoft introduced Memory<T> and Span<T> there was some confusion about what the types were even for. Because of the tremendous opportunities they provide, I'd like to explain what role they serve, especially with regards to Stringier.

In both cases, Memory<T> and Span<T>, they are what's called reference slices. That is, they are references to existing objects in memory, and are also slices of that object rather than a reference to the whole thing. Confusing? Yeah, a bit. Generic concepts aren't all that great to explain. But you can think of it like this: in both cases, these are arrays, only they reuse an existing array.

Now, more specifically, these are used for two scenarios. First and foremost, they are an optimization tool. Because you're not allocating on every slice, you can wind up using way less memory, which means time spent allocating objects on the heap, less time the garbage collector spends doing work, and more. But there's another purpose I don't think Microsoft mentioned anywhere near well enough: they provide a generalization of contiguous data in ways that required duplication of code before. What do I mean by this? Well, thinking of Span<T> as a reference to part of an array is only part of the picture. It's a reference to any contiguous memory. What's this mean? Well, heap arrays, stackalloc arrays, fixed buffers, even fat pointers; they can all be referenced through the common type: Span<T>.

This means, for all practical purposes, ReadOnlySpan<Char> is a String. But even better, a "string" that can reuse portions of existing strings! And use of it allows for much more comprehensive API surface, by allowing more of what is conceptually the same, to be used as the same thing, through one single method.

Now, this isn't entirely perfect. I've got a lot of complaints about the way generics are handled in many programming languages, and as you've probably noticed, some of the problems Span<T> solves are clearly workarounds with the flawed C# generics system. But hey, it still means way less duplication of code, and a way better API surface.

Simply put, Memory<T> and Span<T> are abstractions and generalizations, but uniquely allow for more efficient code.

Oldest comments (4)

Jakob Christensen • Aug 29 '20

I am curious. In what ways do you consider C# generics to be flawed?

Patrick Kelly • Aug 29 '20

I'll have to do a deep dive into this sometime, but the gist is this:

They are far too limited in what you can express. Two things I've seen in other languages, and greatly appreciated, is the ability to have generic value parameters, like an integer value as part of the template, and generic subroutine parameters, almost like a callback but instead of a subroutine pointer, it's actually templated on the subroutine instance. F# has the latter, which is a delight when working in it. Ada supports both, but lacks inline instantiation which is a huge problem. There's also too vague of broader generic classes, for example, it's impossible to write any generic functions for any numeric type, something that's possible in both Ada and F#.

Jakob Christensen • Aug 31 '20

Thanks, Patrick.

I am not sure I am smart enough to fully understand what you are saying. You ought to write a post on your points regarding generics in F# vs. C# 😉

Patrick Kelly • Sep 9 '20

Done! dev.to/entomy/generics-systems-83n