DEV Community

darkmage
darkmage

Posted on

2 1

A Weird Way to Substring in C++

/*

author:  mike bell 
twitter: @therealdarkmage
date:    Fri Aug 9 5:12 PM
website: https://mikebell.xyz

I was playing around and discovered a weird way to perform a substring on a string in C++.
Take the address of a string and then, using array indexing, add the position to substring from.
Works with constant and arbitrary strings! Cool/weird!
Compiles on current macOS g++ as of this writing.
*/
#include <iostream>
#include <string>
using std::string;
using std::cout;
using std::endl;
int main() {
    string s = &"Hello, World"[7];
    string ss = &s[2];
    cout << s << endl; // prints "World"
    cout << ss << endl; // prints "rld"
    return 0;
}

Top comments (4)

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

The second use, ss, isn't particularly weird, actually. std::string is basically a wrapper around a c-string (char array).

A few things to consider:

  • C-strings are a linear data structure, stored in adjacent memory (as opposed to a list.)

  • C-strings have to end with the null terminator \0, which marks the end of the string. If a c-string lacked this, there would be no way to know when to stop; the actual length of a c-string is not stored anywhere. (A std::string might cache the length for efficiency reasons though. I haven't read implementation in a while, but I suspect it to be so.)

  • Applying the [] operator to a pointer is just performing pointer arithmetic.

  • & before a variable name is returning the memory address of the variable. In the case of std::string, the first thing in the object's memory is the internal c-string.

The "magic" is all happening on string ss = &s[2];...

  1. We get the address of s, which also happens to point to the beginning of the c-string inside of s.

  2. [2] is the same as address + 2. That means you're pointing to the third character in memory now.

  3. The c-string is read starting at that pointer, and goes until it encounters the \0.

  4. Said c-string is passed to the constructor for std::string, and is used to create ss.

My only caution with this method is that it makes assumptions about the implementation details of std::string. If you use another string class, it might not behave the same.

The safer way to do the same thing is to save the pointer to the c-string inside of s via s.c_str(), and then work with that pointer directly. And even then, you're still not as safe as if you just used std::string's own member functions, because if you get your pointer arithmetic wrong, you're going to have memory errors.

Collapse
 
barney_wilks profile image
Barney Wilks

A "fun" side effect of the fact that the C++ (and C) subscript operator is just pointer arithmetic means that

const char* array[] = { "Well", "This", "Is", "Strange" };

// value contains "Strange"
const char* value = (1 + 1) [array + 1];

is actually valid C++ 😆.

Because x [y] is the same as *(x + y), that means that

const char* x = (1 + 1) [array + 1];

is just the same as

const char* x = *(1 + 1 + array + 1);
// Or
const char* x = *(array + 3); // array [3]

It's worth only noting that this works where the subscript operator does use pointer arithmetic (C style arrays and pointers mostly I think) and not where the [] operator is overloaded - so you can't do

std::string v = "Hello World!";
char second_letter = 1 [v];

So I guess that means the original example of

string s = &"Hello, World"[7];
string ss = &s[2];

can be rewritten as

std::string s = & (2 * 3 + 1) ["Hello, World!" - 1];

if you were so inclined 😆.
Sometimes I worry about C++ - it doesn't exactly help itself at times... 😄

Obligatory godbolt for this:
godbolt.org/z/nHJOgs

Collapse
 
codemouse92 profile image
Jason C. McDonald

Sometimes I worry about C++ - it doesn't exactly help itself at times... 😄

Yes, but at least we get to play with esoteric hackery without needing a different language.

...and sometimes, ever so rarely, if the stars are aligned, that hackery comes in handy.

Collapse
 
therealdarkmage profile image
darkmage

I knew about the pointer arithmetic aspect from college, but I had no idea you could also do it with string constants! Thank you for the explanation!

👋 Kindness is contagious

Explore a trove of insights in this engaging article, celebrated within our welcoming DEV Community. Developers from every background are invited to join and enhance our shared wisdom.

A genuine "thank you" can truly uplift someone’s day. Feel free to express your gratitude in the comments below!

On DEV, our collective exchange of knowledge lightens the road ahead and strengthens our community bonds. Found something valuable here? A small thank you to the author can make a big difference.

Okay