DEV Community

loading...

Learning Algorithms and Data Structures: Strings

lareenmelo profile image Lareen Melo ✨ ・5 min read

A string is a data-type in programming that holds a sequence of characters. More fancily said, "a string is often implemented as an array of bytes (or words) that stores a sequence of elements (typically characters) using some sort of character encoding".

Depending on the programming language, strings can either be mutable or immutable. When a string is immutable it means that it's fixed allocated. So your string it's unique and can't be modified. If you change it, a copy of that string is created and the original string is deallocated instead. When a string is mutable it means that it's dynamically allocated, so your string can be modified (or not, it all depends on how you declare it and if the programming language you're using allows it).

Strings are interesting.

Alt Text

Handling strings is a must in programming. Companies love to do tricky things with them when interviewing simple mortals like me. Reverse a string, split a string after a special character, palindromes, anagrams... this is why strings are important. Because there are so many ways we can manipulate this data type that we can easily go wrong and make our algorithm become inefficient.

Strings behavior

For that reason, I spent a few days trying to understand how strings work in C++ and Swift (I decided to add Swift to the challenge).

Strings in Swift ♥️

In Swift, strings conform to the Collection Type, meaning that they possess similar behaviors as arrays, dictionaries, and sets. Strings in Swift are mutable unless defined as a constant (let). They're also a value type, so if you create a string, that object will get copied whenever it's assigned or passed to a function.

To access the characters within the string you can either use subscript syntax or the methods and properties within the String class. Since the string type conforms to the collection type, to access the characters within a string you have to use indices.

However, due to how Swift handles strings and characters, you must iterate over each element from start to end to find the element you're looking for. Which is why you can't access characters within a string using integer values. Swift documents this very well but here's the overall comment as to why this is a thing: "[...] different characters can require different amounts of memory to store, so in order to determine which Character is at a particular position, you must iterate over each Unicode scalar from the start or end of that String. For this reason, Swift strings can’t be indexed by integer values.”

Strings in C++ 💖

I learned that there are two types of ways we can work with strings in C++. First, using the C string. The C string is an array of characters terminated by the null char "\0". The second way is using the String class from STL. Apparently, the String class is pretty efficient when handling strings and also provides with a handful of functions to manipulate strings, however with C strings we don't really get those functions so that's a limitation. Also, the C strings are fixed-sized, whereas using the String class provides both dynamic and fixed allocation for strings.

Strings vs. char arrays

Char arrays: characteristics.

  • Static-sized
  • Fast access
  • Few built-in methods to manipulate strings
  • A char array doesn’t define a data type

Strings: characteristics.

  • Slower access
  • Define a data type
  • Dynamic allocation
  • More built-in functions to support string manipulations

Operations

There are many operations and functions we can use on strings. Some languages have custom functions to handle strings, and some barely have any but basic operations. Here are a few of the standard and basic operations there are to manipulate and handle strings that are supported by almost all of the programming languages there are.

  • Concatenation: operation of joining character strings.

    • on mutable strings: string concatenation on mutable strings is dynamic. No new memory allocation will be made if concatenating an already declared string, the reference to that string will be modified or expanded.
    • on immutable strings: since you cannot edit the value of an immutable string, each concatenation results in a new initialized string in memory.
  • Access characters: access the string to search or retrieve a character.

  • Size: gets the length of the string. In Swift it's important to understand that to get the length of a string you use the .count method of the Collection Type, but if you're going to check if a string it's empty use the .isEmtpy property because it's more efficient.

  • Substrings: a contiguous sequence of characters within a string. It's important to understand the behavior of substrings of your selected programming language(s). Common substrings are suffixes and prefixes.

    • Suffix: a substring that occurs at the end of a string.
    • Prefix: a substring that occurs at the beginning of a string.

Substrings

Swift substrings ♥️

Substrings in Swift should be treated carefully because they can cause a memory leak. We should only use substrings for a short time in our code and then assign it as a String to some variable/constant because they reuse the storage of the original string, so the entire string must be kept in memory as long as any of its substrings are being used. They have the same methods as the String type.

Apple has really good documentation on Substrings: https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html

C++ substrings 💖

Substrings in C++ are pretty straightforward. A substring is a copy of the string. So creating a substring is creating an initialized String object.

The StringBuilder/StringBuffer issue.

When I was reviewing the data structures chapter on CTCI I read about StringBuilder and StringBuffer for the first time. These are solutions to programming languages that have a default immutable string implementation, for instance, Java or C#.

In Java, strings are immutable by default. Immutable strings are beneficial because you can assure the state of your string and hence share it across, either threads or functions and not worry much.

But once you have to deal with string manipulation, immutable strings aren't really useful. Java generates a new string and then discards the older string for its garbage collector, which is a heavy operation and generates a lot of garbage in heap.

A solution to immutable strings in Java is StringBuffer and StringBuilder, which handle strings in a mutable way and provides efficient string manipulation methods.

We could say immutable strings are like your normal, fixe-sized array and mutable strings are the dynamic-sized implementation of that array.

Problem-solving 🔥

Noticed some progress.

  • Managed to solve 5 problems in a day.
  • Practiced first thinking, then coding.

I still need to work on thinking first because I tend to leave things at a general level and then when I have to write the code for a specific part of it, I waste my time thinking instead of having it already written down. Also, I need to learn how to stop testing every single thing I code because, in real life, I won't have all that time.

ON TO THE NEXT ONE: Linked Lists.

Alt Text

If you've read something that I didn't quite manage to explain well or something I said that's just wrong, please let me know.

Follow my competitive programming journey on https://github.com/lareenmelo/algorithms 💓

Discussion (1)

Collapse
kyonru profile image
Roberto Junior Amarante Calderón

o: I learned a lot about strings.... Thanks for adding Swift!
I really like your posts! :fireemoji:

Forem Open with the Forem app