Optimise your String Algorithms in Java

Taosif Jamal — Sat, 10 Sep 2022 18:39:53 +0000

While practicing DSA String problems, one thing I stumbled upon many times was trying to find alternatives to make my algorithm faster at some scales. Thus I explored different approaches, and during String questions I had to choose between one way or other so I used to check time & space complexities of operations on string. It took me few google searches to find a satisfactory explaination, but for each operation I gotta google.

Hence to relieve you from this pain, and to save your time: I am listing Time & Space complexities of various operations on string with straightforward explainations. Note that this is all with respect to Java SE 7 or v1.7

Instantiation — new String(“a”)

Time Complexity: O(n)

It creates a char array and fills each character of string in array.

Concatenation — “a” + ”b”

Time Complexity: O(mn)

Every time you concat a string a new buffer is created and the contents are copied over.

Strings are immutable in Java.

Example: Lets say you concatenate "abc + "def"
Under the hood, Java is performing these operations:
• Construct copy of "abc" with additional length; 
• Copy "d" to new array
• Construct copy of "abcd"
• Copy "e" to new array
• Construct copy of "abcde"
• Copy "f" to new array
So O(m) called n times, thus time complexity is O(mn)

Space Complexity: O(mn)

Better Approach: StringBuilder

Use StringBuilder with .append() method, It creates internal buffer that expands only on demand. Its time complexity is almost O(m+n).
String.concat is another approach, but slower than StringBuilder.

.length()

Time Complexity: O(1)

Because string holds a counter variable.

.equals()

Time Complexity: O(n)

In most cases, its actually faster than O(n), because it checks for some things before hand like length of string is not equal then instantly return false. Also it linearly compares upto first non equal character only, but in the worst case its O(n)

Space Complexity: O(1)

.charAt()

Time Complexity: O(1)

It is because in Java, Strings are implemented using char array, and random access is in array’s nature. Hence character at index can be accessed randomly in O(1).

.substring()

Time Complexity: O(n)

As mentioned before, strings are immutable in Java. Hence for creating a substring of length n, a new string instance needs to be created. Under the hood, java is actually creating a char array of length n and copying elements from original string to new one.

Space Complexity: O(n)

It’ll need space to store contents of array.

.toCharArray()

Time Complexity: O(n)

It should be O(1) since string is nothing but a char array, right? Yes, but no. toCharArray method copies the contents of string to a new char array, hence O(n).

Space Complexity: O(n)

Obviously, it’ll need space to store contents of array.

.contains() and .indexOf()

Time Complexity: O(mn)

.contains() calls .indexOf() method and returns true if index > -1.
It is quite surprising that java uses a naive method (Loop inside loop) for finding a string in another. There are several algorithms out there that do the same job in O(n) like KMP, but for them there’s overhead space and time cost as well. So engineers at Sun/Oracle must have empirically tested various algorithms and decided that naive method works best on average for all kind of scenarios.

Space Complexity: O(1)

.replace(char, char)

Time Complexity: O(n)

It basically goes through each character in string, and replaces it with new character. Although with some optimisations

Space Complexity: O(n)

Since strings are immutable, new string needs to be created

.replace(regex, replacement) and .replaceAll()

Time Complexity: Oh(its complicated)

Whether your regex string is simple string like “red” or its a complex regular expression pattern, this function under the hood uses Patterns class to match the character and then replace it. Lets try to understand the nature of time complexity here.
Suppose your regex string is he(ll|ter|r)o
It can match hello, hetero, hero from your string.

The Regex Engine will go through the string, and if it encounters he it’ll intstantly match it, immediately after that it’ll try to match ll if found, it’ll next find o and if o is not found, it’ll go back and try to match ter and you get the point.

Well, Time complexity of this function is not in our control, but we can optimise our Regex to perform well at matching. From previous example, we can do (hello | hetero | hero) So it reduces back tracking by regex engine to optimise search function and ultimately replace function. Here’s a good resource if you want to learn more about optimising regex patterns.
Bonus: I found a tool to Visualise Regex

Space Complexity: O(n)

Remember we can’t mutate a string in Java?

.split(separator)

If Separator is single character and not in “.$|()[{^?*+\”:

Time Complexity: O(n)

Space Complexity: O(n)

If Separator string is more than one character, it is compiled into a Regex Pattern by Java for finding index to split, and as discussed in .replace() section, It’s time complexity depends on nature of pattern

Time Complexity: Oh(its complicated)

Space Complexity: O(n)— Because y’kno immutable

.toLowerCase() and .toUpperCase()

Time Complexity: O(n)

Space Complexity: O(n)

Straightforward, each character of string is checked one by one, and new string is stored separately

That’s all Folks. I hope this article will be helpful to you to optimise your string functions. Did I miss any essential function? let me know if any.
Thanks.

DEV Community: Taosif Jamal

Optimise your String Algorithms in Java

Instantiation — new String(“a”)

Time Complexity: O(n)

Concatenation — “a” + ”b”

Time Complexity: O(mn)

Space Complexity: O(mn)

Better Approach: StringBuilder

.length()

Time Complexity: O(1)

.equals()

Time Complexity: O(n)

Space Complexity: O(1)

.charAt()

Time Complexity: O(1)

.substring()

Time Complexity: O(n)

Space Complexity: O(n)

.toCharArray()

Time Complexity: O(n)

Space Complexity: O(n)

.contains() and .indexOf()

Time Complexity: O(mn)

Space Complexity: O(1)

.replace(char, char)

Time Complexity: O(n)

Space Complexity: O(n)

.replace(regex, replacement) and .replaceAll()

Time Complexity: Oh(its complicated)

Space Complexity: O(n)

.split(separator)

Time Complexity: O(n)

Space Complexity: O(n)

Time Complexity: Oh(its complicated)

Space Complexity: O(n)— Because y’kno immutable

.toLowerCase() and .toUpperCase()

Time Complexity: O(n)

Space Complexity: O(n)