## DEV Community

CodeWithML

Posted on • Updated on

# String Compression

### Problem statement : Write a function to Compress a given string. Only return the compressed string if it saves space.

#### Test Cases

• 'AAABAACCDDDD' ---> 'A3BA2C2D4'
• 'BAAACCDDDD' ---> 'BA3C2D4'
• 'AABBCC' --> 'AABBCC'
• 'BBBAACCCCDD' --> 'B3A2C4D2'

#### Algorithm

• Setup a pointer at first character of string, initialise an empty string and a counter.
• Iterate through the string and check if character is equal to the first character of the string.
• if both are equal increment counter.
• Else,

• add the first character and it's count to result string.
• if count is greater than 1 then only add count to the result string.
• Else, add an empty string to the result string.
• now, the first character will be the current character.
• reset the counter value to 1.
• Add the last character and it's count to result string.

• Return the resultant compressed string if it's length is less than the given string, else return the given string.

#### Time and Space Complexity

• Time complexity: O(n)
• Space complexity: O(n)

### Code

``````class CompressString(object):
def compress(self, input):
if input is None or not input:
return input
result = ''
count = 0
prev_char = input[0]
for char in input:
if char == prev_char:
count += 1
else:
result += prev_char + (str(count) if count > 1 else '')
prev_char = char
count = 1
result += prev_char + (str(count) if count > 1 else '')
return result if len(result) < len(input) else input
``````

More precise compression

``````class CompressString(object):
def compress(self, input):
if input is None or not input:
return input
result = ''
count = 0
prev_char = input[0]
for char in input:
if char == prev_char:
count += 1
else:
if count > 2:
result += prev_char + (str(count))
prev_char = char
count = 1
elif count == 1:
result += prev_char
count = 1
prev_char = char
else:
result += prev_char
result += prev_char
count = 1
prev_char = char
result += prev_char + (str(count) if count > 2 else prev_char)
return result if len(result) < len(input) else input
``````

### Test

``````import unittest
from compressString import CompressString

class TestCompress(unittest.TestCase):

def testCompress(self, func):
self.assertEqual(func(None), None)
self.assertEqual(func(''), '')
self.assertEqual(func('AABBCC'), 'AABBCC')
self.assertEqual(func('AAABCCDDDDE'), 'A3BC2D4E')
self.assertEqual(func('BAAACCDDDD'), 'BA3C2D4')
self.assertEqual(func('AAABAACCDDDD'), 'A3BA2C2D4')
print('Success: testCompress')

def main():
test = TestCompress()
compress_string = CompressString()
test.testCompress(compress_string.compress)

if __name__ == '__main__':
main()
``````

Original article

Github solution

### Happy Coding !

Ryszard Grodzicki

Simple improvement proposal: you could remove the last check by replacing chars only if there are more than 2 in a row.
Replacing 'CC' with 'C2' never saves space. Replacing more than 2 characters always saves space. So you would either return the same input string or return compressed string that's shorter than original.

CodeWithML

I could do something like if count is more than 2 than add the character with count, if count is 1, than only add the character , and lastly if count is exactly 2, then add the character 2 times, so it would remain same as original string.

CodeWithML

I didn't know that, looks really neat. Thanks for the tip. ðŸŒŸ