DEV Community

Christian Heilmann
Christian Heilmann

Posted on

2

Code puzzle - GSM-7 or UCS-2?

Twilio devs logo

Intro

SMS messages can be encoded in different character encodings to optimize space. The two main types of encodings are:

  • GSM-7: This encoding packs the most commonly used letters and symbols in many languages into 7 bits per character. This allows up to 160 characters in a single SMS segment.
  • UCS-2: This encoding uses 16 bits per character and is the fallback for characters not supported by GSM-7. This allows up to 70 characters in a single SMS segment.

When an SMS message contains characters outside the GSM-7 character set, it must be encoded in UCS-2. This significantly reduces the number of characters that can fit into a single SMS segment.

Don’t worry, when using Twilio Programmable Messaging, this will be done automatically for you :)

Challenge

Create a function that determines whether an SMS message body can be encoded using GSM-7 or if it requires UCS-2 encoding. The function should also calculate the length of the message in bits for the required encoding.

Write a function that, given a text string representing the body of an SMS, determines:

  1. Whether the message can be encoded using GSM-7 or if it requires UCS-2 encoding.
  2. The length of the message in bits.

Input

  • A text string representing the body of an SMS.

Output

  • The encoding type (GSM-7 or UCS-2).
  • The length of the message in bits

Example Input / Output

  • “Visit the Twilio booth at Hall A 03 during WeAreDeveloper World Congress” => GSM-7, 504 bits
  • “Rumors say there will be free healthy smoothies at the Twilio booth 🥤🍓🍍” => UCS-2, 1184 bits
  • “Ahoy World” => GSM-7, 70 bits
  • "This is a test message with special characters: ñáéíóú." => UCS-2, 880 bits

These tests are available in the dataset.json file to use to create your function.

Helpful Background Knowledge

Submitting your answer

Fill out this form pointing us to your code solution and tell us how you solved the problem. We will pick from the submissions one lucky winner to get a VIP Ticket worth > 1000 Euro for the WeAreDevelopers World Congress in Berlin, Germany 17-19th of July.

Good luck!

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up