DEV Community

Vivek Kumar
Vivek Kumar

Posted on

26 1 2 1 2

Converting a string to Base64 manually

As a web developer, you must have heard or even used Base64 encodings. Generally we convert some texts or urls to Base64 encoded string. To do this either, we use programming language's methods or we use some third party tools.

In this tutorial we are going to learn how we can encode any text into Base64 manually.

Lets say, we want to encode a text "hi" in base64 binary encoding. Why I am saying it as binary, because base64 encoding is one of the binary representations.

Step 1: Text To ASCII ( text is "hi")

First, each character in the string is converted to its ASCII representation, For example:

h: 104       i: 105     
Enter fullscreen mode Exit fullscreen mode

Step 2: Convert each ASCII character to 8bit binary

01101000 01101001
Enter fullscreen mode Exit fullscreen mode

Step 3: Split these binary numbers into group of 6 bits.

011010 000110 1001
Enter fullscreen mode Exit fullscreen mode

Step 4: Padding(if required)

If you notice the least significant number doesn't have 6 bits, then add 0 (zeroes) at the end to make it 6bits

011010 000110 100100
Enter fullscreen mode Exit fullscreen mode

Just remember one thing here,
if we add two zeroes then we will be using "=" at the end of base64 string
if we will add 4 zeroes then we will be using "==" at the end of base64 string

Step 5 Convert it to Decimal

26 6 36
Enter fullscreen mode Exit fullscreen mode

Step 6: Use Base64 Table to find out corresponding character encoding

Base64 Table - it contains 64 characters

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
Enter fullscreen mode Exit fullscreen mode
   Value Encoding  Value Encoding  Value Encoding  Value Encoding
         0 A            17 R            34 i            51 z
         1 B            18 S            35 j            52 0
         2 C            19 T            36 k            53 1
         3 D            20 U            37 l            54 2
         4 E            21 V            38 m            55 3
         5 F            22 W            39 n            56 4
         6 G            23 X            40 o            57 5
         7 H            24 Y            41 p            58 6
         8 I            25 Z            42 q            59 7
         9 J            26 a            43 r            60 8
        10 K            27 b            44 s            61 9
        11 L            28 c            45 t            62 - (minus)
        12 M            29 d            46 u            63 _
        13 N            30 e            47 v           (underline)
        14 O            31 f            48 w
        15 P            32 g            49 x
        16 Q            33 h            50 y         (pad) =
Enter fullscreen mode Exit fullscreen mode

So,
26 -> a
6 -> G
36 -> k

aGk
Enter fullscreen mode Exit fullscreen mode

Step 7 - Now add padding ( we added two zeroes, so we will use one equal(=) sign.

aGk=

You can also verify this with JavaScript method
btoa("hi") . It will also print same.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more →

Top comments (12)

Collapse
 
hseritt profile image
Harlin Seritt •

Wonderful, thanks for the write up, Vivek! I've had to do this recently at work.

Collapse
 
iamspathan profile image
Sohail Pathan •

Nice explanation, Vivek. Now I know what goes on behind in the encoding and decoding part.
Since you mentioned:

"To do this, we either use programming languages' methods or we use some third-party tools."

One of the third-party tools is ApyHub as well, which supports base64 inputs for certain APIs like file conversions.

Sharing as a resource. :)

Collapse
 
vvkkumar06 profile image
Vivek Kumar •

Thanks for sharing

Collapse
 
ooosys profile image
oOosys • • Edited

Hmmm ... what is the point of adding one more to the myriads of explanations already available on Internet? Covering "hi" instead of "Man" and "Ma" and "M" which is a more complete set of examples???

Collapse
 
ibnsamy96 profile image
Mahmoud Ibn Samy •

I liked how you simplified the whole process splitting it into small steps, but why would we convert text into base64? are there any applicable uses of that?

Collapse
 
godinhojoao profile image
João Godinho •

Nice!! I also have a post explaining how base64 works and its use cases: dev.to/godinhojoao/base64-encoding...

Collapse
 
wyattdave profile image
david wyatt •

Nice

Collapse
 
tainv2002 profile image
Nguyễn Văn Tài •

Good

Collapse
 
nxquan profile image
nxquan •

Good

Collapse
 
dagnelies profile image
Arnaud Dagnelies • • Edited

That's not base64, it's base64url. Confusing both leads to trouble for everyone. The (deprecated) function btoa encodes to base64 with + and / instead of base64url with - and _.

Collapse
 
fjones profile image
FJones •

btoa isn't deprecated anymore afaik, it just comes with documentation warning against use with Unicode.

Collapse
 
dagnelies profile image
Arnaud Dagnelies •

Yes, you are right. btoa is not deprecated, actually it will probably stay forever. It's just my IDE that mislead me recently by annotating it that way.

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

👥 Ideal for solo developers, teams, and cross-company projects

Learn more