Do you have weird desires to do things in shell that shouldn't be done in shell? your opinion doesn't matter, I have already written the blog post :P
Before we venture any further, please be reminded that the wonderful websocat exists, there is really no reason to do this in shell (except for fun :)
Why would anyone do this?
Live reloads for web projects are pretty standard nowadays. If you are already using socat as SimpleHTTPServer, it is only logical to also use socat to notify client side to reload :D
In this case, a complete WebSocket server is not necessary. I only need to send messages to the client, and don't need to handle incoming messages, which make the data flow way easier to handle (easier to handle in socat+shell, it probably doesn't make much difference in other languages).
I was mostly referencing this blog post by another nice gentleman on implementing WebSocket server in ruby, so for those who want to learn more about the boring details, you know where to go.
The journey
If you have sniffed WebSocket request-response before, you can probably recognize that WebSocket request is actually a normal HTTP GET request. And the response first has a header section, just like a normal HTTP response, followed by the payload. And for our server-to-client only scenario, it look just like (and in essence, is) Server-sent events, with data dripping to the client unidirectionally.
To handle the request, we could utilize socat's SYSTEM
address type to echo out the Header,
#!/bin/sh
websocket_script="
echo 'HTTP/1.1 101 Switching Protocols';
echo 'Sec-WebSocket-Accept: UdMEX53kyT/LBV+MbgNRheSRFvQ=';
echo 'Connection: Upgrade';
echo 'Upgrade: websocket';
echo '';
"
socat TCP-LISTEN:8080,crlf SYSTEM:"$websocket_script"
But if you actually tried this, you will find that life is never that simple.
You see, there is a special header called Sec-WebSocket-Accept
that the browser will validate to ensure that our life isn't that easy.
From MDN the algorithm to generate this is:
The server takes the value of the Sec-WebSocket-Key sent in the handshake request, appends 258EAFA5-E914-47DA-95CA-C5AB0DC85B11, takes SHA-1 of the new value, and is then base64 encoded.
Well that seems doable, we can grep
the key from request header and printf
to append the magic string, then we have sha1sum
to get the hash and base64
for encoding, how hard could it be?
Well... I have no idea why head -n1
got stuck and couldn't close the stream1. But grep
already has -m NUM
to stop reading a file after NUM matching lines, that should suffice.
hum... I think base64
takes binary as input. But that looks like hex?
xxd to the rescue
It was indeed hex. A quick google tells us we can use xxd
to convert between hex and binary.
$ printf "a\n" | xxd -p
610a
$ printf "610a" | xxd -r -p
a
Now we have everything we need to get the hashed key:
And plugging it into other echo
statements should just work TM
#!/bin/sh
websocket_magic_string=258EAFA5-E914-47DA-95CA-C5AB0DC85B11
websocket_script="
echo 'HTTP/1.1 101 Switching Protocols';
accept=\$(
grep -m1 'Sec-WebSocket-Key' |
cut -d' ' -f2 |
xargs printf '%s$websocket_magic_string' |
sha1sum | xxd -r -p | base64
);
echo \"Sec-WebSocket-Accept: \$accept\";
echo 'Connection: Upgrade';
echo 'Upgrade: websocket';
echo '';
"
socat TCP-LISTEN:8080,crlf SYSTEM:"$websocket_script"
Now with the easy part out of the way, let's see how to format a WebSocket compliant payload.
The rules are as follow:
print a special bitmask as char
if message.length < 126
print message.length as char
else if message.length < 2**16
print 126 as char # i.e. the 126th ascii character
for each byte of (message.length as unsigned 16-bit integer)
print the byte as char
else
print 127 as char
for each byte of (message.length as unsigned 64-bit integer)
print the byte as char
print the message as chars
(again, you can read the blog post I am referencing for details)
The first part was easy enough, just printing a char2 on screen. However there are no type or type cast in shell, we only have raw byte and raw bytes that happen to be printable.
printf to the rescue
printf
is capable of printing printable and non-printable characters with escape sequences, making it especially useful for showing raw bytes. Like printf '\n'
, running printf "\\$(printf %o '65')"
will return "A". This gives us the ability to print the bitmask and handle the first case.
while IFS= read -r line; do
printf "\\201"; # websocket server message mask 10000001
size=$(printf "$line" | wc -c);
if { test $size -lt 126; }; then
printf "\\$(printf %o $size)";
...
For the second part, we need to coerce a string of decimal digits into two char, representing the decimal value.
For example, we should convert "16737" into "Aa".
----------------------------
| dec | 65 * 256 + 97 |
----------------------------
| char | A a |
----------------------------
You may have noticed that we essentially need to convert the decimal value into binary bytes (as opposed to a single char, meaning the printf
trick doesn't work anymore).
We will need to change the radix of the number, so brush up your math skills and prepare for some fun.
dc to the rescue
Instead of showing off my non-existent math skill here, a basic desk calculator should do the job. And by desk calculator I mean dc
(stands for desk calculator), it conveniently has parameters to control the input/output radix. With a simple dc -e "16 o $size p"
you too can safely forget how to do math!
# 16 o => set 16 as output radix
# 11 => input value (by default base 10)
# p => print out the result
$ dc -e "16 o 11 p"
B
Well, we need 2 char, but a single hex digit is only 4-bit, what if our input size is under 256? It won't give us 4 hex digits!
Luckily we are trying to represent unsigned integer, which mean the leading bits are zeros. If we add enough zeros and take from the end, we don't even need to calculate how many zero to prepend.
$ echo 'AA' | xargs printf '%04d%s' 0 | tail -c4
00AA
# or
$ { printf '%04d' '0'; printf 'AA'; } | tail -c4
00AA
Now we have some hex string, If only there is a way to convert between hex and binary...
$ dc -e "16 o 16737 p" |
> xargs printf "%04d%s" 0 |
> tail -c4 |
> xxd -r -p
Aa
With that, followed by printing the actual text, we are now able to send a WebSocket compliant payload data.
while IFS= read -r line; do
size=$(printf "$line" | wc -c);
printf "\\201"; # websocket server message mask 10000001
if { test $size -lt 126; }; then
printf "\\$(printf %o $size)";
elif { test $size -lt 65536; }; then
printf "\\$(printf %o 126)";
dc -e "16 o $size p" |
xargs printf '%04d%s' 0 |
tail -c4 |
xxd -r -p
else
printf "\\$(printf %o 127)";
dc -e "16 o $size p" |
xargs printf '%016d%s' 0 |
tail -c16 |
xxd -r -p
fi
printf "$line";
done
The last 90%
Now having the main chunk of code done, We can go through some remaining "minor" issues.
Input
Normally socat only accepts two addresses, so only 2 out of "STDIO", "TCP-LISTEN", and "SYSTEM". Since the server itself is a must, that leaves us choosing between stdin and our script.
Fortunately, fork
-ed command inherits parent's file descriptor. We can duplicate stdin to a new file descriptor and reference it inside the command.
A drawback is that due to stdin being a stream, if we have multiple copies of "SYSTEM" (by running the server with fork
option), only one of them will actually receive the message. Of course we can instead write to a normal file and have the commands tail -f
from said file, or even use socat to run another UDP server. However I find these solution less appealing, and, in this use-case, I can live with only having a single client.
CRLF
We commonly use socat TCP with the option crlf
, which means socat will prepend \r
before each \n
. But if you think this is a common use case, you are in for a surprise (or I am very interested in what do you commonly use socat for).
Since we are sending integer as char, when our message size is 10, we will be sending the 10th ascii character — the new line character. After which socat will make our life miserable by prepending extra data. So we will have to drop the crlf
option and attach \r
ourselves.
Portability
Since xxd
isn't portable3, we will have to find a replacement. As we previously discovered, printf
is fully capable of printing raw bytes, but we will have to split each char worth of hex digit in advance.
We can use sed 's/../&\n/g'
to replace each two characters with themselves plus a new line, or we could use a simpler fold -w2
to do the same.
$ echo '1234' | sed 's/../&\n/g'
12
34
# or
$ echo '5678' | fold -w2
56
78
Finally, the end result
#!/bin/sh
websocket_magic_string=258EAFA5-E914-47DA-95CA-C5AB0DC85B11
websocket_script="{
printf 'HTTP/1.1 101 Switching Protocols\r\n';
printf 'Sec-WebSocket-Accept: ';
grep -m1 'Sec-WebSocket-Key' |
sed -e 's/\r//g' |
cut -d' ' -f2 |
xargs printf '%s$websocket_magic_string' |
sha1sum |
head -c 40 |
fold -w2 |
xargs -I{} printf '%o\n' '0x{}' |
xargs -I{} printf '\\\\{}' |
base64 |
xargs printf '%s\r\n';
printf 'Connection: Upgrade\r\n';
printf 'Upgrade: websocket\r\n';
printf '\r\n';
cat <&3;
}"
while IFS= read -r line; do
printf '\201'; # websocker_server_mask 10000001
size=$(printf "%s" "$line" | wc -c);
if { test $size -lt 126; }; then
printf "\\$(printf %o $size)";
extended_payload_length_size=0;
elif { test $size -lt 65536; }; then
printf "\\$(printf %o 126)";
extended_payload_length_size=4;
else
printf "\\$(printf %o 127)";
extended_payload_length_size=16;
fi
{
printf '%016d' '0';
dc -e "16o${size}p" | tr -d "\n";
} |
tail -c $extended_payload_length_size |
fold -w2 |
xargs -I{} printf '%o\n' '0x{}' |
xargs -I{} printf '\{}';
printf '%s' "$line";
done |
socat TCP-LISTEN:8080,reuseaddr,fork SYSTEM:"$websocket_script" 3<&0
That’s it. For those who want to implement client-to-server message handling, maybe you will be able to do that with dual type address and some named pipe. But then again, Why would anyone do that ;)
-
the reason was related to
grep
buffering the output when the destination isn't a tty, I still lack a thorough understanding regarding this, maybe I will write a blog post when I can wrap my head around it better. ↩ -
char in this post are 8-bit, unsigned ↩
-
I wasn't concerned about proper metrics like POSIX compliant. Instead, I just want it to run on docker
alpine/socat
image. ↩
Top comments (0)