Network issues always come in production.
We tend to, at most, perform some stress tests to check what happen in
(unrealistic) high load situations or even stop some critical services,
like databases, to see how resilient is our application or service..
But the truth is that there can be worst situations...
I mean: If your database goes down, you'll rapidly get errors due to TCP
connection attempts being rejected.
But, when network speed decreases due to the amount of traffic, network errors,
etc... You may end up with lots of pending requests that will never end causing
much more trouble that could be easily avoided if we had properly detected the
situation.
To be able to address that situations we need to reproduce them, at least at
some level.
As a first approach, I tried with iptables DROP rules, but this is drastic
solution: Either you have a perfect connection or all packets are lost. And I
wanted to play with several PostgreSQL
timeouts and their effect in
different situations.
Finally I ended up implementing my own tool to simulate different levels of
traffic congestion:
I don't know if there are other similar tools out there (at least I haven't
found any until now: If you know any, please tell me in the comments).
...And it's still far from perfect.
But, at least, it helped me to learn a lot about how PostgreSQL timeouts work
and the effects they can have in different situations.
And it is a very simple-to-use tool. Specially now that I bundled it as a npm
package so that, if you have Node/NPM in your system, you only need to execute
the following to get started:
npx netjam --help
For example, to create a jamable TCP tunnel to local PostgreSQL Server
listening in its default port (5432) you only need to execute the following
command:
npx netjam localhost 5432
Then you will get something like this:
Server listening on port 5000
STATUS:
┌─────────────┬────────────────────────────┐
│ (index) │ Values │
├─────────────┼────────────────────────────┤
│ remoteHost │ 'localhost' │
│ remotePort │ '5432' │
│ listenPort │ 5000 │
│ timestamp │ '2024-08-06T18:52:11.042Z' │
│ waiting │ 0 │
│ open │ 0 │
│ closed │ 0 │
│ withError │ 0 │
│ tx │ 0 │
│ rx │ 0 │
│ iputDelay │ 0 │
│ outputDelay │ 0 │
│ logInterval │ '0 (Disabled)' │
└─────────────┴────────────────────────────┘
AVAILABLE COMMANDS:
inputDelay - Sets input delay to specified value
outputDelay - Sets output delay to specified value
delay - Sets overall balanced delay to specified value
logInterval - Show/Set status (stderr) logging interval in msecs
quit - Quit the program
>
By now it is capable to speed down transmission and reception speed by
introducing small delays between packet transmission and reception at the other
side.
In the future it could be extended by introducing:
Random (on customizable probability) transmission errors through data
mangling.Random packet loosing.
Delay configuration as ranges (so it will take a random value between given
bounds for each tx/rx packet.Who knows... Any ideas?
Top comments (0)