DEV Community

Supervisor Intensity, what is it?

bakenator on March 22, 2019

Did you know that Elixir Supervisors will stop trying to restart a child process if they detect something has gone haywire in the child? They do!...
Collapse
 
stealthmusic profile image
Jan Wedel

Thanks for the article! So how did you eventually configured the supervisor for the tcp socket example? I guess it could happen hubdrefs of times per second, couldnt it?

Collapse
 
bakenator profile image
bakenator • Edited

For this failing example, within my server I was using spawn_link(process_request_function), you can get a lot of added safety switching this to spawn(process_request_function).

But I did figure out a way to restart the server safely for fun.

def start_link(port: port, dispatch: dispatch) do
    case :gen_tcp.listen(port, active: false, packet: :http_bin, reuseaddr: true) do 
        {:ok, socket} ->
            Logger.info("Accepting connections on port #{port}")
            # saving socket to close in case of error
            MyApp.SocketStore.set(socket)
            {:ok, spawn_link(Http, :accept, [socket, dispatch])}
        _ ->
            :gen_tcp.close(MyApp.SocketStore.get())
    end
end

Right after the server starts, I save the socket in a Singleton Genserver.
Then on failure I close the socket and let the process fail.
It fails because it does not return an {:ok, pid} tuple.

Then next time the Supervisor restarts the process, it should succeed since there is no socket bound to the port.

Not the most robust, but it works for this toy example.

Collapse
 
stealthmusic profile image
Jan Wedel

Thanks for the example. I remember sometimes the port is occupied for a longer time, I’ve seen this a couple of times even if no process actually uses it. There is just some delay until it’s freed by the os. So in that case it still would not help, Right?

Thread Thread
 
bakenator profile image
bakenator

I haven't used this hand made server very much, but what you are saying about the delay in closing the socket sounds like it could happen.

There may be a blocking command that I am unfamiliar with to check whether the port is free. Otherwise I would think of putting a sleep command in there to give the os time to free it.