loading...

Discussion on: Use cases for persistent logs with NATS Streaming

Collapse
byronruth profile image
Byron Ruth Author

Both the NATS and STAN clients reconnect automatically. One configurable setting is how many reconnects should take place and you set this to unlimited.

Another is producer may need to store outgoing events into own persistent outgoing queue before trying to send them to STAN to avoid missing events on producer restarts.

On a publish, the client will return an error if it cannot contact the server or if the server fails to ack. If you do this asynchronously (not in the request hot path for example), then you are correct you will need to handle this locally by storing the events and retrying perpetually. However if this is part of a request transaction, then presumably any effects could be compensated for and an error would returned to the caller.

This won't work unless updating this file can be done atomically with handling message.

Yes absolutely, the devil is in the details. The general case is that if you are doing at least two kinds of I/O in a transaction, you will need some kind of two-phase commit or consensus + quorum since any of the I/O can fail.

The intent for the examples here is to demonstrates patterns, not the failure cases (which are very interesting in their own right!) But thanks for pointing out a couple (very serious) failure cases!

Collapse
powerman profile image
Alex Efros

Both the NATS and STAN clients reconnect automatically.

NATS - maybe, but STAN doesn't reconnect.

Thread Thread
byronruth profile image
Byron Ruth Author

MaxReconnects option for NATS, setting to -1 will cause the client to retry indefinitely. For the STAN client, you can achieve the same thing by passing in the NATS connection using this option. So..

nc, _ := nats.Connect("nats://localhost:4222", nats.MaxReconnects(-1))
sc, _ := stan.Connect("test", "test", stan.NatsConn(nc))

The STAN client just build on a NATS connection, so any NATS connection options will be respected in the STAN client.

Thread Thread
powerman profile image
Alex Efros

STAN doesn't reconnect automatically, just test it yourself. Also, if no nats connection provided to STAN then it create default connection with same -1 setting.

Thread Thread
byronruth profile image
Byron Ruth Author

From the README, I read that the intent is for reconnection to happen transparently, but there are cases where this fails (there appears to have been quite a few changes and improvements since I wrote this article). It describes the case that the server doesn't respond to PINGs and the client decided to close the connection. Likewise if there is a network partition or something, the server may disconnect, but the client (in theory) could re-establish the connection after the interruption.

I am not arguing that it can't happen, but I am simply stating that the intent is for the client to reconnect. If you are observing otherwise, then that may be an issue to bring up to the NATS team.

Thread Thread
powerman profile image
Alex Efros

Team is aware about this. I agree intent is to support reconnects, but for now you have to implement it manually, and it's unclear when (if ever) this will be fixed in library. Main issue with reconnects is needs to re-subscribe, and how this should be done is depends on application, so it's not ease to provide some general way to restore subscriptions automatically - this is why it's not implemented yet and unlikely will be implemented soon.

Thread Thread
byronruth profile image
Byron Ruth Author

Good to know. Agreed, in the case of consumers how subscriptions need or should be re-established can vary based on the application.