On quick search, I found this article on gapless sequences. Using what was described there would also enforce a totally ordered sequence number. This will have a performance impact, not sure how much. Perhaps a single writer is a better solution overall for now.
I am concerned that business logic (command processing) will have to scale compute resources. And currently I planned to colocate the writer with the command processing, hence the multiple concurrent writers as command processing scales. I guess good problems to have and figure out when I get there. But I don't like the idea of the single writer being its own separate service.
Here is a quick sketch of what the solution would look like that supported multiple concurrent writers. Using the strategy from that link. This is accomplished by making sequence number generation part of the parent transaction. I even added some rules to make the Event table append-only and the sequence counter update only to prevent maintenance mistakes. Also inspired by that link.
Here, I also renamed SequenceNum to Position.
--DROP TABLE IF EXISTS PositionCounter CASCADE;--DROP FUNCTION IF EXISTS NextPosition();--DROP TABLE IF EXISTS Event CASCADE;--DROP TRIGGER IF EXISTS trg_EventRecorded ON Event;--DROP FUNCTION IF EXISTS NotifyEvent();-- transactional sequence numberCREATETABLEIFNOTEXISTSPositionCounter(PositionbigintNOTNULL);INSERTINTOPositionCounterVALUES(0);-- prevent removal / additional rowsCREATERULErule_positioncounter_noinsertASONINSERTTOPositionCounterDOINSTEADNOTHING;CREATERULErule_positioncounter_nodeleteASONDELETETOPositionCounterDOINSTEADNOTHING;-- function to get next sequence numberCREATEFUNCTIONNextPosition()RETURNSbigintAS$$DECLAREnextPosbigint;BEGINUPDATEPositionCounterSETPosition=Position+1;SELECTINTOnextPosPositionFROMPositionCounter;RETURNnextPos;END;$$LANGUAGEplpgsql;-- event tableCREATETABLEIFNOTEXISTSEvent(PositionbigintNOTNULL,TenantIduuidNOTNULL,StreamIduuidNOTNULL,VersionintNOTNULL,TypetextNOTNULL,MetajsonbNOTNULL,Datajsonb,LogDatetimestamptzNOTNULLDEFAULTnow(),CONSTRAINTpk_event_positionPRIMARYKEY(TenantId,Position),CONSTRAINTuk_event_streamid_versionUNIQUE(TenantId,StreamId,Version))PARTITIONBYLIST(TenantId);-- Append onlyCREATERULErule_event_nodeleteASONDELETETOEventDOINSTEADNOTHING;CREATERULErule_event_noupdateASONUPDATETOEventDOINSTEADNOTHING;-- notificationCREATEFUNCTIONNotifyEvent()RETURNStriggerAS$$DECLAREpayloadtext;BEGIN-- { position }/{ tenantId }/{ streamId }/{ version }/{ event type }SELECTCONCAT_WS('/',NEW.Position,NEW.TenantId,REPLACE(CAST(NEW.StreamIdAStext),'-',''),NEW.Version,NEW.Type)INTOpayload;-- using lower case channel name or else LISTEN would require quoted identifier.PERFORMpg_notify('eventrecorded',payload);RETURNNULL;END;$$LANGUAGEplpgsql;CREATETRIGGERtrg_EventRecordedAFTERINSERTONEventFOREACHROWEXECUTEPROCEDURENotifyEvent();
And here's how to append an event with the transactional sequence number.
-- inside a transactionINSERTINTOEvent(Position,TenantId,StreamId,Version,Type,Meta,Data)VALUES(NextPosition(),'3791B53677C840FC81CA65BFDAF34F1F','A88F94DB6E7A439E9861485F63CC8A13',1,'EmptyEvent','{}',NULL);
Would it not be simpler instead to use explicit table lock when inserting events instead of PositionCounter? It limits performance but I believe result is same as PositionCounter table because it too needs a row level lock
begin;
lock table event;
insert into event ...
commit
I really appreciate your feedback!
On quick search, I found this article on gapless sequences. Using what was described there would also enforce a totally ordered sequence number. This will have a performance impact, not sure how much. Perhaps a single writer is a better solution overall for now.
I am concerned that business logic (command processing) will have to scale compute resources. And currently I planned to colocate the writer with the command processing, hence the multiple concurrent writers as command processing scales. I guess good problems to have and figure out when I get there. But I don't like the idea of the single writer being its own separate service.
Here is a quick sketch of what the solution would look like that supported multiple concurrent writers. Using the strategy from that link. This is accomplished by making sequence number generation part of the parent transaction. I even added some rules to make the Event table append-only and the sequence counter update only to prevent maintenance mistakes. Also inspired by that link.
Here, I also renamed SequenceNum to Position.
And here's how to append an event with the transactional sequence number.
Thanks for putting this together, great read!
Working on porting this to YugaByteDB.
Would it not be simpler instead to use explicit table lock when inserting events instead of
PositionCounter
? It limits performance but I believe result is same as PositionCounter table because it too needs a row level lockSimpler? It would be less to setup vs a Position table. In usage it's adding a lock statement vs calling
NextPosition()
in the insert.I suppose you will just have to test for the performance difference between the two.