DEV Community

이관호(Gwanho LEE)
이관호(Gwanho LEE)

Posted on

`mithril-signer`: State Machine Flow, KES Period, and Safe Registration

Why I Studied mithril-signer

After contributing to the Mithril project and reading the client/aggregator modules, I realized that the signer is one of the most critical components in the whole protocol. The signer is operated by stake pool operators (SPOs) and is responsible for producing the individual signatures that the aggregator later combines into a Mithril certificate. If signers crash or register incorrect data, then participation drops and certificate production can become slow or unstable. So understanding the signer is not only “reading code,” but understanding the real operational reliability of Mithril.

In this post, I summarize how mithril-signer works from an engineer’s point of view: the state machine (status) flow, how registration and signing rounds work, what kind of data moves between signer and aggregator, and why KES period handling is a correctness-sensitive part of registration.


What the Signer Does (High-level Responsibility)

You can think of the signer as a long-running daemon that repeats the following loop:

1) Observe chain context from a Cardano node (epoch, chain tip / slot context, and other values needed by the protocol).

2) Register itself for the current epoch (publish its verification key material and metadata so it can participate).

3) When the aggregator opens a message to be certified, run the lottery locally to decide eligibility.

4) If eligible, produce one or more signatures and submit them to the aggregator.

5) Persist local “already signed” markers so it does not re-sign the same entity repeatedly.

6) Retry safely when the node is still syncing or any dependency is temporarily unavailable.

The most important detail here is that the signer must behave safely under real-world conditions: node syncing, transient network failures, partial DB corruption, and other imperfect states. Reliability is part of the protocol’s real security because low signer participation affects liveness.


Signer Status / State Machine (How the Process Moves)

mithril-signer is implemented as a state machine. The exact enum names can vary by version, but the conceptual states are stable:

  • INIT: signer starts, loads configuration, initializes DB, logger, and internal services.
  • UNREGISTERED: signer is not registered for the current epoch. It should attempt registration (or wait if registration round is not open).
  • READY_TO_SIGN: signer is registered and eligible to participate in signing rounds.
  • REGISTERED_NOT_ABLE_TO_SIGN: signer is registered but cannot sign for some reason (not eligible in this epoch/round, stake conditions, or protocol constraints).

A key engineering point is that the runtime should not crash for ordinary “not ready yet” situations. In many flows, the correct behavior is to keep state and retry later, not panic or register with fake data.


What Data Moves Between Signer and Aggregator?

The signer communicates with the aggregator through HTTP endpoints. The data exchange has two main categories:

1) Registration data

This usually includes:

  • signer identity (pool id / party id),
  • signer verification key material,
  • protocol metadata (version, network),
  • additional fields required by the protocol for the epoch (e.g., KES-related fields depending on the build path).

This happens on startup and at epoch transitions.

2) Signature submission data

This includes:

  • a signature (cryptographic bytes encoded in JSON),
  • “won indexes” (the lottery results that justify why the signer can submit signatures),
  • identifiers tying this signature to the current open message.

This happens repeatedly during the operations phase whenever the aggregator has an open message.


What Is “Beacon” and Why Do We Store It?

In the signer code, “beacon” is a compact identifier for “what was signed.” You can treat it like:

A unique key that represents a specific signed entity at a specific time/epoch.

The signer stores signed beacons locally so it can answer this question:

“Have I already signed this entity? If yes, do not sign again.”

This is idempotency and it protects both:

  • protocol correctness (avoid duplicates)
  • operational health (avoid useless work)

When Does the Signer Check “Already Signed” Beacons?

This check typically happens after the signer has candidate messages to sign but before it generates signatures and submits them. A simplified flow looks like:

1) Determine which signed entities are candidates for this round.

2) Query local DB to filter out entities already signed (filter_out_already_signed_entities).

3) Only sign entities that are not already signed.

4) Store the signed beacon and submit signature(s) to the aggregator.

This is a critical control path. If this DB path crashes, the signer may repeatedly crash or repeatedly re-sign.


KES Period: What It Is and Why It Appears in Registration

KES stands for Key Evolving Signature, a Cardano mechanism where a node’s signing key evolves over time to limit the impact of long-term key compromise. The KES period is an index derived from chain time (slot/epoch context). In Mithril signer registration, KES period may be used to compute a derived value like KES evolutions, which indicates how far the current key period is from the start period encoded in the operational certificate.

The key idea is that KES-related values represent real chain state. If they are unknown, it is safer to treat them as “not ready” and retry rather than sending incorrect values.

Top comments (0)