What is Data Availability Layer?

#web3 #blockchains

Here we will checkout what is a DA layer, and explore options like Celestia, Arbitrum solution, Ethereum etc.

The L1 chains like ethereum has 3 main responsibilities that are

execution of transactions
achieving consensus on transaction ordering
and guaranteeing the availability of the transaction data

The data availability refers to the last point on the blockchain like Ethereum. The idea is that all transaction related data is available to the nodes on the network and it is important to allow nodes to independently verify transactions are compute the blockchain state.

It is the ability of the nodes to download data withing all blocks. Validators usually do this, they download the transaction from a newly proposed block, re-execute the transactions to confirm it is right according to consensus rules and adds the block to the head of the chain once it is valid.

There are some problems with this, the first is thoughput. Chains like ethereum require the nodes to store the large amounts of data so they can provide it for a peer if it needs it and verify it. So if is only able to process 15-20 transactions per second. Storing data on-chain also increases the size of the blockchain exponentially which further drives the requirements of better hardware for full-nodes that need to store ever increasing amount of state data.

So rising hardware costs discourages individual participation.

Data availability layer is a system that stores and provides consensus on the availability of the data. It refers to the location where transaction data is stored.

Rollups play an important role in scaling Ethereum by moving the computation and state storage away from ethereum’s environment. It posts the transaction data in batches from a rollup to the parent which in cases like optimism and arbitrum is Ethereum.

Block data posted from a rollup to Ethereum is publicly available, allowing anyone to execute transaction and validate the rollup chain state. All this while maintaining the principles of a blockchain.

But making data available on-chain is also expensive depending on what DA layer you are selecting. For example it is expensive to store data on Ethereum due to its high transactional costs. DA is a large cost of running a rollup and also the efficiency of the DA solution determines how much activity it can process at once and overall performance.

So posting directly to Ethereum is an expensive option, so 3rd party solutions like Celestia are developed for cheap and efficient data availability for rollups. Some L3s which are using L2s as the settlement layer use the L2 as the data availability layer too due to low costs of submitting transactions.

How does Arbitrum provide a DA solution?

Arbitrum has 2 modes to fix a solution for DA layer. In rollup mode all transaction data in included in the transaction calldata of the parent chain or blobs submitted by the transaction.

AnyTrust mode, the transaction initially gets submitted to a group of nodes known as Data Availability Committee (DAC). The DAC stores and distributes the data and instead of including entire dataset on chain, only a cryptographic proof that the data has been stored on DAC is submitted to the parent chain. It significantly reduces the amount of data stored on chain and reduces the cost.

Data flow on arbitrum anytrust:

the sequencer queues the transactions and batches them together
these are submitted to the parent chain, in AnyTrust mode the sequencer sends the batch to DAC and then submits the Data Availability Certificate which is returned and generated by the DA solution to the parent chain.

How does Celestia work?

The idea behind celestia is to decouple transaction execution and consensus layer. The consensus layer is only responsible for transaction ordering and guaranteeing data availability.

In celestia a block is considered valid only if the data behind that block is available. This is to prevent block producers to release block headers without releasing the data behind them that would prevent clients from reading the transactions necessary to compute the state of their applications.

Celestia introduces a new primitive, data availability sampling. It provides an efficient solution to the DA problem by only requiring the resource-limited light nodes to sample a small number of random shares from each block to verify DA. As more light nodes participate in the sampling, the amount of data that the network can safely handle can increase. It enables a larger block size without increasing the cost to verify the chain.

The traditional approach to verify the DA is

you must download all the data
light nodes cannot do this as it needs too much bandwidth and storage
so only full nodes can do this
It is difficult to scale

So you either sacrifice the decentralization by having only a few full nodes, or keep the blocks small by capping the throughput.

Celestia Data Availability Sampling allows the light nodes to verify data by downloading only a tiny random samples instead of everything. Each light node randomly samples a few small pieces of data.

Light nodes are able to verify with minimal bandwidth. Block size can scale without forcing nodes to download everything and more light nodes means a stronger security. So decentralization is maintained even with massive blocks.

The only tradeoff of this is its probabilistic nature rather than deterministic. With sufficient samples the probability of failure is very low.

DEV Community

What is Data Availability Layer?

Top comments (0)