Salikh Osmanov

Posted on Jul 30 • Edited on Aug 10

Computing the actual size of a TON smart contract storage

#tvm #smartcontract #func #ton

Sometimes, it’s useful to know the exact size of the data stored by a smart contract. A contract must pay a storage fee, which depends on the size of the data stored.

In TON, data is stored in the c4 register of TVM as a Cell. Let’s briefly recall what a Cell is.

A Cell is a data structure that can contain up to 1023 bits and 4 references to other cells.
Read more: TON Documentation - Cell

The Cell is the fundamental building block of TVM.

How to Compute the Storage Size?

According to the official documentation, the FunC standard library provides the following functions to determine the size of a cell and a slice:

compute_data_size?
slice_compute_data_size?
compute_data_size
slice_compute_data_size

It’s essential to understand how the size is actually computed.

Let’s take a look at the description of the compute_data_size? function:

"Returns (x, y, z, -1) or (null, null, null, 0). It recursively calculates the number of unique cells x, data bits y, and cell references z in the directed acyclic graph (DAG) at cell c. This provides the total storage used by the DAG while recognizing identical cells. "

The key concepts here are:

Unique cells
Identical cell recognition

So, what does “unique cells” mean? And how does the function recognize identical cells?

Let’s explore this through research and practical tests.

Project description

To conduct this research, I created a project based on the Blueprint template. You can find the project here.

Note: I won’t include source code here to keep the article clean and focused on the results.

The project includes:

A smart contract for computing storage size
Sandbox tests
A script for running production tests

In the tests, there are only tests for the compute_data_size? and slice_compute_data_size? functions because the only difference between these and their strict versions (compute_data_size, slice_compute_data_size) is that the strict versions raise an exception if the number of unique cells in the cell structure exceeds the max_cells parameter passed to the functions.

Prerequisites

You need to have Node.js installed to run the project locally. For an IDE, Visual Studio Code is recommended, along with the FunC Language Support plugin by Whales Corp, which can be found here.

Smart contact description

The smart contract includes the following functionality:

Filling the storage with a custom Cell structure via an internal message with special opcode and custom cell data.
Computing storage size using compute_data_size? via the get_results_for_cell method.
Computing storage size using slice_compute_data_size? via the get_results_for_slice method.

The contract also computes the size of the input data during storage filling, verifying that computations on raw input data match those on stored internal data. This uses the ~dumb and ~strdump functions (debug primitives), and outputs can be seen when running Sandbox tests.

Test Cases & Observations

To understand how compute_data_size? and slice_compute_data_size? behave under different scenarios, a variety of test cases were designed using custom cell structures. These tests aim to explore how the computation distinguishes between unique and identical cells, how data and references affect uniqueness, and how the max_cells limit behaves. Each test case builds upon the previous one, gradually increasing in complexity and providing insights through both predictable and edge-case configurations.

An empty cell

First of all, let's compute the size of an empty cell — a cell with no data and no references.

const root = beginCell().endCell();

The functions return the following results:

uniqueCells: 1
dataBits: 0
references: 0
success: true

Nothing surprising. Let’s try to fill the cell with some data.

Not empty cell

In this case, we fill the cell with some data and get the results again.

const root = beginCell().storeStringTail("value 1").endCell();

Now the cell has 7 symbols of data, which is 56 bits in total (8 bits per symbol).

The function returns the following results:

uniqueCells: 1
dataBits: 56
references: 0
success: true

Again, the results are predictable. So how about references?

One reference

Now we add one more cell to see the results for two cells.

const cell_1 = beginCell().endCell();
const root = beginCell().storeRef(cell_1).endCell();

We have two cells with no data. The root cell refers to cell_1. cell_1 has no references.

This time we get the following results:

uniqueCells: 2
dataBits: 0
references: 1
success: true

No surprises here. There is one reference from root to cell_1, no data bits, and two cells total.

Let’s make our cell structure a bit more complicated.

No references / no value

In this case, we add a third cell which is identical to cell_1 (no data, no references) and observe how the functions work.

const cell_2 = beginCell().endCell();
const cell_1 = beginCell().endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

We want to see how two identical cells (cell_1 and cell_2) are treated when computing the size of the cell structure.

Now we have the following results:

uniqueCells: 2
dataBits: 0
references: 2
success: true

The interesting part here is that we have only 2 unique cells. Therefore, we can make our first conclusion:

Cells with no data and no references are treated as one unique cell.

No references / the same value

But how about the data? What results do we get if we have no empty cells? First let’s try with the same data.

const cell_2 = beginCell().storeUint(0,1).endCell();
const cell_1 = beginCell().storeUint(0,1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

cell_1 and cell_2 now contain the same 1-bit uint value 0.

After invoking functions, we get the results:

uniqueCells: 2
dataBits: 1
references: 2
success: true

We see that cell_1 and cell_2 are again treated as one unique cell. Due to that, data bits are 1 and unique cells are 2.

Our next conclusion is:

Cells with no references but with the same values are treated as one unique cell.

No references / different values

Logically, with different values, cells cannot be treated as identical. But programmers must verify hypotheses.

Let’s fill cell_1 and cell_2 with different values this time.

const cell_2 = beginCell().storeUint(1,1).endCell();
const cell_1 = beginCell().storeUint(0,1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

cell_1 has value 0, and cell_2 has value 1.

After invoking functions, we get the results:

uniqueCells: 3
dataBits: 2
references: 2
success: true

This time, all cells are treated as unique.

We are done with no-reference cells. Let’s experiment with references.

References to the same cell

We complicate our cell structure by adding a fourth cell. We want to understand how two cells with no value (or the same values) and references to the same cell are treated.

const the_same_cell = beginCell().endCell();
const cell_2 = beginCell().storeRef(the_same_cell).endCell();
const cell_1 = beginCell().storeRef(the_same_cell).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

After running functions, we get:

uniqueCells: 3
dataBits: 0
references: 3
success: true

We see that cell_1 and cell_2 are treated as one unique cell. But there’s one more interesting thing: there are 3 references reported, although there are actually 4.

It’s interesting because references to identical cells are counted normally, but references from cells treated as one unique cell are counted only once.

We can conclude:

Cells with no value or the same values that refer to the same cell are treated as one unique cell.
References from cells treated as one unique cell are counted only once.

References to identical cells

With references to the same cell understood, how about references to identical cells? In this article, identical cells are synonymous with unique cells.

Let’s add one more cell to see how it works.

const cell_2_1 = beginCell().endCell();
const cell_1_1 = beginCell().endCell();
const cell_2 = beginCell().storeRef(cell_2_1).endCell();
const cell_1 = beginCell().storeRef(cell_1_1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

Here, cell_1_1 and cell_2_1 are basically one unique cell (no data / no references).

The functions give the following results:

uniqueCells: 3
dataBits: 0
references: 3
success: true

We see that cell_1 and cell_2 are treated as unique cells. cell_1_1 and cell_2_1 are also unique cells. All references from identical cells are counted only once, so two actual references (from cell_1 to cell_1_1 and from cell_2 to cell_2_1) are counted as one reference.

Our next conclusion:

Cells with no data (or the same data) referring to identical cells are treated as one unique cell.

References to non-identical cells

To clarify that references to non-identical cells prevent treating our cells as unique, let’s check with code.

const cell_2_1 = beginCell().storeUint(1, 1).endCell();
const cell_1_1 = beginCell().storeUint(0, 1).endCell();
const cell_2 = beginCell().storeRef(cell_2_1).endCell();
const cell_1 = beginCell().storeRef(cell_1_1).endCell();

This time, cell_1_1 and cell_2_1 contain different values.

After invoking functions, we get:

uniqueCells: 5
dataBits: 2
references: 4
success: true

All 5 cells are different. Q.E.D.

Different references order

Now, we refer to identical cells, but the order of references is different. We use more than one reference per cell to check this, so our structure is more complex.

const bottom_cell_2 = beginCell().endCell();
const bottom_cell_1 = beginCell().storeUint(0,1).endCell();
const cell_2 = beginCell().storeRef(bottom_cell_2).storeRef(bottom_cell_1).endCell();
const cell_1 = beginCell().storeRef(bottom_cell_1).storeRef(bottom_cell_2).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

Both cell_1 and cell_2 refer to the same cells bottom_cell_1 and bottom_cell_2, but in different orders:

cell_1.ref1 --> bottom_cell_1
cell_1.ref2 --> bottom_cell_2

cell_2.ref1 --> bottom_cell_2
cell_2.ref2 --> bottom_cell_1

After invoking functions, we get:

uniqueCells: 5
dataBits: 1
references: 6
success: true

This leads to the conclusion:

If cells have the same data (or no data) and refer to the same cells, the order of references must be the same to treat them as one unique cell.

Identical structures except for the deepest level

Let’s check a scenario where two branches of our structure are absolutely identical at all intermediate levels, but differ in the very last referenced cell.

const cell_2_1_1 = beginCell().storeUint(3, 2).endCell();
const cell_1_1_1 = beginCell().storeUint(2, 2).endCell();

const cell_2_1 = beginCell().storeRef(cell_2_1_1).endCell();
const cell_1_1 = beginCell().storeRef(cell_1_1_1).endCell();

const cell_2 = beginCell().storeRef(cell_2_1).endCell();
const cell_1 = beginCell().storeRef(cell_1_1).endCell();

const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

Here, cell_1 and cell_2 refer to cells that have no value and exactly one reference each. But the deepest cells they point to — cell_1_1_1 and cell_2_1_1 — are different because they contain different values.

Results from compute_data_size?:

uniqueCells: 7
dataBits: 2
references: 6
success: true

Observation:
Even though the structures above the leaves are visually identical, the difference at the bottom means that all cells in both branches are treated as unique. This happens because uniqueness is checked recursively — if a referenced cell is different, its parent is automatically considered different, all the way up to the root.

Conclusion:
A difference at the deepest level propagates upward, making the entire branch unique in the computation.

Max cells limitation

As a final test, let’s look at the behavior when max_cells is limited to less than the actual number of cells.

const cell_2 = beginCell().storeUint(1,1).endCell();
const cell_1 = beginCell().storeUint(0,1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

We have 3 unique cells. Let’s set max_cells parameter to 2 and see the results.

The functions return:

uniqueCells: null
dataBits: null
references: null
success: false

When the actual size of the cell structure exceeds max_cells, we get nulls.

But here we had 3 unique cells. How about 3 cells but only 2 unique ones?

Max cells limitation for actual / unique cells imbalance

We again have 3 cells but only 2 unique ones.

const cell_2 = beginCell().endCell();
const cell_1 = beginCell().endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();

The function invocation returns:

uniqueCells: 2
dataBits: 0
references: 2
success: true

This lets us conclude:

max_cells limits only the number of unique cells.

Production Tests

The project includes a script (testInProduction.ts) to run all tests on-chain.
Results in production match Sandbox results.

Possible Improvements

To reduce testing costs, modify the smart contract to return excess TON after storage is filled.

Final Conclusions

The functions (compute_data_size?, slice_compute_data_size?,compute_data_size,slice_compute_data_size) calculate:

Unique cells
Data bits
References

Cells are identical if:

They have the same data (or none)
Refer to identical cells
Reference order is the same

Additional rules and nuances:

References from identical cells are counted once
max_cells limits unique cells only
A difference in the deepest referenced cells propagates upward, making all parent cells along the branch unique, even if intermediate cells have the same data and structure

Afterword

The test cases explored in this article helped demystify the behavior of compute_data_size? and slice_compute_data_size? in the TON Virtual Machine. By analyzing different cell configurations, the results provided a clear understanding of how data size is calculated, how identical cells are treated, and how reference structures influence outcomes.

These insights are essential for developers aiming to optimize smart contract storage and cost in TON, ensuring both accuracy and efficiency in contract design.

DEV Community