Sometimes, it’s useful to know the exact size of the data stored by a smart contract. A contract must pay a storage fee, which depends on the size of the data stored.
In TON, data is stored in the c4
register of TVM as a Cell. Let’s briefly recall what a Cell is.
A Cell is a data structure that can contain up to 1023 bits and 4 references to other cells.
Read more: TON Documentation - Cell
The Cell is the fundamental building block of TVM.
How to Compute the Storage Size?
According to the official documentation, the FunC standard library provides the following functions to determine the size of a cell and a slice:
compute_data_size?
slice_compute_data_size?
compute_data_size
slice_compute_data_size
It’s essential to understand how the size is actually computed.
Let’s take a look at the description of the compute_data_size?
function:
"Returns (x, y, z, -1) or (null, null, null, 0). It recursively calculates the number of unique cells x, data bits y, and cell references z in the directed acyclic graph (DAG) at cell c. This provides the total storage used by the DAG while recognizing identical cells. "
The key concepts here are:
- Unique cells
- Identical cell recognition
So, what does “unique cells” mean? And how does the function recognize identical cells?
Let’s explore this through research and practical tests.
Project description
To conduct this research, I created a project based on the Blueprint template. You can find the project here.
Note: I won’t include source code here to keep the article clean and focused on the results.
The project includes:
- A smart contract for computing storage size
- Sandbox tests
- A script for running production tests
In the tests, there are only tests for the compute_data_size?
and slice_compute_data_size?
functions because the only difference between these and their strict versions (compute_data_size
, slice_compute_data_size
) is that the strict versions raise an exception if the number of unique cells in the cell structure exceeds the max_cells
parameter passed to the functions.
Prerequisites
You need to have Node.js installed to run the project locally. For an IDE, Visual Studio Code is recommended, along with the FunC Language Support plugin by Whales Corp, which can be found here.
Smart contact description
The smart contract includes the following functionality:
- Filling the storage with a custom Cell structure via an internal message with special opcode and custom cell data.
- Computing storage size using
compute_data_size?
via theget_results_for_cell
method. - Computing storage size using
slice_compute_data_size?
via theget_results_for_slice
method.
The contract also computes the size of the input data during storage filling, verifying that computations on raw input data match those on stored internal data. This uses the ~dumb
and ~strdump
functions (debug primitives), and outputs can be seen when running Sandbox tests.
Test Cases & Observations
To understand how compute_data_size?
and slice_compute_data_size?
behave under different scenarios, a variety of test cases were designed using custom cell structures. These tests aim to explore how the computation distinguishes between unique and identical cells, how data and references affect uniqueness, and how the max_cells
limit behaves. Each test case builds upon the previous one, gradually increasing in complexity and providing insights through both predictable and edge-case configurations.
An empty cell
First of all, let's compute the size of an empty cell — a cell with no data and no references.
const root = beginCell().endCell();
The functions return the following results:
- uniqueCells: 1
- dataBits: 0
- references: 0
- success: true
Nothing surprising. Let’s try to fill the cell with some data.
Not empty cell
In this case, we fill the cell with some data and get the results again.
const root = beginCell().storeStringTail("value 1").endCell();
Now the cell has 7 symbols of data, which is 56 bits in total (8 bits per symbol).
The function returns the following results:
- uniqueCells: 1
- dataBits: 56
- references: 0
- success: true
Again, the results are predictable. So how about references?
One reference
Now we add one more cell to see the results for two cells.
const cell_1 = beginCell().endCell();
const root = beginCell().storeRef(cell_1).endCell();
We have two cells with no data. The root cell refers to cell_1
. cell_1
has no references.
This time we get the following results:
- uniqueCells: 2
- dataBits: 0
- references: 1
- success: true
No surprises here. There is one reference from root to cell_1
, no data bits, and two cells total.
Let’s make our cell structure a bit more complicated.
No references / no value
In this case, we add a third cell which is identical to cell_1
(no data, no references) and observe how the functions work.
const cell_2 = beginCell().endCell();
const cell_1 = beginCell().endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
We want to see how two identical cells (cell_1
and cell_2
) are treated when computing the size of the cell structure.
Now we have the following results:
- uniqueCells: 2
- dataBits: 0
- references: 2
- success: true
The interesting part here is that we have only 2 unique cells. Therefore, we can make our first conclusion:
Cells with no data and no references are treated as one unique cell.
No references / the same value
But how about the data? What results do we get if we have no empty cells? First let’s try with the same data.
const cell_2 = beginCell().storeUint(0,1).endCell();
const cell_1 = beginCell().storeUint(0,1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
cell_1
and cell_2
now contain the same 1-bit uint value 0.
After invoking functions, we get the results:
- uniqueCells: 2
- dataBits: 1
- references: 2
- success: true
We see that cell_1
and cell_2
are again treated as one unique cell. Due to that, data bits are 1 and unique cells are 2.
Our next conclusion is:
Cells with no references but with the same values are treated as one unique cell.
No references / different values
Logically, with different values, cells cannot be treated as identical. But programmers must verify hypotheses.
Let’s fill cell_1
and cell_2
with different values this time.
const cell_2 = beginCell().storeUint(1,1).endCell();
const cell_1 = beginCell().storeUint(0,1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
cell_1
has value 0, and cell_2
has value 1.
After invoking functions, we get the results:
- uniqueCells: 3
- dataBits: 2
- references: 2
- success: true
This time, all cells are treated as unique.
We are done with no-reference cells. Let’s experiment with references.
References to the same cell
We complicate our cell structure by adding a fourth cell. We want to understand how two cells with no value (or the same values) and references to the same cell are treated.
const the_same_cell = beginCell().endCell();
const cell_2 = beginCell().storeRef(the_same_cell).endCell();
const cell_1 = beginCell().storeRef(the_same_cell).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
After running functions, we get:
- uniqueCells: 3
- dataBits: 0
- references: 3
- success: true
We see that cell_1
and cell_2
are treated as one unique cell. But there’s one more interesting thing: there are 3 references reported, although there are actually 4.
It’s interesting because references to identical cells are counted normally, but references from cells treated as one unique cell are counted only once.
We can conclude:
- Cells with no value or the same values that refer to the same cell are treated as one unique cell.
- References from cells treated as one unique cell are counted only once.
References to identical cells
With references to the same cell understood, how about references to identical cells? In this article, identical cells are synonymous with unique cells.
Let’s add one more cell to see how it works.
const cell_2_1 = beginCell().endCell();
const cell_1_1 = beginCell().endCell();
const cell_2 = beginCell().storeRef(cell_2_1).endCell();
const cell_1 = beginCell().storeRef(cell_1_1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
Here, cell_1_1
and cell_2_1
are basically one unique cell (no data / no references).
The functions give the following results:
- uniqueCells: 3
- dataBits: 0
- references: 3
- success: true
We see that cell_1
and cell_2
are treated as unique cells. cell_1_1
and cell_2_1
are also unique cells. All references from identical cells are counted only once, so two actual references (from cell_1
to cell_1_1
and from cell_2
to cell_2_1
) are counted as one reference.
Our next conclusion:
Cells with no data (or the same data) referring to identical cells are treated as one unique cell.
References to non-identical cells
To clarify that references to non-identical cells prevent treating our cells as unique, let’s check with code.
const cell_2_1 = beginCell().storeUint(1, 1).endCell();
const cell_1_1 = beginCell().storeUint(0, 1).endCell();
const cell_2 = beginCell().storeRef(cell_2_1).endCell();
const cell_1 = beginCell().storeRef(cell_1_1).endCell();
This time, cell_1_1
and cell_2_1
contain different values.
After invoking functions, we get:
- uniqueCells: 5
- dataBits: 2
- references: 4
- success: true
All 5 cells are different. Q.E.D.
Different references order
Now, we refer to identical cells, but the order of references is different. We use more than one reference per cell to check this, so our structure is more complex.
const bottom_cell_2 = beginCell().endCell();
const bottom_cell_1 = beginCell().storeUint(0,1).endCell();
const cell_2 = beginCell().storeRef(bottom_cell_2).storeRef(bottom_cell_1).endCell();
const cell_1 = beginCell().storeRef(bottom_cell_1).storeRef(bottom_cell_2).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
Both cell_1
and cell_2
refer to the same cells bottom_cell_1
and bottom_cell_2
, but in different orders:
cell_1.ref1
--> bottom_cell_1
cell_1.ref2
--> bottom_cell_2
cell_2.ref1
--> bottom_cell_2
cell_2.ref2
--> bottom_cell_1
After invoking functions, we get:
- uniqueCells: 5
- dataBits: 1
- references: 6
- success: true
This leads to the conclusion:
If cells have the same data (or no data) and refer to the same cells, the order of references must be the same to treat them as one unique cell.
Identical structures except for the deepest level
Let’s check a scenario where two branches of our structure are absolutely identical at all intermediate levels, but differ in the very last referenced cell.
const cell_2_1_1 = beginCell().storeUint(3, 2).endCell();
const cell_1_1_1 = beginCell().storeUint(2, 2).endCell();
const cell_2_1 = beginCell().storeRef(cell_2_1_1).endCell();
const cell_1_1 = beginCell().storeRef(cell_1_1_1).endCell();
const cell_2 = beginCell().storeRef(cell_2_1).endCell();
const cell_1 = beginCell().storeRef(cell_1_1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
Here, cell_1
and cell_2
refer to cells that have no value and exactly one reference each. But the deepest cells they point to — cell_1_1_1
and cell_2_1_1
— are different because they contain different values.
Results from compute_data_size?
:
- uniqueCells: 7
- dataBits: 2
- references: 6
- success: true
Observation:
Even though the structures above the leaves are visually identical, the difference at the bottom means that all cells in both branches are treated as unique. This happens because uniqueness is checked recursively — if a referenced cell is different, its parent is automatically considered different, all the way up to the root.
Conclusion:
A difference at the deepest level propagates upward, making the entire branch unique in the computation.
Max cells limitation
As a final test, let’s look at the behavior when max_cells
is limited to less than the actual number of cells.
const cell_2 = beginCell().storeUint(1,1).endCell();
const cell_1 = beginCell().storeUint(0,1).endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
We have 3 unique cells. Let’s set max_cells
parameter to 2 and see the results.
The functions return:
- uniqueCells: null
- dataBits: null
- references: null
- success: false
When the actual size of the cell structure exceeds max_cells
, we get nulls.
But here we had 3 unique cells. How about 3 cells but only 2 unique ones?
Max cells limitation for actual / unique cells imbalance
We again have 3 cells but only 2 unique ones.
const cell_2 = beginCell().endCell();
const cell_1 = beginCell().endCell();
const root = beginCell().storeRef(cell_1).storeRef(cell_2).endCell();
The function invocation returns:
- uniqueCells: 2
- dataBits: 0
- references: 2
- success: true
This lets us conclude:
max_cells
limits only the number of unique cells.
Production Tests
The project includes a script (testInProduction.ts
) to run all tests on-chain.
Results in production match Sandbox results.
Possible Improvements
To reduce testing costs, modify the smart contract to return excess TON after storage is filled.
Final Conclusions
The functions (compute_data_size?
, slice_compute_data_size?
,compute_data_size
,slice_compute_data_size
) calculate:
- Unique cells
- Data bits
- References
Cells are identical if:
- They have the same data (or none)
- Refer to identical cells
- Reference order is the same
Additional rules and nuances:
- References from identical cells are counted once
-
max_cells
limits unique cells only - A difference in the deepest referenced cells propagates upward, making all parent cells along the branch unique, even if intermediate cells have the same data and structure
Afterword
The test cases explored in this article helped demystify the behavior of compute_data_size?
and slice_compute_data_size?
in the TON Virtual Machine. By analyzing different cell configurations, the results provided a clear understanding of how data size is calculated, how identical cells are treated, and how reference structures influence outcomes.
These insights are essential for developers aiming to optimize smart contract storage and cost in TON, ensuring both accuracy and efficiency in contract design.
Top comments (0)