Vahid Yousefzadeh

Posted on Oct 2

Oracle vs PostgreSQL: How Undo Tablespace and MVCC Handle Updates and Deletes

#oracle #devops #postgres #database

As you know, when in a transaction a row is deleted or updated, sometimes it is necessary to revert the performed change (rollback). For example, a user updates a row and immediately realizes that this change must be undone. In most modern relational databases that fully support ACID, this possibility exists, but the execution mechanism in each database can be different.

In this article, we intend to compare Oracle and PostgreSQL solutions for transaction management with a rollback perspective. Considering the length of the topic, this text will focus only on the rollback aspect, and other aspects such as locking mechanisms will be examined in a separate article.

Oracle’s Solution for Rollback

In Oracle Database, when executing a DELETE or UPDATE command, first the current version of the row(before change or deletion) is stored in Undo, and then the intended change is applied to the target table. Therefore, no new row is added to the table.

In other words, the old versions of rows are kept in separate datafiles and are not stored in the datafiles related to the table. If the user decides to ROLLBACK the operation, the changes are restored through the Undo space and the records return to their original state.

Also, Oracle, by using the data in the Undo Tablespace, can guarantee Read Consistency and answer users’ concurrent queries without being affected by current transactions.

See the example below:

SQL> select * from vahid.tb;
  ID NAME      LAST_NAME     
---- -------   ------------- 
   1 Vahid     Yousefzadeh

We want to change the id from 1 to 2:

--session 1
SQL> update vahid.tb set id=2 where id=1;
1 row updated.

The change is not yet committed, therefore it is visible only in this Session:

--session 1:
SQL> select * from vahid.tb;
ID NAME     LAST_NAME      
--- -------  -------------- 
  2 Vahid    Yousefzadeh

Also, other sessions can see the previous value through undo until the user commits or rollbacks the transaction:

--session 2
SQL> select * from vahid.tb;
        ID NAME                 LAST_NAME
---------- -------------------- --------------------
         1 Vahid                Yousefzadeh

This row has been read from the undo tablespace, which has separate file(s). To prove this, it is enough to enable trace for both sessions. Before enabling trace, it is better to flush the buffer cache so that Oracle is forced to transfer information from disk to memory:

SQL> alter system flush BUFFER_CACHE;
System altered.

Also, we must identify the datafile numbers related to the undo tablespace and users (the main table is stored in the users tablespace):

#user tablespace
SQL> select file# from v$datafile where name like '%o1_mf_tbs%';
     FILE#
----------
        16

#undo tablespace
SQL> select file# from v$datafile where name like '%undotb000.dbf';
     FILE#
----------
        17

The trace file shows that Oracle, to respond to the above query in Session 2, has also gone to the undo tablespace datafile (file#=17):

WAIT #139653680870088: nam='db file sequential read' ela=10 file#=17 block#=80 blocks=1 obj#=0 tim=27539759546
WAIT #139653680870088: nam='db file sequential read' ela=10 file#=17 block#=506 blocks=1 obj#=0 tim=27539759622

But in Session 1, which applied the changes and has a transaction running, only datafile 16 is referenced (file#=16):

select * from vahid.tb
END OF STMT
PARSE #140311865900152:c=102250,e=118238,p=65,cr=167,cu=0,mis=1,r=0,dep=0,og=1,plh=2878482057,tim=27466892306
EXEC #140311865900152:c=13,e=13,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2878482057,tim=27466892499
WAIT #140311865900152: nam='db file sequential read' ela=13 file#=16 block#=810 blocks=1 obj#=72703 tim=27466892582
WAIT #140311865900152: nam='db file scattered read' ela=14 file#=16 block#=811 blocks=5 obj#=72703 tim=27466892678

Checking Table and Undo Size After Heavy DELETE and UPDATE
In Oracle database, when a large number of rows are deleted or updated, the table size does not change, but the undo tablespace temporarily grows. Of course, undo space is reusable for other transactions after the transaction ends, and the contents related to that transaction in undo are also overwritable (unless settings are applied by the DBA).

See the following scenario.

Undo file size:

SQL> select bytes/1024/1024 size_MB from v$datafile where name like '%undotb000.dbf';
   SIZE_MB
----------
         1

Table vahid.tb size:

SQL> select bytes/1024/1024 size_MB from dba_segments where segment_name='TB';
   SIZE_MB
----------
       446

Updating the table records:

SQL> update vahid.tb set OWNER='VAHID';
2721736 rows updated.

Check the size of the table and undo datafile again:

SQL> select bytes/1024/1024 size_MB from v$datafile where name like '%undotb000.dbf';
   SIZE_MB
----------
       438

SQL> select bytes/1024/1024 size_MB from dba_segments where segment_name='TB';
   SIZE_MB
----------
       446

As you can see, the undo size has increased and approached the table size, but the table itself has not grown. If we are concerned about the space used, we can drop the undo tablespace and create another one (if no active transaction is on it, it can be dropped).

Deleting the table rows:

SQL>  alter database datafile '/opt/oracle/oradata/FREE/FREEPDB1/undotb000.dbf' autoextend off;
Database altered.

SQL> delete vahid.tb;
2721736 rows deleted.

Check the size again:

SQL> select bytes/1024/1024 size_MB from v$datafile where name like '%undotb000.dbf';
   SIZE_MB
----------
       461

SQL> select bytes/1024/1024 size_MB from dba_segments where segment_name='TB';
   SIZE_MB
----------
       446

Only a slight change happened in undo size, while the table size remained the same.

The important point is that when inserting a row into table vahid.tb, by default the previous free space is reused. Of course, this space can also be fully reclaimed by shrinking the table.

For example, inserting tb1 back into tb:

SQL> insert into vahid.tb select * from vahid.tb1;
2721736 rows created.

Check the table size:

SQL> select bytes/1024/1024 size_MB from dba_segments where segment_name='TB';
   SIZE_MB
----------
       454

As you see, the table size had only a slight change, meaning the free space of table vahid.tb was used.

*One of the weaknesses of using an undo tablespace arises when a very large volume of records needs to be rolled back; in this case, reversing the changes can engage the database for hours and consume significant server resources.

PostgreSQL’s Solution for Rollback

In PostgreSQL, transaction management and ensuring Read Consistency is done by MVCC (Multi-Version Concurrency Control). This means that when an UPDATE or DELETE command is executed, PostgreSQL creates a new version of the row and the old row still remains in the table so that other transactions can see their snapshot.

These old versions are stored in the same table and are deleted by autovacuum when no transaction needs them (if autovacuum is not disabled).

As a result, if a large transaction (with significant updates or deletes) remains uncommitted, the table size increases (table bloat) and performance decreases. Even after committing the transaction, if the old versions (dead tuples) are not VACUUMed, this challenge remains.

Example:

vahiddb=# select * from tb;
 id | name  |  last_name
----+-------+-------------
  1 | Vahid | Yousefzadeh
(1 row)

Since autocommit and autovacuum are enabled by default in PostgreSQL, we disable them:

vahiddb=# \set AUTOCOMMIT off
vahiddb=# ALTER TABLE tb SET (autovacuum_enabled = false);
ALTER TABLE

Update the desired record in session 1:

-session 1
vahiddb=# update tb set id=2 where id=1;
UPDATE 1
vahiddb=*# select * from tb;
 id | name  |  last_name
----+-------+-------------
  2 | Vahid | Yousefzadeh
(1 row)

This change is not visible in session 2:

vahiddb=# select * from tb;
 id | name  |  last_name
----+-------+-------------
  1 | Vahid | Yousefzadeh
(1 row)

PostgreSQL has hidden columns for all tables that can provide more information about records:

--session 1:
vahiddb=*# select ctid,xmin,xmax,id,name,last_name from tb;
 ctid  | xmin | xmax | id | name  |  last_name
-------+------+------+----+-------+-------------
 (0,2) |  546 |    0 |  2 | Vahid | Yousefzadeh
(1 row)

--session 2:
vahiddb=# select ctid,xmin,xmax,id,name,last_name from tb;
 ctid  | xmin | xmax | id | name  |  last_name
-------+------+------+----+-------+-------------
 (0,1) |  545 |  546 |  1 | Vahid | Yousefzadeh
(1 row)

ctid: shows the physical address of the record in the page and row, and is different in two sessions because session 1 has executed a new transaction.
xmin: Transaction ID that created the record. In session 1, xmin=546, meaning the record was created by transaction 546, while in session 2 it shows 545.
xmax: Transaction ID that deleted or updated the row. Value 0 means the record has not been deleted yet, value 546 in session 2 means the record was updated or deleted by transaction 546.
As you see, both rows exist in the table. Now we commit the transaction.

--session 1:
vahiddb=*# commit;
COMMIT

--session 2:
vahiddb=# select ctid,xmin,xmax,id,name,last_name from tb;
 (0,2) |  546 |    0 |  2 | Vahid | Yousefzadeh

The Next Scenario: Table Bloat in PostgreSQL
Table bloat in PostgreSQL happens when old row versions remain after updates or deletes, making the table grow unnecessarily until VACUUM reclaims the space. See the following scenario:

Check the table size:

vahiddb=# \dt+ tb
 public | tb   | table | postgres | permanent   | 16 kB |

Now suppose column id is updated frequently. We update it 100,000 times in a loop:

DO $$
BEGIN
    FOR i IN 1..100000 LOOP
        UPDATE tb SET id = id + 1;
    END LOOP;
END
$$;

After these updates, the table size has increased from 16 KB to 5136 KB:

vahiddb=# \dt+ tb
 public | tb   | table | postgres | permanent   | 5136 kB |

Even though it has only one record:

vahiddb=# select * from tb;
   id   | name  |  last_name
--------+-------+-------------
 100001 | Vahid | Yousefzadeh

CTID shows the active record has moved after 100,000 updates:

vahiddb=# select ctid,xmin,xmax,id,name,last_name from tb;
 (636,149) |  558 |    0 | 100001 | Vahid | Yousefzadeh

Therefore, with each update, the record is copied to a new place and the previous version becomes a dead tuple. As a result, the table experiences bloat and requires a VACUUM.

Since autovacuum was disabled, we perform this operation manually:

vahiddb=# VACUUM FULL tb;
VACUUM
Time: 227.936 ms

*VACUUM FULL physically compacts the table and, of course, it can also be quite costly.

After VACUUM FULL, the table size is reduced back to 16 KB:

vahiddb=# \dt+ tb
 public | tb   | table | postgres | permanent   | 16 kB |

In practice, autovacuum is usually active, preventing table bloat unless large transactions or specific settings are applied.

Conclusion
Oracle uses a separate space called the Undo tablespace for rollback, and previous records are not stored in the main table. In contrast, PostgreSQL with MVCC keeps old versions in the same table, leading to table bloat, which requires regular vacuum to reclaim space.

DEV Community

Oracle vs PostgreSQL: How Undo Tablespace and MVCC Handle Updates and Deletes

Oracle’s Solution for Rollback

PostgreSQL’s Solution for Rollback

Top comments (0)