fernandosimoes

What happens when a row is updated?

When we update a tuple in a page, a tuple-oriented dbms does the following steps:

Let’s understand this better with a simple test table in Postgres.

postgres=# insert into test (value) values ('t1');
INSERT 0 1
postgres=# insert into test (value) values ('t2');
INSERT 0 1
postgres=# select ctid, value from test;
 ctid  | value
-------+-------
 (0,1) | t1
 (0,2) | t2
(2 rows)

Here we have a table test with a column value. We inserted two tuples into the table with values t1 and t2. The ctid column is a pseud column. The first number indicates the page the tuple is in; the second number represents the offset of that tuple in that specific page. Based on this information, we know that the two inserted tuples are in the page 0, and that t1 has the offset 1 and t2 has the offset 2.

Now let’s try to update t2.

postgres=# update test set value = 't2n' where value = 't2';
UPDATE 1
postgres=# select ctid, value from test;
 ctid  | value
-------+-------
 (0,1) | t1
 (0,3) | t2n
(2 rows)

Postgres didn’t update the data in the second offset. Instead, it deleted that tuple, and inserted a new one at the bottom of the page, hence the offset 3 for the inserted tuple. VACUUM FULL2 will also move the ctid.

Let’s see what happens:

postgres=# select ctid, value from test;
 ctid  | value
-------+-------
 (0,1) | t1
 (0,3) | t2n
(2 rows)

postgres=# delete from test where value = 't1';
DELETE 1
postgres=# select ctid, value from test;
 ctid  | value
-------+-------
 (0,3) | t2n
(1 row)

postgres=# vacuum full verbose test;
INFO:  vacuuming "public.test"
INFO:  "public.test": found 2 removable, 1 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
postgres=# select ctid, value from test;
 ctid  | value
-------+-------
 (0,1) | t2n
(1 row)

But why Postgres handles updates this way?

Postgres uses Multi-Version Concurrency Control (MVCC) to handle concurrent transactions. When a tuple is updated, the old version of the tuple is kept in the page, and the new version is inserted at the bottom of the page. This way, the old version of the tuple is still available for other transactions that might be using it.

Postgres keeps track of the old version of the tuple using the xmin and xmax columns. When a tuple is updated, the xmax column is set to the transaction id of the transaction that updated the tuple. This way, Postgres knows that the tuple is no longer valid and should not be used by other transactions.

postgres=# CREATE EXTENSION pageinspect;
postgres=# SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid
                FROM heap_page_items(get_raw_page('test', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid
-------+--------+--------+-------+--------
     1 |    907 |    912 |     0 | (0,3)
     2 |    910 |    911 |     0 | (0,2)
     3 |    912 |      0 |     0 | (0,3)
(3 rows)

Here we can see the two tuples we deleted before, and the updated tuple.

Just to make sure all things make sense, let’s check the current state of the table:


postgres=# select ctid, value, xmax from test;
 ctid  | value | xmax
-------+-------+------
 (0,3) | t2n  |    0
(1 row)