The system column knows that one more column should be on disk – as nothing is present the engine can safely assume that the value is indeed NULL.Ī similar trick can be applied if we add a constant default value: We do need a full table lock BUT it is a really short lock because PostgreSQL does not actually write this column to disk. The important point here is: This operation is really fast because all it does is to run some magic on the system catalog. Test=# ALTER TABLE t_sample ADD COLUMN a2 int The simplest one is to add a column without any default values: We have to address various scenarios here. But what if columns are added? How does it impact the database? Adding a column to a table in PostgreSQL Public | t_sample | table | hs | permanent | heap | 3458 MB | Schema | Name | Type | Owner | Persistence | Access method | Size | Description The initial table is around 3.4 GB in size as shown in the next listing: The following statement creates a simple table which is going to serve as a test dummy for the operations we plan to run:įROM generate_series(1, 100000000) AS a1 Getting started with ALTER TABLE… ADD COLUMN Let’s dive in and see how to run ALTER TABLE … ADD COLUMN in a way that doesn’t hurt operations due to locking or extensive I/O. Changing data structures is an important issue and often comes up, therefore it’s important to understand what is really going on. We know that all the rows in that table should have the new default value but we know also that the table was not rewritten.Running ALTER TABLE … ADD COLUMN can have side effects which have the potential to cause serious issues in production. The column we added has that set in pg_attribute: We can see that when we check for our current table. "pg_attribute_relid_attnum_index" UNIQUE, btree (attrelid, attnum)Īs soon as a new column with a non null default value is added to a table these columns get populated. "pg_attribute_relid_attnam_index" UNIQUE, btree (attrelid, attname) The catalog table pg_attribute got two new columns called “attmissingval” and “atthasmissing”:Ĭolumn | Type | Collation | Nullable | Default The question is how does that work in the background? Actually the idea is quite simple. No sequential scan at all and it only took 5 ms for the alter table to complete. Postgres=# insert into test (a,b,c) select aa.*, md5(aa::text), now() from generate_series ( 1, 1000000 ) aa Postgres=# alter table test add column d text default 'a' Īs you can see a sequential scan happened when the alter table was performed and it took more than a second for the alter table to complete. Postgres=# select seq_scan from pg_stat_user_tables where relid = 'test'::regclass This gave us 1’000’000 rows and what I want to do is to check the amount of sequential scans against the table before and after the alter table. Postgres=# create table test ( a int, b text, c timestamp ) We start by creating a test table in PostgreSQL 10: With PostgreSQL 11 this is not anymore the case and adding a column in such a way is almost instant. Up to PostgreSQL 10 when you add a column to table which has a non null default value the whole table needed to be rewritten. As I am currently preparing my session for the Swiss PGDay which is about some of the new features for PostgreSQL 11, I though this one is worth a blog post as well.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |