PostgreSQL 9.5 – ALLOW FOREIGN TABLES TO PARTICIPATE IN INHERITANCE.

Allow foreign tables to participate in inheritance.
 
Foreign tables can now be inheritance children, or parents.  Much of the
system was already ready for this, but we had to fix a few things of
course, mostly in the area of planner and executor handling of row locks.
 
As side effects of this, allow foreign tables to have NOT VALID CHECK
constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to
accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS.  Continuing to
disallow these things would've required bizarre and inconsistent special
cases in inheritance behavior.  Since foreign tables don't enforce CHECK
constraints anyway, a NOT VALID one is a complete no-op, but that doesn't
mean we shouldn't allow it.  And it's possible that some FDWs might have
use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops
for most.
 
An additional change in support of this is that when a ModifyTable node
has multiple target tables, they will all now be explicitly identified
in EXPLAIN output, for example:
 
 Update on pt1  (cost=0.00..321.05 rows=3541 width=46)
   Update on pt1
   Foreign Update on ft1
   Foreign Update on ft2
   Update on child3
   ->  Seq Scan on pt1  (cost=0.00..0.00 rows=1 width=46)
   ->  Foreign Scan on ft1  (cost=100.00..148.03 rows=1170 width=46)
   ->  Foreign Scan on ft2  (cost=100.00..148.03 rows=1170 width=46)
   ->  Seq Scan on child3  (cost=0.00..25.00 rows=1200 width=46)
 
This was done mainly to provide an unambiguous place to attach "Remote SQL"
fields, but it is useful for inherited updates even when no foreign tables
are involved.
 
Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro
Horiguchi, some additional hacking by me

Initially I almost dismissed it. Another cool feature for hard-code nerds. But then it hit me. This is really big. Why? Read on …

Let's see how to get it working, and what we can use it for

I'll make myself 5 databases, named master and shard_1, shard_2, shard_3, and shard_4.

$ echo master shard_{1,2,3,4} | xargs -n1 createdb

Of course in real life scenario the databases would be on separate machines/instances, but this is just test environment, so it has to be enough.

Now, in each of these databases, I'll make a poor-man sharding table for users:

for a in {1..4}
do
    echo "
        create table users (id serial primary key, username text not null);
        alter sequence users_id_seq increment by 4 restart with $a;
    " | psql -d shard_$a; done
done

It creates the users table, but in each shard it will give different ids. In shard 1 it will be 1, 5, 9, …, in shard 2 it will be 2, 6, 10, … and so on.

With the tables in place, let's make a master table and make all other inherit from it.

$ create table users (id serial primary key, username text not null);
$ create extension postgres_fdw;
$ create server shard_1 foreign data wrapper postgres_fdw options( dbname 'shard_1' );
$ create server shard_2 foreign data wrapper postgres_fdw options( dbname 'shard_2' );
$ create server shard_3 foreign data wrapper postgres_fdw options( dbname 'shard_3' );
$ create server shard_4 foreign data wrapper postgres_fdw options( dbname 'shard_4' );
$ create user mapping for depesz server shard_1 options ( user 'depesz' );
$ create user mapping for depesz server shard_2 options ( user 'depesz' );
$ create user mapping for depesz server shard_3 options ( user 'depesz' );
$ create user mapping for depesz server shard_4 options ( user 'depesz' );
$ create foreign table users_shard_1 () INHERITS (users) server shard_1 options ( table_name 'users' );
$ create foreign table users_shard_2 () INHERITS (users) server shard_2 options ( table_name 'users' );
$ create foreign table users_shard_3 () INHERITS (users) server shard_3 options ( table_name 'users' );
$ create foreign table users_shard_4 () INHERITS (users) server shard_4 options ( table_name 'users' );

Tedious. But it can be make more automatic.

Now, with this in place I should be able to select, insert, update, and delete rows directly to shards:

[master] $ insert into users_shard_1 (username) values ('Alice') returning *;
 id | username 
----+----------
  1 | Alice
(1 row)
 
INSERT 0 1
 
[master] $ insert into users_shard_1 (username) values ('Bob') returning *;
 id | username 
----+----------
  2 | Bob
(1 row)
 
INSERT 0 1
 
[master] $ ^C
 
[master] $ insert into users_shard_2 (username) values ('Carol') returning *;
 id | username 
----+----------
  3 | Carol
(1 row)
 
INSERT 0 1
 
[master] $ insert into users_shard_2 (username) values ('Dan') returning *;
 id | username 
----+----------
  4 | Dan
(1 row)
 
INSERT 0 1
 
[master] $ insert into users_shard_3 (username) values ('Eve') returning *;
 id | username 
----+----------
  5 | Eve
(1 row)
 
INSERT 0 1
 
[master] $ insert into users_shard_3 (username) values ('Frank') returning *;
 id | username 
----+----------
  6 | Frank
(1 row)
 
INSERT 0 1
 
[master] $ insert into users_shard_4 (username) values ('George') returning *;
 id | username 
----+----------
  7 | George
(1 row)
 
INSERT 0 1
 
[master] $ insert into users_shard_4 (username) values ('Harold') returning *;
 id | username 
----+----------
  8 | Harold
(1 row)

Immediately I can see a problem – my sequences on shards do not give the ids I thought they will. Reason is simple – ID is assigned on master. So, to make it work sensibly, I would have to make 4 sequences on master, and do the magic there. Given that values up to 8 are already taken, I'll have to change restart values too:

$ create sequence users_id_seq_shard_1 increment by 4 restart with 9;
$ create sequence users_id_seq_shard_2 increment by 4 restart with 10;
$ create sequence users_id_seq_shard_3 increment by 4 restart with 11;
$ create sequence users_id_seq_shard_4 increment by 4 restart with 12;
$ alter foreign table users_shard_1 alter COLUMN id set default nextval('users_id_seq_shard_1');
$ alter foreign table users_shard_2 alter COLUMN id set default nextval('users_id_seq_shard_2');
$ alter foreign table users_shard_3 alter COLUMN id set default nextval('users_id_seq_shard_3');
$ alter foreign table users_shard_4 alter COLUMN id set default nextval('users_id_seq_shard_4');

And now, new inserts, will have ids assigned correctly:

[master] $ insert into users_shard_1 (username) values ('ian'), ('james') returning *;
 id | username 
----+----------
  9 | ian
 13 | james
(2 rows)
 
INSERT 0 2
 
[master] $ insert into users_shard_2 (username) values ('kenneth'), ('larry') returning *;
 id | username 
----+----------
 10 | kenneth
 14 | larry
(2 rows)
 
INSERT 0 2
 
[master] $ insert into users_shard_3 (username) values ('michael'), ('nicholas') returning *;
 id | username 
----+----------
 11 | michael
 15 | nicholas
(2 rows)
 
INSERT 0 2
 
[master] $ insert into users_shard_4 (username) values ('oscar'), ('paul') returning *;
 id | username 
----+----------
 12 | oscar
 16 | paul
(2 rows)
 
INSERT 0 2

Better.

Now, thanks to this new patch, I can scan them all together:

[master] $ explain analyze select count(*) from users;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=883.69..883.70 rows=1 width=0) (actual time=3.392..3.392 rows=1 loops=1)
   ->  Append  (cost=0.00..849.56 rows=13653 width=0) (actual time=0.880..3.381 rows=16 loops=1)
         ->  Seq Scan on users  (cost=0.00..0.00 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)
         ->  Foreign Scan on users_shard_1  (cost=100.00..212.39 rows=3413 width=0) (actual time=0.878..0.879 rows=4 loops=1)
         ->  Foreign Scan on users_shard_2  (cost=100.00..212.39 rows=3413 width=0) (actual time=0.864..0.864 rows=4 loops=1)
         ->  Foreign Scan on users_shard_3  (cost=100.00..212.39 rows=3413 width=0) (actual time=0.768..0.769 rows=4 loops=1)
         ->  Foreign Scan on users_shard_4  (cost=100.00..212.39 rows=3413 width=0) (actual time=0.857..0.860 rows=4 loops=1)
 Planning time: 0.292 ms
 Execution time: 4.871 ms
(9 rows)

Delete rows from them:

[master] $ delete from users where random() < 0.3 returning *;
 id | username 
----+----------
 10 | kenneth
 15 | nicholas
  7 | George
 12 | oscar
(4 rows)

and of course update them:

[master] $ explain analyze update users set username = lower(username) where username <> lower(username) returning *;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Update on users  (cost=0.00..625.61 rows=4941 width=42) (actual time=0.775..2.344 rows=7 loops=1)
   Update on users
   Foreign Update on users_shard_1
   Foreign Update on users_shard_2
   Foreign Update on users_shard_3
   Foreign Update on users_shard_4
   ->  Seq Scan on users  (cost=0.00..0.00 rows=1 width=42) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: (username <> lower(username))
   ->  Foreign Scan on users_shard_1  (cost=100.00..156.40 rows=1235 width=42) (actual time=0.507..0.510 rows=2 loops=1)
   ->  Foreign Scan on users_shard_2  (cost=100.00..156.40 rows=1235 width=42) (actual time=0.313..0.315 rows=2 loops=1)
   ->  Foreign Scan on users_shard_3  (cost=100.00..156.40 rows=1235 width=42) (actual time=0.294..0.296 rows=2 loops=1)
   ->  Foreign Scan on users_shard_4  (cost=100.00..156.40 rows=1235 width=42) (actual time=0.261..0.262 rows=1 loops=1)
 Planning time: 0.135 ms
 Execution time: 2.898 ms
(14 rows)

Now, you have to notice that sharding in this way doesn't really make the queries faster – all foreign scans are done sequentially.

But, assuming some a bit larger dataset:

[master] $ insert into users_shard_1 (username) select 'random user ' || i from generate_series(1,100000) i;
INSERT 0 100000
 
[master] $ insert into users_shard_2 (username) select 'random user ' || i from generate_series(1,100000) i;
INSERT 0 100000
 
[master] $ insert into users_shard_3 (username) select 'random user ' || i from generate_series(1,100000) i;
INSERT 0 100000
 
[master] $ insert into users_shard_4 (username) select 'random user ' || i from generate_series(1,100000) i;
INSERT 0 100000

We can see that with fast queries it still makes sense.

[master] $ explain analyze select * from users where id = 123124;
QUERY PLAN
———————————————————————————————————————-
Append (cost=0.00..508.81 rows=29 width=36) (actual time=4.270..4.271 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..0.00 rows=1 width=36) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (id = 123124)
-> Foreign Scan on users_shard_1 (cost=100.00..127.20 rows=7 width=36) (actual time=1.020..1.020 rows=0 loops=1)
-> Foreign Scan on users_shard_2 (cost=100.00..127.20 rows=7 width=36) (actual time=1.247..1.247 rows=0 loops=1)
-> Foreign Scan on users_shard_3 (cost=100.00..127.20 rows=7 width=36) (actual time=1.004..1.004 rows=0 loops=1)
-> Foreign Scan on users_shard_4 (cost=100.00..127.20 rows=7 width=36) (actual time=0.993..0.994 rows=1 loops=1)
Planning time: 0.349 ms
Execution time: 6.070 ms
(9 rows)

This doesn't look so good. But, PostgreSQL has built in partitioning. Which we can now use for some great effects:

[master] $ alter foreign table  users_shard_1 add check ( id % 4 = 1 );
ALTER FOREIGN TABLE
 
[master] $ alter foreign table  users_shard_2 add check ( id % 4 = 2 );
ALTER FOREIGN TABLE
 
[master] $ alter foreign table  users_shard_3 add check ( id % 4 = 3 );
ALTER FOREIGN TABLE
 
[master] $ alter foreign table  users_shard_4 add check ( id % 4 = 0 );
ALTER FOREIGN TABLE

and with this in place:

[master] $ explain analyze select * from users where id = 123124 and (id % 4) = (123124 % 4);
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..133.91 rows=2 width=36) (actual time=0.905..0.907 rows=1 loops=1)
   ->  Seq Scan on users  (cost=0.00..0.00 rows=1 width=36) (actual time=0.002..0.002 rows=0 loops=1)
         Filter: ((id = 123124) AND ((id % 4) = 0))
   ->  Foreign Scan on users_shard_4  (cost=100.00..133.91 rows=1 width=36) (actual time=0.902..0.903 rows=1 loops=1)
 Planning time: 0.448 ms
 Execution time: 1.451 ms
(6 rows)

This is much nicer.

Only two “tables" got scanned – users – empty parent of all shards, and the single shard that we needed.

If I used partition/sharding key differently (using ranges for example) PostgreSQL could have be even smart enough to pick the right shard without guiding with (id % 4) = (123124 % 4).

But this little thing doesn't really spoil the fun. Lo and behold it looks like we got sharding ( of course my choice of names for databases did spoil the fun a bit ) solution that will actually work.

With some simple triggers on users() table (to redirect inserts to partitions/shards) we'll be golden.

The only thing missing would be some way to run the foreign scans in parallel (when there is more than one). Maybe it will materialize one day, we'll see.

Great THANK YOU goes to all authors, reviewers and editors. This is really huge thing.

Search This Blog

TutorialDBA - Support | Training | Consultant

PostgreSQL 9.5 – ALLOW FOREIGN TABLES TO PARTICIPATE IN INHERITANCE.

Comments

Post a Comment

Popular posts from this blog

How to find the server is whether standby (slave) or primary(master) in Postgresql replication ?

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database

7 Steps to configure BDR replication in postgresql

Postgres Streaming Replication Configuration

vacuumlo - removing large objects orphans from a database PostgreSQL