Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Parallel VACUUMing

The vacuumdb utility now supports parallel jobs. This is specified with the -j option, just like when using pg_dump or pg_restore. This means vacuuming a database will complete a lot quicker, and especially so for cases where tables are spread across multiple tablespaces. It will also start vacuuming the largest relations first.
For example:
 vacuumdb -j4 productiondb
This would vacuum the database named "productiondb" by spawning 4 vacuum jobs to run simultaneously.

This is great for multi-table vacuums/analyzes. Previously, when running “vacuumdb some_database" you got all tables vacuumed in some kind of order.
We can see it, by creating some test tables:
for a in $( pwgen -A0 6 20 )
    psql -c "create table test_$a as select generate_series(1, 1000000)::int4 as i"
It did create for me these tables:
$ \d
           List of relations
 Schema |    Name     | Type  | Owner  
 public | test_aeshae | table | depesz
 public | test_aijoox | table | depesz
 public | test_eeyies | table | depesz
 public | test_eicaeb | table | depesz
 public | test_ieveov | table | depesz
 public | test_isoexu | table | depesz
 public | test_jaidaf | table | depesz
 public | test_meijie | table | depesz
 public | test_ohceec | table | depesz
 public | test_okaeph | table | depesz
 public | test_ooghap | table | depesz
 public | test_oojuxo | table | depesz
 public | test_sheeru | table | depesz
 public | test_shofae | table | depesz
 public | test_thaepu | table | depesz
 public | test_uuyohl | table | depesz
 public | test_uyuque | table | depesz
 public | test_wainie | table | depesz
 public | test_yiegah | table | depesz
 public | test_zeecie | table | depesz
(20 rows)
If I'll run the vacuumdb now, normally, it will show me:
=$ vacuumdb -e
vacuumdb: vacuuming database "depesz"
I.e. just “vacuum" was run, without any table names – it will run database-wide vacuum that will work in one backend, and do all tables sequentially.
I could have influenced order by doing something like:
vacuumdb -e -t test_zeecie -t test_yiegah -t test_wainie -t test_uyuque -t test_uuyohl -t test_thaepu -t test_shofae -t test_sheeru -t test_oojuxo -t test_ooghap -t test_okaeph -t test_ohceec -t test_meijie -t test_jaidaf -t test_isoexu -t test_ieveov -t test_eicaeb -t test_eeyies -t test_aijoox -t test_aeshae
vacuumdb: vacuuming database "depesz"
VACUUM test_zeecie;
VACUUM test_yiegah;
VACUUM test_wainie;
VACUUM test_uyuque;
VACUUM test_uuyohl;
VACUUM test_thaepu;
VACUUM test_shofae;
VACUUM test_sheeru;
VACUUM test_oojuxo;
VACUUM test_ooghap;
VACUUM test_okaeph;
VACUUM test_ohceec;
VACUUM test_meijie;
VACUUM test_jaidaf;
VACUUM test_isoexu;
VACUUM test_ieveov;
VACUUM test_eicaeb;
VACUUM test_eeyies;
VACUUM test_aijoox;
VACUUM test_aeshae;
But, all tables were vacuumed in order anyway.
Now, I can add -j option, and it will parallelize processing:
$ vacuumdb -e -j8
SELECT c.relname, ns.nspname FROM pg_class c, pg_namespace ns
 WHERE relkind IN ('r', 'm') AND c.relnamespace = ns.oid
 ORDER BY c.relpages DESC;
VACUUM public.test_ohceec;
VACUUM public.test_zeecie;
VACUUM public.test_jaidaf;
VACUUM public.test_aeshae;
VACUUM public.test_ieveov;
VACUUM public.test_yiegah;
VACUUM public.test_shofae;
VACUUM public.test_thaepu;
VACUUM public.test_sheeru;
VACUUM public.test_uuyohl;
VACUUM public.test_eicaeb;
VACUUM public.test_ooghap;
VACUUM public.test_oojuxo;
VACUUM public.test_okaeph;
VACUUM public.test_meijie;
VACUUM public.test_aijoox;
VACUUM public.test_eeyies;
VACUUM public.test_isoexu;
VACUUM public.test_uyuque;
VACUUM public.test_wainie;
VACUUM pg_catalog.pg_proc;
VACUUM pg_catalog.pg_depend;
VACUUM pg_catalog.pg_attribute;
VACUUM pg_catalog.pg_description;
VACUUM pg_catalog.pg_statistic;
VACUUM pg_catalog.pg_operator;
VACUUM pg_catalog.pg_rewrite;
VACUUM pg_catalog.pg_type;
VACUUM pg_catalog.pg_class;
VACUUM information_schema.sql_features;
VACUUM pg_catalog.pg_amop;
VACUUM pg_catalog.pg_amproc;
VACUUM pg_catalog.pg_conversion;
VACUUM pg_catalog.pg_opclass;
VACUUM pg_catalog.pg_index;
VACUUM pg_catalog.pg_aggregate;
VACUUM pg_catalog.pg_collation;
VACUUM pg_catalog.pg_opfamily;
VACUUM pg_catalog.pg_cast;
VACUUM pg_catalog.pg_ts_config_map;
VACUUM pg_catalog.pg_language;
VACUUM pg_catalog.pg_authid;
VACUUM pg_catalog.pg_constraint;
VACUUM pg_catalog.pg_am;
VACUUM pg_catalog.pg_namespace;
VACUUM pg_catalog.pg_database;
VACUUM pg_catalog.pg_tablespace;
VACUUM pg_catalog.pg_pltemplate;
VACUUM pg_catalog.pg_shdepend;
VACUUM pg_catalog.pg_shdescription;
VACUUM pg_catalog.pg_ts_config;
VACUUM pg_catalog.pg_ts_dict;
VACUUM pg_catalog.pg_ts_parser;
VACUUM pg_catalog.pg_ts_template;
VACUUM pg_catalog.pg_extension;
VACUUM pg_catalog.pg_range;
VACUUM information_schema.sql_implementation_info;
VACUUM information_schema.sql_languages;
VACUUM information_schema.sql_parts;
VACUUM information_schema.sql_sizing;
VACUUM information_schema.sql_packages;
VACUUM information_schema.sql_sizing_profiles;
VACUUM pg_catalog.pg_seclabel;
VACUUM pg_catalog.pg_default_acl;
VACUUM pg_catalog.pg_policy;
VACUUM pg_catalog.pg_foreign_table;
VACUUM pg_catalog.pg_foreign_server;
VACUUM pg_catalog.pg_foreign_data_wrapper;
VACUUM pg_catalog.pg_auth_members;
VACUUM pg_catalog.pg_db_role_setting;
VACUUM pg_catalog.pg_event_trigger;
VACUUM pg_catalog.pg_trigger;
VACUUM pg_catalog.pg_largeobject_metadata;
VACUUM pg_catalog.pg_inherits;
VACUUM pg_catalog.pg_attrdef;
VACUUM pg_catalog.pg_enum;
VACUUM pg_catalog.pg_shseclabel;
VACUUM pg_catalog.pg_user_mapping;
VACUUM pg_catalog.pg_largeobject;
As we can see – it initially gets list of tables to vacuum, sorted by relpages – which is basically size of the table.
And then vacuums these in parallel, using many backends. In pg_stat_activity it looks like:
$ select pid, query from pg_stat_activity where pid <> pg_backend_pid()
 pid  |           query            
 5981 | VACUUM public.test_ohceec;
 5982 | VACUUM public.test_zeecie;
 5983 | VACUUM public.test_jaidaf;
 5984 | VACUUM public.test_aeshae;
 5985 | VACUUM public.test_ieveov;
 5986 | VACUUM public.test_yiegah;
 5987 | VACUUM public.test_shofae;
 5988 | VACUUM public.test_thaepu;
(8 rows)
Nice. Starting with largest tables first is great. Thanks guys.


Popular posts from this blog

Oracle DBMS SCHEDULER Examples

How to find the server is whether standby (slave) or primary(master) in Postgresql replication ?

7 Steps to configure BDR replication in postgresql

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database

vacuumlo - removing large objects orphans from a database PostgreSQL