autovacuum database

DBA Jobs
TutorialDBA Forum
IT SUPPORT
Our Services
Training
About Me

autovacuum database

Get link
Facebook
X
Pinterest
Email
Other Apps

- November 11, 2017

Make sure your largest database tables are vacuumed and analyzed frequently by setting stricter table-level auto-vacuum settings. Below is an example which will VACUUM and ANALYZE after 5,000 inserts, updates, or deletes.

1

2

3

4

5

ALTER TABLE table_name SET (autovacuum_vacuum_scale_factor = 0.0);  

ALTER TABLE table_name SET (autovacuum_vacuum_threshold = 5000);  

ALTER TABLE table_name SET (autovacuum_analyze_scale_factor = 0.0);  

ALTER TABLE table_name SET (autovacuum_analyze_threshold = 5000);

The Symptoms

At Lob, we’ve built an internal website to track business metrics, facilitate our customer support team, and track the order status of our Postcard API and our Letter API. Aggregating some of this data, like monthly revenue for example, requires complex database calls to be made on tables containing millions of rows. We’re accustomed to these types of queries taking several seconds, and for the most part we’re fine with the speed, given that the site is not customer facing. Performance is not our primary concern here.

As our tables continued to grow, however, queries began to take 20 seconds or more. The internal website became unusable.

The Diagnosis

We began to diagnose the cause of the slow queries by running them on the database with different parameters and monitoring execution times. We looked for patterns to relate the types of queries to their latency. An interesting correlation between a date filter and performance surfaced. A query that fetched all rows inserted over a month ago would return in ~1 second, while the same query run on rows from the current month was taking 20+ seconds.

With this discovery, the next step was to figure out why the performance of these queries differed by so much. PostgreSQL’s EXPLAIN statement was an essential tool. When Postgres receives a query, the first thing it does is try to optimize how the query will be executed based on its knowledge of the table structure, size, and indices. Prefixing any query with EXPLAIN will print out this execution plan without actually running it.

What EXPLAIN Told Us

When we compared the outputs of EXPLAIN on the fast and slow queries, the problem became immediately evident. When joining two tables on an indexed foreign key, Postgres was performing an efficient Hash Join for the fast running query, and an inefficient Nested Loop for the slower query.

Here’s the output for the fast query:

1

2

3

4

5

6

7

8

9

10

11

Sort  (cost=43509.92..43523.14 rows=5291 width=20)  

  Sort Key: (date(charges.date_created))

  ->  HashAggregate  (cost=43116.55..43182.69 rows=5291 width=20)

        ->  Hash Join  (cost=186.57..43076.07 rows=5397 width=20)

              Hash Cond: ((charges.account_id)::text = (accounts.id)::text)

              ->  Seq Scan on charges  (cost=0.00..42774.71 rows=5409 width=41)

                    Filter: ((NOT deleted) AND (date_created > '2015-03-16 00:00:00'::timestamp without time zone) AND (date_created < '2015-03-24 00:00:00'::timestamp without time zone))

              ->  Hash  (cost=121.32..121.32 rows=5220 width=21)

                    ->  Seq Scan on accounts  (cost=0.00..121.32 rows=5220 width=21)

                          Filter: (NOT admin)

And here’s the explain output for the slow query:

1

2

3

4

5

6

7

8

9

10

GroupAggregate  (cost=42961.29..42961.32 rows=1 width=20)  

  ->  Sort  (cost=42961.29..42961.30 rows=1 width=20)

        Sort Key: (date(charges.date_created)), charges.type

        ->  Nested Loop  (cost=0.00..42961.28 rows=1 width=20)

              Join Filter: ((charges.account_id)::text = (accounts.id)::text)

              ->  Seq Scan on charges  (cost=0.00..42774.71 rows=1 width=41)

                    Filter: ((NOT deleted) AND (date_created > '2015-03-17 00:00:00'::timestamp without time zone) AND (date_created < '2015-03-22 00:00:00'::timestamp without time zone))

              ->  Seq Scan on accounts  (cost=0.00..121.32 rows=5220 width=21)

                    Filter: (NOT admin)

Note that both of these queries were identical, except for the date ranges filtering the rows.

The Cure

As mentioned above, when Postgres builds the query plan, it optimizes based on what it knows about the structure and size of the database. However, its knowledge of the database is not always up-to-date. Without accurate insight about the database tables, suboptimal query executions can be planned. In our case, the query optimizer created slower query plans for the newest rows. This explained how the same query was fast for older rows, of which the database had accurate knowledge, and slow for the youngest rows.

The solution was to VACUUM and ANALYZE the table. Vacuuming cleans up stale or temporary data, and analyzing refreshes its knowledge of all the tables for the query planner. We saw an immediate decrease in execution time for our complex queries, and as a result, a much more user-friendly internal website.

1

2

VACUUM ANALYZE table_name;

You can check the last time your tables were vacuumed and analyzed with the query below. In our case, we had tables that hadn’t been cleaned up in weeks.

1

2

3

4

SELECT relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze  

FROM pg_stat_all_tables  

WHERE schemaname = 'public';

Setting Up Auto Vacuum

To prevent our tables from continually getting messy in the future and having to manually VACUUM ANALYZE, we made the default auto-vacuum settings stricter. Postgres runs a daemon to regularly vacuum and analyze itself. Tables are auto-vacuumed when 20% of the rows plus 50 rows are inserted, updated or deleted, and auto-analyzed similarly at 10%, and 50 row thresholds. These settings work fine for smaller tables, but as a table grows to have millions of rows, there can be tens of thousands of inserts or updates before the table is vacuumed and analyzed.

In our case, we set much more aggressive thresholds for our largest tables, using the commands below. With these settings, vacuuming and analysis occur after a table sees 5,000 inserts, updates, or deletes.

1

2

3

4

5

6

7

8

9

10

11

12

ALTER TABLE table_name  

SET (autovacuum_vacuum_scale_factor = 0.0);

ALTER TABLE table_name  

SET (autovacuum_vacuum_threshold = 5000);

ALTER TABLE table_name  

SET (autovacuum_analyze_scale_factor = 0.0);

ALTER TABLE table_name  

SET (autovacuum_analyze_threshold = 5000);

The threshold to auto-vacuum is calculated by:

1

2

vacuum threshold = autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * number of rows in table

Similarly, the threshold to auto-analyze is calculated by:

1

2

analyze threshold = autovacuum_analyze_threshold + autovacuum_analyze_scale_factor * number of rows in table

autovacuum

Get link
Facebook
X
Pinterest
Email
Other Apps

Comments

How to Configure User Equivalence (Key-Based Authentication) on Linux

- September 29, 2017

Installation The SSH service should already be installed, but if it is not, install it from a Yum repository using the following command. # yum install openssh-server Turn on the SSH service and make sure it starts automatically on reboot. # service sshd start # chkconfig sshd on The SSH service is configured using the "/etc/ssh/sshd_config" file. Configuration changes have to be followed by a restart of the service. # service sshd restart # # or # service sshd reload Firewall The server must have the TCP port 22 open. This can be achieved by adding the following entry to the type of firewall script described here . # Open port for NTP server. iptables -A INPUT -p tcp --dport 22 -j ACCEPT SELinux If you are using SELinux, you will need to consider the following points. The SELinux booleans associated with the SSH service are displayed using the getsebool command. # getsebool -a | grep ssh allow_ssh_keysign --> off fenced_can_ssh --> off ssh...

VMWARE WORKSTATION 3,4,5,6,7,8,9,10,11,12,14,15...etc LICENSE KEYS COLLECTION

- April 09, 2018

Below tutorialdba.com collected and sorted out hundreds of universal License Keys for all major versions of VMware Workstation Pro (not for VMware Workstation Player) 4.x, 5.x, 6.x, 7.x, 8.x, 9.x, 10.x, 11.x, 12.x and v14.x on Windows and Linux platforms (support both 32-bit and 64-bit operating system) in this single post. Besides, we also provide some license keys for VMware other projects. Just enjoy and share them. // 4~14 Universal License Keys // Version License Keys VMware Workstation VMware Workstation 4.x.x ZHDH1-UR90N-W844G-4PTN6 G1NP0-T88AL-M016F-4P8N2 ZC14J-4U16A-0A04G-4MEZP J1WF8-58LDE-881DG-4M8Q3 VMware Workstation 5.x.x LUXRM-WP0DN-A256U-4M9Q3 DJXDR-NDT27-Y2NDU-4YTZK DA925-HP80U-Z8HDC-4WXXP 3KW2W-AYR2C-88M6F-4MDQ2 VMware Workstation 6.x.x A0E8R-YUDFV-6AK2F-4GAN2 CRX0D-VWL0V-7CJ6C-46C7A NA8RX-QPNDU-D2LA9-4WAZL 1H4WM-N21FZ-7GK2A-44U5U 6AJ6N-THY2P-42KEF-4WTFG FK8R9-LPCDT-88H4Y-4WRN3 KAR8R-T8MAL-K8J6A-4WDXQ YJEKW-JMFF4-YA1DC-4WTQ...

PostgreSQL - Access Oracle Database in PostgreSQL

- January 15, 2017

Today, organizations stores information(data) in different database systems. Each database system has a set of applications that run against it. This data is just bits and bytes on a file system - and only a database can turn the bits and bytes of data into business information. Integration and consolidation of such information(data) into one database system is often difficult. Because many of the applications that run against one database may not have an equivalent application that runs against another. To consolidate the information into one database system, we need a heterogeneous database connection. In this post, I'll demo on how you may connect PostgreSQL to one of heterogeneous database Oracle using different methods. Below are few methods to make connection to Oracle database in PostgreSQL. Using ODBC Driver Using Foreign DataWrappers Using Oracle Call Interface(OCI) Driver Softwares used in demo(included download links): CentOS 7 64bit,...

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database

- June 26, 2018

In this post, I am sharing few important function for finding the size of database, table and index in PostgreSQL. Finding object size in postgresql database is very important and common. Is it very useful to know the exact size occupied by the object at the tablespace. The object size in the following scripts is in GB. The scripts have been formatted to work very easily with PUTTY SQL Editor. 1. Checking table size excluding table dependency: SELECT pg_size_pretty(pg_relation_size('mhrordhu_shk.mut_kharedi_audit')); pg_size_pretty ---------------- 238 MB (1 row) 2. Checking table size including table dependency: SELECT pg_size_pretty(pg_total_relation_size('mhrordhu_shk.mut_kharedi_audit')); pg_size_pretty ---------------- 268 MB (1 row) 3. Finding individual postgresql database size SELECT pg_size_pretty(pg_database_size('db_name')); 4. Finding individual table size for postgresql database -including dependency index: SELECT pg_size_pretty(pg_total_rel...

Postgres Database Patch

- October 10, 2019

If you are used to patch Oracle databases you probably know how to use opatch to apply PSUs. How does PostgreSQL handle this? Do we need to patch the existing binaries to apply security fixes? The answer is: No.Lets say you want to patch PostgreSQL from version 10.5/11.3 to version 10.10/11.5. This is called minor version postgres upgrade or postgres patching Why need to patch postgresql server ? Multiple SQL injection vulnerabilities have been discovered in PostgreSQL that could allow for arbitrary code execution. The vulnerabilities are the result of the application’s failure to sufficiently sanitize user-supplied input before using it in an SQL query. These vulnerabilities allow attackers with the CREATE permission (or Trigger permission in some tables) to exploit input sanitation vulnerabilities in the pg_upgrade and pg_dump functions. The CREATE permission is automatically given to new users on the public schema, and the public schema is the default schema used on the...

Search This Blog

TutorialDBA - Support | Training | Consultant

autovacuum database

Comments

Post a Comment

Popular posts from this blog

How to Configure User Equivalence (Key-Based Authentication) on Linux

VMWARE WORKSTATION 3,4,5,6,7,8,9,10,11,12,14,15...etc LICENSE KEYS COLLECTION

PostgreSQL - Access Oracle Database in PostgreSQL

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database

Postgres Database Patch