PostgreSQL 9.6 Parallel query

- September 13, 2017

The first 9.6 feature presented at PGCon is also the new version's most prominent: parallel query. Haas, who leads a team of contributors that has been working on this feature for the last three years, presented on what's in 9.6 and what is still being worked on. Over those three years, the team has been adding multiple backend features to support parallelism. Haas and most of these contributors work for the PostgreSQL support company EnterpriseDB.

Parallel query is the ability to make use of multiple cores to execute the same query in order to speed up operations that can be parallelized. This feature is desirable for big data workloads and has been available in Oracle and DB2 for some time. The goal is to allow all of the resources of the system to be used to answer a single query if that query is the only one running. Parallel query has been on the PostgreSQL to-do list for over a decade.

With version 9.6, some queries are now parallelizable. Specifically, it can now execute sequential scans, some aggregates, and some joins in parallel. Haas said that his team plans to increase the number of query operations that can be parallelized gradually over the next few years.

The first operation made parallel was sequential scans, otherwise known as full table scans. A sequential scan is when the PostgreSQL engine performs an exhaustive scan of all the 8KB disk pages of the base table; it is normally needed when either the user has requested a large percentage of the table or when there are no useful indexes available. The team picked this operation first because it is relatively simple to parallelize. Also, as sequential scans over large tables can be quite slow, speeding them up appeals to users.

For example, as part of an audit you might want to total up all account balances per department in a financial system. Without parallel scan that can be quite slow:

bench=# explain (costs off, analyze on, timing off)
 select bid, sum(abalance) from accounts
 group by bid;

    HashAggregate (actual rows=500 loops=1)
      Group Key: bid
 ->  Seq Scan on accounts
     (actual rows=50000000 loops=1)
      Planning time: 0.050 ms
      Execution time: 10488.074 ms

With parallelism, you can speed up the operation a great deal, cutting execution time by as much as 75% using four "workers":

bench=# explain (costs off, analyze on, timing off)
 select bid, sum(abalance) from accounts
 group by bid;

    Finalize GroupAggregate (actual rows=500 loops=1)
  Group Key: bid
  ->  Sort (actual rows=2500 loops=1)
        Sort Key: bid
        Sort Method: quicksort  Memory: 233kB
        ->  Gather (actual rows=2500 loops=1)
       Workers Planned: 4
       Workers Launched: 4
       ->  Partial HashAggregate
    (actual rows=500 loops=5)
      Group Key: bid
      ->  Parallel Seq Scan
          on accounts
          (actual rows=10000000
    loops=5)
       Planning time: 0.118 ms
       Execution time: 2739.373 ms

As shown in the example, PostgreSQL 9.6 also includes support for parallel aggregation. Simple aggregates, like SUM() and AVG(), can now be executed in parallel in order to double or quadruple throughput. This doesn't work with windowing aggregates, ordered aggregates, or grouping sets yet; to be supported, each individual aggregate function needs additional code for a parallel context.

The third parallel operation included in the beta is joins, specifically hash joins and nested loop joins. These can now be parallelized in some cases in order to join two large tables more quickly. Merge joins, the third major type of join, are not supported because the industry standard algorithms are not parallelizable. According to Haas, the PostgreSQL project will need to invent a new algorithm in order to add the feature.

Parallel query works by creating several "dynamic background workers" for each query. As PostgreSQL operates on a multi-process model, each of these workers is a process. Workers take on part of the query and then the results are fed to the parent process via a "gather node" that collates the results. For example, in a parallel sequential scan, each worker reads one page of the relation at a time in order. The team tested more sophisticated algorithms for partitioning the scan, but they did not improve throughput.

This feature depends on many underlying changes, not just in version 9.6, but in the last several PostgreSQL releases. For example, PostgreSQL 9.4 included dynamic background workers and dynamic shared-memory allocation. PostgreSQL 9.5 introduced the parallel context and group locking.

Version 9.6 includes "parallel-aware executor nodes." PostgreSQL divides up each query, during the planning phase and again during the execution phase, into "nodes" that each represent one task to be performed, such as sorting data. Some of the code for these nodes is now different depending on whether the node is being executed in a parallel context or a non-parallel context. Parallelizing more query operations in future versions will be largely a matter of adding parallelization code to more types of executor nodes, such as index scan nodes.

Because parallel query is multi-process, the team also had to add a facility for passing messages between the master query process and the workers. This is achieved using a shared-memory message queue, in which the calling process allocates a chunk of shared memory for all of the workers to read and update. This permits forwarding errors and notices as well as data rows and addresses.

During the talk, an audience member asked Haas about resource management. In the beta, there are two settings available: max_parallel_degree, which sets the number of parallel workers for queries in the current session, and max_background_workers, which limits the total number of workers in the system as a whole. The latter is intended as an administrative control to prevent overwhelming the processor. Note that these configuration parameters are likely to change during the beta in order to provide better resource management.

While there is still a lot of work to do on parallel query, the current features already show great improvements to PostgreSQL's performance in the TPC-H data warehousing benchmark. Contributors ran tests on a donated IBM Power 7 system housed at the OSU Open Source Lab, using max_parallel_degree=4. Of the 22 long-running queries that make up part of the benchmark, three were four times faster, eleven were twice as fast, and the remaining three were either slightly faster or unchanged.

There are a few other limitations on the parallel query features. First, only read queries are currently run in parallel. Second, parallel queries don't work inserializable mode. They also don't work together with cursors. Contributors expect to fix some of these limitations in the next version of PostgreSQL.

Haas expects that various contributors will be working on adding more parallel operations to PostgreSQL for several years along with improving performance and flexibility. What's in 9.6 provides a framework that will allow more contributors to help with that effort.

Search This Blog

TutorialDBA - Support | Training | Consultant

PostgreSQL 9.6 Parallel query

Comments

Post a Comment

Popular posts from this blog

PostgreSQL Index

VMWARE WORKSTATION 3,4,5,6,7,8,9,10,11,12,14,15...etc LICENSE KEYS COLLECTION

How to CreateYour Own AWS Account Alias?

PostgreSQL Database startup / shutdown /restart

How to find the server is whether standby (slave) or primary(master) in Postgresql replication ?