Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Oracle RAC STATSPACK/AWR performance reports

Interpreting an Oracle RAC (Real Application Clusters) STATSPACK/AWR performance reports can be tricky, and it's quite different from reading a standard STATSPACK report.

The most obvious difference is the shared "cache fusion" component, and it can skew the performance report data, often hiding the root cause of a performance problem.

For example, consider this RAC STATSPACK report:

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:              7,134.04              8,461.68
              Logical reads:             43,060.81             51,074.43
              Block changes:                 21.39                 25.38
             Physical reads:                154.16                182.84
            Physical writes:                 79.52                 94.31
                 User calls:                 68.52                 81.28
                     Parses:                 22.02                 26.12
                Hard parses:                  0.20                  0.24
                      Sorts:                 24.93                 29.57
                     Logons:                  1.29                  1.53
                   Executes:                 29.36                 34.83
               Transactions:                  0.84
 
  % Blocks changed per Read:    0.05    Recursive Call %:    18.71
 Rollback per transaction %:   21.31       Rows per Sort:    37.21
 
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:  100.00       Redo NoWait %:  100.00
            Buffer  Hit   %:   99.82    In-memory Sort %:  100.00
            Library Hit   %:   99.52        Soft Parse %:   99.09
         Execute to Parse %:   25.00         Latch Hit %:   99.92
Parse CPU to Parse Elapsd %:    2.98     % Non-Parse CPU:   99.32
 
 Shared Pool Statistics        Begin   End
                               ------  ------
             Memory Usage %:   81.11   83.06
    % SQL with executions>1:   71.32   71.46
  % Memory for SQL w/exec>1:   51.65   53.13
 
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
PX Deq Credit: send blkd                           14,513       1,864    46.27
CPU time                                                        1,002    24.89
library cache pin                                     413         486    12.07
PX qref latch                                         254         117     2.90
PX Deq: Table Q Sample                              1,272         110     2.74

Here we see high buffer hit ratios (which is deceptive), and note the "PX" entries in the top-5 timed events, which indicate that parallelism is turned-on and being used.  This, in turn, indicates the presence of full-scan activity (multiblock reads from large-table full-table-scans and/or index fast-fast-full scans).
Because this is an OLTP database, we next verify that parallel_automatic_tuning is turned-on, in this case a very bad thing, because it has influenced the database to make full-scan access more attractive than it should be.
parallel_adaptive_multi_user  TRUE
parallel_automatic_tuning     TRUE
We also see relatively high pinging on this RAC node.  This database cannot be partitioned (load balanced) by end-user usage (placing different "types" of end-users on different nodes), and there is the inevitable inter-node pinging as common data is requested between active nodes.
Global Cache Service - Workload Characteristics
-----------------------------------------------
 
Global cache hit ratio:                                    0.2
Ratio of current block defers:                             0.0
% of messages sent for buffer gets:                        0.2
% of remote buffer gets:                                   0.0
Ratio of I/O for coherence:                                0.6
Ratio of local vs remote work:                             6.0
Ratio of fusion vs physical writes:                        0.0
The core performance problems:
Upon closer inspection of this database we would the following problems, none of which were clear from the high-level summary statistics:
  • Unnecessary parallelism - Each node has only two CPU's and parallel_automatic_tuning required turning off.
     
  • Missing indexes - Missing indexes were invoking unnecessary large-table full-table scans.
     
  • I/O bottlenecks - A closer inspection of the I/O subsystem reveals super-slow I/O timings and clear disk enqueues.

Obfuscated details
Unlike a single-node STATSPACK/AWR report, a RAC statspack report is more subtle, and the top problems are not readily apparent from the top event statistics.  In this case, even though the data buffer hit ratio is 99%, there is still a huge I/O contention issue, which can be remedied with a larger db_cache_size:
7830705126Tablespace
------------------------------
                 Av      Av     Av                    Av        Buffer Av Buf
         Reads Reads/s Rd(ms) Blks/Rd       Writes Writes/s      Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
VPPP_SP
        37,754      10 ######     4.7          137        0          7    1.4
CBPO_SP
        33,015       9 ######     1.2           15        0        716    2.0
TEMP
         9,265       3    1.7    30.1        9,350        3          0    0.0
PRODMAN
        12,560       3 ######     4.0          125        0          0    0.0
PRODDTAI
         1,219       0    7.5     1.0          990        0          0    0.0
PERFSTAT
           555       0    5.1     1.0        1,434        0          0    0.0
UNDOTBS1
             0       0    0.0                1,465        0          0    0.0
SYSTEM
           507       0 ######     6.8           56        0          0    0.0
FSDPDM_SP
            85       0 ######     3.2          136        0          0    0.0

Comments

Popular posts from this blog

Oracle DBMS SCHEDULER Examples

How to find the server is whether standby (slave) or primary(master) in Postgresql replication ?

7 Steps to configure BDR replication in postgresql

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database

How to Enable/Disable autovacuum on PostgreSQL