Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

PostgreSQL Architecture

 LIBPQ   -Library Pooled Quota
  • Details About Connected Users Using tools
  • libpq is the C application programmer's interface to PostgreSQL. libpq is a set of library functions that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these queries.
  • Client programs that use libpq must include the header file libpq-fe.h and must link with the libpq library
  • Here are also several complete examples of libpq applications in the directory src/test/examples in the source code distribution.
CLIENTS PROCESS:
  • Whenever we issue a query or the action made by us (client) is called the client process     
  • It is front end.
  • Front end may be a text application, graphical application or web server page.
  • Through TCP/IP clients access the server
  • Many users at a time can access the DB
  • FORKS – This process makes multi user access possible. It don’t disturb the postgres process
POSTMASTER:
  • The work of postmaster is that it authenticates the port (5432) and allocates process for users.
SERVER PROCESS:
  • It is also called as postgres. It accepts the connection from the clients(we) like database files and manages the database action.

Postgresql​ 10 naming updates:

9.6 Directory | 10 Directory
--------------+-------------
      pg_xlog |  pg_wal
      pg_clog |  pg_xact
       pg_log |  log

       
Postgres Server is Divided into Two parts 

I.Instance 

II.Storage


I.Instance is divide into two types 
1.Memory buffer
2.Utility Process

1.Memory Buffer:
a)Shared_buffer:
  • Sets the amount of memory the database server uses for shared memory buffers. The default is typically 128 megabytes (128MB), but might be less if your kernel settings will not support it (as determined during initdb). This setting must be at least 128 kilobytes. (Non-default values of BLCKSZ change the minimum.) However, settings significantly higher than the minimum are usually needed for good performance. This parameter can only be set at server start.
  • If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system. There are some workloads where even large settings for shared_buffers are effective, but because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount. Larger settings for shared_buffers usually require a corresponding increase in checkpoint_segments, in order to spread out the process of writing large quantities of new or changed data over a longer period of time.
  • On systems with less than 1GB of RAM, a smaller percentage of RAM is appropriate, so as to leave adequate space for the operating system. Also, on Windows, large values for shared_buffers aren't as effective. You may find better results keeping the setting relatively low and using the operating system cache more instead. 
  • The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB.

b)Wall_buffer:
  • The amount of shared memory used for WAL data that has not yet been written to disk. The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers, but not less than 64kB nor more than the size of one WAL segment, typically 16MB. This value can be set manually if the automatic choice is too large or too small, but any positive value less than 32kB will be treated as 32kB. 
  • This parameter can only be set at server start.
  • The contents of the WAL buffers are written out to disk at every transaction commit, so extremely large values are unlikely to provide a significant benefit. However, setting this value to at least a few megabytes can improve write performance on a busy server where many clients are committing at once. The auto-tuning selected by the default setting of -1 should give reasonable results in most cases.
c)CLOG Buffers:
  • CLOG BUFFERS are one of the SLRU-style buffers oriented toward circular "rings" of data, like which transaction numbers have been committed or rolled back.
d)Temp_Buffer:
  • Sets the maximum number of temporary buffers used by each database session. 
  • These are session-local buffers used only for access to temporary tables. The default is eight megabytes (8MB). 
  • The setting can be changed within individual sessions, but only before the first use of temporary tables within the session; subsequent attempts to change the value will have no effect on that session.
  • A session will allocate temporary buffers as needed up to the limit given by temp_buffers. The cost of setting a large value in sessions that do not actually need many temporary buffers is only a buffer descriptor, or about 64 bytes, per increment in temp_buffers. However if a buffer is actually used an additional 8192 bytes will be consumed for it (or in general, BLCKSZ bytes).
e)Work_mem:
  • Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to four megabytes (4MB). 
  • Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. 
  • Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. 
  • Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.
f)Maintenance_work_mem:
  • Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). 
  • Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. 
  • Larger settings might improve performance for vacuuming and for restoring database dumps.
Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high. It may be useful to control for this by separately setting autovacuum_work_mem.

2.Utility(Background) Process:
a)BGWriter:
  • There is a separate server process called the background writer, whose function is to issue writes of "dirty" (new or modified) shared buffers. 
  • It writes shared buffers so server processes handling user queries seldom or never need to wait for a write to occur. 
  • However, the background writer does cause a net overall increase in I/O load, because while a repeatedly-dirtied page might otherwise be written only once per checkpoint interval, the background writer might write it several times as it is dirtied in the same interval. 
  • The parameters discussed in this subsection can be used to tune the behavior for local needs.
b)WallWriter:
  • WAL buffers are written out to disk at every transaction commit, so extremely large values are unlikely to provide a significant benefit. However, setting this value to at least a few megabytes can improve write performance on a busy server where many clients are committing at once. The auto-tuning selected by the default setting of -1 should give reasonable results in most cases.
  • specifies the delay between activity rounds for the WAL writer. In each round the writer will flush WAL to disk. It then sleeps for wal_writer_delay milliseconds, and repeats. 
  • The default value is 200 milliseconds (200ms). 
  • Note that on many systems, the effective resolution of sleep delays is 10 milliseconds; setting wal_writer_delay to a value that is not a multiple of 10 might have the same results as setting it to the next higher multiple of 10. This parameter can only be set in the postgresql.conf file or on the server command line.
c)SysLogger:Error Reporting and Logging
  • As per the figure, it is clearly understood that all – the utility processes + user backends + Postmaster Daemon are attached to syslogger process for logging the information about their activities. Every process information is logged under $PGDATA/pg_log with the file .log.
  • Debugging more on the process information will cause overhead on the server. Minimal tuning is always recommended. However, increasing the debug level when required. Click Here for further on logging parameters
  • logging collector, which is a background process that captures log messages sent to stderr and redirects them into log files
  • log_directory- data directory
  • log_filename -The default is postgresql-%Y-%m-%d_%H%M%S.log
  • the default permissions are 0600
d)CHECKPOINTS:
checkpoint will be occurs following scenarios:
  1. pg_start_backup,
  2. CREATE DATABASE, 
  3. pg_ctl stop|restart,
  4. pg_stop_backup,
  5. issue of commit,
  6. pages is dirty 
  7. and a few others.
  • It writes all dirty pages from memory to disk and cleans the shared_buffers area.
  • If your Postgres server crashed, you can measure the data loss between last Checkpoint value time and PostgreSQL stopped time. you can also recover your system using this information.
  • If we increase the checkpoint_segments then checkpoint will occur less and so I/O will be less as it need to write less to disk. 
  • IF large amount of data is inserted there is more generation of checkpoints.
  • Write-Ahead Logging (WAL) puts a checkpoint in the transaction log every so often. 
  • The CHECKPOINT command forces an immediate checkpoint when the command is issued, without waiting for a scheduled checkpoint.
  • A checkpoint is a point in the transaction log sequence at which all data files have been updated to reflect the information in the log. All data files will be flushed to disk.
  • If executed during recovery, the CHECKPOINT command will force a restartpoint rather than writing a new checkpoint.
  • Only superusers can call CHECKPOINT. The command is not intended for use during normal operation.
  • Force switch to a new transaction log file (restricted to superusers) or Switch to new xlog file pg_switch_xlog()
  • If you want to force the switch of a log file, you can execute the following query on Postgres:

      SELECT pg_switch_xlog();
More About CheckPoint
e)Stats Collector:
  • PostgreSQL's statistics collector is a subsystem that supports collection and reporting of information about server activity then update the information to optimizer(pg_catalog),optimizer generate query plan using pg_catalog. 
  •  the collector can count accesses to tables and indexes in both disk-block and individual-row terms. It also tracks the total number of rows in each table, and information about vacuum and analyze actions for each table. 
  • It can also count calls to user-defined functions and the total time spent in each one.
  • PostgreSQL also supports reporting of the exact command currently being executed by other server processes. This facility is independent of the collector process.
  • The statistics collector transmits the collected information to other PostgreSQL processes through temporary files. These files are stored in the directory named by the stats_temp_directory parameter, pg_stat_tmp by default. 
  • For better performance, stats_temp_directory can be pointed at a RAM-based file system, decreasing physical I/O requirements. 
  • When the server shuts down cleanly, a permanent copy of the statistics data is stored in the pg_stat subdirectory, so that statistics can be retained across server restarts. 
  • When recovery is performed at server start (e.g. after immediate shutdown, server crash, and point-in-time recovery), all statistics counters are reset.
f)Archiver:
  • Achiver process is optional process, default is OFF.
  • Setting up the database in Archive mode means to capture the WAL data of each segment file once it is filled and save that data somewhere before the segment file is recycled for reuse.
  • On Database Archivelog mode, once the WAL data is filled in the WAL Segment, that filled segment named file is created under PGDATA/pg_xlog/archive_status by the WAL Writer naming the file as “.ready”. File naming will be “segment-filename.ready”.
  • Archiver Process triggers on finding the files which are in “.ready” state created by the WAL Writer process. 
  • Archiver process picks the ‘segment-file_number’ of .ready file and copies the file from $PGDATA/pg_xlog location to its concerned Archive destination given in ‘archive_command’ parameter(postgresql.conf).
  • On successful completion of copy from source to destination, archiver process renames the “segment-filename.ready” to “segment-filename.done”. This completes the archiving process.
  • It is understood that, if any files named “segement-filename.ready” found in $PGDATA/pg_xlog/archive_status. They are the pending files still to be copied to Archive destination.
II.Storage:
PostgreSQL cluster 2.svg

how to find out the database and table a file path refers to on PostgreSQL database ?

There are three main patterns for paths:
  • 1.For files in the default tablespace: base/database_oid/filenode id for the relation
  • 2.For files in Non-default tablespace: pg_tblspc / tablespace_oid / tablespace_version_subdir / database_oid / filenode id for the relation
  • 3.For shared relations (see below): global/filenode id for the relation
1.For files in the default tablespace:
Tables’ file names aren’t necessarily the same as their oids in pg_class and  can change when VACUUM FULL, TRUNCATE, etc are run. 
For example:

billing_db=# \dt+
                    List of relations
 Schema | Name | Type  |  Owner   |  Size   | Description
--------+------+-------+----------+---------+-------------
 public | t1   | table | postgres | 0 bytes |
(1 row)

billing_db=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/18144/18146
(1 row)

billing_db=# insert into t1 values(2);
INSERT 0 1
billing_db=# insert into t1 values(2);
INSERT 0 1
billing_db=# insert into t1 values(2);
INSERT 0 1

billing_db=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/18144/18146
(1 row)

billing_db=# update t1 set id=1;
UPDATE 3
billing_db=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/18144/18146
(1 row)


billing_db=# vacuum t1;
VACUUM

billing_db=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/18144/18146
(1 row)

billing_db=# vacuum full t1;
VACUUM

billing_db=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/18144/18149
(1 row)
Note this Table id is changed from 18146 to 18149 after vacuum full



2.Non-default tablespace paths:
SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
pg_tblspc / 16709 / PG_9.3_201306121 / 16499/19401
So the filename pattern breaks down to:
  • pg_tblspc: it’s in a non-default tablespace
  • 16709: it’s in the tablespace with oid 16709
  • PG_9.3_201306121: used by PostgreSQL 9.3 with catalog version 201306121.
  • 16499: in the database with oid 16499
  • 19401 the table with relfilenode id 19401

Example:
data_directory:
  • Specifies the directory to use for data storage. This parameter can only be set at server start.
config_file:
  • Specifies the main server configuration file (customarily called postgresql.conf). This parameter can only be set on the postgres command line.
hba_file:
  • Specifies the configuration file for host-based authentication (customarily called pg_hba.conf). This parameter can only be set at server start.
ident_file:
  • Specifies the configuration file for user name mapping (customarily called pg_ident.conf). This parameter can only be set at server start.
external_pid_file :
  • Specifies the name of an additional process-ID (PID) file that the server should create for use by server administration programs. This parameter can only be set at server start.
PG_LOG:
  • It is not an actual postgres directory, it is the directory where RHEL stores the actual textual LOG.
PG_XLOG:
  •  Here the write ahead logs are stored. It is the log file, where all the logs are stored of committed and uncommitted transaction. It contains max 6 logs, and last one overwrites. If archiver is on, it moves there.
PG_CLOG:
  •  It contains the commit log files, used for recovery for instance  crash
PG_VERSION:
  •  A file containing the major version number of PostgreSQL
Base:
  •  Subdirectory containing per-database subdirectories
Global:
  •  Subdirectory containing cluster-wide tables, such as pg_database
PG_MULTIXACT:
  •  Subdirectory containing multitransaction status data (used for shared row locks)
PG_SUBTRANS:
  •  Subdirectory containing subtransaction status data
PG_TBLSPC:
  • Subdirectory containing symbolic links to tablespaces
PG_TWOPHASE:
  • Subdirectory containing state files for prepared transactions
POSTMASTER.OPTS:
  • A file recording the command-line options the postmaster was last started with
POSTMASTER.PID: 
  • A lock file recording the current postmaster PID and shared memory segment ID (not present after postmaster shutdown)

7 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Can anyone explain clearly about postmaster.pid ??? i can't understand the concept of it...Help me out please...

    ReplyDelete
  3. Mr.manoj kumar
    Postmaster.pid put a entry after the authentication successful.

    ReplyDelete
  4. Tell the clear concept of optimizer

    ReplyDelete
    Replies
    1. Yes it's having old connection information,new user authenticate by pg_hba,old means postmaster.pid

      Delete
  5. Tell clear concept of optimizer

    ReplyDelete
    Replies
    1. Dear kirubha,
      Optimizer work is generating new plan if query is old means no need to generate new plan, query will be executed using old plan

      Delete