Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Commonly used linux commands in postgresql

The "pwd" command displays the current directory:

"ls" command lists all files and directories in the specified directory:
 If no location is defined it acts on the current directory. The "-a" flag lists hidden "." files. The "-l" flag lists file details.
ls /u01
ls -al
More Examples:
  • ls -al | pg  do a full directory listing and prompt to stop stuff whizzing off the page.
  • ls | wc -l  count the files in the current directory.
  • ls -alt    list files in date order
  • ls -alt | head -10  as above but only display the first 10
  • ls -l $ORACLE_HOME/reports60/printer/admin/spoolcmd.sh Verify that the spoolcmd.sh file has execute permissions
  • ls -s | awk '{if ($1 > 50) print $1 " " $2 }' list all files over 50 blocks in size.
  • ls -alq List files with hidden characters. Very useful when you cannot delete a file for an unknown reason, as sometimes a file can be created with hidden control characters. (very common when stty not set properly)
  • ls -1 Shows the files in a list (just the file names, this option is useful in shell scripts where the files names need to be fed into another program or command for manipulation)
  • ls -1h The option "-h" comes handy to display the size of the files in a human readable form.
  • ls -lr The parameter -r shows the output in the reverse order
  • ls -lR The -R operator makes the ls command execute recursively—that is, go under to the subdirectories and show those files too
The "cd" command is used to change directories:
cd /u01/app/oracle

The "touch" command is used to create a new empty file with the default permissions:
touch my.log

The "rm" command is used to delete files and directories. The "-R" flag tells the command to recurse through subdirectories.
rm my.log
rm -R /archive

The "mv" command is used to move or rename files and directories. The "." represents the current directory
mv [from] [to]
mv my.log my1.log
mv * /archive
mv /archive/* .

The "cp" command is used to copy files and directories:
cp [from] [to]
cp my.log my1.log
cp * /archive
cp /archive/* .

The "mkdir" command is used to create new directories:
mkdir archive

The "rmdir" command is used to delete directories:
rmdir archive

The "grep" command performs a search for a specified string or pattern.
ps -eaf | grep postgres    Show all processes owned by postgres.

The "find" command can be used to find the location of specific files:

The "/" flag represents the staring directory for the search. Wildcards such as "dbms*" can be used for the filename.
find / -name dbmspool.sql
find / -print | grep dbmspool.sql   Search everywhere for the specified file
find . -exec grep "DISPLAY" {} \; -print | pg   Search all files for the text string "DISPLAY" - takes a while to run !
Tip for postgres Users:
postgres produces many extraneous files: archive,xlog,log files, dump files, and so on. Unless they are cleaned periodically, they can fill up the filesystem and bring the database to a halt.To ensure that doesn't happen, simply search for the files with extension "trc" and remove them if they are more than three days old. 

A simple command does the trick:
find . -name "*.trc" -ctime +3 -exec rm -f {} \;

More Reference : find exercises

The "wc" utility displays a count of the number of characters, words and lines in a file. The switches for this utility are:
-l print line count
-c print character count
-w print word count

Exercises :
wc -l README.txt
    85 README.txt

The "more" or "cat" commands lets you display the contents of a file:
cat file.txt
more file.txt
Join file1 to file2 and output to file3 :
cat file1 file2 > file3

The "tail" command let you see a specified number of lines from the end of the file
tail -n filename

The "head" command let you see the specified number of lines from the top of the file
head -n filename

The "diff" command displays the differences between file1 and file2. Options:
diff README.txt README2.txt 
-t = ignore white spaces and tabs
-i = ignore 'case' letters (A=a)
Another option, -y, shows the same output, but side by side:
 diff -y file1 file2 -W 120

The "alias" command, creates an alias to some commands. Examples:
alias ls='ls -al'  Alias the command 'ls -al'  to ls
alias os='echo $ORACLE_HOME'   alias the command to os

Here is a list of some very useful aliases I like to define:
alias l='ls -d .* --color=tty'
alias ll='ls -l --color=tty'
alias oh='cd $ORACLE_HOME'
alias os='echo $ORACLE_SID'
alias tns='cd $ORACLE_HOME/network/admin'
alias bdump='cd $ORACLE_BASE/admin/$ORACLE_SID/bdump'

The "echo" command, echo strings to screen
echo $DISPLAY -->>display the contents of the DISPLAY variable to screen.

With the "du" and "df" commands, you can display hard disk information. (-k  Use 1024 byte blocks instead of the default 512)
du -->>Display disk usage for all directories and subdirectories under the current directory.
df -k -->>  Displays disk space free on each filesystem. Very useful.
df -h -->>  Displays disk usage human readable

The "ftp" comamnd Invoke the file transfer protocol file exchange program:
ftp diego.domain.com ftp to that machine (it will prompt you for a login.)
Once logged in and at the ftp prompt, you have many options:
bin    Change transfer mode to binary mode (essential for moving oracle files, dmp, zip, etc).
ascii Change transfer mode to ascii mode
  • send myfile   Transfer 'myfile' from your local machine
  • get fred  Receive the file from the host into my local machine.
  • mget *   Transfer all files in current directory of the host to your local machine.
  • !pwd   Check the directory of your local machine
  • pwd   Check current directory of host machine
More Reference :ftp exercises

The "ln" command let you create a link to a file. You use this during the postgres Software installation

Build and Install

cd .../postgresql-8.1.0

LDFLAGS=-R/opt/sfw/lib ./configure --prefix=/usr/local/${PWD##*/} --with-includes=/opt/sfw/include --with-libraries=/opt/sfw/lib
make install


groupadd postgres
useradd -g postgres -d /var/pgsl postgres
passwd -l postgres

mkdir -p /var/pgsql/data
chown -R postgres /var/pgsql 
Create the initial database files:
su - postgres
/usr/local/postgresql-8.1.0/bin/initdb -D /var/pgsql/data
Create an init.d script:
cat <<EOF > /etc/init.d/pgsql
#! /bin/sh

su - postgres -c "/usr/local/${PWD##*/}/bin/pg_ctl -D /var/pgsql/data -l /var/pgsql/pgsql.log \$1"
chmod a+x /etc/init.d/pgsql
ln /etc/init.d/pgsql /etc/rc3.d/S50pgsql
ln /etc/init.d/pgsql /etc/rc3.d/K50pgsql

More Reference : ln -s exercises

The "sed" cpmmand invokes the stream editor. It's helpful to do a global search and replace on a file:
ls | sed 's/$/<br>/g' > my_ls.html  

Place the html command <br> at the end of each line of the output of 'ls.' Good for formatting the ouptut of unix commands into html for cgi scripts.

The "awk" command it has its own scripting language:
For example, to display only the 6th field of the output from 'who am i.' (Field 6 is the IP address of your own terminal session / PC.) you can use:
who am i | awk '{print $6}'  
This can be used to automatically set the DISPLAY environment variable for users' logins.

The "cksum" command provides a checksum of a file.  

It's very useful for comparing two files of the same size that you suspect are different.
cksum <filename>

The "split" command can split up a file into smaller chunks.
split -10000 bug123456.z    Splits 'bug123456' into minifiles of 10000 lines each.

The "gzip" and "compress" commands  allows you to compress files.

The gzip command results in a compressed copy of the original file with a ".gz" extension . The gunzip command reverses this process. The compress command results in a compressed copy of the original file with a ".Z" extension. The uncompress command reverses this process:

zip [.zip file name] [file to be compress]
zip” is one of the compression method in linux. It compresses the normal file and stores in single zip file view more
gzip [file name]
gzip” is one of the compression method in linux. It compresses the normal file and stores in single .gz extension file, this compression is much better compared to zip compression view more
bzip2 [file name]
bzip2” is one of the compression method in linux, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available technique (the PPM family of statistical compressors) It compresses the normal file and stores in single .bz2 extension file, this compression is much better compared to gzip compression view more
tar -cvf [.tar file name] [list of files to be archive]
“tar” is not a compressed files, it is a collection of files within a single file uncompressed. If you want to compress the “tar” file use gzip or bzip2 with tar. view more
The "rsync" is a great file copier or command to SYNC directories
More Reference : rsync exercises

The "ssh" command lets you connect to a remote box. 
 Run command on $HOST as $USER (default command=shell)
    ssh $USER@$HOST command
Run GUI command on $HOSTNAME as $USER
   ssh -f -Y $USER@$HOSTNAME xeyes 

server using sheena as user and create .sshdirectory under it, using following command.
[tutorialdba@nijam ~]$ ssh sheena@ mkdir -p .ssh
The authenticity of host ' (' can't be established.
RSA key fingerprint is 45:0e:28:11:d6:81:62:16:04:3f:db:38:02:la:22:4e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '' (ECDSA) to the list of known hosts.
sheena@'s password: [Enter Your Password Here] 
Forward connections to $HOSTNAME:8080 out to $HOST:80
    ssh -g -L 8080:localhost:80 root@$HOST
Forward connections from $HOST:1434 in to imap:143
    ssh -R 1434:imap:143 root@$HOST  

The "scp" command lets you perform remote copy operations
 Copy with permissions to $USER's home directory on $HOST
    scp -p -r $USER@$HOST: file dir/

For some scp examples:
Here i sending full data directories,control101.ctl files and orapwk2 to192.168.62.129 server and i given destination location as well as oracle user access
scp /home/oracle/control01.ctl  oracle@
scp /home/oracle/k1/oradata/data/* oracle@
scp /home/oracle/orapwk2   oracle@

what is the difference between rsync and scp in linux ? 



File Permissions

The "umask" command  What is UMASK? :
When user creates a file or directory it will be created with permission. The default value is 022.
If we create a file or directory the file will be created with 755 permission.
the calculation is the difference between 777 and the  umask value.
ie. 777-022=755.

If umask is 022 then
Owner will have read, write, execute permissions
Group will have read, execute permissions
Others will have read, execute permissions

The "chmod" command is used to alter file permissions after the file has been created:
chmod 777 *.log
Owner      Group      World      Permission
=========  =========  =========  ======================
7 (u+rwx)  7 (g+rwx)  7 (o+rwx)  read + write + execute
6 (u+wx)   6 (g+wx)   6 (o+wx)   write + execute
5 (u+Rx)   5 (g+Rx)   5 (o+Rx)   read + execute
4 (u+r)    4 (g+r)    4 (o+r)    read only
2 (u+w)    2 (g+w)    2 (o+w)    write only
1 (u+x)    1 (g+x)    1 (o+x)    execute only 
Character eqivalents can be used in the chmod command:
chmod o+rwx *.log
chmod g+r   *.log
chmod -Rx   *.log

 More References chmod:

The "chown" command is used to change the ownership of files after creation :
Let’s create directories for backups and archives (login as root USER)
mkdir -p /Archive/Location
mkdir /backups
chown postgres:postgres -R Archive        
chown postgres:postgres /backups
"chgrp" command  :
 Finally the "chgrp" command is used to change the group to a file:
chgrp <directory> group

The following example changes the ownership on every single file in current directory and lower directories to oracle (useful if someone has done an install erroneously as root.)find . -exec chown oracle {} \; -print 

OS Users Management
The "useradd" command is used to add OS users:
useradd -G oinstall -g dba -d /usr/users/my_user -m -s /bin/ksh my_user
  • The "-G" flag specifies the primary group.
  • The "-g" flag specifies the secondary group.
  • The "-d" flag specifies the default directory.
  • The "-m" flag creates the default directory.
  • The "-s" flag specifies the default shell.
The "usermod" command is used to modify the user settings after a user has been created:
usermod -s /bin/csh my_user
The "userdel" command is used to delete existing users. The "-r" flag removes the default directory.
userdel -r my_user
The "passwd" command is used to set, or reset, the users login password:
passwd my_user

More References linux useradd:

Process Management
The "who" command can be used to list all users who have OS connections:
who | head -5
who | tail -5
who | grep -i postgres
who | wc -l
  • The "head -5" command restricts the output to the first 5 lines of the who command.
  • The "tail -5" command restricts the output to the last 5 lines of the who command.
  • The "grep -i postgres" command restricts the output to lines containing "ora".
  • The "wc -l" command returns the number of lines from "who", and hence the number of connected users.
Some users could be just logged on but actually doing nothing. You can check how long they have been idle, a command especially useful if you are the boss,
 by using the -u option.who -uH
NAME     LINE         TIME          IDLE          PID COMMENT
postgres pts/2        Jan  8 15:57   .          18127 (
postgres pts/3        Jan  8 15:57  00:26       18127 (
root     pts/1        Dec 26 13:42   old         6451 (:0.0)
root     :0           Oct 23 15:32    ?         24215
The new column IDLE shows how long they have been idle in hh:mm format. Note the value “old” in that column? It means that the user has been idle for more than 1 day.
The PID column shows the process ID of their shell connection.

The "
ps" command lists current process information:
ps -ef | grep -i post

Specific processes can be killed by specifying the process id in the 
"kill" command 

the -9 forces to kill that process.
kill -9 12345
Check the postmaster process id before killing it the process id is 9590 
[root@p1 data]# ps -ef |grep post
postgres  9590     1  0 10:54 pts/1    00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres  9593  9590  0 10:54 ?        00:00:00 postgres: checkpointer process                              
postgres  9594  9590  0 10:54 ?        00:00:00 postgres: writer process                                    
postgres  9595  9590  0 10:54 ?        00:00:00 postgres: wal writer process                                
postgres  9596  9590  0 10:54 ?        00:00:00 postgres: autovacuum launcher process                       
postgres  9597  9590  0 10:54 ?        00:00:00 postgres: archiver process                                  
postgres  9598  9590  0 10:54 ?        00:00:00 postgres: stats collector process                           
root      9668 25224  0 May07 pts/1    00:00:00 su postgres
postgres  9669  9668  0 May07 pts/1    00:00:00 bash
postgres 10275  9590  0 11:14 ?        00:00:00 postgres: logger process  
--kill the postmaster process 
[root@p1 data]# kill -9 9590
--check the process whether its running or not
[root@p1 data]# ps -ef |grep post
root     10582  9215  0 11:29 pts/4    00:00:00 grep post
root     15754 15727  0 May08 pts/2    00:00:00 su postgres
postgres 15755 15754  0 May08 pts/2    00:00:00 bash
--create any table its for checking 
postgres=# create table t11(id int);
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
--If you kill the postmaster process means  postgres server  will not restart the server automatically,  once  postgres server down means you need to start the server manually after restarted the postgres server postmaster process id will be varied from 9590 to 10734 
bash-3.2$ ./pg_ctl start  -D /opt/PostgreSQL/9.3/data 
pg_ctl: another server might be running; trying to start server anyway
server starting
bash-3.2$ 2017-05-10 11:35:59 EDT LOG:  redirecting log output to logging collector process
2017-05-10 11:35:59 EDT HINT:  Future log output will appear in directory "pg_log".
--check the postmaster server process id this  will be varried from 9590  to  10734  
-bash-3.2# ps -ef |grep post
root     10702 10672  0 11:34 pts/4    00:00:00 su postgres
postgres 10703 10702  0 11:34 pts/4    00:00:00 bash
postgres 10734     1  0 11:35 pts/4    00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres 10735 10734  0 11:35 ?        00:00:00 postgres: logger process                                    
postgres 10737 10734  0 11:36 ?        00:00:00 postgres: checkpointer process                               
postgres 10738 10734  0 11:36 ?        00:00:00 postgres: writer process                                    
postgres 10739 10734  0 11:36 ?        00:00:00 postgres: wal writer process                                
postgres 10740 10734  0 11:36 ?        00:00:00 postgres: autovacuum launcher process                       
postgres 10741 10734  0 11:36 ?        00:00:00 postgres: archiver process                                  
postgres 10742 10734  0 11:36 ?        00:00:00 postgres: stats collector process                           
root     11744 15782  0 12:34 pts/2    00:00:00 grep post
root     15754 15727  0 May08 pts/2    00:00:00 su postgres
postgres 15755 15754  0 May08 pts/2    00:00:00 bash
--check the postmaster.pid file postmaster process id will be varried
-bash-3.2# cat postmaster.pid 
  5432001   3866625

uname and hostname
The "uname" and "hostname" commands can be used to get information about the host:
uname -a
Linux HPLINUX 2.4.21-20.ELsmp #1 SMP Wed Aug 18 20:46:40 EDT 2004 i686 i686 i386 GNU/Linux

$ uname -a | awk '{ print $2 }'
$ hostname
Some helpful commands

To see the name of a user from his unix id (Provided your UNIX admin keeps them!)
grep userid /etc/passwd

To see if port number 1521 is reserved for postgres
grep 5432 /etc/services

To see the latest 20 lines in the  log file:
tail -20 postgres.log

To see the first 20 lines in the log file:
head -20 postgresql.log

To find a file named "whereare.you" under all sub-directories of /usr/postgres
find /usr/postgres -name whereare.you -print

To remove/delete all the files under /usr/postgres which end with .tmp
find /usr/postgres -name "*.tmp" -print -exec rm -f {} \;

Remove/Delete files older than N number of days  (Useful in delete log, trace, tmp file )
find . -name ‘*.*’ -mtime +[N in days]  -exec rm {} \;

To list all files under /usr/postgres which are older than a week.
find /usr/postgres -mtime +7 -print

To list all files under /usr/postgres which are modified within a week.
find /usr/postgres -mtime -7 -print  -> Solaris
find . -mtime -7 -exec ls -lt {} \;  -> Linux

To compress all files which end with .dmp and are more than 1 MB.

find /usr/postgres -size +1048576c -name "*.dmp" -print -exec compress {} \;
To see the space used and available on /postgres mount point
    df -k /postgres

To convert the contents of a text file to UPPERCASE
tr "[a-z]" "[A-Z]" < filename > newfilename

To convert the contents of a text file to lowercase.
tr "[A-Z]" "[a-z]" < filename > newfilename

To see the postgres processes
ps -ef | grep postgres

To see lines 100 to 120 of a file
head -120 filename | tail -20

Check Swap Activity (Metalink Note 225451.1)
/sbin/swapon -s
free -t
cat /proc/swaps 

The recommended SWap size is two to three times the amount of Physical Memory for Swap space (unless the system exceeds 1 GB of Physical Memory, where two times the amount of Physical Memory for Swap space is sufficient)
Swap space in Linux is used when the amount of physical memory (RAM) is full.If the system needs more memory resources and the physical memory is full, inactive pages in memory are moved to the swap space. While swap space can help machines with a small amount of RAM, it should not be considered a replacement for more RAM. Swap space is located on hard drives, which have a slower access time than physical memory.
Swapping is one of the Unix mechanisms to accommodate the size limitation of memory by moving entire processes to disk to reclaim memory.
Paging is another Unix machanism to manage the limitations of memory. Unlike swapping, where entire processes are moved in and out of memory, paging moves only individual pages of processes to disk. Paging is not as serious a problem as swapping, as the entire program does not have to reside in memory to run. A small amount of paging may not noticeably affect the performance of a system. However, the performance of a system may degraderapidly as paging activity increases.
Swap space can have a dedicated swap partition (recommended), a swap file, or a combination of swap partitions and swap files.
When analyzing your UNIX machine, make sure that the machine is not swapping at all and at worst paging lightly. This indicates a system with a healthy amount of memory available.

How can I enable Swap in LINUX ? 

First check is Swap is enabled:
/sbin/swapon -s
Filename                        Type            Size    Used    Priority
/dev/sda3                       partition       2040244 453180  -1

--To enable swap, check for swap entries in your /etc/fstab
grep swap /etc/fstab
/dev/sda3               swap                    swap    defaults        0 0

And use the '/sbin/swapon -a' command to enable all Swap partitions listed in /etc/fstab. 

How to add a swapfile?
Determine the size of the new swap file and multiple by 1024 to determine the block size. For example, the block size of a 64 MB swap file is 65536.

At a shell prompt as root, type the following command with count being equal to the desired block size:
dd if=/dev/zero of=/data2/swapfile1 bs=1024 count=65536

Setup the swap file with the command:
/sbin/mkswap /data2/swapfile1

To enable the swap file immediately but not automatically at boot time:
/sbin/swapon /data2/swapfile

To enable it at boot time, edit /etc/fstab to include:
/data2/swapfile swap swap defaults 0 0

The next time the system boots, it will enable the new swap file.

Check Services Running and stop them if not used Services that should be removed: 

r* (shell or rsh, login or rlogin, exec or rexec, rcp), telnet, ftp, sendmail, exim, postfix, printer, qmail, http, portmap, SMBD (Samba)
chkconfig --list                                    --> Show services running and its level
chkconfig --del servicename                --> Stop that service
chkconfig --level 345 servicename off --> Stop that service for level 3,4,5

Also it could be necessary to check the file /etc/inetd.conf  because it has references to some services, if any service that I want to stop is there, comment that line and reboot the server or run:
/etc/init.d/inetd restart

Enable FTP and TELNET Services
cd to /etc/xinetd.d
vi wu-ftpd

Change the disable field from "yes" to "no" and save changes.
vi telnet
Change the disable field from "yes" to "no" and save changes.

Network Information
Display network interface configuration parameters
ifconfig -a

Address resolution display and control
arp -a

Check Routes:
netstat -rn 

Change network, change it's ip, mask, bcast and gateway. 
The easiest way is to execute sys-unconfig.
After the process finishes power down the box and move it to the new network.
When you boot the box it will ask the appropriate questions about the network configuration
Important Network LINUX files:
Making the following gross assumptions:
Your IP is:          
Your Gateway is:
Your netmask is: 
Your nameservers are:,, and

/etc/sysconfig/network  File

/etc/hosts  File     localhost.localdomain             localhost   your_machine_name.company.com  your_machine_name your_gateway.company.com       your_gateway
(You don't absolutely *need* your gateway in the hosts file, but I feel it does sometimes speed up some operations)

/etc/sysconfig/network-scripts/ifcfg-eth0  File

/etc/resolv.conf File
search gateway compay_gateway
(The 'search' line is optional. You can have up to 3 'nameserver' lines,and they don't need to be inside your network)

/etc/resolv.conf  File

domain        domain_name
search        domain_name
Get OS File System Block Size 64 bit or 32 bit
On Linux
$uname -a
64 Bits
Linux gaylord.stata.com 2.6.11-1.27_FC3 #1 Tue May 17 20:24:57 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux

64 Bits
Linux caddo.stata.com 2.6.9-5.0.5.EL #1 SMP Fri Apr 8 14:20:58 EDT 2005 ia64 ia64 ia64 GNU/Linux

32 Bits
Linux tango.stata.com 2.6.10-1.771_FC2smp #1 SMP Mon Mar 28 01:10:51 EST 2005 i686 i686 i386 GNU/Linux  
$uname -m
It seems like the uname -m actually gives
    * x86_64 when it is an kernel 64 bits
    * i686 for 32 bits kernel 
$getconf LONG_BIT
which returns either 32 or 64 
On Solaris
isainfo -b -v
/usr/bin/isainfo -kv 
$ getconf -a | grep KERN
$ file /usr/lib/boot/unix* 
/usr/bin/ getconf KERNEL_BITS
/usr/bin/file /stand/vmunix
General Performance
The free command let you identify the amoung of memory used by all the apps on the box. If the amount of  memory used is bigger than the available RAM, then the box starts to swap.
If you use this command with the -m
 option, it will show the numbers in MB.
# free -m
             total       used       free     shared    buffers     cached
Mem:          1772       1654        117          0         18        618
-/+ buffers/cache:       1017        754
Swap:         1983       1065        918
Here we can see that the box has 1772 MB of RAM, currently using 1654 MB, and only 117 MB of free memory.
The next line shows the changes on the size of the cache and buffers in the memory.
Finally the third one shows the amount of swap memory that is being used.
The –t options shows you the totals at the end of the output (adds physical memory plus swap memory):
# free -m -t
             total       used       free     shared    buffers     cached
Mem:          1772       1644        127          0         16        613
-/+ buffers/cache:       1014        757
Swap:         1983       1065        918
Total:        3756       2709       1046
Shows the percentage of used memory:
# free -m | grep Mem | awk '{print ($3 / $2)*100}' 
Shows the percentage of swap memory:
free -m | grep -i Swap | awk '{print ($3 / $2)*100}'
The top command is probably the most useful one for an postgres DBA managing a database on Linux.
Note that unlike other commands, 
top does not produce an output and sits still. It refreshes the screen to display new information. So, if you just issue top
 and leave the screen up, the most current information is always up. To stop and exit to shell, you can press Control-C.
$ top
18:46:13  up 11 days, 21:50,  5 users,  load average: 0.11, 0.19, 0.18 
151 processes: 147 sleeping, 4 running, 0 zombie, 0 stopped 
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle 
           total   12.5%    0.0%    6.7%   0.0%     0.0%    5.3%   75.2% 
Mem:  1026912k av,  999548k used,   27364k free,       0k shrd,  116104k buff 
                    758312k actv,  145904k in_d,   16192k in_c 
Swap: 2041192k av,  122224k used, 1918968k free              590140k cached 
  451 postgres  15   0  6044 4928  4216 S     0.1  0.4   0:20   0 postgres 
 8991 postgres  15   0  1248 1248   896 R     0.1  0.1   0:00   0 top 
    1 root      19   0   440  400   372 S     0.0  0.0   0:04   0 init 
    2 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 keventd 
    3 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kapmd 
    4 root      34  19     0    0     0 SWN   0.0  0.0   0:00   0 ksoftirqd/0 
    7 root      15   0     0    0     0 SW    0.0  0.0   0:01   0 bdflush 
    5 root      15   0     0    0     0 SW    0.0  0.0   0:33   0 kswapd 
    6 root      15   0     0    0     0 SW    0.0  0.0   0:14   0 kscand 
    8 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kupdated 
    9 root      25   0     0    0     0 SW    0.0  0.0   0:00   0 mdrecoveryd
... output snipped ...
Let's examine the different types of information produced. 

The first line: 18:46:13 up 11 days, 21:50, 5 users, load average: 0.11, 0.19, 0.18 

shows the current time (18:46:13),that system has been up for 11 days; that the system has been working for 21 hours 50 seconds.
The load average of the system is shown (0.11, 0.19, 0.18) for the last 1, 5 and 15 minutes respectively. (By the way, you can also get this information by issuing the uptime command.)
If the load average is not required, press the letter "l" (lowercase L); it will turn it off. To turn it back on press l again.  Ideally Load average should be less than 1, otherwise the processes are fully burdened 
The second line:  151 processes: 147 sleeping, 4 running, 0 zombie, 0 stopped
shows the number of processes, Total number of tasks,
Number of sleeping tasks, Number of running tasks, Number of zombie tasks, Number of stopped tasks,
The third and fourth lines:
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total   12.5%    0.0%    6.7%   0.0%     0.0%    5.3%   75.2%
show the CPU utilization details. The above line shows that user processes consume 12.5% and system consumes 6.7%. The user processes include the Oracle processes. Press "t" to turn these three lines off and on. If there are more than one CPU, you will see one line per CPU.
The next two lines:
Mem:  1026912k av, 1000688k used,  26224k free,    0k shrd,  113624k buff
                    758668k actv,  146872k in_d,  14460k in_c
Swap: 2041192k av, 122476k used,   1918716k free             591776k cached
show the memory available and utilized. Total memory is "1026912k av", approximately 1GB, of which only 26224k or 26MB is free. The swap space is 2GB; but it's almost not used. To turn it off and on, press "m".

The rest of the display shows the processes in a tabular format. Here is the explanation of the columns:
The process ID of the process
The user running the process
The priority of the process
The nice value: The higher the value, the lower the priority of the task
Memory used by this process (code+data+stack)
The physical memory used by this process
The shared memory used by this process
The status of this process, shown in code. Some major status codes are:
R – Running
S –Sleeping
Z – Zombie
T – Stopped
You can also see second and third characters, which indicate:
W – Swapped out process
N – positive nice value
The percentage of CPU used by this process
The percentage of memory used by this process
The total CPU time used by this process
If this is a multi-processor system, this column indicates the ID of the CPU this process is running on.
The command issued by this process

While the top is being displayed, you can press a few keys to format the display as you like. Pressing the uppercase M key sorts the output by memory usage. (Note that using lowercase m will turn the memory summary lines on or off at the top of the display.) This is very useful when you want to find out who is consuming the memory. 
Here is sample output:
31903 postgres    15   0 75760  72M 72508 S     0.0  7.2   0:01   0 postgres 
31909 postgres    15   0 68944  66M 64572 S     0.0  6.6   0:03   0 postgres 
31897 postgres    15   0 53788  49M 48652 S     0.0  4.9   0:00   0 postgres
Now that you learned how to interpret the output, let's see how to use command line parameters.
    -h -
Show the current version
    -c -
This toggles the command column between showing command and program name
    -d -
Specify the delay time between refreshing the screen
    -o -
Sorts by the named field
    -p -
Only show processes with specified process IDs
    -u -
Show only processes by the specified user
    -i -
Do not show idle tasks

Only show processes with specified process IDs
The other useful option is -p. If you want to monitor only a few processes, not all, you can specify only those after the -p option. To monitor processes 13609, 13608 and 13554, issue:
top -p 13609 -p 13608 -p 13554
This will show results in the same format as the top command, but only those specific processes.

 Show The Current Version
Type the following to show the current version details for top:
    top -h

Specify A Delay Time Between Screen Refreshes
To specify a delay between the screen refreshes whilst using top type the following:
    top -d
To refresh every 5 seconds type top -d 5

Obtain A List Of Columns To Sort By
To get a list of the columns with which you can sort the top command by type the following:
    top -O

There are a lot of columns so you might wish to pipe the output to less as follows:
    top -O | less
Sort The Columns In The Top Command By A Column Name
Use the previous section to find a column to sort by and then use the following syntax to sort by that column:
    top -o

To sort by %CPU type the following:
    top -o %CPU

Only Show The Processes For A Specific User
To show only the processes that a specific user is running use the following syntax:
    top -u

For example to show all the processes that the user postgres is running type the following:
    top -u postgres

Hide Idle Tasks
The default top view can seem cluttered and if you want to see only active processes (i.e those that are not idle) then you can ran the top command using the following command:
    top -i

skill and snice
From the previous discussion you learned how to identify a CPU consuming resource. What if you find that a process is consuming a lot of CPU and memory, but you don't want to kill it? Consider the top output below:
$ top -c -p 16514

23:00:44  up 12 days,  2:04,  4 users,  load average: 0.47, 0.35, 0.31 
1 processes: 1 sleeping, 0 running, 0 zombie, 0 stopped 
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle 
           total    0.0%    0.6%    8.7%   2.2%     0.0%   88.3%    0.0% 
Mem:  1026912k av, 1010476k used,   16436k free,       0k shrd,   52128k buff 
                    766724k actv,  143128k in_d,   14264k in_c 
Swap: 2041192k av,   83160k used, 1958032k free                  799432k cached 
16514 postgres 19   4  28796  26M  20252  D N  7.0  2.5   0:03   0 postgres
Now that you confirmed the process 16514 is consuming a lot of memory, you can "freeze" it—but not kill it—using the skill command.
$ skill -STOP 16514
--After this, check the top output:
$ top -c -p 16514
23:01:11  up 12 days,  2:05,  4 users,  load average: 1.20, 0.54, 0.38 
1 processes: 0 sleeping, 0 running, 0 zombie, 1 stopped 
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle 
           total    2.3%    0.0%    0.3%   0.0%     0.0%    2.3%   94.8% 
Mem:  1026912k av, 1008756k used,   18156k free,       0k shrd,    3976k buff 
                    770024k actv,  143496k in_d,   12876k in_c 
Swap: 2041192k av,   83152k used, 1958040k free                  851200k cached 
16514 postgres  19   4 28796  26M 20252 T N   0.0  2.5   0:04   0 postgres
The CPU is now 94% idle from 0%. The process is effectively frozen. After some time, you may want to revive the process from coma:
$ skill -CONT 16514
This approach is immensely useful for temporarily freezing processes to make room for more important processes to complete.

The command is very versatile. If you want to stop all processes of the user "postgres", only one command does it all:
$ skill -STOP postgres

As you can see, skill decides that argument you entered—a process ID, userid, or command—and acts appropriately. This may cause an issue in some cases, where you may have a user and a command in the same name. The best example is the "postgres" process, which is typically run by the user "postgres". So, when you want to stop the process called "postgres" and you issue:
$ skill -STOP postgres
 all the processes of user "postgres" stop, including the session you may be on. To be completely unambiguous you can optionally give a new parameter to specify the type of the parameter. 

To stop a command called postgres, you can give:
$ skill -STOP -c postgres
--The command snice is similar. Instead of stopping a process it makes its priority a lower one. First, check the top output:
$ top
    3 root      15   0     0    0     0 RW    0.0  0.0   0:00   0 kapmd  
 13680 postgres 15   0  11336  10M  8820 T    0.0  1.0   0:00   0 postgres  
 13683 postgres 15   0  9972 9608  7788 T     0.0  0.9   0:00   0 postgres
 13686 postgres 15   0  9860 9496  7676 T     0.0  0.9   0:00   0 postgres  
 13689 postgres 15   0 10004 9640  7820 T     0.0  0.9   0:00   0 postgres
 13695 postgres 15   0  9984 9620  7800 T     0.0  0.9   0:00   0 postgres
 13698 postgres 15   0 10064 9700  7884 T     0.0  0.9   0:00   0 postgres
 13701 postgres 15   0 22204  21M 16940 T     0.0  2.1   0:00   0 postgres
Now, drop the priority of the processes of "postgres" by four points. 
Note that the higher the number, the lower the priority.
$ snice +4 -u postgres

  16894 postgres  20  4  38904  32M 26248 D N   5.5  3.2   0:01   0 postgres
Note how the NI column (for nice values) is now 4 and the priority is now set to 20, instead of 15. This is quite useful in reducing priorities.
For More Reference about Nice and renice Value:

This utility provides a report that covers process activity, paging, memory usage, disk I/O, and CPU usage (also you can use xosview). When analyzing your UNIX machine, make sure that the machine is not swapping at all and at worst paging lightly.

Having any processes in the b or w columns is a sign of a problem system.
Having an id of 0 is a sign that the cpu is overburdoned.
Having high values in pi and po show excessive paging.

 $ vmstat 5 3   (Displays system statistics (5 seconds apart; 3 times))

procs -----------memory---------- ---swap-- -----io---- --system--  ----cpu----   
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs  us sy id wa  
0  0 329476  54880  91600 613852    0    1     4     2    0     0   1  1  3  1  
0  0 329476  54560  91600 613852    0    0     0    36  118   128  25  0 74  0  
0  0 329476  54564  91600 613860    0    0     1    48  127   143  25  0 74  1 
  • procs (Reports the number of processes in each of the following states)
    • r : The number of processes waiting to be run. The more the load on the system, the more the number of processes waiting to get CPU cycles to run.
    • b : Uninterruptible sleeping processes, also known as “blocked” processes. These processes are most likely waiting for I/O but could be for something else too. Ideally close to 0.
    • w : shows the number of potential processes that have been swapped out and written to disk. If the value is very high, then swapping occurs and the system is short of memory.
  • memory (Reports on usage of virtual and real memory)
    • swpd : the amount of virtual (swap) memory used.  (Kbytes)
    • free : Amount of free physical memory (Kbytes)
    • buff : Amount of memory used like buffer (Kbytes)
    • cache : Amount of memory used like cache (Kbytes)
    • inact: the amount of inactive memory. (-a option)
    • active: the amount of active memory. (-a option)
  • swap (Reports information about page faults and paging activity (units per second). Swap-ins and swap-outs should always be zero
    • si: Rate at which the memory is swapped back from the disk to the physical RAM (in KB/sec). Ideally 0
    • so: Rate at which the memory is swapped out to the disk from physical RAM (in KB/sec). Ideally 0
  • io
    • bi: Rate at which the system sends data to the block devices (in blocks/sec). 
    • bo: Rate at which the system reads the data from block devices (in blocks/sec)
  • system
    • in : The number of interrupts per second, including the clock.
    • cs : The number of context switches per second.
  • cpu (Reports the breakdown of percentage usage of CPU time (averaged across all CPUs)). The final block is probably the most used
    • us : Shows the percentage of CPU spent in user processes. The Oracle processes come in this category.
    • sy : Percentage of CPU used by system processes, such as all root processes
    • id : Percentage of free CPU
    • wa: Percentage spent in “waiting for I/O” 
Let’s see how to interpret these values. The first line of the output is an average of all the metrics since the system was restarted. So, ignore that line since it does not show the current status. The other lines show the metrics in real time.
Ideally, the number of processes waiting or blocking (under the “ procs” heading) should be 0 or close to 0. If they are high, then the system either does not have enough resources like CPU, memory, or I/O. This information comes useful while diagnosing performance issues.
The data under “swap” indicates if excessive swapping is going on. If that is the case, then you may have inadequate physical memory. You should either reduce the memory demand or increase the physical RAM.
The data under “ io” indicates the flow of data to and from the disks. This shows how much disk activity is going on, which does not necessarily indicate some problem. If you see some large number under “ proc” and then “ b” column (processes being blocked) and high I/O, the issue could be a severe I/O contention.
The most useful information comes under the “ cpu” heading. The “ id” column shows idle CPU. If you subtract that number from 100, you get how much percent the CPU is busy. Remember the top command described in another installment of this series? That also shows a CPU free% number. The difference is: top shows that free% for each CPU whereas vmstat shows the consolidated view for all CPUs.
The vmstat command also shows the breakdown of CPU usage: how much is used by the Linux system, how much by a user process, and how much on waiting for I/O. From this breakdown you can determine what is contributing to CPU consumption. If system CPU load is high, could there be some root process such as backup running?
The system load should be consistent over a period of time. If the system shows a high number, use the top command to identify the system process consuming CPU.
Tip for Oracle Users
Oracle processes (the background processes and server processes) and the user processes (sqlplus, apache, etc.) come under “ us”. If this number is high, use top to identify the processes. If the “ wa” column shows a high number, it indicates the I/O system is unable to catch up with the amount of reading or writing. This could occasionally shoot up as a result of spikes in heavy updates in the database causing log switch and a subsequent spike in archiving processes. But if it consistently shows a large number, then you may have an I/O bottleneck.
I/O blockages in an Oracle database can cause serious problems. Apart from performance issues, the slow I/O could cause controlfile writes to be slow, which may cause a process to wait to acquire a controlfile enqueue. If the wait is more that 900 seconds, and the waiter is a critical process like LGWR, it brings down the database instance.
If you see a lot of swapping, perhaps the SGA is sized too large to fit in the physical memory. You should either reduce the SGA size or increase the physical memory.

 $ vmstat -S 5 3    (Displays system statistics (5 seconds apart; 3 times))

  • procs (Reports the number of processes in each of the following states)
    • r : number of processes in the run queue
    • b : number of processes blocked and waiting for resources such as disk or terminal input. (I/O, paging etc.). Ideally close to 0.
    • w : number of runnable processes, or processes swapped but in a sleep state of less than twenty seconds. If this is to high, you may need more memory.
  • memory (Reports on usage of virtual and real memory)
    • swpd : Available swap space (Kbytes)
    • free : Amount of free physical memory (Kbytes)
  • page (Reports information about page faults and paging activity (units per second)
    • re : The number of pages reclaimed. 
    • mf : The number of major and minor faults. 
    • pi : Kbytes paged in per second. 
    • po : Kbytes paged out per second
    • fr : Kbytes freed
    • de : The amount of anticipated memory needed by processes that have recently swapped in. (Kbytes)
    • sr : The pages scanned by the page daemon. High scan rates are caused by a shortage of available memory
  • disk (lists the number of disk operations per second and can show data for up to four disks at a time)
  • faults (Reports the trap/interupt rates (per second)
    • in : number if device interrupts per second
    • sy : number of system calls
    • cs : CPU context switches
  • cpu (Reports the amount of time spent in user mode, kernel mode and idle are reported. )
    • us : user time
    • sy : system time
    • id : idle time
vmstat can also report on swapping and cache flushing. The -S adds two fields to the beginning of the paging statistics. These are si, which lists the number of pages swapped in per second, and so, which gives the number of entire processes swapped out.

The buffer memory is used to save metadata from files like i-nodes.
The cache memory is used for file data.

Here there are NO pageouts (po or so) occurring on this system. It is OK and normal to have page out (po or so) activity. You should get worried when the number of page ins (pi or si) starts rising. This indicates that you system is starting to page
There are no processes that are waiting to be run (r), blocked (b), or waiting for IO (w) in the RUN QUEUE (When a process is ready to be processed by a CPU it will be placed on the waiting line or RUN-QUEUE). You want to keep the RUN-QUEUE under 5-6 for a single CPU machine.

CPU Usage

Another useful command to get CPU related stats is mpstat
$ mpstat -P ALL 5 2
Reports per-processor statistics (5 seconds apart; 2 times):

10:42:38 PM  CPU   %user   %nice %system %iowait    %irq   %soft   %idle    intr/s
10:42:43 PM  all    6.89    0.00   44.76    0.10    0.10    0.10   48.05   1121.60
10:42:43 PM    0    9.20    0.00   49.00    0.00    0.00    0.20   41.60    413.00
10:42:43 PM    1    4.60    0.00   40.60    0.00    0.20    0.20   54.60    708.40
10:42:43 PM  CPU   %user   %nice %system %iowait    %irq   %soft   %idle    intr/s
10:42:48 PM  all    7.60    0.00   45.30    0.30    0.00    0.10   46.70   1195.01
10:42:48 PM    0    4.19    0.00    2.20    0.40    0.00    0.00   93.21   1034.53
10:42:48 PM    1   10.78    0.00   88.22    0.40    0.00    0.00    0.20    160.48
Average:     CPU   %user   %nice %system %iowait    %irq   %soft   %idle    intr/s
Average:     all    7.25    0.00   45.03    0.20    0.05    0.10   47.38   1158.34
Average:       0    6.69    0.00   25.57    0.20    0.00    0.10   67.43    724.08

Average:       1    7.69    0.00   64.44    0.20    0.10    0.10   27.37    434.17

It shows the various stats for the CPUs in the system. The –P ALL options directs the command to display stats for all the CPUs, not just a specific one. The parameters 5 2 directs the command to run every 5 seconds and for 2 times. The above output shows the metrics for all the CPUs first (aggregated) and for each CPU individually. Finally, the average for all the CPUs has been shown at the end.
Let’s see what the column values mean:

Indicates the percentage of the processing for that CPU consumes by user processes. User processes are non-kernel processes used for applications such as an Oracle database. In this example output, the user CPU %age is very little.
Indicates the percentage of CPU when a process was downgraded by nice command. The command nice has been described in an earlier installment. It brief, the command nice changes the priority of a process.
Indicates the CPU percentage consumed by kernel processes
Shows the percentage of CPU time consumed by waiting for an I/O to occur
Indicates the %age of CPU used to handle system interrupts
Indicates %age consumed for software interrupts
Shows the idle time of the CPU
Shows the total number of interrupts received by the CPU per second
Tip for Oracle UsersYou may be wondering about the purpose of the mpstat command when you have vmstat, described earlier. There is a huge difference: mpstat can show the per processor stats, whereas vmstat shows a consolidated view of all processors. So, it’s possible that a poorly written application not using multi-threaded architecture runs on a multi-processor machine but does not use all the processors. As a result, one CPU overloads while others remain free. You can easily diagnose these sorts of issues via mpstat.
Similar to vmstat, the mpstat command also produces CPU related stats so all the discussion related to CPU issues applies to mpstat as well. When you see a low %idle figure, you know you have CPU starvation. When you see a higher %iowait figure, you know there is some issue with the I/O subsystem under the current load. This information comes in very handy in troubleshooting Oracle database performance.
A key part of the performance assessment is disk performance. The iostat command gives the performance metrics of the storage interfaces.

# iostat
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
avg-cpu:  %user   %nice    %sys %iowait   %idle
          15.71    0.00    1.07    3.30   79.91
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cciss/c0d0        4.85        34.82       130.69  307949274 1155708619
cciss/c0d0p1      0.08         0.21         0.00    1897036       3659
cciss/c0d0p2     18.11        34.61       130.69  306051650 1155700792
cciss/c0d1        0.96        13.32        19.75  117780303  174676304
cciss/c0d1p1      2.67        13.32        19.75  117780007  174676288
sda               0.00         0.00         0.00        184          0
sdb               1.03         5.94        18.84   52490104  166623534
sdc               0.00         0.00         0.00        184          0
sdd               1.74        38.19        11.49  337697496  101649200
sde               0.00         0.00         0.00        184          0
sdf               1.51        34.90         6.80  308638992   60159368
sdg               0.00         0.00         0.00        184          0
... and so on ...

The beginning portion of the output shows metrics such as CPU free and I/O waits as you have seen from the mpstat command.
The next part of the output shows very important metrics for each of the disk devices on the system. Let’s see what these columns mean:
The name of the device
Number of transfers per second, i.e. number of I/O operations per second. Note: this is just the number of I/O operations; each operation could be huge or small.
Number of blocks read from this device per second. Blocks are usually of 512 bytes in size. This is a better value of the disk’s utilization.
Number of blocks written to this device per second
Number of blocks read from this device so far. Be careful; this is not what is happening right now. These many blocks have already been read from the device. It’s possible that nothing is being read now. Watch this for some time to see if there is a change.
Number of blocks written to the device

In a system with many devices, the output might scroll through several screens—making things a little bit difficult to examine, especially if you are looking for a specific device. You can get the metrics for a specific device only by passing that device as a parameter.
# iostat sdaj   
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
avg-cpu:  %user   %nice    %sys %iowait   %idle
          15.71    0.00    1.07    3.30   79.91
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sdaj              1.58        31.93        10.65  282355456   94172401

The CPU metrics shown at the beginning may not be very useful. To suppress the CPU related stats shown in the beginning of the output, use the -d option.

You can place optional parameters at the end to let iostat display the device stats in regular intervals. To get the stats for this device every 5 seconds for 10 times, issue the following:
# iostat -d sdaj 5 10

You can display the stats in kilobytes instead of just bytes using the -k option:
# iostat -k -d sdaj    
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sdaj              1.58        15.96         5.32  141176880   47085232

While the above output can be helpful, there is lot of information not readily displayed. For instance, one of the key causes of disk issues is the disk service time, i.e. how fast the disk gets the data to the process that is asking for it. To get that level of metrics, we have to get the “extended” stats on the disk, using the -x option.
# iostat -x sdaj
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
avg-cpu:  %user   %nice    %sys %iowait   %idle
          15.71    0.00    1.07    3.30   79.91
Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sdaj         0.00   0.00  1.07  0.51   31.93   10.65    15.96     5.32    27.01     0.01    6.26   6.00   0.95

Let’s see what the columns mean:
The name of the device
The number of read requests merged per second. The disk requests are queued. Whenever possible, the kernel tries to merge several requests to one. This metric measures the merge requests for read transfers.
Similar to reads, this is the number of write requests merged.
The number of read requests per second issued to this device
Likewise, the number of write requests per second
The number of sectors read from this device per second
The number of sectors written to the device per second
Data read per second from this device, in kilobytes per second
Data written to this device, in kb/s
Average size of the read requests, in sectors
Average length of the request queue for this device
Average elapsed time (in milliseconds) for the device for I/O requests. This is a sum of service time + waiting time in the queue.
Average service time (in milliseconds) of the device
Bandwidth utilization of the device. If this is close to 100 percent, the device is saturated.

How to Use ItWell, that’s a lot of information and may present a challenge as to how to use it effectively. The next section shows how to use the output.
You can use a combination of the commands to get some meaning information from the output. Remember, disks could be slow in getting the request from the processes. The amount of time the disk takes to get the data from it to the queue is called service time. If you want to find out the disks with the highest service times, you issue:

# iostat -x | sort -nrk13
sdat         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00    18.80     0.00   64.06  64.05   0.00
sdv          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00    17.16     0.00   18.03  17.64   0.00
sdak         0.00   0.00  0.00  0.14    0.00    1.11     0.00     0.55     8.02     0.00   17.00  17.00   0.24
sdm          0.00   0.00  0.00  0.19    0.01    1.52     0.01     0.76     8.06     0.00   16.78  16.78   0.32
... and so on ...

This shows that the disk sdat has the highest service time (64.05 ms). Why is it so high? There could be many possibilities but three are most likely:
1.     The disk gets a lot of requests so the average service time is high.
2.     The disk is being utilized to the maximum possible bandwidth.
3.     The disk is inherently slow.
Looking at the output we see that reads/sec and writes/sec are 0.00 (almost nothing is happening), so we can rule out #1. The utilization is also 0.00% (the last column), so we can rule out #2. That leaves #3. However, before we draw a conclusion that the disk is inherently slow, we need to observe that disk a little more closely. We can examine that disk alone every 5 seconds for 10 times.
# iostat -x sdat 5 10
If the output shows the same average service time, read rate and utilization, we can conclude that #3 is the most likely factor. If they change, then we can get further clues to understand why the service time is high for this device.
Similarly, you can sort on the read rate column to display the disk under constant read rates.
# iostat -x | sort -nrk6 
sdj          0.00   0.00  1.86  0.61   56.78   12.80    28.39     6.40    28.22     0.03   10.69   9.99   2.46
sdah         0.00   0.00  1.66  0.52   50.54   10.94    25.27     5.47    28.17     0.02   10.69  10.00   2.18
sdd          0.00   0.00  1.26  0.48   38.18   11.49    19.09     5.75    28.48     0.01    3.57   3.52   0.61
... and so on ...  
The information helps you to locate a disk that is “hot”—that is, subject to a lot of reads or writes. If the disk is indeed hot, you should identify the reason for that; perhaps a filesystem defined on the disk is subject to a lot of reading. If that is the case, you should consider striping the filesystem across many disks to distribute the load, minimizing the possibility that one specific disk will be hot.
From the earlier discussions, one common thread emerges: Getting real time metrics is not the only important thing; the historical trend is equally important.
Furthermore, consider this situation: how many times has someone reported a performance problem, but when you dive in to investigate, everything is back to normal? Performance issues that have occurred in the past are difficult to diagnose without any specific data as of that time. Finally, you will want to examine the performance data over the past few days to decide on some settings or to make adjustments.
The sar utility accomplishes that goal. sar stands for System Activity Recorder, which records the metrics of the key components of the Linux system—CPU, Memory, Disks, Network, etc.—in a special place: the directory /var/log/sa. The data is recorded for each day in a file named sa<nn> where <nn> is the two digit day of the month. For instance the file sa27 holds the data for the date 27th of that month. This data can be queried by the command sar.
The simplest way to use sar is to use it without any arguments or options. Here is an example:
# sar
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
12:00:01 AM       CPU     %user     %nice   %system   %iowait     %idle
12:10:01 AM       all     14.99      0.00      1.27      2.85     80.89
12:20:01 AM       all     14.97      0.00      1.20      2.70     81.13
12:30:01 AM       all     15.80      0.00      1.39      3.00     79.81
12:40:01 AM       all     10.26      0.00      1.25      3.55     84.93
... and so on ...

The output shows the CPU related metrics collected in 10 minute intervals. The columns mean
The CPU identifier; “all” means all the CPUs
The percentage of CPU used for user processes. Oracle processes come under this category.
The %ge of CPU utilization while executing under nice priority
The %age of CPU executing system processes
The %age of CPU waiting for I/O
The %age of CPU idle waiting for work

From the above output, you can see that the system has been well balanced; actually severely under-utilized (as seen from the high degree of %age idle number). Going further through the output we see the following:

... continued from above ...
03:00:01 AM       CPU     %user     %nice   %system   %iowait     %idle
03:10:01 AM       all     44.99      0.00      1.27      2.85     40.89
03:20:01 AM       all     44.97      0.00      1.20      2.70     41.13
03:30:01 AM       all     45.80      0.00      1.39      3.00     39.81
03:40:01 AM       all     40.26      0.00      1.25      3.55     44.93
... and so on ...

This tells a different story: the system was loaded by some user processes between 3:00 and 3:40. Perhaps an expensive query was executing; or perhaps an RMAN job was running, consuming all that CPU. This is where the sar command is useful--it replays the recorded data showing the data as of a certain time, not now. This is exactly what you wanted to accomplish the three objectives outlined in the beginning of this section: getting historical data, finding usage patterns and understanding trends.
If you want to see a specific day’s sar data, merely open sar with that file name, using the -f option as shown below (to open the data for 26th)

# sar -f /var/log/sa/sa26

It can also display data in real time, similar to vmstat or mpstat. To get the data every 5 seconds for 10 times, use:

# sar 5 10

Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008   
01:39:16 PM       CPU     %user     %nice   %system   %iowait     %idle 
01:39:21 PM       all     20.32      0.00      0.18      1.00     78.50 
01:39:26 PM       all     23.28      0.00      0.20      0.45     76.08 
01:39:31 PM       all     29.45      0.00      0.27      1.45     68.83 
01:39:36 PM       all     16.32      0.00      0.20      1.55     81.93 
… and so on 10 times …
Did you notice the “all” value under CPU? It means the stats were rolled up for all the CPUs. In a single processor system that is fine; but in multi-processor systems you may want to get the stats for individual CPUs as well as an aggregate one. The -P ALL option accomplishes that.
#sar -P ALL 2 2
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
01:45:12 PM       CPU     %user     %nice   %system   %iowait     %idle
01:45:14 PM       all     22.31      0.00     10.19      0.69     66.81
01:45:14 PM         0      8.00      0.00     24.00      0.00     68.00
01:45:14 PM         1     99.00      0.00      1.00      0.00      0.00
01:45:14 PM         2      6.03      0.00     18.59      0.50     74.87
01:45:14 PM         3      3.50      0.00      8.50      0.00     88.00
01:45:14 PM         4      4.50      0.00     14.00      0.00     81.50
01:45:14 PM         5     54.50      0.00      6.00      0.00     39.50
01:45:14 PM         6      2.96      0.00      7.39      2.96     86.70
01:45:14 PM         7      0.50      0.00      2.00      2.00     95.50
01:45:14 PM       CPU     %user     %nice   %system   %iowait     %idle
01:45:16 PM       all     18.98      0.00      7.05      0.19     73.78
01:45:16 PM         0      1.00      0.00     31.00      0.00     68.00
01:45:16 PM         1     37.00      0.00      5.50      0.00     57.50
01:45:16 PM         2     13.50      0.00     19.00      0.00     67.50
01:45:16 PM         3      0.00      0.00      0.00      0.00    100.00
01:45:16 PM         4      0.00      0.00      0.50      0.00     99.50
01:45:16 PM         5     99.00      0.00      1.00      0.00      0.00
01:45:16 PM         6      0.50      0.00      0.00      0.00     99.50
01:45:16 PM         7      0.00      0.00      0.00      1.49     98.51
Average:          CPU     %user     %nice   %system   %iowait     %idle
Average:          all     20.64      0.00      8.62      0.44     70.30
Average:            0      4.50      0.00     27.50      0.00     68.00
Average:            1     68.00      0.00      3.25      0.00     28.75
Average:            2      9.77      0.00     18.80      0.25     71.18
Average:            3      1.75      0.00      4.25      0.00     94.00
Average:            4      2.25      0.00      7.25      0.00     90.50
Average:            5     76.81      0.00      3.49      0.00     19.70
Average:            6      1.74      0.00      3.73      1.49     93.03
Average:            7      0.25      0.00      1.00      1.75     97.01

This shows the CPU identifier (starting with 0) and the stats for each. At the very end of the output you will see the average of runs against each CPU.
The command sar is not only fro CPU related stats. It’s useful to get the memory related stats as well. The -r option shows the extensive memory utilization.
# sar -r
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
12:00:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad
12:10:01 AM    712264  32178920     97.83   2923884  25430452  16681300     95908      0.57       380
12:20:01 AM    659088  32232096     98.00   2923884  25430968  16681300     95908      0.57       380
12:30:01 AM    651416  32239768     98.02   2923920  25431448  16681300     95908      0.57       380
12:40:01 AM    651840  32239344     98.02   2923920  25430416  16681300     95908      0.57       380
12:50:01 AM    700696  32190488     97.87   2923920  25430416  16681300     95908      0.57       380

Let’s see what each column means:
The free memory available in KB at that time
The memory used in KB at that time
%age of memory used
This %age of memory was used as buffers
This %age of memory was used as cache
The free swap space in KB at that time
The swap space used in KB at that time
The %age of swap used at that time
The cached swap in KB at that time
At the very end of the output, you will see the average figure for time period.
You can also get specific memory related stats. The -B option shows the paging related activity.
# sar -B
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
12:00:01 AM  pgpgin/s pgpgout/s   fault/s  majflt/s
12:10:01 AM    134.43    256.63   8716.33      0.00
12:20:01 AM    122.05    181.48   8652.17      0.00
12:30:01 AM    129.05    253.53   8347.93      0.00
... and so on ...

The column shows metrics at that time, not currently.
The amount of paging into the memory from disk, per second
The amount of paging out to the disk from memory, per second
Page faults per second
Major page faults per second
To get a similar output for swapping related activity, you can use the -W option.
# sar -W
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
12:00:01 AM  pswpin/s pswpout/s
12:10:01 AM      0.00      0.00
12:20:01 AM      0.00      0.00
12:30:01 AM      0.00      0.00
12:40:01 AM      0.00      0.00
... and so on ...

The columns are probably self-explanatory; but here is the description of each anyway:
Pages of memory swapped back into the memory from disk, per second
Pages of memory swapped out to the disk from memory, per second
If you see a lot of swapping, you may be running low on memory. It’s not a foregone conclusion but rather something that may be a strong possibility.
To get the disk device statistics, use the -d option:
# sar -d
Linux 2.6.9-55.0.9.ELlargesmp (prolin3)     12/27/2008
12:00:01 AM       DEV       tps  rd_sec/s  wr_sec/s
12:10:01 AM    dev1-0      0.00      0.00      0.00
12:10:01 AM    dev1-1      5.12      0.00    219.61
12:10:01 AM    dev1-2      3.04     42.47     22.20
12:10:01 AM    dev1-3      0.18      1.68      1.41
12:10:01 AM    dev1-4      1.67     18.94     15.19
... and so on ...
Average:      dev8-48      4.48    100.64     22.15
Average:      dev8-64      0.00      0.00      0.00
Average:      dev8-80      2.00     47.82      5.37
Average:      dev8-96      0.00      0.00      0.00
Average:     dev8-112      2.22     49.22     12.08

Here is the description of the columns. Again, they show the metrics at that time.
Transfers per second. Transfers are I/O operations. Note: this is just number of operations; each operation may be large or small. So, this, by itself, does not tell the whole story.
Number of sectors read from the disk per second
Number of sectors written to the disk per second

To get the historical network statistics, you use the -n option:
# sar -n DEV | more
Linux 2.6.9-42.0.3.ELlargesmp (prolin3)     12/27/2008
12:00:01 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
12:10:01 AM        lo      4.54      4.54    782.08    782.08      0.00      0.00      0.00
12:10:01 AM      eth0      2.70      0.00    243.24      0.00      0.00      0.00      0.99
12:10:01 AM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:10:01 AM      eth2      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:10:01 AM      eth3      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:10:01 AM      eth4    143.79    141.14  73032.72  38273.59      0.00      0.00      0.99
12:10:01 AM      eth5      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:10:01 AM      eth6      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:10:01 AM      eth7      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:10:01 AM     bond0    146.49    141.14  73275.96  38273.59      0.00      0.00      1.98
… and so on …
Average:        bond0    128.73    121.81  85529.98  27838.44      0.00      0.00      1.98
Average:         eth8      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth9      3.52      6.74    251.63  10179.83      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
In summary, you have these options for the sar command to get the metrics for the components:
Use this option …
… to get the stats on:
Specific CPU(s)
What if you want to get the all the available stats on one output? Instead of calling sar with all these options, you can use the -A option which shows all the stats stored in the sar files.
If the %idle is near zero, your CPU is overloaded. If the %iowait is large, your disks are overloaded.
Once it is established that the system has high CPU usage, the next step is to find out who is using the CPU.

ps -fe | grep smon orps -e -o pcpu -o pid -o user -o args | sort -k 1 | tail -21r Displays the top 10 CPU users on the system. 
28180 1      0 Oct 31 - 0:48  ora_smon_V734 usupport 
30262 1      0 Nov 01 - 0:00  ora_smon_VKHILL usupport 
30900 1      0 Oct 14 - 9:03  ora_smon_V806 usupport 
31958 1    111 Oct 24 - 3:31  ora_smon_V815 <-- Notice the C column usupport 
37986 1      0 Nov 06 - 14:00 ora_smon_V805

Here we can see a smon of the database V815 using a lot of CPU by looking at the C column which reflects the CPU units of processing that are being used.
There are 100 units per CPU so the reason why this number is above 100 is that this machine has 2 cpus.

$ ps -e -o pcpu -o pid -o user -o args | sort -k 1 | tail -21r Displays the top 20 CPU users on the system.
oracleDDDS2 (LOCAL=NO)
oracleDDDS2 (LOCAL=NO)

The PID column can then be matched with the SPID column on the V$PROCESS view to provide more information on the process:
SELECT a.username, a.osuser, a.program, spid, sid, a.serial#   FROM  v$session a, v$process b   WHERE a.paddr = b.addr     AND spid = '&pid';

Automatic Startup Scripts on Linux
Create a file in the /etc/init.d/ directory, in this case the file is called myservice, containing the commands you wish to run at startup and/or shutdown.

Use the chmod command to set the privileges to 750:

chmod 750 /etc/init.d/myservice
Link the file into the appropriate run-level script directories:

ln -s /etc/init.d/myservice /etc/rc0.d/K10myservice
ln -s /etc/init.d/myservice /etc/rc3.d/S99myservice
Associate the myservice service with the appropriate run levels:

chkconfig --level 345 dbora on
The script should now be automatically run at startup and shutdown (with "start" or "stop" as a commandline parameter) like other service initialization scripts.

CRON and & Command
Jobs in background
You can add the '&'  command at the end of any command to run in background
cp * /tmp &
You can also use the "nohup" command to avoid the termination of a background job even if the shell terminates
nohup cp * /tmp &

You can use "bg" to take a job to the background. Before issuing this command, press ^Z, to suspend the process and then use bg, to put it in the background

You can use "fg"  to bring a background job to foreground.

Finally, the command "jobs" will list the current jobs in the shell.

Cron Commands
Cron is a unix utility that allows tasks to be automatically run in the background at regular intervals by the cron daemon often termed as cron jobs.
Crontab (CRON TABLE) is a file which contains the schedule of cron entries to be run and at specified times, you can invoke it with the "crontab -e" command.

A crontab file has five fields for specifying day , date and time  followed by the command to be run at that interval. You can also specify a range of values.

*     *     *     *     *  command to be executed -     -     -     -     - |     |     |     |     | |     |     |     |     +-----> day of week (1 - 7) (monday = 1) |     |     |     +-----------> month (1 - 12) |     |     +-----------------> day of month (1 - 31) |     +-----------------------> hour (0 - 23) +-----------------------------> min (0 - 59) The first 5 fields can be specified using the following rules:

*       - All available values or "first-last". 3-4     - A single range representing each possible from the start to the end of the range inclusive. 1,2,5,6 - A specific list of values. 1-3,5-8 - A specific list of ranges. 0-23/2  - Every other value in the specified range.

The following entry runs a cleanup script a 01:00 each Sunday. Any output or errors from the script are piped to /dev/null to prevent a buildup of mails to root: 0 1 * * 0 /u01/app/oracle/dba/weekly_cleanup > /dev/null 2>&1 # Execute the file save.sh every day at 0.05 and send results to a log file: 5 0 * * * /home/oracle/save.sh.sh 1>>/home/oracle/log 2>&1  # Execute at 2:15pm the first day of each month and do not send the results: 15 14 1 * * /home/oracle/mensual.sh 1>/dev/null 2>&1  # Execute from Monday to Friday at 10PM0 22 * * 1-5 shutdown -h now 1>/dev/null 2>&1  # Execute every minute* * * * * /home/oracle/espia.sh

Cluster Wide CRON Jobs On Tru64
On clustered systems cron is node-specific. If you need a job to fire once per cluster, rather than once per node you need an alternative approach to the standard cron job. One approach is put forward in the HP best practices document (Using cron in a TruCluster Server Cluster), but in my opinion a more elegant solution is proposed by Jason Orendorf of HP Tru64 Unix Enterprise Team (TruCluster Clustercron).

In his solution Jason creates a file called /bin/cronrun with the following contents:

set -- $(/usr/sbin/cfsmgr -F raw /)
shift 12
[[ "$1" = "$(/bin/hostname -s)" ]] && exit 0
exit 1
This script returns TRUE (0) only on the node which is the CFS serving cluster_root.

All cluster wide jobs should have a crontab entry on each node of the cluster like:

5 * * * /bin/cronrun && /usr/local/bin/myjob
Although the cron jobs fire on all nodes, the "/bin/cronrun &&" part of the entry prevents the script from running on all nodes except the current CFS serving cluster_root.

NFS Mount (Sun)
The following deamons must be running for the share to be seen by a PC:

  • /usr/lib/nfs/nfsd -a
  • /usr/lib/nfs/mountd
  • /opt/SUNWpcnfs/sbin/rpc.pcnfsd
To see a list of the nfs mounted drives already present type:

First the mount point must be shared so it can be seen by remote machines:

share -F nfs -o ro /cdrom
Next the share can be mounted on a remote machine by root using:

mkdir /cdrom#1

mount -o ro myhost:/cdrom /cdrom#1

NFS Mount (Tru64)
On the server machine:

If NFS is not currently setup do the following:

  • Application Manager -> System Admin -> Configuration -> NFS
  • Select the "Configure system as an NFS server" option.
  • Accept all defaults.
Create mount point directory:

mkdir /u04/backup
Append the following entry to the "/etc/exports" file:

Make sure the correct permissions are granted on the directory:

chmod -R 777 /u04/backup
On the client machine:

If NFS is not currently setup do the following:

  • Application Manager -> System Admin -> Configuration -> NFS
  • Select the "Configure system as an NFS client" option.
  • Accept all defaults.
Create mount point directory:

mkdir /backup
Append an following entry to the "/etc/fstab" file:

nfs-server-name:/u04/backup     /backup         nfs rw,bg,intr 0 0
Finally, mount the fileset:

mount /backup
At this point you can start to use the mount point from your client machine. Thanks to Bryan Mills for his help with Tru64.

PC XStation Configuration
- Download the CygWin setup.exe from http://www.cygwin.com.
- Install, making sure to select all the X11R6 (or XFree86 in older versions) optional packages.
- If you need root access add the following entry into the /etc/securettys file on each server:

- From the command promot on the PC do the following:

set PATH=PATH;c:cygwinbin;c:cygwinusrX11R6bin
XWin.exe :0 -query <server-name>
- The X environment should start in a new window.

- Many Linux distributions do not start XDMCP by default. To allow XDMCP access from Cygwin edit the "/etc/X11/gdm/gdm.conf" file. Under the "[xdmcp]" section set "Enable=true".

- If you are starting any X applications during the session you will need to set the DISPLAY environment variable. Remember, you are acting as an XStation, not the server itself, so this variable must be set as follows:

DISPLAY=<client-name>:0.0; export DISPLAY

Useful Profile Settings
The following .profile settings rely on the default shell for the user being set to the Korn shell (/bin/ksh).

The backspace key can be configured by adding the following entry:

stty erase "^H"
The command line history can be accessed using the [Esc][k] by adding the following entry:

set -o vi
Auto completion of paths using a double strike of the [Esc] key can be configured by adding the following entry:

set filec

Useful Files
Here are some files that may be of use:

User settings
Group settings for users.
Hostname lookup information.
Kernel parameters for Solaris.
Kernel parameters for Tru64.


Popular posts from this blog

rman interview and scenario based questions and answer and also rman(backup and recovery) discussions panel

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database

How to find the server is whether standby (slave) or primary(master) in Postgresql replication ?

7 Steps to configure BDR replication in postgresql

PostgreSQL Introduction