Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

PostgreSQL Json Data Type

In this tutorial, i will explain you how to work with PostgreSQL JSON data type.
what is JSON ?
  • It is one of the data type in postgres.
  • Json + PostgreSQL =NoSQL
  • JSON STANDS FOR JavaScript Object Notation
  • JSON data types stored value is valid according to the JSON rules.
  • JSON data types are for storing JSON data as multi-level, dynamically structured object graphs.
  • serialised object is stored in a text column. The json type takes care of deserialising it back to object graph while reading values from that column.
  • The main usage of JSON is to transport data between a server and web application. Unlike other formats, JSON is human-readable text.
  • PostgreSQL supports JSON data type since version 9.2. 
  • It provides many functions and operators  for manipulating JSON data.
There are two JSON data types: 
1.json 
2.jsonb
  • The json data type stores an exact copy of the input text.
  • jsonb data is stored in a decomposed binary format 
  • jsonb Insertion  makes it slightly slower to input due to added conversion overhead.
  • jsonb also supports indexing
  • JSONB does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.
PRACTICAL 1. CREATING SIMPLE JSON TABLE:

postgres=# CREATE TABLE SALES (
 ID INT NOT NULL PRIMARY KEY,
 SALES_INFO json NOT NULL
);
From above  Sales  table consists of two columns:
  • The id column is the primary key column that identifies the sales id.
  • The sales_info column stores the data in the form of JSON data types.
--Describe the sales table using \d

postgres=# \d sales
       Table "public.sales"
   Column   |  Type   | Modifiers 
------------+---------+-----------
 id         | integer | not null
 sales_info | json    | not null
Indexes:
    "sales_pkey" PRIMARY KEY, btree (id) 
PRACTICAL 2. INSERTING  JSON DATA ON SALES TABLE:

INSERT INTO SALES VALUES
 (1,'{ "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}'
 );

INSERT INTO SALES VALUES
 (2,'{ "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}'
 );

INSERT INTO SALES VALUES
 (3,'{ "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}'
 );  
--List down the sales table

postgres=# select * from sales;
 id |                                    sales_info                                    
----+----------------------------------------------------------------------------------
  1 | { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
  2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
  3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)  
--list down the json column only

postgres=# select sales_info from sales;
                                    sales_info                                    
----------------------------------------------------------------------------------
 { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
 { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
 { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)  
--List down the id column only for understanding purposes.

postgres=# select id from sales;
 id 
----
  1
  2
  3
(3 rows)  
Deleting json objects:

postgres=# select * from sales;
 id |                                  sales_info                                   
----+-------------------------------------------------------------------------------
  2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
  3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
  4 | { "customer": "junaith", "PRODUCTS": {"product_name": "pen","total_item": 8}}
  7 | { "customer": "daniel", "PRODUCTS": {"product_name": "car","total_item": 8}}
  8 | { "customer": "daniel", "PRODUCTS": {"product_name": "car","total_item": 8}}
(5 rows)

postgres=# delete from sales where sales_info ->'PRODUCTS'->>'total_item'='8';
DELETE 3
postgres=# select * from sales;                                               
 id |                                  sales_info                                   
----+-------------------------------------------------------------------------------
  2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
  3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(2 rows)  

PRACTICAL 3.HOW TO QUERYING & FILTERING JSON DATA:
the two operators -> and ->> to help you query JSON data.
  • -> will return the attribute as a JSON object key(original JSON type).
  • ->> will return the property as integer or text (the parsed form of the attribute).
--See the below example first two query  did'nt specify column name that is why it showing  ?column? after that i specified the column name as customer_name 

postgres=# select sales_info -> 'customer' from sales;
 ?column? 
----------
 "NIJAM"
 "ABU"
 "UMAR"
(3 rows)

postgres=# select sales_info ->> 'customer' from sales;
 ?column? 
----------
 NIJAM
 ABU
 UMAR
(3 rows)

postgres=# select sales_info ->> 'customer' as customer_name from sales;
 customer_name 
---------------
 NIJAM
 ABU
 UMAR
(3 rows)

postgres=# select sales_info -> 'PRODUCTS'  from sales;                 
                   ?column?                   
----------------------------------------------
 {"product_name": "choclate","total_item": 6}
 {"product_name": "badam","total_item": 5}
 {"product_name": "mobile","total_item": 1}                                              
  • Using -> operator returns a JSON object, you can chain it with the operator ->> to retrieve a specific node.
  • JSON IS CASESENSITIVE DEFAULTY, from below example first i try to retrieve the data using "products"(lowercase) that is why it showed empty value after that i chaneged "PRODUCTS"(UPPER CASE)  then it showing values.  
postgres=# select sales_info -> 'products' ->>'product_name' from sales;
 ?column? 
----------
 
 
 
(3 rows)


postgres=# select sales_info -> 'PRODUCTS' ->'product_name' from sales;
  ?column?  
------------
 "choclate"
 "badam"
 "mobile"
(3 rows)

postgres=# select sales_info -> 'PRODUCTS' ->>'product_name' from sales;
 ?column? 
----------
 choclate
 badam
 mobile
(3 rows  
FROM above example First  sales_info -> 'PRODUCTS'  returns  as JSON objects. And then sales_info -> 'PRODUCTS' ->>'product_name' returns all products as text.

PRACTICAL 4.HOW TO USE WHERE CLAUSE ON JSON DATA:
--Let us take the sales table for using WHERE CLAUSE

postgres=# select * from sales;
 id |                                    sales_info                                    
----+----------------------------------------------------------------------------------
  1 | { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
  2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
  3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)  
--To find out what are the products  bought customer UMAR I use the following query

postgres=# SELECT sales_info ->>'customer' as customer_name,sales_info ->>'PRODUCTS' ->>'product_name' 
as name_of_product from sales where sales_info ->>'customer'='UMAR';
ERROR:  operator does not exist: text ->> unknown
LINE 1: ...stomer' as customer_name,sales_info ->>'PRODUCTS' ->>'produc...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.  
SOLUTION:
  • FIRST TIME WE SPECIFIED sales_info ->>'PRODUCTS' ->>'product_name' SO WE NEED TO GIVE  sales_info ->'PRODUCTS' ->>'product_name' 
  • postgres=# SELECT sales_info ->>'customer' as customer_name,sales_info ->'PRODUCTS' ->>'product_name' 
    as name_of_product from sales where sales_info ->>'customer'='UMAR';
     customer_name | name_of_product 
    ---------------+-----------------
     UMAR          | mobile
    (1 row) 
  • Defaulty FROM above example First  sales_info -> 'PRODUCTS'  returns  as JSON objects. And then 'PRODUCTS' ->>'product_name' returns all products as text.
  • when you call json nested data first you call like ->(json object key format)then Finally  you need to call ->>(text format or json object key format).
  • if you specify firstly json field as text(->>) format when calling nested node ,after that if you specify -> or ->> values  json will not know first text format(->>) value that is why it throw error.
  • JSON displayed  text format but actually json work as json object key(->) format .
  • postgres=# SELECT sales_info ->>'customer' as customer_name,sales_info ->'PRODUCTS' ->'product_name' 
    as name_of_product from sales where sales_info ->>'customer'='UMAR';
     customer_name | name_of_product 
    ---------------+-----------------
     UMAR          | "mobile"
    (1 row) 
  • HERE i specified first json object(->) format finnaly i called json object format (->) this is not a problem.
  • other wise if you specify firstly as json text format (->>) finally if you give json object format (->) or  json text format (->>) what ever it is json will not know fist values surely it will throw error.

PRACTICAL 5.HOW TO USE FUNCTION ON JSON DATA:

json_each() function allows us to expand the outermost JSON object into a set of key-value pairs. 

postgres=# select sales_info from sales;
                                    sales_info                                    
----------------------------------------------------------------------------------
 { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
 { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
 { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)  
--To get a set of keys in the outermost JSON object use function json_object_keys function

postgres=# select json_each (sales_info) from sales;
                            json_each                            
-----------------------------------------------------------------
 (customer,"""NIJAM""")
 (PRODUCTS,"{""product_name"": ""choclate"",""total_item"": 6}")
 (customer,"""ABU""")
 (PRODUCTS,"{""product_name"": ""badam"",""total_item"": 5}")
 (customer,"""UMAR""")
 (PRODUCTS,"{""product_name"": ""mobile"",""total_item"": 1}")
(6 rows)  
--If you want calculate  array size of json function

select json_array_length('[1,2,3,4,5,6,7,8]');  
PRACTICAL 5.HOW TO CREATE ARRAY TABLE ON JSON DATA:
Here we are creating array table using json data type and and also explained how to   retrieve the array data from json table and how to understand the array data.

postgres=# create table TAB_ARRAY (id INT,BOOKS json);
CREATE TABLE  
--Let us inserting array  values

insert into tab_array values (1, '{
  "book_name": "story",
  "book_id": ["111", "232", "353", "484"] }' 
 );

insert into tab_array values (2, '{
  "book_name": "the king",
  "book_id": ["1231", "78", "7", "9"] }' 
 ); 

--List down the tab_array table

postgres=# select * from tab_array;
 id |                    books                    
----+---------------------------------------------
  1 | {                                          +
    |   "book_name": "story",                    +
    |   "book_id": ["111", "232", "353", "484"] }
  2 | {                                          +
    |   "book_name": "the king",                 +
    |   "book_id": ["1231", "78", "7", "9"] }
(2 rows)  
--filter the json data

postgres=# select id, books->'book_id'->>2 from tab_array;
 id | ?column? 
----+----------
  1 | 353
  2 | 7
(2 rows)

postgres=# select id, books->'book_id'->>0 from tab_array where id=1;
 id | ?column? 
----+----------
  1 | 111
(1 row)  
Note:Array start from 0 ...N you will undestand json array below example
["111", "232", "353", "484"]     ------>array values
    0        1         2        3         ------> array start from 0...3
from above notes 
if u give  where id=1 and books->'book_id'->>1 means it return 232
if u give  where id=1 and books->'book_id'->>2 means it return 353
if u give  where id=1 and books->'book_id'->>3 means it return 484
if u give  where id=1 and books->'book_id'->>0 means it return 111



PRACTICAL 6.HOW TO CREATE INDEX ON JSON TABLE:
  • index is fast fetching data without going full table scan.
Here  i will explain how to create index on json data type table and how to  work index on json data type.
--check the sales table for already having any index or not.

postgres=# \d sales
       Table "public.sales"
   Column   |  Type   | Modifiers 
------------+---------+-----------
 id         | integer | not null
 sales_info | json    | not null
Indexes:
    "sales_pkey" PRIMARY KEY, btree (id)


postgres=# \di sales
No matching relations found.  
--already we have sales table now let us go to create indicies on any (even nested) JSON field:

create unique index ind_name
on sales ((sales_info ->'PRODUCTS'->>'total_item'));
--Now check the table index is properly created or not using \di,\dt,pg_indexes.

postgres=# \di ind_name
              List of relations
 Schema |   Name   | Type  |  Owner   | Table 
--------+----------+-------+----------+-------
 public | ind_name | index | postgres | sales
(1 row)



Deleting json objectspostgres=# \d sales
       Table "public.sales"
   Column   |  Type   | Modifiers 
------------+---------+-----------
 id         | integer | not null
 sales_info | json    | not null
Indexes:
    "sales_pkey" PRIMARY KEY, btree (id)
    "ind_name" UNIQUE, btree (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text))



postgres=# select tablename,indexname,indexdef from pg_indexes where tablename='sales';
 tablename | indexname  |                                                    indexdef                       
                              
-----------+------------+-----------------------------------------------------------------------------------
------------------------------
 sales     | sales_pkey | CREATE UNIQUE INDEX sales_pkey ON sales USING btree (id)
 sales     | ind_name   | CREATE UNIQUE INDEX ind_name ON sales USING btree ((((sales_info -> 'PRODUCTS'::te
xt) ->> 'total_item'::text)))
(2 rows)
 From below example index is  properly working becouse we already created unique index on on json column of "total_item" unique index will not accept duplicate value if you try to insert duplicate values it will throw errors,i hope you understood it from below example.

postgres=# select * from sales;
 id |                                    sales_info                                    
----+----------------------------------------------------------------------------------
  2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
  3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
  5 | { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 9}}
  1 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 6}}
  4 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 8}}
(5 rows)


postgres=# INSERT INTO SALES VALUES
 (9,'{ "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 8}}'
 );
ERROR:  duplicate key value violates unique constraint "ind_name"
DETAIL:  Key (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text))=(8) already exists. 
another example for json index is properly working or not
HERE LET US consider  sales table having above 20000 json records for making practical json index , This way we can begin to see some performance implications when dealing with JSON data in Postgres, as well as how to solve them.

SELECT count(*) FROM sales WHERE sales_info ->'PRODUCTS'->>'total_item'= '1';
count
-------
4937 
(1 row) 
--taking explain plan for without index

postgres=# explain analyze SELECT sales_info ->>'customer' as customer_name from sales 
where sales_info ->'PRODUCTS'->>'total_item'= '1';
Aggregate (cost=335.12..335.13 rows=1 width=0) (actual time=4.421..4.421 rows=1 loops=1) -> Seq Scan on cards (cost=0.00..335.00 rows=50 width=0) (actual time=0.016..3.961 rows=4938 loops=1) 
    Filter: (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text) = '1'::text) 
    Rows Removed by Filter: 5062 
Total runtime: 4.475 ms$cd /db_data/data/pg_log
$tail -f  postgresql-Tue.log  
Now, that wasn’t that slow of a query at 4.5ms, but let’s see if we can improve it.

CREATE INDEX ind_name1 ON sales ((sales_info ->'PRODUCTS'->>'total_item')); 
If we run the same query which now has an index, we end up cutting the time in half.

SELECT count(*) FROM sales WHERE sales_info ->'PRODUCTS'->>'total_item'= '1';
count
-------
4937 
(1 row)

postgres=# explain analyze SELECT sales_info ->>'customer' as customer_name from sales 
where sales_info ->'PRODUCTS'->>'total_item'= '1';
Aggregate (cost=118.97..118.98 rows=1 width=0) (actual time=2.122..2.122 rows=1 loops=1) -> Bitmap Heap Scan on cards (cost=4.68..118.84 rows=50 width=0) (actual time=0.711..1.664 rows=4938 loops=1) 
    Recheck Cond: (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text) = '1'::text)  
    Heap Blocks: exact=185 
    -> Bitmap Index Scan on idxfinished (cost=0.00..4.66 rows=50 width=0) (actual time=0.671..0.671 rows=4938 loops=1) 
    Index Cond:(((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text) = '1'::text) 
Total runtime: 2.175 ms 
Our query is now taking advantage of the ind_name1 index we created, and the query time has been approximately cut in half.



No comments:

Post a Comment