PostgreSQL Json Data Type
In this tutorial, i will explain you how to work with PostgreSQL JSON data type.
what is JSON ?
1.json
2.jsonb
PRACTICAL 3.HOW TO QUERYING & FILTERING JSON DATA:
the two operators -> and ->> to help you query JSON data.
FROM above example First sales_info -> 'PRODUCTS' returns as JSON objects. And then sales_info -> 'PRODUCTS' ->>'product_name' returns all products as text.
PRACTICAL 4.HOW TO USE WHERE CLAUSE ON JSON DATA:
--Let us take the sales table for using WHERE CLAUSE
--To get a set of keys in the outermost JSON object use function json_object_keys function
Here we are creating array table using json data type and and also explained how to retrieve the array data from json table and how to understand the array data.
what is JSON ?
- It is one of the data type in postgres.
- Json + PostgreSQL =NoSQL
- JSON STANDS FOR JavaScript Object Notation
- JSON data types stored value is valid according to the JSON rules.
- JSON data types are for storing JSON data as multi-level, dynamically structured object graphs.
- serialised object is stored in a text column. The json type takes care of deserialising it back to object graph while reading values from that column.
- The main usage of JSON is to transport data between a server and web application. Unlike other formats, JSON is human-readable text.
- PostgreSQL supports JSON data type since version 9.2.
- It provides many functions and operators for manipulating JSON data.
1.json
2.jsonb
- The json data type stores an exact copy of the input text.
- jsonb data is stored in a decomposed binary format
- jsonb Insertion makes it slightly slower to input due to added conversion overhead.
- jsonb also supports indexing
- JSONB does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.
postgres=# CREATE TABLE SALES ( ID INT NOT NULL PRIMARY KEY, SALES_INFO json NOT NULL );From above Sales table consists of two columns:
- The id column is the primary key column that identifies the sales id.
- The sales_info column stores the data in the form of JSON data types.
postgres=# \d sales
Table "public.sales"
Column | Type | Modifiers
------------+---------+-----------
id | integer | not null
sales_info | json | not null
Indexes:
"sales_pkey" PRIMARY KEY, btree (id)
PRACTICAL 2. INSERTING JSON DATA ON SALES TABLE:
--List down the sales tableINSERT INTO SALES VALUES
(1,'{ "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}'
);
INSERT INTO SALES VALUES
(2,'{ "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}'
);
INSERT INTO SALES VALUES
(3,'{ "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}'
);
postgres=# select * from sales;
id | sales_info
----+----------------------------------------------------------------------------------
1 | { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)
--list down the json column onlypostgres=# select sales_info from sales;
sales_info
----------------------------------------------------------------------------------
{ "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
{ "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
{ "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)
--List down the id column only for understanding purposes.postgres=# select id from sales;
id
----
1
2
3
(3 rows)
Deleting json objects:postgres=# select * from sales;
id | sales_info
----+-------------------------------------------------------------------------------
2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
4 | { "customer": "junaith", "PRODUCTS": {"product_name": "pen","total_item": 8}}
7 | { "customer": "daniel", "PRODUCTS": {"product_name": "car","total_item": 8}}
8 | { "customer": "daniel", "PRODUCTS": {"product_name": "car","total_item": 8}}
(5 rows)
postgres=# delete from sales where sales_info ->'PRODUCTS'->>'total_item'='8';
DELETE 3
postgres=# select * from sales;
id | sales_info
----+-------------------------------------------------------------------------------
2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(2 rows)
the two operators -> and ->> to help you query JSON data.
- -> will return the attribute as a JSON object key(original JSON type).
- ->> will return the property as integer or text (the parsed form of the attribute).
postgres=# select sales_info -> 'customer' from sales;
?column?
----------
"NIJAM"
"ABU"
"UMAR"
(3 rows)
postgres=# select sales_info ->> 'customer' from sales;
?column?
----------
NIJAM
ABU
UMAR
(3 rows)
postgres=# select sales_info ->> 'customer' as customer_name from sales;
customer_name
---------------
NIJAM
ABU
UMAR
(3 rows)
postgres=# select sales_info -> 'PRODUCTS' from sales;
?column?
----------------------------------------------
{"product_name": "choclate","total_item": 6}
{"product_name": "badam","total_item": 5}
{"product_name": "mobile","total_item": 1}
- Using -> operator returns a JSON object, you can chain it with the operator ->> to retrieve a specific node.
- JSON IS CASESENSITIVE DEFAULTY, from below example first i try to retrieve the data using "products"(lowercase) that is why it showed empty value after that i chaneged "PRODUCTS"(UPPER CASE) then it showing values.
postgres=# select sales_info -> 'products' ->>'product_name' from sales;
?column?
----------
(3 rows)
postgres=# select sales_info -> 'PRODUCTS' ->'product_name' from sales;
?column?
------------
"choclate"
"badam"
"mobile"
(3 rows)
postgres=# select sales_info -> 'PRODUCTS' ->>'product_name' from sales;
?column?
----------
choclate
badam
mobile
(3 rows
PRACTICAL 4.HOW TO USE WHERE CLAUSE ON JSON DATA:
--Let us take the sales table for using WHERE CLAUSE
postgres=# select * from sales;
id | sales_info
----+----------------------------------------------------------------------------------
1 | { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)
--To find out what are the products bought customer UMAR I use the following querypostgres=# SELECT sales_info ->>'customer' as customer_name,sales_info ->>'PRODUCTS' ->>'product_name'
as name_of_product from sales where sales_info ->>'customer'='UMAR';
ERROR: operator does not exist: text ->> unknown
LINE 1: ...stomer' as customer_name,sales_info ->>'PRODUCTS' ->>'produc...
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
SOLUTION:
- FIRST TIME WE SPECIFIED sales_info ->>'PRODUCTS' ->>'product_name' SO WE NEED TO GIVE sales_info ->'PRODUCTS' ->>'product_name'
postgres=# SELECT sales_info ->>'customer' as customer_name,sales_info ->'PRODUCTS' ->>'product_name' as name_of_product from sales where sales_info ->>'customer'='UMAR'; customer_name | name_of_product ---------------+----------------- UMAR | mobile (1 row)
- Defaulty FROM above example First sales_info -> 'PRODUCTS' returns as JSON objects. And then 'PRODUCTS' ->>'product_name' returns all products as text.
- when you call json nested data first you call like ->(json object key format)then Finally you need to call ->>(text format or json object key format).
- if you specify firstly json field as text(->>) format when calling nested node ,after that if you specify -> or ->> values json will not know first text format(->>) value that is why it throw error.
- JSON displayed text format but actually json work as json object key(->) format .
postgres=# SELECT sales_info ->>'customer' as customer_name,sales_info ->'PRODUCTS' ->'product_name' as name_of_product from sales where sales_info ->>'customer'='UMAR'; customer_name | name_of_product ---------------+----------------- UMAR | "mobile" (1 row)
- HERE i specified first json object(->) format finnaly i called json object format (->) this is not a problem.
- other wise if you specify firstly as json text format (->>) finally if you give json object format (->) or json text format (->>) what ever it is json will not know fist values surely it will throw error.
PRACTICAL 5.HOW TO USE FUNCTION ON JSON DATA:
json_each() function allows us to expand the outermost JSON object into a set of key-value pairs.
postgres=# select sales_info from sales;
sales_info
----------------------------------------------------------------------------------
{ "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 6}}
{ "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
{ "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
(3 rows)
postgres=# select json_each (sales_info) from sales;
json_each
-----------------------------------------------------------------
(customer,"""NIJAM""")
(PRODUCTS,"{""product_name"": ""choclate"",""total_item"": 6}")
(customer,"""ABU""")
(PRODUCTS,"{""product_name"": ""badam"",""total_item"": 5}")
(customer,"""UMAR""")
(PRODUCTS,"{""product_name"": ""mobile"",""total_item"": 1}")
(6 rows)
--If you want calculate array size of json functionselect json_array_length('[1,2,3,4,5,6,7,8]');
PRACTICAL 5.HOW TO CREATE ARRAY TABLE ON JSON DATA:Here we are creating array table using json data type and and also explained how to retrieve the array data from json table and how to understand the array data.
postgres=# create table TAB_ARRAY (id INT,BOOKS json);
CREATE TABLE
--Let us inserting array valuesinsert into tab_array values (1, '{
"book_name": "story",
"book_id": ["111", "232", "353", "484"] }'
);
insert into tab_array values (2, '{
"book_name": "the king",
"book_id": ["1231", "78", "7", "9"] }'
);
--List down the tab_array table
--already we have sales table now let us go to create indicies on any (even nested) JSON field:
--taking explain plan for without index
postgres=# select * from tab_array;
id | books
----+---------------------------------------------
1 | { +
| "book_name": "story", +
| "book_id": ["111", "232", "353", "484"] }
2 | { +
| "book_name": "the king", +
| "book_id": ["1231", "78", "7", "9"] }
(2 rows)
--filter the json datapostgres=# select id, books->'book_id'->>2 from tab_array;
id | ?column?
----+----------
1 | 353
2 | 7
(2 rows)
postgres=# select id, books->'book_id'->>0 from tab_array where id=1;
id | ?column?
----+----------
1 | 111
(1 row)
Note:Array start from 0 ...N you will undestand json array below example
["111", "232", "353", "484"] ------>array values
0 1 2 3 ------> array start from 0...3
from above notes
if u give where id=1 and books->'book_id'->>1 means it return 232
if u give where id=1 and books->'book_id'->>2 means it return 353
if u give where id=1 and books->'book_id'->>3 means it return 484
if u give where id=1 and books->'book_id'->>0 means it return 111
PRACTICAL 6.HOW TO CREATE INDEX ON JSON TABLE:
- index is fast fetching data without going full table scan.
Here i will explain how to create index on json data type table and how to work index on json data type.
--check the sales table for already having any index or not.
postgres=# \d sales
Table "public.sales"
Column | Type | Modifiers
------------+---------+-----------
id | integer | not null
sales_info | json | not null
Indexes:
"sales_pkey" PRIMARY KEY, btree (id)
postgres=# \di sales
No matching relations found.
create unique index ind_name on sales ((sales_info ->'PRODUCTS'->>'total_item'));--Now check the table index is properly created or not using \di,\dt,pg_indexes.
postgres=# \di ind_name List of relations Schema | Name | Type | Owner | Table --------+----------+-------+----------+------- public | ind_name | index | postgres | sales (1 row) Deleting json objectspostgres=# \d sales Table "public.sales" Column | Type | Modifiers ------------+---------+----------- id | integer | not null sales_info | json | not null Indexes: "sales_pkey" PRIMARY KEY, btree (id) "ind_name" UNIQUE, btree (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text)) postgres=# select tablename,indexname,indexdef from pg_indexes where tablename='sales'; tablename | indexname | indexdef -----------+------------+----------------------------------------------------------------------------------- ------------------------------ sales | sales_pkey | CREATE UNIQUE INDEX sales_pkey ON sales USING btree (id) sales | ind_name | CREATE UNIQUE INDEX ind_name ON sales USING btree ((((sales_info -> 'PRODUCTS'::te xt) ->> 'total_item'::text))) (2 rows)
From below example index is properly working becouse we already created unique index on on json column of "total_item" unique index will not accept duplicate value if you try to insert duplicate values it will throw errors,i hope you understood it from below example.
postgres=# select * from sales;
id | sales_info
----+----------------------------------------------------------------------------------
2 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 5}}
3 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 1}}
5 | { "customer": "NIJAM", "PRODUCTS": {"product_name": "choclate","total_item": 9}}
1 | { "customer": "ABU", "PRODUCTS": {"product_name": "badam","total_item": 6}}
4 | { "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 8}}
(5 rows)
postgres=# INSERT INTO SALES VALUES
(9,'{ "customer": "UMAR", "PRODUCTS": {"product_name": "mobile","total_item": 8}}'
);
ERROR: duplicate key value violates unique constraint "ind_name"
DETAIL: Key (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text))=(8) already exists.
another example for json index is properly working or not
HERE LET US consider sales table having above 20000 json records for making practical json index , This way we can begin to see some performance implications when dealing with JSON data in Postgres, as well as how to solve them.
SELECT count(*) FROM sales WHERE sales_info ->'PRODUCTS'->>'total_item'= '1';
count
-------
4937
(1 row)
postgres=# explain analyze SELECT sales_info ->>'customer' as customer_name from sales
where sales_info ->'PRODUCTS'->>'total_item'= '1';
Aggregate (cost=335.12..335.13 rows=1 width=0) (actual time=4.421..4.421 rows=1 loops=1) -> Seq Scan on cards (cost=0.00..335.00 rows=50 width=0) (actual time=0.016..3.961 rows=4938 loops=1)
Filter: (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text) = '1'::text)
Rows Removed by Filter: 5062
Total runtime: 4.475 ms$cd /db_data/data/pg_log
$tail -f postgresql-Tue.log
Now, that wasn’t that slow of a query at 4.5ms, but let’s see if we can improve it.CREATE INDEX ind_name1 ON sales ((sales_info ->'PRODUCTS'->>'total_item'));
If we run the same query which now has an index, we end up cutting the time in half.SELECT count(*) FROM sales WHERE sales_info ->'PRODUCTS'->>'total_item'= '1';
count
-------
4937
(1 row)
postgres=# explain analyze SELECT sales_info ->>'customer' as customer_name from sales
where sales_info ->'PRODUCTS'->>'total_item'= '1';
Aggregate (cost=118.97..118.98 rows=1 width=0) (actual time=2.122..2.122 rows=1 loops=1) -> Bitmap Heap Scan on cards (cost=4.68..118.84 rows=50 width=0) (actual time=0.711..1.664 rows=4938 loops=1)
Recheck Cond: (((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text) = '1'::text)
Heap Blocks: exact=185
-> Bitmap Index Scan on idxfinished (cost=0.00..4.66 rows=50 width=0) (actual time=0.671..0.671 rows=4938 loops=1)
Index Cond:(((sales_info -> 'PRODUCTS'::text) ->> 'total_item'::text) = '1'::text)
Total runtime: 2.175 ms
Our query is now taking advantage of the ind_name1 index we created, and the query time has been approximately cut in half.
Comments
Post a Comment