Even Wikipedia only offers a negative definition:
«A database that focuses on data that is modeled in means other than tabular relations»
Repository at github.com/dbu/postgres-json
1 | CREATE TABLE auction_jsonb ( |
2 | id SERIAL NOT NULL, |
3 | title VARCHAR(255) NOT NULL, |
4 | current_price INT DEFAULT NULL, |
5 | item JSONB NOT NULL, |
6 | PRIMARY KEY(id) |
7 | ); |
JSON: Untyped, stored raw | JSONB: Stored structured |
---|---|
|
|
=> Use JSONB unless you rely on non-standard behaviour.
item->'author'
= '"David"'
or, 2.5 times faster
item->>'author'
= 'David'
> The JSON information |
>> Extract primitive type |
1 | SELECT COUNT(*) |
2 | FROM auction_jsonb |
3 | WHERE item->>'author' = 'David'; |
Rows | json | jsonb |
---|---|---|
10k | 136 | 31 |
100k | 319 | 46 |
1M | 1'280 | 157 |
10M | 15'505 | 2'768 |
Query time in milliseconds
1 | CREATE INDEX json_author |
2 | ON auction_json_indexed ((item->>'author')); |
Rows | json | json indexed | jsonb | jsonb indexed |
---|---|---|---|---|
10k | 136 | 25 | 31 | 20 |
100k | 319 | 12 | 46 | 13 |
1M | 1'280 | 3 | 157 | 3 |
10M | 15'505 | 7 | 2'768 | 6 |
Query time in milliseconds
At 10M rows, postgres stopped using the index and went for full table scan
1 | -- Value <= 1.15 makes postgres use the index |
2 | ALTER TABLESPACE pg_default |
3 | SET (random_page_cost=1.1); |
Thanks Frank Heikens on stackoverflow for hinting random_page_cost
1 | CREATE TABLE auction_generated_fields ( |
2 | id SERIAL NOT NULL, |
3 | author VARCHAR(255) generated always |
4 | as (item->>'author') stored, |
5 | item JSONB NOT NULL, PRIMARY KEY(id)); |
Rows | jsonb indexed | generated | generated indexed |
---|---|---|---|
10k | 20 | 21 | 18 |
100k | 13 | 30 | 10 |
1M | 3 | 88 | 3 |
10M | 6 | 786 | 4 |
Query time in milliseconds
01 | CREATE FUNCTION text_to_timestamp(text) RETURNS TIMESTAMP |
02 | LANGUAGE sql IMMUTABLE AS |
03 | $$ |
04 | SELECT CASE |
05 | WHEN $1 ~ '^\d{4}-\d{2}-\d{2}[ |T]\d{2}:\d{2}:\d{2}(\.\d+)?(\+00:00)?$' |
06 | THEN CAST($1 AS timestamp without time zone) |
07 | END |
08 | $$; |
09 |
10 | CREATE TABLE auction_generated_fields ( |
11 | id SERIAL NOT NULL, |
12 | start_date TIMESTAMP(0) generated always |
13 | AS (text_to_timestamp(item->>'startDate')) stored, |
14 | item JSONB NOT NULL, PRIMARY KEY(id)); |
item->type
JSON_QUERY(item, '$.author')
item #> '{type}'
item['type']
No sharing of indexes.
=> Decide which construct you use
contains
item @> '{"author": "David"}'
key exists
item ? 'author'
jsonb_exists(item, 'author')
any key exists
item ?| array['author','foo']
jsonb_exists_any(item, array['a..', 'f..']
all keys exist
item ?& array['author','title']
jsonb_exists_all(item, array['a..', 'f..']
1 | CREATE INDEX auction_json_gin_idx |
2 | ON auction_jsonb_gin USING GIN (item); |
3 |
4 | SELECT COUNT(*) |
5 | FROM auction_jsonb_gin |
6 | WHERE item @> '{"author": "David"}'; |
Rows | ->> indexed | @> no index | @> with index | @> with GIN |
---|---|---|---|---|
10k | 20 | 5 | 5 | 5 |
100k | 13 | 20 | 19 | 5 |
1M | 3 | 144 | 149 | 13 |
10M | 6 | 1'239 | 1'235 | 76 |
Query time in milliseconds
Speed up arbitrary JSON contains queries, but not other queries
Data distribution of the indexed column makes a huge difference
Variants | json | json idx | jsonb | jsonb idx | GIN >> | GIN @ | gen | gen idx |
---|---|---|---|---|---|---|---|---|
10 | 12'372 | 784 | 1'895 | 676 | 1'882 | 1'364 | 1'030 | 44 |
1'000 | 12'424 | 6 | 1'583 | 6 | 1'260 | 58 | 931 | 3 |
100'000 | 13'197 | 3 | 1'710 | 3 | 1'501 | 5 | 838 | 3 |
Response times in milliseconds, table with 10 million rows
Jsonb, indexes and generated fields do not come for free
Rows | json | json idx | jsonb | jsonb idx | GIN | gen | gen idx |
---|---|---|---|---|---|---|---|
10k | 60 | 90 | 90 | 120 | 130 | 94 | 143 |
100k | 550 | 740 | 820 | 990 | 2'110 | 890 | 1'000 |
1M | 5'150 | 8'500 | 7'670 | 11'340 | 24'000 | 8'440 | 11'773 |
10M | 55'000 | 96'000 | 97'000 | 115'000 | 256'000 | 84'000 | 116'000 |
Inserting dummy rows, duration in milliseconds
Manipulate JSON instead of replacing it completely
1 | -- update, ignore if not exists |
2 | UPDATE auction_jsonb |
3 | SET item = jsonb_set(item, '{author}', '"value"', false) |
4 | WHERE id=42; |
5 |
6 | -- update or create |
7 | UPDATE auction_jsonb |
8 | SET item = jsonb_set(item, '{author}', 'value', true) |
9 | WHERE id=42; |
Postgres is replacing the whole JSON, query is not faster
01 | -- delete attribute |
02 | UPDATE auction_jsonb |
03 | SET item = item - 'key' |
04 | WHERE ID=42; |
05 |
06 | -- delete attribute alternate |
07 | UPDATE auction_jsonb |
08 | SET item = jsonb_set_lax(item, '{author}', null, true, 'delete_key') |
09 | WHERE ID=42; |
10 |
11 | -- delete multiple |
12 | item - array['key1', 'key2'] |
raise_exception | Fail if value is null |
use_json_null | Set JSON field to null (default) |
delete_key | Delete the key on null value |
return_target | Leave previous value when new is null |
1 | set_jsonb_lax(item, '{author}', null, true, 'return_target'); |
pg_jsonschema
/ postgres-json-schema
)
||
concat array / merge objectsjsonb_build_object('key', 'value', 'k2', 'v2')
1 | #[ORM\Column(type: Types::JSON)] |
2 | private array $item ; |
3 |
4 | #[ORM\Column(type: Types::JSON, options: [ 'jsonb' => true])] |
5 | private array $item ; |
Index can not be defined in Doctrine.
You have to manually add them to the migration.
1 | $this ->addSql( 'CREATE INDEX json_author ON auction_json_indexed ((item->>\'author\'))' ); |
1 | #[ORM\Column( |
2 | length: 255, |
3 | nullable: true, |
4 | insertable: false, |
5 | updatable: false, |
6 | columnDefinition: "VARCHAR(255) generated always as (item->>'author') stored" , |
7 | generated: "ALWAYS" |
8 | )] |
9 | private ?string $author = null; |
Note: the column definition is native SQL, not DQL
You can create native queries with Doctrine DBAL
Or use scienta/doctrine-json-functions
1 | # config/packages/doctrine.yaml |
2 | doctrine: |
3 | orm: |
4 | dql: |
5 | string_functions: |
6 | JSON_GET_TEXT: Scienta\...\Postgresql\JsonGetText |
7 | JSONB_CONTAINS: Scienta\...\Postgresql\JsonbContains |
Register boolean functions as string and compare with = true
1 | $qb = $this ->createQueryBuilder( 'a' ); |
2 | $qb ->select( 'COUNT(a)' ) |
3 | // Need to qualify with a. in front of item |
4 | ->where( "JSON_GET_TEXT(a.item, 'author') = :author" ) |
5 | ->setParameter( 'author' , $authorName ) |
6 | ; |
7 |
8 | return $qb ->getQuery()->getSingleScalarResult(); |
01 | $qb = $this ->createQueryBuilder( 'a' ); |
02 | $connection = $this ->getEntityManager()->getConnection(); |
03 | // Could not figure out how to pass author as parameter. |
04 | // Not sure if escaping is safe, |
05 | // needs to be verified for direct user input |
06 | $author = str_replace ( '"' , '\"' , $authorName ); |
07 | $author = trim( $connection ->quote( $author ), "'" ); |
08 | $qb ->select( 'COUNT(a)' ) |
09 | ->where( "JSONB_CONTAINS(a.item, '{\"author\": \"$author\"}') = true" ) |
10 | ; |
11 |
12 | return $qb ->getQuery()->getSingleScalarResult(); |
Rows | jsonb | dbal | dbal, batch 1k | entities |
---|---|---|---|---|
10k | 60 | 4519 | 163 | 6290 |
100k | 550 | 49964 | 1587 | 53823 |
See pgdaash.io
json_each(item)
jsonb_each(item)
SELECT *
FROM jsonb_each
((SELECT item FROM auction_jsonb WHERE id=1));