On 4th of April 2022, Andrew Dunstan committed patch:
JSON_TABLE This feature allows jsonb data to be treated as a table and thus used in a FROM clause like other tabular data. Data can be selected from the jsonb using jsonpath expressions, and hoisted out of nested structures in the jsonb to form multiple rows, more or less like an outer join. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu (whose name I previously misspelled), Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru
Another SQL/JSON patch.
This time something that allows getting data from complicated jsons into recordset.
Let's start with some simple example:
=$ SELECT * FROM json_table( '[{"a":10,"b":20},{"a":30,"b":40}]'::jsonb, '$[*]' COLUMNS ( column_a int4 path '$.a', column_b int4 path '$.b' ) ); column_a | column_b ----------+---------- 10 | 20 30 | 40 (2 ROWS)
I assume the example is easy to understand.
Now, let's assume we want to add serial-like column, named id. Also, I'll show two more tricks:
=$ SELECT * FROM json_table( '[{"a":10,"b":20},{"a":30,"b":40}]'::jsonb, '$[*]' COLUMNS ( id FOR ORDINALITY, column_a int4 path '$.a', column_b int4 path '$.b', a int4, b int4, c text ) ); id | column_a | column_b | a | b | c ----+----------+----------+----+----+--- 1 | 10 | 20 | 10 | 20 | 2 | 30 | 40 | 30 | 40 | (2 ROWS)
Adding serial-like id worked. And please note that i also added columns a and b – without specifying their paths. In case path is simply ‘$.SOMETHING' and you want the column to be named SOMETHING you don't need to explicitly state paths.
If there is no such field in the json, it will be returned as null.
Now, pretty commonly one has nested structures. For example, we could have json like this:
=$ SELECT jsonb_pretty(j) FROM sample; jsonb_pretty ----------------------------------------- [ + { + "title": "first post", + "author": "depesz", + "comments": [ + { + "body": "comment #1", + "author": "hubert" + }, + { + "body": "comment #3", + "author": "lubaczewski"+ }, + { + "body": "comment #5", + "author": "someone" + } + ] + }, + { + "title": "second post", + "author": "depesz", + "comments": [ + { + "body": "comment #2", + "author": "depesz" + }, + { + "body": "comment #6", + "author": "anyone" + } + ] + }, + { + "title": "third post", + "author": "someone else", + "comments": [ + { + "body": "comment #4", + "author": "whoever" + } + ] + } + ] (1 ROW)
Single row, nested structure.
With json_table I can:
=$ SELECT jt.* FROM sample, lateral json_table( j, '$[*]' COLUMNS ( id FOR ORDINALITY, author text, title text, NESTED PATH '$.comments[*]' COLUMNS ( comment_author text PATH '$.author', comment_body text PATH '$.body' ) ) ) AS jt; id | author | title | comment_author | comment_body ----+--------------+-------------+----------------+-------------- 1 | depesz | FIRST post | hubert | comment #1 1 | depesz | FIRST post | lubaczewski | comment #3 1 | depesz | FIRST post | someone | comment #5 2 | depesz | SECOND post | depesz | comment #2 2 | depesz | SECOND post | anyone | comment #6 3 | someone ELSE | third post | whoever | comment #4 (6 ROWS)
There are, as previously many additional options:
- default clauses for fields
- exists clauses that return boolean or int depending on jsonpath existing in given place
- format specifier (json/jsonb) that allows extraction value from json, as json, even if datatype used is, for example, text
- with wrapper, somehow related to making multiple things into array
- omit quotes, which, to be fair, I don't understand
and possibly more.
Docs are not there yet (at the moment), so I can't check them all there.
This is amazing. I'm waiting for the docs so I can read more about various functionalities, and options.
Huge thanks a lot to all involved.
Writing a materialized view with this that gets refreshed concurrently on trigger or schedule could make it more attractive to use jsonb for accepting flexible data, but still allow easy normalization to a standard table to get statistics for better query plans.