Waiting for PostgreSQL 19 – json format for COPY TO

On 20th of March 2026, Andrew Dunstan committed patch:

json format for COPY TO
 
This introduces the JSON format option for the COPY TO command, allowing
users to export query results or table data directly as a stream of JSON
objects (one per line, NDJSON style).
 
The JSON format is currently supported only for COPY TO operations; it
is not available for COPY FROM.
 
JSON format is incompatible with some standard text/CSV formatting
options, including HEADER, DEFAULT, NULL, DELIMITER, FORCE QUOTE,
FORCE NOT NULL, and FORCE NULL.
 
Column list support is included: when a column list is specified, only
the named columns are emitted in each JSON object.
 
Regression tests covering valid JSON exports and error handling for
incompatible options have been added to src/test/regress/sql/copy.sql.
 
Author: Joe Conway <mail@joeconway.com>
Author: jian he <jian.universality@gmail.com>
Co-Authored-By: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Davin Shearer <davin@apache.org>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
Discussion: https://postgr.es/m/6a04628d-0d53-41d9-9e35-5a8dc302c34c@joeconway.com

This sounds pretty cool.

So, we have COPY command that can very quickly output (or input) large number of rows.

We could use two (well, three, with BINARY) modes:

normal:

=$ copy sample to stdout;
6       john "politico" doe     wiggles 2016-08-21 12:30:17.891926+02   2017-01-22 22:50:17.584726+01   t
7       jousts  pearliest shimming      2017-11-06 05:57:15.789526+01   2018-07-24 08:45:25.754326+02   f
8       lade\ndinned    a\n\tb\n\tc     2024-07-13 08:06:12.995926+02   2025-02-05 10:42:56.512726+01   t
9       crueller        vineyard        2023-10-22 11:32:33.943126+02   2024-03-25 16:12:38.829526+01   t
10      zoological abc  disputations    2016-08-08 22:32:58.970326+02   2017-07-06 16:30:24.227926+02   t

csv formatted:

=$ copy sample to stdout with csv;
6,"john ""politico"" doe",wiggles,2016-08-21 12:30:17.891926+02,2017-01-22 22:50:17.584726+01,t
7,jousts,pearliest shimming,2017-11-06 05:57:15.789526+01,2018-07-24 08:45:25.754326+02,f
8,"lade
dinned","a
        b
        c",2024-07-13 08:06:12.995926+02,2025-02-05 10:42:56.512726+01,t
9,crueller,vineyard,2023-10-22 11:32:33.943126+02,2024-03-25 16:12:38.829526+01,t
10,zoological abc,disputations,2016-08-08 22:32:58.970326+02,2017-07-06 16:30:24.227926+02,t

It's important to note that while normal output makes sure that each record is in single line, as it escapes whitespace characters using \t, and \n two-characters strings, in case of csv, we get what is in there – including inline spaces, tab characters, and newlines. The only escaping is changing “ characters into “".

While this works, apparently non-trivial amount of csv parsers don't handle well multi-line records.

Plus – recently there has been a lot of tools that handle interoperability using JSON format.

Now, thanks to work by developers we will get native export to JSON:

=$ copy sample to stdout with json;
{"id":6,"full_name":"john \"politico\" doe","other_column":"wiggles","created":"2016-08-21T12:30:17.891926+02:00","updated":"2017-01-22T22:50:17.584726+01:00","is_active":true}
{"id":7,"full_name":"jousts","other_column":"pearliest shimming","created":"2017-11-06T05:57:15.789526+01:00","updated":"2018-07-24T08:45:25.754326+02:00","is_active":false}
{"id":8,"full_name":"lade\ndinned","other_column":"a\n\tb\n\tc","created":"2024-07-13T08:06:12.995926+02:00","updated":"2025-02-05T10:42:56.512726+01:00","is_active":true}
{"id":9,"full_name":"crueller","other_column":"vineyard","created":"2023-10-22T11:32:33.943126+02:00","updated":"2024-03-25T16:12:38.829526+01:00","is_active":true}
{"id":10,"full_name":"zoological abc","other_column":"disputations","created":"2016-08-08T22:32:58.970326+02:00","updated":"2017-07-06T16:30:24.227926+02:00","is_active":true}

It's important to note that each record is its own object, but there is no external array/object that would wrap the whole content for table. If you need it for some reason, there is fancy force_array option:

=$ psql -qAtX -c "copy sample to stdout with (format json, force_array)"
[
 {"id":6,"full_name":"john \"politico\" doe","other_column":"wiggles","created":"2016-08-21T12:30:17.891926+02:00","updated":"2017-01-22T22:50:17.584726+01:00","is_active":true}
,{"id":7,"full_name":"jousts","other_column":"pearliest shimming","created":"2017-11-06T05:57:15.789526+01:00","updated":"2018-07-24T08:45:25.754326+02:00","is_active":false}
,{"id":8,"full_name":"lade\ndinned","other_column":"a\n\tb\n\tc","created":"2024-07-13T08:06:12.995926+02:00","updated":"2025-02-05T10:42:56.512726+01:00","is_active":true}
,{"id":9,"full_name":"crueller","other_column":"vineyard","created":"2023-10-22T11:32:33.943126+02:00","updated":"2024-03-25T16:12:38.829526+01:00","is_active":true}
,{"id":10,"full_name":"zoological abc","other_column":"disputations","created":"2016-08-08T22:32:58.970326+02:00","updated":"2017-07-06T16:30:24.227926+02:00","is_active":true}
]

I think that this format is too verbose because of repetition of column names, but I fully get that there are cases where this is exactly what is needed.

Of course, we could have done it previously using “smart" queries, like:"

=$ copy (select to_json(s) from sample s) to stdout;
{"id":6,"full_name":"john \\"politico\\" doe","other_column":"wiggles","created":"2016-08-21T12:30:17.891926+02:00","updated":"2017-01-22T22:50:17.584726+01:00","is_active":true}
{"id":7,"full_name":"jousts","other_column":"pearliest shimming","created":"2017-11-06T05:57:15.789526+01:00","updated":"2018-07-24T08:45:25.754326+02:00","is_active":false}
{"id":8,"full_name":"lade\\ndinned","other_column":"a\\n\\tb\\n\\tc","created":"2024-07-13T08:06:12.995926+02:00","updated":"2025-02-05T10:42:56.512726+01:00","is_active":true}
{"id":9,"full_name":"crueller","other_column":"vineyard","created":"2023-10-22T11:32:33.943126+02:00","updated":"2024-03-25T16:12:38.829526+01:00","is_active":true}
{"id":10,"full_name":"zoological abc","other_column":"disputations","created":"2016-08-08T22:32:58.970326+02:00","updated":"2017-07-06T16:30:24.227926+02:00","is_active":true}

But this approach (with COPY-FROM-SELECT) was slower, and used more memory because of how resultset is produced.

This looks like a great addition that will definitely find many uses. Thanks a lot to everyone that worked on it 🙂

=$
|

Waiting for PostgreSQL 19 – json format for COPY TO

Leave a Reply