Waiting for PostgreSQL 19 – json format for COPY TO

On 20th of March 2026, Andrew Dunstan committed patch:

json format for COPY TO
 
This introduces the JSON format option for the COPY TO command, allowing
users to export query results or table data directly as a stream of JSON
objects (one per line, NDJSON style).
 
The JSON format is currently supported only for COPY TO operations; it
is not available for COPY FROM.
 
JSON format is incompatible with some standard text/CSV formatting
options, including HEADER, DEFAULT, NULL, DELIMITER, FORCE QUOTE,
FORCE NOT NULL, and FORCE NULL.
 
Column list support is included: when a column list is specified, only
the named columns are emitted in each JSON object.
 
Regression tests covering valid JSON exports and error handling for
incompatible options have been added to src/test/regress/sql/copy.sql.
 
Author: Joe Conway <mail@joeconway.com>
Author: jian he <jian.universality@gmail.com>
Co-Authored-By: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Davin Shearer <davin@apache.org>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
Discussion: https://postgr.es/m/6a04628d-0d53-41d9-9e35-5a8dc302c34c@joeconway.com

This sounds pretty cool.

So, we have COPY command that can very quickly output (or input) large number of rows.

We could use two (well, three, with BINARY) modes:

normal:

=$ copy sample to stdout;
6       john "politico" doe     wiggles 2016-08-21 12:30:17.891926+02   2017-01-22 22:50:17.584726+01   t
7       jousts  pearliest shimming      2017-11-06 05:57:15.789526+01   2018-07-24 08:45:25.754326+02   f
8       lade\ndinned    a\n\tb\n\tc     2024-07-13 08:06:12.995926+02   2025-02-05 10:42:56.512726+01   t
9       crueller        vineyard        2023-10-22 11:32:33.943126+02   2024-03-25 16:12:38.829526+01   t
10      zoological abc  disputations    2016-08-08 22:32:58.970326+02   2017-07-06 16:30:24.227926+02   t

csv formatted:

=$ copy sample to stdout with csv;
6,"john ""politico"" doe",wiggles,2016-08-21 12:30:17.891926+02,2017-01-22 22:50:17.584726+01,t
7,jousts,pearliest shimming,2017-11-06 05:57:15.789526+01,2018-07-24 08:45:25.754326+02,f
8,"lade
dinned","a
        b
        c",2024-07-13 08:06:12.995926+02,2025-02-05 10:42:56.512726+01,t
9,crueller,vineyard,2023-10-22 11:32:33.943126+02,2024-03-25 16:12:38.829526+01,t
10,zoological abc,disputations,2016-08-08 22:32:58.970326+02,2017-07-06 16:30:24.227926+02,t

It's important to note that while normal output makes sure that each record is in single line, as it escapes whitespace characters using \t, and \n two-characters strings, in case of csv, we get what is in there – including inline spaces, tab characters, and newlines. The only escaping is changing “ characters into “".

While this works, apparently non-trivial amount of csv parsers don't handle well multi-line records.

Plus – recently there has been a lot of tools that handle interoperability using JSON format.

Now, thanks to work by developers we will get native export to JSON:

=$ copy sample to stdout with json;
{"id":6,"full_name":"john \"politico\" doe","other_column":"wiggles","created":"2016-08-21T12:30:17.891926+02:00","updated":"2017-01-22T22:50:17.584726+01:00","is_active":true}
{"id":7,"full_name":"jousts","other_column":"pearliest shimming","created":"2017-11-06T05:57:15.789526+01:00","updated":"2018-07-24T08:45:25.754326+02:00","is_active":false}
{"id":8,"full_name":"lade\ndinned","other_column":"a\n\tb\n\tc","created":"2024-07-13T08:06:12.995926+02:00","updated":"2025-02-05T10:42:56.512726+01:00","is_active":true}
{"id":9,"full_name":"crueller","other_column":"vineyard","created":"2023-10-22T11:32:33.943126+02:00","updated":"2024-03-25T16:12:38.829526+01:00","is_active":true}
{"id":10,"full_name":"zoological abc","other_column":"disputations","created":"2016-08-08T22:32:58.970326+02:00","updated":"2017-07-06T16:30:24.227926+02:00","is_active":true}

It's important to note that each record is its own object, but there is no external array/object that would wrap the whole content for table. If you need it for some reason, there is fancy force_array option:

=$ psql -qAtX -c "copy sample to stdout with (format json, force_array)"
[
 {"id":6,"full_name":"john \"politico\" doe","other_column":"wiggles","created":"2016-08-21T12:30:17.891926+02:00","updated":"2017-01-22T22:50:17.584726+01:00","is_active":true}
,{"id":7,"full_name":"jousts","other_column":"pearliest shimming","created":"2017-11-06T05:57:15.789526+01:00","updated":"2018-07-24T08:45:25.754326+02:00","is_active":false}
,{"id":8,"full_name":"lade\ndinned","other_column":"a\n\tb\n\tc","created":"2024-07-13T08:06:12.995926+02:00","updated":"2025-02-05T10:42:56.512726+01:00","is_active":true}
,{"id":9,"full_name":"crueller","other_column":"vineyard","created":"2023-10-22T11:32:33.943126+02:00","updated":"2024-03-25T16:12:38.829526+01:00","is_active":true}
,{"id":10,"full_name":"zoological abc","other_column":"disputations","created":"2016-08-08T22:32:58.970326+02:00","updated":"2017-07-06T16:30:24.227926+02:00","is_active":true}
]

I think that this format is too verbose because of repetition of column names, but I fully get that there are cases where this is exactly what is needed.

Of course, we could have done it previously using “smart" queries, like:"

=$ copy (select to_json(s) from sample s) to stdout;
{"id":6,"full_name":"john \\"politico\\" doe","other_column":"wiggles","created":"2016-08-21T12:30:17.891926+02:00","updated":"2017-01-22T22:50:17.584726+01:00","is_active":true}
{"id":7,"full_name":"jousts","other_column":"pearliest shimming","created":"2017-11-06T05:57:15.789526+01:00","updated":"2018-07-24T08:45:25.754326+02:00","is_active":false}
{"id":8,"full_name":"lade\\ndinned","other_column":"a\\n\\tb\\n\\tc","created":"2024-07-13T08:06:12.995926+02:00","updated":"2025-02-05T10:42:56.512726+01:00","is_active":true}
{"id":9,"full_name":"crueller","other_column":"vineyard","created":"2023-10-22T11:32:33.943126+02:00","updated":"2024-03-25T16:12:38.829526+01:00","is_active":true}
{"id":10,"full_name":"zoological abc","other_column":"disputations","created":"2016-08-08T22:32:58.970326+02:00","updated":"2017-07-06T16:30:24.227926+02:00","is_active":true}

But this approach (with COPY-FROM-SELECT) was slower, and used more memory because of how resultset is produced.

This looks like a great addition that will definitely find many uses. Thanks a lot to everyone that worked on it 🙂

4 thoughts on “Waiting for PostgreSQL 19 – json format for COPY TO”

Anonymous says:

2026-04-04 at 09:33

One issue with the `to_json()` approach is that the output of the COPY is not necessarily valid json. Anything with nested quotes will produce invalid json as you end up with the initial \ escape character being duplicated (a `\copy (select to_json …) to program ‘jq .’` should throw an error).
For instance your `”id”: 6` rows aren’t the same between the new `with json` feature & the `to_json` workaround.

It’s still a useful trick to know for sure, but having a robust way of doing json extracts is a great addition to COPY.

@Anonymous:

care to show example, as I am not sure what you mean. I tried copy to json with nested quotes, and it works just fine:

=$ psql
=$ create table t (v text);
=$ insert into t (v) values (E'this "is" value with \'various\' "quotes\'');
=$ C^D
 
=$ psql -qAtX -c "copy t to stdout json" | jq .
{
  "v": "this \"is\" value with 'various' \"quotes'"
}

Anonymous says:

2026-04-05 at 16:50

@depesz: sorry if my message wasn’t clear, I was pointing out the fact that the pre-pg19 approach of using “copy (select to_json(s) from sample s)” can produce invalid json.

For instance your example prints

{“id”:6,”full_name”:”john \\”politico\\” doe”,”other_column”:”wiggles”,”created”:”2016-08-21T12:30:17.891926+02:00″,”updated”:”2017-01-22T22:50:17.584726+01:00″,”is_active”:true}

which isn’t valid because of the double backslashes.

The new “with json” feature does output valid json for that same row.
depesz says:

2026-04-06 at 11:53

@Anonymous:
yeah. Now I get it. And you’re right. It can. This shouldn’t be the case now 🙂

This site uses Akismet to reduce spam. Learn how your comment data is processed.

=$
|

Waiting for PostgreSQL 19 – json format for COPY TO

4 thoughts on “Waiting for PostgreSQL 19 – json format for COPY TO”

Leave a Reply