Waiting for PostgreSQL 14 – pg_stat_statements: track number of rows processed by some utility commands.

Title: Waiting for PostgreSQL 14 – pg_stat_statements: track number of rows processed by some utility commands.

On 29th of July 2020, Fujii Masao committed patch:

pg_stat_statements: track number of rows processed by some utility commands.
 
This commit makes pg_stat_statements track the total number
of rows retrieved or affected by CREATE TABLE AS, SELECT INTO,
CREATE MATERIALIZED VIEW and FETCH commands.
 
Suggested-by: Pascal Legrand
Author: Fujii Masao
Reviewed-by: Asif Rehman
Discussion: https://postgr.es/m/1584293755198-0.post@n3.nabble.com

Commit log seems to be clear, but just in case, let's see how it looks.

Of course, to have pg_stat_statements available, you have to load it by adding pg_stat_statements to your shared_preload_libraries:

$ SHOW shared_preload_libraries;
 shared_preload_libraries 
──────────────────────────
 pg_stat_statements
(1 ROW)

If you don't have it, you can add it from SQL:

ALTER system SET shared_preload_libraries = 'pg_stat_statements';

but it will require PostgreSQL restart anyway.

Afterwards, simple:

$ CREATE extension pg_stat_statements ;
CREATE EXTENSION

And you can use it.

There are many columns in pg_stat_statements view, but the important for now are:

  • query – well, what query this is about
  • rows – how many rows were processed

So, let's see it. First, let's make sure the stats are empty, to avoid having too much data to display:

$ SELECT pg_stat_statements_reset();
 pg_stat_statements_reset 
──────────────────────────
 
(1 ROW)

Now, let's make a table, and insert 10 sample rows:

$ CREATE TABLE test (i int4);
CREATE TABLE
 
$ INSERT INTO test (i) SELECT generate_series(1,10);
INSERT 0 10

Stats now look like this:

$ SELECT query, ROWS FROM pg_stat_statements WHERE query !~ 'pg_catalog';
                       query                        │ ROWS 
────────────────────────────────────────────────────┼──────
 CREATE TABLE test (i int4)0
 SELECT pg_stat_statements_reset()1
 INSERT INTO test (i) SELECT generate_series($1,$2)10
(4 ROWS)

As you can see it correctly stored information that 10 rows were processed by insert into.

So, let's add some more:

$ INSERT INTO test (i) SELECT generate_series(1,10);
INSERT 0 10
 
$ SELECT query, ROWS FROM pg_stat_statements WHERE query !~ 'pg_catalog';
                            query                             │ ROWS 
──────────────────────────────────────────────────────────────┼──────
 CREATE TABLE test (i int4)0
 SELECT pg_stat_statements_reset()1
 SELECT query, ROWS FROM pg_stat_statements WHERE query !~ $1 │    3
 INSERT INTO test (i) SELECT generate_series($1,$2)20
(5 ROWS)

Sweet. So, let's test the other commands too:

$ CREATE TABLE test2 AS SELECT i FROM generate_series(1,1231) i;
SELECT 1231
 
$ SELECT generate_series(1,100) INTO test3;
SELECT 100
 
$ BEGIN;
BEGIN
 
*$ DECLARE tc cursor FOR SELECT * FROM test;
DECLARE CURSOR
 
*$ fetch 5 FROM tc;
 i 
───
 1
 2
 3
 4
 5
(5 ROWS)
 
*$ ROLLBACK;
ROLLBACK

And now, stats look like:

$ SELECT query, ROWS FROM pg_stat_statements WHERE query !~ 'pg_catalog';
                             query                             │ ROWS 
───────────────────────────────────────────────────────────────┼──────
 CREATE TABLE test2 AS SELECT i FROM generate_series(1,1231) i │ 1231
 fetch 5 FROM tc                                               │    5
 ROLLBACK0
 CREATE TABLE test (i int4)0
 SELECT pg_stat_statements_reset()1
 BEGIN0
 SELECT query, ROWS FROM pg_stat_statements WHERE query !~ $1  │    9
 SELECT generate_series(1,100) INTO test3                      │  100
 DECLARE tc cursor FOR SELECT * FROM test                      │    0
 SELECT query, ROWS FROM pg_stat_statements                    │    4
 INSERT INTO test (i) SELECT generate_series($1,$2)20
 CREATE materialized VIEW test4 AS SELECT * FROM test          │   20
(12 ROWS)

Great, all row counts are as expected.

That's definitely helpful for all admins, thanks a lot to all involved.