Waiting for PostgreSQL 14 – Improvements for handling large number of connections


There is a thread on pgsql-hackers mailing list, dating back to 1st of March 2020 about changes to Pg to optimize handling of larger number of connections.

The thread is pretty long, and the discussion took 5 months, but lately we got couple of commits:

  1. snapshot scalability: Don't compute global horizons while building snapshots.
  2. snapshot scalability: Move PGXACT->xmin back to PGPROC.
  3. snapshot scalability: Move PGXACT->vacuumFlags to ProcGlobal->vacuumFlags.
  4. snapshot scalability: Move subxact info to ProcGlobal, remove PGXACT.
  5. snapshot scalability: Introduce dense array of in-progress xids.
  6. snapshot scalability: cache snapshots using a xact completion counter.
  7. Fix race condition in snapshot caching when 2PC is used.

that (supposedly) should improve handling of large numbers of concurrent connections.

Unfortunately, I don't have ready test servers that would allow me to sensibly tests thousands of connections, but I think we can all safely assume that PostgreSQL 14 should handle large connection counts better than any PostgreSQL before. Thanks a lot, to all involved, but specifically to mastermind of this project: Andres Freund.

Parallel dumping of databases

Some time ago I wrote a piece on speeding up dump/restore process using custom solution that was parallelizing process.

Later on I wrote some tools (“fast dump and restore") to do it in more general way.

But all of them had a problem – to get consistent dump you need to stop all concurrent access to your database. Why, and how to get rid of this limitation?

Continue reading Parallel dumping of databases