Waiting for PostgreSQL 16 – Add array_sample() and array_shuffle() functions.

On 7th of April 2023, Tom Lane committed patch:

Add array_sample() and array_shuffle() functions.
 
These are useful in Monte Carlo applications.
 
Martin Kalcher, reviewed/adjusted by Daniel Gustafsson and myself
 
Discussion: https://postgr.es/m/9d160a44-7675-51e8-60cf-6d64b76db831@aboutsource.net

Continue reading Waiting for PostgreSQL 16 – Add array_sample() and array_shuffle() functions.

Waiting for PostgreSQL 16 – Invent random_normal() to provide normally-distributed random numbers.

On 9th of January 2023, Tom Lane committed patch:

Invent random_normal() to provide normally-distributed random numbers.
 
There is already a version of this in contrib/tablefunc, but it
seems sufficiently widely useful to justify having it in core.
 
Paul Ramsey
 
Discussion: https://postgr.es/m/CACowWR0DqHAvOKUCNxTrASFkWsDLqKMd6WiXvVvaWg4pV1BMnQ@mail.gmail.com

Continue reading Waiting for PostgreSQL 16 – Invent random_normal() to provide normally-distributed random numbers.

Picking random element, with weights

Whenever I'm doing some testing I need sample data. Easiest way to do it is to generate data using some random/generate_series queries.

But what if I need specific frequencies?

For example, I need to generate 10,000,000 rows, where there will be 10% of ‘a', 20% of ‘b', and the rest will be split equally between ‘c' and ‘d'?

Continue reading Picking random element, with weights

Waiting for PostgreSQL 12 – Log all statements from a sample of transactions

On 3rd of April 2019, Alvaro Herrera committed patch:

Log all statements from a sample of transactions
 
This is useful to obtain a view of the different transaction types in an
application, regardless of the durations of the statements each runs.
 
Author: Adrien Nayrat

Continue reading Waiting for PostgreSQL 12 – Log all statements from a sample of transactions

Generate short, random, textual IDs

Couple of people asked, on irc, about how to generate unique, short, ids for rows.

Since I wrote code for this for explain.depesz.com, I figured that instead of pointing to sources, and letting everyone to extract the functions, I will, instead, write a set of functions that what is needed in reusable way.

Continue reading Generate short, random, textual IDs

Waiting for 9.5 – TABLESAMPLE, SQL Standard and extensible

On 15th of May, Simon Riggs committed patch:

TABLESAMPLE, SQL Standard and extensible
 
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.
 
Petr Jelinek
 
Reviewed by Michael Paquier and Simon Riggs

Continue reading Waiting for 9.5 – TABLESAMPLE, SQL Standard and extensible

Explaining the unexplainable – part 5

In previous posts in this series, I talked about how to read EXPLAIN output, and what each line (operation/node) means.

Now, in the final post, I will try to explain how it happens that Pg chooses “Operation X" over “Operation Y".

Continue reading Explaining the unexplainable – part 5