Picking random element, with weights

Whenever I'm doing some testing I need sample data. Easiest way to do it is to generate data using some random/generate_series queries.

But what if I need specific frequencies?

For example, I need to generate 10,000,000 rows, where there will be 10% of ‘a', 20% of ‘b', and the rest will be split equally between ‘c' and ‘d'?

Continue reading Picking random element, with weights

Waiting for PostgreSQL 16 – Add support for regexps on database and user entries in pg_hba.conf

On 24th of October 2022, Michael Paquier committed patch:

Add support for regexps on database and user entries in pg_hba.conf
 
As of this commit, any database or user entry beginning with a slash (/)
is considered as a regular expression.  This is particularly useful for
users, as now there is no clean way to match pattern on multiple HBA
lines.  For example, a user name mapping with a regular expression needs
first to match with a HBA line, and we would skip the follow-up HBA
entries if the ident regexp does *not* match with what has matched in
the HBA line.
 
pg_hba.conf is able to handle multiple databases and roles with a
comma-separated list of these, hence individual regular expressions that
include commas need to be double-quoted.
 
At authentication time, user and database names are now checked in the
following order:
- Arbitrary keywords (like "all", the ones beginning by '+' for
membership check), that we know will never have a regexp.  A fancy case
is for physical WAL senders, we *have* to only match "replication" for
the database.
- Regular expression matching.
- Exact match.
The previous logic did the same, but without the regexp step.
 
We have discussed as well the possibility to support regexp pattern
matching for host names, but these happen to lead to tricky issues based
on what I understand, particularly with host entries that have CIDRs.
 
This commit relies heavily on the refactoring done in a903971 and
fc579e1, so as the amount of code required to compile and execute
regular expressions is now minimal.  When parsing pg_hba.conf, all the
computed regexps needs to explicitely free()'d, same as pg_ident.conf.
 
Documentation and TAP tests are added to cover this feature, including
cases where the regexps use commas (for clarity in the docs, coverage
for the parsing logic in the tests).
 
Note that this introduces a breakage with older versions, where a
database or user name beginning with a slash are treated as something to
check for an equal match.  Per discussion, we have discarded this as
being much of an issue in practice as it would require a cluster to
have database and/or role names that begin with a slash, as well as HBA
entries using these.  Hence, the consistency gained with regexps in
pg_ident.conf is more appealing in the long term.
 
**This compatibility change should be mentioned in the release notes.**
 
Author: Bertrand Drouvot
Reviewed-by: Jacob Champion, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/fff0d7c1-8ad4-76a1-9db3-0ab6ec338bf7@amazon.com

Continue reading Waiting for PostgreSQL 16 – Add support for regexps on database and user entries in pg_hba.conf