Using JSON: json vs. jsonb, pglz vs. lz4, key optimization, parsing speed?

Recently(ish) I had a conversation on one of PostgreSQL support chats (IRC, Slack, or Discord) about efficient storage of JSON data, which compression to use, which datatype.

Unrelated to this, some people (at least two over the last year or so) said that they aren't sure if PostgreSQL doesn't optimize storage between columns, for example, storing attribute names once per column, and not once per value.

Decided to investigate…

Continue reading Using JSON: json vs. jsonb, pglz vs. lz4, key optimization, parsing speed?

Picking random element, with weights

Whenever I'm doing some testing I need sample data. Easiest way to do it is to generate data using some random/generate_series queries.

But what if I need specific frequencies?

For example, I need to generate 10,000,000 rows, where there will be 10% of ‘a', 20% of ‘b', and the rest will be split equally between ‘c' and ‘d'?

Continue reading Picking random element, with weights