Yesterday (August, 7th), Tom Lane committed:
Log Message: ----------- Recognize functional dependency on primary keys. This allows a table's other columns to be referenced without listing them in GROUP BY, so long as the primary key column(s) are listed in GROUP BY. Eventually we should also allow functional dependency on a UNIQUE constraint when the columns are marked NOT NULL, but that has to wait until NOT NULL constraints are represented in pg_constraint, because we need to have pg_constraint OIDs for all the conditions needed to ensure functional dependency. Peter Eisentraut, reviewed by Alex Hunsaker and Tom Lane
One of the most common sources of problems when people move to PostgreSQL from MySQL, is that such query:
select field_a, field_b, count(*) from table group by field_a
is legal and working in MySQL, but not working in PostgreSQL.
Now, with this new patch, subset of such queries will work in PostgreSQL.
First of all – why subset? Simple – if field_a is PRIMARY KEY of given table, and we group by it – it is understandable that you can't have more than 1 value of field_b for it. But if it's not PRIMARY KEY – it can be that you have many different values for field_b for given field_a – and in such case, the query will still not work.
Let's see it on example:
CREATE TABLE people ( id serial PRIMARY KEY, firstname TEXT, lastname TEXT ); CREATE TABLE visits ( id serial PRIMARY KEY, person_id INT4 NOT NULL references people (id), when_logged timestamptz ); INSERT INTO people (firstname, lastname) VALUES ('Bill', 'Hicks'), ('George', 'Carlin'), ('Louis', 'C.K.'), ('Robin', 'Williams'), ('Zach', 'Galifianakis'); INSERT INTO visits (person_id, when_logged) SELECT 1 + floor(random() * 5), now() - '1 year'::INTERVAL * random() FROM generate_series(1,1000);
Now, let's assume we want list of people with count of visits. Previously one would need to:
SELECT p.id, count(*) FROM people p JOIN visits v on p.id = v.person_id GROUP BY p.id;
SELECT p.id, p.firstname, p.lastname, count(*) FROM people p JOIN visits v on p.id = v.person_id GROUP BY p.id, p.firstname, p.lastname;
(or some trick with min(firstname)).
But now, we can:
SELECT p.id, p.firstname, p.lastname, count(*) FROM people p JOIN visits v on p.id = v.person_id GROUP BY p.id;
And it just works. Nice.