On 19th of January 2019, Tomas Vondra committed patch:
Allow COPY FROM to filter data using WHERE conditions Extends the COPY FROM command with a WHERE condition, which allows doing various types of filtering while importing the data (random sampling, condition on a data column, etc.). Until now such filtering required either preprocessing of the input data, or importing all data and then filtering in the database. COPY FROM ... WHERE is an easy-to-use and low-overhead alternative for most simple cases. Author: Surafel Temesgen Discussion: https://www.postgresql.org/message-id/flat/CALAY4q_DdpWDuB5-Zyi-oTtO2uSk8pmy+dupiRe3AvAc++1imA@mail.gmail.com
I assume most of you are aware of COPY command that can be used to quickly load or dump data to/from files.
Since 9.3 we also have ability to copy to/from pipes.
And now, we got one more thing – filtering of input.
When outputting data from database, you could have always use syntax like:
=$ copy ( SELECT a, b, c FROM d WHERE e = f ) TO '/tmp/whatever';
but when loading data into pg, it wasn't so simple. Sure, you could have filter the data before, or use grep as your input program, but now, we can do something much more nicer – apply where clause on loaded data!
Let's see how that would work. First, obviously, I need some table:
=$ CREATE TABLE sysusers ( username text PRIMARY KEY, password text, user_id int4 NOT NULL UNIQUE, group_id int4 NOT NULL, gecos text, homedir text, shell text );
now, I can copy data in there:
=$ copy sysusers FROM '/etc/passwd' WITH delimiter ':'; COPY 42
I loaded 42 users. But, let's assume I just want real system users, and my definition is: with user_id less than 1000:
=$ TRUNCATE sysusers; TRUNCATE TABLE =$ copy sysusers FROM '/etc/passwd' WITH ( delimiter ':' ) WHERE user_id < 1000; COPY 40
The difference is not big, but it's clearly there. And where condition is applied to already parsed data using normal semantics for their datatypes.
This is pretty cool. Thanks to all involved 🙂