Waiting for 9.5 – Row-Level Security Policies (RLS)

On 19th of September, Stephen Frost committed patch:

Row-Level Security Policies (RLS)
 
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table.  Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
 
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner.  Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
 
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used.  If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
 
By default, row security is applied at all times except for the
table owner and the superuser.  A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE.  When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
 
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
 
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
 
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
 
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
 
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.

So this is actually huge change. I found some mails about related changes from 2007!

Most of the thing is described pretty clearly in the commit log, but let's see how it really looks when used.

For this, we'll create simple test table, couple of test users, and will see how it will work.

CREATE TABLE clients (
    id serial PRIMARY KEY,
    account_name text NOT NULL UNIQUE,
    account_manager text NOT NULL
);
CREATE TABLE
 
CREATE USER peter;
CREATE ROLE
 
CREATE USER joanna;
CREATE ROLE
 
CREATE USER bill;
CREATE ROLE
 
GRANT ALL ON TABLE clients TO peter, joanna, bill;
GRANT
 
GRANT ALL ON SEQUENCE clients_id_seq TO peter, joanna, bill;
GRANT
 
INSERT INTO clients (account_name, account_manager)
    VALUES ('initrode', 'peter'), ('initech', 'bill'), ('chotchkie''s', 'joanna');
INSERT 0 3

With this in place, we can run some test. Obviously, each of the users can now query whole table:

$ \c - peter
You are now connected TO DATABASE "depesz" AS USER "peter".
 
$ SELECT * FROM clients;
 id | account_name | account_manager
----+--------------+-----------------
  1 | initrode     | peter
  2 | initech      | bill
  3 | chotchkie's  | joanna
(3 rows)

Thanks to security policies we should be able to make it so that user can select only its “own" rows.

CREATE policy just_own_clients ON clients
    FOR ALL
    TO public
    USING ( account_manager = CURRENT_USER );
CREATE POLICY
 
ALTER TABLE clients ENABLE ROW LEVEL SECURITY;
ALTER TABLE

And that's it. Now, I can only see rows belonging to myself:

$ SELECT * FROM clients;
 id | account_name | account_manager 
----+--------------+-----------------
  1 | initrode     | peter
(1 ROW)
 
$ \c - joanna
 
$ SELECT * FROM clients;
 id | account_name | account_manager 
----+--------------+-----------------
  3 | chotchkie's  | joanna
(1 row)

What's more – you can't even insert rows that you wouldn't be able to see:

$ \c - peter
 
$ INSERT INTO clients (account_name, account_manager) VALUES ('hack', 'bill');
ERROR:  NEW ROW violates WITH CHECK OPTION FOR "clients"
DETAIL:  Failing ROW contains (4, hack, bill).
 
$ INSERT INTO clients (account_name, account_manager) VALUES ('hack', 'peter');
INSERT 0 1

To be honest, I'm a bit at loss why just “USING ()" in the policy works for both insert and select, but I'm quite happy with it.

Of course, you can override the default, and make it possible, for example, for anyone to create new clients for Bill, so he'll have less time for whatever he's doing when not working:

DROP policy just_own_clients ON clients;
DROP POLICY
 
CREATE policy just_own_clients ON clients
    FOR ALL
    TO public
    USING ( account_manager = CURRENT_USER )
    WITH CHECK ( account_manager IN ( 'bill', CURRENT_USER ) );
CREATE POLICY

Now, one can still only see it's own clients:

$ SELECT * FROM clients;
 id | account_name | account_manager 
----+--------------+-----------------
  1 | initrode     | peter
  5 | hack         | peter
(2 ROWS)

But can easily add more work for Bill:

$ INSERT INTO clients (account_name, account_manager) VALUES ('hack2', 'bill');
INSERT 0 1

Of course your policies can be arbitrarily complex – after all, it's a normal check constraint, which can do anything you want, including, with appropriate extensions, querying external systems.

It's a great addon, and a welcome addition to PostgreSQL. It took some time to get here, but I, for one, definitely appreciate the work of everyone involved. Thanks a lot.

10 thoughts on “Waiting for 9.5 – Row-Level Security Policies (RLS)”

  1. One thing that kept tripping me up when testing this feature was that overlapping polices are permissive rather than restrictive, in that they are combined using OR rather than AND. So if one policy says the account_name must match the current user, but another says the id must be between 1 and 20, the targeted user would see rows that would violate the first policy but don’t violate the second, and vice versa.

    e.g. user bob would see

    id account_name
    1 jane
    2 joe
    3 carl
    4 melissa

    20 rob
    33 bob

  2. @Thom:
    it definitely can be a problem. Perhaps we could get some way to change the “OR” into “AND”?

  3. @Depesz Yeah, I had thought about whether it might be worthwhile choosing a mode for the policy.

    CREATE [ PERMISSIVE | RESTRICTIVE ] POLICY policy_name;

    e.g.

    CREATE RESTRICTIVE POLICY must_match_user… — user_col = current_user
    CREATE PERMISSIVE POLICY can_match_high_id… — id > 50
    CREATE PERMISSIVE POLICY can_match_country… — country in (‘UK’,’Ireland’)

    SELECT * FROM mytable;

    gets planned to…

    SELECT * FROM mytable
    WHERE (user_col = current_user) — must_match_user
    AND ((id > 50) — can_match_high_id
    OR (country in (‘UK’,’Ireland’))); — can_match_country

    Permissive could just be the default, and having only permissive would be no different from the current functionality. Having only restrictive means all policies have to be met. Having a mix means all restrictive polices must be met, and at least one of the permissive policies. Of course, rather than creating several policies, one could roll them into a single policy.

    I haven’t given the above idea much thought, so there could be a glaring problem with it, but I did expect a restrictive model in the first place.

  4. Pity column level security doesn’t let SELECT * work where you only have access to a subset of the fields in the table

  5. @Thom:
    Polices work like roles. Not just PostgreSQL roles, but roles in any operating system or application. If you assign multiple roles to a user, the access is OR’ed together. This is in line with just about all ACL technology. Policies work the same way here.

  6. @Matt Yes, I discussed the nature of RLS with the author of the feature previously, and he described it in terms of granting permission. However, the check options can easily switch one into the mindset of using check constraints, where overlapping check constraints would effectively be connected via AND rather than OR.

    I think there’s usefulness in both modes, but I don’t think it comes all that naturally to think of it as being permissive. I’ve looked at what happens in Oracle since and discovered that their model is restrictive, so it’s going to be equally surprising to those coming from that database system too.

  7. @Thom I have used exclusion constraints before (very useful for rejecting overlaps among rows!) and am somewhat used to the CHECK keyword having more than one concept behind it. For me, check constraints on columns have always been about enforcing data integrity within the row. RLS is about security, a very different subject matter. I would expect that anyone diving into this would start with the concept and end with the syntax. Your concern is valid if they were to start at the syntax and make a non-security related restriction out of it. This seems like a very “tail wagging the dog” way of building code. I would expect that they would consult the PostgreSQL documentation at some point in process of making out the syntax and usage.

    For me, when I think “lets restrict rows”, the first and only thing that comes to mind is roles, and the OR behavior does not violate the “principle of least surprise” like an AND would. I guess I have been wanting simple row level security for a long time, and any other type of row restriction purpose, I believe, belongs in the original query. I just do not think that DBAs should be putting application layer concepts in the database where it affects row retrieval for *EVERYTHING*. But for security, this makes sense, and it even takes database privileges to bypass RLS. But if RLS where used for non-security purposes, then eventually, someone has to bypass RLS for an application logic reason using database privileges. By choosing “AND” RLS checks, it seems like we are encoring what I think are bad designs.

    I would not want to see PostgreSQL model Oracle behavior. Oracle is an unnecessarily complex database with technologies that were implemented either before the implications were fully understood, rushed into releases, or built to support old behavior and syntax, no matter how unintuitive and inconvenient they might be. Open source projects eventually get forked or replaced for going the route that they take. Its current RLS AND behavior sounds like it was chosen based on syntax as opposed to semantics, or perhaps just the feel good concept of being “restrictive”. It probably made more sense to a manager who was not involved in the details.

  8. It would have been nice if RFS took a stored procedure (OR Function) to do the authentication.
    With that flexibility it would be easy to push down application level security to DB. (At least in applications where I work.)

  9. I searched in other websites also . all are using In create policy using clause current_user only.
    But as of my requirement i want to use the session id value .
    EX.) create table test(id integer,name
    varchar(50),session_id integer default
    pg_backend_pid());
    create policy test_policy on test
    using( session_id=pg_backend_pid());
    please suggest me how to use like this.

    Thank you.

Comments are closed.