Let’s talk dirty

Important disclaimer: the module that I'm writing about was written by my colleague Phil Sorber.

We all have been in, or heard about, situation like this:

$ UPDATE users SET password = '...'; WHERE id = 123;

(hint: first ; is before where).

Of course you should have backups, and you can protect yourself from it. But what if backup is too old, and you didn't protect yourself?

We (well, technically Phil) were faced with this problem. He managed to recover enough data using different means, but this made him think, and write pg_dirtyread.

Supposedly, it should make it possible to read removed data. So let's test this claim.

Installation is simple enough:

=$ git clone https://github.com/omniti-labs/pgtreats
...
 
=$ cd pgtreats/contrib/pg_dirtyread/
 
=$ make
...
 
=$ make install
/bin/mkdir -p '/opt/pgbrew/9.1.3/share/postgresql/extension'
/bin/mkdir -p '/opt/pgbrew/9.1.3/lib/postgresql'
/bin/sh /opt/pgbrew/9.1.3/lib/postgresql/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./pg_dirtyread.control '/opt/pgbrew/9.1.3/share/postgresql/extension/'
/bin/sh /opt/pgbrew/9.1.3/lib/postgresql/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./pg_dirtyread--1.0.sql  '/opt/pgbrew/9.1.3/share/postgresql/extension/'
/bin/sh /opt/pgbrew/9.1.3/lib/postgresql/pgxs/src/makefiles/../../config/install-sh -c -m 755  pg_dirtyread.so '/opt/pgbrew/9.1.3/lib/postgresql/'

Couple of things – it comes as a Pg extension, and contains basically just one .so file (plus the extension things, like .control and of course sql.

Having it installed, let's load this to Pg:

=$ psql -c 'create extension pg_dirtyread'
CREATE EXTENSION

Note of warning though – it did work on 9.1 for me, but did not on 9.2. Trying to load pg_dirtyread on 9.2 ended up with:

ERROR:  could not load library "/home/pgdba/work/lib/postgresql/pg_dirtyread.so": /home/pgdba/work/lib/postgresql/pg_dirtyread.so: undefined symbol: RelationGetDescr

No idea why, I let Phil know, but I don't think it's very big problem, since 9.2 is not released yet, so noone is (hopefully) storing important information in it 🙂

Whole extension is just one, simple function:

$ \dx+ pg_dirtyread
Objects IN extension "pg_dirtyread"
     Object Description
────────────────────────────
 FUNCTION pg_dirtyread(oid)
(1 ROW)

Let's see how it works.

To test it, I will create some table, enter there some rows, delete some, and will try to recover. All with disabled autovacuum (just in case).

$ SHOW autovacuum;
 autovacuum
────────────
 off
(1 ROW)
 
$ CREATE TABLE test AS SELECT i AS id, 'depesz #' || i AS username FROM generate_series(1,10) i;
SELECT 10
 
$ SELECT * FROM test;
 id │  username
────┼────────────
  1 │ depesz #1
  2 │ depesz #2
  3 │ depesz #3
  4 │ depesz #4
  5 │ depesz #5
  6 │ depesz #6
  7 │ depesz #7
  8 │ depesz #8
  9 │ depesz #9
 10 │ depesz #10
(10 ROWS)

OK. Data is ready, let's delete some rows:

$ DELETE FROM test WHERE 0 = id % 3;
DELETE 3
 
$ SELECT * FROM test;
 id │  username
────┼────────────
  1 │ depesz #1
  2 │ depesz #2
  4 │ depesz #4
  5 │ depesz #5
  7 │ depesz #7
  8 │ depesz #8
 10 │ depesz #10
(7 ROWS)

Now, let' try to use the function to recover deleted rows:

$ SELECT * FROM pg_dirtyread( 'test'::regclass );
ERROR:  a COLUMN definition list IS required FOR functions returning "record"
LINE 1: SELECT * FROM pg_dirtyread( 'test'::regclass );
                      ^

Oops. We need to specify columns. It's simple because we know them 🙂

$ SELECT * FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text);
 id │  username
────┼────────────
  1 │ depesz #1
  2 │ depesz #2
  3 │ depesz #3
  4 │ depesz #4
  5 │ depesz #5
  6 │ depesz #6
  7 │ depesz #7
  8 │ depesz #8
  9 │ depesz #9
 10 │ depesz #10
(10 ROWS)

That's great. The value really got recovered!

So let's test some more cruel things. What about vacuumed table?

$ vacuum test;
VACUUM
 
$ SELECT * FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text);
 id │  username
────┼────────────
  1 │ depesz #1
  2 │ depesz #2
  4 │ depesz #4
  5 │ depesz #5
  7 │ depesz #7
  8 │ depesz #8
 10 │ depesz #10
(7 ROWS)

OK – vacuum makes the data fully gone. So you have to be fast. But – will the dirty read work also for toasted values? Let's see it:

$ INSERT INTO test (id, username) VALUES (11,repeat('depesz #11', 100000));
INSERT 0 1
 
$ SELECT id, md5(username), LENGTH(username) FROM test WHERE id = 11;
 id │               md5                │ LENGTH
────┼──────────────────────────────────┼─────────
 11 │ 05a472c082f47348d85af868d2d7861d │ 1000000
(1 ROW)

OK. So I entered 1 million characters in the column, got md5 of it (because I will not compare 1 million of bytes to show you that it worked). So, let's delete it, and try to recover:

$ SELECT id, md5(username), LENGTH(username) FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text);
ERROR:  unexpected chunk NUMBER 4 (expected 0) FOR toast VALUE 16395 IN pg_toast_16389

OK. Toasted values are also not available. Which is probably relatively large problem.

But, the whole “read dirty" thing, made me think about reading uncommitted data. Let's see. First – I vacuumed test table to skip issues with row #11 and it's missing toast. Then I did:

session #1 session #2
$ begin;
$ INSERT INTO test (id, username)
    VALUES (12,repeat('depesz #12', 100000));
INSERT 0 1
$ SELECT id, md5(username), LENGTH(username)
FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text);
 id │               md5                │ LENGTH
────┼──────────────────────────────────┼────────
  1 │ e7b3a18efa4dccdba1dbe3b1ee261a0a │      9
  2 │ 71ff75c0e4e0f97d5571b1494036a8cb │      9
  4 │ ad2193303191f9778e1eb567b4253c51 │      9
  5 │ c36973dd5eec47eec490963be7346f35 │      9
  7 │ 5a57a3a8f9b71656939ddced280987a3 │      9
  8 │ eba729acefe975ee08d0b213f2cd322f │      9
 10 │ defc1a12ff09c42c4e168784e31e4915 │     10
(7 ROWS)
$ SELECT id, md5(username), LENGTH(username)
FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text);
 id │               md5                │ LENGTH
────┼──────────────────────────────────┼─────────
  1 │ e7b3a18efa4dccdba1dbe3b1ee261a0a │       9
  2 │ 71ff75c0e4e0f97d5571b1494036a8cb │       9
 12 │ c50f23ab00c5d98d016a67a711bd1223 │ 1000000
  4 │ ad2193303191f9778e1eb567b4253c51 │       9
  5 │ c36973dd5eec47eec490963be7346f35 │       9
  7 │ 5a57a3a8f9b71656939ddced280987a3 │       9
  8 │ eba729acefe975ee08d0b213f2cd322f │       9
 10 │ defc1a12ff09c42c4e168784e31e4915 │      10
(8 ROWS)
$ rollback;
$ SELECT id, md5(username), LENGTH(username)
FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text);
 id │               md5                │ LENGTH
────┼──────────────────────────────────┼─────────
  1 │ e7b3a18efa4dccdba1dbe3b1ee261a0a │       9
  2 │ 71ff75c0e4e0f97d5571b1494036a8cb │       9
 12 │ c50f23ab00c5d98d016a67a711bd1223 │ 1000000
  4 │ ad2193303191f9778e1eb567b4253c51 │       9
  5 │ c36973dd5eec47eec490963be7346f35 │       9
  7 │ 5a57a3a8f9b71656939ddced280987a3 │       9
  8 │ eba729acefe975ee08d0b213f2cd322f │       9
 10 │ defc1a12ff09c42c4e168784e31e4915 │      10
(8 ROWS)

Which is pretty interesting – not only you can read data inserted by other transactions before commit – you can also read it after rollback – up until it will get vacuumed.

And what happens in case of updates?

$ UPDATE test SET username = 'xxx' WHERE id = 1;
UPDATE 1
 
$ SELECT id, username FROM pg_dirtyread( 'test'::regclass ) t (id int4, username text) WHERE id = 1;
 id │ username
────┼───────────
  1 │ depesz #1
  1 │ xxx
(2 ROWS)

We just see all versions of given row available.

All of this is pretty cool, and can definitely be useful. We just need some changes to make it truly a great tool:

  • Fix compilation issue on 9.2 (and test on all other supported Pg versions, and maybe even some unsupported by Pg
  • Do something about toast. If it's technically not possible to get the value from toast – make it return “null" for the column, or whatever – just don't fail fully – otherwise it will break for virtually any real-world table.
  • I would definitely add information about xmin/xmax to outputted columns – this is required to be able to tell which row is latest, and which was previous
  • I don't much like the interface – requirement to name all columns, while possible, can be made nicer. Maybe returning set of texts, where each text is representation of row? So you could query like: select disk_xmin, disk_xmax, (row::table).* from pg_dirtyread(‘table'::regclass) ?

All things said – this tool shows great premise, and can become invaluable for any serious DBA, but it has some issues that need to be addressed.

23 thoughts on “Let’s talk dirty”

  1. Very nice feature. It ought to be a part of standard postgres distribution not only an extension.

  2. I bet the 9.2 problem is just a missing #include “utils/rel.h” in the source.

  3. Depesz,

    Thanks for the review!

    I had not tested it against 9.2 yet, so I will look into that. I also need to work on backward compatibility. I am using some macro’s that I am not sure exist before 9.1.

    I have intentions of including xmin/xmax and friends in the output but haven’t gotten that far yet.

    I was unaware of the toast issue and that is the first thing I will fix. As you mentioned it is nearly useless without it.

    The interface is a tough one. I tried very hard to use the generic record type so that I could return rows like the underlying table. My thought was that you would want to select into a temp table and manipulate the data there. There are no indexes in action so you are doing a full table scan every time you access it. It’s not really meant to be an everyday use function, though I am open to the possibility of optimizing it. I’ll have to think more on this one.

    @marek, I would imagine this as updated select functionality than a standalone function if it were to make the postgres core. Perhaps some sort of “USING SNAPSHOT ANY” syntax.

  4. Very nice demonstration, Depesz. Especially the access to uncommitted writes from other transactions shows how useful this can be beyond data recovery.

  5. Yeah, I’d need xmin/xmax data as well. ctid would be nice too, so that you can *select* the specific row version you want.

  6. How often is an unqualified UPDATE or DELETE not a mistake?

    Will Leinweber suggests that those two commands should always require a WHERE clause, at least in interactive mode. If you really, really want to update all the rows, just add a “WHERE true” clause.

    Precedent for this has been pretty well established thanks to protection against mistakes like inserting a space along the lines of `rm -rf ~ /foo/*`.

  7. Great idea — I have always felt we needed a way to access this data. FYI, I think you will also find that a SELECT on a full page will also make the old data unrecoverable — see my MVCC Unmasked talk for details.

  8. @peter
    I definitely agree with you. Those statements are not meant to be used unqualified – ever. Should be a SQL standard to always have a condition. I never got burned with it but I think it’s just a matter of time.

  9. @Peter, @Jorge: sure, but how’s that related? Besides, it’s trivial to protect against it.

  10. Hello! HELP! I need to install this tool in my postgres.
    I read your page but when I follow the steps I get the following error:
    Makefile: 8: /usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/pgxs.mk: No such file or directory.
    If I go to my /usr/lib/postgresql/9.1/lib I notice that there is no subdirectory there.
    How do I install the developer package without breaking more things?.
    Thank you very much and sorry for my bad English.
    Guillermo

    guillermovil at gmail

  11. @Guillermo:
    Based on path, I guess that you’re using Debian or Ubuntu. In such case, apt-get install postgresql-server-dev-9.1

  12. thank you very much DEPESZ, yes my s.o. is ubuntu if I try:
    apt-get install postgresql-server-dev-9.1
    I got:
    Leyendo lista de paquetes… Hecho
    Creando árbol de dependencias
    Leyendo la información de estado… Hecho
    Se instalarán los siguientes paquetes NUEVOS:
    postgresql-server-dev-9.1
    0 actualizados, 1 se instalarán, 0 para eliminar y 10 no actualizados.
    Necesito descargar 920 kB de archivos.
    Se utilizarán 4.653 kB de espacio de disco adicional después de esta operación.
    Err http://ppa.launchpad.net/pitti/postgresql/ubuntu/ natty/main postgresql-server-dev-9.1 amd64 9.1.6-1~natty1
    404 Not Found
    Imposible obtener http://ppa.launchpad.net/pitti/postgresql/ubuntu/pool/main/p/postgresql-9.1/postgresql-server-dev-9.1_9.1.6-1~natty1_amd64.deb 404 Not Found
    E: No se pudieron obtener algunos archivos, ¿quizás deba ejecutar «apt-get update» o deba intentarlo de nuevo con –fix-missing?

    What can I do?

  13. @Guillermo:
    no idea – I don’t understand the message. Perhaps try someone who knows ubuntu and speaks the language.

  14. Hi depesz,

    It’s possible make a jdbc version to read the deleted rows in postgresql ?

    Thanks

  15. @Alejandro:
    1. have you *ever* seen anything about jdbc on depesz.com? hint: no. I don’t use it, I don’t know anything about it, I don’t write about it.
    2. as far as I can understand, jdbc is protocol to send queries to pg from java application. so any query that you can send from psql, you should be able to send via jdbc. so – the solution that I describe in here – should work via jdbc, just as fine as via *any* database interface – it’s just a query.

  16. Hi,

    After of very tried to compile this PostgreSQL C function for Windows, I compile that (with VS C++ 2008), but the function get a error when try to read a deleted row. The example:

    CREATE FUNCTION pg_dirtyread(oid)
    RETURNS setof record
    AS E’$libdir/pg_dirtyread’, ‘pg_finfo_pg_dirtyread’ LANGUAGE C STRICT ;

    Create table hola(
    id bigserial,
    dato1 varchar(199) not null,
    fecha date
    );

    /*insert 3 rows and delete 1 row*/

    select * from pg_dirtyread(‘hola’::regclass) t (id bigint,dato1 varchar(199), fecha date);

    ERROR: invalid memory alloc request size 1850015748

    Thanks

  17. @Alejandro:
    if you are reporting bug, or requesting help withcompilation, I suggest you contact author, and not some random blogger who wrote about it. Besides, I dont have any windows machines, nor I have any access to such box.

  18. Every time I got error:
    FEHLER: Error converting tuple descriptors!
    DETAIL: Anzahl der zurückgegebenen Spalten (6) entspricht nicht der erwarteten Spaltenanzahl (1).

  19. @Marina:
    1. I have no idea what it means – I don’t speak German.
    2. If/when you’ll translate the error – it probably is better to write to author of the software, not some random blogging guy.

  20. hello, this extension is valid for the latest versions of Postgres ( 9.3,9.4,9.5 ) ? Thanks!!

Comments are closed.