OmniPITR

Thanks to the company I work for OmniTI I was working on pretty cool project. Name of the project is OmniPITR, and here is what it is, why, how, and where to get it.

We are using WAL replication quite extensively. We also do make hot backups a lot. What bugged me (and possibly other, but it could be that I just whined and whined, and they simply say: OK, if you don't like it – fix it) was that it was always complicated setup.

We had some program to archive wal segments. Then there was cronjob to move them to 2 places (backup server, and slave server), then we used pg_restore, sometimes with %r and sometimes without (%r removed obsolete segments).

On top of it there are backups which get tricky – we need to make them on master, which means we have to have wal segments on master, and we need something to cleanup in wal archive on master afterwards. All in all – it worked, but I always felt it could be done in a more self-contained way.

Now, this way has materialized.

As of now the project doesn't have it's page, but you can browse or download (via svn co) the sources. Which work. And don't require compilation.

What OmniPITR is? It's one simple set of tools, where each tool has its own set of tasks, and number of tools is as small as possible, which let me handle all parts of WAL-replication and backup:

  • archiving wal segments
  • making hot backups
  • restoration of wal segments on slave

These are done already. And work. And are (at least to the extent of our possibilities) tested.

We also work on:

  • making hot backups on slave
  • single tool to monitor state of replication/backups

Now. What's cool about it? Let me list some features:

  • It has documentation. Actually – documentation is written before code.
  • It has full step by step howto about setting replication
  • It works transparently with archived wal segments – i.e. uses less disk space. Supports gzip, bzip2 and lzma compressions.
  • Has the ability to apply wal segments on slave with delay – for example to protect against propagating ‘TRUNCATE TABLE users' executed on master to slave(s)
  • PostgreSQL license – basically – do what you want
  • Is fully configurable using command line options
  • Did I mention that there are docs?

To use it you need:

  • Some will. It will not work on itself. You have to set it up (although there is detailed howto provided).
  • Some unix machine(s) – I didn't test it on Windows and I just don't have any. It might work, but currently – it just hasn't been run
  • Perl. I know. perl sucks, is unreadable, there are better languages, and so on. Right. So you need Perl
  • Perldoc program. It should come in with perl, if not – it is in some separate package, like perl-doc. This is not strictly requires, but it helps with reading docs.
  • PostgreSQL. We tested it from 8.2 on.

Missing features (backup on slave, monitoring, and whatever is in todo doc) will be added in short future. Generally I expect to have backup on slave by the end of next week.

Now for some short Q&A:

  • Why docs are in perldoc? Why not .html?
  • Because I like perldoc. If you want .html – there is pod2html program bundled with Perl
  • Really, why didn't you write it in some sensible language? Python? Ruby? Go? Java?
  • Well. Perl is sensible for me. It's by default on most unices, it behaves the same (or close to the same) everywhere, and it is powerful (or: powerful enough) to let me do anything I want
  • Isn't it obsolete by the fact that in 9.0 there is streaming replication?
  • Not really. You still need restore_command on slave to start the procedure, and handle cases where you'd have to “catch-up" for whatever reason ( network issues, slave downtime). Also – OmniPITR also handles hot backup, which are not solved by streaming replication.
  • Why make it public/free?
  • First of all – I'm very selfish. I want my piece of software to be as good as possible. This means – tests. I can perform only so many tests. With new users I get broader test base. I get to fix occasional bugs, but generally the product gets better. And second of all – I think it is nice way of having wal replication, so why not share it if I can? I got my Pg for free after all.

That's about it. If you'd want to play with it – fine, go for it. If you have problems – please let me know. If you think docs are wrong in any way – please let me know. And have some fun.

16 thoughts on “OmniPITR”

  1. I don’t agree with your assessment of Perl. I think it’s beautiful, can be made readable (every language can easily be made unreadable), it’s practical, it’s efficient and incredible powerful (especially if you use CPAN).

  2. Probably worth mentioning, we are using this in production in a number of places, including multiple versions of Postgres, and on several different flavors of *nix OS’s. This isn’t to say people wont find bugs as it is tried in more scenarios, indeed we welcome some feedback like that, but we have a reasonable level of confidence that this wont eat your data. 🙂

  3. Why did you just use walmgr or pitrtools as a basis? They are already well defined tools with existing user bases. They are also both open source.

  4. @Joshua:
    Nope. Wrote from scratch. I was not aware of pitrtools, and as for walmgr (and now that I know of pitrtools – same thing applies) – it’s in python. While it would be interesting to learn better python, I definitely prefer Perl.

  5. Hi,

    I have some suggestions(?) on the omnipitr-backup-master:

    It would be really nice if omnipitr-backup-master could push the archive directly to the standby machine (without creating a temp file) or even directly rsync the data directory there.

    It would make it much easier to create an initial stapshot on the standby machine without first creating a temp file, then pushing it to the standby and then unpacking it there.

    With best regards,

    — Valentine Gogichashvili

  6. @Valentine:
    I don’t think about it as interesting feature – backup-master is a tool for making hot backups – i.e. set it in cron, and have it running every day. adding special functionality (which can be just got by running rsync program directly) that serves to help with side-functionality – while technically possible – doesn’t strike me as very crucial – after all – helping with setup of pitr-slave, is just a side effect, and omnipitr-backup-master is not even particulary good at it, since it backups also xlogs, which are useless for setting replication.

    That said – if I’ll ever have more time, I will probably make another script (not extension for backup-*) for helping setup of pitr-slave. It’s just that currently there are other priorities (biggest being finishing backup-slave tool).

  7. Thank you for your response. I just found your comment… no notification is done by the blogging service about new comments 🙁 anyway my fault 🙂

    Yes, you are right about the separate script. The reason, why I asked for that, as I could not find any nice way to control omnipitr-archive if I want to make an rsync manually.

    — Valentine Gogichashvili

  8. @Valentine:
    there is rss feed for comments, that you can subscribe to.

    In firefox, in address bar, there is rss icon, and when you’ll press it, it will list you all available rss channels – and it includes comments.

    I’m not sure I understand what you need/want. Care to elaborate? Why would you want to control omnipitr-archive separately? What is the flow you’re trying to achieve?

  9. Hm… maybe I do not understand something about initial pushing of the live system to the standby server.

    As WAL files, that are successfully processed by the archive_command are dropped, and I thought, that one had to stop archive_command from copying WAL files during the initial backup to the standby server.

    Eventually now, when I (re)think about it, the WAL files are shipped to the standby server by the archive_command and they are not needed to be included into the rsync (as I was doing before).

    I hope now I understand where I was wrong and will try to just use the rsync without stopping WAL shipping.

    Thanks for the right question,

    — Valentine

  10. @Valentine:
    Generally for initial sending, simply skip pg_xlog directory, as it should be clean when starting slave server anyway, so there is no point in rsyncing it.

  11. How do you get the backup program to backup multiple directories? I have our main db in say /db1 and our tablespaces in /db2. The omnipitr-backup-master does not seem to cleanly handle this, or am I missing something? Thanks for the great work.

  12. @Chris:
    Fair point. I totally forgot about it.

    Well – currently omnipitr doesn’t support it.

    Will need to think about how to sensibly implement it, as there are much more problems with it than with single-tablespace approach (for example – you should keep the same paths after recovering).

  13. Hi

    We are trying to use the tool in a study for pg 9.2 we are doing for our company.

    However the omnipitr-monitor seems not to work:

    Can’t locate object method “new” via package “OmniPITR::Program::Monitor” at /var/opt/hosting/build/omniti-labs-omnipitr-e499c60/bin/omnipitr-monitor line 10.

    I checked the omnipitr/lib/OmniPITR/Program/Monitor.pm file is very light..

    If you can help..

    Thanks

    Nicolas

  14. @Nicolas:
    answer is simple, but unlikely to be helpful – monitor is simply not yet implemented. It is in queue, and it will be under development soon-ish, but there is no eta as for now.

Comments are closed.