How to install and configure PostgreSQL Debian/Ubuntu – for developer use – part 1

Recently I spent some time thinking about what can be improved when it comes to helping new users start using PostgreSQL.

One thing that almost immediately jumped to my mind is – how to install PostgreSQL? The task is theoretically simple. But there are always some caveats – which packages to use, what to configure in the beginning, where to find config files and logs.

With that in mind I decided to write a howto based on my ideas on what is right. These do not necessarily mean that these are the best for everybody, but I think this is a good start for anyone wanting to start their adventure with PostgreSQL.

Final note of warning – this post is for installing and setting PostgreSQL on developer workstation. As in: server where user can do anything, and we don't really care about security. Please do not configure production servers using this howto.

To make sure I start in as vanilla configuration as possible, I made myself two virtual servers:

testubuntu – running Ubuntu 20.10 groovy
testdebian – Debian GNU/Linux 10 (buster)

Both were updated to newest versions of packages before any work was done.

Now, before I will go any further – I tend to not work from root account, unless I absolutely must. So, I need some way to get to root for things like package installation.

Ubuntu ships with sudo configured in such a way that my user (first created) already belongs to sudo group, thanks to which I can:

depesz@testubuntu:~$ sudo id
[sudo] password for depesz: 
uid=0(root) gid=0(root) groups=0(root)

I need to give my password (password to account “depesz"), but clearly sudo works.

Unfortunately – debian has sudo installed, but my user is not added to sudo group by default.

So, first things I need to do (to be able to effectively use sudo), is:

depesz@testdebian:~$ su -c "/usr/sbin/adduser depesz sudo"
Password: 
Adding user `depesz' to group `sudo' ...
Adding user depesz to group sudo
Done.

Afterwards, I need to logout and login again (to make sure new group is loaded properly), and afterwards I can:

depesz@testdebian:~$ sudo id
[sudo] password for depesz: 
uid=0(root) gid=0(root) groups=0(root)

Now let's install PostgreSQL. Both ubuntu and debian do provide packaged PostgreSQL:

depesz@testubuntu:~$ apt-cache search ^postgresql-[0-9] | grep -E '^postgresql-[0-9.]+ '
postgresql-12 - object-relational SQL database, version 12 server
 
depesz@testdebian:~$ apt-cache search ^postgresql-[0-9] | grep -E '^postgresql-[0-9.]+ '
postgresql-11 - object-relational SQL database, version 11 server

but, there is only one version available, and as you can quickly check these are not the latest. Latest version is 13 (at the time when I write it).

Luckily, there is another, better source. apt.postgresql.org repository – maintained by PostgreSQL community.

To make it possible to use the repo we need to execute couple of steps, as described in the docs linked above:

sudo apt-get install curl ca-certificates gnupg

curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

```
sudo apt-get update
```

These commands do:

Installs packages that will be necessary to fetch, and verify, cryptographic signatures for the packages
Downloads, and installs cryptographic key that is used to sign the packages
Actually tells apt, by writing to /etc/apt/sources.list.d/pgdg.list file, to download packages from apt.postgresql.org
Forces apt to fetch new indexes (lists of packages) from all sources, including newly added apt.postgresql.org

It is possible that when running last command you will see something like:

...
Fetched 559 kB in 1s (485 kB/s)
Reading package lists... Done
N: Skipping acquire of configured file 'main/binary-i386/Packages' as repository 'http://apt.postgresql.org/pub/repos/apt groovy-pgdg InRelease' doesn't support architecture 'i386'

While the last line, about i386 packages, looks dangerous, it really isn't.

Now, with this new package repository configured our list of available PostgreSQL versions is much more interesting:

depesz@testubuntu:~$ apt-cache search ^postgresql-[0-9] | grep -E '^postgresql-[0-9.]+ '
postgresql-12 - object-relational SQL database, version 12 server
postgresql-10 - object-relational SQL database, version 10 server
postgresql-11 - object-relational SQL database, version 11 server
postgresql-13 - object-relational SQL database, version 13 server
postgresql-9.5 - object-relational SQL database, version 9.5 server
postgresql-9.6 - object-relational SQL database, version 9.6 server

and on debian, even more:

depesz@testdebian:~$ apt-cache search ^postgresql-[0-9] | grep -E '^postgresql-[0-9.]+ '
postgresql-11 - object-relational SQL database, version 11 server
postgresql-10 - object-relational SQL database, version 10 server
postgresql-12 - object-relational SQL database, version 12 server
postgresql-13 - object-relational SQL database, version 13 server
postgresql-8.2 - object-relational SQL database, version 8.2 server
postgresql-8.3 - object-relational SQL database, version 8.3 server
postgresql-8.4 - object-relational SQL database, version 8.4 server
postgresql-9.0 - object-relational SQL database, version 9.0 server
postgresql-9.1 - object-relational SQL database, version 9.1 server
postgresql-9.2 - object-relational SQL database, version 9.2 server
postgresql-9.3 - object-relational SQL database, version 9.3 server
postgresql-9.4 - object-relational SQL database, version 9.4 server
postgresql-9.5 - object-relational SQL database, version 9.5 server
postgresql-9.6 - object-relational SQL database, version 9.6 server

One, generally, should always use the latest version – it has the most features, and the most bugs removed.

So – let's use postgresql-13:

$ sudo apt-get install postgresql-13

Command is the same on both systems.

After it will finish, you can see that PostgreSQL is running:

$ sudo pg_lsclusters 
Ver Cluster Port Status Owner    Data directory              Log file
13  main    5432 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log

What do all the fields mean? Well, each line is so called “cluster", which is rather unfortunate term, as it's not cluster in the generally used meaning. Rather it basically means “installation". You can have multiple installations of PostgreSQL on the same server – with the same, or different, versions. Meaning of the fields is:

Ver which PostgreSQL version is this cluster using
Cluster name of this cluster. It has to be unique within given version. As in – you can have two clusters named main, in different versions, but only one in single version
Port which port PostgreSQL is listening on. These are both for TCP/IP connections, and connections over Unix Socket.
Status well, what is the status of this cluster. For now, it's safe to assume you will only see statuses online, down, and potentially suffixed by ,recovery if given cluster is starting now and executing recovery.
Owner which system user PostgreSQL data belongs to. Generally you will see there postgres, unless you will do some manual changes.
Data directory where does PostgreSQL keep all the data files related to this cluster, or at least – main data files.
Log file where you can find logs for this PostgreSQL cluster

All is simple, I hope.

Now. Let's actually use it.

First, let's start with default, always available, command line interface to PostgreSQL – namely psql. If i'll run psql now, I will be greeted with error:

$ psql
psql: error: FATAL:  role "depesz" does not exist

This requires some explanation.

There are system accounts (like root, postgres, depesz, and others. Defined in /etc/passwd (by default).

These are not so interesting to PostgreSQL.

To be able to login to PostgreSQL, you have to have account within PostgreSQL.

Immediately after installation there is only one account created, named postgres. This account is also so called superuser – which means someone logged using it can do anything to any database within given cluster.

Interestingly, we will not be able to use it to login, at least not immediately:

$ psql -U postgres
psql: error: FATAL:  Peer authentication failed for user "postgres"

Now, the problem is that, by default, configuration of PostgreSQL will only allow user postgres when logging from system account postgres.

So I can do it, using:

$ sudo -u postgres psql
psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.
 
postgres=#

How does that work?

Normally, when you do “sudo some_command", sudo logs to root account, and then executed some_command.

But if I'd use “sudo -u postgres some_command" – then sudo will log not to root, but rather to account that I named as -u option.

This means that the psql I started will be running from postgres system account.

Now, a bit more info – if you don't provide -U option to psql (or any other way to provide database user name) – psql assumes that your database user is the same as your system user.

Similarly, if you don't provide -d option (name of database you want to connect to), it will assume that you want to connect to database named the same way as your database user.

Since this howto is about setting Pg for developer, let's make it simpler to log using varying accounts.

In directory /etc/postgresql/13/main you will find file pg_hba.conf (please note that directory depends on version and cluster name, so if you installed different version, the path will be slightly different.

If we'd remove comments, it will look like this:

local   all             postgres                                peer
local   all             all                                     peer
host    all             all             127.0.0.1/32            md5
host    all             all             ::1/128                 md5
local   replication     all                                     peer
host    replication     all             127.0.0.1/32            md5
host    replication     all             ::1/128                 md5

Comments provide pretty good explanation on what's there, but for now, let's just change all peer and md5 at the end of lines to trust

This is insecure, as it basically tells PostgreSQL that anyone can log using any database account, without checking of any password or other ways of authentication.

Which is of course very important in production, but not so much on developer workstation.

You can do the change using this one-liner:

$ sudo perl -pi -e 's/(peer|md5)$/trust/' /etc/postgresql/13/main/pg_hba.conf

or just use some editor, but please note that you will have to edit it using root or postgres system user.

After the change we need to reload configuration.

This can be done using quite a lot of various methods:

in shell: sudo pg_ctlcluster 13 main reload
also in shell: sudo systemctl reload postgresql@13-main.service
or, if you have psql open, logged in as superuser, you can simply: select pg_reload_conf();

All will reload pg_hba.conf, and from this moment on, from your normal system account you will be able to run psql using superuser:

$ psql -U postgres
psql (13.1 (Ubuntu 13.1-1.pgdg20.10+1))
Type "help" for help.
 
postgres=#

I find it preferable to using sudo, but you might find otherwise.

Now, within the Pg cluster, there are, by default three databases (which you can see by calling psql -l, or by executing \l in open psql):

template0 – inaccessible to users, used, sometimes, when creating new databases
template1 – basic template for new database. Whenever you create new database (not cluster, database within cluster), and you don't specify otherwise, template1 will be copied and renamed to your requested name.
postgres – generally empty database that is used by various tools as “default database to connect to" when user doesn't specify another db. Generally it's not advised to change anything there.

To make it possible/sensible to use PostgreSQL as your own user, you need to create database user.

You can do it using shell:

$ createuser -U postgres depesz

or using SQL, within psql (or other client program, we'll get to it):

$ psql -U postgres
...
postgres=# CREATE USER depesz;
CREATE ROLE

I had to use -U postgres option to psql or createuser, because otherwise the tools would try to connect to database as user depesz (name of system account I'm currently logged in), and there was no such user before.

Now, I should be able to connect to database as myself:

$ psql
psql: error: FATAL:  database "depesz" does not exist

Well, it did log in as “depesz" account, but, since I didn't provide database name – it tried to connect to database “depesz" – which still doesn't exist. So let's make it. Again, two ways:

shell: createdb -U postgres -O depesz depesz
in SQL, logged as superuser: create database depesz with owner depesz;

Both will do the same – make new database (copy of template1), name it depesz, and make depesz database user owner of this database (which means that it has more privileges in there).

And now, finally:

$ psql
...
depesz=> select current_user, current_database();
 current_user | current_database 
--------------+------------------
 depesz       | depesz
(1 row)

please note that “depesz" in psql prompt means database name, not user name. This can be checked quickly by connecting to another database:

$ psql -d postgres
...
postgres=> select current_user, current_database();
 current_user | current_database 
--------------+------------------
 depesz       | postgres
(1 row)

With this in place, you can now use psql to do all normal SQL operations. Defining data structures, adding data, modifying data, deleting, selecting. All is accessible.

But – lots of people, apparently, don't really like psql.

For such cases, let's install, and configure, pgAdmin. There are many more tools, but pgAdmin has the benefit of being readily available from the same source:

$ sudo apt-get install pgadmin4

This will take some time, but eventually will end.

Afterwards, let's launch it:

$ pgAdmin4

Startup takes some time – when it will finish, in the console you'll see:

 * Serving Flask app "pgadmin" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off

and your web browser will open pgAdmin page. First thing – it will ask you for a password – this will be used to protect all communication details with actual databases. Since it's devel machine, it doesn't really matter what you will choose, as long as you will remember it.

It should look basically like this:

[ first-window.png PNG 1075x700 1075x700+0+0 8-bit sRGB 79166B 0.000u 0:00.000 ]

In the quick links section, there is a button “Add New Server".

After you will click it you will be presented with dialog to define new server:

[ new-server-window.png PNG 1075x700 1075x700+0+0 8-bit sRGB 100885B 0.000u 0:00.000 ]

Please note the tabs at the top – General / Connection / SSL, and so on.

For now, let's put “Personal" as Name in General. Then in “Connection" Tab:

Host: 127.0.0.1 – as it will connect to localhost only
Port should be the same as pg_lsclusters showed, but by default it's correct 5432
Username – in my case it should be “depesz"

I could use username “postgres", but I made myself personal user so I can use it.

After clicking save, I got page looking like:

[ server-added.png PNG 1075x700 1075x700+0+0 8-bit sRGB 76583B 0.000u 0:00.000 ]

And in there, you can do whatever you want. Generally you'd first unroll Databases, and select your own database, like here:

[ db-selected.png PNG 1075x700 1075x700+0+0 8-bit sRGB 87108B 0.000u 0:00.000 ]

How to use pgAdmin – sorry, I just don't know it that well. Started it, looked at couple of things, and disconnected. I'm very old-school, so I use console and psql exclusively.

But I know that others like GUI tools.

Basically, at this moment, you can install any database application, that supports PostgreSQL, have it connect to db server on you machine, using host 127.0.0.1, port 5432, and your own username and database name, and it should work.

If something is wrong – you can see into logs:

=$ sudo tail -n 10 /var/log/postgresql/postgresql-13-main.log
[sudo] password for depesz:
2020-12-21 13:09:42.279 CET [215197] LOG:  starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2020-12-21 13:09:42.279 CET [215197] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2020-12-21 13:09:42.284 CET [215197] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-12-21 13:09:42.291 CET [215198] LOG:  database system was shut down at 2020-12-21 13:09:41 CET
2020-12-21 13:09:42.297 CET [215197] LOG:  database system is ready to accept connections
2020-12-21 13:17:19.279 CET [215197] LOG:  received SIGHUP, reloading configuration files

That should be good enough for development. For production, or multi-user development, we will need some more settings, but this will have to wait for next blogpost.

Questions? Comments?

=$
|

How to install and configure PostgreSQL Debian/Ubuntu – for developer use – part 1

One thought on “How to install and configure PostgreSQL Debian/Ubuntu – for developer use – part 1”