The general knowledge is that numerics are slower than integers/float, but offer precision and ranges that are better.

While I understand what is slow, I don't really know how much slower numerics are. So let's test it.

There are couple of things to consider:

table size (related to disk usage per column)
speed of inserts
speed of indexing
searching using equality operator
searching using range operator
doing some math on the values

Table size. Initial thinking is that int2 column would make the table 4 times smaller than int8. But is it true?

First, let's consider very simplistic table:

CREATE TABLE test_type (
    some_column TYPE
);

where type is one of:

int2
int4
int8
float4
float8
numeric
numeric(10,0) (in this case table name will be test_numeric10)
numeric(20,0) (in this case table name will be test_numeric20)

In each of these tables, let's insert 1 million rows. In all but int2 case these will be unique, but int2 range is too small to fit 1 million unique values, so I'll end up with duplicates:

INSERT INTO test_int2 SELECT i % 32767 FROM generate_series(1,1000000) i;
INSERT INTO test_int4 SELECT i FROM generate_series(1,1000000) i;
INSERT INTO test_int8 SELECT i FROM generate_series(1,1000000) i;
INSERT INTO test_float4 SELECT i FROM generate_series(1,1000000) i;
INSERT INTO test_float8 SELECT i FROM generate_series(1,1000000) i;
INSERT INTO test_numeric SELECT i FROM generate_series(1,1000000) i;
INSERT INTO test_numeric10 SELECT i FROM generate_series(1,1000000) i;
INSERT INTO test_numeric20 SELECT i FROM generate_series(1,1000000) i;

Table sizes:

SELECT relname, pg_size_pretty( pg_table_size(oid) )
FROM pg_class WHERE relname ~ '^test_';
    relname     | pg_size_pretty 
----------------+----------------
 test_float4    | 35 MB
 test_float8    | 35 MB
 test_int2      | 35 MB
 test_int4      | 35 MB
 test_int8      | 35 MB
 test_numeric   | 35 MB
 test_numeric10 | 35 MB
 test_numeric20 | 35 MB
(8 ROWS)

Surprise, surprise.

But, let's redo the test, this time with 2 columns. Each table will get “some_column2", of the same datatype it had before. and all the data will be reinserted, to both columns (tables will get dropped and recreated to avoid any bloat). How will that change?

INSERT INTO test_int2 SELECT i % 32767, i % 32767 FROM generate_series(1,1000000) i;
INSERT INTO test_int4 SELECT i, i FROM generate_series(1,1000000) i;
INSERT INTO test_int8 SELECT i, i FROM generate_series(1,1000000) i;
INSERT INTO test_float4 SELECT i, i FROM generate_series(1,1000000) i;
INSERT INTO test_float8 SELECT i, i FROM generate_series(1,1000000) i;
INSERT INTO test_numeric SELECT i, i FROM generate_series(1,1000000) i;
INSERT INTO test_numeric10 SELECT i, i FROM generate_series(1,1000000) i;
INSERT INTO test_numeric20 SELECT i, i FROM generate_series(1,1000000) i;

and the sizes now:

SELECT relname, pg_size_pretty( pg_table_size(oid) )
FROM pg_class WHERE relname ~ '^test_';
    relname     | pg_size_pretty 
----------------+----------------
 test_float4    | 35 MB
 test_float8    | 42 MB
 test_int2      | 35 MB
 test_int4      | 35 MB
 test_int8      | 42 MB
 test_numeric   | 42 MB
 test_numeric10 | 42 MB
 test_numeric20 | 42 MB
(8 ROWS)

Yeah. So, basically what we are seeing here is effect of two things:

alignment – data is aligned to certain number of bytes. In my case – 8 (that's why int2 and int4 take as much as int8, but when I added next column – int8 grew and int2/int4 didn't). This means that if something is using less than 8 bytes, then it will be “padded" to 8 bytes. It doesn't mean that each value will take 8 bytes – it's a bit more complex than that, and if you're looking for specific details – I guess sources have definitive answers
what a lot of people don't seem to remember – each row contains more than what you just see. There are system columns. And these take 26 bytes. Per row.

So, as far as data size is concerned – there is difference, but it's not really all that big.

What about inserting speed? I'll exclude int2 from the tests, as its smaller range requires limiting range, and this makes the test more complicated (i.e. there are additional operations to do).

So, what about the other datatypes?

I inserted 1 million rows into each of the tables, between tests doing “truncate" on the table, and repeated the process 10 times to get sensible average. Results:

table	time	more than fastest
test_int4	1285.952ms	0.51%
test_int8	1304.867ms	1.99%
test_float4	1284.393ms	0.39%
test_float8	1279.404ms	0.00%
test_numeric	1533.078ms	19.83%
test_numeric10	1553.848ms	21.45%
test_numeric20	1550.829ms	21.21%

That's pretty interesting. I also ran the test, where there was unique index on each of the tables (only on one column). Results for the same insert test with index in place:

table	time	more than fastest
test_int4	2716.123ms	0.00%
test_int8	2927.931ms	7.80%
test_float4	3079.118ms	13.36%
test_float8	3208.430ms	18.13%
test_numeric	3915.411ms	44.15%
test_numeric10	3925.478ms	44.53%
test_numeric20	3909.128ms	43.92%

Clearly indexing has its toll on larger datatypes. Up to ~ 50% slower than int4.

So, we know now, what about writes. Let's see how searches perform.

For equality searching, I will search, using 10 different queries for values (in indexed column): 1, 100000, 200000, 300000 … 1000000.

Since the searches are fast, I repeated each 1000 times. Averages:

table	time	more than fastest
test_int4	0.205ms	0.00%
test_int8	0.205ms	0.08%
test_float4	0.210ms	2.50%
test_float8	0.210ms	2.39%
test_numeric	0.208ms	1.68%
test_numeric10	0.207ms	1.19%
test_numeric20	0.207ms	0.73%

For me it means that there is no difference.

Now range queries. I'll check queries that return 10 rows, 1000 rows and 10000 rows.

All queries were in form of:

SELECT * FROM TABLE WHERE some_column BETWEEN ... AND ...

First the 10 row:

table	time	more than fastest
test_int4	0.235ms	0.49%
test_int8	0.233ms	0.00%
test_float4	0.249ms	6.81%
test_float8	0.249ms	6.75%
test_numeric	0.248ms	6.29%
test_numeric10	0.247ms	5.66%
test_numeric20	0.245ms	5.02%

1000 rows:

test_int4 : 1.294ms (+ 0.00%)
test_int8 : 1.373ms (+ 6.05%)
test_float4 : 2.537ms (+ 96.03%)
test_float8 : 2.561ms (+ 97.86%)
test_numeric : 1.528ms (+ 18.04%)
test_numeric10 : 1.510ms (+ 16.65%)
test_numeric20 : 1.505ms (+ 16.26%)

table	time	more than fastest

And the final, 10k row test:

table	time	more than fastest
test_int4	8.401ms	0.00%
test_int8	9.366ms	11.49%
test_float4	17.959ms	113.77%
test_float8	18.196ms	116.60%
test_numeric	8.954ms	6.59%
test_numeric10	10.072ms	19.89%
test_numeric20	10.106ms	20.30%

That's actually pretty interesting – searching using floats is slower than using numerics?!

And now for some math. I ran a test, which was running commands like:

SELECT 12::int2 * 13::int2

Decided to test 3 operators: addition, multiplication and modulo.

Results for addition test (123 + 456):

table	time	more than fastest
int4	0.162ms	0.00%
int8	0.165ms	2.18%
float4	0.168ms	4.11%
float8	0.168ms	4.03%
numeric	0.168ms	3.76%
numeric(10,0)	0.177ms	9.38%
numeric(20,0)	0.176ms	8.92%

Multiplication (123 * 99):

table	time	more than fastest
int4	0.159ms	0.00%
int8	0.163ms	2.25%
float4	0.166ms	4.31%
float8	0.166ms	4.17%
numeric	0.165ms	4.02%
numeric(10,0)	0.174ms	9.62%
numeric(20,0)	0.173ms	9.12%

Modulo (12345 % 9876). This test couldn't be done with floats, as there is no “float % float" operator:

table	time	more than fastest
int4	0.177ms	0.00%
int8	0.180ms	1.82%
numeric	0.183ms	3.56%
numeric(10,0)	0.193ms	8.77%
numeric(20,0)	0.193ms	8.74%

So, what does it all tell us? It all depends on use case. Indexing is biggest problem, but math itself, or searching is not that big. Whether it's OK for your situation – you have to assess yourself.

UPDATE

Dim from irc asked me to check aggregates too. So I did.

Results for select sum(some_column):

table	time	more than fastest
test_float4	63.163 ms	+ 10.28%
test_float8	62.684 ms	+ 9.44%
test_int2	57.685 ms	+ 0.71%
test_int4	57.276 ms	+ 0.00%
test_int8	152.288 ms	+ 165.88%
test_numeric	118.889 ms	+ 107.57%
test_numeric10	118.179 ms	+ 106.33%
test_numeric20	118.100 ms	+ 106.20%

and select avg(some_column):

table	time	more than fastest
test_float4	69.261 ms	+ 12.53%
test_float8	69.909 ms	+ 13.58%
test_int2	62.018 ms	+ 0.76%
test_int4	61.551 ms	+ 0.00%
test_int8	153.370 ms	+ 149.17%
test_numeric	120.552 ms	+ 95.86%
test_numeric10	120.645 ms	+ 96.01%
test_numeric20	120.597 ms	+ 95.93%

This is pretty interesting. Especially how bad int8 seems to be …

8 thoughts on “How much slower are numerics?”

edo says:

2014-12-07 at 22:00

1. what about disk usage on table+indexes?
2. may be range test result would be better with explicit type specification like “between 1::float8 and 100::float8”?
did says:

2014-12-08 at 02:00

hi
In my understanding search is not slower with float, ‘displaying’ float is slow, eg if you run explain analyse float select is faster than numeric.
ndwijaya says:

2014-12-08 at 03:48

how about comparing int8 vs char(8) and varchar(8).
since they are using the same size of memory and disk how about the performance ?
sftf says:

2014-12-08 at 11:21

“This is pretty interesting. Especially how bad int8 seems to be …”
Maybe 32-bitness of tested PostgreSQL leads to such results?
How about the same on 64-bit PostgreSQL?
marian says:

2014-12-08 at 12:47

What’s postgres version, 64 or 32 bit?
depesz says:

2014-12-08 at 13:35

ndwijaya:
Isn’t that enough: https://www.depesz.com/2012/06/07/123-vs-depesz-what-is-faster/
depesz says:

2014-12-08 at 13:36

@sftf and @marian:

It was (and is) 64 bit on 64bit platform – 9.5devel on 64bit Linux.
glyn says:

2014-12-20 at 12:43

Nice to actually see some quantifiable speed differences. I wonder why sum/avg is so slow for int8 and if older versions of postgres give the same result. I also hadn’t expected to see quite the insert penalty for numeric, is that mainly going to be down to varlena overhead/size?

Comments are closed.