Does varchar(n) use less disk space than varchar() or text?

Some time ago on Slack some person said:

varchar is better (storage efficiency), i recommend using it for less than 2048 chars, for the best : TEXT

There was discussion that followed, the person that claimed this efficiency never backed their claim, saying only that:

VARChar takes much less ‘place' than TEXT … but have to face it to believe it

and when I asked for

Show me db fiddle with queries that show this difference and I'll believe

I got:

HA ha ! i let you dig it  i've some work ….

I reacted (over reacted), and the thread died.

I know (knew?) that varchar/text are the same internally (wrote about it), but perhaps there have been recent change?

So. Since the person claiming that there is difference didn't provide any proof, I decided to look into it myself.

Since the difference is supposedly around (less than) 2kB, I decided to test string lengths:

  • 10 characters
  • 100 characters
  • 500 characters
  • 1000 characters
  • 1500 characters
  • 2000 characters
  • 2500 characters
  • 4000 characters
  • 8000 characters
  • 10000 characters

I made three tables for each length:

  • vc_n_NUMBER – having single column with varchar datatype, no length limit
  • vc_NUMBER_NUMBER – having single column with varchar(NUMBER + 2) length limit
  • text_NUMBER – having single volumn with text datatype

Then, into all three tables, I loaded the same set of 1 million values, where each value has NUMBER length, and each value is random combination of ‘a'..'z', ‘A'..'Z', ‘0'..'9′.

To do this, I wrote this simple ruby script:

chars = ("a".."z").to_a + ("A".."Z").to_a + ("0".."9").to_a
[ 10, 100, 500, 1000, 1500, 2000, 2500, 4000, 8000, 10000 ].each do |len|
  File.open("strings-#{len}.data", "w") do |f|
    1000000.times do
      f.write((1..len).map { chars.sample }.join('') + "\n")
    end
  end
end

Loaded it all and got results:

String length Table size, in bytes
vc_n_NUMBER vc_NUMBER_NUMBER text_NUMBER
10 44,335,104 44,326,912 44,335,104
100 134,373,376 134,365,184 134,373,376
500 546,324,480 546,324,480 546,324,480
1,000 1,170,653,184 1,170,653,184 1,170,653,184
1,500 1,638,891,520 1,638,891,520 1,638,891,520
2,000 2,048,598,016 2,048,598,016 2,048,598,016
2,500 2,828,615,680 2,828,615,680 2,828,615,680
4,000 5,582,495,744 5,582,495,744 5,582,495,744
8,000 8,420,007,936 8,420,007,936 8,420,007,936
10,000 12,478,431,232 12,478,431,232 12,478,431,232

It might be hard to see, but there actually IS a difference.

In case of lengths 10 and 100, table with varchar(NUMBER) column is smaller.

In both cases difference is exactly 8192 bytes.

Upon closer examination, the difference stems from the fact that for these tables PostgreSQL didn't make toast tables:

$ SELECT relname, reltoastrelid FROM pg_class  WHERE relname ~ '^(text|vc).*_(10|100|500)$';
  relname   │ reltoastrelid 
────────────┼───────────────
 vc_n_10    │        261293
 vc_10_10   │             0
 text_10    │        261301
 vc_n_100   │        261306
 vc_100_100 │             0
 text_100   │        261314
 vc_n_500   │        261319
 vc_500_500 │        261324
 text_500   │        261329
(9 ROWS)

Verified with pg_relation_size() calls that the base tables are the same size. So the 8kB difference is simply because of TOAST table that was created, though it's empty. At least that's how it looks for me.

I SO wish people would stop spreading misinformation, but apparently it's not going to happen. Luckily had some time to run tests, and provide this blogpost that shows that there is no REAL difference between storage size for text, varchar, and varchar(n).

2 thoughts on “Does varchar(n) use less disk space than varchar() or text?”

  1. Really interesting.
    I would expect also vc_500_500 to not have any toast associated, since I would think that 502 bytes per tuple can fit easily in 8kB data page, even leaving room for tids and other bits. Apparently there is a threshold over which PostgreSQL decides to toast anyway.

  2. Good post!
    @Luca Ferrari: In UTF-8 encoding, a character can have up to 4 bytes. 500 * 4 = 2000, which exceeds TOAST_TUPLE_TARGET.

Comments are closed.