October 15th, 2009 by depesz | Tags: , , , , | 4 comments »
Did it help? If yes - maybe you can help me?

Andrew Gierth (RhodiumToad on irc) wrote, and Tom Lane committed:

Assorted improvements in contrib/hstore.
 
Remove the 64K limit on the lengths of keys and values within an hstore.
(This changes the on-disk format, but the old format can still be read.)
Add support for btree/hash opclasses for hstore --- this is not so much
for actual indexing purposes as to allow use of GROUP BY, DISTINCT, etc.
Add various other new functions and operators.
 
Andrew Gierth

So, I assume that you know/understand what hstore is. If not – please read the docs, but in short it can be said that it's a hash (associative array, dictionary) in single field in PostgreSQL.

So, Andrew added some new things. But what things?

First of all – you can use hstore for group/distinct. And this is very cool. Let me show you how it works:

# CREATE TABLE test (h hstore);
# INSERT INTO test (h) VALUES ('a=>123');
# INSERT INTO test (h) VALUES ('a=>123');
# INSERT INTO test (h) VALUES ('a=>256');
# INSERT INTO test (h) VALUES ('a=>256,b=>512');
# SELECT h, count(*) FROM test group by h;
h │ count
────────────────────────┼───────
"a"=>"123" │ 2
"a"=>"256", "b"=>"512" │ 1
"a"=>"256" │ 1
(3 rows)

Of course you'll usually want some finer grained statistics, but if you have relatively low number of various fields in hstore column – it can be quite good way to check what's what.

As for new functions and operators, let me show you some:

  • -> operator (which is used to get value from hstore, for given key) works now also with arrays:
    # select 'a=>1, b=>2, c=>3, d=>4'::hstore -> ARRAY['b', 'd', 'c'];
    ?column?
    ──────────
    {2,4,3}
    (1 row)

    And even multi-dimensional arrays:

    # select 'a=>1, b=>2, c=>3, d=>4'::hstore -> ARRAY[['b', 'd'],['c', 'd']];
    ?column?
    ───────────────
    {{2,4},{3,4}}
    (1 row)

  • ? operator checks if given key exists in hstore:
    # select 'a=>1, b=>2, c=>3, d=>4'::hstore ? 'a';
    ?column?
    ──────────
    t
    (1 row)
     
    (depesz@[local]:5850) 13:43:42 [depesz]
    # select 'a=>1, b=>2, c=>3, d=>4'::hstore ? 'e';
    ?column?
    ──────────
    f
    (1 row)
  • ?| and ?& operators – which take array, and check if any or all (?| -> any, ?& -> all) keys from array are in hstore:
    # with t as ( select 'a=>1, b=>2, c=>3, d=>4'::hstore as h )
    select h ?| array['a', 'b'], h ?| array['a','x'], h ?| array['x', 'y'] from t;
    ?column? │ ?column? │ ?column?
    ──────────┼──────────┼──────────
    t │ t │ f
    (1 row)
     
    # with t as ( select 'a=>1, b=>2, c=>3, d=>4'::hstore as h )
    select h ?& array['a', 'b'], h ?& array['a','x'], h ?& array['x', 'y'] from t;
    ?column? │ ?column? │ ?column?
    ──────────┼──────────┼──────────
    t │ f │ f
    (1 row)
  • – new operator, which works as delete(hstore, …), and delete() has been extended to support deletes of arrays and hstores:
    # with t as ( select 'a=>1, b=>2, c=>3, d=>4'::hstore as h )
    select h - 'a'::text, h - ARRAY['a', 'c'], h - 'b=>2,e=>5,d=>10'::hstore from t;
    ?column? │ ?column? │ ?column?
    ──────────────────────────────┼────────────────────┼──────────────────────────────
    "b"=>"2", "c"=>"3", "d"=>"4" │ "b"=>"2", "d"=>"4" │ "a"=>"1", "c"=>"3", "d"=>"4"
    (1 row)

    Important note – please see that you have to cast to text datatype when substracting text from hstore.

  • =\> operator – new functionality – build multi-key hstores in single call:
    # select ARRAY['a','b','c'] => ARRAY['1','2','3'];
    ?column?
    ──────────────────────────────
    "a"=>"1", "b"=>"2", "c"=>"3"
    (1 row)

    or even getting slices of hstore:

    # select 'a=>1, b=>2, c=>3, d=>4'::hstore => ARRAY['a','b','e'];
    ?column?
    ────────────────────
    "a"=>"1", "b"=>"2"
    (1 row)

  • conversion from arrays to hstore:

    # select array['a','1','b','2','c','3']::hstore;
    array
    ──────────────────────────────
    "a"=>"1", "b"=>"2", "c"=>"3"
    (1 row)

    Of course it breaks for odd number of elements:

    # select array['a','1','b','2','c']::hstore;
    ERROR: array must have even number of elements

    Also – you can do the conversion from multidimensional array:

    # select array[['a','1'],['b','2'],['c','3'],['d','4']]::hstore;
    array
    ────────────────────────────────────────
    "a"=>"1", "b"=>"2", "c"=>"3", "d"=>"4"
    (1 row)

    These operations can be reversed with hstore_to_array() and hstore_to_matrix() functions.

  • Conversion of records to hstore:

    # select hstore(record) from pg_shdescription as record;
    hstore
    ───────────────────────────────────────────────────────────────────────────────
    "objoid"=>"1", "classoid"=>"1262", "description"=>"default template database"
    (1 row)

And probably some others I missed, just check the diff of sql.

  1. 4 comments

  2. # Snatch
    Oct 15, 2009

    Is it possible to nest hashes? Let’s say I’d like to achieve something like a => 1, b => { c=> 2, d => 3 }, e => 1

  3. Oct 15, 2009

    @Snatch:
    Values are basically just text. Not arrays, not anything. Text. Of course you can technically serialize hstore to text, and store it as value in another hstore, but it seems to be bad idea. why do you need it?

  4. # Snatch
    Oct 16, 2009

    @Depesz:
    You won’t like the idea as it’s ORM related 😉 So, ActiveRecord – the ORM of RubyOnRails is able to serialize hash to string and store it in database field. It’s often used to store properties which needn’t separate columns. Properties hash often is often nested. I was wondering if using hstore will improve performance of that solution. However if it’s just a text it doesn’t make a difference.

  5. # moltonel
    Oct 20, 2009

    Very nice functionality.

    But these operators are awfully ugly. Seriously, anybody reading a “?|” (for example) in the code will have a hard time figuring out what the ?&|=>! it is. Looks more like sql-generating middleware syntax to me. Can’t we have operators or functions named “has_key” or similar instead ?

Leave a comment