Writing sprintf, and overcoming limitations in pl/perl

Having new VARIADIC functions, I decided it would be cool to be able to write sprintf() function.

Basically the idea is simple, plperl function, which takes “format", and list of arguments and returns generated output.

Of course – I could have done it with array as second argument, but it just doesn't look good:

SELECT sprintf('This is %s data from %s place.', ARRAY['some', 'some']);

I would prefer to write it as:

SELECT sprintf('This is %s data from %s place.', 'some', 'some');

And that's why VARIADIC is helpful.

Knowing that all variadic arguments will be passed as array to function is OK. Is it? Actually – not. That's why:

# CREATE OR REPLACE FUNCTION testit(VARIADIC TEXT[]) RETURNS VOID AS $$
>> die("-->[ " . $_[0] . " ]<--");
>> $$ LANGUAGE plperl;
CREATE FUNCTION
 
# SELECT testit('a', 'b', 'c');
ERROR:  error FROM Perl FUNCTION "testit": -->[ {a,b,c} ]<-- at line 2.

Apparently arrays are passed to plperl function in their “textout" format.

This might looks simple to parse, but actually it is not.

And since plperl doesn't allow me to load/use another modules, if I would ever wrote a function to parse the “array" into perl array, I would have to put this code into every function that will ever use it.

Is there no hope? Luckily, there is.

I can write the function to convert “arrays" into perl arrays, and store it's reference in %_SHARED:

CREATE OR REPLACE FUNCTION prepare_array_parser() RETURNS bool AS $BODY$
$_SHARED{'array_parser'} = sub {
    my $string = shift || '';
    RETURN unless $string =~ s/\A\{(.*)\}\z/$1/;
    my @elements = ();
    my $current = '';
    my $in_quotes = undef;
    my @chars = split //, $string;
    FOR (my $i = 0 ; $i < scalar @chars; $i++) {
        my $char = $chars[$i];
        IF ($char eq ',') {
            IF ($in_quotes) {
                $current .= $char;
            } ELSE {
                push @elements, $current;
                $current = '';
            }
        } elsif ( $char eq '"') {
            $in_quotes = !$in_quotes;
        } elsif ( ($char eq '\\') && ( $in_quotes ) ) {
            $i++;
            $current .= $chars[$i];
        } ELSE {
            $current .= $char;
        }
    }
    push @elements, $current;
    RETURN \@elements;
};
RETURN 1;
$BODY$ LANGUAGE plperl;

Now, having it available in %_SHARED, I can simply write sprintf as:

CREATE OR REPLACE FUNCTION sprintf(IN TEXT, VARIADIC TEXT[]) RETURNS TEXT AS $BODY$
my ($text, $args_as_string) = @_;
my $args = $_SHARED{'array_parser'}->($args_as_string);
RETURN sprintf $text, @$args;
$BODY$ LANGUAGE plperl;

And let's test it:

# SELECT sprintf('Current (%s) version is: %s', now()::text, version());
                                                                        sprintf
-------------------------------------------------------------------------------------------------------------------------------------------------------
 CURRENT (2008-08-01 12:35:59.737822+02) version IS: PostgreSQL 8.4devel ON i686-pc-linux-gnu, compiled BY GCC gcc (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
(1 ROW)

There is just one slight problem.

When I create new connection, and call sprintf function, it bails out:

# SELECT sprintf('Current (%s) version is: %s', now()::text, version());
ERROR:  error FROM Perl FUNCTION "sprintf": Undefined subroutine &main:: called at line 3.

This is because the array_parser function in %_SHARED is not defined.

So, I should call select prepare_array_parser(); before it. But, do I really have to remember it? No. I can modify sprintf function to:

And now the sprintf() works even when called as the first function after connect.

Possible usecases? Think simplifying calls to execute() function from grantall example.

SELECT EXECUTE(sprintf('GRANT ALL ON TABLE %s.%s TO depesz', n.nspname, c.relname)) FROM ...

But I can bet that you can find more interesting uses.

9 thoughts on “Writing sprintf, and overcoming limitations in pl/perl”

  1. Side comment: I think that array handling in pl/* (with the exception of plpgsql) should be rewritten. The way it is now makes it nearly impossible to use.

    Requiring users to write “csv parsers” to pass array to function is nonsensical.

  2. It would be far simpler to just extend PL/Perl. Right now, your return value if an array reference is automatically turned into a string ‘{“…”,…}’ which is handed off to PostgreSQL’s array parser. If you planned to continue working within the current constraints, you could do simple PL/Perl hacking and add a useful array parser on input.

    Better would be to not have to do this foo[] -> text and foo[] -> text stuff but until that day…

  3. I agree with jjore on writing a PL/Perl patch around array parameter processing. In the meantime though.. If you have a simpler array of text values then you can use a simpler “parser”:

    $arrtext =~ s/\{(.*)\}/\1/; # remove curly braces
    @arr = split(‘,’,$arrtext); # create the array

    Cheers,
    /cm

  4. @cod3monk3y:
    unfortunatelly it is not so easy.
    while your approach will work for array like this:

    # select ARRAY[‘a’, ‘b’, ‘c’];
    array
    ———
    {a,b,c}
    (1 row)

    it will not work correctly for things like this:

    # select ARRAY[‘1,23’, ‘a b’, ‘a, c’];
    array
    ———————–
    {“1,23″,”a b”,”a, c”}
    (1 row)

  5. sprintf, really? you gotta be kidding. the true sprintf allows mixed arguments of various string, integer, hex, etc types to be passed in arbitrary order. Until that day, i would use

    select ‘Current (‘ || now()::text || ‘) version is: ‘|| version();

  6. @Kostas:
    sorry, I don’t understand your sarcasm.

    You can pass any argument to function – as long as it is possible to convert it to string – which is not too strange limitation given the fact that in the end you will get string from the function.

    as for arbitrary order – hmm .. the implementations of sprintf that i’ve seen required format first, arguments later. so i don’t really see how is my approach at sprintf “a joke”. care to explain?

    and don’t you think that:

    select sprintf(‘Current (%s) version is: %s’, now()::text, version()); is more readable than yours approach with || ?

  7. I wrote a sprintf variant in plpgsql in case it interests you or your readers:

    http://wiki.postgresql.org/wiki/Sprintf

    It’s not very pretty and it still requires everything to be casted to text, but it doesn’t require the initializing helper function.

    I disagree with Kostas that the concatenation stuff is “good enough”. It’s actually barely readable. Someday we will have “anyelement2”, “anyelement3” etc in Postgres that will allow us passing args of arbitrary types, but until that day arrives we have to live with what we have (what we actually need here is arrays of mixed types).

  8. David Wheeler wrote better version of ‘parse_array’ sub – it handles nulls in array:

        $_SHARED{parse_array} = sub {
            my $string = shift || '';
            return unless $string =~ s/\A\{(.*)\}\z/$1/;
            my @elements;
            my $current = '';
            my $in_quotes = undef;
            my @chars = split //, $string;
            for (my $i = 0 ; $i < @chars; $i++) {
                my $char = $chars[$i];
                if ($char eq ',') {
                    if ($in_quotes) {
                        $current .= $char;
                    } else {
                        push @elements, $current eq 'NULL' && $chars[$i-1] ne '"' ? undef : $current;
                        $current = '';
                    }
                } elsif ( $char eq '"') {
                    $in_quotes = !$in_quotes;
                } elsif ( $char eq '\\' && $in_quotes ) {
                    $i++;
                    $current .= $chars[$i];
                } else {
                    $current .= $char;
                }
            }
            push @elements, $current eq 'NULL' && $chars[-1] ne '"' ? undef : $current;
            return \@elements;
        };

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.