# Converting list of integers into list of ranges

i've a query that returns sequential numbers with gaps (generate_series + join) and my question is: can is somehow construct ranges out of the returned values? sort of range_agg or something?

There was no further discussion, aside from me saying

sure you can. not trivial task, but possible.
you'd need window functions.

but it got me thinking …

First, it would be good to have test data. To avoid retyping it for every query let's make a table with some integers:

```=\$ create table test (i int4);
CREATE TABLE

=\$ insert into test (i) values (1), (2), (5), (6), (7), (8), (9), (10), (20), (30);
INSERT 0 10```

This gives us nice table input data:

```=\$ select i from test order by i;
i
----
1
2
5
6
7
8
9
10
20
30
(10 rows)```

Based on data above, we'd like to get ranges:

• 1 to 2
• 5 to 10
• 20
• 30

First, we need a way to tell if given row starts new range. This can be trivially done by checking whether previous value of i smaller by 1:

```=\$ select
i,
i - lag(i) over (order by i) as diff
from test
order by i;
i  |  diff
----+--------
1 | [null]
2 |      1
5 |      3
6 |      1
7 |      1
8 |      1
9 |      1
10 |      1
20 |     10
30 |     10
(10 rows)```

Nice. If diff is null or > 1 then current i starts new range.

Now, let's make it so that all rows in given range will have the same group id. To do so, I'll assign group_id (being just copy of value from i) for first rows in range, and in other cases, I'll make it null:

```=\$ with with_group_id as (
select
i,
case
when i - lag(i) over (order by 1) < 2 then null
else i
end as group_id
from test
)
select * from with_group_id
order by i;
i  | group_id
----+----------
1 |        1
2 |   [null]
5 |        5
6 |   [null]
7 |   [null]
8 |   [null]
9 |   [null]
10 |   [null]
20 |       20
30 |       30
(10 rows)```

Great. Now, let's fill group_id for non-first rows:

```=\$ with with_group_id as (
select
i,
case
when i - lag(i) over (order by 1) < 2 then null
else i
end as group_id
from test
)
select
i,
max(group_id) over (order by i) as group_id
from with_group_id
order by i;
i  | group_id
----+----------
1 |        1
2 |        1
5 |        5
6 |        5
7 |        5
8 |        5
9 |        5
10 |        5
20 |       20
30 |       30
(10 rows)```

Great. Now, all I need is to get ranges using normal grouping:

```=\$ with with_group_id as (
select
i,
case
when i - lag(i) over (order by 1) < 2 then null
else i
end as group_id
from test
), full_group_id as (
select
i,
max(group_id) over (order by i) as group_id
from with_group_id
)
select
min(i) as range_from,
max(i) as range_to
from
full_group_id
group by group_id
order by group_id;
range_from | range_to
------------+----------
1 |        2
5 |       10
20 |       20
30 |       30
(4 rows)```

This can be also used to generate proper PostgreSQL ranges:

```=\$ with with_group_id as (
select
i,
case
when i - lag(i) over (order by 1) < 2 then null
else i
end as group_id
from test
), full_group_id as (
select
i,
max(group_id) over (order by i) as group_id
from with_group_id
)
select
int4range(
min(i),
max(i),
'[]'
) as range
from
full_group_id
group by group_id
order by group_id;
range
---------
[1,3)
[5,11)
[20,21)
[30,31)
(4 rows)```

Pretty cool, isn't it?

## 5 thoughts on “Converting list of integers into list of ranges”

1. Mark Guyatt says:

Long time no write.

I had to solve exactly this problem in a work related task. At the time, the approach I took was direct SQL, no CTEs at all. I must admit the CTEs definitely make it easier to explain the solution here.

The approach I took was (using your table):

```SELECT
int4range(
i,
CASE WHEN last THEN i ELSE j END,
'[]'
) AS range
FROM (
SELECT
i,
first,
last,
LEAD(i) OVER (ORDER BY i) AS j
FROM (
SELECT
i,
COALESCE(i - LAG(i) OVER(ORDER BY i), 0)  1 AS first,
COALESCE(LEAD(i) OVER (ORDER BY i) - i, 0)  1 AS last
FROM test
) a
WHERE first OR last
) b
WHERE first
ORDER BY i;```

This has the nice side effect of reducing the amount of work to be done when there are large contiguous ranges.

2. Corey says:

I did a similar thing a while back, but for dates. The var `dates` was a `date[]` for this code:

```select  daterange(min(d), max(d)+1) as date_ranges
from    (   select d, sum(adj_num) over(order by d) as range_group
from (  select  e.d,
case
when lag(d) over(order by d) = d-1 then 0
else 1
from (  select  distinct d.d
from    unnest(dates) as d(d)
) e
) f
) g
group by g.range_group;```
3. Dan Scott says:

I believe that this solution is simplest (sorry, no idea where you are getting your nice formatting):

```SELECT MIN(i) AS range_from, MAX(i) AS range_to
FROM
(SELECT i, i-row_number() over (order by i) AS group_num FROM test ORDER BY i) AS t
GROUP BY group_num
ORDER BY group_num```
4. depesz says:

@Dan:

This is beautiful. Abuses certain facts about data, but works, and is much simpler. Thanks.

5. Dan Scott says:

Thanks. 🙂
As I look at it again, it appears that I really could have dropped the second “ORDER BY i” inside the subselect. After all, the aggregate functions don’t care what order the rows of t appear in. I probably put it there to aid with verifying the approach, as I was working on it. So, if I am correct, it could be made slightly simpler, still.