Skip to main content

Recursive SQL queries

·564 words·3 mins
Sql Tutorial

Did you know that SQL queries can be recusive? I didn’t! Now I do. It’s glorious.

The problem
#

I had some data in the following format:

thingsub_thing_idstart_timeend_time
11Sep 1Sep 3
12Sep 1Sep 7

And I wanted it in the following format:

thingdatenumber_of_sub_things
1Sep 12
1Sep 22
1Sep 32
1Sep 41
1Sep 51
1Sep 61
1Sep 71

How do?!

The solution: A recursive common table expression
#

Ah ha!

A recursive CTE allows you to join all the levels of a hierarchy without knowing in advance how many levels there are.

tl;dr, the query that worked was:

select
  exp.thing,
  to_date(exp.start_time) as date,
  count(*) as number_of_sub_things
  from
    (with cte
      as (select
              thing,
              sub_thing,
              start_time,
              end_time
          from some_wide_table where start_time is not null
          union all
          select thing, sub_thing,
                dateadd(day, 1, start_time),
                end_time
          from cte
          where to_date(start_time) <= (to_date(end_time)-1))
    select *
      from cte) as exp
group by 1, 2
order by dte asc;

But whaaat is happening here? I shall explain:

  1. First, create a recursive common table expression (CTE): (with cte ... from cte) as exp
  2. In that CTE, create your “anchor clause” (select thing ... from some_wide_table) and union all it to your “recursive clause” (select thing ... from cte). Your anchor clause is the first, base set of rows you will be comparing your next select statement against. Your recursive clause adds its resulting rows back to that base, and then compares that new “anchor” against the same “recursive clause” select statement.
  3. It keeps doing this until the where to_date(start_time) <= (to_date(end_time)-1)) statement is no longer satisfied. So it’s (kinda) recuuuursive! Magic! And, like all recursive functions, there is a risk that you fall into an infinite loop if your where clause never catches.
  4. Once you’ve completed recursing around, you select * from cte - this is selecting all the resulting rows that were found as the anchor and recursive clauses did their dance.
  5. I then needed to group by the top-level thing and count(*) how many sub-things were present on each date.

Voila!

The next problem
#

So I actually ended up not being able to use this, because I wanted to do this arbitrary recursion over thousands of dates, millions of things, and tens of millions of sub-things. I quickly got to a max iteration count reached error from the database; a safety mechanism it has in place to prevent infinite recursions. It wasn’t infinite, but it was certainly a bit of a combinatorial explosion.

So what do now?!

tl;dr: There was a much more efficient, non-recursive way to do this. Basically, make a table of dates on the fly and join against that. D’oh.

select
  exp.thing,
  to_date(exp.this_date) as dte,
  count(*) as number_of_sub_things
  from
    (select
      x.thing,
      x.sub_thing,
      dates.this_date
    from some_wide_table as x
    right join
      (select
        dateadd(day, seq4(), '2011-01-01') as this_date
        from table(generator(rowcount => 9*365))) as dates
    on to_date(x.start_timee) <= to_date(dates.this_date) and to_date(x.end_time) >= to_date(dates.this_date)
    ) as exp
  group by 1, 2
;

I don’t love this solution, namely because I need to decide (and hard-code) the range of dates to compare against. For now, I’m creating a list of dates from 2011-01-01 to 9 years hence (9*365). It feels hacky. But it took a fraction of the time and no combinatorial problems! So it’s good enough.

Related

Keeping it 100 in the terminal
·762 words·4 mins
Zsh Bash Tutorial
Boy, do I love my dotfiles.
Yet another dotfiles repo
·760 words·4 mins
Tutorial Bash Zsh Cool Tools
marcello
Social media is cigarettes
·1061 words·5 mins
Attention Economy Surveillance Capitalism Tutorial
Or, a moment from the eternal battle between me and the engagement algorithms. Or, a way to defang Twitter virality (tutorial).