Recursive Sql Queries

Over the past few days, I spent my time working on recursive SQL queries. These queries help in querying hierarchical or transitive data. We can implement these queries using Common Table Expressions (CTEs). A common table expression is a temporary named result set which can store result of a query inside it while the query executes. Here is how a CTE looks like.

WITH first_query AS (
  SELECT 'Learn' as column1
),
second_query AS (
 SELECT 'SQL' as column2
)
SELECT * FROM first_query, second_query

CTE Example

With recursive query, the first select statement will initialise with some base result and store it in result set. When second select runs, it will query against the rows already stored in result set and store its result back in result set. This continues recursively against rows returned from last result and stops only when the second select return empty result set. The UNION ALL merges all rows into the final result set. Below is an example on how recursive query works.

-- first "select" initialize n with 10 for the result to be used in next iteration
-- second "select" is a recursive query which read rows from previous iteration to produce a sub result
-- second "select" continue recursion until no more rows are found or condition is not satisfied
-- now union all the result from each iteration and store in cte named print_numbers
WITH RECURSIVE  print_numbers AS  (
	
  -- initialize n with 10
  SELECT 10 AS n 
	  
  UNION ALL
		
  -- continue recursion using previous result until no more rows are found
  SELECT n - 1 FROM print_numbers WHERE n > 1
	  
)
SELECT * FROM print_numbers;

This recursive query will print the following output

simple_recursive_sql_output.png

Now let’s create a table which will store relationship between a child and its parent. So creating table with the following structure.

CREATE TABLE groups (
  id int PRIMARY KEY,
  parent_group text,
  child_group text
)

After inserting some data and querying for the child-parent relationship

sql_query_parent_child_output.png Result representing hierarchical data

Basically, it looks likes this

hierarchical_chart.png

Finding All Groups In A Sub tree

This is basically top-down approach. When we need to find all children of a tree starting at particular group in sub tree and keep moving down until all children having been found. Here the first Select will get the initial rows. Then second Select will run the query from the last iteration result and continue doing recursion until no more rows are found. Then it union all the results to produce a final result set. This is one great example to query hierarchical data and recursive SQL query can be a great boon.

WITH RECURSIVE find_child_groups AS (
  SELECT child_group FROM groups
  WHERE parent_group = 'ParentGroupA'
  
  UNION ALL 
  
  SELECT g.child_group FROM groups g, find_child_groups cg
  WHERE cg.child_group = g.parent_group
)
SELECT * FROM find_child_groups;

The Result from above recursive query is

parent_child_query_output.png

Finding A Group Parents

This query will follow bottom-up approach. This will help us find all the parents of a group. Again, the first Select will get the initial rows. Then second Select will run the query from the last iteration and continue doing recursion until no more rows are found and union the final result set.

WITH RECURSIVE find_parent_groups AS (
  SELECT parent_group FROM groups
  WHERE child_group = 'GrandChildGroupA21'
  
  UNION ALL 
  
  SELECT g.parent_group FROM groups g, find_parent_groups pg
  WHERE pg.parent_group = g.child_group
)
SELECT * from find_parent_groups;

The Result from above recursive query is

group_parent_output.png

So, this way a recursive query can prove so much helpful in querying hierarchical data. We don’t need to introduce different tables to keep a direct relationship between every table data. Thus, it reduces so much overhead to keep denormalized data in correct state.

Published 5 Oct 2019

Ramblings of a Software Engineer
Deepak Sharma on Twitter