SQL is obviously not meant for this but I'm wondering if it's possible to do this with SQL as a sort of challenge. Given an arithmetic parse tree such as:
7-2*3
Which could be represented as:
-
/ \
7 *
/ \
2 3
And stored in SQL as the following (assuming +-/* binary operations on integers only):
CREATE TABLE expression_tree AS (
SELECT * FROM VALUES
(1, 1, '-', NULL, NULL), -- "-" at top
(1, 2, '7', 'L', 1),
(1, 3, '*', 'R', 1),
(1, 4, '2', 'L', 3),
(1, 5, '3', 'R', 3)
AS tmp(expression_id, node_id, value, side, parent_node_id)
)
┌───────────────┬─────────┬───────┬──────┬────────────────┐
│ expression_id ┆ node_id ┆ value ┆ side ┆ parent_node_id │
╞═══════════════╪═════════╪═══════╪══════╪════════════════╡
│ 1 ┆ 1 ┆ - ┆ ┆ │
│ 1 ┆ 2 ┆ 7 ┆ L ┆ 1 │
│ 1 ┆ 3 ┆ * ┆ R ┆ 1 │
│ 1 ┆ 4 ┆ 2 ┆ L ┆ 3 │
│ 1 ┆ 5 ┆ 3 ┆ R ┆ 3 │
└───────────────┴─────────┴───────┴──────┴────────────────┘
Would it be possible to write a WITH RECURSIVE cte that could return 7-2*3 for the above? To start I would have something like:
WITH RECURSIVE expression(node_id, node_value, expr_value) AS (
SELECT node_id, value, value FROM expression_tree WHERE parent_node_id IS NULL
UNION ALL
SELECT et.node_id, et.value,
CASE WHEN side='L'
THEN CONCAT(value, expr_value)
ELSE CONCAT(expr_value, value)
END
FROM expression_tree et JOIN expression e ON et.parent_node_id=e.node_id
)
SELECT * FROM expression
┌─────────┬────────────┬────────────┐
│ node_id ┆ node_value ┆ expr_value │
╞═════════╪════════════╪════════════╡
│ 1 ┆ - ┆ - │
│ 2 ┆ 7 ┆ 7- │
│ 3 ┆ * ┆ -* │
│ 4 ┆ 2 ┆ 2-* │
│ 5 ┆ 3 ┆ -*3 │
└─────────┴────────────┴────────────┘
Here is the start of it: https://sqlfiddle.com/sql-server/online-compiler?id=fad7e1c0-985a-49c6-9f1b-fbbe9b83c5eb.
My difficulty in the above is basically that I need both node child node values at the same time otherwise, it will keep overwriting the L child when it gets the R child. How could that be fixed?




Recursive aggregation is difficult if not impossible using a recursive CTE. It is often easier and more efficient to use a temp table and loop until you have no rows left to aggregate.
If you want to keep it as a single query, then in SQL Server you can use a recursive scalar function.
This is made more complicated by the fact that the table is not properly normalized. The values and operators should be in separate columns, if not separate tables. Also
expression_idseems unnecessary, as the rootnode_idshould uniquely identify an expression.Finally just select the parent node
db<>fiddle