admin管理员组文章数量:1289834
I'm having some trouble formulating the correct recursive behaviour to get what I want in SQL. I'm limited to the BigQuery environment and I want to avoid using any JavaScript if I can, so I wanted to try and recreate this behaviour just using CTEs if possible. I can do exactly what I want in Python, but have been unable to do it in SQL. Yes I know SQL is not really designed for this, but I do have a reason for it so would like to try and get it to work. I want to be able to provide any length array and return a list of permutations. For example:
INPUT [1, 1, 1] OUTPUT [0, 0, 0]
INPUT [2, 1, 1] OUTPUT [[0, 0, 0], [1, 0, 0]]
INPUT [2, 1, 2] OUTPUT [[0, 0, 0], [1, 0, 0], [0, 0, 1], [1, 0, 1]]
INPUT [1, 5] OUTPUT [[0,0], [0,1], [0,2], [0,3], [0,4]]
I don't mind how the inputs and outputs are formatted as long as I can parse them one way or another. E.g. returning rows of strings are fine, each element in a separate row is fine as I can use array_agg, etc.
In Python doing this recursively or using for-loops is pretty trivial. I start with a list of zeroes and work my way along each index, incrementing the value at that index until it reaches input value-1 and store each permutation before moving on.
In SQL this is what I have so far, which is only really the first step:
WITH RECURSIVE
OrigSeq AS (SELECT [2, 1, 2] as some_numbers), -- input sequence
BaseSeq AS (SELECT ARRAY(SELECT 0 as a FROM UNNEST(OrigSeq.some_numbers)) AS base FROM OrigSeq), -- get array of 0s of same size as input sequence
Sequences AS (
(SELECT 0 AS perm_id, 0 as idx, BaseSeq.base[0] as iter, OrigSeq.some_numbers[0]-1 AS orig, BaseSeq.base as base, OrigSeq.some_numbers as num FROM OrigSeq, BaseSeq)
UNION ALL
-- increment index
SELECT perm_id, idx+1 as idx, base[idx+1] as iter, num[idx+1]-1 as orig, base, num FROM Sequences WHERE idx < ARRAY_LENGTH(num)-1
),
Seq2 AS (
SELECT * FROM Sequences
UNION ALL
-- parse iters - not doing this quite right and my brain stops functioning around here
SELECT perm_id+1, idx, base[idx]+1 as iter, orig, base, num FROM Sequences where iter <= orig
),
Seq3 AS (
Select * From Seq2
UNION ALL
SELECT perm_id, idx, iter, orig, base, num from Sequences
)
SELECT perm_id, idx, orig, iter
FROM Seq3
ORDER BY perm_id, idx
Any guidance appreciated.
I'm having some trouble formulating the correct recursive behaviour to get what I want in SQL. I'm limited to the BigQuery environment and I want to avoid using any JavaScript if I can, so I wanted to try and recreate this behaviour just using CTEs if possible. I can do exactly what I want in Python, but have been unable to do it in SQL. Yes I know SQL is not really designed for this, but I do have a reason for it so would like to try and get it to work. I want to be able to provide any length array and return a list of permutations. For example:
INPUT [1, 1, 1] OUTPUT [0, 0, 0]
INPUT [2, 1, 1] OUTPUT [[0, 0, 0], [1, 0, 0]]
INPUT [2, 1, 2] OUTPUT [[0, 0, 0], [1, 0, 0], [0, 0, 1], [1, 0, 1]]
INPUT [1, 5] OUTPUT [[0,0], [0,1], [0,2], [0,3], [0,4]]
I don't mind how the inputs and outputs are formatted as long as I can parse them one way or another. E.g. returning rows of strings are fine, each element in a separate row is fine as I can use array_agg, etc.
In Python doing this recursively or using for-loops is pretty trivial. I start with a list of zeroes and work my way along each index, incrementing the value at that index until it reaches input value-1 and store each permutation before moving on.
In SQL this is what I have so far, which is only really the first step:
WITH RECURSIVE
OrigSeq AS (SELECT [2, 1, 2] as some_numbers), -- input sequence
BaseSeq AS (SELECT ARRAY(SELECT 0 as a FROM UNNEST(OrigSeq.some_numbers)) AS base FROM OrigSeq), -- get array of 0s of same size as input sequence
Sequences AS (
(SELECT 0 AS perm_id, 0 as idx, BaseSeq.base[0] as iter, OrigSeq.some_numbers[0]-1 AS orig, BaseSeq.base as base, OrigSeq.some_numbers as num FROM OrigSeq, BaseSeq)
UNION ALL
-- increment index
SELECT perm_id, idx+1 as idx, base[idx+1] as iter, num[idx+1]-1 as orig, base, num FROM Sequences WHERE idx < ARRAY_LENGTH(num)-1
),
Seq2 AS (
SELECT * FROM Sequences
UNION ALL
-- parse iters - not doing this quite right and my brain stops functioning around here
SELECT perm_id+1, idx, base[idx]+1 as iter, orig, base, num FROM Sequences where iter <= orig
),
Seq3 AS (
Select * From Seq2
UNION ALL
SELECT perm_id, idx, iter, orig, base, num from Sequences
)
SELECT perm_id, idx, orig, iter
FROM Seq3
ORDER BY perm_id, idx
Any guidance appreciated.
Share Improve this question edited Feb 20 at 2:41 Dale K 27.5k15 gold badges58 silver badges83 bronze badges asked Feb 20 at 2:12 dgBPdgBP 1,6936 gold badges27 silver badges42 bronze badges2 Answers
Reset to default 2Recursive solution example for BigQuery SQL
Source:
WITH RECURSIVE
OrigSeq as(
select [1, 1, 1] as some_numbers -- [0, 0, 0]
union all
select [2, 1, 1] -- [[0, 0, 0], [1, 0, 0]]
union all
select [2, 1, 2] -- [[0, 0, 0], [1, 0, 0], [0, 0, 1], [1, 0, 1]]
union all
select [1, 5] as some_numbers -- [[0,0], [0,1], [0,2], [0,3], [0,4]]
)
Query:
- It is necessary to distinguish the rows somehow. Just give some kind of Id.
- Expand every array element from 0 to some_numbers[N]-1
- Recursively: Cartesian product sets {0,...,N1}X{0,...,N2}X{0,...,N3}...
- Concat back
,OrigSeqOrdered as(
select * ,row_number()over(order by (select null)) id
from OrigSeq
)
,r as(
select id, 0 lvl, nn,some_numbers,ARRAY_LENGTH(some_numbers) l
,[cast(nn as string)] arr
from OrigSeqOrdered
cross join UNNEST(GENERATE_ARRAY(0, some_numbers[0]-1)) as nn
union all
select id,lvl+1, nn,some_numbers,l
,array_concat(arr,[cast(nn as string)]) arr
from r
cross join UNNEST(GENERATE_ARRAY(0, some_numbers[lvl+1]-1)) as nn
where (lvl+1)<l
)
select id ,concat("[ [",array_to_string(array_agg(array_to_string(arr,",")),"], [:"),"] ]") sa
from r
where lvl=(l-1)
group by id
order by id
id | sa |
---|---|
1 | "[ [0,0,0] ]" |
2 | "[ [1,0,0], [:0,0,0] ]" |
3 | "[ [1,0,1], [:1,0,0], [:0,0,1], [:0,0,0] ]" |
4 | "[ [0,1], [:0,2], [:0,3], [:0,4], [:0,0] ]" |
Or as array<Int64>
r as(
select id,0 lvl, nn,some_numbers
,ARRAY_LENGTH(some_numbers) l -- array length, used for recursion stop
,[nn] arr
from OrigSeqOrdered
cross join UNNEST(GENERATE_ARRAY(0, some_numbers[0]-1)) as nn
union all
select id
,lvl+1, nn,some_numbers,l
,array_concat(arr,[nn]) arr
from r
cross join UNNEST(GENERATE_ARRAY(0, some_numbers[lvl+1]-1)) as nn
where (lvl+1)<l
)
select id ,some_numbers, arr,lvl,nn
from r
where lvl=(l-1) -- only completed array's
order by id,lvl
Details for cartesian product of sets: Set A{0,1}, set B{0,1}, Set C{0,1,2}.( For OP array[2,2,3])
Create array for some_numbers[0]
GENERATE_ARRAY(0, some_numbers[0]-1)
3->(3-1)
Output: [0,1,2]
Expand array to set of rows
UNNEST([0,1,2])
Output:{0}
{1}
{2}
Recursive Cartesian product
Anchor part of recursive query
| First recursion
| | Next recursion
↓ ↓ ↓
A x B = x C =
a,b a,b,c
{0} {0} {0,0} ---→{0,0,0}
{1} {1} {1,0} |-→{0,0,1}
{0,1} └-→{0,0,2}
{1,1} {1,0,0}
{1,0,1}
{1,0,2}
{0,1,0}
{0,1,1}
{0,1,2}
{1,1,0}
{1,1,1}
{1,1,2}
Concatenating arrays
first array_concat([0],0)->[0,0]
then array_concat([0,0],1)->[0,0,1]
Fiddle
Recursivity to append
A first recursive loop can compute all possible values at each position,
then a second recursive one will append to an array starting from the empty array.
WITH RECURSIVE
OrigSeq AS (SELECT ARRAY[2, 1, 2] AS a),
-- List the indices of our array (which will give us the number of cells too):
Pos AS (SELECT n, pos FROM OrigSeq, UNNEST(a) AS n WITH OFFSET AS pos),
-- For each position list its possible values:
Poss AS
(
SELECT n - 1 AS n, Pos.pos FROM Pos -- Per specification, "2" should give a range from 0 to 1 (2 is the count, not the max), so start one step beyond.
UNION ALL
SELECT n - 1, pos FROM Poss WHERE n > 0
),
-- Start from an empty array, generate as many arrays of 1 element as there are possibilities for cell 1, and so on:
Combi AS
(
SELECT [] a, 0 next
UNION ALL
SELECT ARRAY_CONCAT(a, [ n ]), next + 1
FROM Combi, Poss
WHERE Poss.pos = next
)
-- Keep only the completed arrays:
SELECT * from Combi WHERE next = (SELECT COUNT(1) FROM Pos);
a | lvl |
---|---|
{1,0,1} | 3 |
{0,0,1} | 3 |
{1,0,0} | 3 |
{0,0,0} | 3 |
See the PostgreSQL equivalent in a fiddle:
Recursivity to decrement
There is a way to first generate all positions as posToDecrement
,
then recursively starting from the initial numbers array as a
, joining unnest(a)
to posToDecrement
to generate the different combinations, and array_agg(a.value) over (order by pos rows between unbounded preceding and unbounded following)
, discarding rows where any value falls under 0 (min() < 0).
You'll find a fiddle proof-of-concept for PostgreSQL.
However my intuition tells there is room for improvement:
- given a start array of 3 elements, on each iteration I'd like to generate 3 triplets per previous triplet (thus 1 triplet on first iteration, 3 on the second one, 9 on the third, etc.) where each new triplet is the previous, minus 1 on 1 of its fields (thus if iteration 0 was
[3,2,4]
, iteration 1 should produce[2,2,4],[3,1,4],[3,2,3]
); relying on aUNION non-ALL
to discard duplicates (at iteration 2 two paths lead to[2,1,4]
:[2,2-1,4]
and[3-1,1,4]
). - as UNNEST generates 3 rows per triplet, and a windowed ARRAY_AGG can generate a new array of 3 elements for each of those rows, there should be no need to join anything else than the
UNNEST
peudo-table. - however I'm stuck trying to
ARRAY_AGG(prev.n - CASE WHEN v.pos = CURRENT_ROW THEN 1 ELSE 0 END) FROM UNNEST(…) v
(which already made questions on SO) - thus for now I have only the suboptimal
DISTINCT
after having joinedPos
(listing all possible indices) - another way could be to
ARRAY_CONCAT(ARRAY_AGG(v.n) OVER (… TO 1 PRECEDING), v.n - 1, ARRAY_AGG(v.n) OVER (… FROM 1 FOLLOWING))
本文标签: google bigquerySQL recursive CTEs to perm arrayStack Overflow
版权声明:本文标题:google bigquery - SQL recursive CTEs to perm array - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741462020a2380092.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论