admin管理员组文章数量:1333710
I need to get the cartesian product of two "sets" via Excel 365 Formulas (not VBA, not Power Query). For instance, my two sets are the two left tables, and the expected cartesian product is the right table:
Cartesian product of two sets in Excel 365
How to get the cartesian product with one Excel formula?
If possible, I'd like a general answer, to work for any two sets, whatever values they contain, whatever the number of columns is in the two sets.
The two sets:
Column1 | Column2 | Column1 | Column2 | |
---|---|---|---|---|
4500005010 | A | PO start | D | |
4500005011 | B | header merge | E | |
4500005012 | C | PO activate | F |
I need to get the cartesian product of two "sets" via Excel 365 Formulas (not VBA, not Power Query). For instance, my two sets are the two left tables, and the expected cartesian product is the right table:
Cartesian product of two sets in Excel 365
How to get the cartesian product with one Excel formula?
If possible, I'd like a general answer, to work for any two sets, whatever values they contain, whatever the number of columns is in the two sets.
The two sets:
Column1 | Column2 | Column1 | Column2 | |
---|---|---|---|---|
4500005010 | A | PO start | D | |
4500005011 | B | header merge | E | |
4500005012 | C | PO activate | F |
4 Answers
Reset to default 3A generalized option could be:
=CROSSJOIN(Table1,Table2)
Where CROSSJOIN
is a custom function defined in Name Manager:
=LAMBDA(array1,array2,
LET(
i, SEQUENCE(ROWS(array1)),
j, SEQUENCE(, ROWS(array2)),
HSTACK(
CHOOSEROWS(array1, TOCOL(IF(j, i))),
CHOOSEROWS(array2, TOCOL(IF(i, j)))
)
)
)
Source: https://stackoverflow/a/78244826/22459970
If you also wanted the option to handle header rows, CROSSJOIN
can be defined as follows:
=LAMBDA(array1,array2,[headers],
LET(
incl_h, CHOOSE(headers + 1, 0, 1, 0, 1),
show_h, CHOOSE(headers + 1, 0, 0, 1, 1),
a, DROP(array1, incl_h),
b, DROP(array2, incl_h),
i, SEQUENCE(ROWS(a)),
j, SEQUENCE(, ROWS(b)),
v, HSTACK(
CHOOSEROWS(a, TOCOL(IF(j, i))),
CHOOSEROWS(b, TOCOL(IF(i, j)))
),
IF(
show_h,
VSTACK(
IF(
incl_h,
HSTACK(
TAKE(array1, 1),
TAKE(array2, 1)
),
HSTACK(
"tbl1.Col" & SEQUENCE(, COLUMNS(a)),
"tbl2.Col" & SEQUENCE(, COLUMNS(b))
)
),
v
),
v
)
)
)
The optional [headers] argument will behave in a similar fashion to that of the [field_headers] argument of GROUPBY
, where the following options are accepted:
- 0 | No
- 1 | Yes but don't show
- 2 | No but generate
- 3 | Yes and show
For example:
=CROSSJOIN(Table1,Table2,2)
Or:
=CROSSJOIN(Table1[#All],Table2[#All],3)
Please note, the line breaks and spaces/indentations were used for improved readability. They can be removed when defining the function in Name Manager.
Without using LAMBDA()
helper function:
=LET(
a, TOROW(DataTwo[Column1]&"|"&DataTwo[Column2]),
b, TOCOL(DataOne[Column1]&"|"&DataOne[Column2]&"|"&a),
TEXTSPLIT(TEXTAFTER("|"&b,"|",SEQUENCE(,4)),"|"))
• Or, Option Two:
=LET(
a, ROWS(DataTwo),
b, ROWS(DataOne),
c, CHOOSEROWS(DataOne,INT((SEQUENCE(a*b)-1)/a)+1),
d, CHOOSEROWS(DataTwo,MOD((SEQUENCE(a*b)-1),a)+1),
HSTACK(c,d))
Change variables at the end to evaluate:
=LET(
a, ROWS(DataTwo),
b, ROWS(DataOne),
c, SEQUENCE(a*b),
d, CHOOSEROWS(DataOne,INT((c-1)/a)+1),
e, CHOOSEROWS(DataTwo,MOD((c-1),a)+1),
f, HSTACK(d,e),
f)
There's a solution based on MAKEARRAY
:
=MAKEARRAY(
ROWS(Set_1)*ROWS(Set_2),
COLUMNS(Set_1)+COLUMNS(Set_2),
LAMBDA(row,col,
LET(set_num,IF(col<=COLUMNS(Set_1),1,2),CHOOSE(set_num,
INDEX(Set_1,QUOTIENT(row-1,3)+1,col),
INDEX(Set_2,MOD(row-1,3)+1,col-COLUMNS(Set_1))))))
Cartesian Product
=LET(data1,Table11,data2,Table12,
Repeat,LAMBDA(fs,ss,by_col,
TOCOL(IF(TOROW(ss),fs),,by_col)),
fs,SEQUENCE(ROWS(data1)),
ss,SEQUENCE(ROWS(data2)),
HSTACK(CHOOSEROWS(data1,Repeat(fs,ss,0)),
CHOOSEROWS(data2,Repeat(ss,fs,1))))
TOCOL/IF/SEQUENCE
Efficiency
- Not once have I encountered this combo being more efficient than the
INT/MOD
combo. Not when using them as they are (no visible difference) but when combined with other functions. - In this particular case, on two tables of 1000 rows by 5 columns, this combo took 10s while the
INT/MOD
solution by Mayukh Bhattacharya (the nice one withc, SEQUENCE(a*b),
) took 14s. YourMAKEARRAY
solution took 23s.
IF/SEQUENCE
- To get more familiar with this combination, feel free to download Repeat Column from my
OneDrive
.
本文标签: How to get the cartesian product (cross join) of two sets via Excel FormulasStack Overflow
版权声明:本文标题:How to get the cartesian product (cross join) of two sets via Excel Formulas - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742335362a2455515.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
LAMBDA()
helper:=TEXTSPLIT(TEXTAFTER("|"&TOCOL(DataOne[Column1]&"|"&DataOne[Column2]&"|"&TOROW(DataTwo[Column1]&"|"&DataTwo[Column2])),"|",{1,2,3,4}),"|")
– Mayukh Bhattacharya Commented Nov 20, 2024 at 19:24