admin管理员组文章数量:1332881
I have a table with parent-child relationship. For a dashboard I am creating, I need to create a table so the dashboard (SAS Viya) can work with the table. For that I need to get for every company id (cid) all parents and children, including the level within the structure (e.g. if it is the 3-rd level child) and a sort_chain which is a chain of numbers from the parent to that child, so be able to group/sort all sort_cid's in the dashboard.
(The name sort_ is because I will later use these columns in a dashboard to display/sort the data properly.)
In short, I'm trying to go from this table:
cid | parent_cid |
---|---|
a | |
b | a |
c | b |
d | a |
e | |
f | |
g | f |
I have a table with parent-child relationship. For a dashboard I am creating, I need to create a table so the dashboard (SAS Viya) can work with the table. For that I need to get for every company id (cid) all parents and children, including the level within the structure (e.g. if it is the 3-rd level child) and a sort_chain which is a chain of numbers from the parent to that child, so be able to group/sort all sort_cid's in the dashboard.
(The name sort_ is because I will later use these columns in a dashboard to display/sort the data properly.)
In short, I'm trying to go from this table:
cid | parent_cid |
---|---|
a | |
b | a |
c | b |
d | a |
e | |
f | |
g | f |
To this table using PROC SQL in SAS:
EDIT: I have updated sort_chain below to reflect what I believe most people feel as more 'logical'.
cid | sort_cid | sort_chain | sort_level |
---|---|---|---|
a | a | 1.0.0 | 1 |
a | b | 1.1.0 | 2 |
a | c | 1.1.1 | 3 |
a | d | 1.2.0 | 2 |
b | a | 1.0.0 | 1 |
b | b | 1.1.0 | 2 |
b | c | 1.1.1 | 3 |
c | a | 1.0.0 | 1 |
c | b | 1.1.0 | 2 |
c | c | 1.1.1 | 3 |
d | a | 1.0.0 | 1 |
d | d | 1.2.0 | 2 |
e | e | 2.0.0 | 1 |
f | f | 3.0.0 | 1 |
f | g | 3.1.0 | 2 |
g | f | 3.0.0 | 1 |
g | g | 3.1.0 | 2 |
How can I achieve this?
What I have tried:
First step was setting all missing parent_cid to itself (lets call this table1):
cid | parent_cid |
---|---|
a | a |
b | a |
c | b |
d | a |
e | e |
f | f |
g | f |
Then I tried self-joining this table to get all known daughters, but then you get an extra column, instead of extra rows:
SELECT
a.cid, a.parent_cid, b.cid AS sort_chain
FROM
table1 a
INNER JOIN
table1 b
ON
b.parent_company_id = apany_id;
Same for self-join to get all known parents.
I also tried a cross-join, but then the table gets everything x everything, which doesn't make sense.
In short: I feel like I'm thinking in the right direction, but missing something or messing something up. Hope someone can provide me an example code or point me in the right direction. Thanks!
Share Improve this question edited Nov 21, 2024 at 13:31 Maanloper asked Nov 20, 2024 at 15:39 MaanloperMaanloper 11 silver badge2 bronze badges 7- there is a CONNECT BY syntax that you may research – Randy Commented Nov 20, 2024 at 17:12
- It seems you want to generate a three part string for SORT_CHAIN because the maximum depth of any subtree is just 3. What do you want when you have a tree that is deeper? What if you add a node H that is a child of D ? – Tom Commented Nov 20, 2024 at 20:05
- CONNECT BY is not available in SAS Proc SQL. Certainly is in ORACLE. – Richard Commented Nov 20, 2024 at 22:05
- Suppose you have a chain a.b.c.d do you really want on output row for each order dependent pair along the chain? (i.e. aa ab ac ad bb bc bd cc cd dd (and backward) ba ca da cb db dc ? – Richard Commented Nov 20, 2024 at 22:12
- @Tom, the depth must be in principal unlimited. – Maanloper Commented Nov 21, 2024 at 13:25
1 Answer
Reset to default 1Use a HASH() to find the depth/level and top node for each observation.
data want;
* Load the cid->parent_cid mapping into a hash ;
if _n_=1 then do;
declare hash h(dataset:'have');
h.definekey('cid');
h.definedata('parent_cid');
h.definedone();
end;
* Load next CID value ;
set have ;
* Find their top ancestor and how deep they are ;
do depth=1 by 1 until(h.find());
top=cid;
cid=parent_cid;
end;
* Re-load the same CID value to reset CID and PARENT_CID ;
set have;
run;
To generate your counters use BY group processing. Make sure to make enough counters for the maximum depth your data has.
proc sort;
by top depth cid ;
run;
* find how many levels will be needed ;
proc sql noprint;
select max(depth) into :md trimmed from want;
quit;
* Use FIRST. processing to get the counters ;
* Concatenate to make index string ;
data numbered;
set want;
by top depth cid;
array c[&md] ;
c[depth]+1;
if first.depth then do j=depth+1 to &md;
c[j]=1;
end;
drop j ;
sort_chain=catx('.',of c[*]);
run;
Results
And if we add a new observation CID='h' and PARENT_CID='c' to your example data we will automatically get 4 level sort chain values:
本文标签:
版权声明:本文标题:sql - Get all parents and children for each record, including child-level and chain from main parent to child - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742347879a2457892.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论