admin管理员组文章数量:1355559
Here's my Postgres table schema: db<>fiddle
create table my_table
(id, name, status, realm_id)as values
(1, 'cash', 'denied', 123)
,(2, 'check', 'closed', 123)
,(3, 'payroll','denied', 123)
,(4, 'cash', 'pending', 456)
,(5, 'deposit','suspended', 456)
,(6, 'lending','expired', 456)
,(7, 'loan', 'trial', 456)
,(8, 'crypto', 'active', 456)
,(9, 'payroll','closed', 456);
The result that I'd like to get is something like this:
realm_id | status |
---|---|
123 | inactive |
456 | active |
Here's my Postgres table schema: db<>fiddle
create table my_table
(id, name, status, realm_id)as values
(1, 'cash', 'denied', 123)
,(2, 'check', 'closed', 123)
,(3, 'payroll','denied', 123)
,(4, 'cash', 'pending', 456)
,(5, 'deposit','suspended', 456)
,(6, 'lending','expired', 456)
,(7, 'loan', 'trial', 456)
,(8, 'crypto', 'active', 456)
,(9, 'payroll','closed', 456);
The result that I'd like to get is something like this:
realm_id | status |
---|---|
123 | inactive |
456 | active |
So two dimensions of aggregation:
- aggregate based on
realm_id
first; - aggregate based on
status
: as long as therealm_id
has a name which status is neitherclosed
nordenied
, it'll be marked asactive
, otherwise, it'sinactive
.
I've tried to use aggregate and left outer join, but no luck thus far.
Any ideas would be greatly appreciated!
Share Improve this question edited Mar 30 at 15:44 Zegarek 27.2k5 gold badges24 silver badges30 bronze badges asked Mar 29 at 16:06 Fisher CoderFisher Coder 3,60814 gold badges55 silver badges95 bronze badges 2- 9 | payroll | closed | 456 - so why does this appear as active in your result given the rule status closed = inactive? – P.Salmon Commented Mar 29 at 16:15
- 1 @P.Salmon Point 2 of their requirements is there must be at least one row per realm_id with a value not in ('closed','denied'), then the status should be active for this realm_id. That's the case for realm_id 456, there are even 5 such rows. – Jonas Metzler Commented Mar 29 at 16:21
3 Answers
Reset to default 7You can use EVERY
combined with a GROUP BY
clause to check if every row per realm_id has status closed or denied. Using a CASE
expression, you can then set the status to active or inactive.
SELECT
realm_id,
CASE
WHEN EVERY(status in ('closed', 'denied'))
THEN 'inactive'
ELSE 'active' END AS status
FROM yourtable
GROUP BY realm_id
ORDER BY realm_id;
I would even prefer to skip the CASE
expression and simply return t or f in a column named inactive, that's a matter of taste:
SELECT
realm_id,
EVERY(status in ('closed', 'denied')) AS inactive
FROM yourtable
GROUP BY realm_id
ORDER BY realm_id;
See this db<>fiddle with your sample data.
You can use bool_or()
aggregation function.
The BOOL_OR()
is an aggregate function that allows you to aggregate boolean values across rows within a group.
The BOOL_OR()
function returns true if at least one value in the group is true. If all values are false, the function returns false.
Query:
SELECT realm_id,
CASE WHEN bool_or(status NOT IN ('closed', 'denied'))
THEN 'active' ELSE 'inactive'
END AS status
FROM tableName
GROUP BY realm_id
order by realm_id
Output:
realm_id | status |
---|---|
123 | inactive |
456 | active |
[fiddle](https://dbfiddle.uk/gkfZmwlr)
For speed, add an expression index:
create index on my_table(realm_id asc,(status not in('closed','denied')) desc);
This lets you use an index skip scan:
with recursive cte as(
(select realm_id
, exists(select from my_table as b
where status not in('closed','denied')
and a.realm_id=b.realm_id) as is_active
from my_table as a
order by 1 limit 1
)union all
select s.*
from cte cross join lateral
(select c.realm_id
, exists(select from my_table as d
where status not in('closed','denied')
and c.realm_id=d.realm_id) as is_active
from my_table as c
where c.realm_id>cte.realm_id
order by 1 limit 1) as s
where cte.realm_id is not null)
select*from cte
where realm_id is not null;
Simple aggregation with every
/bool_and
/bool_or
is nice, short and clear:
select realm_id
, not bool_and(status=any('{closed,denied}')) as is_active
from my_table
group by 1;
Unfortunately, it can't use an index which is why on 2M rows it takes 575ms
on average, having to run a sequential scan. The skip scan runs 56x faster, under 11ms
using index-onlies:
demo at db<>fiddle
query_name | avg | min | max | stddev |
---|---|---|---|---|
index skip scan | 00:00:00.010736 | 00:00:00.010207 | 00:00:00.014184 | 00:00:00.000849 |
bool_and , every , bool_or |
00:00:00.575871 | 00:00:00.564045 | 00:00:00.647492 | 00:00:00.014413 |
What would be even faster is having that information collected ahead of time. You can use a tally table to track which realm_id
is active, keep count of records in it and whatever other aggregate info you regularly need to check. It requires a simple trigger to maintain it by running an increment/decrement, boolean flip, append/pop after each DML on the table, but the information is then always readily available with no need to re-calculate.
本文标签: sqlAggregate status based on presence of at least one matchStack Overflow
版权声明:本文标题:sql - Aggregate status based on presence of at least one match - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744012957a2575904.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论