admin管理员组文章数量:1303064
I have two columns in SQL, IP and Agent_ID. Each Agent_ID can have different IPs associated with it, and the same IP should refer to the same user even with a different Agent_ID. How can I create a new unique identifier in SQL to group different Agent_IDs with the same IP into one group while also ensuring users with the same Agent_IDs in the same group? For example,
user data:
IP | Agent_ID |
---|---|
192.168.1.1 | a |
192.168.1.1 | a |
192.168.2.1 | b |
192.168.2.2 | b |
192.168.3.1 | c |
192.168.3.1 | d |
I have two columns in SQL, IP and Agent_ID. Each Agent_ID can have different IPs associated with it, and the same IP should refer to the same user even with a different Agent_ID. How can I create a new unique identifier in SQL to group different Agent_IDs with the same IP into one group while also ensuring users with the same Agent_IDs in the same group? For example,
user data:
IP | Agent_ID |
---|---|
192.168.1.1 | a |
192.168.1.1 | a |
192.168.2.1 | b |
192.168.2.2 | b |
192.168.3.1 | c |
192.168.3.1 | d |
Query output:
IP | Agent_ID | Group |
---|---|---|
192.168.1.1 | a | 1 |
192.168.1.1 | a | 1 |
192.168.2.1 | b | 2 |
192.168.2.2 | b | 2 |
192.168.3.1 | c | 3 |
192.168.3.1 | d | 3 |
2 Answers
Reset to default 1Almost similar to the above answer but slightly shorter version in Snowflake.
For each agent_id
, first find the min IP, which is indirect way of grouping the similar agents
SELECT ip, agent_id,
MIN(IP) OVER (PARTITION BY agent_id) AS min_ip
FROM test11
which generates
IP | AGENT_ID | MIN_IP |
---|---|---|
192.168.1.1 | a | 192.168.1.1 |
192.168.1.1 | a | 192.168.1.1 |
192.168.2.1 | b | 192.168.2.1 |
192.168.2.2 | b | 192.168.2.1 |
192.168.3.1 | c | 192.168.3.1 |
192.168.3.1 | d | 192.168.3.1 |
And then we just rank it using DENSE_RANK() which ranks without gaps.
Final Query
WITH groups AS (
SELECT ip, agent_id,
MIN(IP) OVER (PARTITION BY agent_id) AS min_ip
FROM test11
)
SELECT ip,agent_id,
DENSE_RANK() OVER (ORDER BY min_ip) AS group_id
FROM groups ;
Output
IP | AGENT_ID | GROUP_ID |
---|---|---|
192.168.1.1 | a | 1 |
192.168.1.1 | a | 1 |
192.168.2.1 | b | 2 |
192.168.2.2 | b | 2 |
192.168.3.1 | c | 3 |
192.168.3.1 | d | 3 |
As far as the question can be understood, your user is represented by an agent (Agent_ID). If the other agent has the same IP , then it is the same user.
So we can take min(Agent_ID) for all IP - first_Agent_ID
.
This first_Agent_ID is your "unique identifier" of user.
See example
select *
,(select min(Agent_ID) from data d2 where d2.IP=d1.IP) first_Agent_ID
from data d1
id | IP | Agent_ID | first_Agent_ID |
---|---|---|---|
1 | 192.168.1.1 | a | a |
2 | 192.168.1.1 | a | a |
3 | 192.168.2.1 | b | b |
4 | 192.168.2.2 | b | b |
5 | 192.168.3.1 | c | c |
6 | 192.168.3.1 | d | c |
if you want this identifier to be a number, wrap first_Agent_ID with the rank() or dense_rank() function.
select *
,dense_rank()over(order by (select min(Agent_ID) from data d2 where d2.IP=d1.IP)) GroupN
from data d1
id | IP | Agent_ID | GroupN |
---|---|---|---|
1 | 192.168.1.1 | a | 1 |
2 | 192.168.1.1 | a | 1 |
3 | 192.168.2.1 | b | 2 |
4 | 192.168.2.2 | b | 2 |
5 | 192.168.3.1 | c | 3 |
6 | 192.168.3.1 | d | 3 |
fiddle
I added the Id column to the table just to illustrate the example. This column is not included in the queries.
You haven't answered an interesting question from @Zegarek yet. In his example, your table becomes a connectivity graph and requires a recursive solution.
Perhaps you need to consider this particular case.
本文标签: sqlMake groups based on the values from two separate columnsStack Overflow
版权声明:本文标题:sql - Make groups based on the values from two separate columns - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741712563a2393925.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
Agent_ID=e
with192.168.1.2
? The first agentb
shares neither thatIP
norAgent_ID
, only the secondb
does. If you need all three to become one group in this scenario, you'll need iterative evaluation (recursive CTE orconnect by
) where at the end it's possible everyone turns out to be related transitively, through a chain of indirect links. – Zegarek Commented Feb 10 at 15:00