admin管理员组

文章数量:1416529

I have an inherited query outputting the below table where there is a table with sales and Fees, the dates of them, and a Client ID.

The rule is that on the date when a Sale is made, we can charge a fee, or choose not to, but what we can't do is charge a fee to a client if not attached to a sale on the same day.

I need to look to return just row 5 where we have a fee on its own for Client 1 on the 8th January 2025 and no sale. Rows 1 and 2 for Client 1 are fine as both a Sale and a Fee were present on the same day. Rows 3 and 4, and also 6 are also fine.

But I want to return row 5 on its own so that I can monitor what Fees are being charged on any given day, where a sale is not present.

Event Event Date Client ID
Sale 2025-01-04 1
Fee 2025-01-04 1
Sale 2025-01-06 3
Sale 2025-01-06 3
Fee 2025-01-08 1
Sale 2025-01-09 2

I have an inherited query outputting the below table where there is a table with sales and Fees, the dates of them, and a Client ID.

The rule is that on the date when a Sale is made, we can charge a fee, or choose not to, but what we can't do is charge a fee to a client if not attached to a sale on the same day.

I need to look to return just row 5 where we have a fee on its own for Client 1 on the 8th January 2025 and no sale. Rows 1 and 2 for Client 1 are fine as both a Sale and a Fee were present on the same day. Rows 3 and 4, and also 6 are also fine.

But I want to return row 5 on its own so that I can monitor what Fees are being charged on any given day, where a sale is not present.

Event Event Date Client ID
Sale 2025-01-04 1
Fee 2025-01-04 1
Sale 2025-01-06 3
Sale 2025-01-06 3
Fee 2025-01-08 1
Sale 2025-01-09 2

By ignoring CTEs and Unioning tables together and tackling my problem from a different angle.

Share Improve this question edited Mar 14 at 13:57 Thom A 96.3k11 gold badges61 silver badges95 bronze badges asked Mar 14 at 13:06 Paul HarperPaul Harper 573 silver badges8 bronze badges 2
  • 2 A minimal reproducible example would make this much clearer. – jarlh Commented Mar 14 at 13:09
  • Why ignore CTEs? – MatBailie Commented Mar 15 at 21:15
Add a comment  | 

2 Answers 2

Reset to default 1

It's a left-join question. Let me first provide the query:

select [Fee].[Event], [Fee].[Event Date], [Fee].[Client ID]
from yourtable [Fee]
join yourtable [Sale]
on [Fee].[Client ID] = [Sale].[Client ID] and [Fee].[Event Date] = [Sale].[Event Date] and [Fee].[Event] = 'Fee' and [Sale].[Event] = 'Sale'
where [Sale].[Client ID] is null

Explanation:

  • we self-join yourtable
  • aliasing the first tuple as Fee, for it will represent the fee on that day
  • aliasing the second tuple as Sale for it will represent the sale on that day (it's another matter that we expect the sale to be nonexistent, more on that later)
  • the join criteria is:
    • the client is the same
    • the event date is the same
    • Fee is a fee and Sale is a sale
  • finally, we check for the client of the sale to not exist, which makes sure that from all the fee-sale pairs we choose only those, where there are 0 sale matches for the day for the client

Caveat: you may have multiple fees for the same day without a sale. In such cases those fees will all be separate records in terms of the query above. If you want to group them so a record will represent a single day / client and aggregate the fees, then you will need to group by.

You can achieve this with NOT EXISTS too, as

select [Fee].[Event], [Fee].[Event Date], [Fee].[Client ID]
from yourtable [Fee]
WHERE NOT EXISTS (
    SELECT 1
    FROM yourtable [Sale]
    WHERE [Sale].[Event Date] = [Fee].[Event Date] and
          [Sale].[Client ID] = [Fee].[Client ID]
)

I'm not sure about its performance though. My choice for a left join was due to my opinion that it will be likely more performant than a not exists subquery, but I could be wrong.

If you want unmatched sales, perhaps you can do something like this:

SELECT  *
INTO #data
FROM
(
    VALUES  (N'Sale', N'2025-01-04', 1)
    ,   (N'Fee', N'2025-01-04', 1)
    ,   (N'Sale', N'2025-01-06', 3)
    ,   (N'Sale', N'2025-01-06', 3)
    ,   (N'Fee', N'2025-01-08', 1)
    ,   (N'Sale', N'2025-01-09', 2)
) t (Event,[Event Date],[Client ID])


SELECT  *
FROM    (
    SELECT  *
    ,   COUNT(CASE WHEN Event = 'Fee' THEN 1 END) OVER(PARTITION BY [Event Date], [Client ID]) AS fees
    ,   COUNT(CASE WHEN Event = 'Sale' THEN 1 END) OVER(PARTITION BY [Event Date], [Client ID]) AS sales
    FROM    #data
    ) d
WHERE   d.fees > d.sales

COUNT(CASE WHEN Event = 'Fee' THEN 1 END) OVER(PARTITION BY [Event Date], [Client ID]) calculates how many fees there is per date/client combo. The same is done for sales. Finally you can just take rows where number of fees are higher than number of sales.

This might be fasted than self-joins, especially if you want to see all related data as well.

If your index is by ClientId + EventDate and not EventDate + ClientId you can flip those in the partition by clause.

Output is:

Event Event Date Client ID fees sales
Fee 2025-01-08 1 1 0

本文标签: sql serverSQL ExceptUnion AssistanceStack Overflow