admin管理员组

文章数量:1124395

I have a query that groups data by week. I would like to include a partial week at the start and end of the result but the partial week does not count the records.

SELECT
    series_interval,
    coalesce(conversion_count, 0) AS conversion_count
FROM (
      SELECT
        series::date AS series_interval
      FROM
        generate_series(
          date_trunc('week', to_date('2024-12-10', 'YYYY-MM-DD')), -- Added 6 days to start date to get following week
          date_trunc('week', to_date('2024-12-23', 'YYYY-MM-DD')),
          INTERVAL '1 week'
        ) AS series
      UNION
      SELECT to_date('2024-12-04', 'YYYY-MM-DD')
    ) AS series -- Start date (middle of week)
    LEFT JOIN (
    SELECT
        date_trunc('week', "conversions"."created_at") AS agg_interval,
        count("conversions"."id") AS conversion_count
    FROM
        "conversions"
        INNER JOIN "commissions" ON "commissions"."conversion_id" = "conversions"."id"
    WHERE
        "conversions"."created_at" >= '2024-12-02' 
             AND "conversions"."created_at" <= '2024-12-23 23:59:59.999999'
    GROUP BY
        agg_interval
) c ON c."agg_interval" = series."series_interval";

The date buckets are:

4th => 8th (5 days)
9th => 15th (7 days)
16th => 22nd (7 days)
23rd => 23rd (1 day - restricted by WHERE so final partial week is what I want)

And the query result is:

series_interval conversion_count
2024-12-04 0
2024-12-09 411
2024-12-16 368
2024-12-23 32

I have a query that groups data by week. I would like to include a partial week at the start and end of the result but the partial week does not count the records.

SELECT
    series_interval,
    coalesce(conversion_count, 0) AS conversion_count
FROM (
      SELECT
        series::date AS series_interval
      FROM
        generate_series(
          date_trunc('week', to_date('2024-12-10', 'YYYY-MM-DD')), -- Added 6 days to start date to get following week
          date_trunc('week', to_date('2024-12-23', 'YYYY-MM-DD')),
          INTERVAL '1 week'
        ) AS series
      UNION
      SELECT to_date('2024-12-04', 'YYYY-MM-DD')
    ) AS series -- Start date (middle of week)
    LEFT JOIN (
    SELECT
        date_trunc('week', "conversions"."created_at") AS agg_interval,
        count("conversions"."id") AS conversion_count
    FROM
        "conversions"
        INNER JOIN "commissions" ON "commissions"."conversion_id" = "conversions"."id"
    WHERE
        "conversions"."created_at" >= '2024-12-02' 
             AND "conversions"."created_at" <= '2024-12-23 23:59:59.999999'
    GROUP BY
        agg_interval
) c ON c."agg_interval" = series."series_interval";

The date buckets are:

4th => 8th (5 days)
9th => 15th (7 days)
16th => 22nd (7 days)
23rd => 23rd (1 day - restricted by WHERE so final partial week is what I want)

And the query result is:

series_interval conversion_count
2024-12-04 0
2024-12-09 411
2024-12-16 368
2024-12-23 32

I understand why I get 0 for the first row 0 - because 2024-12-04 is not a Monday and this is how the date truncation / grouping works.

How can I get the count for 4th => 8th?

Share Improve this question edited 2 days ago Tim Fletcher asked 2 days ago Tim FletcherTim Fletcher 7,3741 gold badge36 silver badges33 bronze badges 4
  • What is date '2024-12-05' in output? You have start date '2024-02-04', filter from '2024-12-02' to '2024-12-23 23:59:59.999999'. – ValNik Commented 2 days ago
  • Apologies, I've updated the question with the correct dates / result. – Tim Fletcher Commented 2 days ago
  • I would simply disregard the partial weeks in the series relation and limit the range via the WHERE clause on conversions only. If you don't want to see 2024-12-02 in the output, use LEAST in the top-level selection – Bergi Commented 2 days ago
  • @Bergi I think I'm following. That first data bucket would aggregated on the full week (starting on 2nd Dec) but the overall data that is selected would be restricted by the conversions date range to the 5 days. I do want that first row to have 2024-12-04 as the date though. – Tim Fletcher Commented 2 days ago
Add a comment  | 

2 Answers 2

Reset to default 0

For simplicity, I'l use CTE params to set start_date and end_date for query. Further, the literal constants are not used in the query.
Source data for example see in fiddle .

Generate series (report frame) include start_date, start date of weeks between start_date and end_date.

with params as(
  select '2024-12-04 00:00:00.00000'::timestamp start_date
     ,'2024-12-23 23:59:59.999999'::timestamp end_date
)
SELECT start_date report_date ,date_trunc('week',start_date) series_interval
from params p

UNION
SELECT series::date AS report_date,series::date AS series_interval
FROM params p
cross join generate_series(
          date_trunc('week',start_date+interval '6 day'), -- Added 6 days to start date to get following week
          date_trunc('week', end_date),
          INTERVAL '1 week'
        ) AS series
report_date series_interval
2024-12-04 00:00:00 2024-12-02 00:00:00
2024-12-16 00:00:00 2024-12-16 00:00:00
2024-12-09 00:00:00 2024-12-09 00:00:00
2024-12-23 00:00:00 2024-12-23 00:00:00

Column report_date is used as output, series_interval - for join with grouped data.
Extra dates are cut off by the filter WHERE created_at between start_date and end_date.

with params as(
  select '2024-12-04 00:00:00.00000'::timestamp start_date
     ,'2024-12-23 23:59:59.999999'::timestamp end_date
)
SELECT report_date::date, series_interval,
    coalesce(conversion_count, 0) AS conversion_count
FROM (
    SELECT start_date report_date ,date_trunc('week',start_date) series_interval
    from params p
    UNION
    SELECT series::date AS report_date,series::date AS series_interval
    FROM params p
    cross join generate_series(
          date_trunc('week',start_date+interval '6 day'), -- Added 6 days to start date to get following week
          date_trunc('week', end_date),
          INTERVAL '1 week'
        ) AS series
  ) AS series -- Start date (middle of week)
    LEFT JOIN (
      SELECT
        date_trunc('week', "conversions"."created_at") AS agg_interval,
        count("conversions"."id") AS conversion_count
      FROM "conversions"
      cross join params p
--        INNER JOIN "commissions" ON "commissions"."conversion_id" = "conversions"."id"
      WHERE "conversions"."created_at" between start_date and end_date
      GROUP BY agg_interval
 ) c ON c."agg_interval" = series."series_interval"
 order by report_date;
report_date series_interval conversion_count
2024-12-04 2024-12-02 00:00:00 5
2024-12-09 2024-12-09 00:00:00 7
2024-12-16 2024-12-16 00:00:00 7
2024-12-23 2024-12-23 00:00:00 1

fiddle

If you want first data bucket would aggregated on the full week (starting on 2nd Dec) - (It's not clear why) use where filter

WHERE "conversions"."created_at" between date_trunc('week',start_date) and end_date

You have two non-standard week definitions: 1 starting other the Monday and 1 not being a full week. This complicates the matter somewhat but not excessively. The following uses a recursive CTE to generate the base periods beginning with your non-standard starting week and following periods each ending on the following each Monday and beginning the next period (it does however not deal with the short week). Once the periods are defined a second CTE handles the short week while converting each to a daterange. The main query then just counts the rows falling into each date range. See demo here.

with recursive 
     intvl(sdate,edate,mdate) as
          ( select '2024-12-04'::date                   -- initial date 
                 , next_day('2024-12-04 '::date, 'Mon') -- to Monday following
                 , '2024-12-24'::date                   -- maximun date to process
            union all 
            select edate                                -- beginning of period
                 , next_day(edate,'Mon')                -- to next Monday
                 , mdate                                -- keeping Max date
              from intvl
             where sdate < '2024-12-23'::date 
          ) -- select * from intvl;
    , wk_range(agg_interval) as
         (select daterange(sdate, least(edate,mdate), '[)')
            from intvl
         ) -- select * from wk_range;
select lower(wk.agg_interval)  "Series Interval" 
     , count(j.created_dt)     "Conversion Count" 
  from wk_range wk  
  left join ( select created_at::date created_dt
                   , c.id id, cm.conversion_id jid
                from conversions  c
                join commissions  cm
                  on cm.conversion_id = c.id
                  order by 1
            ) j
         on j.created_dt <@ wk.agg_interval
      group by lower(wk.agg_interval)
      order by lower(wk.agg_interval);

NOTE: The above makes use of a next_day() function. Oracle provides a function to do this. I found it so useful that I wrote a Postgres version to do the same. It is provided in the demo.

本文标签: postgresqlAggregating by week with a partial starting week in PostgresStack Overflow