admin管理员组

文章数量:1122846

We have an extremely large S3 bucket, which is divided into folders by date. e.g.

  • dt=2024-11-19
  • dt=2024-11-20
  • dt=2024-11-21

I run queries through Redshift, and was instructed to always filter by the dt field to keep costs down.

Now I'm trying to write a script that will dynamically query the last 2 days of data, and am wondering which method would be fastest/cheapest:

SELECT ... FROM src WHERE dt >= CAST(DATEADD (day,-1,GETDATE ()) AS DATE)

CREATE TEMPORARY TABLE var AS (SELECT CAST(DATEADD (day,-1,GETDATE ()) AS DATE) AS yday);
SELECT ... FROM src WHERE dt >= (SELECT yday FROM var)

CREATE TEMPORARY TABLE var AS (SELECT CAST(DATEADD (day,-1,GETDATE ()) AS DATE) AS yday);
SELECT ... FROM src JOIN var ON src.dt >= var.yday

Or is there a better way that I haven't thought of yet?

本文标签: sqlHow best to to dynamically query S3 foldersStack Overflow