admin管理员组

文章数量:1296390

I've a csv file named s3-durations.csv which consists of two headers called duration(numeric type) and timestamp (timestamptz).

This is how I read the csv into duckdb -

SELECT *
  FROM read_csv('s3-durations.csv',
      delim = ',',
      header = true,
      columns = {
          'duration': 'numeric',
          'timestamp': 'timestamptz'
      });

I want to add new column to the query result called start_time which is mathematically timestamp - INTERVAL $duration MILLISECONDS.

I'm trying to achieve that using below query -

SELECT duration, timestamp, timestamp - INTERVAL duration MILLISECONDS
  FROM read_csv('s3-durations.csv',
      delim = ',',
      header = true,
      columns = {
          'duration': 'numeric',
          'timestamp': 'timestamptz'
      });

The error is -

Parser Error: syntax error at or near "MILLISECONDS" LINE 1: ...mestamp, timestamp - INTERVAL duration MILLISECONDS ^

But the below query is working fine which uses some constant.

SELECT duration, timestamp, timestamp - INTERVAL 5 MILLISECONDS
  FROM read_csv('s3-durations.csv',
      delim = ',',
      header = true,
      columns = {
          'duration': 'numeric',
          'timestamp': 'timestamptz'
      });

How to fix my duckdb query?

I've a csv file named s3-durations.csv which consists of two headers called duration(numeric type) and timestamp (timestamptz).

This is how I read the csv into duckdb -

SELECT *
  FROM read_csv('s3-durations.csv',
      delim = ',',
      header = true,
      columns = {
          'duration': 'numeric',
          'timestamp': 'timestamptz'
      });

I want to add new column to the query result called start_time which is mathematically timestamp - INTERVAL $duration MILLISECONDS.

I'm trying to achieve that using below query -

SELECT duration, timestamp, timestamp - INTERVAL duration MILLISECONDS
  FROM read_csv('s3-durations.csv',
      delim = ',',
      header = true,
      columns = {
          'duration': 'numeric',
          'timestamp': 'timestamptz'
      });

The error is -

Parser Error: syntax error at or near "MILLISECONDS" LINE 1: ...mestamp, timestamp - INTERVAL duration MILLISECONDS ^

But the below query is working fine which uses some constant.

SELECT duration, timestamp, timestamp - INTERVAL 5 MILLISECONDS
  FROM read_csv('s3-durations.csv',
      delim = ',',
      header = true,
      columns = {
          'duration': 'numeric',
          'timestamp': 'timestamptz'
      });

How to fix my duckdb query?

Share asked Feb 11 at 23:21 user51user51 10.2k29 gold badges91 silver badges185 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 2

It can be done like this:

  with d as ( SELECT *
    FROM read_csv('s3-durations.csv',
        delim = ',',
        header = true,
        columns = {
            'duration': 'numeric',
            'timestamp': 'timestamptz'
        }))
  select 
   duration, 
   timestamp, 
   timestamp - (duration || ' MILLISECONDS')::interval as start 
from d;

or like this:

  with d as ( SELECT *
    FROM read_csv('s3-durations.csv',
        delim = ',',
        header = true,
        columns = {
            'duration': 'numeric',
            'timestamp': 'timestamptz'
        }))
  select 
   duration, 
   timestamp, 
   timestamp - to_milliseconds(duration::int) as start 
from d;
D select 42 as a, INTERVAL (a) MILLISECONDS;
┌───────┬────────────────────────────────────┐
│   a   │ to_milliseconds(CAST(a AS DOUBLE)) │
│ int32 │              interval              │
├───────┼────────────────────────────────────┤
│  42   │ 00:00:00.042                       │
└───────┴────────────────────────────────────┘

You'll need to add parenthesis, the parser error is kind of horrible, I agree. The (future) switch to a PEG-based parser should remedy this, for now I don't think this can be fixed sadly.

本文标签: How to add new column to duckdb query result based on a function that uses one of the columnStack Overflow