admin管理员组

文章数量:1356924

Title: Issue Passing Dynamic Variables from Shell Script into Dataform (v2.6.7)

Post:

Hi Dataform Community,

I'm encountering issues dynamically passing variables (specifically dataset names) from a shell script into my Dataform project (version 2.6.7). Here’s a simplified and anonymized example clearly illustrating my scenario:

Scenario Overview

I have four key files:

1. datasets.csv

datasetA,datasetB

2. dataform.json

{
  "warehouse": "bigquery",
  "defaultSchema": "sample_schema",
  "defaultDatabase": "sample_project",
  "vars": {
    "PROJECT_ID": "sample_project",
    "SCHEMA": "sample_schema",
    "DATASETS": ""
  }
}

3. filter_data.sqlx

config {
  type: "table",
  schema: "sample_schema"
}

SELECT *
FROM ${ref("source_table")}
WHERE dataset_id IN (${dataform.projectConfig.vars.DATASETS.split(",").map(d => `'${d.trim()}'`).join(",")})

4. run_dataform.sh

#!/bin/bash

DATASETS=$(awk 'NF' datasets.csv | paste -sd, -)

cp dataform.json dataform.json.bak

jq --arg DATASETS "$DATASETS" '.vars.DATASETS=$DATASETS' dataform.json > temp.json && mv temp.json dataform.json

dataform run --credentials "./envs/.df-credentials.json"

mv dataform.json.bak dataform.json

Issue Explanation

When I execute the script locally, the variables update correctly, and the Dataform job runs successfully. However, once deployed in a cloud workflow environment, Dataform executes successfully but returns empty tables. It seems like the variables from datasets.csv aren't being recognized or passed correctly in the workflow context.

Desired Outcome

  • I want the compiled SQL in Dataform to reflect the dynamic datasets passed from datasets.csv:
dataset_id IN ('datasetA','datasetB')

My questions to the community

  • Is passing dynamic variables via modification of dataform.json considered best practice?
  • Is there a better, officially supported method (perhaps environment variables or CLI parameters) to dynamically pass variables into Dataform at runtime?

Any help, advice, or examples would be greatly appreciated. Thanks in advance!

本文标签: pythonIssue Passing Dynamic Variables CSV from Shell Script into Dataform (v267)Stack Overflow