admin管理员组

文章数量:1410674

In a BigQuery notebook, if I want to run a query and store the results into a dataframe, I can use the API client:

client = bigquery.Client()
df = client.query(query).to_dataframe()

or I can use BigFrames:

df = bpd.read_gbq(query)

Which of these libraries is more performant?

In a BigQuery notebook, if I want to run a query and store the results into a dataframe, I can use the API client:

client = bigquery.Client()
df = client.query(query).to_dataframe()

or I can use BigFrames:

df = bpd.read_gbq(query)

Which of these libraries is more performant?

Share Improve this question edited Mar 31 at 1:55 Sourav Dutta 4843 silver badges9 bronze badges asked Mar 6 at 18:32 humptydumptyhumptydumpty 1432 silver badges7 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

BigFrames is better to use due to its performance, scalability, and optimized execution for larger datasets, its use of Arrow and BigQuery Storage Read API. As your data warehouse will require more dataset and to be performant, it is highly recommended to use BigFrames.

On the other hand, BigQuery API client.query().to\_dataframe is simpler for small queries, it provides you a structured way to query data, load data, and manage (ETL) but it will be limited to the complexity of the query, network latency, data size.

本文标签: dataframePerformance of BigQuery API Client vs BigQuery BigFramesStack Overflow