admin管理员组文章数量:1302379
Has anyone adopted BigQuery's generative AI functions in their ETL or ad hoc processes and can speak to the cost applications of using them? We are considering using ml.generate_text
to enrich our data, playing around with Gemini 1.5 Flash. The direct integration with BQ would allow for easy application within our existing ETL processes. This is brand new territory for me.
I'm struggling to understand costs and best practices when it comes to reducing them for Gen AI. An example use case would be taking a distinct list of customer email domains to determine which are public (i.e. gmail, yahoo, outlook, etc) and which would be considered private. For this, let's say there are over 100,000 distinct domains to search. Examples I have come across show using ml.generate_text
for each row, such as:
with dist_domain as (
select
split(email, '@')[safe_offset(1)] as email_domain
from
subscriptions_table
group by
all
order by
count(*) desc
limit 10 -- while testing
)
select
prompt,
email_domain,
ml_generate_text_llm_result,
from
ml.generate_text(
model `bqml-llm-model`, -- using Gemini 1.5 Flash
(
select
concat(
'Using the following email domain: (',
email_domain,
') provide results in JSON with the following structure: ',
chr(10),
'- is_common: BOOLEAN result if the domain is widely used by the public, ',
'comes from a common email provider, ',
'or comes from an internet service provider. ',
'Domains like gmail/yahoo/hotmail/etc are considered public domains. ',
chr(10),
'Do not perform any external lookups or searches. Base your decision on your existing knowledge. ',
'An explanation is not required. Just the formatted results.'
) AS prompt,
email_domain
from
dist_domain
),
struct(
true as flatten_json_output
)
);
where the email_domain
column is used within the prompt which gets applied to each row. The prompt with gmail
returns
{
"is_common": true
}
Is it common practice to use ml.generate_text
against 100k+ rows (or more)? This seems like it would take a long time AND be sending a large volume of prompts. Are costs applied in aggregate against the number of tokens returned across all prompts or are you charged for each prompt? Would BQ costs increase for the increased time it takes to run the query while waiting for Gemini's results?
Has anyone adopted BigQuery's generative AI functions in their ETL or ad hoc processes and can speak to the cost applications of using them? We are considering using ml.generate_text
to enrich our data, playing around with Gemini 1.5 Flash. The direct integration with BQ would allow for easy application within our existing ETL processes. This is brand new territory for me.
I'm struggling to understand costs and best practices when it comes to reducing them for Gen AI. An example use case would be taking a distinct list of customer email domains to determine which are public (i.e. gmail, yahoo, outlook, etc) and which would be considered private. For this, let's say there are over 100,000 distinct domains to search. Examples I have come across show using ml.generate_text
for each row, such as:
with dist_domain as (
select
split(email, '@')[safe_offset(1)] as email_domain
from
subscriptions_table
group by
all
order by
count(*) desc
limit 10 -- while testing
)
select
prompt,
email_domain,
ml_generate_text_llm_result,
from
ml.generate_text(
model `bqml-llm-model`, -- using Gemini 1.5 Flash
(
select
concat(
'Using the following email domain: (',
email_domain,
') provide results in JSON with the following structure: ',
chr(10),
'- is_common: BOOLEAN result if the domain is widely used by the public, ',
'comes from a common email provider, ',
'or comes from an internet service provider. ',
'Domains like gmail/yahoo/hotmail/etc are considered public domains. ',
chr(10),
'Do not perform any external lookups or searches. Base your decision on your existing knowledge. ',
'An explanation is not required. Just the formatted results.'
) AS prompt,
email_domain
from
dist_domain
),
struct(
true as flatten_json_output
)
);
where the email_domain
column is used within the prompt which gets applied to each row. The prompt with gmail
returns
{
"is_common": true
}
Is it common practice to use ml.generate_text
against 100k+ rows (or more)? This seems like it would take a long time AND be sending a large volume of prompts. Are costs applied in aggregate against the number of tokens returned across all prompts or are you charged for each prompt? Would BQ costs increase for the increased time it takes to run the query while waiting for Gemini's results?
1 Answer
Reset to default 0Based on the documentation for pricing BigQuery ML remote model training, inference, and tuning costs are incurred remote model types for this case Gemini 1.5 Flash which is under Google models hosted on Vertex AI and the inference function which is the ML.GENERATE_TEXT.
Other Gemini models such as Gemini 1.5 Flash are billed based on modalities such as characters, images, video/audio seconds. Text input is charged by every 1,000 characters of input (prompt) and every 1,000 characters of output (response). Characters are counted by UTF-8 code points and white space is excluded from the count, resulting in approximately 4 characters per token. Prediction requests that lead to filtered responses are charged for the input only. At the end of each billing cycle, fractions of one cent ($0.01) are rounded to one cent.
The key to cost-effective processing with generative AI is batching. Instead of sending 100,000 individual prompts, try to group multiple email domains into a single prompt. If your use case involves handling large datasets and doesn’t require immediate feedback, batch processing will be more efficient.
Batch prediction is a valuable technique for applying machine learning models to large datasets efficiently. Instead of processing individual data points, you can submit a batch of data to Gemini for prediction, saving time and computational resources.
本文标签: artificial intelligenceBigQueryGenerative AI How Are Costs AppliedStack Overflow
版权声明:本文标题:artificial intelligence - BigQuery + Generative AI: How Are Costs Applied? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741705791a2393551.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论