admin管理员组文章数量:1350033
In MySQL, it's possible to do something like:
-- imagine there are 100 columns
SELECT * FROM table GROUP BY col1, col2
I use this feature so much, without having to write out FIRST
or MIN
or whatever aggregate function would make it deterministic. Anyways, what's the simplest way to do this in BigQuery?
In MySQL, it's possible to do something like:
-- imagine there are 100 columns
SELECT * FROM table GROUP BY col1, col2
I use this feature so much, without having to write out FIRST
or MIN
or whatever aggregate function would make it deterministic. Anyways, what's the simplest way to do this in BigQuery?
2 Answers
Reset to default 1You can define yourself a template which
- uses
information_schema.columns
to introspect your tables, - construct a query,
- and play it using
execute immediate
:
EXECUTE IMMEDIATE
(
WITH tc AS (SELECT 'population_by_zip_2010' t, ['geo_id','zipcode'] c) -- ← When invoking you just have to mention your table name and group by columns here.
SELECT CONCAT
(
'SELECT ', STRING_AGG(CASE WHEN column_name IN UNNEST(c) THEN column_name ELSE CONCAT('MIN(', column_name, ')') END),
' FROM `bigquery-public-data`.census_bureau_usa.', table_name, -- ← The schema here has to be adapted once for all.
' GROUP BY ', ARRAY_TO_STRING(c, ',')
) q
FROM tc, `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.COLUMNS -- ← as well as here.
WHERE t = table_name
GROUP BY table_name, c
);
/!\ Beware that each query on information_schema
is billed 10 MB.
Theorically, you may even be able to create a procedure from it (theory thanks to this SO answer, but that I cannot test due to restrictions on Big Query playground):
CREATE OR REPLACE PROCEDURE selstargroupby(t STRING, c ARRAY<STRING>)
BEGIN
EXECUTE IMMEDIATE
(
-- No need of table tc here, its contents are the procedure's parameters.
SELECT CONCAT
…
GROUP BY table_name, c
);
END;
CALL selstargroupby('population_by_zip_2010', ['geo_id','zipcode']);
In BigQuery, SQL syntax is stricter compared to MySQL when using GROUP BY
. In MySQL, you can use SELECT *
in combination with GROUP BY
, but BigQuery requires that any column in the SELECT
clause be either:
Part of the
GROUP BY
clause orUsed in an aggregate function.
So, if you want to select all columns from a table and group by certain columns (e.g., col1, col2
), you cannot use SELECT *
without applying aggregate functions to the other columns.
Check the query below
Use Aggregate Functions with GROUP BY
SELECT
col1,
col2,
ANY_VALUE(col3) AS col3,
ANY_VALUE(col4) AS col4,
-- Continue with other columns
FROM
your_table
GROUP BY
col1, col2;
本文标签: sqlSELECT * with GROUP BY col(s)Stack Overflow
版权声明:本文标题:sql - SELECT * with GROUP BY col(s) - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1743867746a2552895.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
row_number() over (partition by col1, col2)
? – jarlh Commented Apr 2 at 6:50with tbl as (select 1 union all select 2) select * from tbl limit 1
. – David542 Commented 2 days ago