admin管理员组

文章数量:1122832

I have a table with id and variants fields of type VARIANT that hold an array of JSON objects, each row can have from 100 to 1000 such objects inside the variants column, and I need to calculate the total amount of variant objects int the table. I tried to do it somehow with flatten and with array_size but either the query is invalid or the result is very small, I have 150m rows in the table, but sometimes I get results like 50k, which is not the correct one for sure. I'm new to Snowflake and appreciate any help!

I have a table with id and variants fields of type VARIANT that hold an array of JSON objects, each row can have from 100 to 1000 such objects inside the variants column, and I need to calculate the total amount of variant objects int the table. I tried to do it somehow with flatten and with array_size but either the query is invalid or the result is very small, I have 150m rows in the table, but sometimes I get results like 50k, which is not the correct one for sure. I'm new to Snowflake and appreciate any help!

Share Improve this question edited Dec 11, 2024 at 5:52 DarkBee 15.8k8 gold badges69 silver badges110 bronze badges asked Nov 21, 2024 at 16:19 Bogdan DubykBogdan Dubyk 5,4828 gold badges38 silver badges79 bronze badges 1
  • Summing values from a JSON array in Snowflake – Lukasz Szozda Commented Nov 21, 2024 at 16:34
Add a comment  | 

2 Answers 2

Reset to default 0

It would have been helpful if you had shared sample data and the expected output for debugging purpose.

However I created this sample data set in Snowflake based on what you described in the question.Please let me know if this is not what you are looking for, I will remove the answer.

Below query gives me expected output as 9 as this is the number of JSON objects in variant column.

SELECT 
    COUNT(*) AS total_variant_objects
FROM 
    my_table,
    LATERAL FLATTEN(input => my_table.variants) AS flattened_variants;

Here, we need to get the total number of JSON objects in the variant column. A higher-order function like REDUCE OR ARRAY_SIZE would be more performant.

with raw_items as (
    select parse_json('[{"item" : "banana"}, {"item": "apple"}]') as items
    union all
    select parse_json('[{"item" : "bread"}, {"item": "coffee"}]') as items
    union all
    select parse_json('[{"item": "tea"}]') as items
    union all
    select parse_json('[]') as items
)
select
    SUM(
            REDUCE(items,
                   0,
                   (acc, val) -> acc + 1)
    ) as total_objs,
    sum(array_size(items)) as total_objs_2
from raw_items
;
-- 5

本文标签: