admin管理员组文章数量:1289565
I have a json object per row in an Azure dataflow and want to append all values to an array and then flatten it, so that each element of the array is a value rather than all values for that specific row.
My input data looks like this:
Column |
---|
{"Energy to Utility_kWh_15m": "ef9033a5-ca4c-44eb-9f20-5c8c0d4ca7d6", "Output Energy_kWh_15m": "849871d1-b5f5-4ae8-86ad-5030ce16cce5", "Plant Availability_%_15m": "5db1004a-bcdc-4973-9816-124262893d21" |
{"Energy to Utility_kWh_15m": "97046418-371d-41d3-a213-5e9715847a34", "Output Energy_kWh_15m": "6dc86c06-1a5c-11e9-9358-42010afa015a", "Plant Availability_%_15m": "6dcac67c-1a5c-11e9-9358-42010afa015a"} |
... |
I have a json object per row in an Azure dataflow and want to append all values to an array and then flatten it, so that each element of the array is a value rather than all values for that specific row.
My input data looks like this:
Column |
---|
{"Energy to Utility_kWh_15m": "ef9033a5-ca4c-44eb-9f20-5c8c0d4ca7d6", "Output Energy_kWh_15m": "849871d1-b5f5-4ae8-86ad-5030ce16cce5", "Plant Availability_%_15m": "5db1004a-bcdc-4973-9816-124262893d21" |
{"Energy to Utility_kWh_15m": "97046418-371d-41d3-a213-5e9715847a34", "Output Energy_kWh_15m": "6dc86c06-1a5c-11e9-9358-42010afa015a", "Plant Availability_%_15m": "6dcac67c-1a5c-11e9-9358-42010afa015a"} |
... |
and I want my final output to look like:
New Column |
---|
"ef9033a5-ca4c-44eb-9f20-5c8c0d4ca7d6" |
"849871d1-b5f5-4ae8-86ad-5030ce16cce5" |
"5db1004a-bcdc-4973-9816-124262893d21" |
"97046418-371d-41d3-a213-5e9715847a34" |
"6dc86c06-1a5c-11e9-9358-42010afa015a |
"6dcac67c-1a5c-11e9-9358-42010afa015a" |
... |
so that I can use the data in a ForEach pipeline activity and loop through each id.
I have the below solution that provides my expected output, where each select activity following the flatten selects a specific column (one of the key-value pairing). This is not a good solution because as my keys expand so too will the select activities required. I would like this to be dynamic, based on the keys in the json.
Share Improve this question edited Feb 21 at 14:18 Rakesh Govindula 11.5k2 gold badges4 silver badges17 bronze badges Recognized by Microsoft Azure Collective asked Feb 21 at 10:46 YorkYork 717 bronze badges 6- Do you want to get a new kind of solution? or can I provide the next steps after this previous so answer? – Rakesh Govindula Commented Feb 21 at 11:38
- Ideally I'd like a new solution where I don't have to add select activities for each key/column. I will be reusing the structure of this solution for a separate dataset that contains many more key value pairings, therefore the current solution would be a problem as it would require probably another 12 select activities on top – York Commented Feb 21 at 11:44
- In this case, there are same key names in every row, but you don't know how many are they are going to be right? – Rakesh Govindula Commented Feb 21 at 11:46
- Yes exactly, the values are only what are different but the keys are consistent. I just need a way to not have to add select items for every possible key, because for another dataset there will be at least 20 – York Commented Feb 21 at 12:09
- As you don't have the control over the structure or keys on the JSON string, I will try string operations to achieve your requirement. I will update here. – Rakesh Govindula Commented Feb 21 at 12:37
1 Answer
Reset to default 1You can try the below approach, but this will only work in this case where there are not nested structures in your JSON strings and the values should not contain the special character Double quote ("
).
First take a derived column transformation after your source. Here, Create a new column sub_arr
with below expression.
slice(map(split(Column,'": "'),split(#item,'"')[1]),2)
This will first split the JSON string on '": "
' and then for each string, it will again split the sub record string on '"
' and takes first item. It means, it will create the array of values for each JSON string row as shown below.
Next, to combine these arrays of each row, take an Aggregate transformation and create the required res_arr
column in the aggregate section with below expression. Here, no need to take any column in Group By section.
flatten(collect(sub_arr))
Now, it will give the expected array as shown below.
本文标签: Combine values from a json in an Azure dataflow into one large arrayStack Overflow
版权声明:本文标题:Combine values from a json in an Azure dataflow into one large array - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741389650a2376027.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论