admin管理员组文章数量:1315286
I'm using AWS Glue to read data from a DynamoDB table where the sort key sk (string)
is a timestamp in the format 2024-04-10T00:00:00.000000+00:00
. I'm trying to apply a push_down_predicate
to filter records within a specific time range, but I'm getting unexpected results, including timestamps outside the specified range.
What I've Tried:
- DynamoDB Query: When I query directly from DynamoDB using the same timestamp format, the results are as expected.
- AWS Glue Job:
dynamic_frame = glueContext.create_dynamic_frame.from_catalog(
database="my_database",
table_name="my_dynamodb_table",
push_down_predicate=f"sk >= '{start_timestamp}' AND sk < '{end_timestamp}'"
)
Here, `start_timestamp` and `end_timestamp` match the format in DynamoDB.
Observed Behavior: Instead of getting filtered results within the specified timestamp range, I'm seeing a mix of timestamps, including many outside the range.
Question:
Why isn't the push_down_predicate
filtering the DynamoDB data as expected through AWS Glue, and how can I correctly apply this filter to get only the timestamps within the specified range?
I'm using AWS Glue to read data from a DynamoDB table where the sort key sk (string)
is a timestamp in the format 2024-04-10T00:00:00.000000+00:00
. I'm trying to apply a push_down_predicate
to filter records within a specific time range, but I'm getting unexpected results, including timestamps outside the specified range.
What I've Tried:
- DynamoDB Query: When I query directly from DynamoDB using the same timestamp format, the results are as expected.
- AWS Glue Job:
dynamic_frame = glueContext.create_dynamic_frame.from_catalog(
database="my_database",
table_name="my_dynamodb_table",
push_down_predicate=f"sk >= '{start_timestamp}' AND sk < '{end_timestamp}'"
)
Here, `start_timestamp` and `end_timestamp` match the format in DynamoDB.
Observed Behavior: Instead of getting filtered results within the specified timestamp range, I'm seeing a mix of timestamps, including many outside the range.
Question:
Why isn't the push_down_predicate
filtering the DynamoDB data as expected through AWS Glue, and how can I correctly apply this filter to get only the timestamps within the specified range?
1 Answer
Reset to default 2DynamoDB connector does not support push down predicate filtering:
https://docs.aws.amazon/glue/latest/dg/aws-glue-programming-etl-connect-dynamodb-home.html
本文标签: amazon web servicesAWS pushdownpredicate not working with DynamoDbStack Overflow
版权声明:本文标题:amazon web services - AWS push_down_predicate not working with DynamoDb - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741962453a2407359.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论