admin管理员组文章数量:1333450
I am trying to transfer csv data from AWS S3 bucket into BigQuery for analysis and querying. Problems that I have are that only about half of my data from the bucket is being transferred. In data transfers for BQ I had to specify an output table, and in that table I wrote in the schema for what the csv files have. One issue is that not all files in the S3 bucket have the same schema, but the schema I wrote in is a super-set of all possible columns of any given csv file. So I would think that if a file had a differing schema but it is a set of the overall- it would write in and just have null values for the missing column values. Essentially I had to raise my error tolerability count way high to even get half the data to transfer, but I need help or ideas from anyone that has been in this position to try and get the other half into the same table.
Things I tried were auto-detection of schema: this just automates the table schema part and I get the same issue. I also went down the path of creating a table based on a connection source(S3) rather than a data transfer into an already created table. Similar issues.
本文标签: amazon s3How Do I fix my AWS S3 to BigQuery Data TransferStack Overflow
版权声明:本文标题:amazon s3 - How Do I fix my AWS S3 to BigQuery Data Transfer? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742338845a2456182.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论