admin管理员组文章数量:1402983
I am writing output to a Azure Synapse table where the table contains a varbinary(8000) column. When writing using spark it gives error that UNSUPORTED_DATATYPE as I am trying to limit length from default MAX to 8000 as MAX is not supported by Azure Synapse. Below is the error
.apache.spark.sql.catalyst.parser.ParseException: [UNSUPPORTED_DATATYPE] Unsupported data type "VARBINARY(8000)"
ds.write()
.format(FORMAT_JDBC)
.option("url", JDBC_URL)
.option("user", USER_VALUE)
.option("password", PASSWORD_VALUE)
.option(DRIVER_CLASS_NAME, "com.microsoft.sqlserver.jdbc.SQLServerDriver")
.option(DB_TABLE, "\"ImageStore\"")
.option("createTableColumnTypes","ImageName varchar(8000),ImageData VARBINARY(8000)")
.mode(SaveMode.Overwrite)
.save();
Note : I don't need append to an pre-existing table and need to create a new Table. Also I don't want to create a Table of Heap type in Synapse but the default CCI table.
I am writing output to a Azure Synapse table where the table contains a varbinary(8000) column. When writing using spark it gives error that UNSUPORTED_DATATYPE as I am trying to limit length from default MAX to 8000 as MAX is not supported by Azure Synapse. Below is the error
.apache.spark.sql.catalyst.parser.ParseException: [UNSUPPORTED_DATATYPE] Unsupported data type "VARBINARY(8000)"
ds.write()
.format(FORMAT_JDBC)
.option("url", JDBC_URL)
.option("user", USER_VALUE)
.option("password", PASSWORD_VALUE)
.option(DRIVER_CLASS_NAME, "com.microsoft.sqlserver.jdbc.SQLServerDriver")
.option(DB_TABLE, "\"ImageStore\"")
.option("createTableColumnTypes","ImageName varchar(8000),ImageData VARBINARY(8000)")
.mode(SaveMode.Overwrite)
.save();
Note : I don't need append to an pre-existing table and need to create a new Table. Also I don't want to create a Table of Heap type in Synapse but the default CCI table.
Share Improve this question edited Mar 27 at 12:27 Nitish Sharma asked Mar 27 at 12:26 Nitish SharmaNitish Sharma 111 bronze badge 6 | Show 1 more comment1 Answer
Reset to default 0Regarding the ERROR when using the spark it is because spark does not directly define table schema. However, you can create the table first using SQL and then write data into it using Spark.
ERROR: .apache.spark.sql.catalyst.parser.ParseException: [UNSUPPORTED_DATATYPE] Unsupported data type "VARBINARY(8000)
As you mentioned that you are using the Synapse table and you want to limit the Varbinary(8000)
I have tried the below in the dedicated sql pool
CREATE TABLE dbo.ImageStore2 (
ImageName VARCHAR(8000),
ImageData VARBINARY(8000)
)
WITH (DISTRIBUTION = ROUND_ROBIN, HEAP);
Results:
As you are trying to create the table from spark and you mentioned that you
Note : I don't need append to an pre-existing table and need to create a new Table.
Write to Azure Synapse Dedicated SQL Pool:
Ingest large volumes of data into both Internal and External tables.
Supports the following DataFrame save modes:
- Append
- ErrorIfExists
- Ignore
- Overwrite
Reference: Azure Synapse Dedicated SQL Pool Connector for Apache Spark
本文标签:
版权声明:本文标题:java - Spark giving error when writing a limited length Column of type Varbinary for Synapse database - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744089675a2589184.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
varbinary
data type, far as I know, justbinary
. – Andrew Commented Mar 27 at 15:12