admin管理员组文章数量:1122846
[cassandra running from docker windows]
and I am running spark from wsl2
spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1
[its spark-shell after the command above]
and when I try to do
data.collect().foreach(println)
I get this error
scala> data.collect().foreach(println)
24/11/21 23:57:24 WARN ControlConnection: [s0] Error connecting to Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af), trying next node (ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer))
java.io.IOException: Failed to open native connection to Cassandra at {localhost:9042} :: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:59)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:261)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2458)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1049)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:410)
at org.apache.spark.rdd.RDD.collect(RDD.scala:1048)
... 49 elided
Caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
at com.datastax.spark.connector.cql.CassandraConnector$.createSession(CassandraConnector.scala:167)
... 73 more
Suppressed: com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)
com.datastax.oss.driver.shadedty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at com.datastax.oss.driver.shadedty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at com.datastax.oss.driver.shadedty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer
I checked the version compatabilities refering compatabilities
my versions Cassandra: 5.0.2 cqlsh: 6.2.0 (compatible with Cassandra 5.0.2) Spark: 3.5.3 Scala: 2.12.18 Java: 1.8.0_432 (Java 8)
as for cassandra connectors, I tried 3.5.1 and 3.5.0 but neither worked. And I also tried the same in windows (running spark on docker aswell) then also the issue persisted.
[cassandra running from docker windows]
and I am running spark from wsl2
spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1
[its spark-shell after the command above]
and when I try to do
data.collect().foreach(println)
I get this error
scala> data.collect().foreach(println)
24/11/21 23:57:24 WARN ControlConnection: [s0] Error connecting to Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af), trying next node (ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer))
java.io.IOException: Failed to open native connection to Cassandra at {localhost:9042} :: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:59)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:261)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2458)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1049)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:410)
at org.apache.spark.rdd.RDD.collect(RDD.scala:1048)
... 49 elided
Caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
at com.datastax.spark.connector.cql.CassandraConnector$.createSession(CassandraConnector.scala:167)
... 73 more
Suppressed: com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)
com.datastax.oss.driver.shaded.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at com.datastax.oss.driver.shaded.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at com.datastax.oss.driver.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer
I checked the version compatabilities refering https://github.com/datastax/spark-cassandra-connector compatabilities
my versions Cassandra: 5.0.2 cqlsh: 6.2.0 (compatible with Cassandra 5.0.2) Spark: 3.5.3 Scala: 2.12.18 Java: 1.8.0_432 (Java 8)
as for cassandra connectors, I tried 3.5.1 and 3.5.0 but neither worked. And I also tried the same in windows (running spark on docker aswell) then also the issue persisted.
Share Improve this question edited Nov 22, 2024 at 14:16 eshirvana 24.5k3 gold badges27 silver badges42 bronze badges asked Nov 22, 2024 at 7:44 Arka DashArka Dash 1 3 |2 Answers
Reset to default 0I believe you have not set up Cassandra connection properly, replace 127.0.0.1
with proper ip address of the cassandra db host :
$SPARK_HOME/bin/spark-shell --conf spark.cassandra.connection.host=127.0.0.1 \
--packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1
--conf spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions
I was exposing port 7000
of cassandra to 0.0.0.0:9042
. I needed to map port 9042
of cassandra to 0.0.0.0:9042
. As cassandra's CQL client connections listens to port 9042
.
As in docker
docker stop cassandradb
docker rm cassandradb
docker run -d --name cassandradb -p 9042:9042 cassandra
It fixes the Cassandra connection.
本文标签:
版权声明:本文标题:pyspark - unable to connect to cassandra from apache spark:com.datastax.oss.driver.api.core.connection.ClosedConnectionException 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736305396a1932576.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
localhost:9042
to connect cassandradb , is that really reachable from where you are running spark? if not you have to set up the right ip address ,etc... – eshirvana Commented Nov 22, 2024 at 14:24telnet localhost 9042
. where the output is "Connected to localhost.". And I tested both wsl and windows – Arka Dash Commented Nov 22, 2024 at 14:51