admin管理员组

文章数量:1122846

[cassandra running from docker windows]

and I am running spark from wsl2

spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1

[its spark-shell after the command above]

and when I try to do data.collect().foreach(println) I get this error


scala> data.collect().foreach(println)
24/11/21 23:57:24 WARN ControlConnection: [s0] Error connecting to Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af), trying next node (ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer))
java.io.IOException: Failed to open native connection to Cassandra at {localhost:9042} :: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
  at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:59)
  at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:261)
  at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2458)
  at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1049)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:410)
  at org.apache.spark.rdd.RDD.collect(RDD.scala:1048)
  ... 49 elided
Caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
  at com.datastax.spark.connector.cql.CassandraConnector$.createSession(CassandraConnector.scala:167)
  ... 73 more
  Suppressed: com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)
com.datastax.oss.driver.shadedty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at com.datastax.oss.driver.shadedty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at com.datastax.oss.driver.shadedty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:750)
  Caused by: com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer

I checked the version compatabilities refering compatabilities

my versions Cassandra: 5.0.2 cqlsh: 6.2.0 (compatible with Cassandra 5.0.2) Spark: 3.5.3 Scala: 2.12.18 Java: 1.8.0_432 (Java 8)

as for cassandra connectors, I tried 3.5.1 and 3.5.0 but neither worked. And I also tried the same in windows (running spark on docker aswell) then also the issue persisted.

[cassandra running from docker windows]

and I am running spark from wsl2

spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1

[its spark-shell after the command above]

and when I try to do data.collect().foreach(println) I get this error


scala> data.collect().foreach(println)
24/11/21 23:57:24 WARN ControlConnection: [s0] Error connecting to Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af), trying next node (ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer))
java.io.IOException: Failed to open native connection to Cassandra at {localhost:9042} :: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
  at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:59)
  at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:261)
  at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2458)
  at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1049)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:410)
  at org.apache.spark.rdd.RDD.collect(RDD.scala:1048)
  ... 49 elided
Caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=62a6b9af): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)]
  at com.datastax.spark.connector.cql.CassandraConnector$.createSession(CassandraConnector.scala:167)
  ... 73 more
  Suppressed: com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0x29a613af, L:/127.0.0.1:43036 - R:localhost/127.0.0.1:9042] Protocol initialization request, step 1 (OPTIONS): unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer)
com.datastax.oss.driver.shaded.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at com.datastax.oss.driver.shaded.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at com.datastax.oss.driver.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:750)
  Caused by: com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer

I checked the version compatabilities refering https://github.com/datastax/spark-cassandra-connector compatabilities

my versions Cassandra: 5.0.2 cqlsh: 6.2.0 (compatible with Cassandra 5.0.2) Spark: 3.5.3 Scala: 2.12.18 Java: 1.8.0_432 (Java 8)

as for cassandra connectors, I tried 3.5.1 and 3.5.0 but neither worked. And I also tried the same in windows (running spark on docker aswell) then also the issue persisted.

Share Improve this question edited Nov 22, 2024 at 14:16 eshirvana 24.5k3 gold badges27 silver badges42 bronze badges asked Nov 22, 2024 at 7:44 Arka DashArka Dash 1 3
  • did you setup the Cassandra host correctly in spark? as you can see spark is looking at localhost:9042 to connect cassandradb , is that really reachable from where you are running spark? if not you have to set up the right ip address ,etc... – eshirvana Commented Nov 22, 2024 at 14:24
  • I am running cassandra from docker, exposed to 9042. and yes, I am able to connect to the port through telnet localhost 9042 . where the output is "Connected to localhost.". And I tested both wsl and windows – Arka Dash Commented Nov 22, 2024 at 14:51
  • try the your cassandra docker ip address or the local ip address instaed of localhost in spark and try again ... – eshirvana Commented Nov 22, 2024 at 15:32
Add a comment  | 

2 Answers 2

Reset to default 0

I believe you have not set up Cassandra connection properly, replace 127.0.0.1 with proper ip address of the cassandra db host :

$SPARK_HOME/bin/spark-shell --conf spark.cassandra.connection.host=127.0.0.1 \
                            --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1
                            --conf spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions

I was exposing port 7000 of cassandra to 0.0.0.0:9042 . I needed to map port 9042 of cassandra to 0.0.0.0:9042 . As cassandra's CQL client connections listens to port 9042.

As in docker

docker stop cassandradb
docker rm cassandradb
docker run -d --name cassandradb -p 9042:9042 cassandra

It fixes the Cassandra connection.

本文标签: