admin管理员组

文章数量:1291264

I'm deploying Apache Ignite cluster in Kubernetes and using a startupProbe to ensure each node has joined the Ignite baseline topology. It's necessary to make sure that Server Node is completly ready for work (it's connected to Topology, caches are loaded..). For this I created custom bash script that utilizes Ignite REST Api to check whether the Node is part of Topology or not.

#!/bin/bash
apk add --no-cache curl jq
NODE_HOSTNAME=$(hostname)
echo "Hostname: $NODE_HOSTNAME"

NODE_ID=$(curl -s "http://localhost:8080/ignite?cmd=top" | \
         jq -r --arg node_hostname "$NODE_HOSTNAME" '.response[] | select(.tcpHostNames[] | startswith($node_hostname)) | .consistentId')
echo "NODE_ID: $NODE_ID"

BASELINE=$(curl -s "http://localhost:8080/ignite?cmd=baseline" | \
          jq -r --arg node_id "$NODE_ID" '.response.baseline[] | select(. == $node_id)')
echo "Baseline result: $BASELINE"

if [ -z "$BASELINE" ]; then
  echo "Node is NOT part of the baseline topology yet."
  exit 1
else
  echo "Node is part of the baseline topology."
  exit 0
fi

And my Kubernetes deployment's startupProbe is defined as:

startupProbe:
  exec:
    command:
      - /bin/sh
      - -c
      - /scripts/check_baseline.sh
  initialDelaySeconds: 10
  periodSeconds: 10
  failureThreshold: 60

I tested in containers and it works as expected, but when I tried to apply it as startupProbe, I faced the problem, that Readiness Probe is not started until StartUp succeeds, therefore the incoming traffic can not reach the Pod and Ignite Client can not join and form Topology. So I have a deadlock: StartupProbe waits for Node to join cluster, but it can not do it, since incoming traffic is blocked due to this probe.

Is there any way to workaround it? or may be other way how to add such a health check.

本文标签: