admin管理员组

文章数量:1302379

Describing

Cluster of two VM was deployed for HA based on ton of coolguides of pro guys. e.g. : /

Pipeline

HAproxy + Keepalived

More specifically

Hosts: haproxy1.mycompany (10.10.199.101 Master) haproxy2.mycompany (10.10.199.102 Slave) Floating IP address: myservice.mycompany (10.10.199.150)

Expected behavior

Using VIP via Keepalived will not result in losses.

Testing HA via services

  • R/W operations S3 object storage (minio)
  • Queries in Postgresql

Configs

Keepalived MASTER

global_defs {
  router_id haproxya1
  enable_script_security
  script_user root
}

vrrp_script haproxy_check {
  script "/usr/libexec/keepalived/haproxy_check.sh"
  interval 2
  weight 100
}

vrrp_instance VI_1 {
  interface enp6s18
  state MASTER
  priority 100
  virtual_router_id 101
  advert_int 1
  unicast_src_ip 10.10.199.101
  unicast_peer {
    10.10.199.102
  }
  virtual_ipaddress {
    10.10.199.150
  }
  track_script {
    haproxy_check
  }
}

Keepalived BACKUP

global_defs {
  router_id haproxya2
  enable_script_security
  script_user root
}

vrrp_script haproxy_check {
  script "/usr/libexec/keepalived/haproxy_check.sh"
  interval 2
  weight 100
}

vrrp_instance VI_1 {
  interface enp6s18
  state BACKUP
  priority 50
  virtual_router_id 101
  advert_int 1
  unicast_src_ip 10.10.199.102
  unicast_peer {
    10.10.199.101
  }
  virtual_ipaddress {
    10.10.199.150
  }
  track_script {
    haproxy_check
  }
}

Common HAproxy config

global
    maxconn 100000
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    mode tcp
    log global
    retries 10
    option redispatch
    timeout client 30m
    timeout connect 30s
    timeout server 30m
    timeout check 5s
    option http-keep-alive
    no option http-server-close
    http-reuse always

listen hapserver
    mode http
    bind 10.10.199.150:7000
    stats enable
    stats uri /

# PostgreSQL
listen postgres
    bind 10.10.199.150:5000
    server postgres 10.10.199.123:5430 check

# S3 MinIO
listen s3minioWEB
    bind myservice.mycompany:9001 ssl crt /etc/haproxy/ssl/myservice.mycompany.pem
    mode http
    balance leastconn
    server minio1 minio1.mycompany:9001 check inter 2s
    server minio2 minio2.mycompany:9001 check inter 2s
    server minio3 minio3.mycompany:9001 check inter 2s
    server minio4 minio4.mycompany:9001 check inter 2s

listen s3minioAPI
    bind myservice.mycompany:9000 ssl crt /etc/haproxy/ssl/myservice.mycompany.pem
    mode http
    balance leastconn
    server minio1 minio1.mycompany:9000 check inter 2s
    server minio2 minio2.mycompany:9000 check inter 2s
    server minio3 minio3.mycompany:9000 check inter 2s
    server minio4 minio4.mycompany:9000 check inter 2s

During testing, a long task of one of the services (Postgresql, S3) was run (from another computer in the same local corporate network). While the task was in process, MASTER (haproxy1) was stopped. As expected, VIP was raised on another host.

However, when testing S3 minio always connection was closed. But when testing Postgresql the error appeared every other time. Query testing was done using the script:

#!/bin/bash

export DB_HOST="10.10.199.14"
export DB_PORT="5000"
export DB_NAME="db"
export DB_USER="user"
export PGPASSWORD='password'

# ------------------------------------

start_time=$(date +%s)
end_time=$(( start_time + 50 ))

iterator=1

echo "Start"
while [[ $(date +%s) -le $end_time ]]; do
  echo "cur: $iterator"
  psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -c "INSERT INTO haptest (num, timest) VALUES ("$iterator", CURRENT_TIMESTAMP)"
  iterator=$((iterator+1))
done

iterator=$((iterator-1))
echo "Made records: $iterator"
echo "End"

Guys, could you help, whether i mistaken somewhere or did testing the wrong way?

本文标签: