admin管理员组

文章数量:1122849

一、问题描述

在某次IPv6改造过程中,因Redis现场版本实际使用3.0.4,并不支持ipv6,需升级到Redis 4.0.14(4.0版本的最新版),然后在部署过程中报错:

internal:/usr/local/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require': /usr/local/lib/ruby/site_ruby/3.0.0/x86_64-linux/zlib.so: undefined symbol: inflateReset - /usr/local/lib/ruby/site_ruby/3.0.0/x86_64-linux/zlib.so (LoadError)
	from <internal:/usr/local/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from /usr/local/lib/ruby/3.0.0/rubygems/package.rb:188:in `initialize'
	from /usr/local/lib/ruby/3.0.0/rubygems/package.rb:153:in `new'
	from /usr/local/lib/ruby/3.0.0/rubygems/package.rb:153:in `new'
	from /usr/local/lib/ruby/3.0.0/rubygems/source/specific_file.rb:19:in `initialize'
	from /usr/local/lib/ruby/3.0.0/rubygems/dependency_installer.rb:293:in `new'
	from /usr/local/lib/ruby/3.0.0/rubygems/dependency_installer.rb:293:in `resolve_dependencies'
	from /usr/local/lib/ruby/3.0.0/rubygems/commands/install_command.rb:198:in `install_gem'
	from /usr/local/lib/ruby/3.0.0/rubygems/commands/install_command.rb:223:in `block in install_gems'
	from /usr/local/lib/ruby/3.0.0/rubygems/commands/install_command.rb:216:in `each'
	from /usr/local/lib/ruby/3.0.0/rubygems/commands/install_command.rb:216:in `install_gems'
	from /usr/local/lib/ruby/3.0.0/rubygems/commands/install_command.rb:164:in `execute'
	from /usr/local/lib/ruby/3.0.0/rubygems/command.rb:323:in `invoke_with_build_args'
	from /usr/local/lib/ruby/3.0.0/rubygems/command_manager.rb:178:in `process_args'
	from /usr/local/lib/ruby/3.0.0/rubygems/command_manager.rb:147:in `run'
	from /usr/local/lib/ruby/3.0.0/rubygems/gem_runner.rb:53:in `run'
	from /usr/local/bin/gem:21:in `<main>'

相关资源:redis安全、Redis 3.0 中文版文档

二、原因分析及处理

1、上述报错是在执行如下命令时报错的:

gem install -l ./redis-4.2.1.gem #本地安装Redis gem包

2、现场验证是因为zlib的库文件,so文件所致,执行如下:

ruby --version
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-linux]

yum install zlib-devel -y
yum list installed|grep zlib
zlib.x86_64                           1.2.7-18.el7                    @centos7u3
zlib-devel.x86_64                     1.2.7-18.el7     
cd /home/software/ruby-3.0.0/ext/zlib
make&make install

#之后再执行安装
gem install -l ./redis-4.2.1.gem  //输出如下
Successfully installed redis-4.2.1
Parsing documentation for redis-4.2.1
Installing ri documentation for redis-4.2.1
Done installing documentation for redis after 0 seconds
1 gem installed

3、报错: OpenSSl is not available. Install OpenSSL and rebuild Ruby (preferred) or use non-HTTPS sources

 gem install openssl  #报错
ERROR:  While executing gem ... (Gem::Exception)
    OpenSSl is not available. Install OpenSSL and rebuild Ruby (preferred) or use non-HTTPS sources

cd /usr/local/project/ruby-3.0.0/ext/openssl
yum install openssl-devel -y
ruby extconf.rb -with-openssl-include=/usr/include/openssl/ --with-openssl-lib=/usr/lib64/openssl/

 make && make install
……
compiling ossl_x509revoked.c
compiling ossl_x509store.c
linking shared-object openssl.so
/usr/bin/install -c -m 0755 openssl.so /usr/local/lib/ruby/site_ruby/3.0.0/x86_64-linux
installing default openssl libraries

#重新安装
gem install redis-4.2.1.gem   #有点卡,如下所示
Successfully installed redis-4.2.1
Parsing documentation for redis-4.2.1
Installing ri documentation for redis-4.2.1
Done installing documentation for redis after 1 seconds
WARNING:  Unable to pull data from 'https://rubygems/': no such name (https://rubygems/specs.4.8.gz)
1 gem installed



4、报错:

/usr/local/lib/ruby/gems/3.0.0/gems/redis-4.2.1/lib/redis/client.rb:127:in `call': ERR Slot 3277 is already busy (Redis::CommandError)
	from /usr/local/lib/ruby/gems/3.0.0/gems/redis-4.2.1/lib/redis.rb:3311:in `block in cluster'
	from /usr/local/lib/ruby/gems/3.0.0/gems/redis-4.2.1/lib/redis.rb:69:in `block in synchronize'
	from /usr/local/lib/ruby/3.0.0/monitor.rb:202:in `synchronize'
	from /usr/local/lib/ruby/3.0.0/monitor.rb:202:in `mon_synchronize'
	from /usr/local/lib/ruby/gems/3.0.0/gems/redis-4.2.1/lib/redis.rb:69:in `synchronize'
	from /usr/local/lib/ruby/gems/3.0.0/gems/redis-4.2.1/lib/redis.rb:3310:in `cluster'

上述报错是因为搭建redis集群前,redis的旧数据和配置信息没有清理干净,及时手动清理了由于时间差的问题,依然部分被残留,可用redis-cli 登录到每个节点执行 flushall 和 cluster reset 后继续尝试。重启后检查集群状态报错如下:


上图报错的意思就是节点保存的集群状态不一致的提示,即节点存储的集群信息(cluster nodes)里,记录和节点和槽位的分布于其他节点不一致,可执行如下,手动握手:

#检查各节点信息
redis-cli -h ip -p port -a password cluster nodes  > ip-port.nodes  2>/dev/null 
#手动执行重新握手
redis-cli -p 7004 -h 2409:8:0::11 cluster meet 2409:8:0::12 7000

kill 掉slave进程,手动修改该slave实例的nodes.conf(配置文件cluster-config-file定义的名字) 文件,将node与slot的信息修改为和其他节点一致,然后重启slave节点。再次 check 集群状态验证。另外,如果配置持久化,appendfilename文件指定错误也会导致,比如指定写入同一文件,应该分别指定;因redis集群至少需要3个master以及至少一个从节点,所以共计至少需要6个节点,现场未10个节点的集群,单检查集群状态,发现全部都为master,info replication查看节点主从关系,显示connected_slaves:0,查看配置文件masterauth字段并未配置密码验证;当然也可手动指定从:slaveof 127.0.0.1 6379;安装redis-trib.rb脚本的提示手动执行:

#检查配置
grep -E "^(port|cluster|dbfilename|logfile|masterauth|requirepass)" ./etc/redis.conf

redis-cli -p 7000 -h 2409:8::0:11
[2409:8::0:11]:7000> slaveof 2409:8::0:12 7003   #提示
[error] ERR SLAVEOF not allowed in cluster mode.  #redis主从复制,单机模式下以 SLAVEOF 命令触发;cluster 模式下以 REPLICATE 命令触发,且 cluster 模式下不支持 SLAVEOF 命令

#后发现在11上有之前集群的信息,shutdown两边实例后重启后恢复

#再次重建集群还是显示都是M状态的,怀疑最终还是iptables影响,遂关闭该服务尝试
service ip6tables stop

本文标签: 错处集群时报redisLinux