进入redis:
./redis-cli -h ip -p port
查看集群节点(主从,slots编号范围)
cluster nodes
查看key对应的slot
cluster keyslot key
也可以查看slot和节点的对应关系
cluster slots
也可以通过debug object key
可以查看某个key序列化后的长度。
redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?
由于redis是单线程处理请求的,如果一条命令执行特别缓慢,那么新到来的请求就会放入tcp队列等待执行,如果待处理的命令堆积的数量超过了tcp队列容忍的长度, 那么就会拒绝该请求。
异常的堆栈信息描述的过程如下
redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:97)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:30)
at redis.clients.jedis.JedisCluster.get(JedisCluster.java:81)
1、jedis-client 先通过key计算slot, 然后根据slot查询slot到服务器的映射表,假设这个key由server1负责, 那么直接向server1发送请求
2、server1由于执行慢的命令,且tcp队列满了,那么直接拒绝jedis-client
3、jedis-client被拒绝之后,怀疑server1 不可用,然后随机请求一个新的服务器server2
4、server2 告诉jedis-client 还是由server1负责这个key对应的slot,
5、jedis-client再次向server1发起请求,
6、server1由于执行慢的命令,且tcp队列任然满,那么还是直接拒绝jedis-client
当请求的次数超过5次,抛出JedisClusterMaxRedirectionsException异常
解决方案
1、通过观察slow log 优化慢的命令 参考Redis Slowlog启用
2、redis 集群扩容,这样能减少redis每个进程需要处理的命令总量,同时也间接地增加了tcp队列的长度
参考:
- redis 集群查看key在某一个具体的节点上
- Java:将异常的完整堆栈追踪信息保存到字符串中(详解)
- REDIS(Jedis客户端)Too Many Cluster Redirections 异常
增加异常完整调用链,发现抛出异常的原因是:
Unexpected end of stream异常
解决办法是:增加连接池空闲连接检测
setTestWhileIdle(true); // 开启空闲连接检测
setMinEvictableIdleTimeMillis(60000); // 当达到这个阈值时60s,空闲连接就会被移除
setTimeBetweenEvictionRunsMillis(30000); // 检测空闲连接的周期,30秒
setNumTestsPerEvictionRun(-1); // 如果设置成-1,就表示检测所有链接
检测空闲连接setTimeBetweenEvictionRunsMillis(30000);
实际上就是每30s就会发送一个PING命令,那这样其实就不会移除空闲连接了(客户端设置的60s, server设置的是10min)
已通过抓包验证
tcpdump -nn '(src host 10.100.7.207) and ((port 6379) or (port 7379) or (port 7380))' -A
抓取eth0网口,源地址10.100.1.25,目标地址10.102.16.126目标端口6379(发往10.102.16.126的请求)的包,保存到aaa.cap
tcpdump -i eth0 -nn -vv -s0 src host 10.100.1.25 and dst host 10.102.16.126 and dst port 6379 -w aaa.cap
tcpdump解析包
tcpdump -nn -A -r aaa.cap
//tcpdump抓包保存,wireshark分析包文件
tcpdump -i eth0 'port 1988' -vv -nn -s 0 -w zk14.cap
// 通过eth0网口主机192.168.1.11 发送和接收的所有数据包
tcpdump -i etho '(host 192.168.1.11)' -vv -nn -s 0 -w save.cap
// 主机192.168.1.11和主机210.27.48.2 或210.27.48.3通信的数据包
tcpdump -i eth0 '(host 192.168.1.11 and (210.27.48.2 or 210.27.48.3))' -vv -nn -s 0 -w save.cap
参考:
解决Jedis抛出的Unexpected end of stream异常
 
redis Too Many Cluster Redirections?问题 &tcpdump抓包:等您坐沙发呢!