上一篇介绍了Sentinel相关知识和部署说明,本篇介绍部署实战和相关的验证

目前准备2台机器,A机器:10.10.10.126 ; B机器:10.10.10.118

A机器上面安装,Redis Master 6379 端口; Sentinel1 26379端口,Sentinel 26479端口;Sentinel 26579端口

B机器上面安装,Redis Slave 6379端口

Redis如何安装,请参考文章,Redis安装说明

Sentinel其实是个特殊的Redis服务,和Redis安装一样


假设Redis和Sentinel都已经安装完毕了

A机器 安装目录

Master:/data/apps/redis-3.0.7_6379  配置文件:/data/apps/redis-3.0.7_6379/redis.conf

Sentinel1:  /data/apps/redis-3.0.7_26379 配置文件:/data/apps/redis-3.0.7_26379/sentinel.conf

Sentinel2:  /data/apps/redis-3.0.7_26479 配置文件:/data/apps/redis-3.0.7_26479/sentinel.conf

Sentinel3:  /data/apps/redis-3.0.7_26579 配置文件:/data/apps/redis-3.0.7_26579/sentinel.conf


B机器安装目录

Slave:/data/apps/redis-3.0.7_6379  配置文件:/data/apps/redis-3.0.7_6379/redis.conf


sentinel.conf 简单点配置,比如Sentinel1中的配置如下
port 26379
daemonize yes
sentinel monitor mymaster 10.10.10.126 6379 2
sentinel down-after-milliseconds mymaster 5000

Sentinel2 和 Sentinel3 只需要把port改下,其他的一样就可以了


1、Master启动

/data/apps/redis-3.0.7_6379/bin/redis-server /data/apps/redis-3.0.7_6379/redis.conf


2、slave启动

前提需要配置M-S关系,就从属关系,登录 10.10.10.118 机器,编辑配置文件 redis.conf

找到这一行 # slaveof <masterip> <masterport>,然后替换成 slaveof 10.10.10.126 6379 即可启动

/data/apps/redis-3.0.7_6379/bin/redis-server /data/apps/redis-3.0.7_6379/redis.conf


3、查看主从关系

登录10.10.10.126机器,执行  /data/apps/redis-3.0.7_6379/bin/redis-cli info , 可看到如下信息

# Replication
role:master
connected_slaves:1
slave0:ip=10.10.10.118,port=6379,state=online,offset=1373,lag=0
master_repl_offset:1373
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:1372

登录10.10.10.118机器,执行  /data/apps/redis-3.0.7_6379/bin/redis-cli info , 可看到如下信息

# Replication
role:slave
master_host:10.10.10.126
master_port:6379
master_link_status:up
master_last_io_seconds_ago:3
master_sync_in_progress:0
slave_repl_offset:1513
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1047675
repl_backlog_histlen:18258

以上说明主从关系正常


4、启动Sentinel集群

在A机器上执行

/data/apps/redis-3.0.7_26379/bin/redis-sentinel /data/apps/redis-3.0.7_26379/sentinel.conf
/data/apps/redis-3.0.7_26479/bin/redis-sentinel /data/apps/redis-3.0.7_26479/sentinel.conf
/data/apps/redis-3.0.7_26579/bin/redis-sentinel /data/apps/redis-3.0.7_26579/sentinel.conf

启动后,我们重新查看 vi /data/apps/redis-3.0.7_26379/sentinel.conf 会发现多出了一些信息


# Generated by CONFIG REWRITE
dir "/data/apps"
maxmemory 3gb
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
sentinel known-slave mymaster 10.10.10.118 6379
sentinel known-sentinel mymaster 10.10.10.126 26579 3833571a3ddc514b9e83a0367ce85e9bfa9fa251
sentinel known-sentinel mymaster 10.10.10.126 26479 33a66a47a2b6bc39a9404dd63452313ac3d56e24
sentinel current-epoch 0

这些信息是sentinel重写的,里面包含了 配置被重新的次数,leader选举次数,slave的信息,其他sentinel的信息等


5、查看Sentinel的集群状态

在A机器上执行 /data/apps/redis-3.0.7_26379/bin/redis-cli -p 26379 ,其实从任何一台Sentinel上都能获取Sentinel的集群信息

1)sentinel master mymaster 查看 Redis Master的信息

127.0.0.1:26379> sentinel master mymaster
1) "name"
2) "mymaster"
3) "ip"
4) "10.10.10.126"
5) "port"
6) "6379"
7) "runid"
8) "954a9e9296f85dcaf86fa472218a9f4a515bb6a0"
9) "flags"
10) "master"
11) "pending-commands"
12) "0"
13) "last-ping-sent"
14) "0"
15) "last-ok-ping-reply"
16) "734"
17) "last-ping-reply"
18) "735"
19) "down-after-milliseconds"
20) "5000"
21) "info-refresh"
22) "7838"
23) "role-reported"
24) "master"
25) "role-reported-time"
26) "479362"
27) "config-epoch"
28) "0"
29) "num-slaves"
30) "1"
31) "num-other-sentinels"
32) "2"
33) "quorum"
34) "2"
35) "failover-timeout"
36) "180000"
37) "parallel-syncs"
38) "1"
127.0.0.1:26379>

2)sentinel slaves mymaster 查看 Slave的信息

127.0.0.1:26379> sentinel slaves mymaster
1)  1) "name"
   2) "10.10.10.118:6379"
   3) "ip"
   4) "10.10.10.118"
   5) "port"
   6) "6379"
   7) "runid"
   8) "e98476d58e00e59a4acdc0c1f2c22d3a79e19815"
   9) "flags"
  10) "slave"
  11) "pending-commands"
  12) "0"
  13) "last-ping-sent"
  14) "0"
  15) "last-ok-ping-reply"
  16) "408"
  17) "last-ping-reply"
  18) "408"
  19) "down-after-milliseconds"
  20) "5000"
  21) "info-refresh"
  22) "270"
  23) "role-reported"
  24) "slave"
  25) "role-reported-time"
  26) "582165"
  27) "master-link-down-time"
  28) "0"
  29) "master-link-status"
  30) "ok"
  31) "master-host"
  32) "10.10.10.126"
  33) "master-port"
  34) "6379"
  35) "slave-priority"
  36) "100"
  37) "slave-repl-offset"
  38) "115050"
127.0.0.1:26379>


3)sentinel sentinels mymaster 查看集群中其他Sentinel信息

127.0.0.1:26379> sentinel sentinels mymaster
1)  1) "name"
   2) "10.10.10.126:26579"
   3) "ip"
   4) "10.10.10.126"
   5) "port"
   6) "26579"
   7) "runid"
   8) "3833571a3ddc514b9e83a0367ce85e9bfa9fa251"
   9) "flags"
  10) "sentinel"
  11) "pending-commands"
  12) "0"
  13) "last-ping-sent"
  14) "0"
  15) "last-ok-ping-reply"
  16) "691"
  17) "last-ping-reply"
  18) "691"
  19) "down-after-milliseconds"
  20) "5000"
  21) "last-hello-message"
  22) "1665"
  23) "voted-leader"
  24) "?"
  25) "voted-leader-epoch"
  26) "0"
2)  1) "name"
   2) "10.10.10.126:26479"
   3) "ip"
   4) "10.10.10.126"
   5) "port"
   6) "26479"
   7) "runid"
   8) "33a66a47a2b6bc39a9404dd63452313ac3d56e24"
   9) "flags"
  10) "sentinel"
  11) "pending-commands"
  12) "0"
  13) "last-ping-sent"
  14) "0"
  15) "last-ok-ping-reply"
  16) "691"
  17) "last-ping-reply"
  18) "691"
  19) "down-after-milliseconds"
  20) "5000"
  21) "last-hello-message"
  22) "1397"
  23) "voted-leader"
  24) "?"
  25) "voted-leader-epoch"
  26) "0"
127.0.0.1:26379>

4)sentinel get-master-addr-by-name mymaster

127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "10.10.10.126"
2) "6379"
127.0.0.1:26379>

以上一些信息,比较简单,不再一一描述了


6)failover切换

比如执行了下面的命令,就会强制发生failover

127.0.0.1:26379> sentinel failover mymaster
OK

我们再去查看Sentinel.conf 文件时发现已经被重新过了,master已经更换了

sentinel monitor mymaster 10.10.10.118 6379 2  


我们也可以登录Redis,执行info命令,查看主从关系也可以看到,主从已经切换了。


我们换一种方式来验证failover,直接把 A机器上的Master进程kill掉

[root@test1 bin]# ps -ef|grep redis
root     18410     1  0 13:09 ?        00:00:00 ./redis-server *:6379      
root     18455     1  0 13:29 ?        00:00:00 /data/apps/redis-3.0.7_26379/bin/redis-sentinel *:26379 [sentinel]                        
root     18465     1  0 13:29 ?        00:00:00 /data/apps/redis-3.0.7_26479/bin/redis-sentinel *:26479 [sentinel]                        
root     18469     1  0 13:29 ?        00:00:00 /data/apps/redis-3.0.7_26579/bin/redis-sentinel *:26579 [sentinel]                        
root     18483 17823  0 13:34 pts/0    00:00:00 /data/apps/redis-3.0.7_26379/bin/redis-cli -p 26379
root     18513 17941  0 13:49 pts/1    00:00:00 grep redis
[root@test1 bin]#
[root@test1 bin]# kill -9 18410
[root@test1 bin]#


然后我们可以看到,10.10.10.118 就会自动切换为Master了


7)Pub/Sub 消息查看

我们登录A机器和B机器,执行  psubscribe * ,就是监听所有channel信息,会看到不断的接收到消息,如下图

查看大图

可知,Sentinel会不断的向其监视的Master和Slave发送消息,里面包含了 Sentinel 的 IP、端口号、RunID、监听Master名称、Master IP、Master 端口、epoch次数 等内容


A机器,/data/apps/redis-3.0.7_6379/bin/redis-cli -p 26379 , 然后执行  psubscribe *

当强制执行 sentinel failover mymaster  发生一次failover是,则可以看到如下信息:

127.0.0.1:26379> psubscribe *
Reading messages... (press Ctrl-C to quit)
1) "psubscribe"
2) "*"
3) (integer) 1
1) "pmessage"
2) "*"
3) "+new-epoch"
4) "11"
1) "pmessage"
2) "*"
3) "+try-failover"
4) "master mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+vote-for-leader"
4) "e05c0177a13fba5672ca6bf24dc96ce2fff208c6 11"
1) "pmessage"
2) "*"
3) "+elected-leader"
4) "master mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+failover-state-select-slave"
4) "master mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+role-change"
4) "slave 10.10.10.118:6379 10.10.10.118 6379 @ mymaster 10.10.10.126 6379 new reported role is slave"
1) "pmessage"
2) "*"
3) "+selected-slave"
4) "slave 10.10.10.118:6379 10.10.10.118 6379 @ mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+failover-state-send-slaveof-noone"
4) "slave 10.10.10.118:6379 10.10.10.118 6379 @ mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+failover-state-wait-promotion"
4) "slave 10.10.10.118:6379 10.10.10.118 6379 @ mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "-role-change"
4) "slave 10.10.10.118:6379 10.10.10.118 6379 @ mymaster 10.10.10.126 6379 new reported role is master"
1) "pmessage"
2) "*"
3) "+promoted-slave"
4) "slave 10.10.10.118:6379 10.10.10.118 6379 @ mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+failover-state-reconf-slaves"
4) "master mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+failover-end"
4) "master mymaster 10.10.10.126 6379"
1) "pmessage"
2) "*"
3) "+switch-master"
4) "mymaster 10.10.10.126 6379 10.10.10.118 6379"
1) "pmessage"
2) "*"
3) "+slave"
4) "slave 10.10.10.126:6379 10.10.10.126 6379 @ mymaster 10.10.10.118 6379"
1) "pmessage"
2) "*"
3) "-role-change"
4) "slave 10.10.10.126:6379 10.10.10.126 6379 @ mymaster 10.10.10.118 6379 new reported role is master"
1) "pmessage"
2) "*"
3) "+role-change"
4) "slave 10.10.10.126:6379 10.10.10.126 6379 @ mymaster 10.10.10.118 6379 new reported role is slave"


客户端的实现就是利用了上面的信息,来实现自动切换的,就是Client监听某个channel,比如监听 "+switch-master",当发生failover时,就会收到Sentinel发送的信息,解析就能获取新的Master信息,从而可以重新建立JedisPool来操作新的Master。