- 积分
- 16843
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
介绍及特点
4 W& |5 M k3 C8 ] Pacemaker:工作在资源分配层,提供资源管理器的功能% |4 K4 \( M. ?. }& Y
Corosync:提供集群的信息层功能,传递心跳信息和集群事务信息
0 _2 N' j+ P, Z Pacemaker + Corosync 就可以实现高可用集群架构
3 e4 e( k9 E: d3 h2 A
" ~9 l9 T, L1 @5 Q6 b$ u x+ \ 集群搭建6 m! R4 [8 d& G9 [
以下三个节点都需要执行:9 ^/ B- I! f: y l3 {% o( h. l
- z& U. l' |& m# yum install pcs -y
7 v8 e& c; ~* n# R2 K# systemctl start pcsd ; systemctl enable pcsd
" P1 z/ L4 G$ T# echo 'hacluster' | passwd --stdin hacluster. R+ U" S Z' g9 L0 x3 @5 o% B
# yum install haproxy rsyslog -y7 A0 b" n1 J2 r& N, f( H# r4 ~) q
# echo 'net.ipv4.ip_nonlocal_bind = 1' >> /etc/sysctl.conf # 启动服务的时候,允许忽视VIP的存在+ p, ]9 x8 o( P
# echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf # 开启内核转发功能
2 f4 R: C6 P4 K1 v) }9 ~# sysctl -p3 c- V; e" l: a4 }
2 W% P! H) I; N5 B4 l' F( u在任意节点创建用于haproxy监控Mariadb的用户# Z! [5 `& u: ]& x% q. o( N Z
MariaDB [(none)]> CREATE USER 'haproxy'@'%' ;
/ t4 J2 v, ]4 ?3 d1 D配置haproxy用于负载均衡器
9 t& \3 A- e$ S9 d
' a/ i. D- v9 D% J[root@controller1 ~]# egrep -v "^#|^$" /etc/haproxy/haproxy.cfg, Z- n; T/ [0 X$ T/ b K
log 127.0.0.1 local2$ q" M4 F! x+ v! h* C% Q/ z9 }( g
chroot /var/lib/haproxy/ a ?' Y. ]: v" b6 k+ c1 s- e& C7 P
pidfile /var/run/haproxy.pid+ h: @0 b! v! [! p2 [0 Y8 U
maxconn 4000
& E# ~: J7 Z) N& M; B) q user haproxy$ g6 W9 A9 O* x, g1 \1 f
group haproxy+ I3 c1 C" t5 z/ Y( ~0 U
daemon
" G+ O2 G+ k, v # turn on stats unix socket% q% L' m( {% W) ?
stats socket /var/lib/haproxy/stats
. L+ o3 E4 J; j2 G0 o& |7 Ndefaults$ d3 M) u% b7 Z3 K
mode http, K% r8 p. G# J8 [
log global
3 }! B$ } e9 ^2 l1 u$ m6 d2 e option httplog6 } s) F9 C( \( a
option dontlognull
9 F2 N1 |- O$ O' e. |( J1 S( N option http-server-close
: n1 z2 n& F [8 c option forwardfor except 127.0.0.0/87 q9 Q- L. K& g" J2 j, P. ^+ T
option redispatch
! {: _- x f U; D6 s l retries 3( g6 q" {2 U o" Q2 Z
timeout http-request 10s
5 }% C! ?; f" {4 |; A T, P2 [ timeout queue 1m
( `' {6 V) A* S7 I* `$ w timeout connect 10s
Z7 i# \! y& _& t+ \5 U timeout client 1m
1 f) E% y9 z+ t+ w9 ?9 C timeout server 1m
8 A. p* L8 q# Y1 ^0 F1 j& n* _ timeout http-keep-alive 10s
7 U% w* X3 n5 T& ] timeout check 10s6 ^2 M' Y% v- x* o7 k7 k& v
maxconn 4000
1 s! C, q, q' D! L0 P! z# @7 alisten galera_cluster! ]! J% ^8 ^3 W
mode tcp D8 E5 W3 P# P) n" U
bind 192.168.0.10:3306# |: F2 u+ [3 n% [& }* n: q
balance source
, Y7 t2 u) L! n. ?; O: m# M option mysql-check user haproxy+ Z4 Q. }/ S6 h3 z' d+ s
server controller1 192.168.0.11:3306 check inter 2000 rise 3 fall 3 backup8 P9 ]: u$ I" ]4 y% B
server controller2 192.168.0.12:3306 check inter 2000 rise 3 fall 3
. q- i' R- t; N3 i% y- p2 Y8 z server controller3 192.168.0.13:3306 check inter 2000 rise 3 fall 3 backup6 l1 h! k3 K, F8 c% [4 F) N
2 U" G6 `" _/ f" _
listen memcache_cluster# F% k5 M* A1 D. `, j6 q9 d4 c
mode tcp
5 \) l& n- ^( _, T& q9 @ E5 g bind 192.168.0.10:112113 _0 Z$ b/ O; r. C5 D$ A
balance source& {4 R) e* I; @7 L
option tcplog
% z' J \# w* H& ~* P server controller1 192.168.0.11:11211 check inter 2000 rise 3 fall 3
. V* i. L) I3 R; N4 T. H/ ? server controller2 192.168.0.12:11211 check inter 2000 rise 3 fall 37 o; [ _ ?/ K. z* i1 N
server controller3 192.168.0.13:11211 check inter 2000 rise 3 fall 3
8 R$ N( t6 ]( v
; m/ h+ S7 _) c- Q" c* K$ x1 l& D% u
7 x' ]$ x; P5 j- N& P3 n* M: X* m/ b注意:
; V1 w5 L$ w! |* r0 m (1)确保haproxy配置无误,建议首先修改ip和端口启动测试是否成功。: i F. H# g4 [# V% _
(2)Mariadb-Galera和rabbitmq默认监听到 0.0.0.0 修改调整监听到本地 192.168.0.x( P: y( l+ z8 x5 V8 v& B( ~6 i
(3)将haproxy正确的配置拷贝到其他节点,无需手动启动haproxy服务: f) N) _- B/ s/ ]
为haproxy配置日志(所有controller节点执行):" ^% ^9 B6 c; y* S
4 ? o& I/ V* v4 |, Y# vim /etc/rsyslog.conf
% p$ u) i0 c$ `…
) h9 G5 i) Z4 u. K$ Q R1 _8 q1 L$ModLoad imudp
5 m5 w, q( ~2 j, h$UDPServerRun 514
" |9 W4 X: ]2 m…5 L* }" e9 n! {2 s1 C) a
local2.* /var/log/haproxy/haproxy.log2 m8 W3 e/ X. x, B+ A
…1 Q, Y, B% b$ ]/ w- s2 X" z: F
" v- |* A) W8 P" f: B# mkdir -pv /var/log/haproxy/
& I$ X6 H) R8 tmkdir: created directory ‘/var/log/haproxy/’# ^% b! \, _& n& q. i, b
. u; d& z" v( v/ \4 m
# systemctl restart rsyslog) W2 ?2 ~+ Z3 N9 E4 m V! [
! \, y& m$ b, f; Y9 A8 w. L" t+ ~
启动haproxy进行验证操作:
' n' V4 |1 F8 h, h. W6 x7 W7 m% N4 [
# systemctl start haproxy5 K1 Z8 C9 F: D) B
[root@controller1 ~]# netstat -ntplu | grep ha& S( _ A# R% w
tcp 0 0 192.168.0.10:3306 0.0.0.0:* LISTEN 15467/haproxy
! W% U2 D7 }" n- Stcp 0 0 192.168.0.10:11211 0.0.0.0:* LISTEN 15467/haproxy
5 Y8 G E1 o! l- |- Zudp 0 0 0.0.0.0:43268 0.0.0.0:* 15466/haproxy
/ b" J: ~# J. E4 g2 S6 a/ C9 i- e% n7 V/ c/ w& J2 o
验证成功,关闭haproxy
: e+ ~& C2 U7 Z# systemctl stop haproxy
8 |( Q' v( t* V6 B. ]0 L; C
u; l* R% K0 _7 b2 @ ' t2 a) ^+ e- l o$ L/ U
在controller1节点上执行:
6 R2 v8 f7 f( C7 V% W; q9 l2 g[root@controller1 ~]# pcs cluster auth controller1 controller2 controller3 -u hacluster -p hacluster --force$ S" a1 |+ T: e1 _3 F3 h
controller3: Authorized
1 r; H5 j8 M+ Z1 J- Vcontroller2: Authorized2 o4 |2 m+ j9 f
controller1: Authorized
4 Y& @; u; E5 v创建集群:3 c1 K9 O4 J% P9 d
+ l! L8 F1 S% j. y1 U, L: ]3 e2 x
[root@controller1 ~]# pcs cluster setup --name openstack-cluster controller1 controller2 controller3 --force F5 p3 B# r: y. `: [
Destroying cluster on nodes: controller1, controller2, controller3...
5 A! ^$ {1 A4 Z! i; ]2 X! zcontroller3: Stopping Cluster (pacemaker)...
8 u5 _+ ~' b* d6 F5 b: Zcontroller2: Stopping Cluster (pacemaker)...
7 X: v/ |* A) L; R' i+ scontroller1: Stopping Cluster (pacemaker)...6 X0 L. [: P+ J( z$ x
controller3: Successfully destroyed cluster9 V# l: p) H2 M8 r, T
controller1: Successfully destroyed cluster: r, ] D+ M* i. M; U6 Z2 N- `4 L
controller2: Successfully destroyed cluster7 Z$ z+ d7 r! O0 O
. ], T$ [+ w( l- F
Sending 'pacemaker_remote authkey' to 'controller1', 'controller2', 'controller3'
5 T* H8 @2 Y4 L/ l! ^ L! E, ycontroller3: successful distribution of the file 'pacemaker_remote authkey') l4 v: D# W% {- ~/ k; n
controller1: successful distribution of the file 'pacemaker_remote authkey'
6 ?9 ?, d3 t9 q, \9 |' _controller2: successful distribution of the file 'pacemaker_remote authkey'
7 b1 K4 q& _+ a. L! \' [5 KSending cluster config files to the nodes...& C* z( b8 \0 Y) C: d
controller1: Succeeded
3 P9 m5 J. ~- q* f: xcontroller2: Succeeded
- [2 ^2 E( d; n2 Wcontroller3: Succeeded8 }/ q5 G& E p: j( r: W3 X
8 J U- F; B4 MSynchronizing pcsd certificates on nodes controller1, controller2, controller3...
l0 K' m$ p& Y; P' scontroller3: Success
; i; L6 r1 \+ Q y# H. O4 Jcontroller2: Success
% t' H) `# {+ Icontroller1: Success
9 u8 |; |; b+ W# Q" xRestarting pcsd on the nodes in order to reload the certificates...
/ _* ^# D, N2 i1 ycontroller3: Success
5 w; d% A( y! \controller2: Success
- } b$ R! ~2 N# a/ Lcontroller1: Success5 t0 \& |4 R5 J
6 o. d$ m2 |. e& R5 S
启动集群的所有节点:
) S% c7 Y0 A3 B3 T7 z5 ?7 j6 Z( F$ u& S
[root@controller1 ~]# pcs cluster start --all; A4 U3 x4 F" h( L! b" ?8 e+ c
controller2: Starting Cluster...
5 l! u- h1 O3 R ~( c2 Hcontroller1: Starting Cluster...
G$ r3 o( w+ i* n3 {controller3: Starting Cluster...
6 \% V$ ` t! R0 f L' n+ F[root@controller1 ~]# pcs cluster enable --all
3 {/ l3 @ d/ H6 Y: e% acontroller1: Cluster Enabled
7 v- Y8 J P% H8 icontroller2: Cluster Enabled7 b7 y7 g& x) R$ z2 G C
controller3: Cluster Enabled
w2 I k- l6 F: _: c! v: W1 x+ s8 Z1 R
查看集群信息:' u0 y4 g( ]* V) D" l% i
' F; ^, u( J3 ]. c
[root@controller1 ~]# pcs status. Q* d9 O8 y/ U I* I# K& _: }
Cluster name: openstack-cluster
Q% }4 w9 s8 g( U8 pWARNING: no stonith devices and stonith-enabled is not false
6 b8 U3 ~+ C9 l6 mStack: corosync$ H( d/ ?+ @# x. |" }
Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum
$ B6 G+ y9 h% G( P1 Q9 ULast updated: Thu Nov 30 19:30:43 2017
4 i% [7 @' q! B" _2 c PLast change: Thu Nov 30 19:30:17 2017 by hacluster via crmd on controller3
* ]$ d$ Z3 j. m7 K1 a4 F4 c
$ Z$ R+ k: |' m* A! b3 nodes configured5 `& E: u/ [& q9 M2 h0 W! v; m$ `
0 resources configured9 w/ ]2 m" h1 c0 ^& Z
" I7 M( l/ D4 b& _% B2 f4 p3 UOnline: [ controller1 controller2 controller3 ]
# k- V+ J, k* V5 \2 g+ h# k
2 {' x0 `( z7 z. d: c; G6 z! ?No resources! t/ z8 G) O$ Q% s. B5 e r
$ X4 d0 g7 @2 l
6 f" f' ]2 R/ I7 s' r; B7 gDaemon Status:( o2 E1 ~1 L, F0 V, ?5 i9 k# p
corosync: active/enabled- b: n* h' X a7 M5 t, a. w
pacemaker: active/enabled
- d: I S$ U) t; d pcsd: active/enabled. I" f2 `7 |2 o7 c8 u$ s' O! C
[root@controller1 ~]# pcs cluster status
% R m- _8 T0 m% F: q# ICluster Status:% h# ^; A7 }1 V& c
Stack: corosync& F! a0 ], A0 O1 |
Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum1 l' C7 x: G! N8 ^( X8 o+ m
Last updated: Thu Nov 30 19:30:52 2017
6 }- D- V Q! |6 a( y3 _7 D Last change: Thu Nov 30 19:30:17 2017 by hacluster via crmd on controller3
* f, t2 v; ^3 O$ u& K7 x 3 nodes configured$ s6 A0 Q8 C- Q9 P9 f9 B$ V
0 resources configured
: z8 P2 v1 e4 c# H
1 T, D# E" {( j1 d1 D" [PCSD Status:
! [* I% i) s, O5 I. U8 l controller2: Online
' d/ i7 Y. f4 \0 U! a H6 y! W controller3: Online# Q) w9 T0 D7 }- g% K
controller1: Online" N: s" f" A" e7 ?
2 A- `- }0 `+ O; }- Q/ m0 F; B三个节点都在线
: g3 Z% Y9 ~ W; I3 d. s默认的表决规则建议集群中的节点个数为奇数且不低于3。当集群只有2个节点,其中1个节点崩坏,由于不符合默认的表决规则, 集群资源不发生转移,集群整体仍不可用。no-quorum-policy="ignore"可以解决此双节点的问题,但不要用于生产环境。换句话说,生 产环境还是至少要3节点。' ]! x& h" A4 C3 y: c
pe-warn-series-max、pe-input-series-max、pe-error-series-max代表日志深度。
: n, r+ g& J% h# r2 r4 b- _cluster-recheck-interval是节点重新检查的频率。% j& O" p A7 G) E
[root@controller1 ~]# pcs property set pe-warn-series-max=1000 pe-input-series-max=1000 pe-error-series-max=1000 cluster-recheck-interval=5min
. ` q7 F1 y6 b3 g" D. D, X* ~禁用stonith:* e5 @" f1 ?. H& A* A* C- z
stonith是一种能够接受指令断电的物理设备,环境无此设备,如果不关闭该选项,执行pcs命令总是含其报错信息。
8 o9 `! z, g6 y# E4 ^- E[root@controller1 ~]# pcs property set stonith-enabled=false
6 B3 {2 d2 ]- Z& h% r; a: b二个节点时,忽略节点quorum功能:
( u2 d3 f7 Q C% O( F[root@controller1 ~]# pcs property set no-quorum-policy=ignore; b" \& q. C( S3 n$ y! k P" ]
验证集群配置信息. C# V3 K: i z$ w
[root@controller1 ~]# crm_verify -L -V# B N9 c9 j9 n2 O" H9 J: q4 c7 g
为集群配置虚拟 ip; A, h$ {- F6 k$ F5 n
[root@controller1 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \: _ ?. H0 |& Y; k; [4 a
ip="192.168.0.10" cidr_netmask=32 nic=eno16777736 op monitor interval=30s
3 @3 q6 z% Q7 J, f1 |$ n3 d到此,Pacemaker+corosync 是为 haproxy服务的,添加haproxy资源到pacemaker集群
" h0 x G2 ^3 b) P8 [2 l[root@controller1 ~]# pcs resource create lb-haproxy systemd:haproxy --clone
; O, ^& `. [8 ?, i' t# \7 d, U说明:创建克隆资源,克隆的资源会在全部节点启动。这里haproxy会在三个节点自动启动。# N3 l$ x; a6 g
查看Pacemaker资源情况+ y7 O( F- J' Y
[root@controller1 ~]# pcs resource 3 M; P5 W# a) e/ M+ R1 S
ClusterIP (ocf::heartbeat:IPaddr2): Started controller1 # 心跳的资源绑定在第三个节点的" |0 _ v7 K% A5 u) O
Clone Set: lb-haproxy-clone [lb-haproxy] # haproxy克隆资源/ w( z$ h0 l9 v7 D0 E. k
Started: [ controller1 controller2 controller3 ]+ w! t( p9 b0 N" {; e
注意:这里一定要进行资源绑定,否则每个节点都会启动haproxy,造成访问混乱
" ]- n- ~7 g5 s3 x- {/ c$ s# ?将这两个资源绑定到同一个节点上# x( B- D [ J9 `) h$ I
[root@controller1 ~]# pcs constraint colocation add lb-haproxy-clone ClusterIP INFINITY
. D, @8 [) h0 h3 L# G绑定成功
) I# w1 k3 x7 n0 E3 a0 R[root@controller1 ~]# pcs resource
7 I$ Q2 o( k9 T ClusterIP (ocf::heartbeat:IPaddr2): Started controller3; t, R6 f8 t4 c5 M5 N- [
Clone Set: lb-haproxy-clone [lb-haproxy]
) \+ R; N: F) E" f2 N Started: [ controller1]
' l* V1 X) `' z, D. U Stopped: [ controller2 controller3 ]3 h' [& k F3 B. B2 w5 u
配置资源的启动顺序,先启动vip,然后haproxy再启动,因为haproxy是监听到vip
9 J& B' `3 t. `5 P[root@controller1 ~]# pcs constraint order ClusterIP then lb-haproxy-clone2 Q" \- f1 x k- T' V2 n
手动指定资源到某个默认节点,因为两个资源绑定关系,移动一个资源,另一个资源自动转移。
4 A- ]# w' @8 E# r. {3 Z$ r2 b0 P9 ?5 s
[root@controller1 ~]# pcs constraint location ClusterIP prefers controller1
" v! ^: U1 b7 R1 H" q* n$ c4 n4 n[root@controller1 ~]# pcs resource
. Y7 K4 ~8 `& y o" o% ?/ ] ClusterIP (ocf::heartbeat:IPaddr2): Started controller1
0 ]" u- O" K6 C6 f$ o Y Clone Set: lb-haproxy-clone [lb-haproxy]
D; x7 O3 [4 Y* B6 l' h Started: [ controller1 ], {% t( P) x% Q* C3 _/ y
Stopped: [ controller2 controller3 ]
/ ?: c! j( ]# g2 j. x# i6 M: f[root@controller1 ~]# pcs resource defaults resource-stickiness=100 # 设置资源粘性,防止自动切回造成集群不稳定" d h Y+ F2 c8 l3 r5 d) K4 u. Y
现在vip已经绑定到controller1节点$ ]: I. a8 r6 K- M
[root@controller1 ~]# ip a | grep global/ I3 F9 J+ A( x2 x# t
inet 192.168.0.11/24 brd 192.168.0.255 scope global eno16777736
2 ?7 P+ A4 P; X. K inet 192.168.0.10/32 brd 192.168.0.255 scope global eno16777736# X; t1 i) F, ^- Z3 n" h
inet 192.168.118.11/24 brd 192.168.118.255 scope global eno33554992
m2 O: m: y5 u
! `* L+ p- `/ @; T, x- x尝试通过vip连接数据库
' Q/ D3 W. R% |( b( A" e- A% rController1:
* d. z% m' s8 X0 \4 c/ ^, t
) U% w( w9 d+ Z5 p/ F2 |* o; D[root@controller1 haproxy]# mysql -ugalera -pgalera -h 192.168.0.108 K% E# U) v4 J8 L1 s
8 V+ L6 I* t3 Q- F. X. }
7 N; J! t# x1 r( X. ^
Controller2:( W0 N9 z% ]+ F$ ~+ C$ l
8 Z( R9 T/ O7 V% U8 Y, r
( I8 m, m- F$ d/ q# V高可用配置成功。) b/ M6 B) b' z% t( P
: \% o ^5 _$ y( |5 s7 _# s& d* C& j
测试高可用是否正常
% s2 w8 r; m9 O, L在controller1节点上直接执行 poweroff -f
1 ]+ a, t' f h& t4 N[root@controller1 ~]# poweroff -f
& W3 |, h ?% N+ E# _1 F$ e. ovip很快就转移到controller2节点上
O, n' W e9 v4 k c2 u; I0 }5 t
再次尝试访问数据库. }' C4 ]# f% ?
* S( b! G, Q1 B1 p9 S5 `! U
6 a1 g" M' V" w/ O% S8 }5 ], J" t 无任何问题,测试成功。' J$ o: B7 N0 U1 r- c
查看集群信息:" x X5 Q- b& U8 X
5 }* ?; Z5 z: c[root@controller2 ~]# pcs status ' ?2 v+ I9 y2 d0 Y$ \- k _
Cluster name: openstack-cluster# }! g7 x5 B7 T v
Stack: corosync9 C$ @7 U9 {, x& k) u) I
Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum
2 Y1 h! K: v( @/ N2 I" `' [Last updated: Thu Nov 30 23:57:28 2017
2 S6 e6 N. [+ r6 R) V( Q$ O. mLast change: Thu Nov 30 23:54:11 2017 by root via crm_attribute on controller1 a7 a8 t4 I/ p y9 e3 [2 Q4 T
, N/ |" L! E$ G, Q( b& ~3 nodes configured+ h- n3 [* E, C& D3 J
4 resources configured
: }) x9 t! ? k m# Z+ k0 \& v: k0 {$ z! ^6 o$ n4 L5 T7 h
Online: [ controller2 controller3 ], w" C0 s, `0 Z9 m) C' e4 c4 G/ n h
OFFLINE: [ controller1 ] # controller1 已经下线
' e3 {- ~% M3 }- {, p' A! p# v, Q, o) L4 s
Full list of resources: N; s! S! R: E4 S! B! ?; k
0 U/ }$ l/ e2 G' S* m l0 B ClusterIP (ocf::heartbeat:IPaddr2): Started controller2. O( ]$ H" J E% s1 G
Clone Set: lb-haproxy-clone [lb-haproxy]2 q8 I$ b, q0 n3 \1 ~6 {' I
Started: [ controller2 ]: R5 m: q5 A. _ t5 a
Stopped: [ controller1 controller3 ]1 t; y. w! ?% H& l$ L; A
9 r7 {7 n6 v4 R# I
Daemon Status:
4 w9 K( l9 E! n: R corosync: active/enabled9 ~ x* G9 F2 @8 X- a" v
pacemaker: active/enabled
( S( ]2 K/ K, h8 `1 |3 A+ `4 ]# H pcsd: active/enabled |
|