- 积分
- 16843
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
介绍及特点
! Q! ^$ H/ y' H! _1 h Pacemaker:工作在资源分配层,提供资源管理器的功能6 o9 j4 t( p. k3 Q: I
Corosync:提供集群的信息层功能,传递心跳信息和集群事务信息
/ A1 e6 G \ o" _" j8 z; Z: k1 G Pacemaker + Corosync 就可以实现高可用集群架构
# l+ p7 e! k3 u) y& _
, f3 r' C, N3 B# ~- I 集群搭建
( k+ E7 m& t( X$ C6 W4 _以下三个节点都需要执行:" Q5 j( ?* }0 F
' f# r( _, [2 r& Z' o& W* x' ~ o
# yum install pcs -y& {! n8 f d" ~* n) \
# systemctl start pcsd ; systemctl enable pcsd
8 h; z8 f1 X$ f$ ^# echo 'hacluster' | passwd --stdin hacluster
7 M4 m" X6 g1 O, S# yum install haproxy rsyslog -y
9 t$ D- a0 R* [# echo 'net.ipv4.ip_nonlocal_bind = 1' >> /etc/sysctl.conf # 启动服务的时候,允许忽视VIP的存在
" j1 i) ^" g& e$ A3 u# k# echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf # 开启内核转发功能
8 m2 L# M7 T' ?, \6 H- J# sysctl -p
7 B' E% N0 O, q5 E- w/ Z6 ^# y* L; @+ D
在任意节点创建用于haproxy监控Mariadb的用户' h' P \" p0 l; O8 _: c+ h, [
MariaDB [(none)]> CREATE USER 'haproxy'@'%' ;! o7 k E/ t* I6 O- ]& J
配置haproxy用于负载均衡器" S5 i. S4 R% X8 P* k
' f! v6 ~- J! l+ ~4 U; b0 z1 r- Z
[root@controller1 ~]# egrep -v "^#|^$" /etc/haproxy/haproxy.cfg
) f% `9 K9 D! p log 127.0.0.1 local2& q4 q. l! @8 G! J' ^ m5 Y
chroot /var/lib/haproxy
0 p4 @# G! X6 R. e, q pidfile /var/run/haproxy.pid f3 ?* U+ h6 M* w* N; \
maxconn 4000
7 I6 Q. V. O9 [, }4 ^7 o user haproxy
/ q4 X; S; H" R group haproxy
1 k$ D+ T! r5 W$ W I daemon! F) `2 t, O$ n! C
# turn on stats unix socket
# A0 p; U2 l& [( [/ P stats socket /var/lib/haproxy/stats
+ ]4 I% ^* T5 I5 Q5 ?& x! Qdefaults1 U7 k4 M: t+ |9 F9 d( O; f( G" ^
mode http% @: k8 ^- w8 ]! v7 B- M Z5 L
log global
) Z% K6 s: B- x, ]% w- N# `, e option httplog
0 v: L: b$ ^4 X7 f7 n option dontlognull
# g3 _! `8 ?, [ option http-server-close- K! v( q+ ?- K5 u2 c
option forwardfor except 127.0.0.0/8
% v" q$ D* u' Q$ J0 O option redispatch, _6 o+ T/ y% E* ]; a* B( U( ^
retries 3
; |8 [+ s) b" T, _3 a timeout http-request 10s( D* B: w5 D6 |9 Z/ s2 U$ F6 t
timeout queue 1m
( U! x+ {: M: k: y! E' M2 v timeout connect 10s: j g) G% `) L5 c7 T
timeout client 1m
" c4 o+ T( F4 M, J1 U, w/ V timeout server 1m$ p# u, B, `* X* [0 b
timeout http-keep-alive 10s- \# {! i, i2 O: W. R: W/ ?
timeout check 10s
, T9 _" n+ V/ b+ I& C$ C' E9 W maxconn 40009 H1 s9 U! F V8 u
listen galera_cluster& K) R9 k j( }0 A6 A! U, z
mode tcp
/ W2 \' \! O- @, B: ?9 y3 D bind 192.168.0.10:33068 p$ B' U R4 O$ ^+ q
balance source4 i- L5 \# \: c; |
option mysql-check user haproxy, f; z4 j/ w2 S7 \* Z
server controller1 192.168.0.11:3306 check inter 2000 rise 3 fall 3 backup, d. S0 {4 D% j& _: S: [' R) K, b
server controller2 192.168.0.12:3306 check inter 2000 rise 3 fall 3 + h$ I: F( \7 X" r* W* J$ H D
server controller3 192.168.0.13:3306 check inter 2000 rise 3 fall 3 backup: G, J: {; _3 M3 }/ {0 a: R2 `) n
/ J# T& S/ e2 A" D# mlisten memcache_cluster2 L5 ]$ E+ I9 E/ k. \$ E) T- A
mode tcp/ Y6 @& r! h5 K) B# _, m! i# L
bind 192.168.0.10:11211# E9 ?" j E5 F3 L j" _1 |
balance source4 N. }1 H9 A$ A" R- |! l4 q# s
option tcplog2 R8 s3 K( n$ z
server controller1 192.168.0.11:11211 check inter 2000 rise 3 fall 3 ) E% `- e% n. ]" Y& z7 `. P( h8 C
server controller2 192.168.0.12:11211 check inter 2000 rise 3 fall 3
3 p. |) Z! @ K server controller3 192.168.0.13:11211 check inter 2000 rise 3 fall 3" H& O/ ^( q! v- V5 s
5 |: `! Z! V' c# T & g+ h$ g* O; i; m
注意:* p& }3 \/ \5 e6 N& v; u
(1)确保haproxy配置无误,建议首先修改ip和端口启动测试是否成功。! k5 ^3 z/ U; H! L
(2)Mariadb-Galera和rabbitmq默认监听到 0.0.0.0 修改调整监听到本地 192.168.0.x& _% g! T9 v' i/ ?/ z
(3)将haproxy正确的配置拷贝到其他节点,无需手动启动haproxy服务/ r+ n2 Q1 B" b8 }
为haproxy配置日志(所有controller节点执行):$ V5 v" g/ a: V4 l+ k7 x
& Z; a) j" y% u* u# vim /etc/rsyslog.conf
3 y c7 O$ [! [- j…
9 O r* T+ F) U$ModLoad imudp
- ]" p& d b; I% v" `( j9 W4 `, Q$UDPServerRun 514
9 ^3 L3 \: k& ^…
4 ?5 @" u9 F* Y7 x' blocal2.* /var/log/haproxy/haproxy.log
* A9 N0 n: I( H! w…
9 A3 U5 G, I+ \7 B7 i4 K6 ]$ w
4 a6 x1 l3 e. D- D+ |6 i! r# mkdir -pv /var/log/haproxy/1 D2 N! q& \1 B
mkdir: created directory ‘/var/log/haproxy/’
0 w* t( }3 j6 A5 s/ c# Y
* x7 ~& R; K5 D% h G9 i% d# systemctl restart rsyslog
2 _) c( q! F3 m$ N
; e: F$ b+ L; J+ [& F+ k' `启动haproxy进行验证操作:2 a5 |$ Z" z7 q8 D7 e+ Q" ?
0 G* B7 }, L* N5 y& T
# systemctl start haproxy
1 f, m+ l% T" {; w; S0 ]$ I[root@controller1 ~]# netstat -ntplu | grep ha( z$ |3 W7 _; p( w+ G, c$ Q9 H9 }
tcp 0 0 192.168.0.10:3306 0.0.0.0:* LISTEN 15467/haproxy 3 x( }6 X9 y4 s- c
tcp 0 0 192.168.0.10:11211 0.0.0.0:* LISTEN 15467/haproxy
3 }4 a! R+ P; B9 ]. n7 Tudp 0 0 0.0.0.0:43268 0.0.0.0:* 15466/haproxy3 d9 U2 L5 l" |' j2 D6 n$ z
0 I; i0 _8 v1 M5 r! |
验证成功,关闭haproxy; [' o5 l, K3 m# S t
# systemctl stop haproxy
* L4 I! `) i( c6 I! @7 i7 e X5 h7 ?8 s6 z& |# b9 p! d
/ x) d' U/ J- V2 E在controller1节点上执行:
) W* R0 u n8 ~1 }, x% h- u[root@controller1 ~]# pcs cluster auth controller1 controller2 controller3 -u hacluster -p hacluster --force& o& U5 T+ U! K' m* S7 P7 ]$ `
controller3: Authorized4 i. l! Z( e3 p( Q j) z3 V
controller2: Authorized
% u& ^) y* w/ B5 p+ w; D3 j; \controller1: Authorized
7 ]/ \1 w1 Z+ \# s6 |% B创建集群:
$ V) W4 n+ F) j) Z( s' ]4 i; ?6 w; T: Q) I) \) u- i
[root@controller1 ~]# pcs cluster setup --name openstack-cluster controller1 controller2 controller3 --force
4 F, u2 C: p! Y+ tDestroying cluster on nodes: controller1, controller2, controller3...9 c$ W. ^2 ?: m& v3 @
controller3: Stopping Cluster (pacemaker)...* h G. P f7 ] O/ }6 C
controller2: Stopping Cluster (pacemaker)...' W' K# C$ [( \
controller1: Stopping Cluster (pacemaker)...
9 ~9 W6 g9 X# Acontroller3: Successfully destroyed cluster
& [9 a% I! N6 C& v- h' Zcontroller1: Successfully destroyed cluster0 A6 S0 J- i9 U. W+ U$ y
controller2: Successfully destroyed cluster
9 P$ V" m8 b( Q- n* K: |
7 Y& B1 ]1 ~. f( f hSending 'pacemaker_remote authkey' to 'controller1', 'controller2', 'controller3'
+ `) t" g- I; B( g/ `controller3: successful distribution of the file 'pacemaker_remote authkey'
, h! f' Q; i! Y9 a3 @$ kcontroller1: successful distribution of the file 'pacemaker_remote authkey'5 y3 G. P4 M0 z3 Y: |
controller2: successful distribution of the file 'pacemaker_remote authkey'
6 o4 v' k+ s7 t1 t$ N2 fSending cluster config files to the nodes...
0 J( d2 f+ s Y9 p: q( j2 bcontroller1: Succeeded2 o5 w+ c$ {6 o
controller2: Succeeded. m$ Q! W# r4 g/ N8 }
controller3: Succeeded
, ]9 S5 {2 j9 M2 q' \8 _# b0 g$ H# a/ j2 x9 o, I1 S2 y/ A
Synchronizing pcsd certificates on nodes controller1, controller2, controller3...
; G% b- T$ x% X# w6 `3 m: Dcontroller3: Success
* v" f- Z$ Q0 ]. t! dcontroller2: Success+ l1 |& d* v# N4 d: _" M
controller1: Success
0 Q, M( X5 g& Y8 Z$ f+ mRestarting pcsd on the nodes in order to reload the certificates..." j- {! N- A) V e( |& T& O
controller3: Success
( ]4 e6 q9 ~& n6 x. y+ ]0 Gcontroller2: Success
: n+ K8 M7 D$ icontroller1: Success9 a4 b7 m; j+ p H8 P
, w- b, Z4 s+ o8 A. }2 K启动集群的所有节点:* x& W. H. t2 {3 l" p6 l, a; B
1 ^6 e) U2 f3 q( p[root@controller1 ~]# pcs cluster start --all
8 X* ^8 U/ U9 @. S4 T+ econtroller2: Starting Cluster...
, q9 Z/ P* M' C3 e6 Y2 H& ycontroller1: Starting Cluster...
, U7 {4 D% f; A$ rcontroller3: Starting Cluster...7 |8 H5 r4 S2 N. q7 K
[root@controller1 ~]# pcs cluster enable --all
, P- r' {1 ^% Q3 q! W+ d5 ~controller1: Cluster Enabled6 T( p5 O( I C8 Q# \$ @
controller2: Cluster Enabled5 B, h1 s1 e; k9 P
controller3: Cluster Enabled
: u- E% A. Y1 g l2 d5 }% t, `) x- U; ]5 [9 L: ]! d9 K3 i8 B- u
查看集群信息:6 u! f. Q8 R9 t9 ], v( f% A- z
: a# z8 A1 P9 m |
[root@controller1 ~]# pcs status
% v" K4 _3 S' {- t5 A, x, O" u4 PCluster name: openstack-cluster
" P9 ~( a: Q tWARNING: no stonith devices and stonith-enabled is not false
# [* b7 ], ~1 l+ YStack: corosync
6 L5 K' ^& _6 X7 O( K) lCurrent DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum& \! g8 Q$ b* k- x
Last updated: Thu Nov 30 19:30:43 20177 F# m$ r& \& M$ p2 y5 ^ U/ \% \
Last change: Thu Nov 30 19:30:17 2017 by hacluster via crmd on controller3- h" {+ h: q9 D9 t
/ S' X: c4 d, J1 p1 l
3 nodes configured
% N, U5 |# _- U5 h# z9 i' x0 U- S0 resources configured4 E* H. W/ X1 H2 O+ Q
/ k9 U) G/ S! }: \
Online: [ controller1 controller2 controller3 ]/ P- b; k8 h! R0 G6 _& q( ]
# h5 s! N8 V, ]" _& h# T P
No resources
- H: R" O# C1 s8 y& h8 L4 Y5 P3 x) q, r# S c
, o, @; L% Q. r) T/ eDaemon Status:4 D! [% ]# l& I
corosync: active/enabled9 W& z4 q2 \4 P, m" d
pacemaker: active/enabled
# j- S- u4 N: r% F- m$ } pcsd: active/enabled
( g+ o8 k3 J) _' C2 |* [3 i[root@controller1 ~]# pcs cluster status
) B3 u( ~5 u! _8 FCluster Status:8 S i; V- e7 i) ]" v1 ?
Stack: corosync# @7 Z: D. _9 g' |$ Q9 Z! Z# h" x
Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum8 Z# T1 B# p- U5 e1 H
Last updated: Thu Nov 30 19:30:52 20177 P! |: z4 u' D
Last change: Thu Nov 30 19:30:17 2017 by hacluster via crmd on controller3! Q0 Y$ L, V# D! C" b. ]( j
3 nodes configured
, o$ K! T/ o. L+ n 0 resources configured
* ?4 I0 D! }# V& r/ H0 T+ u* i/ R6 Z/ p; O+ n( ?
PCSD Status:
, Q+ g. Y: t8 E2 f& K7 w controller2: Online
7 ]1 @, [) l, X controller3: Online
7 K) {) W* I" }7 b) ]: A2 M( l6 H' X controller1: Online
2 a* j3 `. g* b) u) [+ Q0 q
2 {6 k) o: F8 }' X" p" A三个节点都在线
- b/ J) I& U* N4 ]1 x1 f7 w9 G' P默认的表决规则建议集群中的节点个数为奇数且不低于3。当集群只有2个节点,其中1个节点崩坏,由于不符合默认的表决规则, 集群资源不发生转移,集群整体仍不可用。no-quorum-policy="ignore"可以解决此双节点的问题,但不要用于生产环境。换句话说,生 产环境还是至少要3节点。
( A0 n. k; k$ M2 X2 S0 q" j" Ppe-warn-series-max、pe-input-series-max、pe-error-series-max代表日志深度。
7 ]1 Q9 j" L: O1 r9 Y6 bcluster-recheck-interval是节点重新检查的频率。
4 j* O. x+ E8 T[root@controller1 ~]# pcs property set pe-warn-series-max=1000 pe-input-series-max=1000 pe-error-series-max=1000 cluster-recheck-interval=5min
: Q' f9 P* c5 ]' a, A禁用stonith:1 e9 K, R9 e8 [+ t9 [3 v/ ?2 L
stonith是一种能够接受指令断电的物理设备,环境无此设备,如果不关闭该选项,执行pcs命令总是含其报错信息。6 @" k; ?: H4 t, I2 J$ Z4 e* Z# `
[root@controller1 ~]# pcs property set stonith-enabled=false1 B9 d" w3 {% Y" ]0 i
二个节点时,忽略节点quorum功能:" @4 @1 j; _) s. F- G
[root@controller1 ~]# pcs property set no-quorum-policy=ignore, D; a E+ v5 b3 w: M4 c
验证集群配置信息* q: @ t6 e% ~/ W) q3 A/ T8 g2 l
[root@controller1 ~]# crm_verify -L -V [* e5 M4 y# z' _
为集群配置虚拟 ip. Q0 z. y0 _% o
[root@controller1 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \
% l' |1 @) q' w; }# }; J2 [ ip="192.168.0.10" cidr_netmask=32 nic=eno16777736 op monitor interval=30s
- C/ F. C4 \* C/ e! o到此,Pacemaker+corosync 是为 haproxy服务的,添加haproxy资源到pacemaker集群
' T5 C( G* m" v! D; |7 z[root@controller1 ~]# pcs resource create lb-haproxy systemd:haproxy --clone% N4 O7 C- Z& }, M @. J$ U0 d
说明:创建克隆资源,克隆的资源会在全部节点启动。这里haproxy会在三个节点自动启动。, e% t% [% S! H b4 v* |
查看Pacemaker资源情况
& ]# |" k. F: v[root@controller1 ~]# pcs resource + i# @% e2 x2 `# U) ^& e8 @
ClusterIP (ocf::heartbeat:IPaddr2): Started controller1 # 心跳的资源绑定在第三个节点的! a0 B; R; H0 j; g
Clone Set: lb-haproxy-clone [lb-haproxy] # haproxy克隆资源
& Q! e# H2 l2 v3 G6 I; `2 P Started: [ controller1 controller2 controller3 ]; f$ i3 M0 g9 j+ z M/ [$ ?( Z
注意:这里一定要进行资源绑定,否则每个节点都会启动haproxy,造成访问混乱
% L* a. P: C8 o7 Q8 L9 Z, e将这两个资源绑定到同一个节点上# R4 }) J5 w5 k) V! `' G+ R
[root@controller1 ~]# pcs constraint colocation add lb-haproxy-clone ClusterIP INFINITY
' j, F+ }: F6 h1 B绑定成功4 @" D. m. w: r# y/ z- d1 ]
[root@controller1 ~]# pcs resource3 K& R# b- T4 H' r% x3 @ j
ClusterIP (ocf::heartbeat:IPaddr2): Started controller3( r+ M. Q9 m. P/ E; d3 X
Clone Set: lb-haproxy-clone [lb-haproxy]/ l& I$ S7 e4 K a; X1 q0 D
Started: [ controller1]
1 i4 ~0 M9 s: g& d* U6 V9 B Stopped: [ controller2 controller3 ]1 f3 M6 ^3 H* `: D2 D% z) |
配置资源的启动顺序,先启动vip,然后haproxy再启动,因为haproxy是监听到vip
0 p% h# g, [% v0 e9 ^[root@controller1 ~]# pcs constraint order ClusterIP then lb-haproxy-clone
; b" d* m& m# f. |. j, O8 |+ c, V手动指定资源到某个默认节点,因为两个资源绑定关系,移动一个资源,另一个资源自动转移。
$ ?9 j9 H7 v5 u5 X/ ~, d, N# J' v. n, |& Z/ B" L
[root@controller1 ~]# pcs constraint location ClusterIP prefers controller11 }* b2 }7 R2 @" |8 F
[root@controller1 ~]# pcs resource- Q1 e0 Z7 v' h. _- P
ClusterIP (ocf::heartbeat:IPaddr2): Started controller1$ v% Z: k V2 b2 w: u8 j' y$ }- ]
Clone Set: lb-haproxy-clone [lb-haproxy]
@( E6 T2 C; [/ d. W- O9 C Started: [ controller1 ]- r2 R# M* M3 I8 ~% L( K/ y
Stopped: [ controller2 controller3 ]
$ ]' H) k8 k8 S# d# C; |[root@controller1 ~]# pcs resource defaults resource-stickiness=100 # 设置资源粘性,防止自动切回造成集群不稳定9 I& y7 p: A6 T2 n; m h% [/ l, W
现在vip已经绑定到controller1节点
( G3 f" [2 ?$ G7 ` E9 b, B[root@controller1 ~]# ip a | grep global
! L; S) p/ S+ m" l8 d' g T inet 192.168.0.11/24 brd 192.168.0.255 scope global eno16777736
# |# v9 h5 q( j6 ? inet 192.168.0.10/32 brd 192.168.0.255 scope global eno16777736- n& X v' A) n' d$ r
inet 192.168.118.11/24 brd 192.168.118.255 scope global eno33554992# J- K- K/ S/ I [! g7 N3 {
6 ~ b8 N) d8 G
尝试通过vip连接数据库
9 A3 H* ?1 C; n0 e3 J# iController1:
# \2 |& n+ T i6 v" d& r+ J% U7 v. r9 ]9 \% I' p
[root@controller1 haproxy]# mysql -ugalera -pgalera -h 192.168.0.10
1 E) O% y% `) }* @4 y# P+ p: A$ U
+ |- o% j2 b! X( B5 s* p) @
9 e3 C& I1 A, @6 u1 Y Controller2:& a: R2 r' H' C1 J( b3 d3 w' ^
# q1 v3 ]$ _. b( o* ~0 v
% y: A) A% p2 P( d. [: g高可用配置成功。
* P7 n5 E$ Y7 v7 ~4 a 7 s. G _7 P. ^# `
测试高可用是否正常; P1 W0 X) N* O3 ?* p1 W; K- d
在controller1节点上直接执行 poweroff -f
6 k! V+ U+ R0 r5 \9 X9 p% E[root@controller1 ~]# poweroff -f
) d% r9 E5 k5 z; V3 d: Svip很快就转移到controller2节点上0 [1 h7 X/ K& S- D c
! P) R4 w1 _& [3 H3 W
再次尝试访问数据库
% ?& P: i( J6 F3 {' k, Y; Y- Y$ Y% e& q
( w4 `4 y! w/ }4 x/ C 无任何问题,测试成功。1 H8 K. H6 G$ P6 Q$ G
查看集群信息:
/ `5 T* O# M9 g- c
3 q9 Q1 @/ v) W9 \[root@controller2 ~]# pcs status : R! f* F3 [. Q$ A$ u
Cluster name: openstack-cluster
8 V* ~6 h" p* R, kStack: corosync
$ P$ Y1 H% Q( @Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum
2 P: j& B; N: n/ n8 tLast updated: Thu Nov 30 23:57:28 2017" J) L* f6 H! ~# Z- e b7 C0 h
Last change: Thu Nov 30 23:54:11 2017 by root via crm_attribute on controller1
5 W8 E; X) x S5 X( f) ~1 u: R- d& H) ^
3 nodes configured$ g! |; z1 d) t5 q! x
4 resources configured2 |5 @1 Q' z X" A, |4 T) W
# K2 N# J8 s% N( W; \4 H$ k" h
Online: [ controller2 controller3 ]7 u* Z# d% I- i8 _- z: a
OFFLINE: [ controller1 ] # controller1 已经下线6 D, K5 z" T' l3 @) P- q
1 _% z/ E5 i1 ~% O) i
Full list of resources:
@. M& b4 i$ O
, k$ C9 T- [# z1 o/ P: _ ClusterIP (ocf::heartbeat:IPaddr2): Started controller2
* r6 x' N4 \9 M) v Clone Set: lb-haproxy-clone [lb-haproxy]
8 K6 a Z; H3 b" e2 y1 { Started: [ controller2 ]
: o+ s) m5 Y! L( ~ Stopped: [ controller1 controller3 ]; k9 d7 `) e/ `, K* e
T0 n+ C, U) k) i0 fDaemon Status:
1 k$ W& m. A5 Q6 f9 @ corosync: active/enabled# s8 }& @. K2 G5 J+ S
pacemaker: active/enabled
6 x3 Z( y" J: x- \ ?/ \4 ~8 |( P pcsd: active/enabled |
|