- 积分
- 16843
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
当mon节点全部出现问题的时候或者单独一个节点出现问题时恢复过程
& N; w2 c* I7 z" D' c R5 A. K
3 C+ l( ~. H1 {8 k( [3 o! {
& o* T( ~. J4 Lceph一直无法正常的执行ceph -s命令;
, W4 p2 L4 [: m& e! W E9 s) |
" @ S1 Y3 G2 K. i! ]
C* c2 ^* J$ u" V& }5 _
/ S# z5 u" ~0 w. N# P4 j' Cceph分部署存储告警monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]' E2 t& d; g6 o2 O" y7 r
7 ~3 ]4 ?" l3 @9 V5 }# _
2024-10-17T22:33:47.295+0800 7f20fe7fc700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]2 c4 A0 P( F; e) C a# U7 E
2024-10-17T22:33:47.297+0800 7f20ff7fe700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
. x' D6 j% M- F. K
5 N9 J1 K" W$ r4 x$ x: y4 v) [4 Q v* J( R; g2 H$ D
0 L5 o) n. j5 C, s6 {* T1 M4 u$ `环境中也就只有gm268-3节点因重启失败夯住是好的,gm268-1和gm268-2都已经被损坏。只能想办法从3上入手解决。( N1 P+ E/ A* U/ D
/ ~4 a; v( b; o5 K+ m0 R
结果过程:
* t q( e0 g) _3 ?0 O0 X$ E' n( G1 s$ Y, k# N! A& F. P
1、在gm268-3节点上导出monmap文件:! B& d2 O8 f/ u5 f
( c. A* T- `3 S" l9 x2 P( X+ p; F5 \" Z" Z5 f6 B
$ monmaptool --create --clobber --fsid ce68aab8-8f46-11ed-88c0-ac1f6b3a30b9 --add gm268-3 10.12.3.2:6789 --add gm268-2 10.12.2.2:6789 --add gm268-1 10.12.1.2:6789 /tmp/monmap0 o/ L- d2 B6 r, f& W, x7 V
monmaptool: monmap file /tmp/monmap
4 Q5 @0 r9 f: [' m4 s6 tmonmaptool: set fsid to ce68aab8-8f46-11ed-88c0-ac1f6b3a30b9
8 g0 ^* `2 W0 L9 u$ f Cmonmaptool: writing epoch 0 to /tmp/monmap (3 monitors)
8 b$ x, j; b; ~1 g
" ]( z5 b9 e4 _5 B+ z1 u: O) b) D8 {+ \+ C
导出monmap,好的节点写在前面,后面把所有的坏节点加上就可以了。8 v3 N9 M6 P; B8 g! l
, A' ?# w4 y: B2 P
查看下导出的文件信息:: z7 c+ c6 x3 _( R+ t
, U: C1 \6 K: ] g$ monmaptool --print /tmp/monmap + D( l+ U- [& s. F f$ L- j
monmaptool: monmap file /tmp/monmap( r5 s7 V$ f5 K! {$ B
epoch 0
# r' [. P0 t' Y: k" G1 w; rfsid ce68aab8-8f46-11ed-88c0-ac1f6b3a30b9
! ?6 ~. A7 T" v9 J4 l0 e E6 ^last_changed 2024-10-18T13:17:03.645872+0800
( _% E. Y9 ?; N% c2 u3 Pcreated 2024-10-18T13:17:03.645872+0800
; I- s# T- F, Lmin_mon_release 0 (unknown)
5 L# M, J- h* t4 T9 m8 K* _0: v1:10.12.1.2:6789/0 mon.gm268-1
: h! m' [5 _6 N2 k/ V1: v1:10.12.2.2:6789/0 mon.gm268-25 k3 d1 _- ~& V8 f# m L0 N" g2 ?% M
2: v1:10.12.3.2:6789/0 mon.gm268-3
) s l/ L4 T8 \1 J# T6 }3 r8 {
0 i; J, I; e) @/ n
: l! z( [* c6 t7 [; S
8 C% Y- y& p3 ^2、去gm268-1和gm268-2的节点上找到/var/lib/ceph/mon 目录,备份下。删除掉。因为文件被修改了,导致文件有异常,没有导致认证出问题。原有的/etc/ceph/目录不能删除。
% T0 I% o% K$ F3 X& p- A4 s7 O2 `, n+ N
+ K* V' F0 _3 L1 _' p; ] A: A; B
3、将正常节点上keyring和导出的monmap文件传送到其他两个节点上:
7 q2 O3 C# T0 g
0 X( [& u3 m/ H scp /var/lib/ceph/mon/ceph-gm268-3/keyring gm268-2:/tmp/
7 V6 K4 z( i" B: Z3 xscp /var/lib/ceph/mon/ceph-gm268-3/keyring gm268-1:/tmp/
" m0 t A& _. X9 F: G8 H, H
& Z/ W# ]6 @3 x$ }* n4 j/ A! _scp /tmp/monmap gm268-1:/tmp/
b: z8 F1 Y( T2 H( yscp /tmp/monmap gm268-1:/tmp/
) D4 f' N y8 S- a, H
6 P5 T9 [, _$ ?9 R. V
( f6 Z4 K L% W4 Q& G% Z& l, y4、重做gm268-1和gm268-2 节点mon * `- ], S; V! L$ }
ceph-mon --cluster ceph -i gm268-1 --mkfs --monmap /tmp/monmap --keyring /tmp/keyring -c /etc/ceph/ceph.conf # Q! v& P: G' u. H/ r
5 L' @3 p% Q# U1 @. L" p& X c切换到/var/lib/ceph/mon目录下. q1 C1 g# j' ?1 y* b
执行:
- D' Z# B( l+ _! Q( U- {1 u& H3 ochown -R ceph:ceph mon/
; D8 L5 ?( ` X2 F) z/ ~" ?8 P+ S' T* b- h
启动mon服务:1 L4 g- R; l; _; p$ t
systemctl start ceph-mon@gm268-1.service6 E. P, Y$ w8 n/ x& z& C7 U
! L( `0 D4 e1 `7 P/ I查看服务:
1 C [6 N u# Y# W1 P/ z0 ]8 F
$ systemctl status ceph-mon@gm268-1.service
; X- O7 ]& M3 {6 [! {, v" k● ceph-mon@gm268-1.service - Ceph cluster monitor daemon
# X( {+ G4 m: ]2 |* u Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)) f- Y9 D# M0 B, n* h: v9 f% G
Active: active (running) since Fri 2024-10-18 13:21:24 CST; 38min ago
8 B- f) P: W- @) k) Z( Y Main PID: 664542 (ceph-mon). b$ X A% |; W2 j* z3 @
Tasks: 27
3 {1 h) g' C" {' R4 z; C. S9 F) ` Memory: 286.0M
' _& _% ~ x2 v: q8 ^9 ~& M CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@gm268-1.service
( C l4 h" `+ [& O └─664542 /usr/bin/ceph-mon -f --cluster ceph --id gm268-1 --setuser ceph --setgroup ceph
+ P4 ~$ o# y$ H2 s3 Z5 Q0 X; I" M4 u8 l0 ?7 r
Oct 18 13:21:24 gm268-1 systemd[1]: Started Ceph cluster monitor daemon.
$ g6 Z7 `8 J) c$ T3 _9 A! eOct 18 13:21:24 gm268-1 ceph-mon[664542]: 2024-10-18T13:21:24.793+0800 7fcc5f804700 -1 mon.gm268-1@0(probing) e11 stashing newest monmap 11 for next startup
% r3 C4 {: n) m! I \( P6 T1 A8 OOct 18 13:21:24 gm268-1 ceph-mon[664542]: ignoring --setuser ceph since I am not root/ A2 f+ A/ t) @( R% c
Oct 18 13:21:24 gm268-1 ceph-mon[664542]: ignoring --setgroup ceph since I am not root
: x. H l5 z) L6 f4 @' {) o. M0 q& k& ~9 T V
& u: w# f+ [2 g8 A& l' ^
节点修复完成。" p0 v( k% l$ [3 j& |
节点二上
* C% O) V7 g# l; {7 U
7 R: o R- h3 S5 |ceph-mon --cluster ceph -i gm268-2 --mkfs --monmap /tmp/monmap --keyring /tmp/keyring -c /etc/ceph/ceph.conf
, H; ]8 q/ { \$ R3 j1 {( _
y* s) Q/ E" k% G切换到/var/lib/ceph/mon目录下
. ?( u. e' |* {执行:
$ H+ N1 `& }* Z% N! O% i% Kchown -R ceph:ceph mon/
/ K {6 L2 H) i9 }, l$ g& P7 P. e% t$ M# U$ H& m/ [* _
启动mon服务:
1 x2 F2 M T& u) b$ a2 Esystemctl start ceph-mon@gm268-2.service$ D( O+ p+ E6 }( r6 M, x% L) z! G
- U; V( _/ b5 w" V" i2 x9 E7 k! \" U& v
9 e' P! f* K5 m+ x8 C* e$ i
7 D/ x1 v9 E) x5 h! z U$ systemctl status ceph-mon@gm268-2.service 2 f' U& l! `* n* ?5 [
● ceph-mon@gm268-2.service - Ceph cluster monitor daemon# d4 m8 B/ S. U
Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
1 I* m3 g- y8 c Active: active (running) since Fri 2024-10-18 13:09:42 CST; 51min ago& B+ w" \& i* \6 P
Main PID: 157382 (ceph-mon)
, p5 \( J# b7 E+ W, M' x Tasks: 27/ K* h5 L' w* K; T
Memory: 587.1M
- v; L5 }- r7 k CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@gm268-2.service# }$ ^0 r8 z! w
└─157382 /usr/bin/ceph-mon -f --cluster ceph --id gm268-2 --setuser ceph --setgroup ceph
1 b. j+ s& k1 n" ]
7 i* H( X c* Z4 U" B& w( C6 p& }& {
|
|