- 积分
- 16843
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
当mon节点全部出现问题的时候或者单独一个节点出现问题时恢复过程9 v8 e+ c! f2 Q9 ]; A0 @, g
; n+ I, t% E! G7 V/ O
: d7 k4 j- B* N5 u5 }& ?
ceph一直无法正常的执行ceph -s命令;4 j/ Y* m6 \; D% a
4 s* \. _" _2 @6 @; i
5 W0 q8 s0 `! K1 O
! u: x. A: i/ Q+ zceph分部署存储告警monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]1 O) n5 n7 W: ]% B
( I; b$ |6 P7 t) J. ~" S2024-10-17T22:33:47.295+0800 7f20fe7fc700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]( J2 H( h! A" W. Q7 {1 C
2024-10-17T22:33:47.297+0800 7f20ff7fe700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
X, |0 Z: J5 S1 f2 W# h& ^! x6 y! {8 x- [0 g0 ^
/ ^* a7 W/ m! ]/ J% Z8 B
7 S+ v) `3 J, z" d+ O+ \
环境中也就只有gm268-3节点因重启失败夯住是好的,gm268-1和gm268-2都已经被损坏。只能想办法从3上入手解决。
# f! ~$ A! m, H5 G) u: y! j0 w
" p1 f* X. `% _; x5 x+ P3 O& ^, c结果过程:- L9 Z1 u; Q- @5 o( [' Z
% {0 j1 R5 B5 V) _9 h
1、在gm268-3节点上导出monmap文件:& X1 D: J9 e0 f4 L) ]& ^
1 d. h; D6 D" `) z u/ U4 a% g! O5 [% P k7 E& S: f& ~' L' w
$ monmaptool --create --clobber --fsid ce68aab8-8f46-11ed-88c0-ac1f6b3a30b9 --add gm268-3 10.12.3.2:6789 --add gm268-2 10.12.2.2:6789 --add gm268-1 10.12.1.2:6789 /tmp/monmap
- z- o- @7 E, L W$ umonmaptool: monmap file /tmp/monmap
, `! E7 X& @% P7 {3 ymonmaptool: set fsid to ce68aab8-8f46-11ed-88c0-ac1f6b3a30b9
% e5 r, [: j" @ l+ l8 N, Fmonmaptool: writing epoch 0 to /tmp/monmap (3 monitors)4 Y# I9 r8 W2 Y4 r
( R' H0 L* A$ E7 p) o' n1 t$ i0 J" I' p# |/ q' R. K* e, e
导出monmap,好的节点写在前面,后面把所有的坏节点加上就可以了。
5 `5 J0 n6 G! e
! O6 Z5 F" l7 M U3 @查看下导出的文件信息:+ S2 u$ a. A8 E' m
( A1 J! d3 k/ d8 v4 t
$ monmaptool --print /tmp/monmap
; ]' A! w) p1 F! M# }monmaptool: monmap file /tmp/monmap
% T; ]: a$ ~$ C/ ^( V! O3 pepoch 0
$ E# X1 I) m% [3 M7 R! C. @fsid ce68aab8-8f46-11ed-88c0-ac1f6b3a30b9
5 n) ~) U9 {1 H( y( J" Vlast_changed 2024-10-18T13:17:03.645872+08007 G0 b% o* j, d( x P5 v2 d* Z
created 2024-10-18T13:17:03.645872+0800* w, N: h, t7 P( q% J% e/ p
min_mon_release 0 (unknown)
9 D8 e) d3 ]- [( \7 a0: v1:10.12.1.2:6789/0 mon.gm268-1
% V9 V+ g3 P, w" o1: v1:10.12.2.2:6789/0 mon.gm268-2! c7 A( }5 P- }# E
2: v1:10.12.3.2:6789/0 mon.gm268-3; L/ ?: s: o; t" F+ e; H
# q# N3 B$ y7 ^7 |1 ]: h. Y% d
5 R3 T9 _& V$ d3 g
5 r+ T; }, v: P. S; W$ c/ z) s; p2、去gm268-1和gm268-2的节点上找到/var/lib/ceph/mon 目录,备份下。删除掉。因为文件被修改了,导致文件有异常,没有导致认证出问题。原有的/etc/ceph/目录不能删除。+ o" U& [, m, S) P5 I" F
: e7 @5 I+ {) J+ `" W* D" I, a6 |+ H0 ?, y
3、将正常节点上keyring和导出的monmap文件传送到其他两个节点上:* h9 g8 P$ S! u: m) _
) l1 A6 D# I& x6 h3 O6 a- l scp /var/lib/ceph/mon/ceph-gm268-3/keyring gm268-2:/tmp/0 y& h6 V7 k+ ?3 f9 ?1 [, I q
scp /var/lib/ceph/mon/ceph-gm268-3/keyring gm268-1:/tmp/
8 u8 F! }- i. q" r4 `& _7 c% l$ \/ j
scp /tmp/monmap gm268-1:/tmp/
9 w" W: p+ C8 G( B# Pscp /tmp/monmap gm268-1:/tmp/$ ]% c, }( \* k3 c! b7 D+ v3 w! i; z
; I" D$ ~2 N+ M5 @" z I3 X. \
' S- K7 `8 a T2 l3 F9 d+ B4、重做gm268-1和gm268-2 节点mon , R( a! P% m# w: r9 W Y
ceph-mon --cluster ceph -i gm268-1 --mkfs --monmap /tmp/monmap --keyring /tmp/keyring -c /etc/ceph/ceph.conf
/ I* a1 `, T0 T7 T( y
+ S1 j+ N) z/ f) z切换到/var/lib/ceph/mon目录下 O+ Q( {' S Q; c: p7 i. S5 ?
执行:! G9 J( e5 C3 H k0 _( u1 h
chown -R ceph:ceph mon/% W0 f1 O" i" p* H) o' [
' I- X( u1 e) O$ C8 f# E% m2 K% k启动mon服务:
! B$ V9 ]* S5 y+ qsystemctl start ceph-mon@gm268-1.service
" [5 s4 N* F( p1 ^7 E' P. r
9 P+ l/ R( J# o4 v8 V1 a$ o: m9 w查看服务:
0 o, c+ L: V. q" V+ S; e4 y& w' K- ]% B
$ systemctl status ceph-mon@gm268-1.service & w( |0 {+ c- R, V+ {- W
● ceph-mon@gm268-1.service - Ceph cluster monitor daemon
1 O" a& {* I) Q9 S r4 X' K3 Q" D Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
& a5 Z: D2 _! @3 w( k) u3 U; n Active: active (running) since Fri 2024-10-18 13:21:24 CST; 38min ago
- p* m& u4 e) o5 J" f8 j Main PID: 664542 (ceph-mon)
% `+ ^' H% y2 T% X8 | Tasks: 27
" M8 ]8 d( C9 s: `& o Memory: 286.0M; K% t7 N2 x8 |. R: ^
CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@gm268-1.service
; K* A/ k c6 |+ r └─664542 /usr/bin/ceph-mon -f --cluster ceph --id gm268-1 --setuser ceph --setgroup ceph l7 Q- g1 P& f1 s2 b% B
' B' u# `' F" R' Z1 J0 I( F2 mOct 18 13:21:24 gm268-1 systemd[1]: Started Ceph cluster monitor daemon.' {6 ]! {* f+ i9 X. U4 P/ u$ x
Oct 18 13:21:24 gm268-1 ceph-mon[664542]: 2024-10-18T13:21:24.793+0800 7fcc5f804700 -1 mon.gm268-1@0(probing) e11 stashing newest monmap 11 for next startup
" p% H. {5 b$ u7 [" JOct 18 13:21:24 gm268-1 ceph-mon[664542]: ignoring --setuser ceph since I am not root* {/ O. @1 t! h4 U
Oct 18 13:21:24 gm268-1 ceph-mon[664542]: ignoring --setgroup ceph since I am not root7 i3 k6 K5 L1 [2 l
3 e8 v# Q. R3 ^( x9 c& C( O( K4 i
& E. ?: t* {0 P! `! ^. Z" a节点修复完成。
3 }1 C0 Z, w0 R6 D* {节点二上$ @% C$ t$ c! `' [0 D! z
8 J9 X' M; y. W6 C+ H9 Jceph-mon --cluster ceph -i gm268-2 --mkfs --monmap /tmp/monmap --keyring /tmp/keyring -c /etc/ceph/ceph.conf 5 p1 R7 t B: e/ b K2 H
; t) O% X( v" X" Q1 ~# v6 n
切换到/var/lib/ceph/mon目录下3 E7 [) ~7 I& Y2 g; t9 L
执行:
5 W% j* P7 v1 X0 J( l( lchown -R ceph:ceph mon/3 A, _) k8 z% f0 U' d
/ Y( q, P! x# u8 U
启动mon服务:
% ^1 b. }0 b `" K9 k7 Usystemctl start ceph-mon@gm268-2.service" l. U$ p) K( w' d5 c( Y4 G* S
8 ~, L. E- S1 Z% N- H6 g8 L1 t; h5 E' D' f
+ _: U9 s# n6 [8 ?) ^, g9 N1 ^
$ systemctl status ceph-mon@gm268-2.service
! _# A$ e9 K$ s7 i& e● ceph-mon@gm268-2.service - Ceph cluster monitor daemon3 N, Z1 e& W5 }! t. W2 e4 M3 t
Loaded: loaded (/usr/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
z( ^! L1 s* L: @: p; s Active: active (running) since Fri 2024-10-18 13:09:42 CST; 51min ago; V' U, N" k2 m* R7 ]. j
Main PID: 157382 (ceph-mon)
7 t+ n5 A+ Q- w5 H3 {7 b3 p- u5 J Tasks: 270 g5 P- V$ z$ X; w# d
Memory: 587.1M
9 F( V9 e6 S; _. d CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@gm268-2.service
/ y4 C/ y1 `* L: T/ K └─157382 /usr/bin/ceph-mon -f --cluster ceph --id gm268-2 --setuser ceph --setgroup ceph
3 W1 `2 A( ^) ^$ r- r) c3 R9 W2 g% a. B! ^; ^8 B6 W, `" z2 F, P
9 k w3 @+ }/ Q" f |
|