|
|
; K0 y( I( S, g5 V: G0 l/ ~0 c7 l. p
1.问题: 7 }, j& ~$ p* p6 C; M8 }6 A% a
# ceph health # ]/ u' {& O* [5 }4 K( ~) D
HEALTH_WARN application not enabled on 1 pool(s) 解决:
( E9 A- J) {0 b9 _3 i* e# ceph health detail
. R/ [) ^9 J# M8 @: G
HEALTH_WARN application not enabled on 1 pool(s)
1 w6 r3 C2 I( D4 p3 S H/ T
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
$ c* I& K, c7 b4 A/ y2 g( a
application not enabled on pool 'kube' 1 H( ]& C4 {2 k3 z) ^
use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications. 7 D7 Z8 @# R% f. E/ q3 G
# ceph osd pool application enable kube rbd enabled application 'rbd' on pool 'kube'
9 B6 s" ^* N$ Y$ e
# ceph health HEALTH_OK & o5 {- O% S- u) M& Y
0 h' Q9 `/ M( }3 e
% ]2 s! E' G$ }3 p+ y' G6 R
2.问题: # ceph -s5 Q1 b& D* H+ ~8 R
cluster:
# e6 h& ?' w0 S4 O% b! F# l6 x, p/ y id: e781a2e4-097d-4867-858d-bdbd3a264435
# h" K( t" w( m3 @4 G health: HEALTH_WARN5 K" Y5 y! _* ]8 x" M
clock skew detected on mon.ceph02, mon.ceph03 8 _# G* H. Q" `* u; `
解决:
5 L' O& g1 M8 Y) E' t! f, Q####确认NTP服务是否正常工作
; @# i9 |$ F1 E9 ]2 s# u, C# systemctl status ntpd
9 d* B/ K2 d( A* B/ i& p####修改ceph配置中的时间偏差阈值
3 w/ ~5 Z0 u8 y/ [6 B/ t5 J( j+ Z# vim /etc/ceph/ceph.conf
8 l0 B% e# S0 x) H% j###在global字段下添加:
# ]: k- w7 u/ Q: B( G" g0 kmon clock drift allowed = 2
9 [5 ?8 a4 S z! Q) U2 N, D }5 zmon clock drift warn backoff = 30 Y- x- y# D6 k
####向需要同步的mon节点推送配置文件
D6 {- T. \( |) ~# q# cd /etc/ceph/' w: \& B. Q& k: v
# ceph-deploy --overwrite-conf config push ceph{01..03}
) C4 R% @8 R9 }% J4 w* p& I####重启mon服务并验证4 d; \5 m( m+ J
# systemctl restart ceph-mon.target R5 P0 g- ?1 P9 F2 @6 y' M
# ceph -s7 ?2 r' l( v: o) q+ {! k" @; {
cluster:
- W$ z+ J8 c3 e' C id: e781a2e4-097d-4867-858d-bdbd3a2644350 P1 c& K4 G; g( }7 o9 Q
health: HEALTH_OK5 I! {- G8 C3 d
1 l5 t2 W2 o9 z& Q; t5 {" g3.问题:
# z2 Z/ n* @$ ^" t, S2 S# rbd map abc/zhijian --id admin
* ?. ~! I2 l) q. h6 t. rrbd: sysfs write failed
+ |6 ^ U5 L a# F& \RBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable".
* Z0 P! m2 u9 PIn some cases useful info is found in syslog - try "dmesg | tail".
+ d0 Z. o, Y) T3 n) U$ t' T0 nrbd: map failed: (6) No such device or address
1 C+ P% v. u% Q! K+ Y; {# w# K , B2 D3 `" `0 \1 j6 E: y9 N
解决:
$ |2 v' j+ G. b8 \由于kernel不支持块设备镜像的一些特性,所以映射失败
; [2 v2 \+ R- ?6 N8 y( \8 {3 q) q# rbd feature disable abc/zhijian exclusive-lock, object-map, fast-diff, deep-flatten9 B6 g+ H4 T" g4 M8 v7 B
# rbd info abc/zhijian/ `, R# Q7 V B. S( _/ t
rbd image 'zhijian':
. t0 f# J0 W. F& n! ^4 Esize 1024 MB in 256 objects
+ v U/ x# h/ S& r. M. _. worder 22 (4096 kB objects)
" @9 C0 I; @+ S; O# @block_name_prefix: rbd_data.1011074b0dc51
' e0 _$ N* |) M8 M5 iformat: 2
0 S" s/ e( h8 O* k! b9 cfeatures: layering( ?' i; Z: e/ |) k+ t3 e
flags: + b3 Y" y# a" W
create_timestamp: Sun May 6 13:35:21 2018
- ?0 l9 H- o$ w: s, w! H# H* k! c# rbd map abc/zhijian --id admin
" t' ^% k: X, ]6 c/dev/rbd0
$ B! y7 Q) s( y6 l/ N% E+ H" I+ a$ t
- w+ e# b+ H0 C' Q/ x. B4.问题: B8 c) u5 @' ]
# ceph osd pool delete cephfs_data
$ n) ?; w9 k0 {Error EPERM: WARNING: this will *PERMANENTLY DESTROY* all data stored in pool cephfs_data. If you are *ABSOLUTELY CERTAIN* that is what you want, pass the pool name *twice*, followed by --yes-i-really-really-mean-it.! C( e) L2 J7 b1 T8 u
# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it" B! E* C. J |3 H
Error EPERM: pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool! X- v# q& c- |* m
' V5 i, p3 `) A' l- V K0 V2 R* i$ F
解决:
" m) ]' Q/ S' S3 |( |# tail -n 2 /etc/ceph/ceph.conf
9 N6 N U4 L6 f2 H* h7 B[mon]4 n- H) v/ }" c4 P, ]6 }( v
mon allow pool delete = true
3 \2 B1 h" L8 i" F z% `# K# {- r# D Z- V# [# S
向需要同步的mon节点推送配置文件:
: ^" Z7 T! Z7 T* N. }7 B# cd /etc/ceph/
% l" S; b! g/ R# ceph-deploy --overwrite-conf config push ceph{01..03}% u+ Y7 i' G7 f0 d! O
2 A, c3 |7 R& P1 v7 e
重启mon服务并验证: l: Y1 X/ k4 R! I- a" d
# systemctl restart ceph-mon.target; U. M1 i# S, x$ [ b; Z0 ?
# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it8 T/ ^4 a1 C: [' Z! s
pool 'cephfs_data' removed
m& }" S* e8 m+ J/ c
$ [5 S: a! r3 W! c6 Y o" i/ ? G! c5.问题:
" {1 J; b5 X/ m# ceph osd pool rm cephfs_data cephfs_data --yes-i-really-really-mean-it# g6 i" h% @0 Q* L) e
Error EBUSY: pool 'cephfs_data' is in use by CephFS
2 T" n- R3 a1 ]( v/ R, R0 X: V& c; a$ m# _
解决:- v! G4 y/ f* C% v6 w, _
# ceph fs ls: K) a! q+ h- V% Y7 ?6 R
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
, D% }+ B3 J* F# ceph fs rm cephfs --yes-i-really-mean-it9 x) l1 v4 ~$ P# _8 O% P
Error EINVAL: all MDS daemons must be inactive before removing filesystem
" K1 Y( d9 e; X( }7 L- j2 } S0 f# systemctl stop ceph-mds.target! N" y% O4 ]6 V- m) O. s1 N5 X, [
# ceph fs rm cephfs1 H- k( j4 s" g$ P Y
Error EPERM: this is a DESTRUCTIVE operation and will make data in your filesystem permanently inaccessible. Add --yes-i-really-mean-it if you are sure you wish to continue.6 X! ^: o8 S0 z6 ?6 M4 ^5 r
# ceph fs rm cephfs --yes-i-really-mean-it% x/ N: c, V# Q- \
# ceph fs ls
6 B( z* S" n: V% U; u$ j9 y& lNo filesystems enabled
8 f- v/ x. Y4 U) H* |7 D
8 {7 E# J* A/ P6.问题:1 h5 \+ N! U3 f; p) y+ J; d
使用静态PV创建pod,pod一直处于ContainerCreating状态:
8 O' s( h! j0 O% {7 Y5 X; P# kubectl get pod ceph-pod1
! c5 i8 R3 Y# H9 q: ]3 UNAME READY STATUS RESTARTS AGE
6 B; u" ^1 U) H, j6 t- m. g, Zceph-pod1 0/1 ContainerCreating 0 10s' x$ c( ~/ d7 v4 |4 q
......* E$ H& M$ |- O+ s; M. g# M: x
# kubectl describe pod ceph-pod1
+ h& ?: m7 ?: `& ]Warning FailedMount 41s (x8 over 1m) kubelet, node01 MountVolume.WaitForAttach failed for volume "ceph-pv" : fail to check rbd image status with: (executable file not found in $PATH), rbd output: ()
* ^* U2 E& K+ T7 L) P: [% bWarning FailedMount 0s kubelet, node01 Unable to mount volumes for pod "ceph-pod1_default(14e3a07d-93a8-11e8-95f6-000c29b1ec26)": timeout expired waiting for volumes to attach or mount for pod "default"/"ceph-pod1". list of unmounted volumes=[ceph-vol1]. list of unattached volumes=[ceph-vol1 default-token-v9flt]& d" t/ n0 e1 l" a3 D% l
解决:node节点安装最新版的ceph-common解决该问题,ceph集群使用的是最新的mimic版本,而base源的版本太陈旧,故出现该问题2 l( w# C+ ]2 \ _/ \; M" C
3 C+ S) a* K* |6 `5 d/ ^
7.问题:- p/ F ]& `- h. A7 c' A8 _
创建动态PV,PVC一直处于pending状态:
' J5 z) b- o. X5 F# g( P7 G# kubectl get pvc -n ceph
7 a* `) A* c; y C0 q3 Z/ X1 _3 Z; ~$ ZNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
9 ^8 k2 y* ]+ {1 a2 n% }/ Yceph-pvc Pending ceph-rbd 2m+ O/ W$ o, [1 Y( w& F% F" K
# kubectl describe pvc -n ceph2 m, h' S4 z2 }8 T8 O/ k
......
5 \0 [; ]) ~+ z+ `. EWarning ProvisioningFailed 27s persistentvolume-controller Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: exit status 1, command output: 2018-07-31 11:10:33.395991 7faa3558b7c0 -1 did not load config file, using default settings.' m$ y, G6 ~+ y6 g$ H0 \ ?
rbd: extraneous parameter --image-feature
) z: u3 Q. j( ~& i" ?% S n解决:
) c) F- ?+ p, V, L& n) N0 ~3 gpersistentvolume-controller 服务运行在master节点,受kube-controller-manager 控制,故master节点也需要安装ceph-common包5 [9 c. B ]: {0 m4 M
0 q" _8 K8 H# m, T
$ ?3 n2 [' Q% |! _, ]6 T X
- l1 ^2 z9 F0 B- N |
|