|
|
5 k4 @8 o8 q# h E) x7 Y7 K
1.问题: 5 }4 Y+ C- c- N: u3 R. H
# ceph health 8 E/ C3 v+ k) d7 \% P B
HEALTH_WARN application not enabled on 1 pool(s) 解决:
. F; U$ _# B s. U1 Z# ceph health detail
8 j o& m. B$ X
HEALTH_WARN application not enabled on 1 pool(s)
2 V" P: }. P% @/ \+ q/ n% ]& C
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) + `5 R4 D1 }* `+ ?
application not enabled on pool 'kube'
# J7 u. W' s7 o* p
use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications. 4 r% f, F* h2 b t2 k% t
# ceph osd pool application enable kube rbd enabled application 'rbd' on pool 'kube' & d M4 z! t: B" D+ o
# ceph health HEALTH_OK
! |. B, M4 V2 F; o( o5 u: u1 i
" `! y& R3 L6 i/ J) ^7 K# q0 X
2 s/ u. |+ D5 ?7 V5 \" a
2.问题: # ceph -s
# `8 Z6 n7 _0 {% I cluster:
8 Q$ `$ Y. p: q" L% d+ [' e& ] id: e781a2e4-097d-4867-858d-bdbd3a264435
5 ~) F: R5 a; T. l& j health: HEALTH_WARN
8 L6 G1 H9 c5 l: z% Q clock skew detected on mon.ceph02, mon.ceph03
. d: }% o$ u5 @2 q: a4 q解决:
8 _# H e( ~/ G) ]& }7 `1 h1 F####确认NTP服务是否正常工作
: p8 H/ f, E, }# systemctl status ntpd) N* Q# T! s7 f5 A3 I' K* m9 z
####修改ceph配置中的时间偏差阈值
4 H; Y) ]/ z( P- z; U2 F# vim /etc/ceph/ceph.conf
l) d0 C( a& Y- \###在global字段下添加:
1 |, y# \/ N2 J# g rmon clock drift allowed = 2
! D+ K" m* R" w# X3 z f+ ?mon clock drift warn backoff = 30 , x- \, C: X- [7 n: w
####向需要同步的mon节点推送配置文件
- O+ U; g% M3 z1 b# cd /etc/ceph// w2 q8 L5 ]$ O/ K8 Z
# ceph-deploy --overwrite-conf config push ceph{01..03}
# G: n% @- E! M% c9 Z2 ~####重启mon服务并验证5 K/ Z- Y3 c5 \/ [* X
# systemctl restart ceph-mon.target
" _9 W7 p; a- s# ceph -s
: I* e" a. _+ m. \) d! R cluster:* }, \! {5 F* P7 p/ g6 Y
id: e781a2e4-097d-4867-858d-bdbd3a264435$ l" R* B5 J: H6 F }- O- X
health: HEALTH_OK
, x+ A0 \1 t, ?( m2 p) y5 |
: T6 P/ [/ D6 x2 @: @/ ^ t3.问题: & A* D9 J2 g' v
# rbd map abc/zhijian --id admin4 W2 K9 O9 \/ _1 b3 e- f
rbd: sysfs write failed
* Z% O. S5 c# ^) q7 HRBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable".3 w' ]0 p5 M6 Q/ g
In some cases useful info is found in syslog - try "dmesg | tail".3 a: G/ U1 }; w! Y! D
rbd: map failed: (6) No such device or address) P# Q1 E0 B) B2 J1 a3 ?8 e
6 O. ^* o i' b: {5 d" k解决:! z; |5 b7 I) V2 Q. F- ?
由于kernel不支持块设备镜像的一些特性,所以映射失败
& z2 z! T9 e( z1 _5 a5 h, D# rbd feature disable abc/zhijian exclusive-lock, object-map, fast-diff, deep-flatten
( H4 h3 ?7 v2 C' t) w8 J' X# rbd info abc/zhijian
% B1 M# ]$ m& P4 K4 |0 B7 ]/ ?6 ~rbd image 'zhijian':
6 z$ s" z0 Q, N8 Y& ssize 1024 MB in 256 objects/ k7 c' e! B5 }( i6 ]
order 22 (4096 kB objects)
K9 T G; u3 U) c$ C) `block_name_prefix: rbd_data.1011074b0dc51
7 `4 K2 k5 Y4 D/ q" R) M5 Lformat: 27 Y/ z; Z; y0 R/ _) g' b1 |- w! O, F
features: layering
1 b7 k) E& l8 \& W; yflags:
, K: G! u0 y1 j: i9 @- Mcreate_timestamp: Sun May 6 13:35:21 20185 w8 E' h! w$ t* T( c9 b% M
# rbd map abc/zhijian --id admin5 L; B% U+ m. Q; M7 U
/dev/rbd06 @1 T, m! N6 i" [! P
9 E; C( H# S5 ]% @1 U3 q6 d4.问题: 1 v5 @! Y; |+ S
# ceph osd pool delete cephfs_data$ u! l1 [9 o+ O* O3 M) u( p
Error EPERM: WARNING: this will *PERMANENTLY DESTROY* all data stored in pool cephfs_data. If you are *ABSOLUTELY CERTAIN* that is what you want, pass the pool name *twice*, followed by --yes-i-really-really-mean-it.% C: m, ]+ c3 ?# K5 [/ y# }$ G( ~& o
# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it
! h+ W8 {) x# I# xError EPERM: pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool$ L. B& _2 f. E j
1 H( L5 p1 e; g- K
解决:. H! Y- ]4 m$ }1 M- w. S- ^0 z. R" Q
# tail -n 2 /etc/ceph/ceph.conf
. P; k- x: J. o2 L. r# O1 a[mon]
% g* c& T) p& nmon allow pool delete = true, Y$ W, H. V. g$ R! [* C
4 ?6 S8 n( l# ^; B8 C6 |8 G; t/ A. g向需要同步的mon节点推送配置文件:
' T1 C2 y" {& |/ l! u4 o5 c# cd /etc/ceph/8 l3 z! q; }( s! t: D/ y
# ceph-deploy --overwrite-conf config push ceph{01..03}
" Q$ k( z, X8 o( k: z5 q/ U+ \* o
重启mon服务并验证:
K; ~: C9 v/ G& b8 r) @# systemctl restart ceph-mon.target
4 w: T9 z* p; \/ ?: S5 a# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it
9 G0 b/ p4 p9 ]9 o' J' D" H" M* k0 @pool 'cephfs_data' removed
8 w8 ~2 \8 W' B% p4 D! m
8 x i+ {8 \! U* k1 P5.问题:7 `/ l& {9 U& [* `/ H o% c% n7 |1 w
# ceph osd pool rm cephfs_data cephfs_data --yes-i-really-really-mean-it9 x: O8 _& P r" Z- \' K7 E
Error EBUSY: pool 'cephfs_data' is in use by CephFS
& D9 @& H6 X6 X. p8 ^* [/ R7 v8 |3 F& v( k. t2 d; |4 _
解决:6 f6 Y# M2 |8 E" ~& v: f7 }: V
# ceph fs ls
F% V3 m+ q( F% jname: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]' b) u+ V* m* s4 d2 f, @
# ceph fs rm cephfs --yes-i-really-mean-it! d% y" M ^, F
Error EINVAL: all MDS daemons must be inactive before removing filesystem
: v9 Z6 I& q% L" C% f8 u1 b. W# systemctl stop ceph-mds.target
. n9 }$ F; \ B: [6 y+ U9 o, Q9 K# ceph fs rm cephfs
& K% _! K- q* o/ e* V' LError EPERM: this is a DESTRUCTIVE operation and will make data in your filesystem permanently inaccessible. Add --yes-i-really-mean-it if you are sure you wish to continue.
3 p3 A% e* m U/ l# ceph fs rm cephfs --yes-i-really-mean-it1 I* M) _6 Z2 \" d4 ?- D
# ceph fs ls
1 X" Q2 O9 V5 g2 |; A2 h& n @No filesystems enabled) k- G7 g3 v( c/ L# R& v
, c: c, i$ X( F! ?' ]
6.问题:
! Z! X1 z+ c6 s! ~0 i使用静态PV创建pod,pod一直处于ContainerCreating状态:2 C& ]" `: j0 R) M
# kubectl get pod ceph-pod13 z" o& l) Y$ |" `5 `
NAME READY STATUS RESTARTS AGE# F; r) g6 |6 b& Y: o
ceph-pod1 0/1 ContainerCreating 0 10s
2 V: Z8 n! V, S8 e/ }' |......
) u# h8 m& X# ]" M) K3 v7 ^% E) k4 h# kubectl describe pod ceph-pod1
C+ @$ R! S$ a. P2 o' w" \: K) oWarning FailedMount 41s (x8 over 1m) kubelet, node01 MountVolume.WaitForAttach failed for volume "ceph-pv" : fail to check rbd image status with: (executable file not found in $PATH), rbd output: ()
0 A, m H0 ]- @+ t; l; g4 v, zWarning FailedMount 0s kubelet, node01 Unable to mount volumes for pod "ceph-pod1_default(14e3a07d-93a8-11e8-95f6-000c29b1ec26)": timeout expired waiting for volumes to attach or mount for pod "default"/"ceph-pod1". list of unmounted volumes=[ceph-vol1]. list of unattached volumes=[ceph-vol1 default-token-v9flt]! U8 Q6 t2 i! f7 ]! w# p
解决:node节点安装最新版的ceph-common解决该问题,ceph集群使用的是最新的mimic版本,而base源的版本太陈旧,故出现该问题, o, |* `$ F8 P. M2 R0 d& q; s
$ x3 @. Q+ l, H. H2 V* F7.问题:
7 ]! p9 C' ~9 H; [; ]# p创建动态PV,PVC一直处于pending状态:% g/ t$ }) L Q+ m. F, [
# kubectl get pvc -n ceph
" I A. I9 \* k& c- f$ ONAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
, \& M4 i: o: g5 m8 }( E* l. F1 g- C# sceph-pvc Pending ceph-rbd 2m- c5 r/ f5 w$ b: U
# kubectl describe pvc -n ceph
8 w z% f5 F! j% |3 w......$ }0 v8 J' k( a& u( V0 p
Warning ProvisioningFailed 27s persistentvolume-controller Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: exit status 1, command output: 2018-07-31 11:10:33.395991 7faa3558b7c0 -1 did not load config file, using default settings.. N J7 S$ f+ X9 D5 L, k! t/ \
rbd: extraneous parameter --image-feature
" {. {" \) X2 i/ n& O+ g解决:
& q" j( Y ^7 S4 C$ p0 Mpersistentvolume-controller 服务运行在master节点,受kube-controller-manager 控制,故master节点也需要安装ceph-common包( n& f; x# B' r6 L- e
4 h6 Q$ |' p+ _1 z0 O$ r: F( S5 D0 y$ K! O, i2 E0 t
% z# x g4 b! N6 `7 n1 k ~ |
|