|
|
' x) G: R2 x7 W; X
1.问题:
1 |/ G6 B( p# ]! c( |) d; _7 D
# ceph health . \$ P( c+ ~3 _. B' z& n
HEALTH_WARN application not enabled on 1 pool(s) 解决: : P: U; k, S* l' i( K3 L" I
# ceph health detail 6 x, S* q1 Y4 z+ ]- K
HEALTH_WARN application not enabled on 1 pool(s)
& Y5 |# W: \+ A
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
# D# H6 `7 Y) a2 F
application not enabled on pool 'kube' 6 e ]% I7 H4 `' u; r) a% m
use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications.
) z9 z- b5 J5 \0 ^8 z' P4 F# ceph osd pool application enable kube rbd enabled application 'rbd' on pool 'kube'
0 _, @9 y# S; o) e8 x% J) r
# ceph health HEALTH_OK 0 }& ^$ Y" ]: _" u7 t9 j4 L& `
; L$ {3 \# Y; n$ z. k
& j( q+ C0 u' G2 ` o
2.问题: # ceph -s
" V% P0 {0 T# v9 L1 l cluster:' t0 b4 O T" I& z
id: e781a2e4-097d-4867-858d-bdbd3a264435
! s$ T- `# @$ G. f1 l health: HEALTH_WARN4 A) a. C1 g: c9 m# e0 D
clock skew detected on mon.ceph02, mon.ceph03
) I# i' U- s5 \* U7 }解决: % w7 U' e9 N+ B0 E! p0 R' H
####确认NTP服务是否正常工作) o6 @1 A: Y' k! K# |/ t7 f
# systemctl status ntpd" b1 a: v: B9 \1 R/ n0 v
####修改ceph配置中的时间偏差阈值
+ n: X2 ?* j7 a Y4 u# vim /etc/ceph/ceph.conf8 m4 k S' ^& X9 |
###在global字段下添加:, {+ \$ M& [% w6 n/ x/ N
mon clock drift allowed = 2
4 ]# T4 J* W7 q9 R' a( Z: gmon clock drift warn backoff = 30
# U% `! w1 v n4 v) s& P####向需要同步的mon节点推送配置文件$ o* q- x" ~$ G' z2 {3 O
# cd /etc/ceph/, O3 a& B( K8 s2 z2 j3 n
# ceph-deploy --overwrite-conf config push ceph{01..03}
2 W' o8 U6 d5 S! j####重启mon服务并验证
2 ^; b% M5 R9 W# systemctl restart ceph-mon.target
. t: t \8 z6 K# M# ceph -s
( ^6 h6 S0 ]* L. s2 S cluster:
. ~- D# E; Q$ k, T id: e781a2e4-097d-4867-858d-bdbd3a264435
6 @% H2 r- T' ]( m+ A( p3 ] health: HEALTH_OK
4 l: {6 `* a* N( i; {5 I9 I5 a( P5 @ P+ y6 ]: o0 N% o- w9 \
3.问题: * s) B( E1 Z, O5 `% l# @. ]* @+ U2 O
# rbd map abc/zhijian --id admin) y' D7 [2 @& f7 j+ F$ F
rbd: sysfs write failed* X6 E9 r. ~* B, p* @
RBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable".
. Y( ]3 H3 |) B) G" c# mIn some cases useful info is found in syslog - try "dmesg | tail".
6 f0 Y0 I1 H" o& T8 J0 n3 rrbd: map failed: (6) No such device or address7 z( H/ [ p, N1 w8 I+ O
- B3 Z1 k+ O' |3 @5 G9 b7 {解决:) G% D9 H( g# a V5 `! g$ o
由于kernel不支持块设备镜像的一些特性,所以映射失败5 g: ?6 x" W+ Q' v1 Q6 G$ {% z
# rbd feature disable abc/zhijian exclusive-lock, object-map, fast-diff, deep-flatten
6 h7 r. I* r! f9 m% ^# rbd info abc/zhijian: b- G! f3 ], \: }
rbd image 'zhijian':
6 V) l2 `! S" h) [- o& L- \size 1024 MB in 256 objects
. V" L7 r! u. ^4 q( H/ border 22 (4096 kB objects)
* [8 [7 B) V% e- y1 pblock_name_prefix: rbd_data.1011074b0dc51
1 Y, j1 h8 i& r" w* R7 Zformat: 2& g7 W7 a; _; ?; d
features: layering
. p* w% N3 n! o. _flags: # j t( D" I+ C% l# g
create_timestamp: Sun May 6 13:35:21 2018) j, k, E. @4 U6 y
# rbd map abc/zhijian --id admin
0 o7 S# L6 u+ u' `" F( n# a! {/dev/rbd0; U; a+ `- d) u( Q" f E2 ~
- ?/ G5 X, M ]
4.问题:
, [+ p" ^; p) B) ?" W# ceph osd pool delete cephfs_data
* y& J, J) g* P! }" W% xError EPERM: WARNING: this will *PERMANENTLY DESTROY* all data stored in pool cephfs_data. If you are *ABSOLUTELY CERTAIN* that is what you want, pass the pool name *twice*, followed by --yes-i-really-really-mean-it.
1 o2 \" J4 J# X/ `7 _# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it5 f- L! u# u: Z4 O
Error EPERM: pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool
; G# A. d) Z2 p& A4 { N5 y: W* q% R. ?2 w/ ?) R2 O' h
解决:
7 Q. X0 h- k; S- i* e. _1 c9 p$ K# tail -n 2 /etc/ceph/ceph.conf - [7 `) y6 N& Q6 g+ h/ m
[mon]1 A5 B& N9 u3 c" H7 X
mon allow pool delete = true
! i* p' k- a9 ]0 G& n( n$ r# {8 b8 P% ^! X& B7 P
向需要同步的mon节点推送配置文件:
/ k& p7 Q+ V- E" e# cd /etc/ceph/
3 F9 A( H, k6 k B) U" j; S# ceph-deploy --overwrite-conf config push ceph{01..03}; [ K5 N/ }2 p& J- S! b2 v
, M6 @% ?: R5 F/ Q6 s/ |" T2 y5 }$ T重启mon服务并验证:. X; h) g) D1 `: Y
# systemctl restart ceph-mon.target5 u2 u8 i$ b& f& \ P! H# H
# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it2 {, ?+ S' i6 \
pool 'cephfs_data' removed
% U' i8 ]2 E; L5 W6 |- c
/ }7 \* O2 v8 a" q: z5.问题:5 C% v! h) S) X% n0 C& h
# ceph osd pool rm cephfs_data cephfs_data --yes-i-really-really-mean-it' t' R2 C' `% Z: ^ g
Error EBUSY: pool 'cephfs_data' is in use by CephFS
, S* f4 X! c& ^+ l1 R# i" v7 n2 E$ v$ l
解决:" h. l$ V. l# }6 |( |& b
# ceph fs ls
% ^8 j' t& U8 Cname: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]- M+ \. x( |9 l) Y) z( m+ R. `
# ceph fs rm cephfs --yes-i-really-mean-it
! ]% a9 q# x8 [2 }/ z7 ^* h$ r6 eError EINVAL: all MDS daemons must be inactive before removing filesystem% D4 e# L* o; _3 I. e( z
# systemctl stop ceph-mds.target
# Y7 T' M& F6 H% G- I# ceph fs rm cephfs) b {3 M* a- E3 j0 P) F
Error EPERM: this is a DESTRUCTIVE operation and will make data in your filesystem permanently inaccessible. Add --yes-i-really-mean-it if you are sure you wish to continue.8 v) o! |1 H$ ]) T/ I* ^
# ceph fs rm cephfs --yes-i-really-mean-it3 N$ C7 N. D1 |! M& ^/ g4 I
# ceph fs ls) W" T6 O4 `2 l$ }
No filesystems enabled
2 e- P- B& v, K! W& N# k3 L" z7 e& Q1 y" h3 f8 h
6.问题:; n6 q3 Y* {0 b8 U1 {$ d( J
使用静态PV创建pod,pod一直处于ContainerCreating状态:
. D4 s: Q8 l+ u! S" O8 D% n# kubectl get pod ceph-pod1
b" @$ k% }1 w3 w; o! k: }NAME READY STATUS RESTARTS AGE" Z- H& o* E& i9 M$ F' n* `/ W4 z, Y( L- E
ceph-pod1 0/1 ContainerCreating 0 10s$ T+ v. l6 G% w
......% W3 ]+ J: F5 z+ _9 _( j: d
# kubectl describe pod ceph-pod1/ R+ N% i% M! `& i1 F) T
Warning FailedMount 41s (x8 over 1m) kubelet, node01 MountVolume.WaitForAttach failed for volume "ceph-pv" : fail to check rbd image status with: (executable file not found in $PATH), rbd output: ()
7 s" N+ O6 D% ~" RWarning FailedMount 0s kubelet, node01 Unable to mount volumes for pod "ceph-pod1_default(14e3a07d-93a8-11e8-95f6-000c29b1ec26)": timeout expired waiting for volumes to attach or mount for pod "default"/"ceph-pod1". list of unmounted volumes=[ceph-vol1]. list of unattached volumes=[ceph-vol1 default-token-v9flt]
r3 a* y) y; F# {- w解决:node节点安装最新版的ceph-common解决该问题,ceph集群使用的是最新的mimic版本,而base源的版本太陈旧,故出现该问题
* |1 s8 {9 a. _9 r% o! b8 b y) O0 I0 D; q, s9 e( k
7.问题:! G: d% ^# K) Q- G6 K' E
创建动态PV,PVC一直处于pending状态:
1 L# ]$ F! O' {8 W% X; S# kubectl get pvc -n ceph
5 L7 z2 _# s' Q8 `5 n7 O. FNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE: t1 B$ D( {+ f: P8 \8 I% @) L* p
ceph-pvc Pending ceph-rbd 2m! c" b4 Z4 }6 u; L" t. U
# kubectl describe pvc -n ceph$ k3 G( S4 z }! Z8 Y! w- i% l) X
......( c/ `; R) J5 P k- k1 O
Warning ProvisioningFailed 27s persistentvolume-controller Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: exit status 1, command output: 2018-07-31 11:10:33.395991 7faa3558b7c0 -1 did not load config file, using default settings.% b/ K4 j2 g. M7 W. Y- ?5 ~: J
rbd: extraneous parameter --image-feature
& Y" F( Z3 `9 W解决:
$ P% I$ k1 X" W3 h0 U3 k+ u; }persistentvolume-controller 服务运行在master节点,受kube-controller-manager 控制,故master节点也需要安装ceph-common包" r+ H( Y* R9 s* A5 O
9 y- B7 y1 D5 D+ o
R5 |! Y2 ^5 m& g9 w
6 |- Q8 b( b5 `, S0 O5 e2 ~# L2 @; f |
|