|
|
5 B1 N O r' L+ p9 O
1.问题:
% K2 i/ S1 k; U9 d$ \
# ceph health
, c3 I8 V5 q9 m( M, V& b
HEALTH_WARN application not enabled on 1 pool(s) 解决: 7 s: a! m% v8 f t/ o
# ceph health detail % s( D6 J7 e, {5 v( R2 L
HEALTH_WARN application not enabled on 1 pool(s) % K7 z. r, _ u8 ~! `5 Q
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) # @" N* c* `5 x
application not enabled on pool 'kube'
: i& y! x A2 R L. T+ R9 k! A
use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications. , }% t4 J% W+ I8 n/ N+ Z5 z
# ceph osd pool application enable kube rbd enabled application 'rbd' on pool 'kube'
4 a& N" A- f, ]! ?) P
# ceph health HEALTH_OK
7 O- Y" l6 z: }% r
7 r" v4 X. n) o1 _+ N1 o) M
2 V. o( [- T0 e
2.问题: # ceph -s. l' d" F7 j! W3 n- ^# o
cluster:) X. Q1 G& ^ \$ z& m/ w
id: e781a2e4-097d-4867-858d-bdbd3a264435
- j3 J. M, O2 S6 h) ?# [ health: HEALTH_WARN
+ Z$ x" L9 f- @# K% u clock skew detected on mon.ceph02, mon.ceph03 ! @4 W( i( T6 u- F5 ?% Q" Y& p
解决:
& \% E1 S A% l: [. V1 m+ n/ t####确认NTP服务是否正常工作
) Y7 s, Y! a3 b' |' W; O# systemctl status ntpd
5 N5 A5 o. \' A% n4 N6 e7 v####修改ceph配置中的时间偏差阈值) t+ R* Z" t. _: `- [1 `
# vim /etc/ceph/ceph.conf/ G$ I/ {, p9 y
###在global字段下添加:
& M* g$ F. ~" emon clock drift allowed = 25 m* e# g' T# T. e6 m6 w
mon clock drift warn backoff = 30
; l2 c7 O! H0 i. e& ~9 O####向需要同步的mon节点推送配置文件
0 K- G4 C8 g; c) r# cd /etc/ceph/
$ m8 H) j9 O4 L, S# ceph-deploy --overwrite-conf config push ceph{01..03}
& W: v9 M6 D6 i7 ?) x% v####重启mon服务并验证
9 g" A4 u8 p6 K }$ c' L' k0 D# systemctl restart ceph-mon.target
% v5 i8 `3 D" ^( J* |: G4 Z# ceph -s$ F9 L4 Q6 a* C9 N* N6 `/ |
cluster:5 e. u' n1 a& E5 n/ ^
id: e781a2e4-097d-4867-858d-bdbd3a264435
5 y. b ~3 s5 `- t1 | health: HEALTH_OK' T/ Y: X: U9 e, |
4 Z$ ]' Y+ V1 E( {- l+ @3 e% r. q3.问题:
. z) I }) N$ {0 |6 R1 W# rbd map abc/zhijian --id admin$ E1 S, n! d* g) L
rbd: sysfs write failed
! y6 y" n6 E8 q) ~& oRBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable"./ F) ?8 D, U' f7 k/ g
In some cases useful info is found in syslog - try "dmesg | tail".7 E' [; p% T) K% Z# W8 [/ C
rbd: map failed: (6) No such device or address
: w8 M, w$ z6 Q' A: _8 ~) v+ X b. x
& M7 T. T1 e1 p5 O8 l解决:
% u& Z8 s$ T( v. G4 H2 z由于kernel不支持块设备镜像的一些特性,所以映射失败
/ D" t6 j u+ D8 P! A+ E4 N( u# rbd feature disable abc/zhijian exclusive-lock, object-map, fast-diff, deep-flatten
7 P9 @5 X/ ^3 P# _0 Z' G# rbd info abc/zhijian
- v# ]# P9 |" c, R6 a4 E' J. Z, D3 j2 Qrbd image 'zhijian':5 ]* B+ x4 y% C/ N$ K. P
size 1024 MB in 256 objects2 f P- |. g, A6 t
order 22 (4096 kB objects)
8 O8 p4 `: ] r0 n* _$ `block_name_prefix: rbd_data.1011074b0dc51) g8 R5 Q) F$ `2 a2 x
format: 2
: f: F+ N) O+ r; r' i# ufeatures: layering2 {+ h9 n& U- H5 I* U
flags:
2 f0 K6 N8 |8 c! h, F$ q4 Pcreate_timestamp: Sun May 6 13:35:21 2018% f" ~: u& `! V. D1 L, K- E
# rbd map abc/zhijian --id admin
5 u1 C* K* {$ t/dev/rbd0, s! _1 v3 I2 V
' @+ b( R+ G' Q4.问题:
. G4 h% j" ?' A& ]2 q# ceph osd pool delete cephfs_data
9 V* j F9 j' r) l* N% D, ?Error EPERM: WARNING: this will *PERMANENTLY DESTROY* all data stored in pool cephfs_data. If you are *ABSOLUTELY CERTAIN* that is what you want, pass the pool name *twice*, followed by --yes-i-really-really-mean-it.& y) P% d% w- a, m8 ]. }& A3 s+ }
# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it
: F/ w9 e9 G3 v: w9 V: |. s, _' ^Error EPERM: pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool+ z9 i* A# b0 D1 x
* b5 V9 w8 q+ t- ]! T, m
解决:
- J0 f5 v3 @1 F4 L5 C# tail -n 2 /etc/ceph/ceph.conf
2 E( q( n7 @9 u& Z. E7 H6 V( G, I[mon]; l" s6 i2 z; x. x9 O+ w
mon allow pool delete = true1 l- T S( p; Q2 s- f l
' T# |, b6 `7 ^
向需要同步的mon节点推送配置文件:
8 o$ k( {) |, W, T0 M, L# cd /etc/ceph/
" L% |7 P" Q3 V3 i8 p9 u4 J% e5 I# ceph-deploy --overwrite-conf config push ceph{01..03}
( t+ _' R8 x4 p# v" i' e3 Y+ m, U
* a; b% I4 ^2 M+ u8 u重启mon服务并验证:
( \; j: ?" F7 g$ ]1 a3 w# systemctl restart ceph-mon.target- @2 G9 z5 F) U( [/ Q k. v
# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it1 n: @4 }/ U( ^& c8 c3 [
pool 'cephfs_data' removed* u: b( c1 {+ C9 g$ g0 Y& [
. R7 x0 H- E: t5 U0 O, x& \7 V( e5.问题:
" U5 d, a( s7 u2 w3 G/ c# ceph osd pool rm cephfs_data cephfs_data --yes-i-really-really-mean-it
2 z W/ Q, |- l. _2 _Error EBUSY: pool 'cephfs_data' is in use by CephFS# D( ]& E! j( c. C# _4 j
+ M( `4 [& Y2 h8 w% h) G$ @2 W6 s2 l" u解决:( M U, S2 @. ?6 i" Q
# ceph fs ls3 l; R2 Q' n9 S( t& B% x( B
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]! p) `4 A# `8 l" D% z' e1 M
# ceph fs rm cephfs --yes-i-really-mean-it: `) t9 D+ S) M
Error EINVAL: all MDS daemons must be inactive before removing filesystem
' B6 D" }0 {! J r! `# systemctl stop ceph-mds.target7 q: j' S7 d) B( h& G3 z
# ceph fs rm cephfs+ ~ _' y9 J- }2 p. C
Error EPERM: this is a DESTRUCTIVE operation and will make data in your filesystem permanently inaccessible. Add --yes-i-really-mean-it if you are sure you wish to continue.
I/ K/ x! c2 y/ R* p4 I- S7 s9 S# ceph fs rm cephfs --yes-i-really-mean-it
- `. O2 t$ ~ z/ I4 N5 E# ceph fs ls) c; c3 \/ }& y/ q9 u4 i
No filesystems enabled
/ x& R' H' }5 l0 I- p: K
# c5 K2 s* L: G, b+ S6.问题:
d* \! H. M( {3 q& @# v- C使用静态PV创建pod,pod一直处于ContainerCreating状态:/ ]2 b& H: s0 F: E8 C
# kubectl get pod ceph-pod1
9 b) E2 k8 B" h$ ] sNAME READY STATUS RESTARTS AGE0 R3 m% M0 F. r- y( V
ceph-pod1 0/1 ContainerCreating 0 10s8 L* V" y- R/ O( \9 b1 ~3 }
......
. Q3 g S- ]8 o3 T$ G! T- c# kubectl describe pod ceph-pod1( p4 W! |: }9 J, c
Warning FailedMount 41s (x8 over 1m) kubelet, node01 MountVolume.WaitForAttach failed for volume "ceph-pv" : fail to check rbd image status with: (executable file not found in $PATH), rbd output: ()/ w- F- t9 P3 A1 H
Warning FailedMount 0s kubelet, node01 Unable to mount volumes for pod "ceph-pod1_default(14e3a07d-93a8-11e8-95f6-000c29b1ec26)": timeout expired waiting for volumes to attach or mount for pod "default"/"ceph-pod1". list of unmounted volumes=[ceph-vol1]. list of unattached volumes=[ceph-vol1 default-token-v9flt]( z6 W) \0 g2 u, |0 ]
解决:node节点安装最新版的ceph-common解决该问题,ceph集群使用的是最新的mimic版本,而base源的版本太陈旧,故出现该问题; j) p6 k0 X. V m6 X5 @
+ _9 w* w' \$ h+ ~) p3 |7.问题:7 \ _9 N) L G* ]" V% j# u6 I
创建动态PV,PVC一直处于pending状态:, O0 b7 S# J* ^6 r7 G3 ]* \
# kubectl get pvc -n ceph) S5 ?$ Z2 F) D8 F) d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
4 p3 S* S# _0 r7 P6 ]0 |7 _ceph-pvc Pending ceph-rbd 2m
0 C5 s/ p/ Z. s f7 ?# kubectl describe pvc -n ceph' Q2 Z5 `0 R, ^2 D5 y
......
# z/ ~- A; @3 vWarning ProvisioningFailed 27s persistentvolume-controller Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: exit status 1, command output: 2018-07-31 11:10:33.395991 7faa3558b7c0 -1 did not load config file, using default settings.2 h: h+ c% j5 g" |0 G: z1 {' I% \
rbd: extraneous parameter --image-feature! w6 G# O/ x+ [$ o* ?" M3 r
解决: n3 G" g: h! s% A3 ~
persistentvolume-controller 服务运行在master节点,受kube-controller-manager 控制,故master节点也需要安装ceph-common包( a7 X2 Y0 i! A0 i9 B/ k
/ y' Q$ r' |% u7 _
, S/ j3 v7 s* T7 u" P2 ` ! p. h) N' x5 U n2 U7 y
|
|