找回密码
 注册
查看: 721|回复: 2

pgs not deep-scrubbed in time异常处理

[复制链接]

1

主题

0

回帖

12

积分

管理员

积分
12
QQ
发表于 2022-12-20 17:00:15 | 显示全部楼层 |阅读模式
发现出现异常warn信息,虽然不影响整个集群正常使用,但强迫症患者还是忍不了,下面是过程。查看具体报错信息& }- g/ b0 K0 y, W$ B2 w0 v( r- q
HEALTH_WARN 2 pgs not deep-scrubbed in time0 V" a- R$ k4 p, \
PG_NOT_DEEP_SCRUBBED 2 pgs not deep-scrubbed in time
; i5 D( O- O" r9 L/ U3 h/ K    pg 18.41 not deep-scrubbed since 2022-12-07 20:15:50.550606+ ^' e+ {* y2 {! l, w3 X& Y
    pg 5.16d not deep-scrubbed since 2022-12-07 22:21:58.141071
" H: C0 K5 h' O; {) y2 ~7 k  v; k7 ^# H% M4 ^$ q/ b
6 d0 _0 x2 z$ g4 A* O3 c2 [0 @
[root@controller1 ~]# ceph pg deep-scrub 18.41/ s9 t/ g+ t% d5 p
instructing pg 18.41 on osd.6 to deep-scrub
. Y4 p3 @4 ~; K. {! o) L[root@controller1 ~]# ceph pg deep-scrub 5.16
7 i& {8 ?% Z/ I9 O8 rinstructing pg 5.16 on osd.13 to deep-scrub
6 X. B5 u' c2 n( b6 n0 ~$ `  w6 `
; V1 x: V' O4 X/ U- ]/ ^( L3 q7 j) V5 k( A+ Q

; p' u( V; }& L5 S6 P' L' e" B7 u[root@controller1 ~]# ceph daemon osd.6 config show |grep osd_deep_scrub_interval
0 ~! L3 C1 ?9 r/ @5 ~3 E8 g    "osd_deep_scrub_interval": "604800.000000",
- x& b" B: r: P5 q[root@controller1 ~]# ceph config set global osd_deep_scrub_interval 3628800
! U' a/ p2 T7 A! s7 C[root@controller1 ~]# ceph daemon osd.6 config show |grep osd_deep_scrub_interval* H/ O; c5 L9 l& r3 i4 ]
    "osd_deep_scrub_interval": "3628800.000000",
, N$ p) R1 v1 b8 a& ][root@controller1 ~]# ceph config set global osd_deep_scrub_interval 3628800
. {( n- w$ y4 `恢复正常后,再改回来:
! ]/ X4 P1 |" X[root@controller1 ~]# ceph daemon osd.6 config show |grep osd_deep_scrub_interval
# ]/ ]) A) v1 x    "osd_deep_scrub_interval": "604800.000000",
' I+ J. j- j- d[root@controller1 ~]# ceph config set global osd_deep_scrub_interval 3628800
: M8 o" g* Y+ Z[root@controller1 ~]# ceph daemon osd.6 config show |grep osd_deep_scrub_interval
, r+ n' o% @# W    "osd_deep_scrub_interval": "3628800.000000",# M2 O* @& J0 I; y4 M9 c% @
[root@controller1 ~]# ceph config set global osd_deep_scrub_interval 3628800^C8 J+ c2 g9 `0 i0 z  t+ U5 E

  {: l, W/ d5 N: B/ i[root@controller1 ~]#
9 G/ `0 _% Y) p% M" W# x+ Q[root@controller1 ~]# ceph -s$ q0 a! k# R' E9 V
  cluster:
* `# ^2 I4 h$ A: k0 |    id:     9d22e36a-2bdd-4d2d-8394-48af75ead777
7 ]4 N6 y8 j+ X    health: HEALTH_OK8 c8 p6 t( z' o5 `8 `0 H' L4 h8 l

" n) F/ l" @- r. }  O) z; L  services:# C) o( A6 w8 Z8 F4 J! Z" D
    mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 5M)
! G; g; P9 ~% l3 t' D  r  N$ F! _; C( F    mgr: ceph1(active, since 19M), standbys: ceph2,ceph30 |! j- X3 u: a  \/ H
    osd: 40 osds: 40 up (since 3w), 40 in (since 12M)
# `6 }* p- \0 W* w" K    rgw: 3 daemons active (host09, host10, host11)
& }$ y- [; ^; |
2 r$ u) g5 T- _* }  task status:
# ]- R3 y8 Y& p5 C' b- W1 c$ n% ]6 O; ~& U; I# y/ A' Y
  data:/ U% h1 F* x2 Q% q$ h
    pools:   16 pools, 3072 pgs! y" w1 h* g3 J+ P* L
    objects: 4.20M objects, 16 TiB  }3 ^) w% P( F7 ~
    usage:   40 TiB used, 107 TiB / 148 TiB avail
; W2 m- X4 B6 N- M$ R* B" N    pgs:     3067 active+clean
* f6 D8 |$ ^( O# X: t5 j" a             5    active+clean+scrubbing+deep
+ S" R8 l) |. A+ V0 H/ |, N& P: b; i) s9 J" w  l/ N
  io:
$ `# b. B5 v5 z4 _' u, [    client:   403 KiB/s rd, 9.5 MiB/s wr, 514 op/s rd, 466 op/s wr. \& n/ q- i% k( F
$ w, s- K8 i. O
[root@ ~]# ceph config set global osd_deep_scrub_interval 604800
8 V9 y; o. \& J5 m[root@ ~]# & F' v# v% M; [! }$ d7 I# f

' d" @/ R, A8 y1 n/ x$ d2 H6 R/ x; F3 u: P( m/ H* E
systemctl restart ceph-osd@6.service7 S. a3 }9 R, p

4 |1 b# t7 n+ m8 R% Z( O4 [6 e8 x: D. r; c7 k6 t2 A( t

5 v" H7 z4 U, x4 O5 T. |

1

主题

0

回帖

12

积分

管理员

积分
12
QQ
 楼主| 发表于 2022-12-20 17:00:16 | 显示全部楼层
根源( n  |4 `9 q( W# c! B+ P7 U
RHCS 4 has introduced 2 new parameters on scrubbing interval warning, and default values are
! G; Q. o# N: W$ @  O0 SRaw) [% J& `% A3 E4 e
"mon_warn_pg_not_deep_scrubbed_ratio": "0.750000",# }6 H4 V) x" h" s
"mon_warn_pg_not_scrubbed_ratio": "0.500000",
& L' p0 B' v( N* E% g+ iThese ratios are based on the following parameters, and default values are
8 G0 T( t# P! f: |Raw/ j2 B. V2 D* x0 I7 E
"osd_scrub_max_interval": "604800.000000",
5 L4 G- S2 w9 ~$ h4 M/ X% i" O$ N$ U: O"osd_deep_scrub_interval": "604800.000000"0 L" I& m$ x  k
When pgs are not scrubbed / deep scrubbed for the configured ratio of the interval, warnings of "pgs not scrubbed / deep-scrubbed in time" will show in Ceph status.
, H2 [3 D3 L2 @' z" r6 a
& k7 Z. C; p/ g' ?! ]$ R: ~' UWhen setting "mon_warn_pg_not_deep_scrubbed_ratio" or "mon_warn_pg_not_scrubbed_ratio" values to 0, warnings will be disabled. Please evaluate the cluster's past usage carefully before setting.7 U7 l" I* H5 l3 ~0 L0 R; d+ ^
- W- o5 u$ ^( @; V
Sometimes, we might get a false warning even though the osd_deep_scrub_interval has been increased. This happens when the scrubbing parameters are not applied globally because these settings are used by both OSDs and MONs.$ T& V9 w7 l- |9 o4 Q
The OSDs use them to determine when to run scrub, and the MONs/MGRs use them to check if they need to show the warning.
" b4 S, n% k# D! M8 }! q, ~7 ~You can set it globally as:& ], Y3 B. q# J" V4 i" e7 V% o5 u
# H- Z6 B$ {0 X% W7 M6 E4 z/ d
Raw1 W& E: J: o. H& |' b0 c
# ceph config set global osd_deep_scrub_interval 3628800; P8 ]7 c# ]; R9 W) O7 A# h+ a
诊断步骤
" k; T% I3 X2 _, S9 `Checking scrubbing related configurations by admin socket
. P7 ?# j! e) e1 j2 x7 _Raw0 _8 u) T3 D) }0 n$ v9 D8 V
ceph --admin-daemon /var/run/ceph/<admin_socket_name>.asok config show | grep scrub

1

主题

0

回帖

12

积分

管理员

积分
12
QQ
 楼主| 发表于 2022-12-20 17:00:17 | 显示全部楼层
You can set the deep scrub period to 2 week, to stretch the deep scrub window. Insted of( U# a: l) S+ P3 O
0 r+ a- k( {+ g; y, p* _; }/ ^
osd_deep_scrub_interval = 604800
% m+ {' k" v% R# P6 a% ~use:0 n8 ^% j/ D: _( h( N' G
1 d; m7 G/ E( E. F9 T9 B  q
osd_deep_scrub_interval = 1209600
您需要登录后才可以回帖 登录 | 注册

本版积分规则

返回首页|Archiver|手机版|小黑屋|易陆发现技术论坛 ( 蜀ICP备2026014127号-1 )

GMT+8, 2026-6-12 00:12 , Processed in 0.012111 second(s), 21 queries .

Powered by Discuz! X5.0

© 2001-2026 Discuz! Team.

快速回复 返回顶部 返回列表