How to resolve Ceph pool getting active+remapped+backfill_toofull

admin · 发表于 2022-7-22 11:11:17

First Tried Reweighting the OSDs

I previously had a similar issue were an OSD was nearfull and ran reweight to help resolve they issue

ceph osd reweight-by-utilization

This is what the cluster looked like before starting the reweight process.

[root@osd1 ~]# ceph -s cluster: id: ffdb9e09-fdca-48bb-b7fb-cd17151d5c09 health: HEALTH_ERR 1 backfillfull osd(s) 2 pool(s) backfillfull 26199/6685016 objects misplaced (0.392%) Degraded data redundancy (low space): 1 pg backfill_toofull services: mon: 3 daemons, quorum osd1,osd2,osd3 mgr: osd1(active), standbys: osd2 mds: cephfs-2/2/2 up {0=osd1=up:active,1=osd2=up:active}, 1 up:standby osd: 15 osds: 15 up, 15 in; 1 remapped pgs data: pools: 2 pools, 256 pgs objects: 3264k objects, 12342 GB usage: 24773 GB used, 18898 GB / 43671 GB avail pgs: 26199/6685016 objects misplaced (0.392%) 255 active+clean 1 active+remapped+backfill_toofull [root@osd1 ~]# ceph osd status+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+| 0 | osd1.example.com | 1741G | 1053G | 0 | 0 | 0 | 0 | exists,up || 1 | osd2.example.com | 2034G | 760G | 0 | 0 | 0 | 0 | exists,up || 2 | osd3.example.com | 1937G | 857G | 0 | 0 | 0 | 0 | exists,up || 3 | osd4.example.com | 2031G | 763G | 0 | 0 | 0 | 0 | exists,up || 4 | osd1.example.com | 2032G | 761G | 0 | 0 | 0 | 0 | exists,up || 5 | osd1.example.com | 2033G | 761G | 0 | 0 | 0 | 0 | exists,up || 6 | osd2.example.com | 485G | 446G | 0 | 0 | 0 | 0 | exists,up || 7 | osd3.example.com | 677G | 254G | 0 | 0 | 0 | 0 | exists,up || 8 | osd3.example.com | 869G | 61.7G | 0 | 0 | 0 | 0 | backfillfull,exists,up || 9 | osd4.example.com | 676G | 255G | 0 | 0 | 0 | 0 | exists,up || 10 | osd4.example.com | 194G | 736G | 0 | 0 | 0 | 0 | exists,up || 11 | osd5.example.com | 2806G | 2782G | 0 | 0 | 0 | 0 | exists,up || 12 | osd5.example.com | 1938G | 3650G | 0 | 0 | 0 | 0 | exists,up || 13 | osd5.example.com | 2901G | 2687G | 0 | 0 | 0 | 0 | exists,up || 14 | osd5.example.com | 2412G | 3067G | 0 | 0 | 0 | 0 | exists,up |+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+[root@osd1 ~]# ceph osd reweight-by-utilizationmoved 9 / 512 (1.75781%)avg 34.1333stddev 16.7087 -> 16.5484 (expected baseline 5.64427)min osd.6 with 8 -> 8 pgs (0.234375 -> 0.234375 * mean)max osd.13 with 60 -> 60 pgs (1.75781 -> 1.75781 * mean)oload 120max_change 0.05max_change_osds 4average_utilization 0.5673overload_utilization 0.6807osd.8 weight 0.6501 -> 0.6001osd.1 weight 0.7501 -> 0.7001osd.5 weight 0.8852 -> 0.8353osd.4 weight 0.9500 -> 0.9000

This process will take a while to run based on the size of your cluster and your configuration.

For me it took about 24 hours to complete, and it didn’t resolve my issue, so I attempted another reweight, and again after 24 hours later I now have two OSDs with a status of backfillfull. So obviously need to look into another way of getting this resolved.

Second Tried Increasing PG

I did some addition checking and looked further into the issue.

I first checked the OSD troubleshooting and then the PG troubleshooting, I tracked down I had a pg issue.

Looks like pg 1.33 is getting low space and not continuing with the backfill. We have misplaced objects and not missing objects which is good, our cluster is still running during this process.

[root@osd1 ~]# ceph health detailHEALTH_ERR 2 backfillfull osd(s); 2 pool(s) backfillfull; 70105/6685016 objects misplaced (1.049%); Degraded data redundancy (low space): 1 pg backfill_toofullOSD_BACKFILLFULL 2 backfillfull osd(s) osd.8 is backfill full osd.9 is backfill fullPOOL_BACKFILLFULL 2 pool(s) backfillfull pool 'cephfs_data' is backfillfull pool 'cephfs_metadata' is backfillfullOBJECT_MISPLACED 70105/6685016 objects misplaced (1.049%)PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull pg 1.33 is active+remapped+backfill_toofull, acting [12,4][root@osd1 ~]# ceph osd status+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+| 0 | osd1.example.com | 1741G | 1053G | 0 | 0 | 0 | 0 | exists,up || 1 | osd2.example.com | 1937G | 856G | 0 | 0 | 0 | 0 | exists,up || 2 | osd3.example.com | 2033G | 760G | 0 | 0 | 0 | 0 | exists,up || 3 | osd4.example.com | 2180G | 614G | 0 | 0 | 0 | 0 | exists,up || 4 | osd1.example.com | 1936G | 857G | 0 | 0 | 0 | 0 | exists,up || 5 | osd1.example.com | 1840G | 954G | 0 | 0 | 0 | 0 | exists,up || 6 | osd2.example.com | 485G | 446G | 0 | 0 | 0 | 0 | exists,up || 7 | osd3.example.com | 677G | 254G | 0 | 0 | 0 | 0 | exists,up || 8 | osd3.example.com | 869G | 61.7G | 0 | 0 | 0 | 0 | backfillfull,exists,up || 9 | osd4.example.com | 867G | 64.3G | 0 | 0 | 0 | 0 | backfillfull,exists,up || 10 | osd4.example.com | 194G | 737G | 0 | 0 | 0 | 0 | exists,up || 11 | osd5.example.com | 2806G | 2782G | 0 | 0 | 0 | 0 | exists,up || 12 | osd5.example.com | 1938G | 3650G | 0 | 0 | 0 | 0 | exists,up || 13 | osd5.example.com | 2901G | 2687G | 0 | 0 | 0 | 0 | exists,up || 14 | osd5.example.com | 2412G | 3067G | 0 | 0 | 0 | 0 | exists,up |+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+

We can see that now today I have 2 OSDs that are backfillfull, which isn’t good, and I can see that pg 1.33 seems to be the one that is giving us a problem.

After doing some additional research I was able to determine the when I setup my Ceph cluster, I only had <10 OSDs, now I’m running 16 OSDs. I had made a bad assumption there was a single OSD per server, but in fact we have 4 drives in each server which gives us 4 OSDs per physical server. Each OSD manages an individual storage device.

Based on the Ceph documentation in order to determine the number of pg you want in your pool, the calculation would be something like this. (OSDs * 100) / Replicas, so in my case I now have 16 OSDs, and 2 copies of each object.

16 * 100 / 2 = 800

The number of pg must be in powers of 2, so the next matching power of 2 would be 1024. So I checked our pool pg size and attempted to make adjustments to see if they helps.

Remember when making changes to pg_num also increase pgp_num.

[root@osd1 ~]# ceph osd lspools1 cephfs_data,2 cephfs_metadata,[root@osd1 ~]# ceph osd pool get cephfs_data sizesize: 2[root@osd1 ~]# ceph osd pool get cephfs_data min_sizemin_size: 1[root@osd1 ~]# ceph osd pool get cephfs_data pg_numpg_num: 128[root@osd1 ~]# ceph osd pool get cephfs_data pgp_numpgp_num: 128

We can see that when I created the pool I used the default of 128, not realizing that I was going to be adding OSDs over time and it’s recommended to adjust pg_num and pgp_num based on the increasing number of OSDs. So I attempted to increase pg_num from 128 to 1024.

[root@osd1 ~]# ceph osd pool set cephfs_data pg_num 1024Error E2BIG: specified pg_num 1024 is too large (creating 920 new PGs on ~15 OSDs exceeds per-OSD max of 32)

I’m not able to make such a radical jump from 128 to 1024, so I did a smaller increase from 128 to 256.

[root@osd1 ~]# ceph osd pool set cephfs_data pg_num 256set pool 1 pg_num to 256

This has initiated the changes in my pool, and before making any further adjustments it will take some time for the cluster to recover. I’m going to wait for this to complete again before making any further changes.

So you can see what my Ceph health check looks like, this is where we are at now after making those changes.

[root@osd1 ~]# ceph -s cluster: id: ffdb9e09-fdca-48bb-b7fb-cd17151d5c09 health: HEALTH_ERR 2 backfillfull osd(s) 2 pool(s) backfillfull 2830303/6685016 objects misplaced (42.338%) Degraded data redundancy: 2/6685016 objects degraded (0.000%), 1 pg degraded Degraded data redundancy (low space): 2 pgs backfill_toofull services: mon: 3 daemons, quorum osd1,osd2,osd3 mgr: osd1(active), standbys: osd2 mds: cephfs-2/2/2 up {0=osd1=up:active,1=osd2=up:active}, 1 up:standby osd: 15 osds: 15 up, 15 in; 130 remapped pgs data: pools: 2 pools, 384 pgs objects: 3264k objects, 12342 GB usage: 24915 GB used, 18756 GB / 43671 GB avail pgs: 2/6685016 objects degraded (0.000%) 2830303/6685016 objects misplaced (42.338%) 253 active+clean 120 active+remapped+backfill_wait 8 active+remapped+backfilling 2 active+remapped+backfill_wait+backfill_toofull 1 active+recovery_wait+degraded io: recovery: 95900 kB/s, 24 objects/s [root@osd1 ~]# ceph health detailHEALTH_ERR 2 backfillfull osd(s); 2 pool(s) backfillfull; 2792612/6685016 objects misplaced (41.774%); Degraded data redundancy: 2/6685016 objects degraded (0.000%), 1 pg degraded; Degraded data redundancy (low space): 2 pgs backfill_toofullOSD_BACKFILLFULL 2 backfillfull osd(s) osd.8 is backfill full osd.9 is backfill fullPOOL_BACKFILLFULL 2 pool(s) backfillfull pool 'cephfs_data' is backfillfull pool 'cephfs_metadata' is backfillfullOBJECT_MISPLACED 2792612/6685016 objects misplaced (41.774%)PG_DEGRADED Degraded data redundancy: 2/6685016 objects degraded (0.000%), 1 pg degraded pg 1.3a is active+recovery_wait+degraded, acting [11,2]PG_DEGRADED_FULL Degraded data redundancy (low space): 2 pgs backfill_toofull pg 1.33 is active+remapped+backfill_wait+backfill_toofull, acting [12,4] pg 1.a6 is active+remapped+backfill_wait+backfill_toofull, acting [7,14]

Earlier when I started only pg 1.33 was showing backfill_toofull, and now we have pg 1.33 and 1.a6 both showing. Lets wait for the dust to settle after our last change before making any more adjustments.

The Recovery Process

After 24 hours it’s looking good, no errors, but it’s still got going through a recover process. We’re down from 42% to 18% objects misplaced, and our OSDs no longer have any backfill error messages, so looks like we’re on the right path.

[root@osd1 ~]# ceph -s cluster: id: ffdb9e09-fdca-48bb-b7fb-cd17151d5c09 health: HEALTH_ERR 1235611/6685016 objects misplaced (18.483%) Degraded data redundancy (low space): 5 pgs backfill_toofull services: mon: 3 daemons, quorum osd1,osd2,osd3 mgr: osd1(active), standbys: osd2 mds: cephfs-2/2/2 up {0=osd1=up:active,1=osd2=up:active}, 1 up:standby osd: 15 osds: 15 up, 15 in; 57 remapped pgs data: pools: 2 pools, 384 pgs objects: 3264k objects, 12342 GB usage: 25062 GB used, 18609 GB / 43671 GB avail pgs: 1235611/6685016 objects misplaced (18.483%) 327 active+clean 49 active+remapped+backfill_wait 5 active+remapped+backfill_wait+backfill_toofull 3 active+remapped+backfilling io: recovery: 38584 kB/s, 9 objects/s [root@osd1 ~]# ceph -s cluster: id: ffdb9e09-fdca-48bb-b7fb-cd17151d5c09 health: HEALTH_ERR 1235327/6685016 objects misplaced (18.479%) Degraded data redundancy (low space): 5 pgs backfill_toofull services: mon: 3 daemons, quorum osd1,osd2,osd3 mgr: osd1(active), standbys: osd2 mds: cephfs-2/2/2 up {0=osd1=up:active,1=osd2=up:active}, 1 up:standby osd: 15 osds: 15 up, 15 in; 57 remapped pgs data: pools: 2 pools, 384 pgs objects: 3264k objects, 12342 GB usage: 25063 GB used, 18608 GB / 43671 GB avail pgs: 1235327/6685016 objects misplaced (18.479%) 327 active+clean 49 active+remapped+backfill_wait 5 active+remapped+backfill_wait+backfill_toofull 3 active+remapped+backfilling io: recovery: 32430 kB/s, 8 objects/s [root@osd1 ~]# ceph osd status+----+-------------------------+-------+-------+--------+---------+--------+---------+-----------+| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |+----+-------------------------+-------+-------+--------+---------+--------+---------+-----------+| 0 | osd1.example.com | 1789G | 1004G | 0 | 0 | 0 | 0 | exists,up || 1 | osd2.example.com | 2228G | 566G | 0 | 0 | 0 | 0 | exists,up || 2 | osd3.example.com | 2270G | 524G | 0 | 0 | 0 | 0 | exists,up || 3 | osd4.example.com | 2164G | 629G | 0 | 0 | 0 | 0 | exists,up || 4 | osd1.example.com | 2069G | 725G | 0 | 0 | 0 | 0 | exists,up || 5 | osd1.example.com | 1454G | 1339G | 0 | 0 | 0 | 0 | exists,up || 6 | osd2.example.com | 485G | 446G | 0 | 0 | 0 | 0 | exists,up || 7 | osd3.example.com | 437G | 494G | 0 | 0 | 0 | 0 | exists,up || 8 | osd3.example.com | 627G | 303G | 0 | 0 | 0 | 0 | exists,up || 9 | osd4.example.com | 771G | 159G | 0 | 0 | 0 | 0 | exists,up || 10 | osd4.example.com | 339G | 591G | 0 | 0 | 0 | 0 | exists,up || 11 | osd5.example.com | 2464G | 3124G | 0 | 0 | 0 | 0 | exists,up || 12 | osd5.example.com | 2174G | 3414G | 0 | 0 | 0 | 0 | exists,up || 13 | osd5.example.com | 3418G | 2170G | 0 | 0 | 0 | 0 | exists,up || 14 | osd5.example.com | 2367G | 3112G | 0 | 0 | 0 | 0 | exists,up |+----+-------------------------+-------+-------+--------+---------+--------+---------+-----------+[root@osd1 ~]#

The recovery process is looking good. I’ll check back again tomorrow to make sure it’s finished and all of our alerts have cleared.

Once that is done I’ll make one more adjustment on the pg_num to bring it up to the right level for the number of our OSDs.

admin · 发表于 2022-7-22 11:11:33

First Tried Reweighting the OSDs
I previously had a similar issue were an OSD was nearfull and ran reweight to help resolve they issue
ceph osd reweight-by-utilization
This is what the cluster looked like before starting the reweight process.
[root@osd1 ~]# ceph -s
  cluster:
id:    ffdb9e09-fdca-48bb-b7fb-cd17151d5c09
health: HEALTH_ERR
         1 backfillfull osd(s)
         2 pool(s) backfillfull
         26199/6685016 objects misplaced (0.392%)
         Degraded data redundancy (low space): 1 pg backfill_toofull

  services:
mon: 3 daemons, quorum osd1,osd2,osd3
mgr: osd1(active), standbys: osd2
mds: cephfs-2/2/2 up  {0=osd1=up:active,1=osd2=up:active}, 1 up:standby
osd: 15 osds: 15 up, 15 in; 1 remapped pgs

  data:
pools: 2 pools, 256 pgs
objects: 3264k objects, 12342 GB
usage: 24773 GB used, 18898 GB / 43671 GB avail
pgs:    26199/6685016 objects misplaced (0.392%)
         255 active+clean
         1 active+remapped+backfill_toofull

[root@osd1 ~]# ceph osd status
+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+
| id |          host       |  used | avail | wr ops | wr data | rd ops | rd data |       state       |
+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+
| 0  | osd1.example.com | 1741G | 1053G | 0 |    0 | 0 |    0 |    exists,up       |
| 1  | osd2.example.com | 2034G |  760G | 0 |    0 | 0 |    0 |    exists,up       |
| 2  | osd3.example.com | 1937G |  857G | 0 |    0 | 0 |    0 |    exists,up       |
| 3  | osd4.example.com | 2031G |  763G | 0 |    0 | 0 |    0 |    exists,up       |
| 4  | osd1.example.com | 2032G |  761G | 0 |    0 | 0 |    0 |    exists,up       |
| 5  | osd1.example.com | 2033G |  761G | 0 |    0 | 0 |    0 |    exists,up       |
| 6  | osd2.example.com |  485G |  446G | 0 |    0 | 0 |    0 |    exists,up       |
| 7  | osd3.example.com |  677G |  254G | 0 |    0 | 0 |    0 |    exists,up       |
| 8  | osd3.example.com |  869G | 61.7G | 0 |    0 | 0 |    0 | backfillfull,exists,up |
| 9  | osd4.example.com |  676G |  255G | 0 |    0 | 0 |    0 |    exists,up       |
| 10 | osd4.example.com |  194G |  736G | 0 |    0 | 0 |    0 |    exists,up       |
| 11 | osd5.example.com | 2806G | 2782G | 0 |    0 | 0 |    0 |    exists,up       |
| 12 | osd5.example.com | 1938G | 3650G | 0 |    0 | 0 |    0 |    exists,up       |
| 13 | osd5.example.com | 2901G | 2687G | 0 |    0 | 0 |    0 |    exists,up       |
| 14 | osd5.example.com | 2412G | 3067G | 0 |    0 | 0 |    0 |    exists,up       |
+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+
[root@osd1 ~]# ceph osd reweight-by-utilization
moved 9 / 512 (1.75781%)
avg 34.1333
stddev 16.7087 -> 16.5484 (expected baseline 5.64427)
min osd.6 with 8 -> 8 pgs (0.234375 -> 0.234375 * mean)
max osd.13 with 60 -> 60 pgs (1.75781 -> 1.75781 * mean)
oload 120
max_change 0.05
max_change_osds 4
average_utilization 0.5673
overload_utilization 0.6807
osd.8 weight 0.6501 -> 0.6001
osd.1 weight 0.7501 -> 0.7001
osd.5 weight 0.8852 -> 0.8353
osd.4 weight 0.9500 -> 0.9000
This process will take a while to run based on the size of your cluster and your configuration.
For me it took about 24 hours to complete, and it didn’t resolve my issue, so I attempted another reweight, and again after 24 hours later I now have two OSDs with a status of backfillfull.  So obviously need to look into another way of getting this resolved.
Second Tried Increasing PG
I did some addition checking and looked further into the issue.
I first checked the OSD troubleshooting and then the PG troubleshooting, I tracked down I had a pg issue.
Looks like pg 1.33 is getting low space and not continuing with the backfill.  We have misplaced objects and not missing objects which is good, our cluster is still running during this process.
[root@osd1 ~]# ceph health detail
HEALTH_ERR 2 backfillfull osd(s); 2 pool(s) backfillfull; 70105/6685016 objects misplaced (1.049%); Degraded data redundancy (low space): 1 pg backfill_toofull
OSD_BACKFILLFULL 2 backfillfull osd(s)
osd.8 is backfill full
osd.9 is backfill full
POOL_BACKFILLFULL 2 pool(s) backfillfull
pool 'cephfs_data' is backfillfull
pool 'cephfs_metadata' is backfillfull
OBJECT_MISPLACED 70105/6685016 objects misplaced (1.049%)
PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
pg 1.33 is active+remapped+backfill_toofull, acting [12,4]
[root@osd1 ~]# ceph osd status
+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+
| id |          host       |  used | avail | wr ops | wr data | rd ops | rd data |       state       |
+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+
| 0  | osd1.example.com | 1741G | 1053G | 0 |    0 | 0 |    0 |    exists,up       |
| 1  | osd2.example.com | 1937G |  856G | 0 |    0 | 0 |    0 |    exists,up       |
| 2  | osd3.example.com | 2033G |  760G | 0 |    0 | 0 |    0 |    exists,up       |
| 3  | osd4.example.com | 2180G |  614G | 0 |    0 | 0 |    0 |    exists,up       |
| 4  | osd1.example.com | 1936G |  857G | 0 |    0 | 0 |    0 |    exists,up       |
| 5  | osd1.example.com | 1840G |  954G | 0 |    0 | 0 |    0 |    exists,up       |
| 6  | osd2.example.com |  485G |  446G | 0 |    0 | 0 |    0 |    exists,up       |
| 7  | osd3.example.com |  677G |  254G | 0 |    0 | 0 |    0 |    exists,up       |
| 8  | osd3.example.com |  869G | 61.7G | 0 |    0 | 0 |    0 | backfillfull,exists,up |
| 9  | osd4.example.com |  867G | 64.3G | 0 |    0 | 0 |    0 | backfillfull,exists,up |
| 10 | osd4.example.com |  194G |  737G | 0 |    0 | 0 |    0 |    exists,up       |
| 11 | osd5.example.com | 2806G | 2782G | 0 |    0 | 0 |    0 |    exists,up       |
| 12 | osd5.example.com | 1938G | 3650G | 0 |    0 | 0 |    0 |    exists,up       |
| 13 | osd5.example.com | 2901G | 2687G | 0 |    0 | 0 |    0 |    exists,up       |
| 14 | osd5.example.com | 2412G | 3067G | 0 |    0 | 0 |    0 |    exists,up       |
+----+-------------------------+-------+-------+--------+---------+--------+---------+------------------------+
We can see that now today I have 2 OSDs that are backfillfull, which isn’t good, and I can see that pg 1.33 seems to be the one that is giving us a problem.
After doing some additional research I was able to determine the when I setup my Ceph cluster, I only had <10 OSDs, now I’m running 16 OSDs.  I had made a bad assumption there was a single OSD per server, but in fact we have 4 drives in each server which gives us 4 OSDs per physical server. Each OSD manages an individual storage device.
Based on the Ceph documentation in order to determine the number of pg you want in your pool, the calculation would be something like this. (OSDs * 100) / Replicas, so in my case I now have 16 OSDs, and 2 copies of each object.
16 * 100 / 2 = 800
The number of pg must be in powers of 2, so the next matching power of 2 would be 1024. So I checked our pool pg size and attempted to make adjustments to see if they helps.
Remember when making changes to pg_num also increase pgp_num.
[root@osd1 ~]# ceph osd lspools
1 cephfs_data,2 cephfs_metadata,
[root@osd1 ~]# ceph osd pool get cephfs_data size
size: 2
[root@osd1 ~]# ceph osd pool get cephfs_data min_size
min_size: 1
[root@osd1 ~]# ceph osd pool get cephfs_data pg_num
pg_num: 128
[root@osd1 ~]# ceph osd pool get cephfs_data pgp_num
pgp_num: 128
We can see that when I created the pool I used the default of 128, not realizing that I was going to be adding OSDs over time and it’s recommended to adjust pg_num and pgp_num based on the increasing number of OSDs.  So I attempted to increase pg_num from 128 to 1024.
[root@osd1 ~]# ceph osd pool set cephfs_data pg_num 1024
Error E2BIG: specified pg_num 1024 is too large (creating 920 new PGs on ~15 OSDs exceeds per-OSD max of 32)
I’m not able to make such a radical jump from 128 to 1024, so I did a smaller increase from 128 to 256.
[root@osd1 ~]# ceph osd pool set cephfs_data pg_num 256
set pool 1 pg_num to 256
This has initiated the changes in my pool, and before making any further adjustments it will take some time for the cluster to recover. I’m going to wait for this to complete again before making any further changes.
So you can see what my Ceph health check looks like, this is where we are at now after making those changes.
[root@osd1 ~]# ceph -s
  cluster:
id:    ffdb9e09-fdca-48bb-b7fb-cd17151d5c09
health: HEALTH_ERR
         2 backfillfull osd(s)
         2 pool(s) backfillfull
         2830303/6685016 objects misplaced (42.338%)
         Degraded data redundancy: 2/6685016 objects degraded (0.000%), 1 pg degraded
         Degraded data redundancy (low space): 2 pgs backfill_toofull

  services:
mon: 3 daemons, quorum osd1,osd2,osd3
mgr: osd1(active), standbys: osd2
mds: cephfs-2/2/2 up  {0=osd1=up:active,1=osd2=up:active}, 1 up:standby
osd: 15 osds: 15 up, 15 in; 130 remapped pgs

  data:
pools: 2 pools, 384 pgs
objects: 3264k objects, 12342 GB
usage: 24915 GB used, 18756 GB / 43671 GB avail
pgs:    2/6685016 objects degraded (0.000%)
         2830303/6685016 objects misplaced (42.338%)
         253 active+clean
         120 active+remapped+backfill_wait
         8 active+remapped+backfilling
         2 active+remapped+backfill_wait+backfill_toofull
         1 active+recovery_wait+degraded

  io:
recovery: 95900 kB/s, 24 objects/s

[root@osd1 ~]# ceph health detail
HEALTH_ERR 2 backfillfull osd(s); 2 pool(s) backfillfull; 2792612/6685016 objects misplaced (41.774%); Degraded data redundancy: 2/6685016 objects degraded (0.000%), 1 pg degraded; Degraded data redundancy (low space): 2 pgs backfill_toofull
OSD_BACKFILLFULL 2 backfillfull osd(s)
osd.8 is backfill full
osd.9 is backfill full
POOL_BACKFILLFULL 2 pool(s) backfillfull
pool 'cephfs_data' is backfillfull
pool 'cephfs_metadata' is backfillfull
OBJECT_MISPLACED 2792612/6685016 objects misplaced (41.774%)
PG_DEGRADED Degraded data redundancy: 2/6685016 objects degraded (0.000%), 1 pg degraded
pg 1.3a is active+recovery_wait+degraded, acting [11,2]
PG_DEGRADED_FULL Degraded data redundancy (low space): 2 pgs backfill_toofull
pg 1.33 is active+remapped+backfill_wait+backfill_toofull, acting [12,4]
pg 1.a6 is active+remapped+backfill_wait+backfill_toofull, acting [7,14]
Earlier when I started only pg 1.33 was showing backfill_toofull, and now we have  pg 1.33 and 1.a6 both showing.  Lets wait for the dust to settle after our last change before making any more adjustments.
The Recovery Process
After 24 hours it’s looking good, no errors, but it’s still got going through a recover process.  We’re down from 42% to 18% objects misplaced, and our OSDs no longer have any backfill error messages, so looks like we’re on the right path.
[root@osd1 ~]# ceph -s
  cluster:
id:    ffdb9e09-fdca-48bb-b7fb-cd17151d5c09
health: HEALTH_ERR
         1235611/6685016 objects misplaced (18.483%)
         Degraded data redundancy (low space): 5 pgs backfill_toofull

  services:
mon: 3 daemons, quorum osd1,osd2,osd3
mgr: osd1(active), standbys: osd2
mds: cephfs-2/2/2 up  {0=osd1=up:active,1=osd2=up:active}, 1 up:standby
osd: 15 osds: 15 up, 15 in; 57 remapped pgs

  data:
pools: 2 pools, 384 pgs
objects: 3264k objects, 12342 GB
usage: 25062 GB used, 18609 GB / 43671 GB avail
pgs:    1235611/6685016 objects misplaced (18.483%)
         327 active+clean
         49  active+remapped+backfill_wait
         5 active+remapped+backfill_wait+backfill_toofull
         3 active+remapped+backfilling

  io:
recovery: 38584 kB/s, 9 objects/s

[root@osd1 ~]# ceph -s
  cluster:
id:    ffdb9e09-fdca-48bb-b7fb-cd17151d5c09
health: HEALTH_ERR
         1235327/6685016 objects misplaced (18.479%)
         Degraded data redundancy (low space): 5 pgs backfill_toofull

  services:
mon: 3 daemons, quorum osd1,osd2,osd3
mgr: osd1(active), standbys: osd2
mds: cephfs-2/2/2 up  {0=osd1=up:active,1=osd2=up:active}, 1 up:standby
osd: 15 osds: 15 up, 15 in; 57 remapped pgs

  data:
pools: 2 pools, 384 pgs
objects: 3264k objects, 12342 GB
usage: 25063 GB used, 18608 GB / 43671 GB avail
pgs:    1235327/6685016 objects misplaced (18.479%)
         327 active+clean
         49  active+remapped+backfill_wait
         5 active+remapped+backfill_wait+backfill_toofull
         3 active+remapped+backfilling

  io:
recovery: 32430 kB/s, 8 objects/s

[root@osd1 ~]# ceph osd status
+----+-------------------------+-------+-------+--------+---------+--------+---------+-----------+
| id |          host       |  used | avail | wr ops | wr data | rd ops | rd data | state |
+----+-------------------------+-------+-------+--------+---------+--------+---------+-----------+
| 0  | osd1.example.com | 1789G | 1004G | 0 |    0 | 0 |    0 | exists,up |
| 1  | osd2.example.com | 2228G |  566G | 0 |    0 | 0 |    0 | exists,up |
| 2  | osd3.example.com | 2270G |  524G | 0 |    0 | 0 |    0 | exists,up |
| 3  | osd4.example.com | 2164G |  629G | 0 |    0 | 0 |    0 | exists,up |
| 4  | osd1.example.com | 2069G |  725G | 0 |    0 | 0 |    0 | exists,up |
| 5  | osd1.example.com | 1454G | 1339G | 0 |    0 | 0 |    0 | exists,up |
| 6  | osd2.example.com |  485G |  446G | 0 |    0 | 0 |    0 | exists,up |
| 7  | osd3.example.com |  437G |  494G | 0 |    0 | 0 |    0 | exists,up |
| 8  | osd3.example.com |  627G |  303G | 0 |    0 | 0 |    0 | exists,up |
| 9  | osd4.example.com |  771G |  159G | 0 |    0 | 0 |    0 | exists,up |
| 10 | osd4.example.com |  339G |  591G | 0 |    0 | 0 |    0 | exists,up |
| 11 | osd5.example.com | 2464G | 3124G | 0 |    0 | 0 |    0 | exists,up |
| 12 | osd5.example.com | 2174G | 3414G | 0 |    0 | 0 |    0 | exists,up |
| 13 | osd5.example.com | 3418G | 2170G | 0 |    0 | 0 |    0 | exists,up |
| 14 | osd5.example.com | 2367G | 3112G | 0 |    0 | 0 |    0 | exists,up |
+----+-------------------------+-------+-------+--------+---------+--------+---------+-----------+
[root@osd1 ~]#
The recovery process is looking good.  I’ll check back again tomorrow to make sure it’s finished and all of our alerts have cleared.
Once that is done I’ll make one more adjustment on the pg_num to bring it up to the right level for the number of our OSDs.

		自动登录	找回密码
密码			注册