- 积分
- 16841
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
Observing similar kind of issue when tried upgrade from 4.3z1 --> 5.3 (latest).Post running the cephadm-preflight playbook, the ceph services (mon,mgr,osd's) got failed on all the nodes.But ceph.target was running.As a result, ceph commands are getting hung.[root@ceph-msaini-taooh8-node2 ~]# systemctl | grep ceph ceph-crash loaded active running Ceph crash dump collector● ceph-mgr loaded failed failed Ceph Manager● ceph-mon loaded failed failed Ceph Monitor system-ceph\x2dcrash.slice loaded active active system-ceph\x2dcrash.slice system-ceph\x2dmds.slice loaded active active system-ceph\x2dmds.slice system-ceph\x2dmgr.slice loaded active active system-ceph\x2dmgr.slice system-ceph\x2dmon.slice loaded active active system-ceph\x2dmon.slice ceph.target loaded active active ceph target allowing to start/stop all ceph*@.service instances at once[root@ceph-msaini-taooh8-node5 ~]# systemctl status ceph.target● ceph.target - ceph target allowing to start/stop all ceph*@.service instances at once Loaded: loaded (/etc/systemd/system/ceph.target; enabled; vendor preset: enabled) Active: active since Wed 2023-12-20 14:58:47 EST; 4h 5min agoDec 20 14:58:47 ceph-msaini-taooh8-node5 systemd[1]: Reached target ceph target allowing to start/stop all ceph*@.service instances at once.========================Upgrade logs========================[root@ceph-msaini-taooh8-node1-installer ~]# ceph --versionceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)[root@ceph-msaini-taooh8-node1-installer ~]# ceph versions{ "mon": { "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 3 }, "osd": { "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 12 }, "mds": { "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 3 }, "rgw": { "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 4 }, "overall": { "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 25 }}[root@ceph-msaini-taooh8-node1-installer ~]# ceph -s cluster: id: 07cd16a8-f925-4d09-a041-6d725b939582 health: HEALTH_WARN 1 pool(s) have non-power-of-two pg_num 1 pools have too few placement groups 3 pools have too many placement groups mons are allowing insecure global_id reclaim services: mon: 3 daemons, quorum ceph-msaini-taooh8-node3,ceph-msaini-taooh8-node2,ceph-msaini-taooh8-node1-installer (age 45m) mgr: ceph-msaini-taooh8-node1-installer(active, since 43m), standbys: ceph-msaini-taooh8-node2, ceph-msaini-taooh8-node3 mds: cephfs:1 {0=ceph-msaini-taooh8-node2=up:active} 2 up:standby osd: 12 osds: 12 up (since 38m), 12 in (since 57m) rgw: 4 daemons active (ceph-msaini-taooh8-node5.rgw0, ceph-msaini-taooh8-node5.rgw1, ceph-msaini-taooh8-node6.rgw0, ceph-msaini-taooh8-node6.rgw1) data: pools: 13 pools, 676 pgs objects: 382 objects, 456 MiB usage: 13 GiB used, 227 GiB / 240 GiB avail pgs: 676 active+clean io: client: 2.5 KiB/s rd, 2 op/s rd, 0 op/s wr[root@ceph-msaini-taooh8-node1-installer ~]# podman psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESb4bc2bbf0671 registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.6 --path.procfs=/ro... 54 minutes ago Up 54 minutes node-exporter288dbf3d1416 registry.redhat.io/rhceph/rhceph-4-rhel8:latest 49 minutes ago Up 49 minutes ceph-mon-ceph-msaini-taooh8-node1-installere02558859efb registry.redhat.io/rhceph/rhceph-4-rhel8:latest 46 minutes ago Up 46 minutes ceph-mgr-ceph-msaini-taooh8-node1-installerfdc68705313e registry.redhat.io/rhceph/rhceph-4-rhel8:latest 30 minutes ago Up 30 minutes ceph-crash-ceph-msaini-taooh8-node1-installer[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# sudo ansible-playbook -i hosts infrastructure-playbooks/rolling_update.yml --extra-vars "health_osd_check_retries=50 health_osd_check_delay=30"PLAY RECAP **************************************************************************************************************************************************************************************************************************************ceph-msaini-taooh8-node1-installer : ok=375 changed=59 unreachable=0 failed=0 skipped=633 rescued=0 ignored=0ceph-msaini-taooh8-node2 : ok=370 changed=39 unreachable=0 failed=0 skipped=685 rescued=0 ignored=0ceph-msaini-taooh8-node3 : ok=370 changed=39 unreachable=0 failed=0 skipped=690 rescued=0 ignored=0ceph-msaini-taooh8-node4 : ok=252 changed=28 unreachable=0 failed=0 skipped=460 rescued=0 ignored=0ceph-msaini-taooh8-node5 : ok=379 changed=38 unreachable=0 failed=0 skipped=625 rescued=0 ignored=0ceph-msaini-taooh8-node6 : ok=368 changed=37 unreachable=0 failed=0 skipped=645 rescued=0 ignored=0ceph-msaini-taooh8-node7 : ok=319 changed=38 unreachable=0 failed=0 skipped=495 rescued=0 ignored=0localhost : ok=1 changed=1 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ansible-playbook -vvvv infrastructure-playbooks/rolling_update.yml -i hosts stdout: |- { "mon": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3 }, "osd": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 12 }, "mds": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3 }, "rgw": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 4 }, "rgw-nfs": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 1 }, "overall": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 26 } } stdout_lines: <omitted>META: ran handlersMETA: ran handlersPLAY RECAP **************************************************************************************************************************************************************************************************************************************ceph-msaini-taooh8-node1-installer : ok=372 changed=51 unreachable=0 failed=0 skipped=626 rescued=0 ignored=0ceph-msaini-taooh8-node2 : ok=363 changed=27 unreachable=0 failed=0 skipped=676 rescued=0 ignored=0ceph-msaini-taooh8-node3 : ok=364 changed=28 unreachable=0 failed=0 skipped=680 rescued=0 ignored=0ceph-msaini-taooh8-node4 : ok=249 changed=21 unreachable=0 failed=0 skipped=453 rescued=0 ignored=0ceph-msaini-taooh8-node5 : ok=375 changed=27 unreachable=0 failed=0 skipped=616 rescued=0 ignored=0ceph-msaini-taooh8-node6 : ok=370 changed=27 unreachable=0 failed=0 skipped=629 rescued=0 ignored=0ceph-msaini-taooh8-node7 : ok=317 changed=29 unreachable=0 failed=0 skipped=489 rescued=0 ignored=0localhost : ok=1 changed=1 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ceph --versionceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ceph versions{ "mon": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3 }, "osd": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 12 }, "mds": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3 }, "rgw": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 4 }, "rgw-nfs": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 1 }, "overall": { "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 26 }}[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# podman psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES6ca1e2071341 registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330 16 minutes ago Up 16 minutes ceph-mon-ceph-msaini-taooh8-node1-installerf518b6b7588d registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330 13 minutes ago Up 13 minutes ceph-mgr-ceph-msaini-taooh8-node1-installer74a1b25bee9e registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330 3 minutes ago Up 3 minutes ceph-crash-ceph-msaini-taooh8-node1-installer38e14828d9ae registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.6 --path.procfs=/ro... 2 minutes ago Up 2 minutes node-exporter[root@ceph-msaini-taooh8-node1-installer ceph-ansible]## systemctl | grep cephceph-crash loaded active running Ceph crash dump collectorceph-mgr loaded active running Ceph Managerceph-mon loaded active running Ceph Monitorsystem-ceph\x2dcrash.slice loaded active active system-ceph\x2dcrash.slicesystem-ceph\x2dmgr.slice loaded active active system-ceph\x2dmgr.slicesystem-ceph\x2dmon.slice loaded active active system-ceph\x2dmon.sliceceph-mgr.target loaded active active ceph target allowing to start/stop all ceph-mgr@.service instances at onceceph-mon.target loaded active active ceph target allowing to start/stop all ceph-mon@.service instances at onceceph.target loaded active active ceph target allowing to start/stop all ceph*@.service instances at once[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ansible-playbook infrastructure-playbooks/cephadm-adopt.yml -i hostsTASK [add ceph label for core component] ********************************************************************************************************************************************************************************************************fatal: [ceph-msaini-taooh8-node2 -> ceph-msaini-taooh8-node1-installer]: FAILED! => changed=false cmd: - podman - run - --rm - --net=host - -v - /etc/ceph:/etc/ceph:z - -v - /var/lib/ceph:/var/lib/ceph:ro - -v - /var/run/ceph:/var/run/ceph:z - --entrypoint=ceph - registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330 - --cluster - ceph - orch - host - label - add - ceph-msaini-taooh8-node2 - ceph delta: '0:00:01.795436' end: '2023-12-20 18:15:08.390207' msg: non-zero return code rc: 22 start: '2023-12-20 18:15:06.594771' stderr: 'Error EINVAL: host ceph-msaini-taooh8-node2 does not exist' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted>[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# dnf install cephadm-ansibleUpdating Subscription Management repositories.Last metadata expiration check: 0:01:06 ago on Wed 20 Dec 2023 06:22:31 PM EST.Dependencies resolved.================================================================================================================================================================================================================================================= Package Architecture Version Repository Size=================================================================================================================================================================================================================================================Installing: cephadm-ansible noarch 1.17.0-1.el8cp rhceph-5-tools-for-rhel-8-x86_64-rpms 32 kInstalling dependencies: ansible-collection-ansible-posix noarch 1.2.0-1.el8cp.1 rhceph-5-tools-for-rhel-8-x86_64-rpms 131 k ansible-collection-community-general noarch 4.0.0-1.1.el8cp.1 rhceph-5-tools-for-rhel-8-x86_64-rpms 1.5 M ansible-core x86_64 2.15.3-1.el8 rhel-8-for-x86_64-appstream-rpms 3.6 M mpdecimal x86_64 2.5.1-3.el8 rhel-8-for-x86_64-appstream-rpms 93 k python3.11 x86_64 3.11.5-1.el8_9 rhel-8-for-x86_64-appstream-rpms 30 k python3.11-cffi x86_64 1.15.1-1.el8 rhel-8-for-x86_64-appstream-rpms 293 k python3.11-cryptography x86_64 37.0.2-5.el8 rhel-8-for-x86_64-appstream-rpms 1.1 M python3.11-libs x86_64 3.11.5-1.el8_9 rhel-8-for-x86_64-appstream-rpms 10 M python3.11-pip-wheel noarch 22.3.1-4.el8 rhel-8-for-x86_64-appstream-rpms 1.4 M python3.11-ply noarch 3.11-1.el8 rhel-8-for-x86_64-appstream-rpms 135 k python3.11-pycparser noarch 2.20-1.el8 rhel-8-for-x86_64-appstream-rpms 147 k python3.11-pyyaml x86_64 6.0-1.el8 rhel-8-for-x86_64-appstream-rpms 214 k python3.11-setuptools-wheel noarch 65.5.1-2.el8 rhel-8-for-x86_64-appstream-rpms 720 k sshpass x86_64 1.09-4.el8ap labrepo 30 kTransaction Summary=================================================================================================================================================================================================================================================Install 15 PackagesTotal download size: 20 MInstalled size: 78 MIs this ok [y/N]: y[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl | grep ceph ceph-crash loaded active running Ceph crash dump collector● ceph-mgr loaded failed failed Ceph Manager● ceph-mon loaded failed failed Ceph Monitor system-ceph\x2dcrash.slice loaded active active system-ceph\x2dcrash.slice system-ceph\x2dmgr.slice loaded active active system-ceph\x2dmgr.slice system-ceph\x2dmon.slice loaded active active system-ceph\x2dmon.slice ceph.target loaded active active ceph target allowing to start/stop all ceph*@.service instances at once[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl status ceph.target● ceph.target - ceph target allowing to start/stop all ceph*@.service instances at once Loaded: loaded (/etc/systemd/system/ceph.target; enabled; vendor preset: enabled) Active: active since Wed 2023-12-20 14:58:46 EST; 3h 54min agoDec 20 14:58:46 ceph-msaini-taooh8-node1-installer systemd[1]: Reached target ceph target allowing to start/stop all ceph*@.service instances at once.[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# ceph -s[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl -l status ceph-mgr● ceph-mgr - Ceph Manager Loaded: loaded (/etc/systemd/system/ceph-mgr@.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2023-12-20 18:34:26 EST; 37min ago Main PID: 110855 (code=exited, status=143)Dec 20 18:34:24 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: 2023-12-20T18:34:24.431-0500 7f0a8fddb700 0 log_channel(cluster) log [DBG] : pgmap v677: 701 pgs: 701 active+clean; 456 MiB data, 2.3 G>Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: Stopping Ceph Manager...Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: managing teardown after SIGTERMDec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: Sending SIGTERM to PID 54Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: Waiting PID 54 to terminate .Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: Process 54 is terminatedDec 20 18:34:26 ceph-msaini-taooh8-node1-installer sh[142533]: f518b6b7588de6ed1793a6f58a4fa9ca41df91f58a7543dd90d97508e6f612e5Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mgr: Main process exited, code=exited, status=143/n/aDec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mgr: Failed with result 'exit-code'.Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: Stopped Ceph Manager.[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl -l status ceph-mon● ceph-mon - Ceph Monitor Loaded: loaded (/etc/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2023-12-20 18:34:26 EST; 38min ago Main PID: 106377 (code=exited, status=143)Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.595-0500 7f4c0a23b880 1 rocksdb: close waiting for compaction thread to stopDec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.595-0500 7f4c0a23b880 1 rocksdb: close compaction thread to stoppedDec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.595-0500 7f4c0a23b880 4 rocksdb: [db_impl/db_impl.cc:397] Shutdown: canceling all background workDec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.599-0500 7f4c0a23b880 4 rocksdb: [db_impl/db_impl.cc:573] Shutdown completeDec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: teardown: Waiting PID 86 to terminate .Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: teardown: Process 86 is terminatedDec 20 18:34:26 ceph-msaini-taooh8-node1-installer sh[142608]: 6ca1e2071341cf2fa0140bced76763b36ec0f17f55ddba50794aa25a1245099eDec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mon: Main process exited, code=exited, status=143/n/aDec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mon: Failed with result 'exit-code'.Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: Stopped Ceph Monitor.[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]#[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# rpm -qa | grep cephceph-grafana-dashboards-14.2.22-128.el8cp.noarchlibcephfs2-16.2.10-208.el8cp.x86_64cephadm-ansible-1.17.0-1.el8cp.noarchpython3-ceph-common-16.2.10-208.el8cp.x86_64ceph-base-16.2.10-208.el8cp.x86_64cephadm-16.2.10-220.el9cp.noarchpython3-ceph-argparse-16.2.10-208.el8cp.x86_64python3-cephfs-16.2.10-208.el8cp.x86_64ceph-common-16.2.10-208.el8cp.x86_64ceph-selinux-16.2.10-208.el8cp.x86_641 |3 ?. H1 w; d2 Z. R3 N
0 ^' c8 r4 }; E. s \# X- h |
|