|
|
软件版本:
, P, u, E1 g5 G8 b) |4 t8 ~8 d. n
$ a% l* K# T6 _# A
* i; H" ^4 d4 L$ p' p D6 n$ }$ M9 d7 o( a: ]& ~
. r Z( z8 {" p7 X K; {! ~% ^
6 @" Z: t/ t' \8 y. Q8 g p. @- X( m( Q% y6 u5 |* Z( x& }
& E# [ ?, Z: M# s) O( d6 m
( Y& G2 z2 o1 `% d& Q1 k R
. `6 C# \+ e" \( k& K& a
5 K0 e1 ]" w' c, d5 Y! {% z2 Z$ j& i' J1 l
7 I; P. E! o; L! U1 \& k2 D( I; u; T- H1 I& }, C
" q) `, U1 H* x! g+ Y- U( P
7 d1 X; R' a/ Y: X9 `
4 p2 {2 I; o; @. Z! @) J9 R5 L: b; g6 t% z3 P
软件版本cephoctopus(15.2.16)centos7.97 \ s, {. Q M1 \! @0 Z, S6 b
ceph官网版本:; u. U' X. I% b/ s
docs.ceph.com/en/latest/r…
- H( r- e& o8 R7 c机器列表:
+ `- J# J# `& i" V( s& Z# |: k) h+ F a
) O6 _ k o/ W$ M* X
7 C! l* b- R P+ P w; f
! w+ G) R2 J9 Q _) q) K
2 \: M3 {4 T. e, Y# j* d0 r& _2 E! p8 ?0 P+ s& y
. v1 H( o; L$ Z1 f
3 w5 j& a3 A& O& j* v$ Y z. g5 s
1 Y8 i K2 z, W# v% b% C; [; |. c( l4 Y5 W& Z3 N3 m
5 d2 |* J2 h& t' J0 k. B. Z" |2 Y5 J
1 v" {! a, ?$ D( S$ n! i% j* C% A8 {6 F
) U. m7 f! g: m' i+ m# K3 X4 R" n1 L0 ~& v
) {# Y- k& T0 P; l- y) I, D
1 h' O# [! @; H( q" q# ^
2 a% n2 c- R" T6 G; }6 x4 f" l7 C5 z4 k: r% a! b6 V/ X/ D
) k1 C! _, K6 M8 B# s+ ^2 v) N* r" C
0 |+ R' f4 Q2 L0 ^% ]2 @5 |
4 o3 T( D; q1 f, ~7 j* l
, |6 r/ ^' J# e' u6 u4 i8 K" T( N" h, d0 w
机器名称ip块设备
$ e" u% q/ }6 f9 X0 L8 Z/ E) Cmaster0 12.70.10.161/dev/vdb 和 8 @" k z$ q9 T
/dev/vdcmaster1 12.70.10.162/dev/vdb 和* [ K$ q0 Y5 |! Y& M( j/ y
/dev/vdcmaster2 12.70.10.163/dev/vdb 和 /dev/vdc
9 ]% r2 Y* E! q准备工作) U |+ L9 L2 m$ B$ B/ {1 h, q
开始使用ansible做些准备工作,实际开始搭建后,不用ansible,因为cephadm本身就是一个集群管理工具。
! F( ^2 a$ P7 d9 @2 d- N) B如下ansible剧本均在集群外的任意一台机器上面执行,需要安装下ansible,不明白的可以参照ansible使用方法。! @3 o% C& [/ v% M$ e& Z. i
ansible的hosts配置如下(因为之前这些机器用来搭建k8s,所以名称没有改变):
: ]+ y+ K2 _4 ibash 体验AI代码助手 代码解读复制代码: _- A" j: N3 l4 K
master0 ansible_host=12.70.10.161 ansible_port=22 ansible_user=root ansible_password=*** host_name=master0
- ~. X3 _( s2 U: nmaster1 ansible_host=12.70.10.162 ansible_port=22 ansible_user=root ansible_password=*** host_name=master1 ; G$ a8 j: b1 e
master2 ansible_host=12.70.10.163 ansible_port=22 ansible_user=root ansible_password=*** host_name=master23 a4 s6 w. i: q5 ]
- \! @) P. o# s, i7 O& g
[all]2 q6 V# l6 Z" V, [9 B: s D% n
master01 [! e, z/ L! d$ @- O
master1, H# B- s6 S4 O Z
master2
( G" G* h: ?1 j/ W, u( L1 s. m
0 Q+ ~6 S. E. _% g. Y升级系统内核
2 P3 S; r* ~# I' {4 q5 q o& w升级系统内核到5.17; f" p8 o$ d0 z
ansible脚本如下(1.kenel.yaml):
! J% W6 `6 _1 dyaml 体验AI代码助手 代码解读复制代码- name: update kernel$ E6 R3 @! s& S% g
hosts: all, T0 M8 }! W* [( O
gather_facts: True
5 ]9 P: I K- l/ _/ J* o vars:# n H) }. b% m5 J" ~; A& `2 s! E& R
tasks:4 g8 u. R" P; L# A: G3 h a9 F
- name: create workspace" b0 B: E) k1 P3 a* n- a6 }
file:
% C5 O @* V: G' c" S5 z J path: /tmp/workspace/
- Z0 M; d+ ]9 E& F state: directory
4 x, v) w6 ?4 y tags: workspace
7 @% ~1 `5 m4 J1 b( } - name: yum install elrepo9 q+ Y Z( Z" v# A% D7 n
copy:! y1 k. \, a6 `5 L
src: templates/rpm.sh.j2
+ d2 O7 v0 d8 g( t8 d" o! ^ dest: /tmp/workspace/rpm.sh
& j2 l: c; D F; o3 C$ @- u mode: 7553 K( w' U; K; s% |7 E8 t3 A
tags:1 D9 N/ A' [7 Z6 Y. ^" T% }
- copy-rmp4 y {$ a3 q8 y" X- |
- name: sh rpm.sh8 x) ~1 `7 Z8 a" l
shell: sh /tmp/workspace/rpm.sh
+ O( w8 K" y; B2 Y tags:$ y9 C- e; M6 e# K. {
- sh-rmp2 q9 ]+ K! q% C' {" ], K
- name: 列出可用的内核相关包
3 B7 d F9 S/ p) u, Z( q# w yum:
) A E+ U9 M& W) J list: available; y0 i! S Z/ ?7 D/ q
disablerepo: "*"& D: `6 ?) ^; z+ I
enablerepo: "elrepo-kernel"
1 ~6 T7 ` n: b8 a - name: 安装内核
$ U, E: ~: A. C yum:
5 `: k8 [2 d Z+ I1 t" o name:& T# I1 R, P0 ^* Q+ H& y. `; a8 O! W! v1 i
- kernel-ml
4 B6 x. f0 F- l8 ~ #- kernel-lt.x86_64
; p& C1 N! G f' r0 b4 m #- kernel-lt-devel.x86_648 G: z7 P2 D$ ~/ j* i" v
enablerepo: elrepo-kernel
( I: W% L. K$ q9 A. y- s - name: 查看内核版本默认启动顺序/ r1 ~# E u8 N* Y; z( X
shell: awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
- V" |6 {6 u* [, y - name: 修改grub中默认版本启动顺序
$ q9 Z7 C, r! K7 l3 q3 }& m# @9 ] lineinfile:& k5 T8 P7 L% F, g" p1 v
path: /etc/default/grub
* z3 N3 E+ f$ D% i regexp: "^GRUB_DEFAULT"
3 F1 `% H# a& e. O line: "GRUB_0=saved": }& m3 [; z% X% Y- g, A
backrefs: yes
5 M6 M3 M- S9 s3 e state: present
* W" a9 U5 w" Y7 @) Z. ] tags:' P1 |0 v8 b3 f" C5 B! y; n
- grub3 r+ x8 f) g* ]
- name: 重新创建内核配置8 x+ y# B8 ?/ U/ _; B; u/ f
shell: grub2-mkconfig -o /boot/grub2/grub.cfg/ D O' E* W& B1 _( I
- name: Reboot the machine5 Y' n6 n6 ~3 y0 n y
reboot:- |; C; b2 c2 ]7 W+ A! i% T! {7 `$ |
reboot_timeout: 3003 q+ q% e0 k2 z+ t* X6 t& h. p6 |
- name: uname -r8 T% t" H1 @# e
shell: uname -r; O, K5 |* r: r, {5 [/ f5 c
register: version
( q' S4 z0 h& z - name: debug- ]3 S9 p: R4 _
debug:; l% }0 \4 s: F+ O' p {
msg: "{{ version }}"/ P; K( o0 P* f
4 U& Z& G' K9 |- Y1 A& p2 [升级rpm脚本在ansible剧本同目录下的templates/rpm.sh.j2' F9 n, }) L$ C" K% q# f
内容是:
/ s% G/ v" z. k( G9 p B, tbash 体验AI代码助手 代码解读复制代码#!/bin/bash. g! D W3 R$ \0 z! f$ W6 M
rpm -import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
/ P V. M% h; e" z; Z/ X#rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm7 d3 n- n. }" V& ~
rpm -Uvh https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm4 y+ Y/ e* ? u# z
echo 0
6 g J% @) h1 n/ J2 n; j. W: f6 r B
执行ansible-playbook -v 1.kenel.yaml完成集群中系统内核升级, ]0 d/ x2 s2 {4 I, C$ ?# D
安装必要工具和关闭相关配置
0 q: }0 g1 Y& u
& u9 K2 X& x2 Q( Q6 v脚本2.config.yaml如下:
( b8 Z1 O/ k: n" @4 _4 v' K2 b" fyaml 体验AI代码助手 代码解读复制代码- name: set config" _- l8 k6 w, s: ` u u3 \ @
hosts: all
6 t" e* [& T) b ^7 j gather_facts: True# m/ d9 J8 n: }! w8 m
vars:
1 D! ], a- { L% | handlers:
: y8 w- Z/ V! o - name: update_yum; g+ g9 f- c N( x
shell: |
1 V: R% E6 v, K! C% e" b0 |& ? sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
8 Y: M7 j9 p3 B$ h yum clean all
" B8 Y1 {$ j! s yum makecache -y! u) O7 l1 K, f$ R
- name: noswap_service #禁用swap
* M/ A/ L/ { Y, V; j. X: j t$ H systemd:
/ ^: D4 J* n, j7 J name: noswap
" k3 A7 A: ~6 I! A7 ~ state: started #指定服务状态,其值可以为stopped停止、started启动、reloaded、restarted、running9 T3 T; F, k& k! W& Q/ p$ Y
enabled: yes #指定服务是否为开机启动,yes为启动,no为不启动" }8 u1 ~1 D$ z# e
daemon_reload: yes #yes 重启systemd服务,让unit文件生效
- L5 v ?& T( T, R tasks:" L& |* ?) D" H$ H
- name: back up repo/ A; D% g2 r- w9 o. Q
shell: |
5 a5 v& x3 J! I/ j# W) s# l& Z, A mv /etc/yum.repos.d/epel.repo /etc/yum.repos.d/epel.repo.backup8 ?& f# h& _+ @7 H+ V- X5 q9 Y+ G
mv /etc/yum.repos.d/epel-testing.repo /etc/yum.repos.d/epel-testing.repo.backup8 h1 Y3 X7 ^5 B5 _& }* A2 [
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
) |" L0 t0 A! S3 Y1 o- f) K$ S - name: delete /etc/systemd/system/noswap.service #禁用swap
8 v( k/ ?9 L8 M6 Y* s file:# J- q C+ ` {
path: /etc/systemd/system/noswap.service
, f& L/ p2 O& N$ ?3 ]1 ]8 b: }$ t state: absent, ^: V* Y: y p
- name: download repo #下载yum源$ n$ D6 I' j5 z5 Q/ m4 ]. h
get_url:
; v5 ~3 E4 u. I. K# B2 q6 x0 J' K url: "{{ item.url }}"
* R {$ V+ J3 l( V7 j5 u& v dest: "{{item.dest}}"' P: b: Y2 X/ I# z2 }6 D. J# a. c( [5 m
force: yes" d3 c% s! l: }9 w4 p
with_items:
3 y+ {9 h% v& Z0 k1 e& k - {url: "https://mirrors.aliyun.com/repo/Centos-7.repo", dest: "/etc/yum.repos.d/CentOS-Base.repo"}
# v3 v' p: \- Q - {url: "http://mirrors.aliyun.com/repo/epel-7.repo", dest: "/etc/yum.repos.d/epel.repo"}
; i9 _; Z2 F% S4 o( g1 R5 y - {url: "http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo", dest: "/etc/yum.repos.d/docker-ce.repo"}- l. E' }2 i! p8 r1 s
notify: update_yum #更新yum(见上面的handlers)
+ m9 Z$ C2 q) a6 _2 S - name: install net-tools #安装必要的工具
9 ]# @ ]6 p- G* ^ yum:1 l7 L# y# W; P4 k: q
name:6 A4 ^$ a9 g4 O- Y* U
- net-tools
5 E+ |$ {( ^9 Z7 h0 m( E( z - vim' ^* e* c: E; b1 W5 ]6 D
- rsync
/ ~& `! L" j4 y* } - chrony! ?; z* h, U( d' {3 \& v
state: present
( Y$ @; m8 }$ ~" Y update_cache: true9 o% I7 q1 {# R
tags: tools# F; u: f1 a- O0 x* I5 q1 b3 `
- name: chrony_service* r# c1 A1 B$ b: V
systemd:
2 b2 W& P& l) {! g5 ` name: chronyd [: s4 C) M( P8 ]/ K$ V
state: started #指定服务状态,其值可以为stopped停止、started启动、reloaded、restarted、running
1 R& V+ h4 E- N b4 x# T+ M enabled: yes #指定服务是否为开机启动,yes为启动,no为不启动: W7 y& i( o( W4 n# f9 w
daemon_reload: yes #yes 重启systemd服务,让unit文件生效
; O+ J5 q- O6 ?+ x8 X0 \. S tags: chronyd
% a0 _" x5 l5 Q6 P/ T - name: config /etc/hosts #关键 生成hosts文件' b9 ]% k# [7 S# X% E
template:' y7 H9 J6 p: X0 t0 ]/ g
src: hosts.j2
3 Z6 \- V. O5 c9 A. B' p dest: /etc/hosts
. y* @6 r" p* H+ d; r# J1 s mode: 0644
" j8 q( O: A( G backup: false# y* C; @; Y( F$ o6 `
tags:
9 D E' I. b$ T7 M$ m - hosts
' j; U5 y) @1 R - name: set hostname #设置主机名
9 X n. h. ~( u3 F hostname:9 I; [8 ]0 V! J! Y0 O
name={{host_name}}
) V$ W6 |8 n, D+ r; ?' O: G - name: set timezone to Asia-Shanghai #设置时区
. M3 \/ m. [# X$ m shell: |4 J5 _5 Y" ~: L+ x; ~, y
/usr/bin/timedatectl set-timezone Asia/Shanghai
5 S8 R0 f8 }* Z- c chronyc -a makestep* M: H$ F* J2 y* ?. V. Y
tags:
9 D* R' H3 T! |, V4 R; U* A - set_timezone
s2 n/ g1 s- ~ - name: stop firewalld service #关闭防火墙. C; M9 W7 d( w/ n
service : U& r$ }- L( ~" E
name: firewalld.service
: @7 K1 Y+ O% N5 w* l6 V1 @# p state: stopped
% O. M; A% U% ]: z enabled: no" b3 q8 [2 o0 @) S
register: firewalld_service_result+ v( O4 u! |9 y) D* l! A0 w5 j' z
failed_when: "firewalld_service_result is failed and 'Could not find the requested service' not in firewalld_service_result.msg"
1 s% ^0 A4 N2 ]. r! ~' h3 s6 _5 f tags: stop-firewall# E- r0 K# f9 v( [, W6 k
- name: Write noswap systemd service config file' R" I( B% ~) Z
template:2 X* J) b8 N+ ?/ g* L; m+ S
src: noswap.service.j2) j6 ~9 Y! Y1 |- M4 `" B! `
dest: /etc/systemd/system/noswap.service
. n$ t5 m) V& I- ?( D5 { S# h1 l owner: root ~- M E2 S. r& ?& ]
group: root2 c, k8 m9 x+ I0 @; e+ K
mode: 0644# ?2 I' Q2 ^" y) G1 H
notify: noswap_service5 A6 |7 P/ P4 I) K* t# G! N3 W
- name: Disabling SELinux state #关闭suse
4 N- l8 B6 `$ H0 ~5 h. u7 @ selinux:
c' _% O# N& E/ T state: disabled
; E( V4 L% S: X. c" e, r" V - name: Reboot the machine #重启机器
3 L+ n3 H2 @1 G# C( G. }) Z* B" q reboot:
1 u, |" K. |# u# a' b: U8 t* l0 M reboot_timeout: 300/ Q3 m7 e0 \/ D! x* x* x) N0 L
) A9 t& c% Q% q+ j; i" R
templates/hosts.j2内容:5 p5 L( W2 f! r, b; L
bash 体验AI代码助手 代码解读复制代码
. z, h! G# F e0 F# \: f6 u2 z0 i' n% |3 i4 w; N
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
6 H$ H4 n: C# c& d' I( ?6 U* E! a::1 localhost localhost.localdomain localhost6 localhost6.localdomain6* \' C2 b& J! q2 p
/ r8 ?! b; n& B; j- T k
/ X2 a% M* g( Y5 Z
{% for h in play_hosts %}
, H0 M L, G7 U# v4 ?{{hostvars[h]['ansible_default_ipv4']['address']}} {{hostvars[h]['inventory_hostname']}}
( ~4 J' o5 \6 [' u$ W7 X{% endfor -%}
4 U- B3 W& P% q4 T) {0 H0 v- Y8 ^* ]+ Q" c0 G+ ?, h
执行ansible-playbook -v 2.config.yaml完成相关配置工作9 Y% d+ S$ K' l1 O
安装docker
: G) v6 L" a$ L按照官网说法,可以使用docker或者podman,本文选用docker。# @- E7 \; V- e! \/ A, O, H" }* {9 R1 B
剧本3.docker.yaml内容如下:
; w9 j9 B V* m/ [6 _' K# Tyaml 体验AI代码助手 代码解读复制代码- name: install docker
# d" K( @7 n4 | hosts: all- \2 q- m& T! j& [5 @
gather_facts: True( t. g% l$ F8 k/ g3 ~, l$ F# {
vars:/ g9 E% @; N$ o9 f, s4 g
DOCKERHUB_URL: registry.aliyuncs.com/google_containers
$ B1 v* @2 T' ^1 @ handlers:
1 U, R* l/ J! e- d: B' H6 f tasks:# t- |8 Y) E8 ~. k* }
- name: mkdir -p /etc/docker/
" J3 h$ z) @$ X: R: A- |0 A5 Y file:& P/ B* t9 P: `
path: /etc/docker/
& y. l4 v q* Q3 n* A9 q: `& N state: directory3 n+ j# y n ]5 d. K
- name: install docker
4 [# ^1 x# B0 Z yum: name=docker-ce
9 X& |$ O. @3 M+ Q' ^8 V - name: change config of docker
' h/ d C6 W, T( [4 s' |" F shell: |( ?! D4 M6 R) r7 e8 k( r
cat > /etc/docker/daemon.json <<EOF
% D' i* T [0 O5 V$ w- N, p/ H {"exec-opts": ["native.cgroupdriver=systemd"],1 c9 ~! b3 P4 G' w( F" B
"registry-mirrors": ["https://registry.aliyuncs.com","https://registry.cn-beijing.aliyuncs.com"]& o9 Z1 C# C& G. F5 X/ B: [9 ?- `
}
( I* |, f: c/ {, Z# `2 Q EOF
+ F( {7 P( ^& p tags:
3 q ^; W' e0 ~2 F+ _, @ - config
4 u5 p) p! a5 Z - name: add systemctl
$ e0 }$ n2 _6 O1 d6 |% D systemd:
: @$ f. u. O+ j3 `" z name: docker.service* ~4 S% I% z/ }+ R! n" [' f6 o8 K [
state: started #指定服务状态,其值可以为stopped、started、reloaded、restarted、running
, ?" n+ R% z& _. I0 t1 \$ R* M enabled: yes #指定服务是否为开机启动,yes为启动,no为不启动
/ C! T- ~1 [8 w$ Y, Z! H daemon_reload: yes #yes 重启systemd服务,让unit文件生效' L Y, E& c. G- n
tags:
. [2 t2 ]" r+ s# Y - docker-daemon
5 K% {( _; i! \" M# x - name: docker login #在阿里云镜像仓库开了个账号,用于同步墙外面的镜像(https://cr.console.aliyun.com/cn-beijing/instance/credentials)
! v7 f9 |9 e, e8 m( G) F9 Q shell: docker login --username cyxinda@163.com --password *** registry.cn-beijing.aliyuncs.com3 }& Q( a' T! i4 s! Y
tags: login
' B4 i( q& p0 K3 Y- e) |2 S9 I
& k' v- f' ?& J6 I1 ^8 X执行ansible-playbook -v 3.docker.yaml完成安装docker的工作
* E) r& _7 \" M* Q& b2 }6 W( @开始创建ceph集群, m) {0 f/ B: } C2 b2 F3 I$ R
安装cephadm- L; u. ~7 V2 {2 Q. c" X
继续使用ansible为集群中每台机器安装ceph
( e& H- r6 r& a# q剧本cephadm.yaml内容如下:
: b' P& b( l% }7 pyaml 体验AI代码助手 代码解读复制代码- name: download ceph
7 K% g I6 D5 r* u& d hosts: 127.0.0.1 _- J+ h k0 {+ F, m5 {
connection: local
1 K4 Z* ]( F. K' l/ w gather_facts: yes- n$ K% l9 k4 P* I( s
tasks:
9 |$ u) `2 k, _ - name: download cephadm
N9 q' s( Y& \' Z+ J) h get_url:$ v P" G; L3 x: |" u2 X7 f1 R `% D
url: https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
8 @/ E- [9 ^& J; s; P/ S. t dest: /tmp/ceph/cephadm0 }- {& g' D2 i4 ~5 U
force: yes/ g; X, x/ s' o+ h* P3 l
mode: 755
& T2 o) j# x [+ k1 v1 I timeout: 600+ z( t6 O8 C' X! I
- name: download cephadm and install
- F" b, U. S b hosts: all# w* R' k+ P7 M9 q+ H
gather_facts: True
4 a$ Z% j! M4 Y1 E( Z tasks:- b2 w2 r& M) G; }, \( G
- name: mkdir workdir2 g$ o0 |9 o: |; \5 k* a' y( O
file:
9 j; E9 z) d0 C. l path: /tmp/ceph
# \) ]- {8 Y! l. h: W state: directory, |. z' t: |- @/ s6 n
- name: 分发cephadm安装文件
+ j# T/ K+ V2 }& X copy:
& \" q+ _% d1 | src: "{{ item.src }}"' D0 L- I& S3 \9 x6 ~7 o, h
dest: "{{ item.dest }}"& ?2 m) i" m! N
owner: root
2 p U6 X* u \1 ]4 O group: root6 d) Y- {! s- L w. |; E9 }5 {( ~% V
mode: "{{item.mode}}"7 Q0 e8 I2 u3 D9 c4 ?* v# n3 ^5 t
with_items:
! O6 [% s, a/ f# a8 u - {src: "/tmp/ceph/cephadm", dest: "/tmp/ceph/cephadm",mode: "755" }( U# y& Q0 v" {: R8 \8 [: o
tags: cp-cephadm4 t" s5 w6 o% {* \. }8 U' C
- name: install python3
( j* ~ Q, y$ h- E4 c, x yum:
: q5 Q* b+ L# u. b4 t name:. O' R* a; U4 E
- python3
2 G' x2 h6 ~/ P7 N6 r! G9 C - chrony s/ V* o+ d) @5 p
state: present* a5 O7 s" U) W8 Z. {0 ~7 U
- name: add systemd
, U# `+ h) F$ c3 I9 o# \ systemd:2 w. |9 B* B+ P% N' k
name: chronyd
( [3 i6 ~( k1 g3 p) Z4 ~ state: started #指定服务状态,其值可以为stopped、started、reloaded、restarted、running
/ h* S9 G4 x! O g3 X enabled: yes #指定服务是否为开机启动,yes为启动,no为不启动
1 n" R- O1 K \, x1 d daemon_reload: yes #yes 重启systemd服务,让unit文件生效5 U8 s1 l& R. d
- name: add ceph release
4 M _) Y4 T$ W% i' o! ? shell: /tmp/ceph/cephadm add-repo --release octopus( {8 V) t1 M+ F, v( C( f
- name: install cephadm
- ~4 D5 `6 z( i. d shell: /tmp/ceph/cephadm install& j5 l, [& ]; N- E
- name: which cephadm" `2 k t6 W1 w: y
shell: which cephadm( s6 z: s: ~- Q5 y( Y! Q
register: which_cephadm( M! J+ j0 \7 D! U$ W
tags: which-ceph
/ E+ I3 D7 W6 I- x2 T6 ~ - name: show1 d2 ^" I% P, w) H: K
debug: var=which_cephadm verbosity=0 #check.stdout 显示出的信息会看的更清晰点
& C: Y) M6 N* u* b% S tags: show-result( y' W* {* ]% `* K. L4 J: X3 Q
+ Y% z5 G1 E7 H9 F+ E, {
执行ansible-playbook -v cephadm.yaml完成ceph的安装 S6 \! e) o! w
7 S8 P+ |6 M1 ~* q% K6 l引导集群
; u, k+ v( y! }" q/ z将master0作为引导主机,在其上面执行如下引导命令:' t2 {- c0 |- S
bash 体验AI代码助手 代码解读复制代码
- \; B0 S. W6 e& ]1 V z( m9 h w) h8 Q6 y- P F
[root@master0 ~]# mkdir -p /etc/ceph
7 g0 R5 H% r' k$ z2 P[root@master0 ~]# cephadm bootstrap --mon-ip 12.70.10.1610 C8 F! d* P$ o+ C* t* j0 x) L
Verifying podman|docker is present...
' D2 A( j7 k9 `% }Verifying lvm2 is present...
3 Z. h/ o3 J: Y, ]1 `1 MVerifying time synchronization is in place...
0 ~2 t* T: I+ l2 a* c% S$ m5 R3 SUnit chronyd.service is enabled and running
& n2 h# K) H- c+ Y ~; W5 f5 c* hRepeating the final host check...3 h( e' D: J% h& `
podman|docker (/usr/bin/docker) is present
# ^1 S! D. ~/ X- ysystemctl is present3 e6 q- F3 \7 i- z7 ?& z( T
lvcreate is present
; m/ L3 Z& o+ a7 F/ W. C$ N' n8 ^Unit chronyd.service is enabled and running
" M( O6 d- ~+ A9 KHost looks OK
2 [- y- O( m1 g8 g" R# a) rCluster fsid: e3386564-bb02-11ec-af56-525400299ff7# x4 ^# K8 d+ p/ z7 ` j+ y
Verifying IP 172.70.10.161 port 3300 ...6 F! k( b8 o2 ]! P
Verifying IP 172.70.10.161 port 6789 ...9 ^* B$ C0 E/ v: k. B5 T; Y
Mon IP 172.70.10.161 is in CIDR network 172.70.10.0/24
k9 N9 q3 v/ A$ K6 i. }2 vPulling container image quay.io/ceph/ceph:v15...# @: ?9 Z5 h7 q
Extracting ceph user uid/gid from container image...
2 U! g8 O) S: g% iCreating initial keys...
! ~$ _# L6 }8 Z& k) R- ~) MCreating initial monmap...: \' ~. r5 m7 W( }5 A) N
Creating mon...5 N8 b+ q( G( `" Q" a+ L
Waiting for mon to start...6 V, P# ^: g$ I0 S1 q
Waiting for mon...( {& i5 X. N3 v+ L* t
mon is available
, a2 E0 q+ B# T7 b' X! ?Assimilating anything we can from ceph.conf...
$ n$ e8 @9 S' V( p& t2 Q$ E) EGenerating new minimal ceph.conf...
0 X8 f7 @) T( D4 ]+ D7 H" ARestarting the monitor...7 V; r0 {5 Q5 T. C8 S0 K7 r
Setting mon public_network...
2 i$ f. v4 L8 O( Q7 h g! x: fCreating mgr...3 a+ J9 L# Z, N) b# u- o
Verifying port 9283 ...
0 a9 B* L( L; P" @Wrote keyring to /etc/ceph/ceph.client.admin.keyring6 `2 _- Z' A O( J: C# g M/ q
Wrote config to /etc/ceph/ceph.conf
8 b3 ~' o4 L; SWaiting for mgr to start...
# f4 t6 k l4 w8 x" LWaiting for mgr...
6 C1 ?$ Z% Z& n# c9 \mgr not available, waiting (1/10)...7 n2 h7 h4 c6 K, a! S1 B* v
mgr not available, waiting (2/10)...
6 i1 p. S9 M X* Mmgr not available, waiting (3/10).../ @# N4 k& h0 B% b0 |2 h
mgr is available
?$ |; y( ?. }; I* m& C+ fEnabling cephadm module...
7 d; z# n; O$ F$ AWaiting for the mgr to restart...
. `; L) f* C. J" gWaiting for Mgr epoch 5...- ^/ Z3 i8 l! T& w/ V' d
Mgr epoch 5 is available
4 A, F: T8 C; L4 U( X' U" i0 ZSetting orchestrator backend to cephadm...
$ \: j: B. e8 c& m- G6 |Generating ssh key.../ v7 v- o: r7 V$ _% B2 t" {0 L
Wrote public SSH key to to /etc/ceph/ceph.pub
3 I2 B, x6 q; E# V) L( o" f3 |Adding key to root@localhost's authorized_keys...& Z+ h, ]# z: Y5 B' B
Adding host master0...
; P7 f$ o% b" R7 W- m# dDeploying mon service with default placement...
6 Z; o+ p8 D& @, B' V/ e/ LDeploying mgr service with default placement...
& l" n; |$ S! F; V$ @2 eDeploying crash service with default placement...' w2 s( `$ ~4 c; Z
Enabling mgr prometheus module... A( q# I3 R5 C5 n# I- q& U
Deploying prometheus service with default placement...
" h+ E K$ e4 e/ j9 xDeploying grafana service with default placement...
9 Q! i* z0 N5 |* \2 \, k1 ?Deploying node-exporter service with default placement...
# a; y9 l$ K0 O; `Deploying alertmanager service with default placement...
8 b6 B G2 z( DEnabling the dashboard module...$ E% Q- Q8 x* Y
Waiting for the mgr to restart...4 r, W$ o6 s2 R# W" H/ M+ }; ]: l4 \
Waiting for Mgr epoch 13...0 x+ S1 |: f3 w( { D# j
Mgr epoch 13 is available# ?0 i" O1 K7 [1 |! m
Generating a dashboard self-signed certificate...* I/ ]; t- b1 {% g7 l/ h& Z# m) a
Creating initial admin user...2 s) L U! t6 o# h3 L# w
Fetching dashboard port number...& E+ q2 g2 S9 p% D
Ceph Dashboard is now available at:
( M6 Q. \4 c& K! C! x" B7 }8 i8 `) ^7 X; b6 t4 f$ K) O$ j
URL: https://master0:8443/
% ^0 }' Y" f$ O' X8 w$ G4 J User: admin7 A- Z3 ]* o( V1 T
Password: vym1bdeajd
$ M+ C/ i3 G2 i: m: C9 `' q2 v& Z V
& h' j# M" y" n( D# g0 N5 _0 |2 }You can access the Ceph CLI with:
7 K( {* N2 q& t& S4 H4 L5 N$ ~, l& @# i! D, A
sudo /usr/sbin/cephadm shell --fsid e338664-bb02-11ec-af6-525400299ff7 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
* B k$ V$ P! K+ \+ D n& V5 G% h3 V! d9 m& J3 e: H
Please consider enabling telemetry to help improve Ceph:
' f& |, P4 M* R8 K: A' ?+ h- O& p7 |
ceph telemetry on4 V: b6 g$ S3 j, l; Z/ B. Y
0 }& }+ Q& p5 NFor more information see:7 G6 r1 l3 V) S9 z4 K
9 _5 \0 m" `0 B+ @& \
https://docs.ceph.com/docs/master/mgr/telemetry/; |, {( P& a l+ }6 q
# J' S4 j8 M1 ?/ b$ f
Bootstrap complete./ d' p# k$ m1 l+ }( ~ i2 x+ j
; _* w. G8 [8 ~" X9 s( g6 s9 ~
0 ?9 q" o+ n. W) b1 i7 g: Z/ I
在命令执行过程中,有如下提示:+ E- B! n* R) g7 N7 a* J$ \
bash 体验AI代码助手 代码解读复制代码Ceph Dashboard is now available at : ' M% e7 i! @( R N3 T
URL: https://master0:8443/
8 {. ]2 @7 u$ r$ u( n User: admin0 B8 g" N, _2 ?
Password: vymdeajd- e c9 ^2 l" [9 `) j# d( n5 q
8 Y1 W$ s" @ D/ i' h& I
按照提示,在浏览器上面可以访问:
( a% D6 T8 {0 E% A5 E9 v
3 C% e0 S8 D, A- \进入到管理页面后,如下:, A! e& [! d0 ~; o1 i
! e. i a3 U; P+ ~% ^0 y) V
参照安装文档,该命令将:
1 l- }$ S* a: `! T7 Y5 a在本地主机上为新集群创建一个监视器和管理器守护程序。 N( Q- n% f# w& i/ U( L: V; ]" ?
% D, E* _6 J# W为 Ceph 集群生成一个新的 SSH 密钥并将其添加到 root 用户的/root/.ssh/authorized_keys文件中。
. U1 A* L' Q& B6 y E将与新集群通信所需的最小配置文件写入/etc/ceph/ceph.conf.
6 n# ]/ s' \. Z; R/ p9 |
' N, w% j9 [. K0 y, @client.admin将管理(特权!)密钥的副本写入/etc/ceph/ceph.client.admin.keyring./ u& W8 r& o( r( y U
将公钥的副本写入 /etc/ceph/ceph.pub.! y# K3 ?' X% B$ t) c
启用 CEPH CLI(必须)
2 C+ Z1 a9 I+ e# H' [继续在引导机器上面执行如下命令,即可开启ceph shell client
& P( X, k' B' L9 u注意:后面的ceph命令,均需在ceph shell环境下执行
; s. v) \3 \ p' O2 P8 ~$ L( qbash 体验AI代码助手 代码解读复制代码% C' M: L' D% b b: ?! C
5 A) ?& Z6 w. [: Q l% M" E[root@master0 ~]# cephadm shell
4 x6 z2 H/ d7 {: w/ U- MInferring fsid e3386564-bb02-11ec-af56-525400299ff7
2 e5 g3 Z! i! q/ s/ W( u- }Inferring config /var/lib/ceph/e3386564-bb02-11ec-af56-525400299ff7/mon.master0/config6 ] @. a4 l% j0 F& U
Using recent ceph image quay.io/ceph/ceph@sha256:1b0ceef23cbd6a1af6ba0cbde344ebe6bde4ae183f545c1ded9c7c684239947f
( E* Q9 M' U; y3 ^+ B3 s& r, r
! y# d/ F0 R$ H; [[ceph: root@master0 /]# ceph -v4 Z$ I ~- P4 ~6 q& C
ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
9 `$ A! _/ J5 I4 f
2 a! `9 U4 f Z3 K# K, m5 M$ a
. p% q- E0 R$ r" {[ceph: root@master0 /]# ceph -s
8 |( b$ x5 C0 g; Z7 I) S cluster:
% g# B/ F% q; i0 k% ^4 ? id: e3386564-bb02-11ec-af56-525400299ff77 U* N6 n9 r. k
health: HEALTH_WARN
: N9 \: t; h# J4 w: j( E OSD count 0 < osd_pool_default_size 3
, w2 M# n* G* E# K' D: Z% X, e+ M( T6 I/ ^
services:
' l4 x8 Q) O% p4 o: n mon: 1 daemons, quorum master0 (age 48m)8 ? w8 R( m% i" J, M3 @8 _. K6 d
mgr: master0.ojikws(active, since 47m)7 ?/ d3 {! h9 j2 g/ c3 a g
osd: 0 osds: 0 up, 0 in. i/ A# ^8 x+ s8 ^. S. e1 c
. v& K5 O9 S. I4 s data:
. Q- Q# X6 y* y, c0 l4 z pools: 0 pools, 0 pgs
+ x+ S3 P* z' }7 Y% Z objects: 0 objects, 0 B
& S2 k5 i$ R3 A ] usage: 0 B used, 0 B / 0 B avail
@( m/ H( I. H0 I1 Q" T g pgs:% r: O' O! U7 B
/ X( S* {; ~- e& K" X将主机添加到集群& I, t$ Q5 l- e2 H5 Z
/ T8 K u' n# H2 x& ?' _在新主机的 root 用户 authorized_keys文件中安装集群的公共 SSH 密钥. L) L0 O; d# b, Y. X7 _
: X' M3 n6 e2 g0 v
bash 体验AI代码助手 代码解读复制代码4 j# w1 D, }! x, |
4 `: r a( p9 h+ Y
[root@master0 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@master1
! {" ^0 x" W! U8 A/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub"4 v/ E! ~4 ?) C0 Q
The authenticity of host 'master1 (12.70.10.162)' can't be established.# }( a1 F- ^, C F8 o, x% k3 r
ECDSA key fingerprint is SHA256:J40vT3JXLYRku40nj9oOq1XQMbnkTXZ2Qc5IDFAy4xc.
6 |" l" Q; k) P; OECDSA key fingerprint is MD5:8d:ef:46:df:ce:06:7d:86:05:e9:04:ad:68:12:40:8c.
7 m% e: M; @5 `- n* Y8 E$ }# ~Are you sure you want to continue connecting (yes/no)? yes
) \# l# h3 D" |: ?root@master1's password:" D5 t2 s G i: Q
( Q! \" z7 D) Z4 i K) N4 h
Number of key(s) added: 1
$ {) ^" h& |. B# P: M& V, A" s1 d( t7 S3 W7 ]: P5 T8 z
Now try logging into the machine, with: "ssh 'root@master1'"
8 E, d6 ?& i0 A8 xand check to make sure that only the key(s) you wanted were added.
% u# M9 }5 I9 {7 C) i
' q& D: `8 V% `. I [[root@master0 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@master23 t. Q) R3 O. c: I' G7 z; M
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub"
# G5 _/ V" f. ~: a CThe authenticity of host 'master2 (12.70.10.163)' can't be established.+ Q* i/ C% L+ V" z
ECDSA key fingerprint is SHA256:J40vT3JXLYRku40nj9oOq1XQMbnkTXZ2Qc5IDFAy4xc.
& W; `& d2 a+ m" x QECDSA key fingerprint is MD5:8d:ef:46:df:ce:06:7d:86:05:e9:04:ad:68:12:40:8c.0 i5 K0 B p6 m0 q8 q% n6 M
Are you sure you want to continue connecting (yes/no)? yes
; a0 J* N* g: t3 w2 R1 Broot@master2's password:" @; F" E0 Q2 ?$ j
0 b7 I8 N7 a+ ^3 U+ kNumber of key(s) added: 1
/ ^) g8 U2 C J) o6 n/ z+ C. H1 g( L3 O6 q# a8 B
Now try logging into the machine, with: "ssh 'root@master2'"6 _# \; E. R' b- h
and check to make sure that only the key(s) you wanted were added.+ ?8 y+ f2 W$ d
5 {$ }8 `8 X2 {4 ]4 Q3 F
这部分本来也可以用ansible做的,但是懒得写脚本了,就两台机器,也就罢了5 @9 V/ Y* e* l# N
添加机器到集群:6 x. h% J( E7 T2 }/ j
bash 体验AI代码助手 代码解读复制代码, k2 _$ g, ?1 [% H
5 G8 l5 P5 N1 B% O7 q0 g/ w
[ceph: root@ceph1 /]# ceph orch host add ceph2 12.70.10.1623 f* |( H) h- ^/ ^3 l1 g* `) I6 C7 w
Added host 'ceph2': F# P5 f: j! e5 W
[ceph: root@ceph1 /]# ceph orch host add ceph3 12.70.10.163& S# R. z8 X9 f$ m
Added host 'ceph3') w0 c P7 F/ p+ P
[ceph: root@ceph1 /]# ceph orch host add ceph4 12.70.10.1642 z9 m) K' W J+ W' n
Added host 'ceph4'9 v# @9 D: G% j$ |0 e
[ceph: root@ceph1 /]# ceph orch host add ceph5 12.70.10.165
; a0 ^3 U# D- a" `Added host 'ceph5'% o; Y, E" t0 c) m
[ceph: root@ceph1 /]# ceph orch host add ceph6 12.70.10.166# R( G& r) p9 ^9 c+ A4 _0 K
Added host 'ceph6'
) {& d" A1 a7 {, h1 t( T9 {1 P7 f: H! g& c9 v/ w
. B H2 x( h3 A. O( F* d
添加多个监控器9 K$ K, O" H* Z8 v9 e, Q, a9 Y, A
配置监视器子网:& \! }$ F p3 {# d9 m3 ]
bash 体验AI代码助手 代码解读复制代码4 I! M7 C6 q7 \& G$ j' l
[ceph: root@master0 /]# ceph config set mon public_network 12.70.10.0/24
, p- z7 ^" Y! W! z. |# ?[ceph: root@master0 /]# ceph config set mon public_network 12.70.10.0/24& u! q1 K5 h% V9 ^' f5 }
" T% ]( x8 Y$ ]##要启动三台监视器,需要调整监视器数量:
X: u: f+ S# ^. F8 n7 }" [, |[ceph: root@master0 /]# ceph orch apply mon 3 " S* T6 \$ g/ y) O
Scheduled mon update..., E8 S, D5 I* v( D. W
' U; N3 a. y' Y) Y按照官网的说法:Cephadm 仅在已配置子网中配置了 IP 的主机上部署新的监控守护程序
+ A" {& b: V+ e9 E b; B在一组特定的主机上部署监视器,请务必在此列表中包含第一个(引导)主机。
" Y1 N3 ?+ t8 ybash 体验AI代码助手 代码解读复制代码
" E9 Z" Z" o& a1 f J+ _( v5 U$ p, D# [ r% ?+ L
[ceph: root@master0 /]# ceph orch apply mon master0,master1,master2
0 F8 r7 L0 f' n* f% X: MScheduled mon update...) j& P) F1 T. S; {& T
##加标签8 V% X/ O+ A n5 k! Z0 q
[ceph: root@master0 /]# ceph orch host label add master0 mon
4 B P. @% X( j( I, m$ n' @, OAdded label mon to host master0
' {" Z& @3 |2 n[ceph: root@master0 /]# ceph orch host label add master1 mon! H9 ~& i4 S! k) l% m0 o- Q; B9 i
Added label mon to host master1
( a! {9 N9 F9 t9 E) B1 l+ D[ceph: root@master0 /]# ceph orch host label add master2 mon, y- ~, ]3 m* D: w% [& J
Added label mon to host master2
( u/ H0 y$ p; @+ Z0 t; m
9 A7 C' a2 c2 z[ceph: root@master0 /]# ceph orch host ls
) D3 L' C& p+ r8 S$ RHOST ADDR LABELS STATUS: S( U7 @0 e6 H3 c
master0 master0 mon
. A4 w; u( Y2 Z) @, b9 D# jmaster1 master1 mon
& P1 K9 U& v7 k* a& w& ^( tmaster2 master2 mon
7 F5 \4 e, ^2 {% m. j3 [, Y! }' N9 g, L( s
[ceph: root@master0 /]# ceph -s- k5 [: l( J8 R1 U
cluster:
0 o0 M, f* b2 J& S$ K% u0 `4 L) } id: e3386564-bb02-11ec-af56-525400299ff7# Y A' }& B& v! P6 M6 f
health: HEALTH_WARN8 V8 [* x$ X) g% w# e0 i d1 |
OSD count 0 < osd_pool_default_size 3
* a* ]& I" y1 U/ X0 e: T! f. K8 z6 ?3 Y: H0 Y
services:# H1 X$ g/ r* A
mon: 3 daemons, quorum master0,master1,master2 (age 88s) E/ i: N8 X+ [6 m; U% u- j
mgr: master0.ojikws(active, since 73m), standbys: master1.uxevld
. f8 h4 {! u# H% L osd: 0 osds: 0 up, 0 in1 s' F4 L p3 B
" x; W; e- _! r
data:
! h% f" c1 J% a: ?) l& ` pools: 0 pools, 0 pgs6 `3 u" Q. ^! L1 q2 X* `. i+ T
objects: 0 objects, 0 B
R$ Q) k+ j+ V* y; Y+ I usage: 0 B used, 0 B / 0 B avail
; I- I4 r6 G4 ]- }, W {: } pgs:% ]: r* x( d1 X! n
+ T. J$ g) P% v7 o* ^: n; R
加入ODS
7 _! v- O1 ^9 N2 \可以看到集群三台机器上面,一共挂载了6块磁盘:% P' D1 U3 K4 _5 W _
bash 体验AI代码助手 代码解读复制代码
2 U. O% Q6 t. d2 d/ y( R
+ J, J; {' P. R3 j* \) Q[ceph: root@master0 /]# ceph orch device ls
& D0 D9 f: H* I+ s4 FHostname Path Type Serial Size Health Ident Fault Available0 p3 l$ ~8 j, e2 X: b
master0 /dev/vdb hdd 536G Unknown N/A N/A Yes
# b$ Z3 c6 A' i% Z- ^master0 /dev/vdc hdd 536G Unknown N/A N/A Yes3 X! t. p% u% {8 g
master1 /dev/vdb hdd 536G Unknown N/A N/A Yes
! Z. ]1 B. \& E: K/ B( Cmaster1 /dev/vdc hdd 536G Unknown N/A N/A Yes/ \0 x+ |7 w$ r! ]% _ J: N
master2 /dev/vdb hdd 536G Unknown N/A N/A Yes. C% m, Q- h& y" X) Q
master2 /dev/vdc hdd 536G Unknown N/A N/A Yes
9 F( q6 x" w/ s/ q8 S
# O% @0 H& q- r当然在每台机器上面可以列出块设备(/dev/vdb 和 /dev/vdc):) _+ N- P3 F, t
bash 体验AI代码助手 代码解读复制代码
' V. N$ T; F! |8 y S, I% X
% y$ ?& y9 t% Q) k) y, M& B. w8 o1 v+ b( v[ceph: root@master0 /]# lsblk
/ p w9 j0 D9 p* m) ]1 `NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
8 L6 e8 n1 ?/ \4 d7 N7 W7 _sr0 11:0 1 1024M 0 rom, r! I0 f i$ {9 S
vda 252:0 0 500G 0 disk
4 K5 b1 k5 Z% \; x|-vda1 252:1 0 1G 0 part /rootfs/boot
2 M4 P0 t* P) R% V`-vda2 252:2 0 499G 0 part, u! j; F8 s( C6 q
|-centos-root 253:0 0 50G 0 lvm /rootfs" V" l5 b/ a/ u; G$ o
|-centos-swap 253:1 0 7.9G 0 lvm4 k& z4 M! z9 I. a+ `( i
`-centos-home 253:2 0 441.1G 0 lvm /rootfs/home8 B) ?/ _! o: T, f6 F6 {* k7 u/ e4 x
vdb 252:16 0 500G 0 disk! O* W/ N4 Z) C" L, l7 n
vdc 252:32 0 500G 0 disk
3 g( u c' j: M9 I7 d+ j; H3 c$ Q8 c+ M: C
按照官网说法:设备满足& v9 c! u/ c- \2 Q$ q
如果满足以下所有条件,则认为存储设备可用:
$ Z2 g/ I& H' Y" F: g3 M% Y `
. o+ {) P; C! l/ t) f设备不能有分区。 C% |4 t* C# L# B, y
设备不得具有任何 LVM 状态。
, [/ f/ S2 V1 @4 Y' q+ p" r不得安装该设备。( ]* n- `8 E/ Y( {9 e7 Y
设备不得包含文件系统。
0 _1 y& E1 t: M! k; ^' t& w设备不得包含 Ceph BlueStore OSD。
9 |0 X3 ^( R% [1 |5 L. N设备必须大于 5 GB。/ w7 v# \! F/ a! y9 h" e
( J0 y4 L7 l+ }- T
Ceph将 拒绝在不可用的设备上配置 OSD
$ f6 }* R3 z' U可以使用如下命令,将所有可用的磁盘添加到ceph集群中:
$ v2 O0 |6 Q% F: K/ Cbash 体验AI代码助手 代码解读复制代码( `: `5 V+ [* n/ x' ~
2 p/ E3 T" I1 f J/ H
[ceph: root@master0 /]# ceph orch apply osd --all-available-devices
% q. D b: i3 [+ Y& NScheduled osd.all-available-devices update...: W, [/ }5 a+ S* T; w0 H$ p
[ceph: root@master0 /]# ceph -s, R' e6 i! m9 D7 B5 ~8 s$ s
cluster:' Q$ L! n# s$ p3 A1 s
id: e3386564-bb02-11ec-af56-525400299ff74 \9 \8 g3 z( b* i- m9 Z0 L+ Y
health: HEALTH_OK
+ `$ j2 |. C6 a1 H, a" z
0 G+ |, E) ~6 D6 P o' }7 m- @$ E" L services:
, i8 ?/ _* I. t; Y# V/ p mon: 3 daemons, quorum master0,master1,master2 (age 12m)
/ }. [7 v+ e1 L/ Y9 w6 T% i5 ~& x mgr: master0.ojikws(active, since 84m), standbys: master1.uxevld) u! F6 T& Z' @/ k2 {; ~- T
osd: 6 osds: 6 up (since 22s), 6 in (since 22s)& _3 t( _0 U& ]: U4 X/ h' P/ {1 h
% u9 E. ~3 e. P
data:! n }; q' _& X o# `
pools: 1 pools, 1 pgs4 {1 G; d1 J( ]$ P8 W
objects: 0 objects, 0 B- U$ v8 O3 s2 \4 |' W3 @
usage: 6.0 GiB used, 2.9 TiB / 2.9 TiB avail& y3 ]% Z8 B. F1 j
pgs: 1 active+clean6 p1 v" |. c h. O# K6 [
% W" J* W- ]0 r# Z1 U' [也可以单独加入:
; k& k8 V8 A. d+ N9 t' @bash 体验AI代码助手 代码解读复制代码
; X9 _2 M7 G e; V4 V( `$ L9 A' M) O' @' m8 }4 R
[ceph: root@master0 /]# ceph orch daemon add osd master0:/dev/vdb) H; ?- u: W5 e7 {. K
[ceph: root@master0 /]# ceph orch daemon add osd master0:/dev/vdc1 F. h0 e' g& N8 j4 R
[ceph: root@master0 /]# ceph orch daemon add osd master1:/dev/vdb
. c$ t; i( x+ i6 Z# \[ceph: root@master0 /]# ceph orch daemon add osd master1:/dev/vdc
2 {: }, k6 o5 l( ~4 h% G[ceph: root@master0 /]# ceph orch daemon add osd master2:/dev/vdb
0 [) J* Q9 W6 G1 N[ceph: root@master0 /]# ceph orch daemon add osd master2:/dev/vdc
8 e( I, S' N/ L: b' U8 q, C, T( b, n* z1 A
在前端看到:% X! B' k7 k1 O+ r3 L
9 m+ I5 f' K* p+ i
- U& s: K% ~4 ~+ k挂载块设备! ], _/ u& Y9 I3 K' I6 r' {
bash 体验AI代码助手 代码解读复制代码
' }/ e$ g# N8 a) ]
9 S3 }7 Q0 T1 g, k7 [: Z[root@ceph101 tmp]# ceph osd pool create test_rbd 32
% l6 B$ Y3 b+ z$ o7 ^: @; }pool 'test_rbd' created3 Z$ J8 n: Z" w* k: j
[root@ceph101 tmp]# ceph df8 ^( ^9 C* s. ?) I( C; Y* D+ ]
--- RAW STORAGE ---) Y7 y- F, k6 f9 u1 c
CLASS SIZE AVAIL USED RAW USED %RAW USED
+ d0 {- T2 ^/ F) m' Zhdd 5.9 TiB 5.8 TiB 1.1 GiB 13 GiB 0.22
5 X$ W4 P9 W! Y+ t- ITOTAL 5.9 TiB 5.8 TiB 1.1 GiB 13 GiB 0.22, w( Y5 M- F; b: T# n
: b: T- V- F! h/ [, D--- POOLS ---; V- t% ] i: z! ]: f5 _6 a
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL; w5 ^% q M) ]0 d! M: Y! V k
device_health_metrics 1 1 0 B 0 0 B 0 1.9 TiB2 d# @9 q4 _" d7 h' L
.rgw.root 24 32 22 KiB 36 6.6 MiB 0 1.9 TiB- X: u5 Q: r8 T: | ~4 v! m
zone_01.rgw.log 31 32 26 KiB 965 55 MiB 0 1.9 TiB; M, U9 E7 |/ b; j- u
zone_01.rgw.control 32 32 0 B 8 0 B 0 1.9 TiB5 x5 x8 q" g Z; x3 K9 O0 @
zone_01.rgw.meta 33 8 5.5 KiB 16 2.6 MiB 0 1.9 TiB
+ n0 k5 ~+ T) X O8 M/ A6 p9 czone_01.rgw.buckets.index 34 8 672 KiB 55 2.0 MiB 0 1.9 TiB
2 F; F- D3 G' S& U9 I4 Izone_01.rgw.buckets.data 35 32 2.2 MiB 12 7.9 MiB 0 1.9 TiB
) t% H! n) v7 j+ |; W! t& kzone_01.rgw.otp 36 32 0 B 0 0 B 0 1.9 TiB
. o/ B( g z% S, K( I2 x9 o. U& k* Dcp_pool 42 32 1.4 MiB 2 4.4 MiB 0 1.9 TiB
f3 _; E% \6 [6 I+ ?, P+ Dtest_rbd 43 32 0 B 0 0 B 0 1.9 TiB
0 U1 z' q, @, [3 O) B5 d+ t0 V##创建镜像5 y1 n6 _, }7 B5 w* i
5 ~! O+ d9 L! W7 Q0 [
[root@ceph101 tmp]# rbd create test_rbd_image_1 --size 10240 -p test_rbd- M9 o" z% ` r- ~- K0 a' E( n
# [( K1 o2 K$ b. @! T& W
[root@ceph101 tmp]# rbd -p test_rbd ls
4 u# |4 \- ^$ A) ]+ ` ltest_rbd_image_1. A( k" a$ ^) g
' O1 r, W7 H/ ~$ Y3 M+ T) u5 Q
##检看一个RBD镜像的详细信息
3 ~, j$ _' [% G" e' B& m' L |1 g" p[root@ceph101 tmp]# rbd --image test_rbd_image_1 info -p test_rbd
9 j2 U1 M Z9 `- Xrbd image 'test_rbd_image_1':0 h) ~- v9 l9 h( _8 }3 }: r" @
size 10 GiB in 2560 objects
4 a7 x; L* f! b2 ~) Y7 v: x# v, s order 22 (4 MiB objects)& Z& \$ K( J) ^8 T
snapshot_count: 0
9 r0 d: `, f; V5 j* b' P id: 2836b2f53ea863 x% O; W! c; S& j
block_name_prefix: rbd_data.2836b2f53ea86
, a ?1 I6 |! E+ b/ r" [ format: 2# L# m$ e$ ~* w% i4 k5 ~
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling- y4 g# k; d& H5 n+ K1 U
op_features:8 l2 k+ H2 L( g$ r' ]. ~
flags:+ c; C9 N0 @7 D
create_timestamp: Tue May 10 10:38:16 2022
- o5 a+ ?9 X _6 i access_timestamp: Tue May 10 10:38:16 2022
+ e- ?0 p M7 F( ?) h modify_timestamp: Tue May 10 10:38:16 2022( a1 I: O( n" i6 s( q; t) ~: ?
journal: 2836b2f53ea860 x: `0 \4 a& q. N8 q8 S/ \
mirroring state: disabled
0 W( ]6 j& ?( ~3 F! w/ [( R( Q/ J+ z m6 N+ |2 C
- @9 ]; g* @( n
[root@ceph101 tmp]# rbd pool stats -p test_rbd
! S! q! @- A- G/ x" cTotal Images: 18 P& ~: _( D7 l
Total Snapshots: 0
: a$ q! ?; c! H- f6 rProvisioned Size: 10 GiB: ?) M3 n" D( f0 S- \6 @8 Q
[root@ceph101 tmp]# rbd showmapped
; Y# k: m+ G8 E5 A, Mid pool namespace image snap device
6 I( l$ T; t$ S/ V0 cp_pool image2 - /dev/rbd0
/ U- |. Z2 x% ]1 u
8 r! r: r( C9 y6 l. v$ S* w( H5 \( ^[root@ceph101 test_rbd]# umount -f /dev/rbd0' E9 M: g5 ~- F5 C9 k0 F; }
#或者& f1 b: K! k! I" a
" g) ?* W% y5 R6 J[root@ceph101 test_rbd]# rbd unmap -f /dev/rbd0
8 i: R. ^3 H9 N##然后将pool关联应用
, e; `; y& r( K# e- v: ]! O6 a7 z
/ i4 l/ @2 @. v+ i[root@ceph101 ~]# ceph osd pool application enable test_rbd rbd9 \ u& _/ V/ ]1 {3 f, F" E$ r% T" B
enabled application 'rbd' on pool 'test_rbd'
1 {: n: n, p B9 C X' R[root@ceph101 ~]# rbd map test_rbd/test_rbd_image_19 ]( r, v, q# g _
rbd: sysfs write failed U) j* P( e! f
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable test_rbd/test_rbd_image_1 journaling".
; K1 [! q% e# A0 j* u) SIn some cases useful info is found in syslog - try "dmesg | tail".
: [: q( S4 V5 B. R8 d, \, Drbd: map failed: (6) No such device or address1 e7 x5 i! P# H5 A! [1 ]
8 F0 k. }! Y) G[root@ceph101 ~]# uname -r
; f3 a+ }5 R, x! I5.17.6-1.el7.elrepo.x86_64$ b) Z* X* P' M+ X! H# k
[root@ceph101 ~]# dmesg | tail
1 u) I: o: k; T2 b[ 9.843030] random: crng init done0 X2 R' @$ M& j2 m1 S6 ~" J; `; D
[ 9.843034] random: 7 urandom warning(s) missed due to ratelimiting
& ^& l4 S; X/ k: h- A) i2 e0 t7 ?' ^[ 10.669178] Bridge firewalling registered" C3 O/ p5 O! K
[ 22.239147] process '/bin/ip' started with executable stack
' r7 l, w, S; ?/ r8 b[ 8185.140070] Key type ceph registered. W, H) @* Y& ^
[ 8185.140395] libceph: loaded (mon/osd proto 15/24)
. l9 s6 |& R9 \0 R. ~[ 8185.141923] rbd: loaded (major 251)
5 e5 p: E! A" G! P% L! T' w4 C[ 8185.158536] libceph: mon3 (1)12.70.10.184:6789 session established- c+ X$ [' a" C1 F- q
[ 8185.160696] libceph: client175843 fsid 7a367006-c449-11ec-9566-525400ce981f4 [& `5 W: Y6 i/ X9 O
[ 8185.288221] rbd: image test_rbd_image_1: image uses unsupported features: 0x40 & ? |& }' x8 Y- E
##根据提示`[ 8185.288221] rbd: image test_rbd_image_1: image uses unsupported features: 0x40`,可以确定内核不支持的features是十六进制0x40,转成十进制是4*16+0*1=64,即2的6次方=64,journaling5 Y8 J0 G2 S& e; D) x* @
#layering: 支持分层**(0次方)**
) p& G1 g6 E9 j% h5 x#striping: 支持条带化 v2 **(1次方)**1 o7 q R* H8 n3 B1 S& s
#exclusive-lock: 支持独占锁 **(2次方)**
; M: H/ F" H' p& K: a! a2 W#object-map: 支持对象映射(依赖 exclusive-lock )**(3次方)**; u+ M; Y6 ?' z; ~
#fast-diff: 快速计算差异(依赖 object-map )**(4次方)**; R5 L+ X. F. V
#deep-flatten: 支持快照扁平化操作**(5次方)**
, l1 X% R* H' ~( Q1 J#journaling: 支持记录 IO 操作(依赖独占锁)**(6次方)**
' z: O& u6 J8 A# j$ O' E: ~3 j( O0 t[root@ceph101 ~]# rbd feature disable test_rbd/test_rbd_image_1 journaling. }3 M: b c. m* K m6 y z
[root@ceph101 ~]# rbd map test_rbd/test_rbd_image_1 C5 w6 t* D* p( g0 v& m9 ~
/dev/rbd0
: l4 s" p! S% J3 ~! C#lsblk 查看磁盘9 V# w( W; w, q
1 s# t. [. P8 D! s8 L( B
[root@ceph101 test_rbd]# rbd showmapped
3 o' ~& n, Z# b6 @: B$ kid pool namespace image snap device
; v5 s' N& K6 K. Z! l0 test_rbd test_rbd_image_1 - /dev/rbd0
3 ~0 g$ G& e9 M" B3 A) k5 A[root@ceph101 ~]# lsblk9 J0 l5 W) f+ ~7 F4 y
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
; I: V, }: L& xrbd0 251:0 0 10G 0 disk
8 Y- _/ T3 B2 k$ R, G! cvdb 252:16 0 500G 0 disk# |, H% _+ R) [3 p
└─ceph--cbd3517f--a42b--41b9--bdb5--350597fb4873-osd--block--da454e2d--c289--430f--a685--9b437b5a3e00 253:4 0 500G 0 lvm9 F% m* m/ u- E& j
sr0 11:0 1 1024M 0 rom: @. }" i4 S) c: b) B6 {
vdc 252:32 0 500G 0 disk
! G, g# h1 S+ i- e└─ceph--797b51d7--f835--43d7--a987--1316a2438933-osd--block--40e4dc65--08e9--4971--9187--2d05208bbb0d 253:3 0 500G 0 lvm- R% Q! N% ]" r
vda 252:0 0 500G 0 disk
: j5 N, |/ f8 l! M1 |( P0 J. Q% _2 B├─vda2 252:2 0 499G 0 part
0 A( ]" c$ s: \) ~) X) D4 ^6 q│ ├─centos-swap 253:1 0 7.9G 0 lvm6 u2 ~) ~6 u! c: C5 [7 I4 |
│ ├─centos-home 253:2 0 441.1G 0 lvm /home9 t6 ^/ a/ \9 ?$ ?% m; a" Z. i
│ └─centos-root 253:0 0 50G 0 lvm /
. h6 q) C z0 F: ~2 A# h" @) Z└─vda1 252:1 0 1G 0 part /boot" b& K" K: f6 u; ~2 E
7 O: q+ i' ~" f( |% a& Y, ^- m" b1 ^2 N+ {5 P
# 格式化磁盘
5 r& K6 M% o% p8 p
* q2 \" [" ^$ `, R[root@ceph101 ~]# mkfs.ext4 /dev/rbd02 Y/ C7 ?4 O& X' }+ h9 F
mke2fs 1.42.9 (28-Dec-2013)
* b* v, g' w( u* C5 z. \( I) BDiscarding device blocks: 完成
3 B2 `$ z z8 ^# d& y0 ^, V' d文件系统标签=7 V% f; m6 r w% C- s8 `
OS type: Linux
" h7 t$ ]! L0 @5 d& x块大小=4096 (log=2)" ?8 I* x' w7 U5 G' N1 k0 u7 |0 z
分块大小=4096 (log=2)
& f' g; w. ^3 n; i/ F% w: y: f* Q+ GStride=16 blocks, Stripe width=16 blocks) G3 L$ y- h$ Y! [7 X8 o" l5 [: P; Q- L
655360 inodes, 2621440 blocks
$ ~9 P! j9 v4 F1 l131072 blocks (5.00%) reserved for the super user$ c- l8 ^5 |1 t
第一个数据块=0
- h" j& }: c5 P8 X8 l: [7 hMaximum filesystem blocks=2151677952
& V3 e i$ V2 i$ i5 g! e$ `8 B80 block groups0 C! o! b. Z) D' N1 a) Z. U5 E
32768 blocks per group, 32768 fragments per group6 t; X. |* c2 f; r1 T$ J
8192 inodes per group
1 Y& X8 t' O: GSuperblock backups stored on blocks:
% F8 P) ^3 B2 ]- H2 Z" ] 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
5 {9 D: j/ m+ s F* i" J& O! m$ E0 p# w6 w# D
Allocating group tables: 完成2 B4 N8 v2 B& a: `" ?( Y- ?2 r
正在写入inode表: 完成/ Y& b+ d% p M9 I
Creating journal (32768 blocks): 完成
# |2 X7 J" d7 G5 B) [# cWriting superblocks and filesystem accounting information:
. H2 j6 Z! o- C完成* h( l; ^" k, E) D7 p9 |* [7 T
; M7 p) h9 _8 ?. L##创建挂载目录) u+ k1 l: K5 J* x
[root@ceph101 ~]# mkdir test_rbd/
, Z1 q A+ w/ l. Q0 k4 E#挂载
2 ^7 O& ^' \1 a; D Y( e& s[root@ceph101 ~]# mount /dev/rbd0 /root/test_rbd+ p- t6 c# g) q" i0 d7 x
[root@ceph101 ~]# lsblk
+ U" P) [& W, |8 e- |NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
& L; n8 |( {/ Irbd0 251:0 0 10G 0 disk /root/test_rbd
8 j1 D) N0 \! Tvdb 252:16 0 500G 0 disk3 S$ \+ ?+ @: u
└─ceph--cbd3517f--a42b--41b9--bdb5--350597fb4873-osd--block--da454e2d--c289--430f--a685--9b437b5a3e00 253:4 0 500G 0 lvm
. B4 G% x$ M9 G7 P: hsr0 11:0 1 1024M 0 rom; {, h- x$ K/ q( D1 h/ B0 b1 x! f2 K; R7 E
vdc 252:32 0 500G 0 disk
' ~" v p, k1 @9 s; z% z. L└─ceph--797b51d7--f835--43d7--a987--1316a2438933-osd--block--40e4dc65--08e9--4971--9187--2d05208bbb0d 253:3 0 500G 0 lvm9 Y3 [2 P$ a0 _3 f. y) P- A+ ~
vda 252:0 0 500G 0 disk' F% o0 ~* \! s- V) O" s- U
├─vda2 252:2 0 499G 0 part8 @9 }" C h. ^ O" M+ u. I
│ ├─centos-swap 253:1 0 7.9G 0 lvm) B3 H. U* `/ `0 K' P
│ ├─centos-home 253:2 0 441.1G 0 lvm /home8 w* p, O5 Y" \1 t7 I: V
│ └─centos-root 253:0 0 50G 0 lvm /
; r) _9 t+ M5 |4 r0 G* Y& T└─vda1 252:1 0 1G 0 part /boot! ?. r S4 |' ^6 `7 g* |% M7 S1 I
3 e. q$ u. ~1 ?& |* O7 q; T. N; }
#写入小说文档到挂载的目录
9 `" Z' P! ?$ q' a/ r/ t[root@ceph101 ~]# mv bcsj.txt test_rbd/) c5 b+ e R+ i: L
[root@ceph101 ~]# md5sum test_rbd/bcsj.txt! S( Z0 ~3 H' U/ m
0d615ccd0e1c55f62002134f5cac81cc test_rbd/bcsj.txt8 f" w4 K- W7 k+ _3 ~7 G3 o0 B
[root@ceph101 ~]# df -lh
5 X/ j) o% U! J3 n文件系统 容量 已用 可用 已用% 挂载点
2 y9 F! J9 ]* M/dev/rbd0 9.7G 15M 9.2G 1% /root/test_rbd- s! T2 U8 d2 Q1 ^/ b
, `- f* z9 z( U% h* N4 |7 z
[root@ceph101 ~]# ceph df
: i; r% c7 I& }8 S, U--- RAW STORAGE ---
% E$ b6 z7 V" F9 FCLASS SIZE AVAIL USED RAW USED %RAW USED
2 h, ]2 t- q, h% \" n4 d4 ^hdd 5.9 TiB 5.8 TiB 1.5 GiB 14 GiB 0.23% x& G0 k4 r$ R; N
TOTAL 5.9 TiB 5.8 TiB 1.5 GiB 14 GiB 0.23" f$ B: U. B# a! k+ \" S/ a
' h' g5 t: X0 l- q) K, a--- POOLS ---
0 s' {9 e7 \7 W" U9 r S" Y& dPOOL ID PGS STORED OBJECTS USED %USED MAX AVAIL$ y" j9 |/ X% H8 r* b
device_health_metrics 1 1 0 B 0 0 B 0 1.9 TiB
9 ~0 \$ T i/ _. i.rgw.root 24 32 22 KiB 36 6.6 MiB 0 1.9 TiB3 o+ o/ i( i6 D+ j! j
zone_01.rgw.log 31 32 26 KiB 965 55 MiB 0 1.9 TiB x, q- T! n' {2 \( L
zone_01.rgw.control 32 32 0 B 8 0 B 0 1.9 TiB
9 l- L8 F7 M; v9 b0 ?zone_01.rgw.meta 33 8 5.5 KiB 16 2.6 MiB 0 1.9 TiB) i" f- s$ V% ?
zone_01.rgw.buckets.index 34 8 672 KiB 55 2.0 MiB 0 1.9 TiB
8 t. x( h( g! G* \$ Qzone_01.rgw.buckets.data 35 32 2.2 MiB 12 7.9 MiB 0 1.9 TiB
9 B% Q! ], N. Y& i# Pzone_01.rgw.otp 36 32 0 B 0 0 B 0 1.9 TiB
2 A2 f$ g2 U. N! I/ g0 g* Scp_pool 42 32 2.2 MiB 2 7.0 MiB 0 1.9 TiB
, I2 c8 ]# e( i$ V- Z, M& Q5 r6 q6 ]. Qtest_rbd 43 32 148 MiB 57 446 MiB 0 1.9 TiB
0 k4 N" n* F1 {3 v4 w* X. f) T7 S$ ?
# P4 u0 ~: l6 o2 |* b% K
对象网关8 g6 H( a# i2 I/ f
bash 体验AI代码助手 代码解读复制代码
- z+ h1 G% R% F: H5 ? o! @( N0 J+ }* e0 w, f' `
yum install ceph-radosgw -y6 W9 D3 o* e( A7 r8 p
- @% k/ P4 f% T# Z9 h5 ~( g9 ]
[ceph: root@ceph1 ceph]# radosgw-admin user create --uid='s3_admin' --display-name='s3_rgw_admin' --access-key='s3_rgw_admin_access_key' --secret-key='s3_rgw_admin_secret_key'8 U% }3 a# D% Y3 `* ?2 ?" Z
{' ?3 M7 ]/ h8 C# D% w
"user_id": "s3_admin",
/ g* D" \" _5 c/ L "display_name": "s3_rgw_admin",
! d# v$ o3 l- `& f) ? "email": "",
1 v) K7 o% m6 B, u "suspended": 0,
8 h* j/ {+ X# M/ M5 X+ ]! g6 K [ "max_buckets": 1000,
; U2 j( U a# M j6 p1 |" w/ ? "subusers": [],
- N$ T5 B& D- ?3 n "keys": [+ V- r4 P0 |8 m; @% P7 m3 T6 L
{
: I$ v C( c+ l0 p- \' ` "user": "s3_admin",+ Q- J0 ^4 }- e# z' D* I
"access_key": "s3_rgw_admin_access_key",; E# @: j6 q% l5 ]
"secret_key": "s3_rgw_admin_secret_key"
$ B* z! m3 n! a1 p- H* K9 d }$ O0 E" ]+ |" y; ~8 f1 t0 s. [
],: o: ]) l9 ?) W. n6 r* W
"swift_keys": [],, F5 L. g: r. C
"caps": [],! h) q3 K/ t# w2 `7 T# |5 R
"op_mask": "read, write, delete",
% i% F( u+ w7 Y' u8 R5 U( B "default_placement": "",) x, D$ r1 }. M+ ^ r
"default_storage_class": "",. ~, u, Z* O* l& O1 t! A: {
"placement_tags": [],; l% L/ C7 ?+ B' U2 E6 K( d
"bucket_quota": {% L( r5 _. j# @* H& H" i) n
"enabled": false,
6 p# y% n: `% y/ z$ \ "check_on_raw": false,# V( T ?) P; O! F0 f5 ]
"max_size": -1,- P5 o. g8 l; W! A/ g' |
"max_size_kb": 0,, T: r, y- t) j {. Y- g4 z( c
"max_objects": -12 E" m, j, `1 D# J' O
},! G' d- S/ Z/ h+ f" f# P% R+ p
"user_quota": {& m, G' [+ ?9 m( G4 r6 c
"enabled": false,8 I3 n" @8 I+ c2 {+ y5 p! x
"check_on_raw": false,
3 U, M7 W( X9 M "max_size": -1,5 e, u: v7 m9 t' H! ]
"max_size_kb": 0,8 L7 D- X1 a0 Z0 X3 B H; d. `
"max_objects": -1( q0 A2 x4 t# b s& w+ F% f; ?
},/ `' |" e5 U0 \" |1 p% C& n
"temp_url_keys": [],
/ G% c; d* U7 m5 {1 n4 x- j5 s "type": "rgw",% Y' u4 q: ?- t. b' ]- F
"mfa_ids": []/ Y/ e2 }$ _; y, i# {) {
}
# y" z- ]1 F5 b( i9 B% \+ P
. w' N8 f9 @" P8 K6 V[ceph: root@ceph1 ceph]#
; D8 B k- Y, u' E, P[ceph: root@ceph1 ceph]# radosgw-admin user info --uid='s3_admin'
# p+ q% ]. Q. u0 T) Q7 a{
0 X! f& L. m: g5 T, ?' @7 R& l "user_id": "s3_admin",
! b5 H8 J- [9 P" d) L5 I "display_name": "s3_rgw_admin",
) Q+ W1 P* r' ^8 p7 { "email": "",
, @* k0 k3 Z6 @8 {; ? J6 d "suspended": 0,
% v& E$ r, A! M* m7 }& a "max_buckets": 1000,
# \' `9 X) s' c$ z "subusers": [],
+ r! K$ E. Q8 n "keys": [5 ~3 @* C# U4 ~3 o; K& u
{ }- n. A2 O3 J$ i! s
"user": "s3_admin",& F. x+ {3 r+ Z, @* a) r! ]
"access_key": "s3_rgw_admin_access_key",
0 _$ l2 l& K* s "secret_key": "s3_rgw_admin_secret_key"$ Q* V7 t6 z" E8 g8 l3 X
}
/ e. B1 X6 O7 \ ],7 Y6 e) W, d+ \5 @+ L
"swift_keys": [],
+ j, ]+ r; }0 C7 V! } "caps": [],
3 d+ ? {7 E( b$ S" [2 b+ Z1 C9 R" C "op_mask": "read, write, delete",
/ H' @) Z1 q, v9 v4 R: T1 s "default_placement": "",* p% C1 h2 e1 C8 f8 J$ `- w
"default_storage_class": "",
' }9 Y+ v/ C2 L8 h$ }, }$ @( [ "placement_tags": [],6 a E3 e/ m' H n. M* I
"bucket_quota": {; X2 d2 @" l7 l# X1 L2 V& y
"enabled": false,$ L* r5 C2 O$ W6 U( Z7 y
"check_on_raw": false,1 W6 I* ?7 d' q+ G, c
"max_size": -1,
$ U, l7 P& q. P3 A* G, X* j9 [# i% E "max_size_kb": 0,
5 @( [: i! }/ ]6 v! i "max_objects": -1
* A5 b& K3 o) R5 w7 W7 ] },7 t2 B( G( A' O3 Y: ]1 z3 l
"user_quota": {
6 j5 W; u% [/ I# | "enabled": false,
% l. M$ C7 s. ^" y3 I4 h "check_on_raw": false,* ~7 u1 W5 A: `2 U2 E6 J4 q
"max_size": -1,2 V7 z7 E9 z. a) a
"max_size_kb": 0,* U$ ~! |% D4 Y/ \: O) l
"max_objects": -1
; G+ s" w6 E4 h: b },( N* L0 E, K7 _& r, h
"temp_url_keys": [],
% d6 U% U: r0 g "type": "rgw",
. T; S+ c6 |& k$ u "mfa_ids": []
5 L6 V( X$ d& V* Z9 L5 i1 ]6 s0 V9 I}0 `; H9 u# s9 _4 j8 t. }
) ~8 x" }& O% ?8 ~1 D0 h$ [ ?ceph tell mon.* injectargs --mon_max_pg_per_osd=100 6 D3 y) Q# ?0 U; c! h* n# m6 R2 L
2 U8 n5 ]4 d8 T
#----------------------------------------------------------------------------
% a8 B( j, a+ O$ V: Q5 Tradosgw-admin user create --uid=rgw_admin --display-name=rgw_admin --system7 D% T9 x# d' s4 T
radosgw-admin user info --uid rgw_admin --system
+ _+ T8 D( n" D2 l0 p0 e" P
4 O! c& k5 Z, S+ J- f
% s: @1 T& y# i' t; i7 nradosgw-admin realm create --rgw-realm=realm_1 --default# T) ?" u4 Y* W" {( M2 A
radosgw-admin zonegroup create \
1 J* C, G: R/ P5 k --rgw-realm=realm_1 \
; C& e$ ]6 A0 l7 m$ c, t --rgw-zonegroup=zone_group_1 \
+ a/ S3 T. j/ j5 v* w7 o- X --endpoints http://ceph101:80 \
; w5 |: T: K T+ S( l" }2 N7 Z+ H --master --default, n( e, Q4 {! i% [* G% W# T- g. {
! j. V6 @3 B- Z) [2 Z
$ b* y! X" P9 ^4 h2 v5 O8 \ radosgw-admin zone modify \$ A) J8 r, t# b. S$ L: U
--rgw-realm=realm_1 \
9 x0 o1 g' S* R- a9 B3 X --rgw-zonegroup=zone_group_1 \/ @. W6 C9 i) C
--rgw-zone=zone_01 \1 V6 r5 _$ \ w+ b& j
--endpoints http://ceph101:80 \7 b/ }+ h& o( M$ Z P
--access-key=IAWL6PLNFMNM0SLQNWQ0 \
8 ?+ j( j, B' j) T) ~ --secret=pZTNQ8HThJVXOHBnx5VCP1qJgPGfT9LTMpmwjhAo \
1 K, Z, M6 `( ~$ P6 ~0 x; g --master --default 2 k& ?2 M* L$ _3 I2 m' M& ?: l
2 I% m/ ]" i/ z6 d! } v4 d) e radosgw-admin period update --commit( a: f% [! [; j/ W# o2 U- L, g) G
radosgw-admin period update --rgw-realm=realm_1! ?3 e- A5 J1 R9 c5 @' f
ceph orch apply rgw realm_1 zone_01 --placement="1 ceph101": s3 ?- f7 R" X1 J% F& Q6 @) x" c
v. ?, e1 ]' \0 z& `
上述命令,一定要在集群健康的情况下运行,否则会出现rgw 的docker进程启动不了的情况,正常情况下,可以看到如下进程:$ b" ~1 @+ R6 y9 ^
bash 体验AI代码助手 代码解读复制代码
% b# x: c+ y o+ `) h- r9 N7 X/ u[root@ceph101 ~]# ceph orch ls" R; f2 {1 ~0 t
NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME IMAGE ID
3 v' r$ J( ^, k/ v5 }alertmanager 1/1 4m ago 2d count:1 quay.io/prometheus/alertmanager:v0.20.0 0881eb8f169f
4 i' s, H1 N# z, D) Gcrash 6/6 4m ago 2d * quay.io/ceph/ceph:v15 3edede73a7c4
( H8 C" {, Q k: Qgrafana 1/1 4m ago 2d count:1 quay.io/ceph/ceph-grafana:6.7.4 557c83e11646, R0 ?7 @( ]( H( V- {: o4 K5 f
mgr 2/2 4m ago 2d count:2 quay.io/ceph/ceph:v15 3edede73a7c4
( A+ [, [3 y1 G: o+ ^! n' ]mon 5/5 4m ago 2d ceph101;ceph102;ceph103;ceph104;ceph105 quay.io/ceph/ceph:v15 3edede73a7c4: n: W# }5 `! f+ E% f T: y- i
node-exporter 6/6 4m ago 2d * quay.io/prometheus/node-exporter:v0.18.1 e5a616e4b9cf; Z+ @6 M/ A- B2 b
osd.None 12/0 4m ago - <unmanaged> quay.io/ceph/ceph:v15 3edede73a7c4
: A% U' x9 u' j6 s! A8 oprometheus 1/1 4m ago 2d count:1 quay.io/prometheus/prometheus:v2.18.1 de242295e225
! G2 L7 g" _* I& _7 G5 p; q( ?) @rgw.realm_1.zone_01 1/1 4m ago 75m ceph101;count:1 quay.io/ceph/ceph:v15 3edede73a7c4
! L4 [& p9 k( |, F- L+ ^
3 H/ @7 C) _5 F
( H' ^) N$ f8 o6 E如果rgw进程无法启动的情况,可能是集群不健康,查看log:
# v n; f% |! a* j5 @ceph log last cephadm
. E- E( u" q8 I _7 p/var/log/ceph/cephadm.log' j4 Z8 i$ U8 U5 Y# @
bash 体验AI代码助手 代码解读复制代码[root@ceph101 system]# ceph log last cephadm w& S$ n2 Y" |. u. a! K0 p
2022-04-27T07:57:57.347323+0000 mgr.ceph101.qhgzmi (mgr.14164) 95889 : cephadm [ERR] Failed to apply rgw.realm_1.zone_group_1acementSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlaceme': 'rgw', 'service_id': 'realm_1.zone_group_1', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone' 'rgw_frontend_ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when healt
/ W) e" y. I( k# T" ETraceback (most recent call last):1 q: D% j& a2 p* H7 v' E# U
File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services
4 A8 J& l* i% i; D- y8 B; @* N! n if self._apply_service(spec):
6 {/ G. R" D& k File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service
8 O; h- ]7 s, V% m! |" c rgw_config_func(cast(RGWSpec, spec), daemon_id)
( _5 T8 P1 B: `2 q7 j' { File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config) Z; N- T) S+ _9 |2 ^9 P' E* V
self.create_realm_zonegroup_zone(spec, rgw_id)
. q2 X' U) r' }. v9 B5 h File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone! Q! U _- a$ `, p* K8 r
raise OrchestratorError('Health not ok, will try again when health ok')
8 D, \4 G; m, f* P, f. morchestrator._interface.OrchestratorError: Health not ok, will try again when health ok
6 C' I* h9 l( g1 G9 |2022-04-27T07:57:57.353366+0000 mgr.ceph101.qhgzmi (mgr.14164) 95890 : cephadm [ERR] Failed to apply rgw.realm_1.zone_01 specntSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlacementSpegw', 'service_id': 'realm_1.zone_01', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone': 'zone_01ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when health ok
6 S* e( m8 c- i& b. _- TTraceback (most recent call last):
9 Y5 q' H2 w8 s. P% d3 l. b1 ~ File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services; B7 \1 m( d: g4 O9 i0 V/ U
if self._apply_service(spec):
# ^2 s* F5 }! ~# s- ^: I j M4 X File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service- J4 Q) m) D9 r$ w, I6 Y% Q
rgw_config_func(cast(RGWSpec, spec), daemon_id)" a4 w( O1 K( e5 h& l7 J3 e
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config
x. L; I. t7 w. N) U self.create_realm_zonegroup_zone(spec, rgw_id)
3 m T8 f7 I% U File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone
4 A" a A& b) m raise OrchestratorError('Health not ok, will try again when health ok')
% c1 v5 L7 J) _/ W* gorchestrator._interface.OrchestratorError: Health not ok, will try again when health ok7 J O9 ^/ |" _% A
2022-04-27T08:07:22.102133+0000 mgr.ceph101.qhgzmi (mgr.14164) 96175 : cephadm [INF] refreshing ceph104 facts) w6 f" K. b# }# j* Y& \1 a: i, e
2022-04-27T08:07:22.103197+0000 mgr.ceph101.qhgzmi (mgr.14164) 96176 : cephadm [INF] refreshing ceph103 facts' d# [6 r2 ~9 |- Y
2022-04-27T08:07:22.105047+0000 mgr.ceph101.qhgzmi (mgr.14164) 96177 : cephadm [INF] refreshing ceph106 facts
9 m3 M1 `( G( }: |" [* |( o( b9 u. a2022-04-27T08:07:22.105643+0000 mgr.ceph101.qhgzmi (mgr.14164) 96178 : cephadm [INF] refreshing ceph105 facts8 B) A2 P K. z8 C3 ]4 i
2022-04-27T08:07:22.106985+0000 mgr.ceph101.qhgzmi (mgr.14164) 96179 : cephadm [INF] refreshing ceph102 facts ?# `( _* Q" Z8 F/ j
2022-04-27T08:07:22.910395+0000 mgr.ceph101.qhgzmi (mgr.14164) 96181 : cephadm [INF] refreshing ceph101 facts2 Q' T$ S# j7 w' q z1 X
2022-04-27T08:07:23.599992+0000 mgr.ceph101.qhgzmi (mgr.14164) 96182 : cephadm [ERR] Failed to apply rgw.realm_1.zone_group_1acementSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlaceme': 'rgw', 'service_id': 'realm_1.zone_group_1', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone' 'rgw_frontend_ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when healt
$ m+ Q# u1 U ?" L" k1 oTraceback (most recent call last):; t( R2 T* h) H) ]: d! b* m
File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services* P; S% t) U4 O- Q l
if self._apply_service(spec):
9 O- R7 u, A2 e, A3 D/ m0 |, B File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service
7 f5 N$ k7 b- d/ ^ rgw_config_func(cast(RGWSpec, spec), daemon_id)
/ v8 Q7 ]5 x; I$ a0 I$ j- Q1 X File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config4 j' S2 S. z$ ^# [( p% ]
self.create_realm_zonegroup_zone(spec, rgw_id)
& ]* \& m# q# s File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone
1 S" S, m# i" G, G! f/ b raise OrchestratorError('Health not ok, will try again when health ok')& @0 K$ h7 K. e; V" k
orchestrator._interface.OrchestratorError: Health not ok, will try again when health ok- g- s. ?) e( P# h
2022-04-27T08:07:23.615964+0000 mgr.ceph101.qhgzmi (mgr.14164) 96183 : cephadm [ERR] Failed to apply rgw.realm_1.zone_01 specntSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlacementSpegw', 'service_id': 'realm_1.zone_01', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone': 'zone_01ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when health ok, g; ?7 y& x6 I5 D
Traceback (most recent call last):
) y4 W5 [) K/ }9 a3 p" | File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services; j3 B! }& O& L0 _9 J/ |
if self._apply_service(spec): D0 R6 X U( z! T
File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service3 j' Y3 A" j, X" O4 z& W
rgw_config_func(cast(RGWSpec, spec), daemon_id)' g1 o1 ?, C( q3 a! W+ S9 R8 F5 h
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config
+ A+ n! U2 d0 ?- r self.create_realm_zonegroup_zone(spec, rgw_id)
* N* ]: ]0 G+ a! w8 q File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone# S- o! ^3 t' i% Q* d% m9 I c
raise OrchestratorError('Health not ok, will try again when health ok') G) L; }4 S% F) y& W m" B' x8 ~
orchestrator._interface.OrchestratorError: Health not ok, will try again when health ok
1 e [/ P9 |) j/ |* t2 c" U& y1 L4 T2022-04-27T08:07:23.784884+0000 mgr.ceph101.qhgzmi (mgr.14164) 96184 : cephadm [ERR] Failed to apply rgw.realm_1.zone_group_1acementSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlaceme': 'rgw', 'service_id': 'realm_1.zone_group_1', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone' 'rgw_frontend_ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when healt6 v9 E9 u9 W' a# D- |9 x
Traceback (most recent call last):
3 `4 T W. y) O2 c: s+ A4 C File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services
% B5 z1 @* S7 \+ H O6 A if self._apply_service(spec):. ^% `( e" }9 b+ f9 s
File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service0 { v' f- { k/ j
rgw_config_func(cast(RGWSpec, spec), daemon_id)
. Q3 o0 g/ t! P9 N: K File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config
w2 R! i8 e- X# A( a self.create_realm_zonegroup_zone(spec, rgw_id)6 v, Q, F# u) s3 a% |6 f# Y
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone
3 c- B: y. Q& y/ I# | raise OrchestratorError('Health not ok, will try again when health ok')4 O; v2 i9 r$ W' m' [) L
orchestrator._interface.OrchestratorError: Health not ok, will try again when health ok5 t: H4 Z7 c, k& Y
2022-04-27T08:07:23.788497+0000 mgr.ceph101.qhgzmi (mgr.14164) 96185 : cephadm [ERR] Failed to apply rgw.realm_1.zone_01 specntSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlacementSpegw', 'service_id': 'realm_1.zone_01', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone': 'zone_01ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when health ok
' V' p- Y6 L# \8 |$ q8 N4 D# i: D5 ETraceback (most recent call last):4 g2 R# q8 g' R: U p8 K
File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services
+ E* F B; u7 _9 \4 {' D; T if self._apply_service(spec):
9 R8 G' d6 X- H7 R0 k File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service$ j3 u( h; z4 L+ P
rgw_config_func(cast(RGWSpec, spec), daemon_id)* a' W4 a$ r/ a: T
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config2 U6 s+ D1 a& O9 p
self.create_realm_zonegroup_zone(spec, rgw_id)
3 P' h* y4 w) \4 ~7 n File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone
+ E" e2 `. ^2 e' @$ b raise OrchestratorError('Health not ok, will try again when health ok')
V3 J* [( R! K: }# z' `# o: forchestrator._interface.OrchestratorError: Health not ok, will try again when health ok7 O9 m. s. P/ `5 h
2022-04-27T08:16:33.025623+0000 mgr.ceph101.qhgzmi (mgr.14164) 96463 : cephadm [INF] Saving service rgw.realm_1.zone_01 spec- f, ~' P' e P% O: }: H
2022-04-27T08:16:33.240231+0000 mgr.ceph101.qhgzmi (mgr.14164) 96464 : cephadm [INF] refreshing ceph101 facts: j# O9 M# W5 I. c6 p5 ]% r A
2022-04-27T08:16:33.248641+0000 mgr.ceph101.qhgzmi (mgr.14164) 96465 : cephadm [INF] refreshing ceph102 facts ~6 D5 r' i ]; d7 D# X
2022-04-27T08:16:33.250945+0000 mgr.ceph101.qhgzmi (mgr.14164) 96466 : cephadm [INF] refreshing ceph103 facts! p7 y2 t j2 F0 e# F4 u+ F: f
2022-04-27T08:16:33.252787+0000 mgr.ceph101.qhgzmi (mgr.14164) 96467 : cephadm [INF] refreshing ceph104 facts- Q& N0 |& s' @- G ~9 @" o) ^! n
2022-04-27T08:16:33.254250+0000 mgr.ceph101.qhgzmi (mgr.14164) 96468 : cephadm [INF] refreshing ceph105 facts
) j$ {, @/ G. `2022-04-27T08:16:33.256573+0000 mgr.ceph101.qhgzmi (mgr.14164) 96469 : cephadm [INF] refreshing ceph106 facts% V6 h3 n# q$ `3 N$ b8 K8 Q
2022-04-27T08:16:34.288319+0000 mgr.ceph101.qhgzmi (mgr.14164) 96470 : cephadm [ERR] Failed to apply rgw.realm_1.zone_group_1acementSpec(hostname='ceph101', network='', name=''), HostPlacementSpec(hostname='ceph102', network='', name=''), HostPlaceme': 'rgw', 'service_id': 'realm_1.zone_group_1', 'unmanaged': False, 'preview_only': False, 'rgw_realm': 'realm_1', 'rgw_zone' 'rgw_frontend_ssl_certificate': None, 'rgw_frontend_ssl_key': None, 'ssl': False}): Health not ok, will try again when healt
( w2 e; h' b; C) J2 A( kTraceback (most recent call last):
: k$ f( o) W, p. \! w File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services
! w# N- }) O* S: a5 P# h/ t/ b s if self._apply_service(spec):& j0 Q) L4 X( W" y. O
File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service
1 P" k: Q8 r+ A& c: b rgw_config_func(cast(RGWSpec, spec), daemon_id)$ i3 n( x v) G2 D- Z
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config6 \+ n* t4 k- e P" t: g" \1 B
self.create_realm_zonegroup_zone(spec, rgw_id)9 y/ R1 Q( o* b, r; R
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone
$ S# Y: T! p- G5 U( v raise OrchestratorError('Health not ok, will try again when health ok'); |# i, Z4 Y! ?
orchestrator._interface.OrchestratorError: Health not ok, will try again when health ok6 K! } |2 s( Z% G
2022-04-27T08:16:34.292193+0000 mgr.ceph101.qhgzmi (mgr.14164) 96471 : cephadm [ERR] Failed to apply rgw.realm_1.zone_01 specntSpec(hostname='ceph101', network='', name='')]), 'service_type': 'rgw', 'service_id': 'realm_1.zone_01', 'unmanaged': Falsene_01', 'subcluster': None, 'rgw_frontend_port': None, 'rgw_frontend_ssl_certificate': None, 'rgw_frontend_ssl_key': None, 's2 W7 S4 J0 b1 S
Traceback (most recent call last):
0 _# }1 E* @( l+ z File "/usr/share/ceph/mgr/cephadm/serve.py", line 412, in _apply_all_services
+ m" p. x6 Y5 b. @% V) V7 r) i if self._apply_service(spec):
5 u3 t; l( W5 {0 g2 m# C File "/usr/share/ceph/mgr/cephadm/serve.py", line 511, in _apply_service. `# @* P; z% i& C d
rgw_config_func(cast(RGWSpec, spec), daemon_id)
2 L3 ]4 {/ {5 k/ \% ?0 i File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 539, in config3 L( q* ~9 B5 X: H* p4 ?9 s8 @3 w
self.create_realm_zonegroup_zone(spec, rgw_id)% W; f# x! {9 M t; R9 D
File "/usr/share/ceph/mgr/cephadm/services/cephadmservice.py", line 617, in create_realm_zonegroup_zone0 Z/ w7 p" V) }. @) S. `
raise OrchestratorError('Health not ok, will try again when health ok')
4 S/ W5 I8 A* Morchestrator._interface.OrchestratorError: Health not ok, will try again when health ok
/ @, r# l, c& h+ e& X+ z' C
$ e7 A4 h1 h' ]3 E* U, u. N) X; y6 D# ?+ \+ i7 p
: J8 P/ W* r0 {3 c! R$ K' @8 L进入到docker中
& N! O3 `. S7 ?& t+ \bash 体验AI代码助手 代码解读复制代码docker exec -it 736e1816f245 /bin/sh
, J+ B2 ]' j% Ish-4.4# cd /usr/share/ceph/mgr/cephadm/services/7 L3 ]$ X. Y8 N9 g, f( l* }/ I
sh-4.4# ls. k2 k3 a( p: z5 l# E4 h
__init__.py cephadmservice.py container.py iscsi.py monitoring.py nfs.py osd.py3 m$ N, S) {; J- G- S% E5 U; O
sh-4.4# vim cephadmservice.py& \6 }8 [7 L' A6 _9 A8 ?% e; A
sh: vim: command not found
; T7 G; r! p5 q% w6 ush-4.4# vi cephadmservice.py
+ H6 f/ E# s: c: F! k0 F, ^# f##可以看到是因为python代码查看集群的状态为 Health not ok,所以不能向下进行了。& w) l4 |- D8 }6 f" W6 o
#进一步,将集群调整为健康状态,再次尝试,可以正常3 p: A% K% b, K8 b8 y% \
9 e7 g* b( @1 E4 U: n
访问s3对象服务,windows下安装S3 Browser
( \: t- H1 v% @7 D1 o% Q6 P. r2 a
创建用户后,不知道为什么用户被自动删除掉了,重新建立了一下用户,再次绑定zone之后,就可以使用s3 Broswer正常访问了。: q' M$ w0 y) E5 D% C6 e0 n. n
其他常用命令:( ?) }4 u6 P+ P& Q$ [6 o* E3 u) Y
bash 体验AI代码助手 代码解读复制代码 radosgw-admin realm delete --rgw-realm=realm_1
* M/ E' K N0 y& {) P8 Y6 q radosgw-admin zonegroup delete --rgw-zonegroup=zone_group_1( [: T3 B; X' T+ w" C$ E" T
radosgw-admin zone delete --rgw-zone=zone_1) x0 O# G5 x* h' R; ?) `+ E
2 p G+ }" T2 h$ P1 Pradosgw-admin realm list
; v! I3 `9 O" p2 r( Z( _radosgw-admin zonegroup list
, [" d9 w8 D4 d8 h$ H- cradosgw-admin zone list; s% E/ @/ R4 {% P8 `( j; f6 Z6 }
radosgw-admin zone list --rgw-zonegroup default
' {6 `% B. W7 R- Z/ d: `radosgw-admin user info --uid rgw_admin --system4 e7 k0 O- U3 s# R* f% f2 n; S" f0 l
# n/ e$ ]( H' l4 \3 Z$ [# n
s3cmd mb s3://sec --region=zone_group_1
8 ?+ d3 r' z9 g' }2 v3 r$ i. T7 G: {4 O) F8 a
) H) F% ]! A7 z: d$ W/ z3 K9 d6 ?
配置dashboard:+ K8 W+ ]3 H- _/ i) \# O+ p
bash 体验AI代码助手 代码解读复制代码 1752 2022-05-09 11:21:28 radosgw-admin user list
: T0 e' P) j% D5 H1 } 1755 2022-05-09 11:22:06 radosgw-admin user info --uid rgw_admin
' N) o& [" r/ B1 H 1757 2022-05-09 11:34:44 vim access.key0 {" E* f" A/ q5 v) T' U" U! x
1758 2022-05-09 11:35:00 vim secret.key6 _, c4 W8 ~1 M8 }( f5 _" f6 [
1759 2022-05-09 11:35:21 ceph dashboard set-rgw-api-access-key -i access.key
5 h7 _- T9 F& G) Q" H; L! C z* X1 @9 x 1760 2022-05-09 11:35:34 ceph dashboard set-rgw-api-secret-key -i secret.key
% t) t1 A5 n! c5 x& L 1761 2022-05-09 11:36:49 history2 \" T' P, w: O) B a
9 J- W' T) D$ P7 }+ D$ f+ a在前端可以看到8 Q1 l w! S# R& o, l/ @
1 e# ]/ A- ~) J. @0 e搭建主-主备份9 L0 h; {% i/ E' b- G; {
bash 体验AI代码助手 代码解读复制代码8 G& e! e4 h4 X" e: R
8 R2 f! ]7 T' R* i3 G fradosgw-admin realm pull --url=http://12.70.10.181:80 --access-key=IAWL6PLNFMNM0SLQNWQ0 --secret=pZTNQ8HThJVXOHBnx5VCP1qJgPGfT9LTMpmwjhAo
. z0 Y7 K9 J% z7 q( f# K( R( {+ g; t
' R7 ?: M. m1 _& X. c7 Vradosgw-admin zone modify --rgw-zonegroup=zone_group_1 \
/ y5 H. }# l) ?7 U5 I: j3 G2 O --rgw-zone=zone_02 --url=http://ceph101:80 \) Q& Y5 F9 i( q$ Y: W7 Q' H
--access-key=IAWL6PLNFMNM0SLQNWQ0 --secret=pZTNQ8HThJVXOHBnx5VCP1qJgPGfT9LTMpmwjhAo \. g3 B1 m6 `# S1 E3 v
--endpoints=http://ceph1:80 # k z: E: @2 G1 K$ v8 Z+ q
radosgw-admin period update --commit! n* l4 F# Z) h% K) j- O6 ?
ceph orch apply rgw realm_1 zone_02 --placement="1 ceph1"5 t9 V- ~/ U6 R" M2 ?
! l$ h. f" D6 n H6 T# q; ~, m查看应用的配置9 i( ]/ {; t6 p# B, [" n
bash 体验AI代码助手 代码解读复制代码[root@ceph5 f88b0b1a-c467-11ec-a2b8-525400299ff7]# ceph config dump( o& V- L5 G* \, c, ~
) m$ G. L. k* {6 s6 _1 L& t7 m
2 w: @! ~2 v) T L[root@ceph5 f88b0b1a-c467-11ec-a2b8-525400299ff7]# pwd2 W( Y& @) u" e) h
/var/lib/ceph/f88b0b1a-c467-11ec-a2b8-525400299ff71 A/ {: l# v8 J' o1 a+ q8 k4 g
[root@ceph5 f88b0b1a-c467-11ec-a2b8-525400299ff7]# ll# Y4 v: |( ]' o# b3 H6 p' n2 G% b: ]
总用量 0
5 p7 c6 W( d t% ?% u& l0 a& fdrwx------ 4 ceph ceph 92 4月 27 18:52 crash
# A0 U) L9 f( U" P- kdrwx------ 2 ceph ceph 133 4月 25 16:07 crash.ceph5
9 n. N5 `! c7 x" e5 L E/ ddrwx------ 3 ceph ceph 190 4月 27 17:04 mon.ceph5- @" J* {3 Y9 O+ y0 ]
drwx------ 2 65534 65534 104 4月 25 16:07 node-exporter.ceph5
4 _8 I9 b, c9 K& Idrwx------ 2 ceph ceph 241 4月 28 19:42 osd.4
3 T/ k5 F4 `% tdrwx------ 4 root root 88 4月 28 10:59 removed
! O! H$ n% J, L1 ?drwx------ 2 ceph ceph 133 4月 28 19:57 rgw.realm_1.zone_03.ceph5.tapydb
8 |- }5 v( u2 ?( T2 e6 R7 M% Z
" m# l2 m2 v7 x2 b* S& ?- K6 }4 a/ E
[root@ceph5 f88b0b1a-c467-11ec-a2b8-525400299ff7]# ceph config show rgw.realm_1.zone_03.ceph5.tapydb
. O# A4 I3 N& e5 ENAME VALUE SOURCE OVERRIDES IGNORES$ ~4 y6 v- f. w3 U9 V
admin_socket $run_dir/$cluster-$name.$pid.$cctid.asok default! l# O: d6 ^$ D" ~8 D5 r
container_image quay.io/ceph/ceph:v15 mon
8 [- @# R& O# F! _4 {) Vdaemonize false override
: H! O* n [ h9 n3 B2 H8 cdebug_rgw 1/5 default
/ D/ P! f( g! k/ ~keyring $rgw_data/keyring default
: V6 f: h5 V3 Y- u6 t8 clog_stderr_prefix debug default
9 f8 r- C6 |0 y( i4 Nlog_to_file false default! C) j& X# {( l) m5 E) V
log_to_stderr true default6 t1 t; o/ Q% v+ c W; B7 l+ q+ j
mon_host [v2:12.70.10.161:3300/0,v1:12.70.10.161:6789/0] [v2:12.70.10.162:3300/0,v1:12.70.10.162:6789/0] [v2:12.70.10.163:3300/0,v1:12.70.10.163:6789/0] [v2:12.70.10.164:3300/0,v1:12.70.10.164:6789/0] [v2:12.70.10.165:3300/0,v1:12.70.10.165:6789/0] file1 B! c. V9 y4 x2 l) K5 H
no_config_file false override
: U. ]4 F! L+ P, Sobjecter_inflight_ops 24576 default
, X* L$ Q' L. a1 l+ Y$ U% qrbd_default_features 61 default
) M7 Q2 j6 [+ x0 ?rgw_frontends beast port=80 mon
6 P6 W1 N' H" i) S$ k& F6 D. d3 Srgw_realm realm_1 mon& |9 I' T5 ]( O% h; L0 ~( c4 h
rgw_zone zone_03 mon
8 d% _5 `) M3 J2 Q5 Xsetgroup ceph cmdline8 P* F1 ?9 Q5 C8 V/ R8 S; B
setuser ceph cmdline
7 ^ j& a1 v3 j) \
% k" e5 b5 ]! |" s3 H4 H3 c8 {2 K对于想修改rgw启动端口的,可以修改配置文件
( i" }. w, z2 c Mbash 体验AI代码助手 代码解读复制代码##在一个域realm和一个zonegroup下面,可以创建多个zone,每个zone上面可以创建一个rgw,所以再次创建一个rgw网关 h1 Y* }0 v4 ~' X& H+ t7 P% a
radosgw-admin realm pull --url=http://12.70.10.181:80 --access-key=IAWL6PLNFMNM0SLQNWQ0 --secret=pZTNQ8HThJVXOHBnx5VCP1qJgPGfT9LTMpmwjhAo2 w: @: I7 T3 U9 E+ X) M
$ p3 h3 G: k1 D2 C2 W! f
radosgw-admin zone modify --rgw-zonegroup=zone_group_1 \
% f" L+ E1 I* D/ v% b+ i --rgw-zone=zone_03 --url=http://ceph101:80 \
1 R0 I& m% ?/ Z+ K; R --access-key=IAWL6PLNFMNM0SLQNWQ0 --secret=pZTNQ8HThJVXOHBnx5VCP1qJgPGfT9LTMpmwjhAo \4 R4 u& b ~' |% u% E/ |
--endpoints=http://ceph5:8088
0 h8 L9 F5 o3 |+ J2 |/ M0 S) O: Wradosgw-admin period update --commit
: \% ~4 {) p5 ^9 V g4 ~/ l; H& B: vceph orch apply rgw realm_1 zone_03 --placement="1 ceph1"
- \' }' {% l4 q9 ~
$ C. r& m. M d5 I! X) ]8 T- M7 P0 m+ C! j# u, G- d: P) b/ e/ e$ x
[root@ceph5 rgw.realm_1.zone_03.ceph5.crocaj]# pwd
4 w7 a5 P* R0 D3 p+ a5 x( z, r/var/lib/ceph/f88b0b1a-c467-11ec-a2b8-525400299ff7/rgw.realm_1.zone_03.ceph5.crocaj
# s% C8 H( m) d, s[root@ceph5 rgw.realm_1.zone_03.ceph5.crocaj]# vim* q& C: P4 @; q5 v
' ~ e% P+ j# Z`# minimal ceph.conf for f88b0b1a-c467-11ec-a2b8-525400299ff7
3 r$ v- Q$ I9 E" F. {1 e[global]5 u7 b x) i, ?
fsid = f88b0b1a-c467-11ec-a2b8-525400299ff7! q3 ]. L1 a9 N& }4 V- R" J8 F
mon_host = [v2:12.70.10.161:3300/0,v1:12.70.10.161:6789/0] [v2:12.70.10.162:3300/0,v1:12.70.10.162:6789/0] [v2:12.70.10.163:3300/0,v1:12.70.10.163:6789/0] [v2:12.70.10.164:3300/0,v1:12.70.10.164:6789/0] [v2:12.70.10.165:3300/0,v1:12.70.10.165:6789/0]
( c7 ? o' i9 n- b% n7 S& D[client]
( L1 S; J4 ^# X$ e: b; hrgw_frontends = "beast port=8088"
+ ?8 r$ O1 b6 A* U, @`. x: E$ s! @# w- C' \+ j; c
#然后重启rgw服务
& U6 f0 R. t. s L' ~#还可以通过执行ceph config set client.rgw.realm_1.zone_03 rgw_frontends "beast port=18888"
) U" B1 |% J; }6 d#去掉config中的配置项,重启服务,也可以生效# G9 H: X$ X/ j# s. ~
#至于精确的配置如:
0 U4 E5 K5 c5 S0 E0 p% c`# @ ~; w) t7 {! l; Y; y& ?
[client.rgw.realm_1.zone_03]* A4 P. u e, w/ T, F
rgw_frontends = "beast port=8888"0 ?) t, V# m2 M) a, f
` P0 c7 i; d! e2 t y/ Q" v7 s$ T
#试了很多次,都不好用,官网说的不明白
2 X! n7 O% _/ _; S% [! z2 J
* A9 ?9 q) E3 @: b6 f% Z' l参见:
8 Y+ y& M- h5 Q, E' y* f
' ^; _: l( f0 S2 I6 |- ?) P指定rgw的数据池
0 ]% A) r0 e6 z4 W
* M9 K8 H; Z" `, T移除osd
9 L% z( H) @5 n+ y i% X* ?1 hbash 体验AI代码助手 代码解读复制代码如果是用命令行,操作如下:
3 B" d% N( v3 ]5 N7 W9 O将节点状态标记为out (节点已经不再提供服务)
* @& E, K. M# g; p[root@ceph1 ~]# ceph osd out osd.5
/ j3 Z G( @+ {0 t从crush中移除节点(不删除会影响到当前主机的host crush weight)% I; @$ X9 @% ]9 \
[root@ceph1 ~]# ceph osd crush remove osd.5
- d) n# z# c/ x6 E9 N8 }* r/ d4 @! U删除节点. z- Y/ x. P" Q5 N
[root@ceph1 ~]# ceph osd rm osd.5
9 x2 W# n1 ^. i删除节点认证(不删除编号会占住)8 S) r) i h* {) ~8 @$ @
[root@ceph1 ~]# ceph auth ls5 V: j4 r; x+ X( x' k) v; v
[root@ceph1 ~]# ceph auth del osd.5
\/ u$ g6 V: n7 N! J: e[root@ceph1 ~]# ceph orch daemon stop osd.5/ o0 u6 `, ^3 Q) f$ ]8 |# E
Scheduled to stop osd.5 on host 'ceph6'2 W: b9 P* z' m
[root@ceph1 ~]# ceph health detail4 ~9 u0 ~! Z0 [; l5 P* d0 \
HEALTH_WARN 1 failed cephadm daemon(s)
7 _2 |, [ z9 ~ v[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
: R; @# u) }* I9 {3 C/ }+ r# ` daemon osd.5 on ceph6 is in error state1 A$ G, }1 G/ P9 L- |$ U z
[root@ceph1 ~]# ceph orch daemon rm osd.5 --force! e7 O" V! T, k/ v7 J/ a/ J
Removed osd.5 from host 'ceph6'/ i( c2 L* H# B D* [* _
- o% j$ U2 W% O' X9 J$ J$ O8 e
恢复磁盘:
9 E( f/ l: n7 W; y* M找到某个未加载的盘,有两种方式:
$ `% i- w7 i7 j* H9 u8 m lbash 体验AI代码助手 代码解读复制代码[root@ceph6 osd.5_2022-05-11T02:14:55.313464Z]# ceph osd metadata0 Z( w4 L) ]/ h8 X, A- x* v
! J) j* P* u; s y! ?和0 d8 |; |1 r( X. @# X3 F
bash 体验AI代码助手 代码解读复制代码[root@ceph6 osd.5_2022-05-11T02:14:55.313464Z]# pwd
, L+ F$ g+ `2 e. B: f, e1 K/ S/var/lib/ceph/f88b0b1a-c467-11ec-a2b8-525400299ff7/removed/osd.5_2022-05-11T02:14:55.313464Z
" d5 X0 f" r7 G1 t/ H[root@ceph6 osd.5_2022-05-11T02:14:55.313464Z]# ll
! J0 W( q9 R. U, [ } c6 N. v; c总用量 52
% t* r# G# P I/ v6 f) V( q8 Alrwxrwxrwx 1 ceph ceph 93 5月 10 12:47 block -> /dev/ceph-2e1cc736-34d6-4dac-8d7c-f78db028a9eb/osd-block-c831faa6-7cc6-4c04-9709-c33fb29a45f3
( t# F: _1 y. e4 D-rw------- 1 ceph ceph 37 5月 10 12:47 ceph_fsid
6 i; z3 T8 E; } E" [1 T-rw------- 1 ceph ceph 377 4月 28 19:43 config3 O4 ?. Y! k- G3 Z% L8 p' C$ W
-rw------- 1 ceph ceph 37 5月 10 12:47 fsid- U$ f+ G& p& W) Q X
-rw------- 1 ceph ceph 55 5月 10 12:47 keyring3 q& l9 e8 S& u M
-rw------- 1 ceph ceph 6 5月 10 12:47 ready4 Q$ e) w7 {, C# t& s
-rw------- 1 ceph ceph 3 4月 25 17:04 require_osd_release* p3 Y+ R! }! d+ H% d" D: {
-rw------- 1 ceph ceph 10 5月 10 12:47 type
" x# g+ K9 S4 }3 C-rw------- 1 ceph ceph 38 4月 28 19:43 unit.configured* t% {( f; E$ r& F I
-rw------- 1 ceph ceph 48 4月 25 17:04 unit.created' t) y R- G* ^; @3 v m0 h
-rw------- 1 ceph ceph 22 4月 28 19:43 unit.image
; p0 H) _' S" W Y( v {-rw------- 1 ceph ceph 931 4月 28 19:43 unit.poststop
4 l! e& K5 g1 }1 D4 E+ q-rw------- 1 ceph ceph 2035 4月 28 19:43 unit.run
9 @; e7 D$ H( X. d/ K9 ]-rw------- 1 ceph ceph 2 5月 10 12:47 whoami
8 j! U( N9 ^2 q7 U4 B" L
R+ E) {8 U& C- p$ n& n+ w% d8 W; y, _( }
[root@ceph6 osd.5_2022-05-11T02:14:55.313464Z]# lsblk
6 z/ o" d8 a5 W$ b3 n O% zNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT, x& {* M3 |/ y: i2 }
vdb 252:16 0 500G 0 disk
# w) [8 H x. u5 x└─ceph--2e1cc736--34d6--4dac--8d7c--f78db028a9eb-osd--block--c831faa6--7cc6--4c04--9709--c33fb29a45f3 253:4 0 500G 0 lvm
; l0 M7 ~/ J6 h- \# Isr0 11:0 1 1024M 0 rom$ a7 R3 l9 h4 E; C9 h+ g) V$ `/ I
vdc 252:32 0 500G 0 disk
7 W: l. M+ A; \) Q/ \6 ~6 T: f- M└─ceph--beabac1e--6d55--481b--99e4--db03786e8f78-osd--block--7f9316bd--d011--426f--be86--52b8bfad4c0b 253:3 0 500G 0 lvm- S0 { w- Y, t4 p/ h- [
vda 252:0 0 500G 0 disk" H3 c2 | }0 |8 Q: ^, a
├─vda2 252:2 0 499G 0 part2 a& x* e5 u- v4 `
│ ├─centos-swap 253:1 0 7.9G 0 lvm
1 M {: N1 t5 B1 u│ ├─centos-home 253:2 0 441.1G 0 lvm /home4 M! ^, Q! l, h* l/ K
│ └─centos-root 253:0 0 50G 0 lvm /
z8 M% I* D4 k; r3 X6 x└─vda1 252:1 0 1G 0 part /boot, F) H1 `( A, J# F1 V, `4 A4 ^9 }
# O6 B( ?: ]) B0 S- e" \2 o可用确定是ceph6上面的/dev/vdb块设备
( W# Q( J+ W3 [" v1 V% x) r! Ubash 体验AI代码助手 代码解读复制代码#zap该磁盘,使其可重新被使用
/ m; ?. n; _+ V- ?& a8 P# V9 u" \. D6 \) n
[root@ceph1 ~]# ceph orch device zap ceph6 /dev/vdb --force
% t! r! y1 w' o+ G% _; {/bin/docker: stderr --> Zapping: /dev/vdb
% R! [4 @! L" ?8 M/bin/docker: stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/vdb bs=1M count=10 conv=fsync
* E" P9 k, d Z1 i- L, `. E. D/bin/docker: stderr stderr: 10+0 records in
6 T1 g/ j! f$ g, m; m6 c/bin/docker: stderr 10+0 records out2 N4 m9 W' p# G& \ O9 z
/bin/docker: stderr stderr: 10485760 bytes (10 MB, 10 MiB) copied, 1.1166 s, 9.4 MB/s3 |7 ~5 p. }) Y
/bin/docker: stderr --> Zapping successful for: <Raw Device: /dev/vdb>
) n: k! e# f# ?8 L0 J[root@ceph1 ~]# ceph orch daemon add osd ceph106:/dev/vdc
" ^6 v8 R8 y# |" Q. m, ^7 `& nceph orch daemon rm osd.5 --force
# Q4 }1 `& E' r# t! Mceph osd out osd.5% T3 O- v2 ^0 a) R; c
ceph osd rm osd.5+ M0 I. H$ \3 k1 h- R8 ~: C
ceph osd crush rm osd.5* w: f9 ?9 @& K5 n& L# t
2 i" T L! g! q% D+ M
pg故障处理7 c! {( g' u# G# d$ D! D- _
bash 体验AI代码助手 代码解读复制代码& b) f" O/ j- q. J
; Y" c. A7 L$ m/ L
[root@k8snode001 ~]# ceph health detail" o% k/ D2 F; d7 }0 x' x5 L
HEALTH_ERR
( Y5 a" ^5 h# k; l 1/973013 objects unfound (0.000%); ( i- @. x& Y% f/ p% V# e
17 scrub errors; Possible data damage: 1 pg recovery_unfound, 8 pgs inconsistent, 1 pg repair; Degraded data redundancy: 1/2919039 objects degraded (0.000%), 1 pg degraded9 I$ ^1 T8 U' @5 J
OBJECT_UNFOUND 1/973013 objects unfound (0.000%)( j# S% P& W% {( e4 C
pg 2.2b has 1 unfound objects
& t3 `! e% L: B) m- l+ n3 GOSD_SCRUB_ERRORS 17 scrub errors2 A& D) J& a& s% o: O& [. Y
PG_DAMAGED Possible data damage: 1 pg recovery_unfound, 8 pgs inconsistent, 1 pg repair
: ^* _0 F0 R( B X. R pg 2.2b is active+recovery_unfound+degraded, acting [14,22,4], 1 unfound
6 i" L2 }0 e' X9 F. H pg 2.44 is active+clean+inconsistent, acting [14,8,21]
h9 U" Y3 U8 ~6 F* z. ?) ` pg 2.73 is active+clean+inconsistent, acting [25,14,8]: F9 p+ ~# @7 V
pg 2.80 is active+clean+scrubbing+deep+inconsistent+repair, acting [4,8,14]" d* r0 Y% o6 I% F- X' ^2 |
pg 2.83 is active+clean+inconsistent, acting [14,13,6]' W' I$ j( J( d0 E/ i
pg 2.ae is active+clean+inconsistent, acting [14,3,2]
* z0 C& @1 {2 v1 Z4 J pg 2.c4 is active+clean+inconsistent, acting [8,21,14]
6 Q3 o; p: g S* H! {8 } pg 2.da is active+clean+inconsistent, acting [23,14,15]
3 \; e2 a' S$ E pg 2.fa is active+clean+inconsistent, acting [14,23,25]( } ^. C9 Z R1 E
PG_DEGRADED Degraded data redundancy: 1/2919039 objects degraded (0.000%), 1 pg degraded7 v7 e. @3 G4 O: E/ x, s1 f3 C
pg 2.2b is active+recovery_unfound+degraded, acting [14,22,4], 1 unfound
0 ~" p7 w. p- @& U4 L
l+ R# Y6 c7 q" P$ Z5 L* b; a2.查看pg map- B* k! X. a3 i1 k
[root@k8snode001 ~]# ceph pg map 2.2b
/ o L8 M2 {$ p3 g2 \9 rosdmap e10373 pg 2.2b (2.2b) -> up [14,22,4] acting [14,22,4]. w6 P' R! I" |6 c* u' |! e
从pg map可以看出,pg 2.2b分布到osd [14,22,4]上- x1 B5 {! w) R$ \) y- A. i* p
0 U/ {: `; m- d. h3.查看存储池状态
9 }8 P1 W: D6 j0 V[root@k8snode001 ~]# ceph osd pool stats k8s-17 |% ?% ~8 D" b
pool k8s-1 id 2
! c7 H9 U# F8 J& v 1/1955664 objects degraded (0.000%)0 w6 U- d9 i5 J1 b q: b8 |* ^: M! r
1/651888 objects unfound (0.000%)6 ]; Z: e. L: \
client io 271 KiB/s wr, 0 op/s rd, 52 op/s wr ^3 i) ^# `1 i) j% s3 ]
; u' p9 D$ E" n9 M4 q" \. @[root@k8snode001 ~]# ceph osd pool ls detail|grep k8s-12 }* T+ H8 `' c
pool 2 'k8s-1' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 88 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd- J% q3 A' L. v% d" q
4.尝试恢复pg 2.2b丢失的块
* c" x! n+ L5 c# T[root@k8snode001 ~]# ceph pg repair 2.2b
$ M) @7 W+ Z# { M如果一直修复不成功,可以查看卡住PG的具体信息,主要关注recovery_state,命令如下- Q5 W( {. @5 w8 M5 `4 E: X l( i
[root@k8snode001 ~]# ceph pg 2.2b query/ m+ f& L9 W& R, m& Z: p; u
如果repair修复不了;
& a0 d7 ~8 S) C9 q两种解决方案,回退旧版或者直接删除0 B$ E9 ?% {. Z0 ?' X
回退旧版/ Q2 [! A. H" `4 [. U B
[root@k8snode001 ~]# ceph pg 2.2b mark_unfound_lost revert* n. a" A* s }! ]
直接删除
6 M6 M# R$ F, {( Q[root@k8snode001 ~]# ceph pg 2.2b mark_unfound_lost delete
# S& K" ^! s0 |, j. M6 p8 W' z2 ^* b
' p+ x- s; R8 v; W& Q) }. m参考:记一次ceph pg unfound处理过程7 ]; N& w# q% b8 k5 A ]6 m- g
卸载# q2 U, [/ G0 h7 l: O+ V! k9 }$ Q
卸载过程:
) g6 b. [5 h9 \, Cbash 体验AI代码助手 代码解读复制代码重命名命令alias ceph='cephadm shell -- ceph'2 N1 b0 N: G/ _6 x& Q
#找到fsid
( P( @8 ^2 I# t5 [$ s- n3 M |[root@master1 ~]# ceph -s# \3 P8 L; a2 M/ O' r
Inferring fsid 008a0d2e-b163-11ec-ba7a-5254004c51c6
, S0 E) `1 x" @% f" qInferring config /var/lib/ceph/008a0d2e-b163-11ec-ba7a-5254004c51c6/mon.master1/config% b; w) p5 e3 ?/ b: @: P) O
Using recent ceph image quay.io/ceph/ceph@sha256:1b0ceef23cbd6a1af6ba0cbde344ebe6bde4ae183f545c1ded9c7c684239947f
/ A; }: p: u/ p( a$ D8 s/ W2022-04-01T03:20:37.240+0000 7f27a2b6b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
" b: c, Z) G7 m i7 D2022-04-01T03:20:37.240+0000 7f27a2b6b700 -1 AuthRegistry(0x7f279c05ec00) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx9 Q. q" c+ C+ ^
2022-04-01T03:20:37.242+0000 7f27a2b6b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory6 |9 A( ^4 E+ R: N% ~% q7 k6 L
2022-04-01T03:20:37.242+0000 7f27a2b6b700 -1 AuthRegistry(0x7f27a2b69f90) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx! }* z0 h. [+ \* g+ s: V* w- d
2022-04-01T03:20:37.243+0000 7f27a0907700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]1 Z) ^. O8 U* p W9 v% _( K6 V& m
2022-04-01T03:20:37.243+0000 7f27a2b6b700 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication0 I! b: a* }4 t3 V- E8 D5 F
[errno 13] RADOS permission denied (error connecting to the cluster)
# Z: v( I$ D- Y7 ~8 e% e* {
6 J& H6 @, [7 i7 r) j' Z#使用fsid删除集群# |3 R @; h) X' b
[root@master1 ~]# cephadm rm-cluster --fsid 008a0d2e-b163-11ec-ba7a-5254004c51c6 --force/ l" k k; O5 P- ]8 z9 _* h! @
* u L: f) }. [% x4 n$ F
7 `& C1 I, `* @/ b( p+ G1 C2 `分发ceph.pub) d- ]. k$ v. r* j" `
playbook -v 2.mon1.yaml -t "find_pub,show_pub,push_pub" --extra-vars "ceph_pub_path=/tmp/ceph/master1/etc/ceph/ceph.pub" ( J( G" [- C+ x' z% y8 V( O
或; k; y F% v) S9 Q6 i' }
bash 体验AI代码助手 代码解读复制代码playbook -v 2.mon1.yaml -t "find_pub,show_pub,push_pub" -e "ceph_pub_path=/tmp/ceph/master1/etc/ceph/ceph.pub"
/ B0 U4 }) l' Y2 X6 d- d! e7 u4 G% z; X
使用 ansible master0 -m setup 可以看所有的变量# Y! M8 q6 {3 @5 \, n
参考:ansible.com.cn/docs/playbo…9 e* u! M( x' I: A5 N2 H
jinja2参考:' `% X' e2 z$ [5 D
stackoverflow.com/questions/3…& S. R% R2 j& }, r( C! |
bash 体验AI代码助手 代码解读复制代码 playbook -v 2.mon1.yaml0 k+ S i1 m: Q/ |. d& H9 w
playbook -v 3.push.pub.key.yaml -e "pub_key=/tmp/ceph/master1/etc/ceph/ceph.pub"$ r& J F0 v3 _* `8 Q' A( i
playbook -v 4.add.host.yaml -t "weave-script" -e "mon1=master1"
& `' w# m8 n& ^# z0 d% b2 |* a5 j: B0 i) a. [( g7 @
本篇介绍Ceph Monitor的子命令,通过子命令的配合实现对MON的管理和配置。+ x1 F5 U& K" _& [0 f: }% Z; c2 g) b
& `! }8 T( \& m# ?; R# P4 A( ^$ B- c添加(add) 在某个地址上新增一个名字为的MON服务。
% n7 d0 k6 g% {4 F. `; i示例:* r1 ^0 q2 i6 U
ceph mon add <IPaddr[:port]>! v* T0 E' l1 `5 m; E0 v% m
导出(dump) 显示特定版本的MON map的格式化的信息,该命令可以指定MON map的版本信息,具体示例如下,参数为epoch:, x" J4 g9 X8 E, ?- G8 B% D
ceph mon dump {<int[0-]>}1 f, d: a( U. o4 F, d) M! D% `
ceph mon dump 1 d8 O& |$ S; e( z+ J: a0 @% F
获取映射(getmap) 获取特定版本的MON map信息,该命令获取的是二进制的信息,可以保持在某个文件中,具体格式如下:! j* e) o$ q2 q- i( @
ceph mon getmap {<int[0-]>}7 L' w. q s& g1 A
示例:
( y0 x( z, L" yceph mon getmap 2 –o filename; E! V/ Y' X" j; |' N
移除(remove) 移除特定名称的MON服务节点。具体格式如下:
! F6 A9 N1 \$ a. @* yceph mon remove
5 v4 x) b- e2 R: {示例:7 n/ W" P9 [1 K/ U' U+ e% [
ceph mon remove osd3
+ n3 M z. h8 H3 ^, B获取状态( stat) 显示MON的摘要状态信息,具体格式如下:
1 t0 L+ O3 C' ]/ g% {ceph mon stat
, l. a- F( E1 h0 n6 b2 T, V报告状态(mon_status) 报告MON的状态,相对详细,具体格式如下:0 o* Z; ^% D' ]# X
ceph mon_status0 t' H0 D/ R. G; @$ y* q; Q# t
Ceph命令之ceph mon(Monitor管理)
/ J0 R9 m3 G7 n6 W7 b. {2 t, }: H W7 g/ d9 W
( r4 a7 [3 |: S) @/ l
|
|