|
一、环境准备 安装环境为VMWare Workstation虚拟机 1.准备5台虚拟机 ceph-deploy:作为管理节点,后续的ceph-deploy工具都在该节点上进行操作。 , O; G8 R1 I' S: V5 P# t
ceph-node1、ceph-node2 、ceph-node3 :即做mon节点又做osd节点,都有3块磁盘,前2块磁盘部署2个osd,第3块磁盘建立2个相等大小分区作为2个osd盘的日志分区。这样,集群共有6个osd进程,3个monitor进程。
. v j2 A) n. Sceph-client :作为客户端,用来挂载ceph集群提供的存储进行测试。 2.磁盘规划 ceph-node1、ceph-node2 、ceph-node3 上面各有3块盘,/dev/sdb,/dev/sdc为数据盘,/dev/sdd为日志盘 

3.修改各个节点/etc/hosts文件 192.168.128.110 ceph-deploy : L: z: D/ ]2 f
192.168.128.111 ceph-node1 * J" o$ T% t+ G& ]3 d
192.168.128.112 ceph-node2
9 n4 S+ }9 e# ~% E5 U, q+ f192.168.128.113 ceph-node3
8 ]9 P% o- j% J( S' `192.168.128.114 ceph-client 4.关闭firewalld systemctl stop firewalld & f( N" B" D+ N( T+ z X
systemctl disable firewalld 5.关闭Selinux vi /etc/sysconfig/selinux % y" p% A; j; F( C( O
SELINUX=disabled 6.安装ntp yum install ntp ntpdate ntp-doc systemctl enable ntpd / B9 C9 ~/ P! a, O) y7 _% i C
systemctl start ntpd 将系统时区改为上海时间
; n* w+ H, `+ l1 @* p1 iln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime 查看时间是否准确 # L. v6 E- r4 ?: w7 Z' x3 ?
date /etc/ntp.conf # deploy节点 3 D5 Y; P" m7 f3 W4 Q0 h
…
& Y4 _# A7 ?$ e# E) Yrestrict 192.168.128.0 mask 255.255.255.0 nomodify notrap 6 l3 J7 N. \0 Y5 A# ~0 ]& A( n
server 127.127.1.0 iburst $ C w) X5 T+ t& r
fudge 127.127.1.0 stratum 10
# D+ O1 K' j t' j* {… /etc/ntp.conf # ceph其他节点
( A& W! S1 Q; X. ~2 n. t…
& ^5 e( m- E' U0 y1 [( w% qserver ceph-deploy iburst / X& [- X R1 Q6 Y' `* P
… systemctl restart ntpd 其他节点查看
$ g/ l6 e4 H% a# {- \watch ntpq -p
4 I& U' P* G0 @2 c( T1 J9 h 二、配置ceph源,安装依赖包在每个节点都添加ceph源,修改epel源,使用阿里云源 cd /etc/yum.repos.d
% J1 }. D: t2 S- z; | S2 a0 Mvi ceph.repo [ceph]
; r B3 r8 ^% C' Lname=Ceph packages
. [/ W' N- _8 z. sbaseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/x86_64/4 r0 [) R' l7 Y @' r& W# f
enabled=1" V r7 f h% C6 `' B
gpgcheck=1
; N" W0 A& C% r: xpriority=2type=rpm-mdgpgkey=http://mirrors.aliyun.com/ceph/keys/release.asc
+ g1 m" E2 P& Y% K* A# A& P[ceph-noarch]
, F$ T: f% E4 P+ g2 l# Tbaseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/noarch/
, k3 n9 W+ u2 t8 |" |% ^- j: c# Q$ penabled=1
7 l" R9 L3 [' Y/ t* }gpgcheck=1
. l6 u, @! ~* Y+ E4 z7 s) ppriority=2type=rpm-mdgpgkey=http://mirrors.aliyun.com/ceph/keys/release.asc
( ~" }; Y! P" o2 \$ c
; @+ R& [0 H* \- \' bvi epel.repo [epel]7 z4 j0 a( H, q. E- a
name=Extra Packages for Enterprise Linux 7 - $basearch
+ s9 [* l3 Y5 I& V; ubaseurl=http://mirrors.aliyun.com/epel/7/$basearch
& c% O0 s3 e6 m- y5 [% V! A http://mirrors.aliyuncs.com/epel/7/$basearch) T s' O- ]. [& I1 X: A
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch: t+ k" p2 [6 Y
failovermethod=priority
* i* C7 ~( n3 ?( P; x. ^enabled=1
- z* X( T4 s" Q/ Wgpgcheck=0% q: {- C, M [& {
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
$ E6 P" n5 w+ S" \( a8 d0 g0 d, J / e! G- E. W$ a J; Y; X' w; r
下面这些包需要在每个ceph节点都安装 yum install -y yum-utils snappy leveldb gdisk python-argparse gperftools-libs- ~4 S1 Q' h+ o
三、安装ceph-deploy工具ceph-deploy是ceph官方提供的部署工具,它通过ssh远程登录其它各个节点上执行命令完成部署过程,需要安装在ceph-deploy节点 yum install -y ceph-deploy
; c( I. r M- a$ }2 F4 \/ M我们把ceph-deploy节点上的/data/ceph/deploy目录作为ceph-deploy部署目录,其部署过程中生成的配置文件,key密钥,日志等都位于此目录下,因此下面部署应当始终在此目录下进行 mkdir -p /data/ceph/deploy9 o- f6 e5 m Z; w( h
ceph-deploy工具默认使用root用户SSH到各Ceph节点执行命令。为了方便,可以配置ceph-deploy免密码登陆各个节点。如果ceph-deploy以某个普通用户登陆,那么这个用户必须有无密码使用sudo的权限 ssh-keygen $ U0 Q' ~$ q& C4 J
ssh-copy-id ceph-node1 2 R7 k2 |. l% M
ssh-copy-id ceph-node2
& C- x4 L4 m0 P' k% l% D4 ossh-copy-id ceph-node3 . ]9 w8 b* j3 k" d- f' M( P4 k
ssh-copy-id ceph-client 四、安装ceph集群1.ceph软件包安装 首先安装ceph软件包到三个节点上。上面我们已经配置好ceph源,因此这里使用--no-adjust-repos参数忽略设置ceph源 cd /data/ceph/deploy2 v8 Y$ I. A& E, ]; h. M! m9 t0 v
3 @1 u0 S0 Z6 K/ a2 Qceph-deploy install --no-adjust-repos ceph-node1 ceph-node2 ceph-node3
$ Y9 X, w7 J: ?0 q
" ~' e. I4 b6 x' ^2.创建ceph集群,部署新的mon节点 ceph-deploy new ceph-node1 ceph-node2 ceph-node3
( t* v! M# n) D
1 q P* H5 b8 P7 X上步会创建一个ceph.conf配置文件和一个监视器密钥环到各个节点的/etc/ceph/目录,ceph.conf中会有fsid,mon_initial_members,mon_host三个参数 默认ceph使用集群名ceph,可以使用下面命令创建一个指定的ceph集群名称 ceph-deploy --cluster {cluster-name} new {host [host], ...}* Q& u; v3 q2 d4 V" ]
Ceph Monitors之间默认使用6789端口通信, OSD之间默认用6800:7300 范围内的端口通信,多个集群应当保证端口不冲突 3.修改配置文件 修改ceph-deploy目录/data/ceph/deploy下的ceph.conf vi /data/ceph/deploy/ceph.conf 添加如下参数 osd_journal_size = 10000 #10G
6 S N% G8 I. I& S# Iosd_pool_default_size = 2; U. Y( B. W: ?' @0 L& ?/ e) b
osd_pool_default_pg_num = 512
, u1 R- c" \( U( mosd_pool_default_pgp_num = 512, M5 d5 c. S2 O- C5 g- t6 f8 I+ E
rbd_default_features = 3
8 ~/ j9 x: \$ @, g
- @' P& W& ]2 R q$ D# A4.添加mons 我们这里创建三个Monitor ceph-deploy mon create ceph-node1 ceph-node2 ceph-node3
/ J/ y/ f' w* ?/ X+ s, i
( U9 b+ V0 S$ G8 ? B上面命令效果如下
; p3 S: w& D: M. H5 \0 |a.write cluster configuration to /etc/ceph/{cluster}.conf
; z' _" P0 _' S, n% vb.生成/var/lib/ceph/mon/ceph-node1/keyring
5 U. J, i& N- y( F3 J4 mc.systemctl enable ceph-mon@ceph-node1 , J% w# L M, E& }
d.systemctl start ceph-mon@ceph-node1 在一主机上新增监视器时,如果它不是由ceph-deploy new命令所定义的,那就必须把public network加入 ceph.conf配置文件 5. key管理 为节点准备认证key ceph-deploy gatherkeys ceph-node1 ceph-node2 ceph-node3# w! h3 Z; M0 m# N" f
若有需要,可以删除管理主机上、本地目录中的密钥。可用下列命令: ceph-deploy forgetkeys
1 U) W0 O! ] \2 p6 g: Y9 L% t; m' x6. osd创建 创建集群,安装ceph包,收集密钥之后,就可以创建osd了 准备osd ceph-deploy osd prepare ceph-node1:sdb:/dev/sdd ceph-node1:sdc:/dev/sdd ceph-node2:sdb:/dev/sdd ceph-node2:sdc:/dev/sdd ceph-node3:sdb:/dev/sdd ceph-node3:sdc:/dev/sdd
- w) K3 |( u4 Q- c+ w+ h- I可以prepare多个osd # w. e% a7 n6 M+ P( g1 |( o5 @
ceph-node1:sdb:/dev/sdd 意思是在node1上创建一个osd,使用磁盘sdb作为数据盘,osd journal分区从sdd磁盘上划分 - O3 Q' e+ O) G) v7 J$ t
每个节点上2个osd磁盘sd{b,c},使用同一个日志盘/dev/sdd,prepare过程中ceph会自动在/dev/sdd上创建2个日志分区供2个osd使用,日志分区的大小由上步骤osd_journal_size = 10000(10G)指定,你应当修改这个值
5 {& u% T9 [! {" Wprepare 命令只准备 OSD。在大多数操作系统中,硬盘分区创建后,不用 activate 命令也会自动执行 activate 阶段(通过 Ceph 的 udev 规则) 激活osd
& K6 B! ?5 j/ [' v: k3 D6 K2 jceph-deploy osd activate ceph-node1:sdb1:/dev/sdd1 ceph-node1:sdc1:/dev/sdd2 ceph-node2:sdb1:/dev/sdd1 ceph-node2:sdc1:/dev/sdd2 ceph-node3:sdb1:/dev/sdd1 ceph-node3:sdc1:/dev/sdd2
1 T8 Z( _; I. }7 t/ J
& H2 Y+ ?) v& X2 R6 y; e: T. z0 W4 b% b7 ~* `# z6 ]
sdd1,sdd2 为ceph自动创建的2个日志分区
( B# ~2 i# ~: o! Q0 x9 t2 t. Sactivate 命令会让 OSD 进入 up 且 in 状态,此命令所用路径和 prepare 相同。在一个节点运行多个OSD 守护进程、且多个 OSD 守护进程共享一个日志分区时,你应该考虑整个节点的最小 CRUSH 故障域,因为如果这个 SSD 坏了,所有用其做日志的 OSD 守护进程也会失效 ! @9 O; q- h6 ~# _1 l: ^
7.验证安装成功 在mon节点执行 ceph -s[root@ceph-node1 ~]# ceph -s, K5 O/ w$ W# E) {$ f; {
cluster 2ba8f88a-1513-445b-93d0-4ec7717c6777
, j4 }/ y% D1 J* Q3 n U; @7 \9 x3 J health HEALTH_WARN
/ `' a: b% u9 ]0 q7 W* L3 ]7 I: S clock skew detected on mon.ceph-node2, mon.ceph-node3
, e- C4 G# p8 u" g too few PGs per OSD (21 < min 30)
0 ^8 A* a; }. x Monitor clock skew detected
4 x0 R* H0 q. K: z+ M+ R6 R monmap e1: 3 mons at {ceph-node1=192.168.128.111:6789/0,ceph-node2=192.168.128.112:6789/0,ceph-node3=192.168.128.113:6789/0}9 b# Y6 s ?% g
election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
& k3 I4 @+ K7 F9 m' U8 e: L% \ osdmap e30: 6 osds: 6 up, 6 in
3 ], v+ h. Y1 g& [ flags sortbitwise,require_jewel_osds
9 M& g6 c; q% }! V; D pgmap v63: 64 pgs, 1 pools, 0 bytes data, 0 objects
: s4 g* |0 _( ]! d9 p. K 201 MB used, 299 GB / 299 GB avail
1 m8 _# b2 ?0 V' n! m1 Z3 {+ m 64 active+clean
1 w+ a) ^& ] Z0 [' ]显示 HEALTH_WARN too few PGs per OSD (21 < min 30) 通过下面的命令可以看到默认创建一个pool rbd,pg_num 为 64 [root@ceph-node1 ~]# ceph osd pool ls detail, I' w$ b8 ]; ^# r
5 w9 L+ q/ O: T/ I5 o" g3 V g
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
. Z' Y! c4 s$ R+ g7 D" i- l6 d8 A& P( x
8 F, D6 U; m# `! j
再创建几个pool,为后面对接OpenStack做准备,pg数量就能上来了 ceph osd pool create cinder-volumes 128% r2 W4 K7 j: E, }
) |/ p) d" h% P) z) w1 r B
ceph osd pool set cinder-volumes size 2) b1 ?8 _5 e+ E' z' ^
ceph osd pool create nova-vms 128! b1 ]6 J6 p' W% O F: V8 A. N
4 {7 } Y" g6 @6 Oceph osd pool set nova-vms size 2, e7 q4 R$ j2 U# _" q1 Q
ceph osd pool create glance-images 64. T! [2 Z6 a; K, U3 h
ceph osd pool set glance-images size 2* X" s' H6 v0 ?5 ?( B7 K5 j% k, d
ceph osd pool create cinder-backups 64
* f3 Y8 Y, v! d, Iceph osd pool set cinder-backups size 2! F9 }# b% _% ^7 C0 }
0 C# W" P, I" l% e: R1 W
又显示 HEALTH_WARN Monitor clock skew detected [root@ceph-node1 ~]# ceph -s' x8 f* V# L. ^
cluster 2ba8f88a-1513-445b-93d0-4ec7717c67773 f( P: Q/ g. ]
health HEALTH_WARN, B! p0 g5 V3 m6 g3 { P
clock skew detected on mon.ceph-node2, mon.ceph-node3
3 h3 u) W! J: X+ d0 h& z9 N% M Monitor clock skew detected ; {' Q0 r: ?1 {2 i+ u3 d
monmap e1: 3 mons at {ceph-node1=192.168.128.111:6789/0,ceph-node2=192.168.128.112:6789/0,ceph-node3=192.168.128.113:6789/0}% [( ~8 }7 g q* B. w, N! V5 x
election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
& S7 L6 o8 }5 j: S osdmap e42: 6 osds: 6 up, 6 in' F5 t: n5 }5 D8 s. `9 v
flags sortbitwise,require_jewel_osds
6 j( Y0 i! B7 K1 Z. u pgmap v123: 448 pgs, 5 pools, 0 bytes data, 0 objects( b& T) h; w+ I
209 MB used, 299 GB / 299 GB avail
! \$ r) e2 _* c$ G, d$ [ i 448 active+clean$ u* J( C3 J* W& Y* x' X
+ D9 ?) S. M7 Z' R& N p: \时间同步有问题,经检查 ceph-node1节点的ntp服务没起来
# y: z B% s. A2 ?1 ~3 ]" ]9 G [root@ceph-node1 ~]# systemctl status ntpd
2 a9 C& x; s8 F, r3 c" N6 k; A
! Q, e& C: @8 p9 b● ntpd.service - Network Time Service # [. g5 H g/ U6 R# j
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
! }$ Q5 I7 T1 g* _6 V1 Q$ x# j+ Q Active: inactive (dead)
& }0 _( A, f; Z: J! z, t[root@ceph-node1 ~]# systemctl start ntpd" k. c4 ]# Z, M" u4 _, D8 i9 E
+ m# z$ l; m: G* k再次查看集群正常了 [root@ceph-node1 ~]# ceph -s
, p l# ?0 O7 ?- e% [2 O! `7 @/ o cluster 2ba8f88a-1513-445b-93d0-4ec7717c6777 ' h+ }* g( X* C9 z
health HEALTH_OK
8 w" c0 {# z" m3 y! }, r monmap e1: 3 mons at {ceph-node1=192.168.128.111:6789/0,ceph-node2=192.168.128.112:6789/0,ceph-node3=192.168.128.113:6789/0}
f( u/ f' p- Q2 k election epoch 10, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
9 |% X7 G* \" _; }9 ] osdmap e42: 6 osds: 6 up, 6 in
a! |$ M' _" s6 |7 c/ l$ m flags sortbitwise,require_jewel_osds * {, I; h2 l& \2 ~& l6 ?
pgmap v123: 448 pgs, 5 pools, 0 bytes data, 0 objects 8 t9 P4 o+ D# l, D: F# A# S
209 MB used, 299 GB / 299 GB avail
( x+ F9 K4 O1 W( g o I) P3 S 448 active+clean4 @" v( r, Q2 C& L* h
五、Ceph client测试# E. I- P+ N5 q e- K* C
1.准备client节点 " j, @0 f( g# H. n X1 l8 |
通过ceph-deploy节点执行命令: ceph-deploy install --no-adjust-repos ceph-client
. c. I6 z3 w0 X. b; p8 J
, n9 V! p' u5 K8 Cceph-deploy admin ceph-client. Y$ g2 ]' g* H( {
" l) m9 ^! w- y, U
2.创建一个1G大小的块设备 rbd create --pool rbd --size 1024 test_image
2 f0 q; A# Z$ Q& n, u查看创建的块设备 rbd list
1 g2 m2 o" D5 H0 Q) _rbd info test_image
k ~! z/ r6 {; {
9 S, X, u6 L* S7 u" L2 H3.将ceph提供的块设备映射到ceph-client rbd map --pool rbd test_image
' l4 ?2 E% w" o' R! c& x! j) ?, r4.查看系统中已经映射的块设备
+ t" Z V T. q; S8 v+ R( _2 N4 \6 Srbd showmapped [root@ceph-client ~]# rbd showmapped
4 h3 N* P( \& O0 h* u, H# zid pool image snap device & S% i+ W6 P3 p0 G0 i' l
0 rbd test_image - /dev/rbd0
2 w1 |8 D% L* S2 e( S4 Z% [% ~9 f; k* W, p F4 g

5.取消块设备映射
7 }6 w# I5 K9 `2 Z; \' @9 S7 v, urbd unmap /dev/rbd0 此次安装配置参考了网上的多个文档,在此表示非常感谢! |