|
一、环境准备 安装环境为VMWare Workstation虚拟机 1.准备5台虚拟机 ceph-deploy:作为管理节点,后续的ceph-deploy工具都在该节点上进行操作。
0 F( T, K$ U/ cceph-node1、ceph-node2 、ceph-node3 :即做mon节点又做osd节点,都有3块磁盘,前2块磁盘部署2个osd,第3块磁盘建立2个相等大小分区作为2个osd盘的日志分区。这样,集群共有6个osd进程,3个monitor进程。 & x R4 i- u' Y6 G# `3 S; M' U9 F+ Z
ceph-client :作为客户端,用来挂载ceph集群提供的存储进行测试。 2.磁盘规划 ceph-node1、ceph-node2 、ceph-node3 上面各有3块盘,/dev/sdb,/dev/sdc为数据盘,/dev/sdd为日志盘 

3.修改各个节点/etc/hosts文件 192.168.128.110 ceph-deploy
: Z, r' q/ M% M: k: i192.168.128.111 ceph-node1 0 U; j1 B0 @4 Y
192.168.128.112 ceph-node2 * _8 _, y2 m, H- u
192.168.128.113 ceph-node3 / a8 n% I! C) o" x& u+ M
192.168.128.114 ceph-client 4.关闭firewalld systemctl stop firewalld
% ~7 {. x, t4 a7 A6 Ksystemctl disable firewalld 5.关闭Selinux vi /etc/sysconfig/selinux 6 w$ b% \/ w, a+ O" Q: O8 V6 o
SELINUX=disabled 6.安装ntp yum install ntp ntpdate ntp-doc systemctl enable ntpd
$ u7 t! V1 H+ Lsystemctl start ntpd 将系统时区改为上海时间
5 ` B1 c6 M- I$ f0 Cln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime 查看时间是否准确 ( u% K* v8 i B- p4 h/ H3 P3 |3 I% M; L
date /etc/ntp.conf # deploy节点 - O& Y+ t5 K+ r
… 1 p7 C( X0 m) l0 @
restrict 192.168.128.0 mask 255.255.255.0 nomodify notrap
5 N9 c/ g7 t3 j' w7 z5 `server 127.127.1.0 iburst 5 `/ t6 L6 a/ a
fudge 127.127.1.0 stratum 10 1 g+ x& L3 Z6 |/ N5 j
… /etc/ntp.conf # ceph其他节点 + x6 h: U3 l6 j$ c
… + S/ y+ w0 H1 }4 ^' |( A
server ceph-deploy iburst . c4 ~- g6 t/ z& k
… systemctl restart ntpd 其他节点查看 , m5 F" A% K. H1 n
watch ntpq -p $ B! {* l# f6 n0 L6 N1 @
 二、配置ceph源,安装依赖包在每个节点都添加ceph源,修改epel源,使用阿里云源 cd /etc/yum.repos.d $ a, V1 s: C6 t; S5 w. K* ?
vi ceph.repo [ceph]
% O8 R- U3 [6 vname=Ceph packages$ d# Z& O8 G! b ~
baseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/x86_64/" j6 M3 T% `9 O# w9 v
enabled=1" e! W) B3 C8 N! |
gpgcheck=18 @- q% {/ B7 W
priority=2type=rpm-mdgpgkey=http://mirrors.aliyun.com/ceph/keys/release.asc" j% x5 P/ g# _% c% n9 T' |9 S
[ceph-noarch]
1 `$ P9 B- N: j8 {9 {: C X$ D7 Rbaseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/noarch/" `3 M$ S' h6 o: F$ N3 O" d
enabled=1
* z( Q8 ?/ S* \7 ]gpgcheck=1/ u: F% t/ C0 `
priority=2type=rpm-mdgpgkey=http://mirrors.aliyun.com/ceph/keys/release.asc6 s: p" R/ [% a
/ m$ @. j3 K E# ]; E
vi epel.repo [epel]9 m, d# d1 p+ w3 P! s. `# k, o
name=Extra Packages for Enterprise Linux 7 - $basearch
0 U, |: \/ V, n$ _1 ~baseurl=http://mirrors.aliyun.com/epel/7/$basearch
; ^% z- ^( T$ x0 { ]4 H http://mirrors.aliyuncs.com/epel/7/$basearch
6 x4 L- }; K8 M' l. Q+ G#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch7 t9 K0 n- Z+ s, _& ?* a w* S
failovermethod=priority' B+ `( I; K# A+ G
enabled=1$ Q t' {+ N# ^' F# O" z
gpgcheck=0: f) I/ o( E7 a
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 : ?2 I; b# d# @8 c" X
4 ~+ y# U3 R5 q: H下面这些包需要在每个ceph节点都安装 yum install -y yum-utils snappy leveldb gdisk python-argparse gperftools-libs
8 M& w z4 S `+ Q* y) k三、安装ceph-deploy工具ceph-deploy是ceph官方提供的部署工具,它通过ssh远程登录其它各个节点上执行命令完成部署过程,需要安装在ceph-deploy节点 yum install -y ceph-deploy
# d* I7 `1 Q+ X7 a我们把ceph-deploy节点上的/data/ceph/deploy目录作为ceph-deploy部署目录,其部署过程中生成的配置文件,key密钥,日志等都位于此目录下,因此下面部署应当始终在此目录下进行 mkdir -p /data/ceph/deploy
: X0 }& R. w& j4 Q7 \ X6 @% B2 ~ceph-deploy工具默认使用root用户SSH到各Ceph节点执行命令。为了方便,可以配置ceph-deploy免密码登陆各个节点。如果ceph-deploy以某个普通用户登陆,那么这个用户必须有无密码使用sudo的权限 ssh-keygen
8 A8 ?+ k1 d5 }2 D" }8 v" J# yssh-copy-id ceph-node1
, O" j( b* Q/ K3 d# hssh-copy-id ceph-node2 3 k& w9 \. V% ] d4 H+ n! }
ssh-copy-id ceph-node3
; U1 v1 h- Y1 e# [ssh-copy-id ceph-client 四、安装ceph集群1.ceph软件包安装 首先安装ceph软件包到三个节点上。上面我们已经配置好ceph源,因此这里使用--no-adjust-repos参数忽略设置ceph源 cd /data/ceph/deploy Z4 H: n) A7 c7 E: [! _ Y; }4 w
* E# x5 R6 Z, g9 I6 [1 h: O
ceph-deploy install --no-adjust-repos ceph-node1 ceph-node2 ceph-node3
1 M3 b$ u" |+ I3 H4 X6 J4 N5 n7 p5 N4 B0 q; E" R0 d
2.创建ceph集群,部署新的mon节点 ceph-deploy new ceph-node1 ceph-node2 ceph-node3
/ Q6 A1 I/ Z* C; E/ T( m Y7 C) R
4 a! Z0 j0 O3 o$ g上步会创建一个ceph.conf配置文件和一个监视器密钥环到各个节点的/etc/ceph/目录,ceph.conf中会有fsid,mon_initial_members,mon_host三个参数 默认ceph使用集群名ceph,可以使用下面命令创建一个指定的ceph集群名称 ceph-deploy --cluster {cluster-name} new {host [host], ...}
# \( a/ T& f7 b; jCeph Monitors之间默认使用6789端口通信, OSD之间默认用6800:7300 范围内的端口通信,多个集群应当保证端口不冲突 3.修改配置文件 修改ceph-deploy目录/data/ceph/deploy下的ceph.conf vi /data/ceph/deploy/ceph.conf 添加如下参数 osd_journal_size = 10000 #10G
& `+ n- w7 S# H3 x6 G# L! sosd_pool_default_size = 2
9 V3 Z3 ^0 e7 q/ Dosd_pool_default_pg_num = 512' |( l6 c& B) F3 ?% u( }, X/ R
osd_pool_default_pgp_num = 512. _1 J; j+ i5 b1 L: m0 L6 d- {7 @
rbd_default_features = 3) Q* f. q% h1 z! v% [
1 } K+ F. t, J6 a2 e) N% H) ] @4.添加mons 我们这里创建三个Monitor ceph-deploy mon create ceph-node1 ceph-node2 ceph-node3/ t% R1 Q0 c6 x
! X# i4 P- ? b! {' B
上面命令效果如下 ! O0 n# i$ X7 ^4 v8 A
a.write cluster configuration to /etc/ceph/{cluster}.conf
3 [8 c% D5 n2 p0 ^0 O% {- S. D- f, @b.生成/var/lib/ceph/mon/ceph-node1/keyring
! h$ ^- ~3 P; a5 f9 n2 Xc.systemctl enable ceph-mon@ceph-node1
$ F# X1 x$ s, { x; H& X+ \6 Td.systemctl start ceph-mon@ceph-node1 在一主机上新增监视器时,如果它不是由ceph-deploy new命令所定义的,那就必须把public network加入 ceph.conf配置文件 5. key管理 为节点准备认证key ceph-deploy gatherkeys ceph-node1 ceph-node2 ceph-node3# d( S9 a% a# V% ^* W
若有需要,可以删除管理主机上、本地目录中的密钥。可用下列命令: ceph-deploy forgetkeys
3 k2 g3 B. O$ s. ]& ^1 G; J' V0 c5 K6. osd创建 创建集群,安装ceph包,收集密钥之后,就可以创建osd了 准备osd ceph-deploy osd prepare ceph-node1:sdb:/dev/sdd ceph-node1:sdc:/dev/sdd ceph-node2:sdb:/dev/sdd ceph-node2:sdc:/dev/sdd ceph-node3:sdb:/dev/sdd ceph-node3:sdc:/dev/sdd6 Z2 w5 C; u7 R4 y' Q
可以prepare多个osd
* a! g! l8 _- h: L# Yceph-node1:sdb:/dev/sdd 意思是在node1上创建一个osd,使用磁盘sdb作为数据盘,osd journal分区从sdd磁盘上划分
# w6 h6 {$ T; l# R每个节点上2个osd磁盘sd{b,c},使用同一个日志盘/dev/sdd,prepare过程中ceph会自动在/dev/sdd上创建2个日志分区供2个osd使用,日志分区的大小由上步骤osd_journal_size = 10000(10G)指定,你应当修改这个值 : [/ t7 z2 L' r
prepare 命令只准备 OSD。在大多数操作系统中,硬盘分区创建后,不用 activate 命令也会自动执行 activate 阶段(通过 Ceph 的 udev 规则) 激活osd
l& C. w8 Q! C/ t9 s/ wceph-deploy osd activate ceph-node1:sdb1:/dev/sdd1 ceph-node1:sdc1:/dev/sdd2 ceph-node2:sdb1:/dev/sdd1 ceph-node2:sdc1:/dev/sdd2 ceph-node3:sdb1:/dev/sdd1 ceph-node3:sdc1:/dev/sdd24 V. x& d, r7 q% ^% Q+ ]: a _7 o
5 h( I+ q+ s8 _. B; C5 B! r: x
! X" l/ B6 x- s; v$ z( F X. N O
sdd1,sdd2 为ceph自动创建的2个日志分区 " K5 y& j; C: _+ A S9 J
activate 命令会让 OSD 进入 up 且 in 状态,此命令所用路径和 prepare 相同。在一个节点运行多个OSD 守护进程、且多个 OSD 守护进程共享一个日志分区时,你应该考虑整个节点的最小 CRUSH 故障域,因为如果这个 SSD 坏了,所有用其做日志的 OSD 守护进程也会失效
4 T5 ^1 `, q- X
7.验证安装成功 在mon节点执行 ceph -s[root@ceph-node1 ~]# ceph -s
: V+ O! B v4 i- F& I2 N" N cluster 2ba8f88a-1513-445b-93d0-4ec7717c6777" U9 K2 u, f' s% {* i
health HEALTH_WARN+ k) P. g/ f% ]. W }
clock skew detected on mon.ceph-node2, mon.ceph-node3- U3 P8 t3 o6 K8 X
too few PGs per OSD (21 < min 30)% t4 m& m( t4 B P1 ^3 K! n/ r
Monitor clock skew detected
0 r" L5 c9 i. f l7 l- g$ X monmap e1: 3 mons at {ceph-node1=192.168.128.111:6789/0,ceph-node2=192.168.128.112:6789/0,ceph-node3=192.168.128.113:6789/0}+ i0 W0 n7 w a; K4 b8 |
election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
4 A6 V8 G/ {" w: X+ b4 X2 F0 W osdmap e30: 6 osds: 6 up, 6 in
; t( O8 I$ T) t: t: p/ K: U% F flags sortbitwise,require_jewel_osds
/ a k% C1 ~1 G9 a3 X pgmap v63: 64 pgs, 1 pools, 0 bytes data, 0 objects% w. {$ V) Y, G, a; ?
201 MB used, 299 GB / 299 GB avail2 _2 @. |6 X+ Y0 F9 D
64 active+clean
/ V' b" m# c$ I) K5 ]) i显示 HEALTH_WARN too few PGs per OSD (21 < min 30) 通过下面的命令可以看到默认创建一个pool rbd,pg_num 为 64 [root@ceph-node1 ~]# ceph osd pool ls detail/ W* J2 j, j% ^4 C4 R& D8 @
% \" i2 |' D1 s7 Apool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0% Y: E) v9 O( o; e
4 L2 V. D, O' D: J' t0 O: O9 y
# t0 }- `0 E1 ?2 |( R! A
再创建几个pool,为后面对接OpenStack做准备,pg数量就能上来了 ceph osd pool create cinder-volumes 128# E* H7 v% @. }/ Z9 K `% f
$ d% s/ q% e' U% N+ t S
ceph osd pool set cinder-volumes size 2
5 w% G3 P7 X3 s: \ceph osd pool create nova-vms 128
' B8 @. o5 Q3 d. |) C' `9 @2 p& t3 O! P! s* K. y
ceph osd pool set nova-vms size 2
# s* u$ t2 `- c# W1 s. Q5 z2 y9 qceph osd pool create glance-images 64
3 S* \7 L( m/ B7 ]$ S8 yceph osd pool set glance-images size 2/ B L \- q7 c: c/ x+ P* `4 ?
ceph osd pool create cinder-backups 64
; e* C& r6 U: j5 lceph osd pool set cinder-backups size 2) V! X' \ w5 Y+ l7 i# J0 N5 n
. f4 K" V2 g0 P% j O6 O+ Z又显示 HEALTH_WARN Monitor clock skew detected [root@ceph-node1 ~]# ceph -s: |# ^9 Y3 L$ R
cluster 2ba8f88a-1513-445b-93d0-4ec7717c6777
' p' e0 h) J. N health HEALTH_WARN
9 w( V9 E/ f! U4 j& E clock skew detected on mon.ceph-node2, mon.ceph-node3
) G; h% |: n$ i" h# H3 J% D Monitor clock skew detected 3 ]" _5 }; J7 I) g8 @
monmap e1: 3 mons at {ceph-node1=192.168.128.111:6789/0,ceph-node2=192.168.128.112:6789/0,ceph-node3=192.168.128.113:6789/0}
, f& K8 q M6 Z6 J" H- g9 L election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node38 t# ^, \7 Y) D& ~% n
osdmap e42: 6 osds: 6 up, 6 in, K' `3 G/ S: ?! v
flags sortbitwise,require_jewel_osds0 C9 S& r1 N8 V: {) Y
pgmap v123: 448 pgs, 5 pools, 0 bytes data, 0 objects: M/ N; R- x! x% _9 Z2 w
209 MB used, 299 GB / 299 GB avail
# d/ g) A" m" [' c6 h ^, \ 448 active+clean% J8 R& A4 z& f( X @
8 j) M! Q. v' }% M# i
时间同步有问题,经检查 ceph-node1节点的ntp服务没起来
. Z$ E% }2 c% ~ T3 d: f8 F [root@ceph-node1 ~]# systemctl status ntpd0 L, s: b" J& J. U: P
" g8 M' v1 R& q' p● ntpd.service - Network Time Service . j8 e9 L M( V, V1 g' {
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
" \. e, ~5 z2 t2 O3 h Active: inactive (dead)
( f7 i, z( V# W- p! Y$ n[root@ceph-node1 ~]# systemctl start ntpd( t- I2 p) {8 c
$ _) ^6 W+ D9 e1 W7 T" {% ^$ z9 l- v/ G
再次查看集群正常了 [root@ceph-node1 ~]# ceph -s ) O' I$ j& U! a$ x( a. O( B/ n% j! o
cluster 2ba8f88a-1513-445b-93d0-4ec7717c6777
6 A/ J7 F, N" G* B) m% ` health HEALTH_OK
0 b2 _+ i9 y3 J5 U7 H2 d( C p monmap e1: 3 mons at {ceph-node1=192.168.128.111:6789/0,ceph-node2=192.168.128.112:6789/0,ceph-node3=192.168.128.113:6789/0}
$ }: Z: _* y- V* J% { X election epoch 10, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3 3 P% d. u1 S1 |- H7 Z8 k. f+ P
osdmap e42: 6 osds: 6 up, 6 in 6 v- r8 i3 e, h# P+ m" O0 C( l0 F
flags sortbitwise,require_jewel_osds ' d- b$ A. E& v, P
pgmap v123: 448 pgs, 5 pools, 0 bytes data, 0 objects
. C" {5 y" c- D1 f6 Y3 K6 U" D. l( Y 209 MB used, 299 GB / 299 GB avail 2 {" J V3 Z! @, ~
448 active+clean
& T3 J$ d& U( t( E/ d' k4 ~$ O% ?0 ?4 N五、Ceph client测试5 U" P- W5 ?5 I/ ^, L
1.准备client节点
U2 t. U7 y" N通过ceph-deploy节点执行命令: ceph-deploy install --no-adjust-repos ceph-client
4 V. ~# }1 l& \; M- H* |+ Z) K! ?& R( Z8 ~6 }
ceph-deploy admin ceph-client( o1 F S" B" B* V8 @$ G- x
- B7 X0 A+ Q: ] N3 Q
2.创建一个1G大小的块设备 rbd create --pool rbd --size 1024 test_image0 t3 S4 W1 f B6 f
查看创建的块设备 rbd list
C% J9 V3 p8 W! g/ z% H+ Wrbd info test_image& ]1 W6 R- Z/ R" Y
: z1 |% S- M( _' l$ H7 P- S8 x) j6 M1 ?4 q3.将ceph提供的块设备映射到ceph-client rbd map --pool rbd test_image0 B# ?) c% X: i& W! C# c" }
4.查看系统中已经映射的块设备 * [* J" I; C. o" V- t- \/ R) B
rbd showmapped [root@ceph-client ~]# rbd showmapped1 w5 G# l, V: N% }( c
id pool image snap device 0 H$ L1 k B$ l+ [1 C8 z: q i
0 rbd test_image - /dev/rbd0/ H+ X! d9 E' v. w8 b" R& R4 a
) ^6 r$ }1 v a9 c
5.取消块设备映射
7 A5 O+ l, ?1 {2 X; B9 krbd unmap /dev/rbd0 此次安装配置参考了网上的多个文档,在此表示非常感谢! |