- 积分
- 469
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
Ironic对接原生的Neutron
9 r# x& }8 c* Y) X$ {3 Y4 i4 a5 w 部署、配置相关:
/ Y( V N" i a' ~5 H1 F- Ironic自己有一个dhcp-server,在inspect过程使用
- neutron-dhcp,在provision过程使用
- inspect和provision过程使用的tftp server可能不同
5 T6 r7 g, _/ Q7 m* X * O. Y3 ^: x9 x9 O
Register过程* s5 `3 B) @' V+ ~2 e8 L
用户录入ironic node,包含ipmi等信息
8 \/ `; @6 r. W# T4 Z0 Z+ V: Y8 b 2 l6 _ q( y0 l9 `1 W# m
Inspect过程+ m) I A# Y3 f" u
这个过程中使用Inspect Network,要求:0 E6 H" j& y$ g; B6 G1 x( E
- Ironic dhcp-server能收到BM节点的DHCP请求。
- BM节点拿到IP后,能和tftp-server-1互通(三层可达)
; f* c/ S/ F) V* L8 E7 H2 ]( e; d% h
用户获取BM节点信息
; b% m/ E% @" x( d0 v" g' v' i2 Y4 ~ - Ironic通过IPMI设置BM节点PXE启动
- Ironic通过IPMI启动BM节点,做PXE启动
- BM节点从Ironic dhcp-server获取IP。此时BM节点的请求报文不带vlan tag,使用上联接入交换机的native vlan(默认tag=1)
- 拿到IP之后,BM节点从tftp-server-1下载小镜像(ramdisk,内含Ironic Python Agent)
- 执行某些操作,获取BM节点的详细信息
- 将BM节点关机。ramdisk运行在内存中,关机后丢失。: X. F" y$ g$ {) B. B# `& Y; V; V9 a
7 s# T2 V9 v! U$ S/ Z5 ^
Provision过程
, h8 Q) Q& {' r. @( b% e这个过程中使用Provisioning Network(由neutron创建),要求:- L0 V' R0 b# P0 h% V3 j
- BM, glance-api, ironic-api, ironic-conductor, neutron-dhcp-agent需要保证PROVISION NETWORK连通性
$ Z. f6 H& v$ a' ?( v
用户申请物理机,安装操作系统,配置业务网卡等2 |# i7 L- y! o8 D# f1 O, @& a9 S
- 从nova入口
- Ironic IPMI启动BM节点,做PXE启动
- 此时,要求BM节点从neutron-dhcp-server获取IP(通过native vlan)。但由于Ironic-dhcp-server也允许native vlan过来的请求,所以必须保证DHCP请求被Neutron-dhcp-server处理。
- 拿到IP之后,BM节点从tftp-server-2(可以和Inspect过程中的tftp server不同)下载小镜像(ramdisk,内含Ironic Python Agent)
- (这一步怎么控制的?)从glance下载用户要求的镜像,做安装(要求拿到的IP和glance-api能互通)
- 安装完成之后,通过cloud-init在BM操作系统内部打上对应的vlan tag(必须保证该vlan tag在接入交换机上预先做了配置)- @& x9 K$ Y# H4 }
" F3 {+ y/ ~, |; Y! C- _; C! } 关键问题:
4 U$ i" E/ V. n, I- Ironic-dhcp-server和Neutron-dhcp-server都允许native vlan过来的DHCP请求,如果有两个BM节点同时做Inspect和Provision操作,可能引起冲突。
& ^ Y' Y" T4 @3 _/ W9 @
- 两个DHCP server合并。但是Neutron-dhcp-server是白名单方式,而在Inspect节点,dhcp-server还不知道BM节点的信息,没法配置白名单。
- 严格将Inspect和Provision过程分开。在机房初始化过程中,开启Ironic-dhcp-server,做完Inspect之后将其关闭;或者在EPC上强制Inspect过程中,disable Provision操作。* y4 q, D0 H% }" `9 ^" Z
9 z3 Y9 `: A+ c# p5 p
* 一级私有云中兴方案,将两个DHCP合并了,运行在ToR交换机上。% b/ Y _: o. ^$ h1 }8 P* P5 D- Y5 M
- BM节点的租户vlan一定要在接入交换机上预先配置,如果做不到,则需要动态地配置交换机
- Neutron-dhcp-agent需要在业务网上4 G( k. g5 ~3 B, L- [& X
2 a7 v5 K6 Y- C* B9 F( O8 c 苏州Ironic环境! b9 J4 k, `- j" {$ Y
10.142.24.12 root/@IDC_host4321/ A; w( B3 N* e9 d: g, Q% w9 Y/ H
$ Q/ W) f* q+ {0 u' c) v2 } l# y5 l6 S
浙江Ironic测试环境' ^9 X8 s% ]: o p! t7 n. t g
2 ]) C" R+ {- H
Ironic DHCP5 @$ T3 L1 {( h' W6 e
[root@csv-yglcs17 ~]# cat /etc/dhcp/dhc
8 X7 [. Y" Y" m' I& N Z, J. r) fdhclient.d/ dhcpd6.conf dhcpd.conf! R; M4 }: w: H( i* O5 |9 u
[root@csv-yglcs17 ~]# cat /etc/dhcp/dhcpd.conf
0 {* ?# a; R% t5 Yoption domain-name "test.com"; l7 |- u* R1 u R$ i& E
option domain-name-servers 8.8.8.8, 61.88.88.88; K) {, A$ ~% R, [) K# F
default-lease-time 60000;
1 `! `3 e" e8 H+ s2 `4 L: Tmax-lease-time 720000;* C" ^8 ]+ a' [; k: N, q U$ m, O0 p
subnet 20.26.34.0 netmask 255.255.255.0 {+ G; i4 z4 G* @, o, J$ D
range 20.26.34.10 20.26.34.100; <== DHCP段
* O1 E- C7 Q" t5 s option routers 20.26.34.1;
$ r S3 m& l- J _ next-server 20.26.33.26; <== tftp server
# S+ J* I' J/ c1 m$ r4 v: V filename "pxelinux.0";
+ F, n( ~# J, s; ?& \$ R: {0 T}
! r% Q! m' E3 A; p8 W# Jsubnet 20.26.33.0 netmask 255.255.255.0 { <== conductor节点只有33.0网段IP,如果不配置这个subnet,则dhcp启动时会报下面这个错误
+ y3 w/ Z; p* M) r& |}
" U+ [. t1 c/ w: _
! v$ U& z( g5 b5 D) H) F问题:* {2 f2 ], p$ C& M1 r1 b4 W/ H! f
Apr 19 14:30:21 csv-yglcs17 systemd: Starting DHCPv4 Server Daemon...5 A" c: o4 N, g
Apr 19 14:30:21 csv-yglcs17 dhcpd: Internet Systems Consortium DHCP Server 4.2.52 C# [6 g# \0 I/ |! j
Apr 19 14:30:21 csv-yglcs17 dhcpd: Copyright 2004-2013 Internet Systems Consortium.$ T% p7 X) O* y3 }8 [, t& P
Apr 19 14:30:21 csv-yglcs17 dhcpd: All rights reserved.
" m6 K* J& w. r3 a, q) uApr 19 14:30:21 csv-yglcs17 dhcpd: For info, please visit https://www.isc.org/software/dhcp/
& a/ Y U' i9 \3 i) gApr 19 14:30:21 csv-yglcs17 dhcpd: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the config file9 N6 @: R8 k/ c) y
Apr 19 14:30:21 csv-yglcs17 dhcpd: Wrote 15 leases to leases file.
) Q, l! R6 F+ X$ X: wApr 19 14:30:21 csv-yglcs17 dhcpd:
5 m/ P1 C0 n; d0 X; Q3 qApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno33557248 (no IPv4 addresses).9 K% y! a: ] S$ a( Q
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno33557248. If this is not what) e; S* E! i- o4 g7 D, X8 t6 C% ]$ Y
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration. J- `. o9 z# o: v
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment, V' n5 U' G C) K6 u& C# \6 y1 V
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno33557248 is attached. **
5 C# l+ S3 \2 u$ b uApr 19 14:30:21 csv-yglcs17 dhcpd:
+ i1 `6 B% z5 P1 gApr 19 14:30:21 csv-yglcs17 dhcpd:
: L6 q6 q1 B4 U+ H4 C) }2 PApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for virbr0 (192.168.122.1).' g% Y* z$ J/ @9 X. R
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on virbr0. If this is not what; j+ E8 S( ~# O B! j; X- y- F
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration; v: M6 L& }5 Z+ l- x8 O
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment, P/ C+ I& w2 s; D! d |. x9 L% ^
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface virbr0 is attached. **$ K; U+ `- Z6 _( B' z! ?0 I
Apr 19 14:30:21 csv-yglcs17 dhcpd:4 w% a! P v# {* Y/ Y8 P
Apr 19 14:30:21 csv-yglcs17 dhcpd:/ e3 W& a/ E% K) a
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno16777984 (20.26.33.26).
) \1 \# O0 q6 y* jApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno16777984. If this is not what
! A8 y: ?' L0 ~; P* HApr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration: k# r5 W2 r% v4 ^. M$ X
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment: M1 G7 b% U0 q2 K& e$ b" A+ `1 D
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno16777984 is attached. **( S$ l* q/ G& X% u
Apr 19 14:30:21 csv-yglcs17 dhcpd:5 b# x2 I4 |% J/ }- {5 Y0 b
Apr 19 14:30:21 csv-yglcs17 dhcpd:
( L, c. C4 @5 V4 QApr 19 14:30:21 csv-yglcs17 dhcpd: Not configured to listen on any interfaces!, L1 x: Z/ g2 E8 y+ f* o- b
Apr 19 14:30:21 csv-yglcs17 dhcpd:" V/ F8 Q% p/ G+ g- ~% V( G! \- ?
Apr 19 14:30:21 csv-yglcs17 dhcpd: This version of ISC DHCP is based on the release available
3 B7 F5 j8 {& m, f- IApr 19 14:30:21 csv-yglcs17 dhcpd: on ftp.isc.org. Features have been added and other changes: g0 p- P' Z$ }4 M* e6 E$ g
Apr 19 14:30:21 csv-yglcs17 dhcpd: have been made to the base software release in order to make
% R' S0 A; y" Z! y) |' \Apr 19 14:30:21 csv-yglcs17 dhcpd: it work better with this distribution.
' G7 ]( y/ R9 Y! O! dApr 19 14:30:21 csv-yglcs17 dhcpd:( Q# V& F% i1 [9 \
Apr 19 14:30:21 csv-yglcs17 dhcpd: Please report for this software via the CentOS Bugs Database:/ K4 G" w/ ?- W9 g+ ^+ a
Apr 19 14:30:21 csv-yglcs17 dhcpd: http://bugs.centos.org/
$ l& ?: O9 Z) y; o9 mApr 19 14:30:21 csv-yglcs17 dhcpd:& C& k! R! t" D8 {
Apr 19 14:30:21 csv-yglcs17 dhcpd: exiting.! W) B* ~" f% x7 ?, u% p1 t4 j- }
Apr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service: main process exited, code=exited, status=1/FAILURE
" E, S) [$ s7 w- ?5 Z- |Apr 19 14:30:21 csv-yglcs17 systemd: Failed to start DHCPv4 Server Daemon.
7 m7 r/ E) Z2 |9 P2 N# a* L1 q! HApr 19 14:30:21 csv-yglcs17 systemd: Unit dhcpd.service entered failed state.0 d$ L! j# c' Q- n# } ?& k
Apr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service failed.: F/ }' m2 e! U1 n3 [
* n7 m0 a! P9 I+ f. |4 @ $ O2 m1 ]: s+ w5 \- f+ k# G
Ironic Inspector
4 }; e0 @9 L( z# o3 W U[root@csv-yglcs17 pxelinux.cfg]# pwd
# }1 t# }7 L5 v# c2 E5 @6 }/tftpboot/pxelinux.cfg& U% O' L. S1 O# f0 i+ Z3 u
[root@csv-yglcs17 pxelinux.cfg]# cat default' v3 q; }; k3 X
default introspect! y& {) u0 O5 Y
label introspect: Y5 W( U* h0 n( J5 a4 m& J
kernel /tftpboot/ironic-inspector/inspector-kernel
, ] E' V; Q; h, Gappend initrd=/tftpboot/ironic-inspector/inspector-ramdisk ipa-inspection-callback-url=http://20.26.33.26:5050/v1/continue systemd.journald.forward_to_console=yes ipa-collect-lldp=True
6 B. g+ g" \3 E/ E: zipappend 33 ?( U3 D( [9 ]/ k' x. ?
$ U3 l; a, G d e; u0 C' F( `
inspector在20.26.33.26上. [5 D( k5 T" K3 e
8 ]; c+ w* ?4 ZIronic Provisioning: i7 q8 v2 k/ S3 q
ironic.conf中的provisioning_network还没配置。还有cleaning_network。
- w J; s4 Q) V: P# ]8 U
% D9 |! c) D6 s0 x检查IPMI
- [- J% A# ~' C[root@csv-yglcs17 ~]# ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus power status
; s' Q" `, K* d, o2 Z. uChassis Power is on
: S. `1 Q7 G5 D
, O+ u& H% x0 W0 ?1 c: M( y) P: g C% N- z0 ]) X& e; F4 c
+ O7 H( B3 O6 v" Y2 w& I == 操作 ==
9 H( O9 l' }8 E A
3 A2 J1 ?6 t; A$ w
3 `1 A! I4 M( v. P, R+ E1 H3 T' i/ ^) h I6 b* h
& X+ V" a" w* w2 V1 j4 ]
; |6 S8 b2 m& R' u6 n# a7 X: F! B e# P+ o' K* D3 z
$ ^1 A- h) \7 I: k
. A' H- H- I4 K4 f; u# q" a9 t- Y$ p7 {- Z& a
$ c2 N8 E `" {ironic node-create --chassis_uuid dbb588b3-75e8-4028-b851-110671e05e58 \2 P3 F" U) h; B. U& Q: _
--driver agent_ipmitool \
/ K* H- Y; b4 n' { --name pc-zjnacthd01 \
( h/ l5 t+ |8 p- k5 @4 x7 V5 F -i ipmi_address=50.1.65.245 \; }( \0 V% g" u2 Y. a& Z' Y
-i ipmi_username=root \, R B( |' I' R" {7 y& _% W5 X
-i ipmi_password=Huawei12#$ \
# q9 i- U8 T4 o -i ipmi_port=623 \
$ Q, E3 N) M7 K3 R' z8 O6 y -i driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e \8 g/ a6 @$ D, D+ h' Z. I* O% H
-i driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc0 \* Q% d- L8 c# ]
7 X8 W$ ~, B' ^
Update 5/25: 正在开发Ironic AZ功能,通过node-update将AZ属性加入node,同步给nova数据库。nova boot时只需要指定AZ创建机器即可。$ c; j6 j, z0 z& Z& N6 p2 c8 x$ R
4 T7 a$ B4 X% t c" k/ w8 z$ p0 @: |% S
) ~/ j" [' I9 ]8 E, H4 T$ Z% e- D6 w0 f5 b+ X
" z; E4 O2 o/ H
5 l3 ]+ Z1 I. h. S" bupdate 5/12:
- d) A5 V8 q# w q1 L2 {( N+ v: |8 A- {1 b% |. g. d
- v7 E$ b# r; Q2 a) @! X
- }* h* `0 Q& ]6 p6 F5 W- `
2 f |* R! H5 Y( z
1 P; ~! V- Q* o7 m
9 t. U. m: k' \& }inspect成功之后:/ r1 i' }7 |) x6 x- K7 B3 l
) q; w i! ~. b0 ]7 Y7 G, X+ J
# A& r6 j1 w; ]7 b. r# c* i
5 s6 j0 Q0 e8 a/ g0 R/ G% x, X0 e# \* I6 C6 D( N; \
. ~* T; D" T8 s+ T
* v. U8 p/ J+ I5 `/ X5 n
|4 J" d! M5 R8 c8 q2 N9 Hinspect失败,原因见“问题2”
6 H4 W+ G" F' Y C& K" Q0 g, m* g! `8 X+ z$ d) m/ S
% x2 ?2 K) ]& r! S4 X" q* U
& N1 ]3 p( \% k
]8 N$ ]( [7 w0 S, f2 J! V
0 c" {- k5 h2 j, O U2 f, |- \* x! L" k
配置provisioning_network:( i+ J1 k! w8 t; S
; ^8 u4 c% c$ w" ]
1 v4 W' U$ ?8 C, O: {! b9 A! e! i+ f6 ~) j2 N$ `: Q
7 ^4 r) \7 w8 j7 V8 r( u: C
- ^8 A: S x) y9 X
9 a3 \7 N: T1 p! b
' j, g1 l+ q2 u5 J1 Y: C8 |& [0 q" v6 r$ {9 M
2 A2 T4 F' c- P8 n
9 [1 K8 P/ |. ^# h) V3 V! oInspect成功之后:
2 w( Q; F8 `3 _# Z3 K; G# T; J8 t( C: d6 u2 d2 r* @2 x3 ?/ M
1 i" I: \* o% i- q
8 ? G6 C! R7 e" w- h& @
9 n/ d7 W" u) r2 x o2 M) E( T& a
; ]3 c5 C4 { O5 ^
7 I' G& x: f- y
# v$ g, H7 C8 {: e0 v1 Z' V
) s- j' h5 }/ r
- F4 O) D1 ]5 U0 O& P) I/ q$ p* ^" X! T) E& g0 l; m" L% K
上传Ironic使用的镜像:6 s+ s1 y' U: l4 Q
glance image-create --name CentOS-7-64bit-ironic.qcow2 --disk-format qcow2 --container-format bare --file CentOS-7-64bit-ironic.qcow2 --is-public True --human-readable —progress0 [3 `2 U; p9 T% {2 J1 N6 O
glance image-update 40928b81-9be1-402a-8684-4e2d2fcf330f --property hypervisor_type=baremetal
" f+ ^5 ^/ _! ^: c
2 [2 ]- k; l4 w1 m9 e6 K1 jnova boot --flavor 2 --image 40928b81-9be1-402a-8684-4e2d2fcf330f --nic net-id=3a151049-ff3f-4bc5-88a1-b9084ec24bc9 pc-zjnacthd01
5 K1 m P$ r6 R3 r" g* d4 y: R/ f w, A
! q8 W, B/ o, H! s
- e$ c( }: B( B& m4 A. w1 v1 Q$ e" P5 u
3 i9 b+ `) b/ `" K& N9 n2 K9 D
== 问题 ==9 f/ S6 r7 ?! z0 @4 q% S
- node name有限制?' N' R% E [9 n
6 M' p0 Z5 r1 i% x) x
/ y3 E* Q4 m$ Z2 c+ N% C$ o* t' F8 {9 b$ D0 v* O
0 w1 Z0 r5 R( o6 ~1 P7 `
7 e; v) T" p: Q3 ^" B9 C' x- 第一次Inspect失败
$ X- V% q1 A# N, {0 M3 V- z 2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 228, in url_for% K/ [3 m# P; Y! l9 z
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main raise exceptions.EndpointNotFound(msg)8 y0 S2 v4 ` M! N. Q
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main EndpointNotFound: public endpoint for baremetal service in RegionFour region not found+ n# M: Y, ?9 T
( _: q+ O8 K9 ]7 v1 H4 t) V
重启ironic服务后解决1 P0 f+ M& a D. ~# e% r
) S& u# i! Z. w, X. L7 C- 第二次inspect失败,BM拿不到IP
/ q& \% a+ X* ^* X DHCP请求已经发送到dhcp server:0 c5 A4 m0 L/ U& j, t
8 q8 F- ^( }, T) I# d; g5 Z% ]
) n7 f) y% L/ Y: Q5 @$ `9 P
+ T) |3 h* A( n5 N& K
P: o5 i& t+ b: i* |& `7 J, C% V% c& N" X" k6 s
- inspect时找不到cleaning_network
% I) R- _8 }$ O6 q+ U& w; I 配置cleaning_network(=provide_network)% H8 P- K) G3 R% G! z h4 z& j1 u
5 F* ^: S; s) y( `2 o- nova boot失败, conductor.log:" o2 x# S2 t: Z0 C( ~" k. l1 g
( p. ^; S# b( l# m2 i
3 m6 C) ~* @6 T& c
" N* p+ e7 i0 K" X ?; w+ C3 k& y* B, t9 Y( i
更新控制节点的nova代码、ironic节点的ironic代码、计算节点ironicclient代码之后,问题解决. b$ `0 J7 `; C$ ~
I, F7 I; }7 \! i) e& |# E/ R. r- nova boot失败,compute.log0 p( H& P5 G7 D9 K4 J8 k
! P: R3 J; r+ |2 L$ r7 |
! t* X% w; F# p1 p2 Q, i0 R* s8 E; w% T5 V' c8 B0 j
! R- |) ^7 e8 k- c: [" V( ]原因是这个ironic node driver_info还没更新:) _ ^# H$ F- ~9 W/ Y1 r
# i5 k& J+ K& S# J3 y0 C3 M- x
4 h6 }+ Y1 V( g& M X" s
' N+ O$ ~1 f) X: w
; g1 X6 S$ w8 [更新一下:
' f( B: \1 s9 I, ^/ I# ]# Wironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=f8205536-070b-4286-8d0c-35e3b86477418 W1 }* g4 t% @6 _1 ?
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=302e6438-4d31-429b-8bae-47e225d4ed67
. V! D: g5 v- r- O7 V- V( D& Y. G8 @update 05/12: ! H5 t5 J! ?9 M- e
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e
1 H A2 }- m. v' n& Cironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
; U/ h/ A1 c: p& A1 R
4 l/ P8 M: T0 N3 x3 l. F* b4 ], P
! Q+ d( C: \2 V6 a; ^7 a
3 Z) ?0 H0 G% P# k3 j
- l7 s9 C n l Y# Q: v6 a
, Q% i% A3 ]2 p7 I" q( Z3 [- B- nova boot失败,镜像找不到,compute.log$ y( \- _/ S' S8 X+ u0 G/ A& ~
- n* @. Q. } P. Q; M0 z0 |/ ?( z2 l0 o
/ } [: k$ J1 V" y$ C& D) X& p5 E! v0 L5 B4 v6 C' q. t, E
计算节点nova.conf的glance-api配错了:
" T6 u, o: C1 q; a4 Z8 z' v, [
. T' ^# ^. d$ H [1 t+ n3 ?0 H! ^) c& r1 J' F, V+ o
) F+ I8 W" D3 y! N' {( G9 L* z
6 g2 a% Y! k- G* b: X) xironic-conductor节点ironic.conf中添加glance api version=1; I3 P7 q4 R- Q5 S( L: L; Y
9 N+ z/ \3 X0 b" d; j: _4 ]+ p4 g4 S0 l$ l3 A! v
/ u5 ]* \8 V; y) C5 s9 c
1 |) R- s+ I; W( v7 G; ^
0 `$ `" t% H0 H7 C4 L: B; C
8 l3 \ \: d, E& g* g1 }
glance_api_version=1
% s: u4 U: J# H! X8 A
/ U% G! r% P' j/ v' d! d6 I- nova boot失败,ironic-conductor.log:
9 |3 s; V) E6 c, x" ]" }
' I& Z1 s9 r3 e4 N ~9 q% v% l9 G$ w7 y: ?1 B/ v& B1 e
" {% V* V% g" W, r3 h7 E8 T6 |5 g' f9 Q2 x. f4 ]
命令行验证,可以在provisioning network d5a284c3-41d3-4eb3-a11f-58a99d3e2eb1上创建port
( E+ h. ?; {- P* ?& l$ R p7 w4 w+ o8 Y- ? @
原因是没有enable LLDP。enable之后:8 X: e2 I5 K7 c. b
$ ?$ K }3 E2 o0 l4 S$ i9 V& s9 r! `ironic port-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic port-delete 'X'
. S( c2 O# Q* z! g% Jironic portgroup-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic portgroup-delete ‘X'. x7 f5 D K5 A! q: o# O
重新Inspect:
/ }, ~: G3 {( D3 l2 J% K" pironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 manage3 c" E& d* [2 i& b, c [! T
ironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 inspect
- P# y w7 ~. ?; \. ~
7 `: Z) L' u# d- g+ L! j$ E
1 s! z" O) R5 ` Z8 C4 m2 C M/ H# }2 j9 N2 m) D H
! P& o2 t4 M- a. o" w
5 G7 N! t/ _* q2 Z% a
( Y1 R; V" R/ @% d! `% ]5 t$ X
2 e- H) E7 d# Y9 e D
1 M; T1 Q# u H/ {/ u4 u! Z4 n6 I
: P, C! g- Q/ t- nova boot失败,找不到用户镜像0 o/ J* A3 H+ ]1 p0 k @
原因是glance-registry.conf中的数据库写错了。2 i; O( \8 w* N, O
1 d. K; T' y8 x( W8 ^) q5 q" [! z- nova boot失败,找不到ramdisk
: Y( E2 D/ w" l; a0 G& j% t' { / C' e4 O3 I7 z. @8 z
# n! t' X% n0 x$ U7 B% J r$ L
; t1 o( t6 h8 |) o. g" c
' o' z# \6 i: L0 R6 G! H) @/ c
8 p8 i! }9 Y$ c( g8 c7 p( s这个image UUID是配置在ironic node的driver_info里面的,image需要上传到glance2 w' l m- ]* |. f
+ C7 x2 K8 a' k2 o/ u上传镜像:
" E0 {6 E2 s" \$ A; L; x! H% {8 J e& x' _5 H: j( n+ S
3 k h5 s+ R" f! L1 h8 Y
1 U+ y$ ^8 b; @( d/ d* {6 @! b
5 x% r! V% `6 O$ v \& y( H% J/ P2 t3 i8 I: O2 c
$ {# q2 [) d) e6 {
; s2 {* R/ h: a) q# ~* f' Q: M' n5 A3 T
2 d8 f3 o0 E. J8 v& J) r
2 {% ~3 Z" q. Q. }9 p. r' e
1 g% [# }; ^8 m/ D* y
. t/ L5 X# J8 \7 f. E3 V2 X p8 c! l
) s- [; q7 V- ^, I5 x更新Ironic node信息:
; N) [0 ?, J2 {# E" O; i+ U6 W
, n6 [ m+ e- N0 m1 T# H/ ~ G( z9 e
% S [% c. z5 Y. m [' {
% g8 x$ |* `1 r+ W; J- L, c
Y: o; t( I! I, b3 e# y( G# R- nova boot失败,访问tftp权限不够
T$ V# o( j8 F. m3 a
5 U8 Q B% b* J! ^+ x+ r
9 ?7 G: q4 ^& j7 g- u8 S' [8 x/ S/ x4 k2 K7 I3 w: a2 o
0 l8 x8 e7 A' c9 Y" Q9 ~
; [; h% L0 o* \* u( ?
chown -R ironic:ironic /tftpboot/; @6 D$ M. c7 Q5 m" i3 c1 ~: ]( y
* }2 l# i. Y& S/ `/ I3 _! u( d
$ ?7 T( r2 k& ] I1 [( @$ n
: {( c) B% }, ], s% \
; y- P6 @# {7 d# u; Q' d8 w+ x8 c1 \1 V1 e, E' v9 ^7 l
3 t, d `( R- c/ L6 v$ P
- nova boot失败,物理机DHCP请求被ironic-dhcp捕获了
0 w7 h* L8 D/ A8 `9 V. r5 T 关闭ironic-dhcp9 S( ~7 N2 A' l% V+ Y
|8 o- c f& ~" W- nova boot失败,物理机DHCP时不能从neutron DHCP拿到IP- r. S4 b( Z/ F! M/ T0 M
在控制节点上,neutron dhcp在dnsmasq启动的namespace中。relay的目的地址是控制节点管理网IP(eno16777984),dnsmasq的监听设备为namespace的tap口,IP为20.26.34.91,他拿不到dhcp请求。
. j5 G0 @* N0 l( C+ F现在的方法是:在控制节点上手动启动一个dnsmasq,使用neutron dhcp一样的配置( J$ f/ K" |$ K0 T( D% j
; ]. h) ~7 r/ W8 H7 r- 拿到IP之后,进入ramdisk系统,但是重启之后不能进入用户镜像的操作系统" `' _5 }# G' v6 U' J, {- `# V
查看BIOS的启动设备顺序,发现是- Boot Device Selector : No override
* q( x% s1 x" r) L9 d/ i查看ironic-conductor.log,发现连不上20.26.34.70:9999。这是IPA的地址和监听端口,需要保证ironic-conductor节点能连上,但是的确不通。& X% A5 D3 M- T
) X0 ~/ m" v3 E9 G
3 Z5 c1 g. L( ?/ H$ {, E, {/ i1 y) M
) E% ]. i4 W) D! w0 t
9 H( ]* P7 g$ G: X* T6 t
姚军说可能是ramdisk启动之后,有两个网口获取到了IP地址,引起路由错乱,建议我们ramdisk启动之后,删除第二个地址。! I% d* M0 s4 W* l0 u' m
8 A, [' o& L) R) a! E0 e
05/04 update: 在provisioning network上加上静态路由:destination=控制节点网段,nexthop为provisioning network GW
& i6 C4 e" S/ d4 N05/11 update:neutron subnet-update aca03dd8-3d2a-4c54-99de-7a8a7bac4f53 --host-route destination=20.26.33.0/24,nexthop=20.26.34.10 X' Y& K" A- O0 l; c. L5 E/ G
Updated subnet: aca03dd8-3d2a-4c54-99de-7a8a7bac4f53( l. K" [* ]7 U% Q4 w1 F& ^
' n* Y/ F9 J0 ?0 r5 _3 H: b9 p8 {& X& Z 登录/注册后可看大图
6 o% O0 `4 J- q5 |8 ^6 `0 W! n; p
+ `4 T7 h$ ~" M! t. f
' _5 j: q! j% m7 u
. {& f0 j# A3 x5 l m8 |验证可行,能连接这个端口并下载用户镜像: ==> 为啥会有多个网卡获取到IP,如何从代码层面解决?
1 `( L7 k3 Q3 I3 m l0 U5 X% W) a5 x6 I
* z6 l0 L/ o5 [
/ ^$ g$ s* [% y" R. z7 G" V" f6 P% b, k- ?
# l, k/ p% s+ [& ^: a; Q
1 k4 J# I) J# l' R9 r H0 P& }0 I" Z
IPMI查询启动顺序:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootparam get 57 N8 i5 a3 d" U9 I6 s
设置硬盘启动:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootdev disk1 S( }% g9 Q% i! q: B, _, J
( @" j0 |4 ^$ u' n' R% _
g% `6 ?5 h: I1 Q$ b
0 G1 k+ X& Q, G" c' n4 E1 o- z* f
) f3 e! Y2 u$ ~" [) j. }! D% s4 B! F- s
- 用户镜像下载到了/dev/sdl,没有下载到第一个硬盘,并且整个boot过程超时了
+ ~+ A7 _: _) e* |1 J
; E+ b5 g3 O- e0 E0 p8 z8 P f N( @9 w
# F8 C9 z4 \- g ?) }+ P' Q0 H+ ~7 Z& D0 E& p. t6 B; s* O
a. 姚军修改了ramdisk,固定使用/dev/sda作为写入的硬盘
: z/ l+ J0 Y+ J0 \: r b. 修改ironic.conf的deploy_callback_timeout=900
6 m" Z' j' W$ N8 h2 s' q1 @3 Y V0 g. s! Q
Updat 05/04: s5 k4 u3 i4 |9 @
李灏:ironic node-update 4fae2ae3-0935-4585-8be2-00298015f8f3 replace properties/root_device='{"name": "/dev/sda"}'
, x: x$ O, u% H9 N. H. L5 c6 Z+ d8 \- P1 e
- 写入了/dev/sda,但是ironic-conductor没有重启机器,导致boot hang死' E% _( U1 D7 _& J& h0 C
journalctl -fu python-ironic-agent查看IPA内的日志
3 g& v) J& H( q5 M; ^/ kjournalctl --no-pager+ m0 w! W7 h4 i/ C) T% F6 [+ U0 Z( j
; @8 ?' l1 M6 k3 V" ~4 A
- 镜像写入/dev/sda后,IPA执行partprobe /dev/sda失败
2 q- ^. ?$ v4 M+ a( s4 E- [ 3 h" _( {# n. f) }: F# Z: j
* U3 y5 `8 D0 |; I2 P, d
3 @( V+ q$ C. {8 N8 R3 p7 T2 D$ W) |
+ C5 m9 D7 z* j5 Y4 [. `( Q nramdisk中的ironic-lib需要打patch:https://review.openstack.org/#/c/444061/
0 E5 p0 |3 G n! F
4 E+ C+ i& p- c9 y4 h0 {1 R9 h [& N) Q8 Y3 f3 j: g
|
|