|
|
Ironic对接原生的Neutron/ A! u, _; a0 {4 Q+ ?$ c: L0 p
部署、配置相关:
0 [' K4 U4 w- z' D3 Q+ H# j- Ironic自己有一个dhcp-server,在inspect过程使用
- neutron-dhcp,在provision过程使用
- inspect和provision过程使用的tftp server可能不同
$ W! T; d* z9 u1 [
8 f0 i/ t0 _! C3 j( c6 iRegister过程 D( t2 _9 P3 t
用户录入ironic node,包含ipmi等信息
8 ^* ?9 ~* L) Z6 u3 i7 E
4 S- a9 E- t" B; }8 y: M( AInspect过程
# C, ^/ N# E" |4 @1 N0 B6 z2 d这个过程中使用Inspect Network,要求:
# D. t# c! e) s; ]- Ironic dhcp-server能收到BM节点的DHCP请求。
- BM节点拿到IP后,能和tftp-server-1互通(三层可达)
1 u. x6 F7 O9 m3 ]) ~& p
用户获取BM节点信息1 S( R0 u) J5 [( ]1 C2 ?
- Ironic通过IPMI设置BM节点PXE启动
- Ironic通过IPMI启动BM节点,做PXE启动
- BM节点从Ironic dhcp-server获取IP。此时BM节点的请求报文不带vlan tag,使用上联接入交换机的native vlan(默认tag=1)
- 拿到IP之后,BM节点从tftp-server-1下载小镜像(ramdisk,内含Ironic Python Agent)
- 执行某些操作,获取BM节点的详细信息
- 将BM节点关机。ramdisk运行在内存中,关机后丢失。
* Q9 {+ l s( }( Y
: W8 M+ r Y |$ iProvision过程
2 p, X* N" s' M- `3 m0 q' c这个过程中使用Provisioning Network(由neutron创建),要求:
3 V& V8 _2 }+ {7 c3 y+ C- BM, glance-api, ironic-api, ironic-conductor, neutron-dhcp-agent需要保证PROVISION NETWORK连通性7 P4 P! `0 X' L3 Z1 N
用户申请物理机,安装操作系统,配置业务网卡等
1 ~' [3 ~5 }7 L1 S1 x. B$ n- 从nova入口
- Ironic IPMI启动BM节点,做PXE启动
- 此时,要求BM节点从neutron-dhcp-server获取IP(通过native vlan)。但由于Ironic-dhcp-server也允许native vlan过来的请求,所以必须保证DHCP请求被Neutron-dhcp-server处理。
- 拿到IP之后,BM节点从tftp-server-2(可以和Inspect过程中的tftp server不同)下载小镜像(ramdisk,内含Ironic Python Agent)
- (这一步怎么控制的?)从glance下载用户要求的镜像,做安装(要求拿到的IP和glance-api能互通)
- 安装完成之后,通过cloud-init在BM操作系统内部打上对应的vlan tag(必须保证该vlan tag在接入交换机上预先做了配置)
" x! a9 N& v6 i. F# D
7 F6 u+ I% l& j: }; e: z8 p$ ] 关键问题:' x2 { _" }2 x/ s w
- Ironic-dhcp-server和Neutron-dhcp-server都允许native vlan过来的DHCP请求,如果有两个BM节点同时做Inspect和Provision操作,可能引起冲突。
9 {+ @9 g. I. s9 ?' a
- 两个DHCP server合并。但是Neutron-dhcp-server是白名单方式,而在Inspect节点,dhcp-server还不知道BM节点的信息,没法配置白名单。
- 严格将Inspect和Provision过程分开。在机房初始化过程中,开启Ironic-dhcp-server,做完Inspect之后将其关闭;或者在EPC上强制Inspect过程中,disable Provision操作。, K9 i6 y* q5 @2 ?8 j
3 F% _4 ~# v3 K- S6 `- R- ~2 y5 W
* 一级私有云中兴方案,将两个DHCP合并了,运行在ToR交换机上。
/ D1 u* ]" F4 q) L8 L. L
- BM节点的租户vlan一定要在接入交换机上预先配置,如果做不到,则需要动态地配置交换机
- Neutron-dhcp-agent需要在业务网上5 _% `9 l4 Z7 t+ q6 |, W
; L5 n# }; C2 O3 }3 b 苏州Ironic环境
9 o# p7 R* k/ R I$ j- v6 N10.142.24.12 root/@IDC_host43219 P( I m" T& o, o0 ]! b
0 h3 d! h& O# G# I2 F: Q) A! b! R7 r* }( g; K( e: x
浙江Ironic测试环境
5 J2 I% P& \. P, m- ^7 o, J8 W" P3 R5 Q
Ironic DHCP
* h4 R+ y$ M" P[root@csv-yglcs17 ~]# cat /etc/dhcp/dhc2 i9 D4 Y" e! C# ^
dhclient.d/ dhcpd6.conf dhcpd.conf
& t2 D: M! e& m# Z/ P[root@csv-yglcs17 ~]# cat /etc/dhcp/dhcpd.conf3 d- Q+ J6 `1 R, \
option domain-name "test.com"; G; ?1 Y" W# I% i3 {+ s
option domain-name-servers 8.8.8.8, 61.88.88.88;& A Z7 ]2 a* p6 ]+ |
default-lease-time 60000;! `! ]0 V: ~5 w/ N: @' f
max-lease-time 720000;' }" {) Q% F/ \8 _+ G- x K
subnet 20.26.34.0 netmask 255.255.255.0 {
4 x$ F# k: b- ~$ W range 20.26.34.10 20.26.34.100; <== DHCP段
1 @% b; B) D e4 B- L option routers 20.26.34.1;
) n: F( p! d: G/ e+ x, m. B next-server 20.26.33.26; <== tftp server- [6 X( z- d2 N; b
filename "pxelinux.0";
! L# M" _ F# p) I% y7 @& _- d) S- H}' P9 E: Z" R6 z: D
subnet 20.26.33.0 netmask 255.255.255.0 { <== conductor节点只有33.0网段IP,如果不配置这个subnet,则dhcp启动时会报下面这个错误
3 v4 c, |* Y) E- E6 |}4 f6 v& d3 I K) H) H0 |+ t
( p' ^- D3 t. `+ M问题:
6 G2 R2 ?2 D* G' JApr 19 14:30:21 csv-yglcs17 systemd: Starting DHCPv4 Server Daemon...
! l. T9 o$ G7 \- l! ?Apr 19 14:30:21 csv-yglcs17 dhcpd: Internet Systems Consortium DHCP Server 4.2.5; q7 e8 D/ J- i$ |9 Y2 e
Apr 19 14:30:21 csv-yglcs17 dhcpd: Copyright 2004-2013 Internet Systems Consortium. i% t4 Q+ d1 P+ @9 u; P
Apr 19 14:30:21 csv-yglcs17 dhcpd: All rights reserved.
7 K' O# a5 I- K1 AApr 19 14:30:21 csv-yglcs17 dhcpd: For info, please visit https://www.isc.org/software/dhcp/
* f i4 }2 V, }8 O) Y2 b9 kApr 19 14:30:21 csv-yglcs17 dhcpd: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the config file/ ]$ M, `' [ c' v, ^9 \( e
Apr 19 14:30:21 csv-yglcs17 dhcpd: Wrote 15 leases to leases file.3 k2 S) ~. A* U3 s+ \% L; H
Apr 19 14:30:21 csv-yglcs17 dhcpd:* D0 a; u8 i' F$ x% s/ V( ?
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno33557248 (no IPv4 addresses).0 a& O, `* c$ v, q
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno33557248. If this is not what! u; S0 @2 A! Z. N3 j4 P/ X) p
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration7 O: O1 I. z z& C
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment
: ]* r9 u B" z, @* s' dApr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno33557248 is attached. **
; @4 z$ C- ~3 G% t6 m oApr 19 14:30:21 csv-yglcs17 dhcpd:
8 v/ y/ N. u$ p4 G! g9 zApr 19 14:30:21 csv-yglcs17 dhcpd:* F' S; k2 j! x% d6 r' X, E
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for virbr0 (192.168.122.1).
' e9 ~) k& s, ?# `0 j" vApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on virbr0. If this is not what% s: ~. {& i9 y7 X( e
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration
3 `. _) o! P/ N! Z- W+ rApr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment$ @: S; \9 r9 L* ?
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface virbr0 is attached. **
- Q4 Z, p0 Y4 j1 b7 dApr 19 14:30:21 csv-yglcs17 dhcpd:
8 Q# ?& X( Z) c. t# Q& O. J: cApr 19 14:30:21 csv-yglcs17 dhcpd:
* w: O# x7 T0 H- j* e$ LApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno16777984 (20.26.33.26).
: K M& N( S- ?' CApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno16777984. If this is not what2 ^$ `- c6 m( N. U
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration
9 R1 Y& l4 z" W) b. k3 X6 `2 PApr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment
$ @7 s9 s5 L+ c3 g4 x; T6 `Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno16777984 is attached. **1 w* \( {! q' j
Apr 19 14:30:21 csv-yglcs17 dhcpd:9 ` v% }* O! n% t6 A! c4 K
Apr 19 14:30:21 csv-yglcs17 dhcpd:
- N% M5 A/ B. jApr 19 14:30:21 csv-yglcs17 dhcpd: Not configured to listen on any interfaces!
. n4 x: W3 S# [* w. A0 p1 N& E+ s) eApr 19 14:30:21 csv-yglcs17 dhcpd:
" N( J6 \. z9 w/ \7 BApr 19 14:30:21 csv-yglcs17 dhcpd: This version of ISC DHCP is based on the release available
, N8 {- s/ W1 Y9 J, o6 nApr 19 14:30:21 csv-yglcs17 dhcpd: on ftp.isc.org. Features have been added and other changes: ?1 U; L5 O8 y6 ^5 a7 Z
Apr 19 14:30:21 csv-yglcs17 dhcpd: have been made to the base software release in order to make
' ^9 ?5 m. i; w' wApr 19 14:30:21 csv-yglcs17 dhcpd: it work better with this distribution.3 v* i0 `: c V( }* Z, i
Apr 19 14:30:21 csv-yglcs17 dhcpd:& f. U9 K$ Q q
Apr 19 14:30:21 csv-yglcs17 dhcpd: Please report for this software via the CentOS Bugs Database:. {" d- b7 [% J2 l; U
Apr 19 14:30:21 csv-yglcs17 dhcpd: http://bugs.centos.org/
; X/ w2 [( k h! i' }Apr 19 14:30:21 csv-yglcs17 dhcpd:# E% ]6 v8 |% m' o" y* {1 |3 M( U
Apr 19 14:30:21 csv-yglcs17 dhcpd: exiting.# I$ [. }% n% G1 M2 ?
Apr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service: main process exited, code=exited, status=1/FAILURE
$ e. u% G! M3 {/ S+ s* { p1 H: fApr 19 14:30:21 csv-yglcs17 systemd: Failed to start DHCPv4 Server Daemon.
% @$ i8 D; O8 a3 j, pApr 19 14:30:21 csv-yglcs17 systemd: Unit dhcpd.service entered failed state.- k5 M9 u- w2 f @+ e
Apr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service failed.
3 p" C: A1 B: @- `
& D8 A9 j* q: a3 _8 _0 M # Z6 M3 J3 n+ X) L: p
Ironic Inspector) p8 P% o) ~+ V& x+ M; |+ I
[root@csv-yglcs17 pxelinux.cfg]# pwd
! g% @, I$ ]$ Y7 c! K1 x% u/tftpboot/pxelinux.cfg* x$ Z7 Z% m9 Q* M% Z7 h
[root@csv-yglcs17 pxelinux.cfg]# cat default0 l# W$ v- Z, u" \9 m. {4 f
default introspect
4 H2 `! d( M$ h6 Dlabel introspect
2 K6 X) m& N) R% F% [' o: Skernel /tftpboot/ironic-inspector/inspector-kernel* g9 K# ~8 {2 S: T, s* |
append initrd=/tftpboot/ironic-inspector/inspector-ramdisk ipa-inspection-callback-url=http://20.26.33.26:5050/v1/continue systemd.journald.forward_to_console=yes ipa-collect-lldp=True
( M2 S* m; u' V: O$ L s+ qipappend 3
, A. `0 W% O) L $ U7 g% \! w$ X! V# I
inspector在20.26.33.26上' j6 `; q: b" M- C" {( r
8 R, X3 b! v) W' ~9 E$ q8 M0 G0 ^Ironic Provisioning
4 {) H( I1 q/ t1 `% N9 I3 c0 m. Qironic.conf中的provisioning_network还没配置。还有cleaning_network。
+ h& d/ G! R8 O4 P' P. G
0 f& s, H/ O( J& [检查IPMI
8 B' F: i. j# S% v& s[root@csv-yglcs17 ~]# ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus power status3 y" }4 u; W9 Y7 ~! R; h* u
Chassis Power is on! g% l# o; W) u7 A# x
( }; s$ w$ M$ b: h* l, E* `( ]2 U: n- Q0 P0 |
3 k+ v2 U$ O9 f. V c
== 操作 ==
" M, u+ J' N: W/ F8 w" I( I3 V% O) B; ]

# y) L* S, m7 A6 C; l0 q$ v; i9 i" f6 \" u$ k$ A p, @
1 D% X- q A4 k6 D* q) ]) r! G' L8 j1 y: J
4 Q/ _% t: ^. p" k
" K9 Z: ?( R) ~: v; f4 d
! k- q! ^+ ~- F
3 x& U8 W9 P2 X6 O; N4 r& m
$ d' a& Y0 S- p7 V3 Dironic node-create --chassis_uuid dbb588b3-75e8-4028-b851-110671e05e58 \
4 G* t Q/ \; l& G --driver agent_ipmitool \, ~* G) m% t3 C
--name pc-zjnacthd01 \
# `: _1 i6 r" ?7 u4 S- K -i ipmi_address=50.1.65.245 \
) c1 w4 o, {; b) ?2 j5 P6 m. D -i ipmi_username=root \, ~; [- X4 d- p6 U- z5 K
-i ipmi_password=Huawei12#$ \( m7 S4 j$ T0 L( T
-i ipmi_port=623 \3 | N& d* P" B
-i driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e \
( p1 K& O+ z: U; }7 ~) p9 D -i driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
$ J# B7 u. G, c
. [! ^, _+ Z% c9 G+ b; ^$ ^8 [$ gUpdate 5/25: 正在开发Ironic AZ功能,通过node-update将AZ属性加入node,同步给nova数据库。nova boot时只需要指定AZ创建机器即可。( \7 H3 K% b5 U' V" h# C
! s# U- [ P) Z, `; u' h " _, u8 s+ z- _
7 @+ E$ P4 r& Z6 }3 y- _! L3 }: S c* n" ^" T9 e) p6 o8 {
) O- I+ U/ Z2 a
' h, T& ~* r) p% {
update 5/12: ]* J: J. d! x& n

7 ?6 R# B" i8 x4 m. I/ Q
; Q# i3 x. r6 i+ t; V4 h1 m% d+ o
% Y- K9 Y0 C3 H! X* ~% F
" p) ^6 o3 U9 K* [, B
) T' l- v/ @ W4 u* q, g* h
, Q) _1 K0 j3 n3 K einspect成功之后:5 @ q* A( r w. N `7 T; {

, I0 @" H* t5 l) q2 e C3 b% R5 `' J0 c0 k
- r% E" k9 m$ c3 F( b0 `" ~
5 O( L; B; \) w6 d
# g( h. s+ Y$ J7 b/ G& L. v5 I" o8 r( E3 b; `8 A1 w
) Q* }0 U. i8 j' f a
inspect失败,原因见“问题2”. p$ |; J5 @- n0 s( I( q7 X5 B
, |6 G0 @* e+ h, }( z 8 X7 a/ _8 ?6 z* E% e
. L! O. v) u- E8 ? M% Y( k4 b- Z
6 \! W7 o! \ o4 z- |" |' r5 L
$ W6 M! k: q! |4 V配置provisioning_network:! j& ?1 T" ]3 P+ G! V

* s7 _: y3 w3 W$ `* x; K. w* ^8 [2 @$ H
" C# e5 c. S* i. B3 K
p) T! E0 ]$ a' Y( c' ] q; K( X' {; | I# W0 h$ F
% \0 u+ A& c& i4 o/ a9 s$ Y) |* v4 b
X: S) w6 D: K- J# u" j% ]
" x. F" Q: s$ C M. ^3 `- r
# B, r, A+ D6 [4 F) m- \4 _+ D$ @
9 I; d6 I( c. p
Inspect成功之后:
a6 c* @# _1 N m N! R
3 B: F; {' B, U0 Y. d
, j3 Z1 ?! W% k( q+ F+ v. t" ?8 t- n9 d
6 }5 J, K2 s1 ^5 n+ W& F
% m& k; e* B d) R/ X% j ; [/ F+ t" a+ P$ {$ k
' D; h1 L D& F8 ? s
/ ?8 u" s/ g6 _5 V' |. ~) i
' L* B% b3 U6 U8 m* x# X
8 N" m5 v4 v. T/ P上传Ironic使用的镜像:0 [3 {, t9 m4 Z; q. n; n( g
glance image-create --name CentOS-7-64bit-ironic.qcow2 --disk-format qcow2 --container-format bare --file CentOS-7-64bit-ironic.qcow2 --is-public True --human-readable —progress/ `' ]% D) j0 l0 c: ^! W' W7 ?
glance image-update 40928b81-9be1-402a-8684-4e2d2fcf330f --property hypervisor_type=baremetal
/ R7 K% g& P* p; }6 ]1 Z) B4 o6 G 4 l P) k( e; x* V1 q# z
nova boot --flavor 2 --image 40928b81-9be1-402a-8684-4e2d2fcf330f --nic net-id=3a151049-ff3f-4bc5-88a1-b9084ec24bc9 pc-zjnacthd01' M# F- G8 ]+ I, ?' I

3 C& i, y4 v _: y% Q$ _" \5 `% X7 q! K" [: }7 U. {$ w2 l
4 b. O9 I2 }+ e# G0 Z: j: F; Y1 b0 C& H% c
4 R) r4 l- ]; |7 n0 U; o* c== 问题 ==3 r* d E2 @1 h& D+ ?% K7 ~# j
- node name有限制?
% w; a- c5 ]2 t/ Y

5 g, ?( E6 o. t3 r8 a" u4 _2 H& E2 X1 t
4 c" g* h2 {; z5 r/ l5 x; a3 e4 S5 g) @. o. t5 p0 {* C- e) O0 O
$ c0 E! `9 v6 }8 y6 L2 F! T- 第一次Inspect失败9 J, V1 z' ^* W
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 228, in url_for9 s2 N' P6 u5 i6 K
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main raise exceptions.EndpointNotFound(msg)
/ A! t8 f3 ^6 \5 F2 d" y7 T2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main EndpointNotFound: public endpoint for baremetal service in RegionFour region not found
* s n, P( X/ D, h- n( K7 }
9 R9 d- f0 I0 s( L* L重启ironic服务后解决6 b; Q8 q% \2 M* T; |
$ y2 B6 j; y% S" J1 B8 r' t7 T( A
- 第二次inspect失败,BM拿不到IP, Q4 U5 N8 C5 n4 K
DHCP请求已经发送到dhcp server:4 P6 C$ b: ]0 o0 m
* O2 n3 }! Q" V
; o0 T+ y' Q: m- h5 [ V* }% y& u7 b6 M% e$ A- C9 v1 ~
" F$ U. Y. y0 s- g" S& C
3 B" q7 H# P# G3 u8 t( p0 i1 b- inspect时找不到cleaning_network
1 y( D& ?6 e3 d! a/ M$ V+ H6 r( N& i 配置cleaning_network(=provide_network)+ ]5 S! w' f; V( U
$ ~* X1 |/ J; _9 O, E: h3 ?6 p
- nova boot失败, conductor.log:2 M2 G% d$ r4 T7 r: K6 A% V

* R! B9 g5 h2 C! G9 m# U2 ~" C4 y. i1 V6 i3 }/ |2 q8 N
+ T8 ^4 G7 Y; t- m) J: u! {
# F# t: M; [( k/ A9 W- p* M更新控制节点的nova代码、ironic节点的ironic代码、计算节点ironicclient代码之后,问题解决
- t. D" e( f* G" G! ^) y, J/ d- f" u
- nova boot失败,compute.log
8 P* t4 j3 }% j$ q3 R) q0 D; D

2 L9 i; d, e6 }- U9 V+ v/ D9 \3 W9 |2 s$ N
& o" F; `8 ^3 c0 S5 T% r
& W8 y3 n. K' r. D& b# r/ ]
原因是这个ironic node driver_info还没更新:
# C' [6 J8 ?' y7 v# S
+ ]1 H% N* B3 b* R
" i9 U! g; w, M6 {% |! \0 k) [. p
) P. d3 m' w2 T1 \. a+ b6 u F8 {8 _/ G% Z/ K; d& Z& X* M
更新一下:4 l1 @: l8 M& Y9 z+ [
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=f8205536-070b-4286-8d0c-35e3b8647741
, ~- I/ d+ l0 k* v# a+ b8 a/ q2 s2 {8 bironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=302e6438-4d31-429b-8bae-47e225d4ed67' e: ?1 I, m' J5 k( L# Y6 k, R' C2 \
update 05/12: ) w: m# p+ [, L% \5 C, f1 X5 T7 Z
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e. j7 D: n) ?& p4 u4 N
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
9 m6 {- P8 J- I/ N 
4 Q2 @7 Q0 b( Z. o- `% k5 M) R( b v3 @; w
- C$ ^! V9 E' a9 {: P; N
5 M" D) A3 K F# z: M
z V$ q* U" J8 I0 A7 K4 ]- nova boot失败,镜像找不到,compute.log
+ t. N/ a* ^: U9 y e
2 F6 Q6 q3 w( I2 J2 z8 s" V
$ i7 f( c, ^& B
# J+ x9 }# b& c4 J. c! ?* F: j! r8 W7 ^: a" ~; L1 z- [
计算节点nova.conf的glance-api配错了:
, V* c4 k* m! n- k: J
9 _6 ^( {8 v, i: Y# Y0 K
, @# d5 s/ u4 L& B5 h7 V
+ [9 o: O! \ K# r) y- }' I" I: [
* C c/ g4 t2 Y) F% V' t3 ?* m% S/ Yironic-conductor节点ironic.conf中添加glance api version=1, s5 D) r a' C1 y3 ~4 L

! U$ v- @; h; `" Z' U/ j
6 v/ M' \9 X2 C: K2 c5 A: v8 L0 a1 _$ M- j
/ Y1 A5 a7 y' M/ E, r% p" O
: W% x5 W5 e2 {9 @
+ ?9 d+ j6 K& w$ c/ q3 sglance_api_version=1# L% O% g# Y* G( |8 [
0 a+ ]% `; e+ K( f( ~" P% h
- nova boot失败,ironic-conductor.log:
6 _1 x/ M0 c4 K8 \

( M; y: N5 i, C
+ L5 |& z! f0 |8 k
- l5 D5 ]4 [, j* c6 H# Z2 p8 K9 H4 T* `. O
命令行验证,可以在provisioning network d5a284c3-41d3-4eb3-a11f-58a99d3e2eb1上创建port
0 k; M0 [6 m5 O
4 G9 E! N9 ~; _6 _' a原因是没有enable LLDP。enable之后:5 C! Y6 y" b$ I* I1 k; j
1 o/ g+ _ v+ X- F
ironic port-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic port-delete 'X'2 Z2 O x* s8 u0 l2 {; e. w: D9 s
ironic portgroup-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic portgroup-delete ‘X'
6 `# i$ E8 f$ K& i: p. {$ S' y. k重新Inspect:1 s, L$ {" m7 \& S
ironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 manage
4 H% t8 D1 D& g$ R$ e; \$ M3 Sironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 inspect% @4 M5 ~' A6 w

6 h& r. |* u' W) M3 t/ `' j4 }) K
1 j" y/ P* g& r: F9 y* `
# P7 C+ {* `' a3 w- C0 D- P0 p1 ~$ v
* V: b" i" m5 @7 x$ C/ J: r$ q ' |# G! R3 W0 X0 e; P, J* _8 g, O
# j/ b; ^6 j) x
# N, [2 ~/ q2 K, o% f; Z
0 g2 ~ h: \0 j7 w% r1 d' b* V" k4 G# y/ ]4 l! i) `9 D+ L. {
- nova boot失败,找不到用户镜像
) A! P& Q8 V! t 原因是glance-registry.conf中的数据库写错了。
5 j( O& z u: g/ N \/ K
% T* M0 x; M- l% `- nova boot失败,找不到ramdisk! e' |# i' x4 |3 J8 B5 l
S! z( C- w) C' A2 V2 p. T# V
4 N, a5 U+ z; Z, t! K
: T: p4 y: A7 j3 ? ^7 u
: x& m( l# k/ Z7 r3 k5 {- W% h5 R+ \. w S: T8 i
这个image UUID是配置在ironic node的driver_info里面的,image需要上传到glance
* V9 _3 ?; ~' z) D$ Q- o) v8 T( s3 L7 L/ k: I: c3 H5 e. t! N
上传镜像:
% ^9 ?0 o9 u# b2 q, z8 p ; [: E" X! B7 [+ t+ K! F2 k
. N2 y+ w6 \# X- t
) L V- f) p Y& l. W& l$ I& u) Y# J# q4 f$ h0 k3 @: E. J
# e/ j8 j9 v9 l0 Z, k
' I* R4 h# { q) B
& r9 s4 D; Y3 o8 [; }4 V) e$ |
8 g/ K/ H3 R2 Q ?
. m' Q* R# {8 h& ^& L
& C: u# V2 V5 Y. G
1 I- f3 g3 Z( P% p3 q
" N% Q ^/ D& [
1 y- d+ ~2 Y3 }' \* u% A更新Ironic node信息:
4 P* _' u4 Z. C, I2 d# F/ j' T2 R
0 n! s$ `; ]( i, A, y
: V6 [, ]; o; {( q! b( a
7 P: \$ V, n1 `, {. ?( C* W
" H/ } d/ ~; q! [% Z. V# O0 v3 \. D% s# Q' ]' G
- nova boot失败,访问tftp权限不够/ V9 v' u! W1 S3 c$ ~

1 S* {7 {1 ?8 d! O1 t& T/ w) G3 I9 G
# ^7 k6 o! Z/ Y' [; z. I
$ h) Z T6 i) t- @
6 X3 K* N5 I: z4 t! R0 A) ]' |; d/ o) a {! `$ w. z
chown -R ironic:ironic /tftpboot/
$ k2 \3 ?- L, I- ^9 E: g
4 j- S* I9 K; t9 A& S+ w# E
1 X Y; B3 c+ h& P% ~9 I9 N3 z% L. m2 l$ p8 h* k5 O
l! n t0 q0 ]* b8 ]1 E' }$ R9 L
& A0 l7 b% P$ y1 b. f
5 _6 l9 b: |7 Y0 f3 `5 `/ Z& k2 y
- nova boot失败,物理机DHCP请求被ironic-dhcp捕获了4 S% ]( ]' H, }+ {2 E
关闭ironic-dhcp$ x: Y, ]2 S: u0 u
) ]: {6 P) X* K% v3 N3 l- nova boot失败,物理机DHCP时不能从neutron DHCP拿到IP
- C" K1 C! \' C$ D, b1 T" E8 P 在控制节点上,neutron dhcp在dnsmasq启动的namespace中。relay的目的地址是控制节点管理网IP(eno16777984),dnsmasq的监听设备为namespace的tap口,IP为20.26.34.91,他拿不到dhcp请求。' c3 I: Q- B) V6 ^5 d
现在的方法是:在控制节点上手动启动一个dnsmasq,使用neutron dhcp一样的配置
0 ]8 o: w! r' D% \" n8 w8 [# ^: M4 `
- 拿到IP之后,进入ramdisk系统,但是重启之后不能进入用户镜像的操作系统$ |+ r% e$ [" z- g; F
查看BIOS的启动设备顺序,发现是- Boot Device Selector : No override4 q7 i# O5 M* @1 K* k
查看ironic-conductor.log,发现连不上20.26.34.70:9999。这是IPA的地址和监听端口,需要保证ironic-conductor节点能连上,但是的确不通。
4 T3 |7 {( \( B! o& }2 h7 m) `0 g+ U 0 r- Y& [* I8 o$ H! ^9 u6 c- `
' N) O O0 ~1 J2 X9 t
# N! {) [3 B( F* s" ]$ w( G+ f! y/ X9 g; L
) G( u) ]& d( S- o% }$ `
姚军说可能是ramdisk启动之后,有两个网口获取到了IP地址,引起路由错乱,建议我们ramdisk启动之后,删除第二个地址。8 s" X3 E* s* ]
5 o; X! C7 Q1 | y05/04 update: 在provisioning network上加上静态路由:destination=控制节点网段,nexthop为provisioning network GW
; v2 j7 `* j1 P( ~2 Z5 u2 m" M05/11 update:neutron subnet-update aca03dd8-3d2a-4c54-99de-7a8a7bac4f53 --host-route destination=20.26.33.0/24,nexthop=20.26.34.1
& o( J: a* |6 R3 t. V) ~ l: JUpdated subnet: aca03dd8-3d2a-4c54-99de-7a8a7bac4f53
8 k% N" z! I* K$ G% ~- Q: s 8 ^2 N2 x% h T2 R, c+ c& V$ F
$ |# q, z' L- ^$ l( E/ V7 ^
3 h" M4 P4 Z. I0 p' g
; b8 r) ?0 t6 O! g
! H0 o* T1 J# m. t1 Q9 |验证可行,能连接这个端口并下载用户镜像: ==> 为啥会有多个网卡获取到IP,如何从代码层面解决?
1 s a5 {; ^5 q' P$ h D
; F+ C, j& V0 Y0 @" R9 W' l5 P3 b2 v7 n; J+ g+ u5 p
9 K6 f3 R+ W, o5 l% o1 ?
2 U$ {/ `- E( [" y% @3 }- T1 m
# s# V8 f! O$ ?; w9 A
( ~4 [4 f; z" u$ a( J- zIPMI查询启动顺序:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootparam get 5
Y# U" f2 U6 b2 Z( h; G设置硬盘启动:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootdev disk
8 Q/ Y8 h) y9 Q' ^5 A4 M
7 O6 S( H6 B# S9 s# X; b1 j0 i( g! C- [1 R) n4 A
: [: i7 z1 N! H; I, {
Y5 l4 t) E( \4 J9 H7 w
8 n1 f/ d+ x8 }2 e
- 用户镜像下载到了/dev/sdl,没有下载到第一个硬盘,并且整个boot过程超时了
F" U" ~9 k$ ]- U( L' d i

+ c, a H* m/ C% h/ x% `5 l! m4 b& u% V/ r) s" H5 t
" R$ F9 z& \$ d7 P% x) d( X. z9 y
' D- u: H5 L+ n2 X) H- [! H
a. 姚军修改了ramdisk,固定使用/dev/sda作为写入的硬盘
& W8 ]" \9 d% m9 R) | b. 修改ironic.conf的deploy_callback_timeout=900
% N2 g% v: E8 m' Y! b6 \, x l: j
3 g# m X/ c' r! w6 q5 b% E# cUpdat 05/04:
2 R' L- M- J- l李灏:ironic node-update 4fae2ae3-0935-4585-8be2-00298015f8f3 replace properties/root_device='{"name": "/dev/sda"}'3 c: _% u) m2 I( V
9 c3 N ? Q- G8 Z/ W
- 写入了/dev/sda,但是ironic-conductor没有重启机器,导致boot hang死
P' I9 V/ \8 d4 b$ a* u1 e journalctl -fu python-ironic-agent查看IPA内的日志2 E- |( I( Z" u, ]- e+ @
journalctl --no-pager f% E8 W. Q; [3 C" S
! {3 F0 D- z7 `; I8 i1 k- 镜像写入/dev/sda后,IPA执行partprobe /dev/sda失败
# N1 G# p$ V+ ?$ C
" d! d. i: M% s! l
! s7 F+ E5 t4 d% o/ V5 y* D/ [
6 e' b4 e- ]4 o( H H, m
ramdisk中的ironic-lib需要打patch:https://review.openstack.org/#/c/444061/
: r* B/ }. C% [6 l# P7 f. B0 e4 d' b1 e! }8 p* k' q5 v7 m
# N& _+ G3 Y& A) X0 }8 S+ q0 y$ L S
|
|