|
|
Ironic对接原生的Neutron
* u5 N: W, J/ j; p* Y1 y' @+ ^ 部署、配置相关:6 |' Y7 w2 Z: O7 `7 w% ^
- Ironic自己有一个dhcp-server,在inspect过程使用
- neutron-dhcp,在provision过程使用
- inspect和provision过程使用的tftp server可能不同' d4 z; q/ s- O
/ M* I+ R }6 @5 D, h0 x9 DRegister过程
& s" Z# `: L# S用户录入ironic node,包含ipmi等信息
( M7 [: O2 G- ^4 T& e# _+ ~$ i
& ]' S/ K' X4 CInspect过程
( g7 ] C0 g) I6 \这个过程中使用Inspect Network,要求:
- f. U- q6 v0 n- Ironic dhcp-server能收到BM节点的DHCP请求。
- BM节点拿到IP后,能和tftp-server-1互通(三层可达)% B' N* ^# K5 {# }6 m
用户获取BM节点信息
8 J; J) f# u1 u1 b - Ironic通过IPMI设置BM节点PXE启动
- Ironic通过IPMI启动BM节点,做PXE启动
- BM节点从Ironic dhcp-server获取IP。此时BM节点的请求报文不带vlan tag,使用上联接入交换机的native vlan(默认tag=1)
- 拿到IP之后,BM节点从tftp-server-1下载小镜像(ramdisk,内含Ironic Python Agent)
- 执行某些操作,获取BM节点的详细信息
- 将BM节点关机。ramdisk运行在内存中,关机后丢失。% R8 { W* s* _, ~: d
# |/ e/ m( m% M4 m. A
Provision过程
* O0 d6 N7 B7 x这个过程中使用Provisioning Network(由neutron创建),要求:( Z$ {8 `, v3 \4 s$ ], [
- BM, glance-api, ironic-api, ironic-conductor, neutron-dhcp-agent需要保证PROVISION NETWORK连通性1 h6 ^2 d4 d. t9 H
用户申请物理机,安装操作系统,配置业务网卡等% ^# q- d# i) T, O7 u+ |. i( ~# \: `
- 从nova入口
- Ironic IPMI启动BM节点,做PXE启动
- 此时,要求BM节点从neutron-dhcp-server获取IP(通过native vlan)。但由于Ironic-dhcp-server也允许native vlan过来的请求,所以必须保证DHCP请求被Neutron-dhcp-server处理。
- 拿到IP之后,BM节点从tftp-server-2(可以和Inspect过程中的tftp server不同)下载小镜像(ramdisk,内含Ironic Python Agent)
- (这一步怎么控制的?)从glance下载用户要求的镜像,做安装(要求拿到的IP和glance-api能互通)
- 安装完成之后,通过cloud-init在BM操作系统内部打上对应的vlan tag(必须保证该vlan tag在接入交换机上预先做了配置)+ @6 d4 y% n% ~ w5 ]5 v& Q+ x3 d
! _) _5 P# B! c% }- E" F: ?. z. f
关键问题:# p+ A) { K2 h/ Z% Z5 v' j+ U4 k) q% R
- Ironic-dhcp-server和Neutron-dhcp-server都允许native vlan过来的DHCP请求,如果有两个BM节点同时做Inspect和Provision操作,可能引起冲突。
. S/ N9 [ p/ W* Q7 s8 C
- 两个DHCP server合并。但是Neutron-dhcp-server是白名单方式,而在Inspect节点,dhcp-server还不知道BM节点的信息,没法配置白名单。
- 严格将Inspect和Provision过程分开。在机房初始化过程中,开启Ironic-dhcp-server,做完Inspect之后将其关闭;或者在EPC上强制Inspect过程中,disable Provision操作。
. V# K0 c, W7 M `- a0 ]1 w
# {& V; _2 T3 w( h2 \- s/ R( ~3 E* 一级私有云中兴方案,将两个DHCP合并了,运行在ToR交换机上。7 }1 j" H$ M5 i/ p, }/ H8 e
- BM节点的租户vlan一定要在接入交换机上预先配置,如果做不到,则需要动态地配置交换机
- Neutron-dhcp-agent需要在业务网上% H8 r3 C# t5 P. l+ D- Q$ Y* W
+ h, H7 |9 d2 R: V2 }; C
苏州Ironic环境
! V |% _3 o6 h4 e" `$ d10.142.24.12 root/@IDC_host4321) t0 f: F6 H4 {
3 p* w! H6 t5 E2 ?# l: Z# X; c5 p- d, k% S6 ^
浙江Ironic测试环境
9 M) ^ P1 h+ V3 p/ L8 K6 U! ~8 i/ J
Ironic DHCP
& Z C$ M! l+ M- A7 [% Z) L+ y% I[root@csv-yglcs17 ~]# cat /etc/dhcp/dhc! Z; w6 P& j- p$ a' T& d/ D
dhclient.d/ dhcpd6.conf dhcpd.conf
5 |- \+ I% M: L6 K8 R[root@csv-yglcs17 ~]# cat /etc/dhcp/dhcpd.conf
/ t' l# M& p6 |2 ~+ L9 O J0 f7 Y2 Voption domain-name "test.com";
! H6 ~5 l; L' r0 Soption domain-name-servers 8.8.8.8, 61.88.88.88;6 j# c/ V& B' s' U: O* ^7 W
default-lease-time 60000;- s2 C& r$ V1 m* O6 E7 t
max-lease-time 720000;0 h* i1 ?3 `$ s3 l5 W
subnet 20.26.34.0 netmask 255.255.255.0 {
; L" b5 L+ L8 q: B range 20.26.34.10 20.26.34.100; <== DHCP段
% S$ o$ b7 x: F8 G T! ] option routers 20.26.34.1;% A+ f0 f+ S, b( [
next-server 20.26.33.26; <== tftp server
8 ^: ^. x' s7 n# P# n7 B filename "pxelinux.0";5 J: [5 i# B- _3 ^$ G
}
9 o: a! V" @" y0 |) o1 ssubnet 20.26.33.0 netmask 255.255.255.0 { <== conductor节点只有33.0网段IP,如果不配置这个subnet,则dhcp启动时会报下面这个错误% d0 R+ t/ _* F2 k$ s# T7 J
}
, q" ^8 e: A) k2 Q* V% F' ?7 @4 W- T% Z' ~
问题:
2 Q( c/ ~, a; g, l% e! mApr 19 14:30:21 csv-yglcs17 systemd: Starting DHCPv4 Server Daemon...) ?& j- j; U' H4 k
Apr 19 14:30:21 csv-yglcs17 dhcpd: Internet Systems Consortium DHCP Server 4.2.5
+ t9 x- p' P) x1 SApr 19 14:30:21 csv-yglcs17 dhcpd: Copyright 2004-2013 Internet Systems Consortium.; `9 n" t! z2 o
Apr 19 14:30:21 csv-yglcs17 dhcpd: All rights reserved.
/ V( f6 O F. W( J' B. ~Apr 19 14:30:21 csv-yglcs17 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ k" X5 a2 A7 p% ]% X; L9 R
Apr 19 14:30:21 csv-yglcs17 dhcpd: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the config file4 d& y: U- ^$ ~, f# B( w
Apr 19 14:30:21 csv-yglcs17 dhcpd: Wrote 15 leases to leases file.
0 C a+ h8 |6 A" QApr 19 14:30:21 csv-yglcs17 dhcpd:, H/ h7 T) H8 c- n# h3 A
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno33557248 (no IPv4 addresses).7 [6 Y: t, i! [' @/ \. A5 \8 G( y
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno33557248. If this is not what
& O# m1 _0 H; ZApr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration5 T* J6 R" a r3 p4 B( x; S0 Z' m
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment& x; T8 `' r- R/ _ S( Y4 z. r
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno33557248 is attached. **' `4 g* F" p3 |% \1 A
Apr 19 14:30:21 csv-yglcs17 dhcpd:
, O' V9 n9 Q0 i+ x5 ~8 aApr 19 14:30:21 csv-yglcs17 dhcpd:: T. d! g2 Y! b# C B
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for virbr0 (192.168.122.1).- V5 s; h+ k1 Y5 D) ?& u
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on virbr0. If this is not what! Q" ?( u$ P: [$ s% x$ E
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration; s8 ~# q& y ^1 g! E) R
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment0 G! t( |0 `1 D& X2 M
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface virbr0 is attached. **
. @# ]4 V+ r! IApr 19 14:30:21 csv-yglcs17 dhcpd:" [1 l6 O3 W1 r( c7 |( R
Apr 19 14:30:21 csv-yglcs17 dhcpd:5 i# N- O+ J) L
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno16777984 (20.26.33.26).
0 ^) O% ~" B3 ?! q% yApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno16777984. If this is not what2 T' S R! y7 j, e- t
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration* U) X3 c, `% {7 U7 Y
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment
/ m" r$ I! ]" Q4 y0 ~1 CApr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno16777984 is attached. **
: Y3 B% Y0 J4 }$ \* fApr 19 14:30:21 csv-yglcs17 dhcpd:
; W9 X1 F: `0 B2 dApr 19 14:30:21 csv-yglcs17 dhcpd:
. e- z* a) f9 [) P7 [+ ?Apr 19 14:30:21 csv-yglcs17 dhcpd: Not configured to listen on any interfaces!
0 H% V0 {5 ~/ Y+ w Q: ~0 }2 OApr 19 14:30:21 csv-yglcs17 dhcpd:# Z2 }* R$ S1 S
Apr 19 14:30:21 csv-yglcs17 dhcpd: This version of ISC DHCP is based on the release available
f9 ]! a! J& n' t& w8 p3 {Apr 19 14:30:21 csv-yglcs17 dhcpd: on ftp.isc.org. Features have been added and other changes* q5 L& X+ f( j. `0 u! {/ K
Apr 19 14:30:21 csv-yglcs17 dhcpd: have been made to the base software release in order to make
+ [; X5 l9 O: eApr 19 14:30:21 csv-yglcs17 dhcpd: it work better with this distribution.
1 {6 j! `- J$ \/ O2 |$ N, ?" H9 IApr 19 14:30:21 csv-yglcs17 dhcpd:; d" U" S6 w- q' Z# [, H
Apr 19 14:30:21 csv-yglcs17 dhcpd: Please report for this software via the CentOS Bugs Database:1 Z/ Y) j, N, C4 _ J( @. m( L
Apr 19 14:30:21 csv-yglcs17 dhcpd: http://bugs.centos.org/
! l5 n: `( z- H& K4 m' v4 G3 EApr 19 14:30:21 csv-yglcs17 dhcpd:( k. ?! z( [9 P# \* E
Apr 19 14:30:21 csv-yglcs17 dhcpd: exiting.
2 g! _( s5 A- B0 a) kApr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service: main process exited, code=exited, status=1/FAILURE
+ |: S- y/ a4 T: e2 ~Apr 19 14:30:21 csv-yglcs17 systemd: Failed to start DHCPv4 Server Daemon.( W8 b" s6 ]: R* J
Apr 19 14:30:21 csv-yglcs17 systemd: Unit dhcpd.service entered failed state.
( }! `4 Z! h# t) h% Q% h; f9 HApr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service failed.
& |: g) ^0 m( e0 e/ D) M+ Z# z8 ]& `" Z1 |
5 l2 p c2 `1 q) U; B5 [% _$ t& LIronic Inspector
9 h; h) O2 S }" S7 f i[root@csv-yglcs17 pxelinux.cfg]# pwd
$ a% T1 Y! L2 K) `/ j5 U/tftpboot/pxelinux.cfg# C! Z' m1 ^( o R* t7 ?: \ ?
[root@csv-yglcs17 pxelinux.cfg]# cat default
4 R5 R8 c1 J. B+ Sdefault introspect
0 w* L7 C/ X- m$ a; I6 mlabel introspect
: K. n2 k3 u( H' {! A3 t8 X# akernel /tftpboot/ironic-inspector/inspector-kernel
" D/ F2 s7 o7 ~+ a `6 Xappend initrd=/tftpboot/ironic-inspector/inspector-ramdisk ipa-inspection-callback-url=http://20.26.33.26:5050/v1/continue systemd.journald.forward_to_console=yes ipa-collect-lldp=True
6 {% p' O- m( @0 \1 `/ w5 zipappend 36 p+ z9 _* a1 q$ z! |( j
Z* q/ D6 |6 V7 J, ]inspector在20.26.33.26上' T8 d a7 k, r
' O3 [2 k* {. F
Ironic Provisioning
, Z0 Y- Y7 `3 nironic.conf中的provisioning_network还没配置。还有cleaning_network。
8 r5 n' h6 ]1 Q& m
0 F- v) ^/ C! q0 q# p+ i检查IPMI
; s. g0 n k4 b8 J( u: c; f- P[root@csv-yglcs17 ~]# ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus power status
+ t8 U* |! @, I9 SChassis Power is on
" p* d6 L& G6 Z! T8 M& U7 }4 v7 ~$ N
; W3 u' _% _2 d0 b# _4 z3 Q5 F5 C) D: F0 B
== 操作 ==) ^& \4 X/ }) r, S; V7 x* d7 ]
7 T! O V+ c- z$ t* m5 [

& i$ ?* Y( k- `8 t$ e$ }- b( b7 J
+ j1 N' W7 s4 b! a6 e. l$ z) Q: j: A# `7 N4 Z# S% W
5 B1 d- h8 i# Q! y+ b / K; Z% [! D& G$ O
9 _' F6 c7 T! N8 c( l' X3 ]2 u/ _9 g- s8 D) u/ |6 N
- a7 C* x. g% x4 A7 I' S% ?ironic node-create --chassis_uuid dbb588b3-75e8-4028-b851-110671e05e58 \( |3 \& Q3 f9 ]+ G
--driver agent_ipmitool \
$ W7 h( K+ c& \4 S/ g& R --name pc-zjnacthd01 \
6 S, i- r2 I; |, z -i ipmi_address=50.1.65.245 \3 X! O0 Z0 i& k8 i; G
-i ipmi_username=root \
% S2 g! j) y9 b: T3 f- e -i ipmi_password=Huawei12#$ \
) I5 `3 L. i, B$ j! h) y) }; c -i ipmi_port=623 \. X7 m& b" H% h' w: L( y' q. l% X
-i driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e \3 I) L6 y# }; a9 n
-i driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc" m% y7 [6 A8 T6 T6 L+ C+ u' e* ^
0 l8 _$ a1 k7 c- K* B% |+ g7 V
Update 5/25: 正在开发Ironic AZ功能,通过node-update将AZ属性加入node,同步给nova数据库。nova boot时只需要指定AZ创建机器即可。
6 M. _# `) O& r3 `
( z% i. D/ l3 m* K3 L
. M$ P Z, ~- }7 X J/ ^/ f, Y) A+ n
& t3 ?" z& t$ K8 N: q8 D
7 M9 q8 V2 ?$ c: D# G
; O0 R! R! ^$ t& Supdate 5/12:6 x) j5 I8 U- S9 S+ B

) _5 D3 C4 O4 A( V7 S' G5 H* Y+ N5 G; t# F$ T. M) n V
; R+ O _, X( x. }
) ^( u+ t0 I3 q5 Q
, l" ^0 N" g% c z3 p
$ u) r7 h( N) D2 M2 {# k [) L' Uinspect成功之后:* h" O7 x- ]/ n' i
8 |0 q, c* e" `8 B/ `. G; C
6 h0 Q# Y. ?3 e* Z" {( Z8 J# B/ p
& v6 Y/ e- L$ v5 W7 E5 r1 t9 P
& S8 ]: B c4 X4 [ h# T9 j# P " t6 V8 w/ `7 g) J w' u5 p
; b- _+ x) ?. ^0 s5 q K7 v2 [# p- K% q
inspect失败,原因见“问题2”) d; m. o8 j) c$ Z
# F* ~5 ]/ m* q& A/ A" y' W. m* f p

: N5 V: I0 [4 K1 c
, P* s" I! O: T' ^. n, |
# t7 w* e0 n/ `3 v8 f6 M" k/ C, k. S
' `3 B8 B, q* M7 R1 T4 ~
+ e' n1 l2 U2 ~4 v0 N! |# S配置provisioning_network:: x/ ]! |# L) z; r; B4 M. M
' K! a u9 @1 u7 z9 ^, u
6 T# d+ w9 I$ Z1 F& S3 }; N. |1 {0 {' \; G& ?! d& J! a
Y9 }3 S. z8 q2 }* v! s% h. M4 u1 r. p. l
; }7 x( J8 ~8 B% {/ o7 ~
' p1 u% @5 h+ N/ X6 g4 R7 {
" ]6 [# p8 E# N8 r9 H
v# C. s! p E* S Q
) u+ |4 J8 x/ z4 y; fInspect成功之后:( d2 c: Q; I3 T% a5 ] Y

" v1 _ d2 w0 z$ \5 y7 r9 B; h* I% J
/ ~6 e3 q! Y1 I& n1 I
1 C! E6 }. A4 ~- |+ g* x: F
& M0 G, x E2 K8 `% d C1 Z; V' _& s / W' a9 K% }3 }' g
: C7 s, ^( p* d# M9 }
% s& P5 p" _( ?' S7 f1 r& y8 O5 Q8 W$ V& R" i: I1 k7 [+ B# v; ?
: x2 b! k* t: ~: e8 y; i, n上传Ironic使用的镜像:' T& v. D/ S& a$ r& |
glance image-create --name CentOS-7-64bit-ironic.qcow2 --disk-format qcow2 --container-format bare --file CentOS-7-64bit-ironic.qcow2 --is-public True --human-readable —progress+ y" g# _! _5 k8 `) g
glance image-update 40928b81-9be1-402a-8684-4e2d2fcf330f --property hypervisor_type=baremetal
" J/ z/ j9 ~/ D1 U$ X8 i- V % ?" \7 C" U9 G
nova boot --flavor 2 --image 40928b81-9be1-402a-8684-4e2d2fcf330f --nic net-id=3a151049-ff3f-4bc5-88a1-b9084ec24bc9 pc-zjnacthd01
4 ]5 q9 ?$ o7 P) v% u; p, D
4 o7 r7 o, |$ t8 p& Z- b2 V' i
7 b6 R+ V) q% K @% _# c# ~" ^( D$ ~
; G) t! B+ }$ P8 n5 q4 d; V F+ k9 H% m; [5 `: j6 G
== 问题 ==$ E( L! x* t+ U4 T! [0 h3 D
- node name有限制?
1 R+ b! I1 t3 H6 @2 y9 w

5 ?5 b) g# p; Y2 v2 c" R
$ C2 ~. A0 J: z1 O$ P$ \
T6 ~& H @$ J% @0 r4 \5 Q2 D
: b8 q, T0 s* C+ J) O# c, i
4 b; P6 c7 J. N2 v5 e9 L- 第一次Inspect失败$ n! S7 O6 g! o% X& K: B( p+ |4 G
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 228, in url_for0 J" x2 B$ S$ P, ~$ b( Q; u. D
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main raise exceptions.EndpointNotFound(msg)+ N% d' z: Q5 u- y
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main EndpointNotFound: public endpoint for baremetal service in RegionFour region not found6 q$ |! [' |* F- A9 M2 u! Z
. P" A W( H9 k1 P4 V1 N$ a1 R
重启ironic服务后解决( k, B( ^# Q8 x# b4 w
8 C8 K7 G$ g. N8 e! _- v: l& Q. J- 第二次inspect失败,BM拿不到IP- n; H! n) b' K
DHCP请求已经发送到dhcp server:
3 I7 K1 p7 s5 }/ o1 x
" n5 w9 N5 i" B6 h3 I3 l
% c* q6 @6 o4 D3 p: _; v# @- I- S! E& `/ _' I4 J
! `6 s8 N# ~8 R( z5 E7 k
: C# }. a4 I0 I8 H" f- inspect时找不到cleaning_network* H/ A9 S3 |) c& y7 G5 ~
配置cleaning_network(=provide_network)
9 i% i! w( \2 G3 ~! Q( `! f5 {: P( C2 M- Q2 A
- nova boot失败, conductor.log:
$ N4 ^$ b; p2 M5 O i

0 B4 N: E5 O/ e3 F3 L0 Q
5 Z. a6 \9 A; o3 o \4 `! ~" b6 g, I
; C' m# d! h0 Y更新控制节点的nova代码、ironic节点的ironic代码、计算节点ironicclient代码之后,问题解决8 K- f1 L+ n1 ^' R3 A
" ]0 J, A. B% `2 x) j" d
- nova boot失败,compute.log
9 @6 H- h; C/ w! \7 I
) M' e* P8 U5 {6 X7 |' @' G6 Y
- {' E3 R8 G) f) D4 g! b J s
4 m; N) n/ o' T; ^% L) a1 a
+ b+ [1 X: t$ N: G% r2 ~$ C9 V原因是这个ironic node driver_info还没更新:$ [* ~2 s% V3 f& n D

J: c! ?. n- S* B0 p
7 N* e( `2 M- |1 y' n& e$ Z L' k
! z b" `2 }) c9 P8 ?3 O8 e9 y8 y% ?- H( a
更新一下:
$ b7 Q7 m9 Y j/ R3 `2 f1 Nironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=f8205536-070b-4286-8d0c-35e3b8647741; M' [; G* z& w
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=302e6438-4d31-429b-8bae-47e225d4ed67' d: {/ q8 W, \. J$ B
update 05/12:
/ A/ `" d# m; jironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e/ F$ I; }3 A' x: x8 R, U
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc$ N2 U$ _+ L1 t8 l0 f$ w0 L
5 I1 z C Z9 g" Y
* \, N- s- f7 C) u/ R
9 O7 v6 r* W& N B' b8 O% H, ?8 }/ | q# T$ B
; {6 ?4 a6 x- R. h. _! q- nova boot失败,镜像找不到,compute.log* O/ K# q% g6 D% ]' n
0 Y& n6 X/ {' u+ |. z+ V
: P% _7 o8 Z% l0 A f3 Z+ G# q
9 x; T! z V% J
7 k9 y& K% S, a" T计算节点nova.conf的glance-api配错了:
: T* l S6 d2 \6 D+ F8 A 5 m0 |! k; Q& I# F o
6 l4 b5 Y( P6 u% T0 ?
1 _, r3 W. y4 }* T8 P- A9 f
- s1 j. a; P' ?6 ^& o/ j# s) Dironic-conductor节点ironic.conf中添加glance api version=1
6 c$ }1 B/ T6 d% X/ [. Q& g 8 B- r8 U% y' q
, T( p3 v! r7 u9 `2 A' [
0 f7 f0 r2 o4 O3 l* I* L; F: u1 K; u5 Z4 ^+ k
1 a" @: ]3 k- T$ p
$ k; K; U; w% Q( G1 i+ Aglance_api_version=1
: J$ O3 V& Z, l- [! { L3 \" t7 V: G/ F8 ~" K
- nova boot失败,ironic-conductor.log:) A0 m- L$ q, k; v- h
) e& O% |, Z9 c4 s& N% f5 p
& R. @* u4 q& w& b4 [( J
, Z" o) ^* L4 @4 x% ~' ^ r, ^! `* |/ V! l: v
命令行验证,可以在provisioning network d5a284c3-41d3-4eb3-a11f-58a99d3e2eb1上创建port
, S* v; u' L2 A* H5 k& T" D' x4 ] V" `. @2 A* `" o: p
原因是没有enable LLDP。enable之后:. l1 F7 I/ y/ v% i# | n* s6 N8 Z
; e' |6 O; {. d4 K" yironic port-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic port-delete 'X'2 \! Z- j7 Y$ i
ironic portgroup-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic portgroup-delete ‘X'" @" J5 O$ Y, h- W
重新Inspect:
' r- Q- p% X# Y Qironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 manage ]/ o1 W3 W- S! A/ T9 O
ironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 inspect
: R4 f* G+ O. e# }/ d- B
4 W% J' y! t o7 F
@4 t6 G# Z' |; t* k
* V" Z* V" d# v- ^# x
; Q5 f4 Y8 b+ N& ^9 q0 X& n0 j# u8 A" ?3 s) S v
( n; ]# C% i2 m: y- T# r5 O! E$ F
9 q& |# N9 ^$ ~3 @1 l, C# e: z/ R* H. f
$ b% k( [# M% c; `$ {6 S
6 L% e: G8 d* X K7 V# F- T
$ L% t; g7 g9 w1 \$ _ z' Y& M- nova boot失败,找不到用户镜像% V- G+ m, ~) t5 i1 K. C
原因是glance-registry.conf中的数据库写错了。
7 O& Y. R F+ }0 @* j* F M* x; X1 c8 e) |5 }9 L9 J2 i: r
- nova boot失败,找不到ramdisk
4 C2 L- @6 z' p' ?$ ~9 r% w% ?* [ * G& m9 L. P. Q

; h8 ]. S8 A& ^) N& y5 S; h1 l. h1 ? ?7 S* \, \* N2 A
3 h% `. { `: g- @0 z; @* o" h. b+ L5 k. Z8 z2 o9 V
这个image UUID是配置在ironic node的driver_info里面的,image需要上传到glance' G+ H: m$ ~ T9 R8 P
+ C" [3 a8 x: s4 d3 c上传镜像:* d" |1 R# N) v

9 u# J- W5 ]! H, @% f; V
5 F! m7 p' Y+ x2 _% ?
3 k. e! u. ^* v) x0 F* V5 j" f |* U7 Q# O, n R' e6 z8 e
/ G l3 Z8 Q: M& X2 L4 `9 [
! e, G/ d, x- b: Y* S7 K
7 q* Z' z6 m6 N' w+ `
! |6 T6 u) @4 k
# d4 D& l4 N: Z- h! ~' A
. x& Z) B+ [. ?3 x: [
7 }6 ~& o7 l6 K' k: t! _+ r* I
/ x7 E6 V$ Z: U* K) E0 \! U6 k1 ?- V; G' }, U; |4 C4 I
更新Ironic node信息:
9 M, E$ {1 m/ a* i( T
7 `6 V4 v/ m' ^2 `( q
; |7 I8 d$ g0 j, T
( O7 p4 Z7 _# N4 s6 P& B' ?3 |) T+ U4 ?2 S9 F/ s
4 x7 W( K( @2 `& e6 f- G& l. r
- nova boot失败,访问tftp权限不够
' s3 ]% }3 r* h

& @. ?! z, m1 g( Y0 Y7 ^' f+ a" {3 u, z7 F- z5 E' D1 G
& q. V6 X g- C( C1 h! @
% U2 x( `/ k a" V; N
# G7 r+ `6 t. s7 d" e$ {chown -R ironic:ironic /tftpboot/+ b5 ~. O1 Y) o, k& n" ~- I! K. y9 G! }
9 r, u* Q1 J5 @/ r+ E2 v7 y# f3 x+ }9 D
* M" S4 _5 w; d
) J) u& M0 V h$ g& P
& Q) z. _. I2 \9 H$ \/ B7 r q5 s( T- C3 F" f# \
* e6 w* X& t2 W, G, U! [) K- nova boot失败,物理机DHCP请求被ironic-dhcp捕获了7 S. N9 l# } O G e
关闭ironic-dhcp' [, k: I) V' |1 [* H' {
0 Z' L; T4 j! d0 ?- nova boot失败,物理机DHCP时不能从neutron DHCP拿到IP5 D( D% d+ G8 Z/ b! I3 ?
在控制节点上,neutron dhcp在dnsmasq启动的namespace中。relay的目的地址是控制节点管理网IP(eno16777984),dnsmasq的监听设备为namespace的tap口,IP为20.26.34.91,他拿不到dhcp请求。- k3 p L7 s; G. I
现在的方法是:在控制节点上手动启动一个dnsmasq,使用neutron dhcp一样的配置
5 p! t }# h( P' O8 l7 r; D4 s, t
' K, m }9 O) ]) V& `( G' W- 拿到IP之后,进入ramdisk系统,但是重启之后不能进入用户镜像的操作系统$ s C o. m5 P- a9 K- J T+ ]
查看BIOS的启动设备顺序,发现是- Boot Device Selector : No override5 E$ Q# U+ @( ?/ t
查看ironic-conductor.log,发现连不上20.26.34.70:9999。这是IPA的地址和监听端口,需要保证ironic-conductor节点能连上,但是的确不通。1 a3 a8 {) ^/ y. P+ Y: i

: F l) p/ G" @' K1 t* b7 q& e6 T3 D) a
7 U5 N! r3 b6 Y) e9 g/ y
: Q& J' [* b5 P3 d% {; `7 _, ~9 d% }8 {, Y! i& W" c
姚军说可能是ramdisk启动之后,有两个网口获取到了IP地址,引起路由错乱,建议我们ramdisk启动之后,删除第二个地址。
], g4 ?2 j" R8 p% I; A9 N/ ^) m9 E
8 L& j. @" |9 a: }" t3 T d8 _ A05/04 update: 在provisioning network上加上静态路由:destination=控制节点网段,nexthop为provisioning network GW0 d' X- o( G' g$ l: T8 S7 c
05/11 update:neutron subnet-update aca03dd8-3d2a-4c54-99de-7a8a7bac4f53 --host-route destination=20.26.33.0/24,nexthop=20.26.34.1: x, I4 ]9 v1 z) n+ y9 m6 ~: W
Updated subnet: aca03dd8-3d2a-4c54-99de-7a8a7bac4f535 Z; F$ }0 g; z6 D8 L, W

# t5 B$ O6 U+ T& v- s. t' ^1 N* C& p6 T
. _7 U0 x" V5 |: V
+ n$ A8 m& m. _1 i1 m
; O/ ]8 D5 l; m8 `7 d# o/ O验证可行,能连接这个端口并下载用户镜像: ==> 为啥会有多个网卡获取到IP,如何从代码层面解决?
6 l, l3 X5 L3 w4 _ & M7 V& H# z( w. }# P5 h! S
( D0 R0 r5 _! e7 J! B7 t9 D! S/ w/ w$ C8 @
" Y+ }# ~6 k& Z7 ]
; s2 E: j5 P0 g% N% h+ \0 d6 i
0 i: L& q0 ?) ?5 JIPMI查询启动顺序:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootparam get 5
/ M" l" Y, K! Q3 j- s* G+ {& e设置硬盘启动:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootdev disk
) U4 y3 ?- M% ~% O. C ` ' J3 \$ z3 G Q) k! j; u
2 {5 L2 N+ y1 B# S8 r2 \7 B3 i) o- A7 G; M
/ j; B5 \, s3 \/ a
. t c5 R; H: b/ t% B0 s; V- 用户镜像下载到了/dev/sdl,没有下载到第一个硬盘,并且整个boot过程超时了* v& E- O4 b- @- `

0 m- n8 k8 v5 m! ^5 P. v: y; |
4 i) R' @) t; H) Z- V# B" m" Y; W4 e
3 d, B- l( O, F
5 J) {+ n/ q4 _2 @2 y* z1 H! D a. 姚军修改了ramdisk,固定使用/dev/sda作为写入的硬盘4 y- Y2 |$ h" y2 g
b. 修改ironic.conf的deploy_callback_timeout=900
/ V2 p$ B# Q( z) e' j" T- A6 f8 B, `7 [3 j
Updat 05/04: 9 V9 `8 ^2 }! ?. p/ {. S% E
李灏:ironic node-update 4fae2ae3-0935-4585-8be2-00298015f8f3 replace properties/root_device='{"name": "/dev/sda"}'
5 L9 e3 N! s8 @
5 s1 k% |! I& K9 z& j- 写入了/dev/sda,但是ironic-conductor没有重启机器,导致boot hang死
; m3 T( `3 \* N journalctl -fu python-ironic-agent查看IPA内的日志1 I" x* e9 h9 B6 P S( i1 `8 H
journalctl --no-pager
9 `5 @# q6 [" l
. a$ G; h0 Y. s' s1 r2 Q- 镜像写入/dev/sda后,IPA执行partprobe /dev/sda失败- S% f- N8 ~3 e8 L

# }9 I) X; m+ m7 E; n6 U
+ k! J7 X9 Q4 \5 C+ T4 Z7 q" [ q; z% E" A# C {9 F6 O9 k( r9 ?% {
* `" y) V5 o, U8 g
ramdisk中的ironic-lib需要打patch:https://review.openstack.org/#/c/444061/+ B" c4 O! |. }2 h2 d9 i
( T2 G% J2 u! P0 R' V* Y1 F: P
$ O. [7 p3 f \" F |
|