找回密码
 注册
查看: 2561|回复: 0

自动化kolla-ansible部署openstack+GPU透传方法

[复制链接]

1

主题

0

回帖

12

积分

管理员

积分
12
QQ
发表于 2021-6-25 11:36:52 | 显示全部楼层 |阅读模式
1. CentOS7.x-8.x系列为虚拟机配置GPU直通
# I* |! z0 C) E0 C3 K: @复制代码
! u5 n+ x6 C! i  x3 b1. 编辑文件vim  /etc/modules, 添加以下内容:) E& e" W$ M8 e/ O! o: u
pci_stub$ q; x  z/ V( N. A1 B
vfio
$ ]' ]# h! D" }: ~3 Hvfio_iommu_type1& Y$ [( j  s. L+ Z& H% D
vfio_pci) [# {, `0 A& S) a0 i- e& T' n6 v
kvm
, e# d( Z& Z4 [! ?0 Ikvm_intel
  l$ K" U, t% h* d3 X2 u8 u* t9 y
% y& g# d7 _8 J# g$ k2. 在KVM主机上启用IOMMU
; x: l+ ~" C- c: v#对于Intel芯片:
: t# H! ?& ?8 B! gGRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"
$ ?. [- J. Y. t, O, N#对于AMD芯片:1 k' e5 r, J$ b$ ?
GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt iommu=1": |5 `$ E0 u4 n1 J
* w! w/ S- g7 O
vim /etc/default/grub, N; w7 F" w! `: q) E& o; e7 L- _7 ~& ^: T

/ ?4 d5 @0 @% M- n& }: _/ N9 ~GRUB_TIMEOUT=5
* n6 A1 y0 e4 Z% `9 K+ v+ j3 _$ sGRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"* L, A  T& s8 P
GRUB_DEFAULT=saved
+ i+ r" D$ @: O( m0 k/ d5 J# _GRUB_DISABLE_SUBMENU=true
) ^! e2 ~2 [1 W9 V; [5 c' fGRUB_TERMINAL_OUTPUT="console"
2 O7 h) a- _$ ~* c0 QGRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet intel_iommu=on"
6 t) r, b  u1 G. a# c: QGRUB_DISABLE_RECOVERY="true"
4 v5 f1 v7 t$ l* I# [& b
; Z, o: E0 n3 J/ |$ B   3.  重新生成grub6 Z% z& x  h" L# I8 l
   EFI# w+ C0 ^+ b# `4 O; A
   grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg' Z* i% Q! h% z) M; l6 C
   非EFI 7 t+ t6 ]  f, b" u/ y
   grub2-mkconfig -o /boot/grub2/grub.cfg
' T- ~0 K* M) O4.  将下列内容加入到blacklist中以避免被宿主机占用,编辑文件- m! k" @" q4 Q
vim  /etc/modprobe.d/blacklist.conf2 i8 W1 P' @6 \0 h& P8 ]* N, C
blacklist snd_hda_intel$ b/ b6 T2 E0 d0 ~# L
blacklist amd76x_edac
+ ~+ ]  w. j3 }5 s; Y+ c  @blacklist vga16fb( {; j0 f& L/ G
blacklist nouveau
+ ]9 |% g9 O: p3 Z% Q3 w$ z8 Y2 h$ g! bblacklist rivafb
4 |# }2 Z, N2 X2 p# |# Iblacklist nvidiafb- B/ I; f6 y$ G9 Q1 b
blacklist rivatv+ H7 ?5 X- E" \) f: }& `# {
blacklist nvidia
; l4 ]" y) Z0 Q3 E4 L$ |. g. l% K3 ^( N0 h* H
5.  查找显卡的Product ID 以及 Vendor ID:2 k# u3 a4 @8 S- l# i" Q" z/ S9 f
yum install pciutils -y
0 l2 k& I. q, D9 }lspci -nn | grep NVIDIA0 d1 z" t3 R! G7 H
如下:
1 z4 R; m' j  \  r/ a( g[root@stein-a ~]#   y1 D, o/ J8 o& W0 Q
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)% W; `% d3 |+ t! b) |
03:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
, Z5 m4 |/ ^5 r  j9 U, E- j5 k/ M) L8 l- l
6.  编辑
3 h. L7 x# p3 [. Dvim /etc/modprobe.d/vfio.conf4 w" e5 j& a" z* C5 `5 D
# create new: for [ids=***], specify [vendor-ID:device-ID]
, c- L" J# ?# B- m4 z* Q5 p5 uoptions vfio-pci ids=10de:1bb1,10de:10f0, \9 h9 u' b. d  T: N9 P
4 a- M5 z( p8 _; L
7.  写入到系统启动项
) w5 L( n2 x9 ^$ r! q. Zecho 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf ' s0 i3 A( w% K& q

9 @  D; l2 E- U- C$ H" Q8.  重新生成initramfs
) M, h. q* m4 G9 M7 a/ \3 omv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak7 a0 A( v( w/ m* z
dracut -v /boot/initramfs-$(uname -r).img $(uname -r): o& a( z1 i; w# s0 Z
+ O1 \% ^4 x, D  B3 s5 ]" ^
9.  重启系统8 t  s! z0 f3 A. W9 E
reboot$ H. b: a; o2 h) s  ~1 d
% P" k# A' Q1 E2 x2 a3 U% `* |4 r
10. 验证
& s! c$ W2 u3 @9 y3 mlspci -nnk -d 10de:1bb1
' E2 ~/ y: w  {: G- J/ T! ?dmesg | grep -i vfio
2 \) E. l9 c  Y& T5 o& i* f[root@stein-a ~]# lspci -nnk -d 10de:1bb1: k+ P, D* R5 \1 e- a' S! R% }
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)1 m& ^* H: g6 D1 n; {
        Subsystem: NVIDIA Corporation Device [10de:11a3]
( f( Y3 l: n. b        Kernel driver in use: vfio-pci
4 {0 z& k1 @7 j% \" |1 U        Kernel modules: nouveau
& N% N& m( G3 [  C4 Q2 W[root@stein-a ~]# dmesg | grep -i vfio
& n4 |$ t8 L; S. |% d+ q1 X% X[    2.503115] VFIO - User Level meta-driver version: 0.3
" t' L( I7 H. D8 D$ r[    2.515645] vfio_pci: add [10de:1bb1[ffff:ffff]] class 0x000000/00000000" }" J9 A: X9 X% G3 V
[    2.515752] vfio_pci: add [10de:10f0[ffff:ffff]] class 0x000000/00000000
" Q! K" g  l- U5 C$ a8 ^; E[root@stein-a ~]#
6 ]1 U$ E" ?* \# \' u; I- d( P复制代码
6 a% Q% e$ w8 k- r9 b% v
% G; @8 m, G0 \! J0 x( z! j8 q2. Ubuntu18.04系列为虚拟机配置GPU直通7 f0 n! z. X+ |( f
复制代码
  H# W8 h% R: J0 j1 ?1. 编辑文件vim  /etc/modules, 添加以下内容:
6 L+ u8 R0 z- i; |( Bpci_stub" W4 g* o( X% f) o; z: m
vfio
% j) a3 p' c# b& `+ O0 ?0 dvfio_iommu_type1/ Q3 Y3 _  {% S0 Q' o/ Q
vfio_pci+ W" F* ]$ d! `, X2 x8 k
kvm$ ^. j7 [. a6 F
kvm_intel
/ S! C  k  p4 E& K* g
5 i% P0 i8 P' @1 {% E/ _2. 在KVM主机上启用IOMMU
4 d6 t) w, D* H' \& i8 P; m#对于Intel芯片:4 U2 u, D+ S* [
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on"
# m7 s* h3 F' P) a$ H#对于AMD芯片:# S; B0 K. J4 {# \! J
GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt iommu=1"( i1 `2 z) e' M, `7 o

0 U! @/ J) w( yvim /etc/default/grub
, |: v9 i6 D5 C$ m1 d0 ^
( @" `+ V& s, f/ ?% ?3 VGRUB_DEFAULT=0
* \. o$ |7 G0 x8 T2 qGRUB_TIMEOUT_STYLE=hidden
+ g: s* }" P+ \' U( j, t' c  aGRUB_TIMEOUT=0
3 ?+ j' {" W0 j+ H/ BGRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
& {! ?: ^8 O, L8 d) d& U" oGRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on"
* @) v/ j" |2 b! H: g8 kGRUB_CMDLINE_LINUX=""
( O3 h. n' G6 K( ~- }$ Z  v; S% ?' P! r- r. n
   3.  重新生成grub5 M4 r) t% h* x% T* n" C/ y2 |
   EFI
6 i! M! [- @: y( `6 k% s   grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
* |9 S8 j" C% ]5 t   非EFI 6 j7 H! `+ r6 F2 _+ u
   grub2-mkconfig -o /boot/grub2/grub.cfg
1 u) B9 }& D+ ]/ j3 q' I' J8 q3 d5 F4.  将下列内容加入到blacklist中以避免被宿主机占用,编辑文件( F' _3 y2 M/ P1 L
vim  /etc/modprobe.d/blacklist.conf; A- _5 i( {  B0 U- K( F6 a7 t
blacklist snd_hda_intel
5 C+ j& i: h1 g' z) ~( m( rblacklist amd76x_edac
& t# Q; t0 ]3 B( h3 x& ~; Dblacklist vga16fb8 B( H8 A! r) M2 Y
blacklist nouveau
2 K5 l( [/ `6 u* r: a4 Wblacklist rivafb
' D3 N% d0 C; f  y9 mblacklist nvidiafb* `, O0 O; f) q, F4 i6 z6 d
blacklist rivatv% \# T. d& ]# x; I. x8 ?1 l
blacklist nvidia2 x) g/ W! ~/ _2 o! g
- d) l3 `- E9 f% }+ I# S
5.  查找显卡的Product ID 以及 Vendor ID:
' `$ Q) p% D& j2 G, kapt install pciutils -y; ]6 |/ }5 r7 q3 m( c: Z
lspci -nn | grep NVIDIA' b6 |. B6 k2 j* P
如下:/ K* N. }5 J9 s9 {" U. K0 ?
[root@stein-a ~]# lspci -nn | grep NVIDIA1 y, j# S6 u8 T6 ~( e
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1); S; n0 L, P" M5 K  o7 j3 _$ x
03:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)1 i% f. f1 d9 R- `5 ^9 X) A
- u3 @$ ~8 X! G
6.  编辑% _' B6 `# ~$ F7 f' z; ~$ ?4 a- s
vim /etc/modprobe.d/vfio.conf( W, J! Y6 w5 F2 w/ w, |' M+ g
# create new: for [ids=***], specify [vendor-ID:device-ID]& I) k" }. r# _! U$ K, Q  [
options vfio-pci ids=10de:1bb1,10de:10f02 C% y' P1 W. l+ p. A4 p6 M

3 k! c4 b+ K3 }7.  写入到系统启动项5 u# _# J, m) d
echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf
6 H$ J- c" j/ T7 N2 }4 @6 @( a$ F7 ?' U( `
8.  重新生成initramfs
9 J8 j, d& c9 M; J5 {$ Gdracut -v /boot/initramfs-$(uname -r).img $(uname -r)
  u6 `( E! y7 l+ \& E: U+ t. m9 k# b& Y3 j0 G
9.  重启系统
# C( h% ^% Q% l1 y, D# p- O! ureboot1 P" L- N6 Z9 Y( |" Z7 g# n

! Y" x- E. O4 @5 u10. 验证% t5 f$ o( o( }
lspci -nnk -d 10de:1bb1; `% G1 B# S: U9 V5 M
dmesg | grep -i vfio+ E. j( `' \  ?+ p2 }% R
root@kvm:~# lspci -nnk -d 10de:1bb1
5 c7 O/ Z" L& wdmesg | grep -i vfio, b5 s, G0 V( u5 C# [2 K
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)$ L6 j+ ]4 K' |
        Subsystem: NVIDIA Corporation GP104GL [Quadro P4000] [10de:11a3]; S7 S2 Q) l' u$ q8 `+ s  k
        Kernel driver in use: vfio-pci6 G; Y% r3 C, V. a) s
        Kernel modules: nvidiafb, nouveau- {1 {  N) Z6 U7 X! a* A0 u
root@kvm:~# dmesg | grep -i vfio
  @7 H$ [( c) ^! ~[    3.838714] VFIO - User Level meta-driver version: 0.3! o' {% X3 ]8 O) l0 e
[    3.846238] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none3 H5 S. K+ a8 `* z/ Z
[    3.866370] vfio_pci: add [10de:1bb1[ffffffff:ffffffff]] class 0x000000/00000000
7 |9 H& ^' D& M8 N2 }8 g/ x6 ~5 z: ?[    3.886375] vfio_pci: add [10de:10f0[ffffffff:ffffffff]] class 0x000000/00000000
, D+ `& ~! E' w; N+ `复制代码$ f. V! \5 K9 W! T4 G$ A

# x& ^' |0 W  D! h) [5 e复制代码$ h4 {0 i6 Q! d; M- s& T
#如果你单机部署的,在单机下配置。' C9 b* m" V, [4 o, T+ y2 G! n
#如果你是高可用部署的,在三台控制节点配置
$ E  I  [0 V0 x0 W3 |1. 添加pci
) ?  u" j$ ~; r3 X  v2 [2 Cvim /etc/kolla/config/nova/nova-compute.conf( ]8 {: p7 L' V8 R: j) s3 p
[libvirt]  x# [& h! @7 b* C! {
inject_password=true9 \" _+ ], W1 f& I5 H
cpu_mode=host-passthrough
! ^4 \6 i8 [( l/ avirt_type = kvm: Q0 ^! D( C: E* i! y
[pci]6 I- j/ B* a6 e' m7 U2 K
passthrough_whitelist: { "vendor_id": "10de", "product_id": "1bb1" }, X8 |! j# i2 A+ l$ ~$ E" x/ S
6 J3 a# N7 j% s: e* W& p4 D. h
2. 修改nova.conf8 P7 F, f0 f( y* F& ~. t. X$ B
vim /etc/kolla/config/nova.conf
$ e6 d+ T" o7 Y5 U( G: @; j6 y, o[DEFAULT]! f' v. m, _, s" ^! L5 q% n
service_down_time = 120
( B, A3 n8 R8 `5 i  O$ D8 E9 ccpu_allocation_ratio = 4.0   9 A. k( K2 K. {0 n4 y
disk_allocation_ratio=1.0
/ A, y, T& b6 {* A% eram_allocation_ratio = 1.0   $ [% q- w3 o! v( s& c
reserved_host_disk_mb = 4096
+ f5 |6 h3 ~3 u8 T$ w5 {! hreserved_host_memory_mb = 4096   n) F/ D! f2 o' a
allow_resize_to_same_host = True   
! I  R  X8 L: ^+ d- T) gremove_unused_base_images = False1 b% |6 {3 z+ k' Z7 e3 N) g9 N
image_cache_manager_interval = 0
# k5 `8 Q: ^% Wresume_guests_state_on_host_boot = True& c  A) G) ~0 q- u3 ?  j, ]; E( w9 R# K! P

! F  D& {4 d' u! g0 r1 G5 b- w' i[PCI]
) K8 ~5 ]& e  U3 K5 @4 h' S2 H: ialias: { "vendor_id":"10de", "product_id":"1bb1", "device_type":"type-PCI", "name":"quadro-p4000" }) K4 l) z! c2 H) U6 {7 z6 H8 k) C
[filter_scheduler]
4 K/ f) R1 r8 x1 Uenabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter
9 y- n( `; E$ V) q, R$ R* y; Vavailable_filters = nova.scheduler.filters.all_filters  C, A4 k, \: V! ^" m! S
$ M: U* v, t* A/ d) m! n" T4 Z
3. GPU 类型实例创建  
# ~" l. _  |& _" Topenstack flavor create --vcpus 4 --ram 8192 --disk 30 --property "pci_passthrough:alias"="quadro-p400:1" g1.4c.8m.p4002 P, n! H, E  I0 c& `. [$ D( F7 t$ z
复制代码
$ j+ M& \' P4 |; V  `- {
+ y1 u3 z5 L& _" p$ a2 s8 h) Q3. CentOS7.x系列 安装显卡驱动
) c0 @' i9 X7 o- {8 @+ T  k复制代码
. b' |- [  {* d% y+ v1.  查看是否含有英伟达显卡/ X2 W6 z. n) ]" ]0 K2 T5 e! B
lspci | grep -i NVIDIA
2 j9 U7 r3 v6 x( x; y7 J#下面说明有1块英伟达的显卡% D: G8 [; q- f2 u+ O( P% C% K
[root@train-all ~]#  lspci | grep -i NVIDIA+ ?* R8 }+ [! r7 q( _1 u
04:00.0 VGA compatible controller: NVIDIA Corporation GP104GL [Quadro P4000] (rev a1)
: p6 L% b- H, P6 p04:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
+ ?' @- I& j1 P& U# [( h  C1 f[root@train-all ~]# % I# K$ A- G+ r' b* x
7 d5 u' V# t6 z- v* B
2.  添加ELRepo源
; ^9 r: \1 C' f. Nrpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org ; a4 c6 G" k, @3 Z" J

2 T9 H- {* B  J; m. k3.  安装ELRepo  }% e2 |7 z0 P# \7 n' \
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
2 U0 P0 ]5 |: g, [2 x& Y' E
4 Y% T5 v4 L0 ^, s# d4.  安装nvidia-detect
' R% L2 F4 U& B" Q8 z3 }& u1 x3 Gyum install nvidia-detect -y
( R" R0 {  `4 E: j0 a" C7 ]+ M; b% K
5.  运行nvidia-detect
! e; @9 X: z/ H$ {. {nvidia-detect -v
- B. J* n2 n$ w! F  s  Q1 B0 m* E8 }! U  Z5 @* o8 V6 E; G
6.  查找驱动程序
) j! ~3 `4 h4 }. h" w- t9 `yum search kmod-nvidia+ P, p9 l2 f6 L5 N
+ Y3 I8 L; a% d4 o- U
7.  安装驱动程序+ \8 W8 ^5 h  F3 C. E8 R* s9 }
yum install kmod-nvidia.x86_64 -y
: G0 Q2 Y2 e" `" j
! O1 M2 c- _" d# u8.  查看禁用Nouveau  s% W8 o) N) ~/ j
lsmod | grep nouveau ; T( h3 `- t1 e% j# z, J% }( j
#若没有输出 则说明禁用成功,否则执行下面的命令
0 N' K" p1 o' r! r, g9 F5 U: G
3 J6 Z0 O0 g  F6 _  i: J/ }& x9.  在/etc/modprobe.d/blacklist-nouveau.conf中创建一个文件,其内容如下:# w! I% |( Y  c/ m' g9 U
vi /etc/modprobe.d/blacklist-nouveau.conf
. t" _. A' {( H3 I- `4 D' }! |, I添加1 f  ~, w5 f2 t0 W
blacklist nouveau
- ^" _# Q% k- f* j7 zoptions nouveau modeset=0' @9 K, n4 u' C- H% A# ^1 y+ x
2 _. P* P/ @0 y0 b3 z* s
10. 重新生成内核initramfs
2 b2 G( ?' V( Wdracut --force4 ?/ T( L! l' g8 I9 U  r: F' C0 P
6 y. ]  p3 E5 n7 a& j3 w: Q( Z( ^& H/ L
11.  重启系统
5 a- r* I, z; X& _! t7 y# J8 E( ~reboot! K$ ?; L6 T4 }$ }; K  {1 I8 v. j

# Y9 {* j! g8 f# w' F! I1 L12.  测试0 [7 A6 E# i, A. y8 d/ m) {$ w
nvidia-smi
& E& ~4 i, u0 T
您需要登录后才可以回帖 登录 | 注册

本版积分规则

返回首页|Archiver|手机版|小黑屋|易陆发现技术论坛 ( 蜀ICP备2026014127号-1 )

GMT+8, 2026-6-12 04:24 , Processed in 0.020447 second(s), 23 queries .

Powered by Discuz! X5.0

© 2001-2026 Discuz! Team.

快速回复 返回顶部 返回列表