易陆发现互联网技术论坛

 找回密码
 开始注册
查看: 2556|回复: 0
收起左侧

自动化kolla-ansible部署openstack+GPU透传方法

[复制链接]
发表于 2021-6-25 11:36:52 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?开始注册

x
1. CentOS7.x-8.x系列为虚拟机配置GPU直通9 \* G+ ]  f6 {, |! S  G, {
复制代码) c6 E7 N: C, [* I% D1 H* `
1. 编辑文件vim  /etc/modules, 添加以下内容:
9 b6 k7 \/ q- ~; b+ V% O1 P2 epci_stub
! \& ]0 h5 Y0 f! o1 C4 h3 wvfio1 R- U) V7 q$ n' U# H& J0 V6 L% I
vfio_iommu_type1
; D# [0 G& l8 w5 ivfio_pci4 o9 N4 g. Q( R
kvm
# N2 n1 P( _8 [kvm_intel
% v7 h' {5 g! |& h! z
7 \. `( U1 W: N2. 在KVM主机上启用IOMMU
! A0 R' d7 {4 M* y% f#对于Intel芯片:
$ Z( s# i8 f8 @3 o6 a4 EGRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"9 ^* F# ~" I) u) D6 s
#对于AMD芯片:+ L& w. o5 r. G- o+ u4 b# Q- U1 |
GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt iommu=1"+ Z2 D$ P; s# w6 H

( }1 v+ B" h! p0 fvim /etc/default/grub
, u) m  x" g5 r; w, m+ P% I
% Z" }3 S+ q7 n; t2 D  |( _( LGRUB_TIMEOUT=51 c2 {0 U9 E  A6 ]
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
$ n0 D& V! Y; ?: xGRUB_DEFAULT=saved
1 u4 r2 z0 E  o/ u: R, t- H& gGRUB_DISABLE_SUBMENU=true
, F; q- ]; A& C3 r% xGRUB_TERMINAL_OUTPUT="console"2 h' p3 N2 c5 q4 |4 Y* V+ d
GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet intel_iommu=on"
5 ~; A$ e5 x! \) U- e) c' pGRUB_DISABLE_RECOVERY="true"
7 m1 Z! S$ A; d" J; U9 }( _. l' v8 t  b% K
   3.  重新生成grub4 n" e7 s8 w" [6 ]3 O/ g
   EFI
* C$ l" z6 n3 g* }1 [" Y0 T   grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg; a% ]  b* w1 P8 a! d
   非EFI 2 \' f5 y4 P; `0 j8 T9 X0 b3 X
   grub2-mkconfig -o /boot/grub2/grub.cfg
$ X, M: S: ?+ B4 `0 o' C, S4.  将下列内容加入到blacklist中以避免被宿主机占用,编辑文件
! @: n1 E+ \5 T9 Uvim  /etc/modprobe.d/blacklist.conf
; Z9 |! S3 V+ ?, F+ R* Xblacklist snd_hda_intel* ^% x, O' F" l' E
blacklist amd76x_edac" A" c) Q6 d5 i4 V3 r5 d8 E
blacklist vga16fb
( T! V1 ~, ?3 ^: }+ [4 t0 @blacklist nouveau; Z* c. J6 ]3 i4 H+ c2 z
blacklist rivafb6 n7 E9 y) P2 R; W, C. O
blacklist nvidiafb2 b0 a& E4 O2 e7 ^/ ^- G  N
blacklist rivatv) x" Q! ?7 L6 q1 {3 a! h
blacklist nvidia
' _' j; t" J6 Z: y% [0 p/ j. T# M1 f; k* V8 A! x; I) |$ F
5.  查找显卡的Product ID 以及 Vendor ID:
: n& m- r; ]8 E) F3 `# Pyum install pciutils -y
! r% B3 l; S: k, X& Llspci -nn | grep NVIDIA, X' u0 j# ?& i: G0 k
如下:
# W) j( s" Q/ C  c[root@stein-a ~]# $ L$ x1 i1 F. h0 N& P
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)4 H" X1 }8 `/ F' n
03:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)% |+ D9 X! L5 v  y6 i
: U! Z% |) _. b( s+ B
6.  编辑3 _( v; m3 u: M4 \- R
vim /etc/modprobe.d/vfio.conf
/ u$ G/ a4 Z& d/ ?8 k: F# create new: for [ids=***], specify [vendor-ID:device-ID]
# \( g4 ~3 b. R  J! e' W) V# m/ @' x' e/ Roptions vfio-pci ids=10de:1bb1,10de:10f0; _# s3 U5 S8 x/ n8 ^. E! a
. |2 m! B" \4 I
7.  写入到系统启动项) \! C& @8 ]0 x5 V
echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf * N* R% V. q' p8 ^' `5 ^
8 p) q( M2 [" h# F# u1 u- _" }+ e
8.  重新生成initramfs- V4 a6 ]4 d) n* E* `$ |9 a& P
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
8 s. L6 ?6 @+ mdracut -v /boot/initramfs-$(uname -r).img $(uname -r)
; `) S: a6 `0 q/ @- Z7 s! B  [; D' t4 E) b3 |2 Y8 N& ?
9.  重启系统
! ^& B0 [+ b5 _; r+ Q0 {reboot
, W: O0 g) u7 v8 G
6 t7 ~( n0 f! Z% r7 I) g* ~10. 验证$ {- [' Q/ m) `* T( @" {% A
lspci -nnk -d 10de:1bb12 l+ x6 s8 W+ |( N% g* H
dmesg | grep -i vfio
$ s& G" O& q3 A4 H5 ~# R[root@stein-a ~]# lspci -nnk -d 10de:1bb17 D) U8 E: R$ m
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)
7 o7 f( W( s9 C' u1 r6 s& W9 T  s        Subsystem: NVIDIA Corporation Device [10de:11a3]
& X) l" X- F1 j3 C- R! G! Z        Kernel driver in use: vfio-pci
2 O$ Z5 ?7 _% Z. D$ Y( M) s        Kernel modules: nouveau
& l4 H' ]. m0 c* s[root@stein-a ~]# dmesg | grep -i vfio" P3 b0 t; g& B9 d% D* s- ~
[    2.503115] VFIO - User Level meta-driver version: 0.3
) Y3 B6 p/ n( K  \! f7 M& |[    2.515645] vfio_pci: add [10de:1bb1[ffff:ffff]] class 0x000000/00000000
" g& Y$ N: d) C, J/ ^[    2.515752] vfio_pci: add [10de:10f0[ffff:ffff]] class 0x000000/00000000
6 C$ d9 D: u' r2 C[root@stein-a ~]#
6 p1 Y7 b" w& o, }' g复制代码+ ^* y8 z* L( p" J; T9 w* d3 C$ b
# e3 N; I' i. p- f, c  N
2. Ubuntu18.04系列为虚拟机配置GPU直通
7 T) A, t& }9 A# T9 ~% O复制代码5 {9 y" Q' J: E2 j5 e
1. 编辑文件vim  /etc/modules, 添加以下内容:! F5 @6 o2 K8 }; M
pci_stub
1 H- {! c9 q1 {, k7 q$ Hvfio
5 {6 ]1 v/ O6 \5 r2 svfio_iommu_type1" s% h* c' A4 f
vfio_pci( B5 s) l. b! [# D  {7 |7 A( P
kvm
, y4 ?+ V3 n: ukvm_intel% @7 _  ?3 A2 M4 |  e

7 A3 _4 {  i+ D$ f% G% H2. 在KVM主机上启用IOMMU $ i7 g- u! g7 E7 y9 i
#对于Intel芯片:
. b- Z; Y! k0 K( Q. R$ T/ RGRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on"
4 O7 j% A* R/ c4 d#对于AMD芯片:
$ w6 t6 N' W7 d+ b; uGRUB_CMDLINE_LINUX_DEFAULT="iommu=pt iommu=1") E+ A2 I# V3 a) f- P# s

" h% w& f3 F* c2 Q( S* C  ~" jvim /etc/default/grub' m# w; N7 W  J/ j, o  g1 }
7 S) a# n% e- R7 v5 j
GRUB_DEFAULT=0  l. M! D5 k$ L
GRUB_TIMEOUT_STYLE=hidden) m$ O) H/ ^5 s
GRUB_TIMEOUT=0
- a6 i" d  \# [4 G( YGRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`5 k7 X" c: E, m# T  r; C* f+ H
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on"
! D+ @! n& Y" a. UGRUB_CMDLINE_LINUX=""
+ C& h) F  r& J9 W& Y  W0 ^+ f1 q* _% v, ?- u. ]
   3.  重新生成grub/ p0 ?2 \5 Q6 f4 j! \  d1 L
   EFI6 X6 u1 \6 x  m: _
   grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg# s/ a5 b+ r( g& K+ D+ ]( Z
   非EFI
; K3 G+ s0 _4 K   grub2-mkconfig -o /boot/grub2/grub.cfg
3 l' Y* ]2 D+ p, N4.  将下列内容加入到blacklist中以避免被宿主机占用,编辑文件
, {2 O% z# L+ rvim  /etc/modprobe.d/blacklist.conf* ^3 }& ^! i; \8 J( {; V
blacklist snd_hda_intel5 l  S7 u3 b( t. f3 S
blacklist amd76x_edac+ o; }' d6 h/ F' ]2 G0 c! w
blacklist vga16fb: @  W) k" C0 @4 e
blacklist nouveau% P4 Y/ B$ D2 d$ |& G
blacklist rivafb8 _( s& N# p3 B# X! b8 y
blacklist nvidiafb6 p3 M2 \) c' T8 e# m, w' d
blacklist rivatv/ i2 Z. w* t1 H  V2 L0 u) {+ j
blacklist nvidia: @- }* U/ O9 d/ R( J

" O7 u: [. A8 `8 F1 x2 V9 k5.  查找显卡的Product ID 以及 Vendor ID:
4 Q6 m  e# x4 _. `, P0 t" }apt install pciutils -y1 B9 U6 X+ g& o& h. X$ ]
lspci -nn | grep NVIDIA& l8 \8 q5 g2 z4 W. `- H7 B
如下:
9 ^; M* N/ U+ C. E* J/ N[root@stein-a ~]# lspci -nn | grep NVIDIA3 [+ c4 {" n' [5 ?2 k& N# Y# z# ~
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1): k# K1 {# C: R; p# E1 b2 C) p
03:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)( }) V8 M8 H+ t# h& p/ W5 N) R" r
. o, O7 D" A9 [$ O* i) V3 f  U
6.  编辑
5 _" c8 }1 ^) x( v6 {* Gvim /etc/modprobe.d/vfio.conf3 e! t  v8 _# T) V4 G
# create new: for [ids=***], specify [vendor-ID:device-ID]& O5 d# v. m8 F& O
options vfio-pci ids=10de:1bb1,10de:10f0" J( t/ m0 a# r( D8 ~

3 Y, r( e; e, n9 @7 E2 s7.  写入到系统启动项
! E3 T) a3 t; W1 v) Hecho 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf
1 B% V' W2 l& ]" x' p  z. n, J( i4 L; s& E: R
8.  重新生成initramfs
: i/ i; W& ^; ?* e/ ^7 y# J8 |" gdracut -v /boot/initramfs-$(uname -r).img $(uname -r)' Y" f3 ?- h  u

) i* W6 z, h+ |7 X% l9.  重启系统" e9 n/ l! p5 V% O4 T' t
reboot  Z+ j) o( `4 l3 u* h
. G- i+ j" `: |" B* Z! x
10. 验证" j" j! k- v, L  q2 W! M% W1 x  t. E
lspci -nnk -d 10de:1bb1) y: V4 ^4 M; [8 t
dmesg | grep -i vfio
9 |0 ^! |8 o( droot@kvm:~# lspci -nnk -d 10de:1bb1- Y4 [) C/ j: Z& [& {: N5 i
dmesg | grep -i vfio8 }8 o8 Z& u) w/ E5 \. n0 }* _% t
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)) {0 ?3 `5 l* f4 ~0 T
        Subsystem: NVIDIA Corporation GP104GL [Quadro P4000] [10de:11a3]9 z. q% `0 q- H! p3 g6 {2 C
        Kernel driver in use: vfio-pci8 a2 V/ u. ~) d. O, V
        Kernel modules: nvidiafb, nouveau4 X$ ~( Y* w3 W5 V2 D  d1 m+ B
root@kvm:~# dmesg | grep -i vfio
; X* ^2 [  `- k$ a[    3.838714] VFIO - User Level meta-driver version: 0.3
1 n6 W$ `* e, s[    3.846238] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
4 \+ k/ V' a, v4 x$ ~! t5 `8 k+ V[    3.866370] vfio_pci: add [10de:1bb1[ffffffff:ffffffff]] class 0x000000/000000002 x' L$ G. O+ b9 s% P  t
[    3.886375] vfio_pci: add [10de:10f0[ffffffff:ffffffff]] class 0x000000/00000000
( B! w6 {3 i2 b5 a1 h4 e复制代码
$ J. o# Q5 F4 n5 m" _# E7 F   n3 T, e" ?" [; c5 d
复制代码* y1 t$ t( t4 ]% e- ~  d2 S; n
#如果你单机部署的,在单机下配置。' f4 D: g6 Z, o) {
#如果你是高可用部署的,在三台控制节点配置" M) V( w* O6 h1 t" {4 M: a
1. 添加pci( a5 d$ _- \. A1 ]. K  A
vim /etc/kolla/config/nova/nova-compute.conf
7 X* s8 w' i7 I9 }( e[libvirt]
0 v3 V) I; e& P3 D7 `inject_password=true, @/ r' @$ }1 W
cpu_mode=host-passthrough
; ]3 q- J5 n& a& Qvirt_type = kvm0 c7 h) t- e7 L3 a3 s
[pci]
8 @: @( G7 R* X7 x$ e- p" w, Cpassthrough_whitelist: { "vendor_id": "10de", "product_id": "1bb1" }4 `, m( i$ v& }1 [1 x8 M/ Z5 w: j

" ^7 r, W+ p# s% H2. 修改nova.conf
' B. x9 O  A' g2 evim /etc/kolla/config/nova.conf0 s: T' I. _: I5 P0 i1 M7 Y. m6 g
[DEFAULT]' r+ _$ N6 g) p" S7 T
service_down_time = 1200 |8 ^% e% x7 @, Z; e% t) V5 H5 K9 J) i
cpu_allocation_ratio = 4.0   / w5 b! `' g" A+ A0 B" e( K7 O
disk_allocation_ratio=1.0
! N, b) i) I! s+ q# M4 Pram_allocation_ratio = 1.0   
. ?. w$ E$ N8 h! breserved_host_disk_mb = 4096 6 u$ n) [4 [! J8 L/ z1 t6 _
reserved_host_memory_mb = 4096 $ O2 }" j- w  ?9 M5 S
allow_resize_to_same_host = True    3 X: g2 K8 {2 J  I: q& X
remove_unused_base_images = False
& O- j5 ^5 n: o" p7 Z4 n' kimage_cache_manager_interval = 0
- k. n2 j) Q; K9 Oresume_guests_state_on_host_boot = True
7 k3 o/ t6 }; @* c/ L
' H5 V3 m4 N5 y[PCI]
7 V6 a1 {! p, [alias: { "vendor_id":"10de", "product_id":"1bb1", "device_type":"type-PCI", "name":"quadro-p4000" }
- m5 O- y; P& j% ?[filter_scheduler]  X6 @- Y4 R: X3 M1 h: h1 ^; L0 O
enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter
: i5 p3 E; Q8 ?8 ~3 P6 G, |available_filters = nova.scheduler.filters.all_filters6 h" A5 f* J4 d5 W% h1 g

9 b6 r4 q! _6 u3. GPU 类型实例创建  ; |5 o2 ]1 k2 l7 |! q' q
openstack flavor create --vcpus 4 --ram 8192 --disk 30 --property "pci_passthrough:alias"="quadro-p400:1" g1.4c.8m.p400( Z1 X1 F, `; [0 E
复制代码
1 j, ]: ^% y" }: }0 { 0 G% e* a0 Z' v9 b: y4 j
3. CentOS7.x系列 安装显卡驱动
! ^! O8 y+ `8 y# s复制代码" `. k: r1 Q' `' v/ ~) z
1.  查看是否含有英伟达显卡
$ S7 _' x8 x* \5 [% w" Qlspci | grep -i NVIDIA: G9 T& {, P/ c: g
#下面说明有1块英伟达的显卡
* Q$ |0 P: _3 g: p) X2 e[root@train-all ~]#  lspci | grep -i NVIDIA
- Q  ^6 s( D4 e. y* n9 ~& l04:00.0 VGA compatible controller: NVIDIA Corporation GP104GL [Quadro P4000] (rev a1)3 U! Z! u% j( K# O7 `
04:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
& l. o0 s! m# g" u+ P  Y[root@train-all ~]#
2 u3 N1 J5 @. |3 t$ n3 N2 ^$ h* B4 m  E! N% R, h! }5 ^6 _, q! q
2.  添加ELRepo源& F9 k( Q9 _) L' e8 p
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# f4 }% _% @; C4 s, J. n
# l$ b# s5 c; ?) R/ p  T- I3.  安装ELRepo) _! P, Q, L- ]! K' O
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
7 ?8 j# V- o( [1 \: I( v# P, f! M9 s- r4 A8 u. v( ]7 w
4.  安装nvidia-detect# A0 U+ u! j3 @$ V( i
yum install nvidia-detect -y4 m6 `2 A8 t' C

; _" o7 v. r- x5.  运行nvidia-detect. ]5 B5 T/ Y1 ]; P- ?% [* b( v
nvidia-detect -v3 O" ?  l1 B! y3 x* O7 u

7 d$ R+ g8 S' a6.  查找驱动程序6 ]- t  y% f# H' l% q/ Z8 ?% _1 x
yum search kmod-nvidia
- H8 c/ J4 `  V; u
$ @6 h3 M" }: p$ N* Y3 x7.  安装驱动程序
5 z# \# W$ \: s6 Q  Nyum install kmod-nvidia.x86_64 -y
: P2 X4 i$ H( M6 U
, r# z! t' d% v; }4 F+ z" t8.  查看禁用Nouveau& R+ n1 i) O& @
lsmod | grep nouveau
0 _* L0 t4 @/ I1 f1 }#若没有输出 则说明禁用成功,否则执行下面的命令' k' v6 k* ]/ |8 d% C
% N' z- \# U8 C: T- W
9.  在/etc/modprobe.d/blacklist-nouveau.conf中创建一个文件,其内容如下:
; k5 i' W2 y. N/ \7 i' [vi /etc/modprobe.d/blacklist-nouveau.conf
7 h) T/ U% y: r添加
0 D" I% T5 ]% u; M$ ~9 K9 b' c$ w* }blacklist nouveau
4 P9 r! v7 D6 ^3 }options nouveau modeset=0) Z) W2 \2 K$ a9 P

: z/ E0 ^  H" m7 G/ M10. 重新生成内核initramfs0 _6 @* [3 y& W2 i) d" i- G
dracut --force0 q  k! j' ?$ r
! M- _& A1 q( F+ M" g, N2 E
11.  重启系统
" I  z, q( \( T( V% ireboot
4 F+ m# k2 k7 i7 X' t0 @+ y6 q  y0 W0 f' B
12.  测试6 ?# D' ?, P3 d( }) n- `; t
nvidia-smi& ?/ ?: a; f  S; H& [4 w7 `
您需要登录后才可以回帖 登录 | 开始注册

本版积分规则

关闭

站长推荐上一条 /4 下一条

北京云银创陇科技有限公司以云计算运维,代码开发

QQ|返回首页|Archiver|小黑屋|易陆发现技术论坛 ( 蜀ICP备2026014127号-1 )点击这里给我发消息

GMT+8, 2026-4-8 21:20 , Processed in 0.047081 second(s), 25 queries .

Powered by Discuz! X3.4 Licensed

© 2012-2025 Discuz! Team.

快速回复 返回顶部 返回列表