环境信息 OS –> CentOS7 openstack –> Ceph –> 虽然OpenStack自带有迁移和疏散机制,但并不一定保证100%成功,本文基于疏散失败的情况,来恢复实例。 起因客户那边,物理机系统盘故障,导致数据全部丢失,最开始想到的方法是疏散,直接在dashboard或者控制节点终端执行疏散命令 nova evacuate node09 然而事情并没有到此结束,有2台虚拟机,一直疏散不了,没办法,数据最重要,由于nova,glance,cinder使用的都是ceph,即数据都存放在ceph中,这样恢复起来就有了可能, 具体步骤如下: 重建libvirt.xml文件 首先查看一下拟机信息 nova show 6bb3bc65-91c3-486c-b853-2a5c879a5395 具体信息如下: +--------------------------------------+----------------------------------------------------------+ | Property | Value | +--------------------------------------+----------------------------------------------------------+ | network | 192.168.10.25, xx.xx.xx.xx | | OS-DCF:diskConfig | AUTO | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | node03 | | OS-EXT-SRV-ATTR:hypervisor_hostname | node03 | | OS-EXT-SRV-ATTR:instance_name | instance-000001ae | | OS-EXT-STS:power_state | 1 | | OS-EXT-STS:task_state | - | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2017-06-23T06:03:20.000000 | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | config_drive | | | created | 2017-06-23T06:01:44Z | | flavor | yunwei01 (d76ea4cd-0c46-423c-9a2a-0ab15c5a1b0a) | | hostId | 7081812a3c654f417fc545300ecd03252d8ac4bf992b54272bcfee61 | | id | 6bb3bc65-91c3-486c-b853-2a5c879a5395 | | image | centos65 (0488b591-1755-4da0-abe9-8e5c2a6931b5) | | key_name | - | | metadata | {} | | name | rhea-2 | | os-extended-volumes:volumes_attached | [{"id": "7cd2936a-b4ef-4e85-8159-eace4c1b7981"}] | | progress | 0 | | security_groups | default | | status | ACTIVE | | tenant_id | 290f841df5644709847a51d6604a228f | | updated | 2017-12-21T06:46:13Z | | user_id | d158732076c740aead5286969273faea | +-------------------------------------+----------------------------------------------------------+ 记录这些信息,后面有大用 在平台上,找一个flavor和image相同的实例,找到此实例uuid 查看该实例信息 nova show uuid 进入到相应节点,拷贝libvirt.xml和console.log cd /var/lib/nova/instance/$uuid/ cp * /mnt/bakrecovery cd /mnt/bakrecovery more libvirt.xml 信息如下: <domain type="kvm"> <uuid>6bb3bc65-91c3-486c-b853-2a5c879a5395</uuid> <name>instance-000001ae</name> <memory>16777216</memory> <vcpu>16</vcpu> <metadata> <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0"> <nova:package version="13.1.0-1.el7"/> <nova:name>rhea-2</nova:name> <nova:creationTime>2017-12-21 06:46:11</nova:creationTime> <nova:flavor name="yunwei01"> <nova:memory>16384</nova:memory> <nova:disk>40</nova:disk> <nova:swap>2048</nova:swap> <nova:ephemeral>0</nova:ephemeral> <nova:vcpus>16</nova:vcpus> </nova:flavor> <nova:owner> <nova:user uuid="3eb5de8526c14da495488f1a264915f6">admin</nova:user> <nova:project uuid="10d0b378c82d4b5da2288ef852ca5bd1">admin</nova:project> </nova:owner> <nova:root type="image" uuid="0488b591-1755-4da0-abe9-8e5c2a6931b5"/> </nova:instance> </metadata> <sysinfo type="smbios"> <system> <entry name="manufacturer">Fedora Project</entry> <entry name="product">OpenStack Nova</entry> <entry name="version">13.1.0-1.el7</entry> <entry name="serial">d3c7a080-2534-4b5f-9d86-9cd152acc671</entry> <entry name="uuid">6bb3bc65-91c3-486c-b853-2a5c879a5395</entry> <entry name="family">Virtual Machine</entry> </system> </sysinfo> <os> <type>hvm</type> <boot dev="hd"/> <smbios mode="sysinfo"/> </os> <features> <acpi/> <apic/> </features> <cputune> <shares>16384</shares> </cputune> <clock offset="utc"> <timer name="pit" tickpolicy="delay"/> <timer name="rtc" tickpolicy="catchup"/> <timer name="hpet" present="no"/> </clock> <cpu mode="host-model" match="exact"> <topology sockets="16" cores="1" threads="1"/> </cpu> <devices> <disk type="network" device="disk"> <driver type="raw" cache="writeback"/> <source protocol="rbd" name="vms/6bb3bc65-91c3-486c-b853-2a5c879a5395_disk"> <host name="172.16.2.5" port="6789"/> <host name="172.16.2.6" port="6789"/> <host name="172.16.2.7" port="6789"/> </source> <auth username="cinder"> <secret type="ceph" uuid="14fdf1bb-44d7-40ad-a98a-16fe7e65115b"/> </auth> <target bus="virtio" dev="vda"/> </disk> <disk type="network" device="disk"> <driver type="raw" cache="writeback"/> <source protocol="rbd" name="vms/6bb3bc65-91c3-486c-b853-2a5c879a5395_disk.swap"> <host name="172.16.2.5" port="6789"/> <host name="172.16.2.6" port="6789"/> <host name="172.16.2.7" port="6789"/> </source> <auth username="cinder"> <secret type="ceph" uuid="14fdf1bb-44d7-40ad-a98a-16fe7e65115b"/> </auth> <target bus="virtio" dev="vdb"/> </disk> <disk type="network" device="disk"> <driver name="qemu" type="raw" cache="writeback"/> <source protocol="rbd" name="volumes/volume-7cd2936a-b4ef-4e85-8159-eace4c1b7981"> <host name="172.16.2.5" port="6789"/> <host name="172.16.2.6" port="6789"/> <host name="172.16.2.7" port="6789"/> </source> <auth username="cinder"> <secret type="ceph" uuid="14fdf1bb-44d7-40ad-a98a-16fe7e65115b"/> </auth> <target bus="virtio" dev="vdc"/> <serial>7cd2936a-b4ef-4e85-8159-eace4c1b7981</serial> </disk> <interface type="bridge"> <mac address="fa:16:3e:34:e3:fe"/> <model type="virtio"/> <source bridge="qbrff441fbc-bb"/> <target dev="tapff441fbc-bb"/> </interface> <serial type="file"> <source path="/var/lib/nova/instances/6bb3bc65-91c3-486c-b853-2a5c879a5395/console.log"/> </serial> <serial type="pty"/> <input type="tablet" bus="usb"/> <graphics type="vnc" autoport="yes" keymap="en-us" listen="0.0.0.0"/> <video> <model type="cirrus"/> </video> <memballoon model="virtio"> <stats period="10"/> </memballoon> </devices> </domain> 需要修改如下部分: 1.name和uuid <uuid>6bb3bc65-91c3-486c-b853-2a5c879a5395</uuid> <name>instance-000001ae</name> 前面记录信息,刚好派上用场 2.name和创建时间 <nova:name>rhea-2</nova:name> <nova:creationTime>2017-12-21 06:46:11</nova:creationTime> 这个name是创建实例的时候,我们起的名字,上面的那个,是随机生成的名字 3.修改相应uuid(包括image,project,user) <nova:user uuid="3eb5de8526c14da495488f1a264915f6">admin</nova:user> <nova:project uuid="10d0b378c82d4b5da2288ef852ca5bd1">admin</nova:project> </nova:owner> <nova:root type="image" uuid="0488b591-1755-4da0-abe9-8e5c2a6931b5"/> 3.修改虚拟机uuid,和serial,serial通过比对发现,同一个计算节点这个号相同 <entry name="serial">d3c7a080-2534-4b5f-9d86-9cd152acc671</entry> <entry name="uuid">6bb3bc65-91c3-486c-b853-2a5c879a5395</entry> <entry name="family">Virtual Machine</entry 4.块设备id,格式为虚拟机uuid_disk,uuid_swap <source protocol="rbd" name="vms/6bb3bc65-91c3-486c-b853-2a5c879a5395_disk"> <source protocol="rbd" name="vms/6bb3bc65-91c3-486c-b853-2a5c879a5395_disk.swap"> 5.通过cinder划分的卷id <target bus="virtio" dev="vdc"/> <serial>7cd2936a-b4ef-4e85-8159-eace4c1b7981</serial> </disk> 6.虚拟机网卡mac地址,还有端口,这里,mac地址可通过dashboard查看,admin账户下,管理员–>网络–>(网络)uuid–>端口 <interface type="bridge"> <mac address="fa:16:3e:34:e3:fe"/> <model type="virtio"/> <source bridge="qbrff441fbc-bb"/> <target dev="tapff441fbc-bb"/> 7.console日志路径 source path="/var/lib/nova/instances/6bb3bc65-91c3-486c-b853-2a5c879a5395/console.log"/> 8.对比修改完后,一定要仔细确认,以免出现问题 mkdir -p /var/lib/nova/instance/$uuid cp libvirt.xml /var/lib/nova/instance/$uuid cp console.log /var/lib/nova/instance/$uuid 修改权限 chown nova.nova /var/lib/nova/instance/$uuid/xml chown qemu.qemu /var/lib/nova/instance/$uuid/console.log 在数据库中修改节点信息 update instances set host='node10',node='node10' where uuid='6bb3bc65-91c3-486c-b853-2a5c879a5395'; 重启相应节点nova-coompute服务 systemctl restart openstack-nova-compute 启动实例,启动后网络可能不通,这时可以通过迁移来解决, 当计算节点异常断电或者发生故障后,导致计算节点无法正常工作,这时该计算节点上面的vm如何迁移到别的可用计算节点上呢? 由于宕机的计算节点上面的nova-compute服务已经down了,所以shelve和migrate操作无法运行,会报错。 nova中提供了Evacuate命令,该命令可以将vm从宕机或者故障的计算节点上面迁移到可用的计算节点上,并在新的计算节点上rebuild该vm。 该命令只能在命令行操作,并且前提是宕机的节点和可用的节点间有共享存储。 [root@01 ~]# nova help evacuate usage: nova evacuate [--password <password>] [--force] <server> [<host>] Evacuate server from failed host. Positional arguments: <server> Name or ID of server. <host> Name or ID of the target host. If no host is specified, the scheduler will choose one. Optional arguments: --password <password> Set the provided admin password on the evacuated server. Not applicable if the server is on shared storage. --force Force to not verify the scheduler if a host is provided. (Supported by API versions '2.29' - '2.latest') [root@01 ~]# openstack server list +----------------------+----------------------+---------+----------------------+------------------------+ | ID | Name | Status | Networks | Image Name | +----------------------+----------------------+---------+----------------------+------------------------+ | 07ff8391-47d4-4304-8 | demo | SHUTOFF | private63=10.0.0.6 | | | b76-6db5bd65a8c0 | | | | [root@01 ~]# nova evacuate demo 到界面上面可以看到该vm已经迁移到可用的计算节点上,并正常运转了。 |