openstack 在创建虚拟机的过程中最后需要通过glance-image来创建虚拟机的disk文件,具体代码在/nova/virt/libvrit.py的_create_image中,这个创建过程说实话有点蛋疼,我感觉逻辑不是特别清晰,而且有些方法的命名也起到了很好的误导人的作用,所以只能猜测_create_image这个方法是一个做运维的兄弟写的:^)。这几天频繁的被一个同事吐槽,openstack架构太烂,各种烂,该耦合拆分了,该拆分的耦合了,各种不懂op的developer在做着devops的工作,结果就是不懂op的写代码,不懂写代码的最后负责op,最后让大家用来各种不顺手,所以就有了专门做openstack的服务的公司,而且还可以活的很大并且获得了大家认可,当然前提是在米国,在天朝,算了,还是开始分析一下这个create_image流程吧。

# image_backend = Backend() #file at: /nova/virt/libvrit/imagebackend.py
# Backend().image() return Qcow2()
def image(fname, image_type=FLAGS.libvirt_images_type):
return self.image_backend.image(instance['name'], fname + suffix, image_type)

# 这个root_fname 就是/var/lib/nova/instance/_base/<root_fname>那个cache文件名
# root_fname = hashlib.sha1(image_uuid).hexdigest()
image('disk').cache(fetch_func=libvirt_utils.fetch_image,
context=context,
filename=root_fname,
size=size,
image_id=disk_images['image_id'],
user_id=instance['user_id'],
project_id=instance['project_id'])


# 看一下cache方法:
def cache(self, fetch_func, filename, size=None, *args, **kwargs):
"""Creates image from template.

Ensures that template and image not already exists.
Ensures that base directory exists.
Synchronizes on template fetching.

:fetch_func: Function that creates the base image
Should accept `target` argument.
:filename: Name of the file in the image directory
:size: Size of created image in bytes (optional)
"""

@utils.synchronized(filename, external=True, lock_path=self.lock_path)
def call_if_not_exists(target, *args, **kwargs):
if not os.path.exists(target):
fetch_func(target=target, *args, **kwargs)

if not os.path.exists(self.path):
base_dir = os.path.join(FLAGS.instances_path, '_base')
if not os.path.exists(base_dir):
utils.ensure_tree(base_dir)
base = os.path.join(base_dir, filename)

self.create_image(call_if_not_exists, base, size,
*args, **kwargs)


# 再看看create_image方法:
class Qcow2(Image):
def __init__(self, instance, name):
super(Qcow2, self).__init__("file", "qcow2", is_block_dev=False)

self.path = os.path.join(FLAGS.instances_path,
instance, name)

def create_image(self, prepare_template, base, size, *args, **kwargs):
@utils.synchronized(base, external=True, lock_path=self.lock_path)
def copy_qcow2_image(base, target, size):
qcow2_base = base
if size:
size_gb = size / (1024 * 1024 * 1024)
qcow2_base += '_%d' % size_gb
if not os.path.exists(qcow2_base):
with utils.remove_path_on_error(qcow2_base):
libvirt_utils.copy_image(base, qcow2_base)
disk.extend(qcow2_base, size)
libvirt_utils.create_cow_image(qcow2_base, target)

prepare_template(target=base, *args, **kwargs)
# NOTE(cfb): Having a flavor that sets the root size to 0 and having
# nova effectively ignore that size and use the size of the
# image is considered a feature at this time, not a bug.
if size and size < disk.get_disk_size(base):
LOG.error('%s virtual size larger than flavor root disk size %s' %
(base, size))
raise exception.ImageTooLarge()
with utils.remove_path_on_error(self.path):
copy_qcow2_image(base, self.path, size)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
create_image的逻辑流程是这样子地:
1. fetch_img from glance and save it to _base named to <cache_id>.part
2. if img format not raw then run: qemu-img convert -O raw /var/lib/nova/instances/_base/<cache_id>.part /var/lib/nova/instances/_base/<cache_id>.converted
3. rm <cache_id>.part
4. rename <cache_id>.converted to <cache_id>
5. copy_qcow2_img at _base: cp <cache_id> to <cache_id>_<root_disk_size>
6. disk.extend(<cache_id>_<root_disk_size>, root_disk_size)
7. create_cow_img: qemu-img create -f qcow2 -o backing_file=<cache_id>_<root_disk_size> /var/lib/nova/instances/instance-xxxxxxxx/disk

因为有这个cache的关系,所以修改了模板后需要做两件事情
1. 删除_base的下的cache,cache_id和image的对应关系如下:
    root_fname = hashlib.sha1(image_id).hexdigest()
2. 修改数据库image对应的checksum,可以写个工作或者直接md5sum一下
    glance image-update <uuid> --checksum=xxxxxxxx

不过有时候md5sum算出来的好像和代码算出来的不一致,程序算image的checksum用的是如下方法(可以直接copy使用,不过我推荐自己写一遍,亲自动手openstack的代码是一个很好的学习过程,毕竟一个开源项目获得了太多优秀人才的贡献,多写一点以后可以做架构师啊,不过我是做总监的料):

#!/usr/bin/env python
#-*-coding=utf-8-*-


import hashlib
import sys


def chunkreadable(iter, chunk_size=65536):
"""
Wrap a readable iterator with a reader yielding chunks of
a preferred size, otherwise leave iterator unchanged.

:param iter: an iter which may also be readable
:param chunk_size: maximum size of chunk
"""

return chunkiter(iter, chunk_size) if hasattr(iter, 'read') else iter


def chunkiter(fp, chunk_size=65536):
"""
Return an iterator to a file-like obj which yields fixed size chunks

:param fp: a file-like object
:param chunk_size: maximum size of chunk
"""

while True:
chunk = fp.read(chunk_size)
if chunk:
yield chunk
else:
break


def cacl_md5(img_path):
checksum = hashlib.md5()
bytes_written = 0

with open(img_path, 'r') as f:
for buf in chunkreadable(f, 65536):
bytes_written += len(buf)
checksum.update(buf)

checksum_hex = checksum.hexdigest()
print "%s = %s" % (img_path, checksum_hex)
return checksum_hex


def main(uuid):
cacl_md5("/var/lib/glance/images/%s" % uuid)


if __name__ == "__main__":
"""usage: python glance-md5.py <image-uuid>"""
main(sys.argv[1])

基本上create_image的流程算是简单了解了一下,其实我感觉openstack的网络才是最复杂的,现在我们还是用的VlanManager,但是我对iptables不太了解,不太懂NAT,SNAT各种,所以碰上浮动IP的问题,我有点不知道怎么调试,只能各种猜测(arp, tcpdump, tracepath)。 this all, done!

[参考资料]
[The life of an OpenStack libvirt image]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
qemu-img -h <br /> usage: qemu-img command \[command options\] <br /> QEMU disk image utility <br /> <br /> Command syntax: <br />   check \[-f fmt\] \[-r \[leaks | all\]\] filename <br />   **create** \[-f fmt\] \[-o options\] filename \[size\] <br />

qemu-img create -f raw ./images/cirros.raw 1G
>Formatting './images/cirros.raw', fmt=raw size=1073741824
ll -h
>-rw-r--r-- 1 root root 1.0G Jul 13 21:53 cirros.raw
qemu-system-x86_64 /images/cirros.raw

>backing_file
qemu-img create -f qcow2 cirros_backing.qcow2 -o backing_file=cirros.raw 1G
ll -h
>-rw-r--r-- 1 root root 193K Jul 13 22:28 cirros_backing.qcow2
qemu-img info cirros_backing.qcow2
>image: cirros_backing.qcow2
>file format: qcow2
>virtual size: 1.0G (1073741824 bytes)
>disk size: 196K
>cluster_size: 65536
>backing file: cirros.raw
>使用backing_file创建的模板可以convert成raw格式,包含整个模板文件
>这个convert将cirros_backing.qcow2和cirros.raw合并,生成新的raw格式模板
qemu-img convert -O raw cirros_backing.qcow2 cirros_backing.raw
qemu-img info cirros_backing.raw
>image: cirros_backing.raw
>file format: raw
>virtual size: 1.0G (1073741824 bytes)
>disk size: 0

commit [-f fmt] [-t cache] filename
convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] filename [filename2 […]] output_filename

1
2
3
4
5
6
7
8
9
10
11
12
13
14
qemu-img convert -c -f raw -O qcow2 cirros.raw cirros.qcow2
ll -h
>-rw-r--r-- 1 root root 193K Jul 13 21:58 cirros.qcow2
>-rw-r--r-- 1 root root 1.0G Jul 13 21:53 cirros.raw
qemu-img convert -c -f raw -O qcow2 cirros.raw cirros.qcow2 -o ?
>Supported options:
>size             Virtual disk size
>compat           Compatibility level (0.10 or 1.1)
>backing_file     File name of a base image
>backing_fmt      Image format of the base image
>encryption       Encrypt the image
>cluster_size     qcow2 cluster size
>preallocation    Preallocation mode (allowed values: off, metadata)
>lazy_refcounts   Postpone refcount updates

info [-f fmt] [–output=ofmt] [–backing-chain] filename

1
2
3
4
5
6
7
8
9
10
11
12
qemu-img info cirros.raw
>image: cirros.raw
>file format: raw
>virtual size: 1.0G (1073741824 bytes)
>disk size: 0
>虽然ll中看到文件的大小是1G,但是实际上磁盘大小是0。这就是稀疏文件
qemu-img info cirros.qcow2
>image: cirros.qcow2
>file format: qcow2
>virtual size: 1.0G (1073741824 bytes)
>disk size: 136K
>cluster_size: 65536
snapshot [-l -a snapshot -c snapshot -d snapshot] filename
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>只有qcow2才支持快照
qemu-img snapshot -l cirros.qcow2
qemu-img snapshot -c tag1 cirros.qcow2
qemu-img snapshot -c tag2 cirros.qcow2
qemu-img snapshot -l cirros.qcow2
>Snapshot list:
>ID        TAG                 VM SIZE                DATE       VM CLOCK
>1         tag1                      0 2013-07-13 22:14:31   00:00:00.000
>2         tag2                      0 2013-07-13 22:15:31   00:00:00.000
qemu-img snapshot -a 2 cirros.qcow2
qemu-img snapshot -d 1 cirros.qcow2
qemu-img snapshot -l cirros.qcow2
>Snapshot list:
>ID        TAG                 VM SIZE                DATE       VM CLOCK
>2         tag2                      0 2013-07-13 22:15:31   00:00:00.000

rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F backing_fmt] filename
resize filename [+ | -]size

1
2
3
4
5
6
>只有raw格式的镜像才可以改变大小
qemu-img resize cirros.raw +1GB
>Image resized.
ll -h
>-rw-r--r-- 1 root root 193K Jul 13 21:58 cirros.qcow2
>-rw-r--r-- 1 root root 2.0G Jul 13 22:08 cirros.raw


Command parameters:
‘filename’ is a disk image filename
‘fmt’ is the disk image format. It is guessed automatically in most cases
‘cache’ is the cache mode used to write the output disk image, the valid options are: ‘none’, ‘writeback’ (default, except for convert), ‘writethrough’, ‘directsync’ and ‘unsafe’ (default for convert)
‘size’ is the disk image size in bytes. Optional suffixes ‘k’ or ‘K’ (kilobyte, 1024), ‘M’ (megabyte, 1024k), ‘G’ (gigabyte, 1024M) and T (terabyte, 1024G) are supported. ‘b’ is ignored.
‘output_filename’ is the destination disk image filename
‘output_fmt’ is the destination format
‘options’ is a comma separated list of format specific options in a name=value format. Use -o ? for an overview of the options supported by the used format
‘-c’ indicates that target image must be compressed (qcow format only)
‘-u’ enables unsafe rebasing. It is assumed that old and new backing file match exactly. The image doesn’t need a working backing file before rebasing in this case (useful for renaming the backing file)
‘-h’ with or without a command shows this help and lists the supported formats
‘-p’ show progress of command (only certain commands)
‘-S’ indicates the consecutive number of bytes that must contain only zeros for qemu-img to create a sparse image during conversion
‘–output’ takes the format in which the output must be done (human or json)

Parameters to check subcommand:
‘-r’ tries to repair any inconsistencies that are found during the check. ‘-r leaks’ repairs only cluster leaks, whereas ‘-r all’ fixes all kinds of errors, with a higher risk of choosing the wrong fix or hiding corruption that has already occurred.

Parameters to snapshot subcommand: ‘snapshot’ is the name of the snapshot to create, apply or delete
‘-a’ applies a snapshot (revert disk to saved state)
‘-c’ creates a snapshot
‘-d’ deletes a snapshot
‘-l’ lists all snapshots in the given image

Supported formats: vvfat vpc vmdk vdi sheepdog raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi dmg tftp ftps ftp https http cow cloop bochs blkverify blkdebug

[1]Get My Host IP </br> [2]Looping Call </br>


####Get My Host IP####

def _get_my_ip():
"""
Returns the actual ip of the local machine.

This code figures out what source address would be used if some traffic
were to be sent out to some well known address on the Internet. In this
case, a Google DNS server is used, but the specific address does not
matter much. No traffic is actually sent.
"""

try:
csock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
csock.connect(('8.8.8.8', 80))
(addr, port) = csock.getsockname()
csock.close()
return addr
except socket.error:
return "127.0.0.1"

####Looping Call####

from eventlet import event
from eventlet import greenthread

class LoopingCallDone(Exception):
"""Exception to break out and stop a LoopingCall.

The poll-function passed to LoopingCall can raise this exception to
break out of the loop normally. This is somewhat analogous to
StopIteration.

An optional return-value can be included as the argument to the exception;
this return-value will be returned by LoopingCall.wait()

"""


def __init__(self, retvalue=True):
""":param retvalue: Value that LoopingCall.wait() should return."""
self.retvalue = retvalue

class LoopingCall(object):
def __init__(self, f=None, *args, **kw):
self.args = args
self.kw = kw
self.f = f
self._running = False

def start(self, interval, initial_delay=None):
self._running = True
done = event.Event()

def _inner():
if initial_delay:
greenthread.sleep(initial_delay)

try:
while self._running:
self.f(*self.args, **self.kw)
if not self._running:
break
greenthread.sleep(interval)
except LoopingCallDone, e:
self.stop()
done.send(e.retvalue)
except Exception:
LOG.exception(_('in looping call'))
done.send_exception(*sys.exc_info())
return
else:
done.send(True)

self.done = done

greenthread.spawn(_inner)
return self.done

def stop(self):
self._running = False

def wait(self):
return self.done.wait()


Demo(how to use):
# Waiting for completion of live_migration.
timer = utils.LoopingCall(f=None)

def wait_for_live_migration():
"""waiting for live migration completion"""
try:
self.get_info(instance_ref)['state']
except exception.NotFound:
timer.stop()
post_method(ctxt, instance_ref, dest, block_migration)

timer.f = wait_for_live_migration
timer.start(interval=0.5).wait()

在不能进行live-migeration或者block-migeration的情况下,客户需要将一个负载比较高的云主机迁移到负载比较低的物理服务器

现在整理一下手动迁移云主机的步骤,经验证实施成功,具体操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
1. 释放浮动IP
2. 关闭云主机,记录云主机UUID,instance-00000xxx

源物理机
3. virsh undefine instance-000000xx
4. scp -r instance-000000xx eayun@xxx.xxx.xxx.xxx:/tmp

目标物理机
5.  cd /var/lib/nova/instances
6. mv /tmp/instance-000000xx /var/lib/nova/instances
7. cd instance-000000xx
8. vi libvirt.xml
<parameter name="DHCPSERVER" value="172.16.0.4"/>
修改DHCPSERVER 为正确的dhcpserver地址,查看其他同项目云主机配置即可
9.  查看libvirt.xml 中  <filterref filter="nova-instance-instance-00000015-fa163e6b0c0b">的值
10. vi nw.xml
<filter name="nova-instance-instance-00000015-fa163e6b0c0b" chain="root" >
<filterref filter="nova-base" />
</filter>
11. virsh nwfilter-define nw.xml
12. virsh define libvirt.xml
13. mysql -uroot -peayun2012
use nova;
update instances set host='target-hostname' where uuid = 'xxxx';
14.service nova-network restart

15. 开启云主机
16. 绑定浮动IP

1. 安装python基础依赖包

sudo apt-get install build-essential git python-dev python-setuptools python-pip libxml2-dev libxslt-dev

2. 安装mysql

sudo apt-get install mysql-server mysql-client python-mysqldb

3. 创建keystone数据框

mysql -u root
create database keystone;
grant all privileges on keystone.* to ‘keystone’@’localhost’ identified by ‘password’ with grant option;
quit

4. 获取keystone/python-keystoneclient源代码

git clone git@github.com:openstack/keystone.git
git clone git@github.com:openstack/python-keystoneclient.git

5. 安装keystone/python-keystoneclient

cd /opt/openstack/keystone
sudo pip install -r tools/pip-requires
sudo python setup.py install

其实这里并不是必须的,因为在keystone的pip-requires里已经包含python-keystoneclient,
我们之所以要手动安装是因为,如果你要扩展keystone-api,那你就需要修改keystoneclient了。
cd /opt/openstack/python-keystoneclient
sudo pip install -r tools/pip-requires
sudo python setup.py install

6. 配置keystone

sudo mkdir /etc/keystone/
sudo cp ./etc/keystone.conf.sample /etc/keystone/keystone.conf
vi /etc/keystone/keystone.conf

1
connection = mysql://keystone:password@localhost/keystone

7. Testing

export OS_SERVICE_TOKEN=ADMIN
export OS_SERVICE_ENDPOINT=’http://127.0.0.1:35357/v2.0’

7.1. show all user, just test api if ok.

keystone user-list +—-+——+———+——-+ | id | name | enabled | email | +—-+——+———+——-+ +—-+——+———+——-+

7.2. create tenant

keystone tenant-create –name demo –description “demo tenant” –enable true +————-+———————————-+ | Property | Value | +————-+———————————-+ | description | demo tenant | | enabled | True | | id | cae6a8e4472e46e9ac383d64c21a40ff | | name | demo | +————-+———————————-+

7.3. create user

keystone user-create –tenant-id cae6a8e4472e46e9ac383d64c21a40ff –name demo –pass password –enable true +———-+———————————-+ | Property | Value | +———-+———————————-+ | email | | | enabled | True | | id | b0a0b7f31e034352af3eb7ec637d4a91 | | name | demo | | tenantId | cae6a8e4472e46e9ac383d64c21a40ff | +———-+———————————-+

7.4. create role

keystone role-create –name admin +———-+———————————-+ | Property | Value | +———-+———————————-+ | id | 8e88ac56af704ed7b2c1586fb41705a3 | | name | admin | +———-+———————————-+

7.5. add user to role

keystone user-role-add –user b0a0b7f31e034352af3eb7ec637d4a91 –tenant-id cae6a8e4472e46e9ac383d64c21a40ff –role 8e88ac56af704ed7b2c1586fb41705a3

7.6. show user role

keystone user-role-list –user demo –tenant demo +———————————-+———-+———————————-+———————————-+ | id | name | user_id | tenant_id | +———————————-+———-+———————————-+———————————-+ | 9fe2ff9ee4384b1894a90878d3e92bab | member | b0a0b7f31e034352af3eb7ec637d4a91 | cae6a8e4472e46e9ac383d64c21a40ff | | 8e88ac56af704ed7b2c1586fb41705a3 | admin | b0a0b7f31e034352af3eb7ec637d4a91 | cae6a8e4472e46e9ac383d64c21a40ff | +———————————-+———-+———————————-+———————————-+

7.7 user token

curl -d ‘{“auth”:{“tenantName”: “demo”, “passwordCredentials”: {“username”: “demo”, “password”: “password”}}}’ -H “Content-type: application/json” http://127.0.0.1:35357/v2.0/tokens | python -m json.tool { “access”: { “metadata”: { “is_admin”: 0, “roles”: [ “9fe2ff9ee4384b1894a90878d3e92bab”, “8e88ac56af704ed7b2c1586fb41705a3” ] }, “serviceCatalog”: [], “token”: { “expires”: “2013-05-22T18:37:47Z”, “id”: “xxxxxxxxx”, “issued_at”: “2013-05-21T18:37:47.487814”, “tenant”: { “description”: “demo tenant”, “enabled”: true, “id”: “cae6a8e4472e46e9ac383d64c21a40ff”, “name”: “demo” } }, “user”: { “id”: “b0a0b7f31e034352af3eb7ec637d4a91”, “name”: “demo”, “roles”: [ { “name”: “member” }, { “name”: “admin” } ], “roles_links”: [], “username”: “demo” } } }

MAYBE: Signing error: Unable to load certificate - ensure you’ve configured PKI with ‘keystone-manage pki_setup’


[参考资料]
Openstack Hands on lab