云计算搭建(如何搭建openstack云平台)

圈圈笔记 110

欢迎来到讨论OpenStack和High Availability(高可用性)的文章,我要来解释我如何为我的公司建立高可用性的OpenStack云。

我们以两台机器开始我们的集群,两台都有一个公共和两个私有网卡。·server1:5.x.x.x(公共ip),10.0.0.1 (eth1), 10.1.0.1 (eth2)·server2:5.x.x.x(公共ip),10.0.0.2 (eth1), 10.1.0.2 (eth2)两个节点的hosts文件(相关部分)10.0.0.1 server110.0.0.2 server2Pacemaker与Corosync的安装首先我们需要安装Pacemaker和Corosync:apt-get install pacemaker corosync要对corosync进行设置,将如下内容复制粘贴到所有节点的/etc/corosync/corosync.conf内:

Please read the corosync.conf.5 manual pagecompatibility: whitetanktotem {version: 2secauth: offthreads: 0interface {ringnumber: 0bindnetaddr: 10.8.0.0mcastaddr: 226.94.1.1mcastport: 5405ttl: 1}}logging {fileline: offto_stderr: noto_logfile: yesto_syslog: yeslogfile:/var/log/cluster/corosync.logdebug: offtimestamp: onlogger_subsys {subsys: AMFdebug: off}}service {Load the Pacemaker Cluster Resource Managername: pacemakerver: 1}amf {mode: disabled}quorum {provider: corosync_votequorumexpected_votes: 2}

原因不明,但你必须手动创建/var/log/cluster目录,避免产生parse error in config: parse error in config: .的错误:mkdir /var/log/cluster我们还需要在启动时候开启两个服务,因此使用:update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .要添加Pacemaker并编辑/etc/default/corosync设置:START=yes要安装Corosync注意:上面的设置在两台主机都要进行检查Corosync配置开启corosync服务:service corosync start检查是否一切正常:corosync-cfgtool -sPrinting ring status.Local node ID 33556490RING ID 0id = 10.8.0.2status = ring 0 active with no faults还要检查集群节点和人数:corosync-quorumtool -lNodeid Votes Name16779274 1 server133556490 1 server2检查Pacemaker配置在确认Corosync工作正常后,让我们来配置Pacemaker,首先,开启服务:service pacemaker start现在检查它是否识别出我们的集群:

crm_mon -1============Last updated: Mon Jul 16 15:01:57 2012Last change: Mon Jul 16 14:52:34 2012 via cibadmin on server1Stack: openaisCurrent DC: server1 – partition with quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 2 expected votes2 Resources configured.============Online: [ server1 server2 ]

因为这是两个主机的安装,你还需要禁用人数,由于split brain,因此要禁用人数检查:crm configure property no-quorum-policy=ignore在这之后还可以禁用STONITH,如果不需要的话:crm configure property stonith-enabled=false现在我们就安装完corosync和pacemaker了,下一步就是安装MySQL并让Pacemaker使它高可用度。

Openstack的核心就是MySQL数据库,几乎每个组件都是用MySQL获取/设置信息,让我们来看看如何建立一个完全高度可用的MySQL终端。在hosts文件/etc/hosts中,加入这行:10.0.1.1 mysqlmaster首先我们需要为mysql下载资源代理:

cd /usr/lib/ocf/resource.d/mkdir perconacd perconawget -qhttps://github.com/y-trudeau/resource-agents-prm/raw/master/heartbeat/mysqlchmod u+x mysql

这样一来,当我们从从属机到主机升级MySQL服务器是,我们还要将mysqlmaster的IP绑定到那个节点上,当失效服务器出现时,它将以slave模式启动MySQL。因为,让我们来建立我们的虚拟IP:crm configure primitive mysqlmasterIP ocf:heartbeat:IPaddr2 params ip=10.0.1.1 cidr_netmask=16 nic=eth1 op monitor interval=10s我们可以通过再次运行集群监视器来检查我们的新IP:

============Last updated: Mon Jul 16 16:10:34 2012Last change: Mon Jul 16 16:10:33 2012 via cibadmin on server1Stack: openaisCurrent DC: server1 – partition with quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 3 expected votes2 Resources configured.============Online: [ server1 server2 ]mysqlmasterIP (ocf::heartbeat:IPaddr2): Started server1

现在我们已经配置好了虚拟IP,接下来设置MySQL复制,在两个节点上安装MySQL服务器:apt-get install mysql-server我们来安装基本的复制,在server1上,编辑/etc/mysql/my.cnf,在[mysqld]这部分(85行附近),取消这部分的注释:server-id = 1log_bin =/var/log/mysql/mysql-bin.log在第二个服务器上,同样文件内,取消注释并编辑:server-id = 2log_bin =/var/log/mysql/mysql-bin.log并让MySQL监听所有地址:bind-address = 0.0.0.0之后建立一个复制和一个测试用户,这样在所有服务器的mysql客户端上都会出现:

grant replication client, replication slave on *.* to repl_user@’10.0.%.%’ identified by ‘password’;grant replication client, replication slave, SUPER, PROCESS, RELOAD on *.* to repl_user@’localhost’ identified by ‘password’;grant select ON mysql.user to test_user@’localhost’ identified by ‘password’;FLUSH PRIVILEGES;

现在禁用启动时开启MySQL,因为初始化脚本已经被转换成upstart,在所有的节点上打开/etc/init/mysql.conf并注释掉以下这行:start on runlevel [2345]现在来创建MySQL资源:

crm configure primitive clustermysql ocf:percona:mysqlparams binary=/usr/bin/mysqld_safe log=/var/log/mysql.log socket=/var/run/mysqld/mysqld.sockevict_outdated_slaves=false config=/etc/mysql/my.cnf pid=/var/run/mysqld/mysqld.pid socket=/var/run/mysqld/mysqld.sockreplication_user=repl_user replication_passwd=passwordtest_user=test_user test_passwd=passwordop monitor interval=5s role=Master OCF_CHECK_LEVEL=1″op monitor interval=2s role=Slave timeout=30″ OCF_CHECK_LEVEL=1″op start interval=0″ timeout=120″op stop interval=0″ timeout=120″

你会发现MySQL正在一个节点上运行:

crm_mon -1============Last updated: Mon Jul 16 17:36:22 2012Last change: Mon Jul 16 17:14:55 2012 via cibadmin on server1Stack: openaisCurrent DC: server2 – partition with quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 3 expected votes3 Resources configured.============Online: [ server1 server2]mysqlmasterIP (ocf::heartbeat:IPaddr2): Started server2clustermysql (ocf::heartbeat:mysql): Started server2

现在要安装master/slave控制器,首先我们需要设置hosts的IP,这样它才能迁移MySQL主机,用crm configure edit改动这几行:

node server1attributesclustermysql_mysql_master_IP=10.0.0.1″node server2attributesclustermysql_mysql_master_IP=10.0.0.2″

然后创建真正的master/slave资源,要实现该步,只需通过crm建立:

crm configure ms ms_MySQL clustermysqlmeta master-max=1″ master-node-max=1″ clone-max=2″ clone-node-max=1″ notify=true globally-unique=false target-role=Master is-managed=true

现在MySQL应以master/slave模式启动,crm_mon -1会产生以下结果:============

Last updated: Tue Jul 17 11:26:04 2012Last change: Tue Jul 17 11:00:34 2012 via cibadmin on server1Stack: openaisCurrent DC: server1- partition with quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 3 expected votes4 Resources configured.============Online: [ server1 server2 ]Master/Slave Set: ms_MySQL [clustermysql]Masters: [ server1 ]Slaves: [ server2 ]mysqlmasterIP (ocf::heartbeat:IPaddr2): Started server1

最后要做的就是当mysql以master或slave模式启动时,变动master/slave IP,以下操作可轻易实现:

crm configure colocation masterIP_on_mysqlMaster inf: mysqlmasterIP ms_MySQL:Mastercrm configure order mysqlPromote_before_IP inf: ms_MySQL:promote mysqlmasterIP:start

好了,现在当你停止pacemaker服务时,MySQL在另一节点上会以master模式启动,而且ip也会相应变动。在MySQL之后,我们需要让RabbitMQ高度可用。

注意:实际上RabbitMQ仅支持带有Drbd的主动/被动,主动/主动也支持,不过你需要在Openstack代码中更改队列声明。Eugene Kirpichov制作的补丁仍在开发中,在这里可以找到:https://review.openstack.org//c/10305/

因此按惯例我们只使用官方提供的资源安装RabbitMQ,想要获得更新的版本,请等待补丁:

echo debhttp://www.rabbitmq.com/debian/ testing main >/etc/apt/sources.list.d/rabbitmq.listwget -qhttp://www.rabbitmq.com/rabbitmq-signing-key-public.asc -O- | sudo apt-key add –apt-get updateapt-get install rabbitmq-server

现在准备工作已经完成了,该来安装Keystone,并使它高度可用,在本教程里我不会涉及安装的部分,因为手册里已经全都包括了。

只有两处不同:

·你必须在两个主机上都安装keystone,不是一个

·要将MySQL主机设置为clustermysql,这样它在MySQL主机之上。

·当你定义为每个服务创建虚拟IP的服务时(这里是指keystoneip, glanceip, novacomputeip等,并在建立终端时指向它们)

现在你已经安装好Keystone并创建了用户,角色,服务和终端,我们来让它高度可用,我们需要在启动时禁用自动载入,在两台主机上这样做:

echo manual >/etc/init/keystone.override

现在下载资源代理:

mkdir/usr/lib/ocf/resource.d/openstackcd/usr/lib/ocf/resource.d/openstack/wgethttps://raw.github.com/madkiss/keystone/master/tools/ocf/keystonechmod u+x *

然后为Keystone创建基元:

crm configure primitive keystoneService ocf:openstack:keystoneparams config=/etc/keystone/keystone.conf os_auth_url=http://clusterkeystone:5000/v2.0/ os_password=admin os_tenant_name=admin os_username=admin user=keystone client_binary=/usr/bin/keystoneop monitor interval=15s timeout=30s

clusterkeystone处是分配给Keystone的虚拟IP,os_*是你在安装Keystone设置的管理员用户的认证信息。

对虚拟IP和服务分组是很有用的,这样它们就能在同一主机开启:

crm configure group Keystone keystoneIP keystoneService

在MySQL主机开启后运行Keystone,可以这样做:

crm configure orderkeystone_after_mysqlmasterIP inf: mysqlmasterIP:start Keystone

这样你就有了一个能用的keystone故障恢复,以防主机故障。

一个案列,两个主机都在运行:

============Last updated: Mon Jul 30 15:03:40 2012Last change: Mon Jul 30 15:03:38 2012 via cibadmin on server2Stack: openaisCurrent DC: server1 – partition with quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 2 expected votes5 Resources configured.============Online: [ server1 server2 ]mysqlmasterIP (ocf::heartbeat:IPaddr2): Started server1Master/Slave Set: ms_MySQL [clustermysql]Masters: [ server1 ]Slaves: [ server2 ]Resource Group: KeystonekeystoneIP (ocf::heartbeat:IPaddr2): Started server2keystoneService (ocf::openstack:keystone): Started server2

现在停止server1,几秒后你会得到:

============Last updated: Mon Jul 30 15:08:34 2012Last change: Mon Jul 30 15:08:26 2012 via crm_attribute on server2Stack: openaisCurrent DC: server2 – partition WITHOUT quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 2 expected votes5 Resources configured.============Online: [ server2 ]OFFLINE: [ server1 ]mysqlmasterIP (ocf::heartbeat:IPaddr2): Started server2Master/Slave Set: ms_MySQL [clustermysql]Masters: [ server2 ]Stopped: [ clustermysql:0 ]Resource Group: Keystone keystoneIP (ocf::heartbeat:IPaddr2): Started server2keystoneService (ocf::openstack:keystone): Started server2

在Keystone教程里,我假定你已经安装了Glance并使用了。

与通常的设置不同的地方是:

·当设置/etc/glance/glance-api-paste.ini和/etc/glance/glance-registry-paste.ini时,记得也要编辑auth host:

service_protocol=httpservice_host = clusterkeystoneservice_port = 5000auth_host = clusterkeystoneauth_port = 35357auth_protocol = httpauth_uri =http://clusterkeystone:5000/admin_tenant_name = serviceadmin_user = glanceadmin_password = glance

·当设置/etc/glance/glance-registry.conf时,使用clustermysql作为MySQL主机

·在设置的最后,运行glance-manage version_control 0 和 glance-manage db_sync

·一定要安装curl,glance-registry资源代理没说但你需要它

现在我们要让Pacemaker在需要时运行glance,首先我们要停止glance服务并在启动时禁用,然后下载资源代理,在所有主机上做以下操作:

echo manual >/etc/init/glance-api.overrideecho manual >/etc/init/glance-registry.overrideservice glance-api stopservice glance-registry stopcd/usr/lib/ocf/resource.d/openstack/wgethttps://raw.github.com/madkiss/glance/ha/tools/ocf/glance-apiwgethttps://raw.github.com/madkiss/glance/ha/tools/ocf/glance-registrychmod u+x *

然后添加资源:

crm configure primitive glanceApiService ocf:openstack:glance-apiparams config=/etc/glance/glance-api.conf os_auth_url=http://clusterkeystone:5000/v2.0/ os_password=adminos_tenant_name=admin os_username=admin user=glance client_binary=/usr/bin/glanceop monitor interval=15s timeout=30scrm configure primitive glanceRegistryServiceocf:openstack:glance-registryparams config=/etc/glance/glance-registry.conf os_auth_url=http://clusterkeystone:5000/v2.0/os_password=admin os_tenant_name=admin os_username=admin user=glanceop monitor interval=15s timeout=30s

现在pacemaker可以在我们的集群上运行Glance API and Registry了。

按惯例,分组并给Glance添加正确的命令:

group Glance glanceIP glanceApiService glanceRegistryService

crm configure order glance_after_keystone inf: Keystone Glance

我设置让Glance在Keystone之后运行,因为它依赖于那两者还有MySQL,因为Keystone在MySQL之后运行,你只能让它在Keystone之后运行。

这是由此产生的配置:

============Last updated: Mon Jul 30 16:14:09 2012Last change: Mon Jul 30 16:11:54 2012 via crm_attribute on server1Stack: openaisCurrent DC: server1 – partition with quorumVersion:1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c2 Nodes configured, 2 expected votes8 Resources configured.============Online: [ server1 server2 ]mysqlmasterIP (ocf::heartbeat:IPaddr2): Started server1Master/Slave Set: ms_MySQL [clustermysql]Masters: [ server1 ]Slaves: [ server2 ]Resource Group: KeystonekeystoneIP (ocf::heartbeat:IPaddr2): Started server2keystoneService (ocf::openstack:keystone): Started server2Resource Group: GlanceglanceIP (ocf::heartbeat:IPaddr2): Started server1glanceApiService (ocf::openstack:glance-api): Started server1glanceRegistryService (ocf::openstack:glance-registry): Started server1

在MySQL, RabbitMQ, Keystone和Glance之后,我们要来安装要用Pacemaker管理的Nova服务,并让它们高度可用。和其他教程一样,当编辑/etc/nova/api-paste.ini,还要改动服务主机:

service_protocol=httpservice_host = clusterkeystoneservice_port = 5000auth_host = clusterkeystoneauth_port = 35357auth_protocol = httpauth_uri =http://clusterkeystone:5000/admin_tenant_name = serviceadmin_user = novaadmin_password = nova

还有,在做nova-manage db sync之前,一定要设置SQL主机为clustermysql,我这里这样设置/etc/nova/nova.conf:

[DEFAULT]dhcpbridge_flagfile=/etc/nova/nova.confdhcpbridge=/usr/bin/nova-dhcpbridgelogdir=/var/log/novastate_path=/var/lib/novalock_path=/run/lock/novaallow_admin_api=trueuse_deprecated_auth=falseauth_strategy=keystonescheduler_driver=nova.scheduler.simple.SimpleSchedulers3_host=clusterglanceec2_host=clusterec2ec2_dmz_host=clusterec2rabbit_host=clusterrabbitcc_host=clusterec2nova_url=http://clusternova:8774/v1.1/glance_api_servers=clusterglance:9292image_service=nova.image.glance.GlanceImageServiceiscsi_ip_prefix=192.168.4sql_connection=mysql://novadbadmin:password@clustermysql/novaec2_url=http://clusterec2:8773/services/Cloudkeystone_ec2_url=http://clusterkeystone:5000/v2.0/ec2tokensapi_paste_config=/etc/nova/api-paste.inilibvirt_type=kvmlibvirt_use_virtio_for_bridges=truestart_guests_on_host_boot=trueresume_guests_state_on_host_boot=truenovnc_enabled=truenovncproxy_base_url=http://5.9.x.x:6080/vnc_auto.htmlvncserver_proxyclient_address=10.8.0.1vncserver_listen=0.0.0.0network_manager=nova.network.manager.FlatDHCPManagerpublic_interface=eth0flat_interface=eth2flat_network_bridge=br100flat_injected=Falseforce_dhcp_release=trueiscsi_helper=tgtadmconnection_type=libvirtroot_helper=sudo nova-rootwrapverbose=Truedebug=Truemulti_host=trueenabled_apis=ec2,osapi_compute,osapi_volume,metadata

再次检查你的/etc/hosts,确保你已经把clustermysql,clusterglance这样的虚拟IP声明成你在Keystone安装(在终端配置里)和MySQL认证时设定的那样。现在你可以官方教程里的db_sync部分了。我们必须停止服务并让它们由Pacemaker管理:

service nova-api stopservice nova-cert stopservice nova-compute stopservice nova-consoleauth stopservice nova-network stopservice nova-objectstore stopservice nova-scheduler stopservice nova-volume stopservice novnc stopecho manual >/etc/init/nova-api.overrideecho manual >/etc/init/nova-cert.overrideecho manual >/etc/init/nova-compute.overrideecho manual >/etc/init/nova-consoleauth.overrideecho manual >/etc/init/nova-network.overrideecho manual >/etc/init/nova-objectstore.overrideecho manual >/etc/init/nova-scheduler.overrideecho manual >/etc/init/nova-volume.overrideecho manual > /etc/init/novnc.override

为服务下载资源代理:

cd/usr/lib/ocf/resource.d/openstack/wgethttps://raw.github.com/leseb/OpenStack-ra/master/nova-api-rawgethttps://raw.github.com/leseb/OpenStack-ra/master/nova-cert-rawgethttps://raw.github.com/leseb/OpenStack-ra/master/nova-consoleauth-rawgethttps://raw.github.com/leseb/OpenStack-ra/master/nova-scheduler-rawgethttps://raw.github.com/leseb/OpenStack-ra/master/nova-vnc-rawgethttps://raw.github.com/alex88/nova-network-ra/master/nova-network-rawgethttps://raw.github.com/alex88/nova-compute-ra/master/nova-compute-rawgethttps://raw.github.com/alex88/nova-objectstore-ra/master/nova-objectstore-rawgethttps://raw.github.com/alex88/nova-volume-ra/master/nova-volume-rachmod +x *

设置服务随Pacemaker启动:

crm configure primitive novaApiService ocf:openstack:nova-api-raparams config=/etc/nova/nova.confop monitor interval=5s timeout=5scrm configure primitive novaCertServiceocf:openstack:nova-cert-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaConsoleauthServiceocf:openstack:nova-consoleauth-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaSchedulerServiceocf:openstack:nova-scheduler-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaVncService ocf:openstack:nova-vnc-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaNetworkServiceocf:openstack:nova-network-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaComputeServiceocf:openstack:nova-compute-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaObjectstoreServiceocf:openstack:nova-objectstore-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure primitive novaVolumeServiceocf:openstack:nova-volume-raparams config=/etc/nova/nova.confop monitor interval=30s timeout=30scrm configure clone novaVolume novaVolumeServicemeta clone-max=2″ clone-node-max=1″crm configure clone novaNetwork novaNetworkServicemeta clone-max=2″ clone-node-max=1″crm configure clone novaCompute novaComputeServicemeta clone-max=2″ clone-node-max=1″crm configure clone novaApi novaApiServicemeta clone-max=2″ clone-node-max=1″crm configure clone novaVnc novaVncServicemeta clone-max=2″ clone-node-max=1″crm configure group novaServices novaConsoleauthService novaCertService novaSchedulerServicecrm configure ordernovaServices_after_keystone inf: Keystone novaServices

注意:一定根据你的需要的用处来使用clone指令,其实我在Api and Network上使用clone,因为我运行的是multi_host openstack。我的nova.conf里面说s3_host是glance ip,一定要编辑group Glance来包括nova-objectstore服务,所以要进行crm configure edit并确保有这一行:group Glance glanceIP novaObjectstoreService glanceApiService glanceRegistryService现在你就能查看OpenStack集群的状态了:

Binary Host Zone Status State Updated_Atnova-compute server1 nova enablednova-compute server2 nova enablednova-network server2 nova enablednova-network server1 nova enablednova-scheduler server2 nova enablednova-consoleauth server2 nova enablednova-cert server2 nova enablednova-volume server1 nova enablednova-volume server2 nova enabled

好了,现在你已经有了所有Corosync+Pacemaker管理的Openstack组件了。Openstack的虚拟机高可用性特性仍在开发,敬请期待这方面的更新。

上一篇:

下一篇:

  推荐阅读

分享