Linux高可用(HA)集群corosync安裝配置

2017年2月27日22:44:03 發(fā)表評(píng)論 4,577 ℃

前提:

1)本配置共有兩個(gè)測(cè)試節(jié)點(diǎn),分別node1.amd5.cn和node2.amd5.cn,相的IP地址分別為172.16.100.11和172.16.100.12;

2)集群服務(wù)為apache的httpd服務(wù);

3)提供web服務(wù)的地址為172.16.100.1;

4)系統(tǒng)為rhel5.8

1、準(zhǔn)備工作

為了配置一臺(tái)Linux主機(jī)成為HA的節(jié)點(diǎn),通常需要做出如下的準(zhǔn)備工作:

1)所有節(jié)點(diǎn)的主機(jī)名稱和對(duì)應(yīng)的IP地址解析服務(wù)可以正常工作,且每個(gè)節(jié)點(diǎn)的主機(jī)名稱需要跟"uname -n“命令的結(jié)果保持一致;因此,需要保證兩個(gè)節(jié)點(diǎn)上的/etc/hosts文件均為下面的內(nèi)容:

172.16.100.11   node1.amd5.cn node1

172.16.100.12   node2.amd5.cn node2

為了使得重新啟動(dòng)系統(tǒng)后仍能保持如上的主機(jī)名稱,還分別需要在各節(jié)點(diǎn)執(zhí)行類似如下的命令:

Node1:

# sed -i 's@\(HOSTNAME=\).*@\1node1.amd5.cn@g'  /etc/sysconfig/network

# hostname node1.amd5.cn

Node2:

# sed -i 's@\(HOSTNAME=\).*@\1node2.amd5.cn@g' /etc/sysconfig/network

# hostname node2.amd5.cn

2)設(shè)定兩個(gè)節(jié)點(diǎn)可以基于密鑰進(jìn)行ssh通信,這可以通過(guò)類似如下的命令實(shí)現(xiàn):

Node1:

# ssh-keygen -t rsa

# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2

Node2:

# ssh-keygen -t rsa

# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1

2、安裝如下rpm包:

libibverbs, librdmacm, lm_sensors, libtool-ltdl, openhpi-libs, openhpi, perl-TimeDate

3、安裝corosync和pacemaker,首先下載所需要如下軟件包至本地某專用目錄(這里為/root/cluster):

cluster-glue

cluster-glue-libs

heartbeat

resource-agents

corosync

heartbeat-libs

pacemaker

corosynclib

libesmtp

pacemaker-libs

下載地址:http://clusterlabs.org/。請(qǐng)根據(jù)硬件平臺(tái)及操作系統(tǒng)類型選擇對(duì)應(yīng)的軟件包;這里建議每個(gè)軟件包都使用目前最新的版本。

使用如下命令安裝:

# cd /root/cluster

# yum -y --nogpgcheck localinstall *.rpm

4、配置corosync,(以下命令在node1.amd5.cn上執(zhí)行)

# cd /etc/corosync

# cp corosync.conf.example corosync.conf

接著編輯corosync.conf,添加如下內(nèi)容:

service {

  ver:  0

  name: pacemaker

  # use_mgmtd: yes

}

aisexec {

  user: root

  group:  root

}

并設(shè)定此配置文件中 bindnetaddr后面的IP地址為你的網(wǎng)卡所在網(wǎng)絡(luò)的網(wǎng)絡(luò)地址,我們這里的兩個(gè)節(jié)點(diǎn)在172.16.0.0網(wǎng)絡(luò),因此這里將其設(shè)定為172.16.0.0;如下

bindnetaddr: 172.16.0.0

生成節(jié)點(diǎn)間通信時(shí)用到的認(rèn)證密鑰文件:

# corosync-keygen

將corosync和authkey復(fù)制至node2:

# scp -p corosync authkey  node2:/etc/corosync/

分別為兩個(gè)節(jié)點(diǎn)創(chuàng)建corosync生成的日志所在的目錄:

# mkdir /var/log/cluster

# ssh node2  'mkdir /var/log/cluster'

5、嘗試啟動(dòng),(以下命令在node1上執(zhí)行):

# /etc/init.d/corosync start

查看corosync引擎是否正常啟動(dòng):

# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/messages

Jun 14 19:02:08 node1 corosync[5103]:   [MAIN  ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.

Jun 14 19:02:08 node1 corosync[5103]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

Jun 14 19:02:08 node1 corosync[5103]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1397.

Jun 14 19:03:49 node1 corosync[5120]:   [MAIN  ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.

Jun 14 19:03:49 node1 corosync[5120]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

查看初始化成員節(jié)點(diǎn)通知是否正常發(fā)出:

# grep  TOTEM  /var/log/messages

Jun 14 19:03:49 node1 corosync[5120]:   [TOTEM ] Initializing transport (UDP/IP).

Jun 14 19:03:49 node1 corosync[5120]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).

Jun 14 19:03:50 node1 corosync[5120]:   [TOTEM ] The network interface [172.16.100.11] is now up.

Jun 14 19:03:50 node1 corosync[5120]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.

檢查啟動(dòng)過(guò)程中是否有錯(cuò)誤產(chǎn)生:

# grep ERROR: /var/log/messages | grep -v unpack_resources

查看pacemaker是否正常啟動(dòng):

# grep pcmk_startup /var/log/messages

Jun 14 19:03:50 node1 corosync[5120]:   [pcmk  ] info: pcmk_startup: CRM: Initialized

Jun 14 19:03:50 node1 corosync[5120]:   [pcmk  ] Logging: Initialized pcmk_startup

Jun 14 19:03:50 node1 corosync[5120]:   [pcmk  ] info: pcmk_startup: Maximum core file size is: 4294967295

Jun 14 19:03:50 node1 corosync[5120]:   [pcmk  ] info: pcmk_startup: Service: 9

Jun 14 19:03:50 node1 corosync[5120]:   [pcmk  ] info: pcmk_startup: Local hostname: node1.amd5.cn

如果上面命令執(zhí)行均沒(méi)有問(wèn)題,接著可以執(zhí)行如下命令啟動(dòng)node2上的corosync

# ssh node2 -- /etc/init.d/corosync start

注意:?jiǎn)?dòng)node2需要在node1上使用如上命令進(jìn)行,不要在node2節(jié)點(diǎn)上直接啟動(dòng);

使用如下命令查看集群節(jié)點(diǎn)的啟動(dòng)狀態(tài):

# crm status

============

Last updated: Tue Jun 14 19:07:06 2011

Stack: openais

Current DC: node1.amd5.cn - partition with quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

0 Resources configured.

============

Online: [ node1.amd5.cn node2.amd5.cn ]

從上面的信息可以看出兩個(gè)節(jié)點(diǎn)都已經(jīng)正常啟動(dòng),并且集群已經(jīng)處于正常工作狀態(tài)。

執(zhí)行ps auxf命令可以查看corosync啟動(dòng)的各相關(guān)進(jìn)程。

root      4665  0.4  0.8  86736  4244 ?        Ssl  17:00   0:04 corosync

root      4673  0.0  0.4  11720  2260 ?        S    17:00   0:00  \_ /usr/lib/heartbeat/stonithd

101       4674  0.0  0.7  12628  4100 ?        S    17:00   0:00  \_ /usr/lib/heartbeat/cib

root      4675  0.0  0.3   6392  1852 ?        S    17:00   0:00  \_ /usr/lib/heartbeat/lrmd

101       4676  0.0  0.4  12056  2528 ?        S    17:00   0:00  \_ /usr/lib/heartbeat/attrd

101       4677  0.0  0.5   8692  2784 ?        S    17:00   0:00  \_ /usr/lib/heartbeat/pengine

101       4678  0.0  0.5  12136  3012 ?        S    17:00   0:00  \_ /usr/lib/heartbeat/crmd

6、配置集群的工作屬性,禁用stonith

corosync默認(rèn)啟用了stonith,而當(dāng)前集群并沒(méi)有相應(yīng)的stonith設(shè)備,因此此默認(rèn)配置目前尚不可用,這可以通過(guò)如下命令驗(yàn)正:

# crm_verify -L 

crm_verify[5202]: 2011/06/14_19:10:38 ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined

crm_verify[5202]: 2011/06/14_19:10:38 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option

crm_verify[5202]: 2011/06/14_19:10:38 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity

Errors found during check: config not valid

  -V may provide more details

我們里可以通過(guò)如下命令先禁用stonith:

# crm configure property stonith-enabled=false

使用如下命令查看當(dāng)前的配置信息:

# crm configure show

node node1.amd5.cn

node node2.amd5.cn

property $id="cib-bootstrap-options" \

  dc-versi \

  cluster-infrastructure="openais" \

  expected-quorum-votes="2" \

  stonith-enabled="false

  

從中可以看出stonith已經(jīng)被禁用。

上面的crm,crm_verify命令是1.0后的版本的pacemaker提供的基于命令行的集群管理工具;可以在集群中的任何一個(gè)節(jié)點(diǎn)上執(zhí)行。

7、為集群添加集群資源

corosync支持heartbeat,LSB和ocf等類型的資源代理,目前較為常用的類型為L(zhǎng)SB和OCF兩類,stonith類專為配置stonith設(shè)備而用;

可以通過(guò)如下命令查看當(dāng)前集群系統(tǒng)所支持的類型:

# crm ra classes 

heartbeat

lsb

ocf / heartbeat pacemaker

stonith

如果想要查看某種類別下的所用資源代理的列表,可以使用類似如下命令實(shí)現(xiàn):

# crm ra list lsb

# crm ra list ocf heartbeat

# crm ra list ocf pacemaker

# crm ra list stonith

# crm ra info [class:[provider:]]resource_agent

例如:

# crm ra info ocf:heartbeat:IPaddr

8、接下來(lái)要?jiǎng)?chuàng)建的web集群創(chuàng)建一個(gè)IP地址資源,以在通過(guò)集群提供web服務(wù)時(shí)使用;這可以通過(guò)如下方式實(shí)現(xiàn):

語(yǔ)法:

primitive <rsc> [<class>:[<provider>:]]<type>

          [params attr_list]

          [operations id_spec]

            [op op_type [<attribute>=<value>...] ...]

op_type :: start | stop | monitor

例子:

 primitive apcfence stonith:apcsmart \

          params ttydev=/dev/ttyS0 hostlist="node1 node2" \

          op start timeout=60s \

          op m timeout=60s

應(yīng)用:

# crm configure primitive WebIP ocf:heartbeat:IPaddr params ip=172.16.100.1

通過(guò)如下的命令執(zhí)行結(jié)果可以看出此資源已經(jīng)在node1.amd5.cn上啟動(dòng):

# crm status

============

Last updated: Tue Jun 14 19:31:05 2011

Stack: openais

Current DC: node1.amd5.cn - partition with quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

1 Resources configured.

============

Online: [ node1.amd5.cn node2.amd5.cn ]

 WebIP  (ocf::heartbeat:IPaddr):  Started node1.amd5.cn

當(dāng)然,也可以在node1上執(zhí)行ifconfig命令看到此地址已經(jīng)在eth0的別名上生效:

# ifconfig 

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:AA:DD:CF  

          inet addr:172.16.100.1  Bcast:192.168.0.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          Interrupt:67 Base address:0x2000 

          

而后我們到node2上通過(guò)如下命令停止node1上的corosync服務(wù):

# ssh node1 -- /etc/init.d/corosync stop

查看集群工作狀態(tài):

# crm status

============

Last updated: Tue Jun 14 19:37:23 2011

Stack: openais

Current DC: node2.amd5.cn - partition WITHOUT quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

1 Resources configured.

============

Online: [ node2.amd5.cn ]

OFFLINE: [ node1.amd5.cn ]

上面的信息顯示node1.amd5.cn已經(jīng)離線,但資源WebIP卻沒(méi)能在node2.amd5.cn上啟動(dòng)。這是因?yàn)榇藭r(shí)的集群狀態(tài)為"WITHOUT quorum",即已經(jīng)失去了quorum,此時(shí)集群服務(wù)本身已經(jīng)不滿足正常運(yùn)行的條件,這對(duì)于只有兩節(jié)點(diǎn)的集群來(lái)講是不合理的。因此,我們可以通過(guò)如下的命令來(lái)修改忽略quorum不能滿足的集群狀態(tài)檢查:

# crm configure property no-quorum-policy=ignore

片刻之后,集群就會(huì)在目前仍在運(yùn)行中的節(jié)點(diǎn)node2上啟動(dòng)此資源了,如下所示:

# crm status

============

Last updated: Tue Jun 14 19:43:42 2011

Stack: openais

Current DC: node2.amd5.cn - partition WITHOUT quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

1 Resources configured.

============

Online: [ node2.amd5.cn ]

OFFLINE: [ node1.amd5.cn ]

 WebIP  (ocf::heartbeat:IPaddr):  Started node2.amd5.cn

 

好了,驗(yàn)正完成后,我們正常啟動(dòng)node1.amd5.cn:

# ssh node1 -- /etc/init.d/corosync start

正常啟動(dòng)node1.amd5.cn后,集群資源WebIP很可能會(huì)重新從node2.amd5.cn轉(zhuǎn)移回node1.amd5.cn。資源的這種在節(jié)點(diǎn)間每一次的來(lái)回流動(dòng)都會(huì)造成那段時(shí)間內(nèi)其無(wú)法正常被訪問(wèn),所以,我們有時(shí)候需要在資源因?yàn)楣?jié)點(diǎn)故障轉(zhuǎn)移到其它節(jié)點(diǎn)后,即便原來(lái)的節(jié)點(diǎn)恢復(fù)正常也禁止資源再次流轉(zhuǎn)回來(lái)。這可以通過(guò)定義資源的黏性(stickiness)來(lái)實(shí)現(xiàn)。在創(chuàng)建資源時(shí)或在創(chuàng)建資源后,都可以指定指定資源黏性。

資源黏性值范圍及其作用:

0:這是默認(rèn)選項(xiàng)。資源放置在系統(tǒng)中的最適合位置。這意味著當(dāng)負(fù)載能力“較好”或較差的節(jié)點(diǎn)變得可用時(shí)才轉(zhuǎn)移資源。此選項(xiàng)的作用基本等同于自動(dòng)故障回復(fù),只是資源可能會(huì)轉(zhuǎn)移到非之前活動(dòng)的節(jié)點(diǎn)上;

大于0:資源更愿意留在當(dāng)前位置,但是如果有更合適的節(jié)點(diǎn)可用時(shí)會(huì)移動(dòng)。值越高表示資源越愿意留在當(dāng)前位置;

小于0:資源更愿意移離當(dāng)前位置。絕對(duì)值越高表示資源越愿意離開當(dāng)前位置;

INFINITY:如果不是因節(jié)點(diǎn)不適合運(yùn)行資源(節(jié)點(diǎn)關(guān)機(jī)、節(jié)點(diǎn)待機(jī)、達(dá)到migration-threshold 或配置更改)而強(qiáng)制資源轉(zhuǎn)移,資源總是留在當(dāng)前位置。此選項(xiàng)的作用幾乎等同于完全禁用自動(dòng)故障回復(fù);

-INFINITY:資源總是移離當(dāng)前位置;

我們這里可以通過(guò)以下方式為資源指定默認(rèn)黏性值:

# crm configure rsc_defaults resource-stickiness=100

9、結(jié)合上面已經(jīng)配置好的IP地址資源,將此集群配置成為一個(gè)active/passive模型的web(httpd)服務(wù)集群

為了將此集群?jiǎn)⒂脼閣eb(httpd)服務(wù)器集群,我們得先在各節(jié)點(diǎn)上安裝httpd,并配置其能在本地各自提供一個(gè)測(cè)試頁(yè)面。

Node1:

# yum -y install httpd

# echo "<h1>Node1.amd5.cn</h1>" > /var/www/html/index.html

Node2:

# yum -y install httpd

# echo "<h1>Node2.amd5.cn</h1>" > /var/www/html/index.html

而后在各節(jié)點(diǎn)手動(dòng)啟動(dòng)httpd服務(wù),并確認(rèn)其可以正常提供服務(wù)。接著使用下面的命令停止httpd服務(wù),并確保其不會(huì)自動(dòng)啟動(dòng)(在兩個(gè)節(jié)點(diǎn)各執(zhí)行一遍):

# /etc/init.d/httpd stop

# chkconfig httpd off

接下來(lái)我們將此httpd服務(wù)添加為集群資源。將httpd添加為集群資源有兩處資源代理可用:lsb和ocf:heartbeat,為了簡(jiǎn)單起見,我們這里使用lsb類型:

首先可以使用如下命令查看lsb類型的httpd資源的語(yǔ)法格式:

# crm ra info lsb:httpd

lsb:httpd

Apache is a World Wide Web server.  It is used to serve \

         HTML files and CGI.

Operations' defaults (advisory minimum):

    start         timeout=15

    stop          timeout=15

    status        timeout=15

    restart       timeout=15

    force-reload  timeout=15

    monitor       interval=15 timeout=15 start-delay=15

接下來(lái)新建資源WebSite:

# crm configure primitive WebSite lsb:httpd

查看配置文件中生成的定義:

node node1.amd5.cn

node node2.amd5.cn

primitive WebIP ocf:heartbeat:IPaddr \

  params ip="172.16.100.1"

primitive WebSite lsb:httpd

property $id="cib-bootstrap-options" \

  dc-versi \

  cluster-infrastructure="openais" \

  expected-quorum-votes="2" \

  stonith-enabled="false" \

  no-quorum-policy="ignore"

  

查看資源的啟用狀態(tài):

# crm status

============

Last updated: Tue Jun 14 19:57:31 2011

Stack: openais

Current DC: node2.amd5.cn - partition with quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

2 Resources configured.

============

Online: [ node1.amd5.cn node2.amd5.cn ]

 WebIP  (ocf::heartbeat:IPaddr):  Started node1.amd5.cn

 WebSite  (lsb:httpd):  Started node2.amd5.cn

 

從上面的信息中可以看出WebIP和WebSite有可能會(huì)分別運(yùn)行于兩個(gè)節(jié)點(diǎn)上,這對(duì)于通過(guò)此IP提供Web服務(wù)的應(yīng)用來(lái)說(shuō)是不成立的,即此兩者資源必須同時(shí)運(yùn)行在某節(jié)點(diǎn)上。

由此可見,即便集群擁有所有必需資源,但它可能還無(wú)法進(jìn)行正確處理。資源約束則用以指定在哪些群集節(jié)點(diǎn)上運(yùn)行資源,以何種順序裝載資源,以及特定資源依賴于哪些其它資源。pacemaker共給我們提供了三種資源約束方法:

1)Resource Location(資源位置):定義資源可以、不可以或盡可能在哪些節(jié)點(diǎn)上運(yùn)行;

2)Resource Collocation(資源排列):排列約束用以定義集群資源可以或不可以在某個(gè)節(jié)點(diǎn)上同時(shí)運(yùn)行;

3)Resource Order(資源順序):順序約束定義集群資源在節(jié)點(diǎn)上啟動(dòng)的順序;

定義約束時(shí),還需要指定分?jǐn)?shù)。各種分?jǐn)?shù)是集群工作方式的重要組成部分。其實(shí),從遷移資源到?jīng)Q定在已降級(jí)集群中停止哪些資源的整個(gè)過(guò)程是通過(guò)以某種方式修改分?jǐn)?shù)來(lái)實(shí)現(xiàn)的。分?jǐn)?shù)按每個(gè)資源來(lái)計(jì)算,資源分?jǐn)?shù)為負(fù)的任何節(jié)點(diǎn)都無(wú)法運(yùn)行該資源。在計(jì)算出資源分?jǐn)?shù)后,集群選擇分?jǐn)?shù)最高的節(jié)點(diǎn)。INFINITY(無(wú)窮大)目前定義為 1,000,000。加減無(wú)窮大遵循以下3個(gè)基本規(guī)則:

1)任何值 + 無(wú)窮大 = 無(wú)窮大

2)任何值 - 無(wú)窮大 = -無(wú)窮大

3)無(wú)窮大 - 無(wú)窮大 = -無(wú)窮大

定義資源約束時(shí),也可以指定每個(gè)約束的分?jǐn)?shù)。分?jǐn)?shù)表示指派給此資源約束的值。分?jǐn)?shù)較高的約束先應(yīng)用,分?jǐn)?shù)較低的約束后應(yīng)用。通過(guò)使用不同的分?jǐn)?shù)為既定資源創(chuàng)建更多位置約束,可以指定資源要故障轉(zhuǎn)移至的目標(biāo)節(jié)點(diǎn)的順序。

因此,對(duì)于前述的WebIP和WebSite可能會(huì)運(yùn)行于不同節(jié)點(diǎn)的問(wèn)題,可以通過(guò)以下命令來(lái)解決:

# crm configure colocation website-with-ip INFINITY: WebSite WebIP

接著,我們還得確保WebSite在某節(jié)點(diǎn)啟動(dòng)之前得先啟動(dòng)WebIP,這可以使用如下命令實(shí)現(xiàn):

# crm configure order httpd-after-ip mandatory: WebIP WebSite

此外,由于HA集群本身并不強(qiáng)制每個(gè)節(jié)點(diǎn)的性能相同或相近,所以,某些時(shí)候我們可能希望在正常時(shí)服務(wù)總能在某個(gè)性能較強(qiáng)的節(jié)點(diǎn)上運(yùn)行,這可以通過(guò)位置約束來(lái)實(shí)現(xiàn):

# crm configure location prefer-node1 WebSite rule 200: node1

這條命令實(shí)現(xiàn)了將WebSite約束在node1上,且指定其分?jǐn)?shù)為200;

補(bǔ)充知識(shí):

多播地址(multicast address)即組播地址,是一組主機(jī)的標(biāo)示符,它已經(jīng)加入到一個(gè)多播組中。在以太網(wǎng)中,多播地址是一個(gè)48位的標(biāo)示符,命名了一組應(yīng)該在這個(gè)網(wǎng)絡(luò)中應(yīng)用接收到一個(gè)分組的站點(diǎn)。在IPv4中,它歷史上被叫做D類地址,一種類型的IP地址,它的范圍從224.0.0.0到239.255.255.255,或,等同的,在224.0.0.0/4。在IPv6,多播地址都有前綴ff00::/8。

多播是第一個(gè)字節(jié)的最低位為1的所有地址,例如01-12-0f-00-00-02。廣播地址是全1的48位地址,也屬于多播地址。但是廣播又是多播中的特例,就像是正方形屬于長(zhǎng)方形,但是正方形有長(zhǎng)方形沒(méi)有的特點(diǎn)。

colocation (collocation)

This constraint expresses the placement relation between two or more resources. If there are more than two resources, then the constraint is called a resource set. Collocation resource sets have an extra attribute to allow for sets of resources which don’t depend on each other in terms of state. The shell syntax for such sets is to put resources in parentheses.

Usage:

        colocation <id> <score>: <rsc>[:<role>] <rsc>[:<role>] ...

Example:

        colocation dummy_and_apache -inf: apache dummy

        colocation c1 inf: A ( B C )

order

This constraint expresses the order of actions on two resources or more resources. If there are more than two resources, then the constraint is called a resource set. Ordered resource sets have an extra attribute to allow for sets of resources whose actions may run in parallel. The shell syntax for such sets is to put resources in parentheses.

Usage:

        order <id> score-type: <rsc>[:<action>] <rsc>[:<action>] ...

          [symmetrical=<bool>]

        score-type :: advisory | mandatory | <score>

Example:

        order c_apache_1 mandatory: apache:start ip_1

        order o1 inf: A ( B C )

property

Set the cluster (crm_config) options.

Usage:

        property [$id=<set_id>] <option>=<value> [<option>=<value> ...]

Example:

        property stonith-enabled=true

rsc_defaults

Set defaults for the resource meta attributes.

Usage:

        rsc_defaults [$id=<set_id>] <option>=<value> [<option>=<value> ...]

Example:

        rsc_defaults failure-timeout=3m

Shadow CIB usage

Shadow CIB is a new feature. Shadow CIBs may be manipulated in the same way like the live CIB, but these changes have no effect on the cluster resources. No changes take place before the configure commit command.

    crm(live)configure# cib new test-2

    INFO: test-2 shadow CIB created

    crm(test-2)configure# commit

Global Cluster Options

no-quorum-policy

ignore

The quorum state does not influence the cluster behavior at all, resource management is continued.

freeze

If quorum is lost, the cluster freezes. Resource management is continued: running resources are not stopped (but possibly restarted in response to monitor events), but no further resources are started within the affected partition.

stop (default value)

If quorum is lost, all resources in the affected cluster partition are stopped in an orderly fashion.

suicide

Fence all nodes in the affected cluster partition.

stonith-enabled

This global option defines if to apply fencing, allowing STONITH devices to shoot failed nodes and nodes with resources that cannot be stopped.

Supported Resource Agent Classes

Legacy Heartbeat 1 Resource Agents

Linux Standards Base (LSB) Scripts

Open Cluster Framework (OCF) Resource Agents

STONITH Resource Agents

Types of Resources

Primitives

Groups

Clones

Masters

Resource Options (Meta Attributes)

<0

>0

0

100

group Web

Web, node1: location: 500

Web: node2

Failover: Active/Passive

Failback: 

資源的粘性:

stickiness

>0: 傾向于留在原地

<0: 傾向于離開此節(jié)點(diǎn)

=0:由HA來(lái)決定去留

INFINITY:無(wú)窮大

-INFINITY: 

Node2: INFINITY

-INFINITY

約束:

location

order

colocation

位置:

次序:

排列:

partitioned cluater

votes, quorum

# crm crm(live)

# cib new active

INFO: active shadow CIB created

crm(active) # configure clone WebIP ClusterIP \

    meta globally-unique="true" clone-max="2" clone-node-max="2"

crm(active) # configure shownode pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

    params drbd_resource="wwwdata" \

    op m>

primitive WebFS ocf:heartbeat:Filesystem \

    params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"

primitive WebSite ocf:heartbeat:apache \

    params c \

    op m>

primitive ClusterIP ocf:heartbeat:IPaddr2 \

    params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \

    op m>

ms WebDataClone WebData \

    meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone WebIP ClusterIP \

    meta globally-unique="true" clone-max="2" clone-node-max="2"

colocation WebSite-with-WebFS inf: WebSite WebFS

colocation fs_on_drbd inf: WebFS WebDataClone:Master

colocation website-with-ip inf: WebSite WebIPorder WebFS-after-WebData inf: WebDataClone:promote WebFS:start

order WebSite-after-WebFS inf: WebFS WebSiteorder apache-after-ip inf: WebIP WebSite

property $id="cib-bootstrap-options" \

    dc-versi \

    cluster-infrastructure="openais" \

    expected-quorum-votes="2" \

    stonith-enabled="false" \

    no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

    resource-stickiness="100"

【騰訊云】云服務(wù)器、云數(shù)據(jù)庫(kù)、COS、CDN、短信等云產(chǎn)品特惠熱賣中

發(fā)表評(píng)論

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: