PaceMaker + corosync in Centos6

LPIC304の勉強として。 PaceMakerはCentos7で使ったことあるのですが、Centos6のPaceMakerはあまり経験がないので構築してみました。 VIPとhttpdリソースを使って、VIPにアクセスするとWebが表示されるような構成です。

構成

ホスト名 IP VIP
centos6-1 192.168.0.102 192.168.0.100
centos6-2 192.168.0.156

インストール

yum -y install pacemaker

# crmコマンドをインストール
wget -P /etc/yum.repos.d/ http://download.opensuse.org/repositories/network:ha-clustering:Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo
yum -y install crmsh

認証設定

両ノードとも以下の設定を行い、内容を揃える。

vi /etc/ha.d/authkeys
auth 1
1 sha1 secret

chmod 600 /etc/ha.d/authkeys

コンフィグ設定

今回はcloudStackで建てたサーバーを使う。CloudStackはマルチキャストが使えないので、udpuの方を使います。 記法は古い形式の member という記法を使います。新しいと node という記法になるらしい。

https://qiita.com/tukiyo3/items/162e131007365fc4fe80

# cp /etc/corosync/corosync.conf.example.udpu /etc/corosync/corosync.conf
# vi /etc/corosync/corosync.conf

# cat /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank
aisexec {
        user: root
        group: root
}
service {
        name: pacemaker
        ver: 0
        use_mgmtd: yes
}

totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: 192.168.0.102
                }
                member {
                        memberaddr: 192.168.0.156
                }
                ringnumber: 0
                bindnetaddr: 192.168.0.0
                mcastport: 5405
                ttl: 1
        }
        transport: udpu
}

logging {
        fileline: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

ログのパーミッション変更とサービス起動

両方のノードで行う。

chown -R hacluster. /var/log/cluster
/etc/rc.d/init.d/corosync start

確認

crm_mon コマンドは -1 をつけないと更新型の表示になる、 -1 をつけると一度きりの表示。

[root@centos6-1 ~]# crm_mon -1
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 21:51:34 2020
Last change: Sat Jul 25 18:53:23 2020 by hacluster via crmd on centos6-2

2 nodes configured (2 expected votes)
0 resources configured

Online: [ centos6-1 centos6-2 ]

No active resources

[root@centos6-1 ~]# crm status
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 22:01:49 2020
Last change: Sat Jul 25 18:53:23 2020 by hacluster via crmd on centos6-2

2 nodes configured (2 expected votes)
0 resources configured

Online: [ centos6-1 centos6-2 ]

No resources

初期設定・リソース設定

# STONITH無効化
crm configure property \
  stonith-enabled=false

# quorum無効化 (ノードの数が半分以下になったときはクラスタとしてサービス提供不可と見なすが2台構成で1台落ちたらその状態になるので無効化)
crm configure property \
  no-quorum-policy=ignore

# ノード復帰時に自動でリソースを元のノード上に移そうとするのを止める
crm configure rsc_defaults \
  resource-stickiness=100

設定確認

[root@centos6-1 ~]# crm configure show
node centos6-1
node centos6-2
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.18-3.el6-bfe4e80420 \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false \
        no-quorum-policy=ignore
rsc_defaults rsc-options: \
        resource-stickiness=100

VIPの設定

# crm configure primitive vip ocf:heartbeat:IPaddr2 \
params ip="192.168.0.100" \
nic="eth0” \
cidr_netmask="24" \
op start interval="0s" timeout="60s" \
op monitor interval="5s" timeout="20s" \
op stop interval="0s" timeout="60s"

# crm configure show
node centos6-1
node centos6-2
primitive vip IPaddr2 \
        params ip=192.168.0.100 nic=eth0 cidr_netmask=24 \
        op start interval=0s timeout=60s \
        op monitor interval=5s timeout=20s \
        op stop interval=0s timeout=60s
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.18-3.el6-bfe4e80420 \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false \
        no-quorum-policy=ignore
rsc_defaults rsc-options: \
        resource-stickiness=100

VIPがついたことを確認

[root@centos6-1 ~]# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    inet 192.168.0.102/24 brd 192.168.0.255 scope global eth0
    inet 192.168.0.100/24 brd 192.168.0.255 scope global secondary eth0

ここでいったん設定確認

[root@centos6-1 ~]# crm_mon -1
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 22:19:00 2020
Last change: Sat Jul 25 22:17:13 2020 by root via cibadmin on centos6-1

2 nodes configured (2 expected votes)
1 resource configured

Online: [ centos6-1 centos6-2 ]

Active resources:

 vip    (ocf::heartbeat:IPaddr2):       Started centos6-1

httpd設定

両方のサーバーで設定します

コンフィグ設定

vi /etc/httpd/conf/httpd.conf
# 921-926行目:コメント解除しアクセス許可範囲変更
<Location /server-status>
   SetHandler server-status
   Order deny,allow
   Deny from all
   Allow from 127.0.0.1 10.0.0.0/24
</Location>

httpdのリソース設定と同居設定

crm configure primitive httpd ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
port="80" \
op start interval="0s" timeout="60s" \
op monitor interval="5s" timeout="20s" \
op stop interval="0s" timeout="60s"

同居設定

crm configure colocation coloc_1 inf: vip httpd

同居設定しない場合はグループ化でも良い

crm configure group webserver vip httpd

確認(グループ化した場合)

[root@centos6-1 ~]# crm status
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 22:28:32 2020
Last change: Sat Jul 25 22:28:17 2020 by hacluster via cibadmin on centos6-2

2 nodes configured (2 expected votes)
2 resources configured

Online: [ centos6-1 centos6-2 ]

Full list of resources:

 Resource Group: webserver
     vip        (ocf::heartbeat:IPaddr2):       Started centos6-1
     httpd      (ocf::heartbeat:apache):        Started centos6-1

Failed Actions:
* httpd_monitor_0 on centos6-1 'unknown error' (1): call=12, status=complete, exitreason='',
    last-rc-change='Sat Jul 25 22:24:16 2020', queued=0ms, exec=102ms


[root@centos6-1 ~]# crm configure show
node centos6-1
node centos6-2
primitive httpd apache \
        params configfile="/etc/httpd/conf/httpd.conf" port=80 \
        op start interval=0s timeout=60s \
        op monitor interval=5s timeout=20s \
        op stop interval=0s timeout=60s
primitive vip IPaddr2 \
        params ip=192.168.0.100 nic=eth0 cidr_netmask=24 \
        op start interval=0s timeout=60s \
        op monitor interval=5s timeout=20s \
        op stop interval=0s timeout=60s
group webserver vip httpd
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.18-3.el6-bfe4e80420 \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false \
        no-quorum-policy=ignore
rsc_defaults rsc-options: \
        resource-stickiness=100

アクセスしてみる

グローバルIPにアクセスするとこんな感じです。

f:id:mukkun0824:20200725225411p:plain

メンテナンスモードにする

クラスタ全体をメンテナンスモードにします。httpdサービス自体は継続されます。

[root@centos6-1 ~]# crm configure property maintenance-mode=true
[root@centos6-1 ~]# crm status
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 22:57:31 2020
Last change: Sat Jul 25 22:57:28 2020 by root via cibadmin on centos6-1

2 nodes configured (2 expected votes)
2 resources configured

              *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

Online: [ centos6-1 centos6-2 ]

Full list of resources:

 Resource Group: webserver
     vip        (ocf::heartbeat:IPaddr2):       Started centos6-1 (unmanaged)
     httpd      (ocf::heartbeat:apache):        Started centos6-1 (unmanaged)

Failed Actions:
* httpd_monitor_0 on centos6-1 'unknown error' (1): call=12, status=complete, exitreason='',
    last-rc-change='Sat Jul 25 22:24:16 2020', queued=0ms, exec=102ms

[root@centos6-1 ~]# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    inet 192.168.0.102/24 brd 192.168.0.255 scope global eth0
    inet 192.168.0.100/24 brd 192.168.0.255 scope global secondary eth0

メンテナンスモードを解除します。

[root@centos6-1 ~]# crm configure property maintenance-mode=false
[root@centos6-1 ~]#
[root@centos6-1 ~]# crm status
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 22:58:49 2020
Last change: Sat Jul 25 22:58:45 2020 by root via cibadmin on centos6-1

2 nodes configured (2 expected votes)
2 resources configured

Online: [ centos6-1 centos6-2 ]

Full list of resources:

 Resource Group: webserver
     vip        (ocf::heartbeat:IPaddr2):       Started centos6-1
     httpd      (ocf::heartbeat:apache):        Started centos6-1

Failed Actions:
* httpd_monitor_0 on centos6-1 'unknown error' (1): call=12, status=complete, exitreason='',
    last-rc-change='Sat Jul 25 22:24:16 2020', queued=0ms, exec=102ms

フェイルオーバーしてみる

フェイルオーバーしてみます。現在centos6-1というサーバーにリソースがあるので、centos6-2に移します。現在リソースがあるノードに対して、 crm node stanby [node_name] でフェイルオーバーされます。

[root@centos6-1 ~]# crm node standby centos6-1
[root@centos6-1 ~]# crm status
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 23:01:02 2020
Last change: Sat Jul 25 23:01:00 2020 by root via crm_attribute on centos6-1

2 nodes configured (2 expected votes)
2 resources configured

Node centos6-1: standby
Online: [ centos6-2 ]

Full list of resources:

 Resource Group: webserver
     vip        (ocf::heartbeat:IPaddr2):       Started centos6-2
     httpd      (ocf::heartbeat:apache):        Started centos6-2

Failed Actions:
* httpd_monitor_0 on centos6-1 'unknown error' (1): call=12, status=complete, exitreason='',
    last-rc-change='Sat Jul 25 22:24:16 2020', queued=0ms, exec=102ms

IPはcentos6-2に移動したことがわかりました。

[root@centos6-2 ~]# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    inet 192.168.0.156/24 brd 192.168.0.255 scope global eth0
    inet 192.168.0.100/24 brd 192.168.0.255 scope global secondary eth0

Webはこんな感じです。 f:id:mukkun0824:20200725230341p:plain

スタンバイ状態から復旧させます。復旧させても、リソースはcentos6-2のままです。

[root@centos6-1 ~]# crm node online centos6-1
[root@centos6-1 ~]# crm status
Stack: classic openais (with plugin)
Current DC: centos6-2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum
Last updated: Sat Jul 25 23:01:42 2020
Last change: Sat Jul 25 23:01:39 2020 by hacluster via crm_attribute on centos6-2

2 nodes configured (2 expected votes)
2 resources configured

Online: [ centos6-1 centos6-2 ]

Full list of resources:

 Resource Group: webserver
     vip        (ocf::heartbeat:IPaddr2):       Started centos6-2
     httpd      (ocf::heartbeat:apache):        Started centos6-2

Failed Actions:
* httpd_monitor_0 on centos6-1 'unknown error' (1): call=12, status=complete, exitreason='',
    last-rc-change='Sat Jul 25 22:24:16 2020', queued=0ms, exec=102ms