.. _tutorial-master-cluster:


==============
Master Cluster
==============

A clustered Salt Master has several advantages over Salt's traditional High
Availability options. First, a master cluster is meant to be served behind a
load balancer. Minions only need to know about the load balancer's IP address.
Therefore, masters can be added and removed from a cluster without the need to
re-configure minions. Another major benefit of master clusters over Salt's
older HA implimentations is that Masters in a cluster share the load of all
jobs. This allows Salt administrators to more easily scale their environments
to handle larger numbers of minions and larger jobs.

Minimum Requirements
====================

Running a cluster master requires all nodes in the cluster to have a shared
filesystem. The `cluster_pki_dir`, `cache_dir`, `file_roots` and `pillar_roots`
must all be on a shared filesystem. Most implementations will also serve the
masters publish and request server ports via a tcp load balancer. All of the
masters in a cluster are assumed to be running on a reliable local area
network.

Each master in a cluster maintains its own public and private key, and an in
memory aes key. Each cluster peer also has access to the `cluster_pki_dir`
where a cluster wide public and private key are stored. In addition, the cluster
wide aes key is generated and stored in the `cluster_pki_dir`. Further,
when operating as a cluster, minion keys are stored in the `cluster_pki_dir`
instead of the master's `pki_dir`.


Reference Implementation
========================

Gluster: https://docs.gluster.org/en/main/Quick-Start-Guide/Quickstart/

HAProxy:

.. code-block:: text

        frontend salt-master-pub
            mode tcp
            bind 10.27.5.116:4505
            option tcplog
            # This timeout is equal to the publish_session setting of the
            # masters.
            timeout client 86400s
            default_backend salt-master-pub-backend

        backend salt-master-pub-backend
            mode tcp
            #option log-health-checks
            log global
            balance roundrobin
            timeout connect 10s
            # This timeout is equal to the publish_session setting of the
            # masters.
            timeout server 86400s
            server rserve1 10.27.12.13:4505 check
            server rserve2 10.27.7.126:4505 check
            server rserve3 10.27.3.73:4505 check

        frontend salt-master-req
            mode tcp
            bind 10.27.5.116:4506
            option tcplog
            timeout client  1m
            default_backend salt-master-req-backend

        backend salt-master-req-backend
            mode tcp
            log global
            balance roundrobin
            timeout connect 10s
            timeout server 1m
            server rserve1 10.27.12.13:4506 check
            server rserve2 10.27.7.126:4506 check
            server rserve3 10.27.3.73:4506 check

Master Config:

.. code-block:: yaml

        id: 10.27.12.13
        cluster_id: master_cluster
        cluster_peers:
          - 10.27.7.126
          - 10.27.3.73
        cluster_pki_dir: /my/gluster/share/pki
        cachedir: /my/gluster/share/cache
        file_roots:
          base:
            - /my/gluster/share/srv/salt
        pillar_roots:
          base:
            - /my/gluster/share/srv/pillar
